• 0
Bose321

Crashed volume after 6.2.3 update

Question

I've updated to 6.2.3 and now one of my 4 volumes is crashed. I'm running on VMWare and I applied the synoboot fix from here:

The sata drives that appeared wrongfully are gone, so that's good, but my volume1 is still crashed. The pool and disk both still are good. All volumes and pools are on a separate disk and are basic, no RAID or something. I believe SHR. IIRC it was btrfs.

 

I can still cd to /volume1 via SSH, but I only see a `@database' folder with mariadb10 it seems. So can this be fixed somehow? Or is everything gone?

 

Thanks in advance.

Share this post


Link to post
Share on other sites

10 answers to this question

Recommended Posts

  • 0
Posted (edited)

Your volume might not be mounted as the underlying root filesystem exists and a process might have tried to reinitialize a mariadb file.

 

Please run the following commands from ssh root:

# synodisk --enum
# synodisk --detectfs /volume1
# cat /etc/fstab
# cat /proc/mdstat

Post the results

Edited by flyride

Share this post


Link to post
Share on other sites
  • 0

Thanks, here are the outputs:

 

synodisk --enum

************ Disk Info ***************
>> Disk id: 2
>> Slot id: -1
>> Disk path: /dev/sdb
>> Disk model: Virtual SATA Hard Drive
>> Total capacity: 2794.00 GB
>> Tempeture: -1 C
************ Disk Info ***************
>> Disk id: 3
>> Slot id: -1
>> Disk path: /dev/sdc
>> Disk model: Virtual SATA Hard Drive
>> Total capacity: 2794.00 GB
>> Tempeture: -1 C
************ Disk Info ***************
>> Disk id: 4
>> Slot id: -1
>> Disk path: /dev/sdd
>> Disk model: Virtual SATA Hard Drive
>> Total capacity: 50.00 GB
>> Tempeture: -1 C
************ Disk Info ***************
>> Disk id: 5
>> Slot id: -1
>> Disk path: /dev/sde
>> Disk model: Virtual SATA Hard Drive
>> Total capacity: 3500.00 GB
>> Tempeture: -1 C

synodisk --detectfs /volume1

Partition [/volume1] unknown

cat /etc/fstab

none /proc proc defaults 0 0
/dev/root / ext4 defaults 1 1
/dev/md2 /volume1 btrfs auto_reclaim_space,synoacl,relatime 0 0
/dev/md4 /volume3 btrfs auto_reclaim_space,synoacl,relatime 0 0
/dev/md3 /volume2 btrfs auto_reclaim_space,synoacl,relatime 0 0
/dev/md5 /volume4 btrfs auto_reclaim_space,synoacl,relatime 0 0

cat /proc/mdstat

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [raidF1]
md2 : active raid1 sdb3[0]
      2924899328 blocks super 1.2 [1/1] [U]

md3 : active raid1 sdc3[0]
      2924899328 blocks super 1.2 [1/1] [U]

md4 : active raid1 sdd3[0]
      47606784 blocks super 1.2 [1/1] [U]

md5 : active raid1 sde3[0]
      3665193984 blocks super 1.2 [1/1] [U]

md1 : active raid1 sdb2[0] sdc2[1] sdd2[2] sde2[3]
      2097088 blocks [12/4] [UUUU________]

md0 : active raid1 sdb1[0] sdc1[1] sdd1[3] sde1[2]
      2490176 blocks [12/4] [UUUU________]

unused devices: <none>

 

Share this post


Link to post
Share on other sites
  • 0

So this tells us a few things

  1. Confirms your use of virtual disks, not physical
  2. Simple volumes, no RAID (aside from DSM's normal RAID1 for DSM and swap)
  3. btrfs filesystems
  4. /volume1 is NOT being mounted

So let's see what the error message is when the system tries to mount your volume:

# mount -v /dev/md2 /volume1

 

Share this post


Link to post
Share on other sites
  • 0

Thanks, but that tells me this:

mount: wrong fs type, bad option, bad superblock on /dev/md2,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

 

Share this post


Link to post
Share on other sites
  • 0
1 minute ago, Bose321 said:

 


       In some cases useful info is found in syslog - try
       dmesg | tail or so.

 

 

Share this post


Link to post
Share on other sites
  • 0

Of course, sorry.

[ 2362.066098] parent transid verify failed on 2854282428416 wanted 5246133 found 5245893
[ 2362.066580] parent transid verify failed on 2854282428416 wanted 5246133 found 5245893
[ 2362.066953] md/raid1:md2: syno_raid1_self_heal_set_and_submit_read_bio(1226): No suitable device for self healing retry read at round 2 at sector 1563983552
[ 2362.067252] md/raid1:md2: syno_raid1_self_heal_set_and_submit_read_bio(1226): No suitable device for self healing retry read at round 2 at sector 1563983560
[ 2362.067536] md/raid1:md2: syno_raid1_self_heal_set_and_submit_read_bio(1226): No suitable device for self healing retry read at round 2 at sector 1563983576
[ 2362.067808] md/raid1:md2: syno_raid1_self_heal_set_and_submit_read_bio(1226): No suitable device for self healing retry read at round 2 at sector 1563983568
[ 2362.068123] parent transid verify failed on 2854282428416 wanted 5246133 found 5245893
[ 2362.068127] BTRFS error (device md2): BTRFS: md2 failed to repair parent transid verify failure on 2854282428416, mirror = 2

[ 2362.089368] BTRFS: open_ctree failed

 

Share this post


Link to post
Share on other sites
  • 0
Posted (edited)

There is some btrfs corruption that is preventing the volume mount (if not obvious from the log dump).

BTRFS tried to self-heal but it can't in this case.  My advice is try and recover all the files to new storage, delete the volume and recreate.

 

Here's a thread with some methods of extracting the data from the damaged volume.  Start with post #9, read through the rest before doing anything, and you may ignore any "vgchange" commands which do not apply to you.  There are a few other threads around on repairing/recovering btrfs corruption if you search for them.

https://xpenology.com/forum/topic/14337-volume-crash-after-4-months-of-stability/?do=findComment&comment=107979

 

Very likely that you can either mount read-only or restore (recover) files to other storage.

Edited by flyride

Share this post


Link to post
Share on other sites
  • 0
Posted (edited)

I would add that if DSM was in a best-practice configuration (passthrough of disks to DSM, and use of RAID) btrfs may very well have fixed this problem itself given the log output.

Edited by flyride

Share this post


Link to post
Share on other sites
  • 0
Posted (edited)

Is it strange that /dev/vg1000 isn't available? I keep reading that everywhere, but that doesn't exist for me. I can't really find commands that I can use, since they almost all use that.

 

I used to have passthrough disks in the past, but that was later no longer advised and I then switched.

 

I've got it mounted with this command:

mount -o recovery,ro /dev/md2 /volume1

That works and I can see my files. But I can't normally mount it. Will have to backup and see if I can mount it, or just recreate it.

 

edit2: Okay weird, I just shut down my VM, spun it up again and my volume is back to normal!?

Edited by Bose321

Share this post


Link to post
Share on other sites
  • 0
Posted (edited)

Again, the vgchange commands don't apply to you since you don't have a multi-disk SHR.  Your /volume1 device was /dev/md2 which was shown from the /etc/fstab dump.

 

Great if you are working now.  You may want to rethink your storage configuration to make things more resilient in the future.

 

 

Edited by flyride

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.