Crashed volume after 6.2.3 update

Bose321 · August 14, 2020

I've updated to 6.2.3 and now one of my 4 volumes is crashed. I'm running on VMWare and I applied the synoboot fix from here:

The sata drives that appeared wrongfully are gone, so that's good, but my volume1 is still crashed. The pool and disk both still are good. All volumes and pools are on a separate disk and are basic, no RAID or something. I believe SHR. IIRC it was btrfs.

I can still cd to /volume1 via SSH, but I only see a `@database' folder with mariadb10 it seems. So can this be fixed somehow? Or is everything gone?

Thanks in advance.

flyride · August 14, 2020

Your volume might not be mounted as the underlying root filesystem exists and a process might have tried to reinitialize a mariadb file.

Please run the following commands from ssh root:

# synodisk --enum
# synodisk --detectfs /volume1
# cat /etc/fstab
# cat /proc/mdstat

Post the results

Edited August 14, 2020 by flyride

Bose321 · August 14, 2020

Thanks, here are the outputs:

synodisk --enum

************ Disk Info ***************
>> Disk id: 2
>> Slot id: -1
>> Disk path: /dev/sdb
>> Disk model: Virtual SATA Hard Drive
>> Total capacity: 2794.00 GB
>> Tempeture: -1 C
************ Disk Info ***************
>> Disk id: 3
>> Slot id: -1
>> Disk path: /dev/sdc
>> Disk model: Virtual SATA Hard Drive
>> Total capacity: 2794.00 GB
>> Tempeture: -1 C
************ Disk Info ***************
>> Disk id: 4
>> Slot id: -1
>> Disk path: /dev/sdd
>> Disk model: Virtual SATA Hard Drive
>> Total capacity: 50.00 GB
>> Tempeture: -1 C
************ Disk Info ***************
>> Disk id: 5
>> Slot id: -1
>> Disk path: /dev/sde
>> Disk model: Virtual SATA Hard Drive
>> Total capacity: 3500.00 GB
>> Tempeture: -1 C

synodisk --detectfs /volume1

Partition [/volume1] unknown

cat /etc/fstab

none /proc proc defaults 0 0
/dev/root / ext4 defaults 1 1
/dev/md2 /volume1 btrfs auto_reclaim_space,synoacl,relatime 0 0
/dev/md4 /volume3 btrfs auto_reclaim_space,synoacl,relatime 0 0
/dev/md3 /volume2 btrfs auto_reclaim_space,synoacl,relatime 0 0
/dev/md5 /volume4 btrfs auto_reclaim_space,synoacl,relatime 0 0

cat /proc/mdstat

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [raidF1]
md2 : active raid1 sdb3[0]
      2924899328 blocks super 1.2 [1/1] [U]

md3 : active raid1 sdc3[0]
      2924899328 blocks super 1.2 [1/1] [U]

md4 : active raid1 sdd3[0]
      47606784 blocks super 1.2 [1/1] [U]

md5 : active raid1 sde3[0]
      3665193984 blocks super 1.2 [1/1] [U]

md1 : active raid1 sdb2[0] sdc2[1] sdd2[2] sde2[3]
      2097088 blocks [12/4] [UUUU________]

md0 : active raid1 sdb1[0] sdc1[1] sdd1[3] sde1[2]
      2490176 blocks [12/4] [UUUU________]

unused devices: <none>

flyride · August 14, 2020

So this tells us a few things

Confirms your use of virtual disks, not physical
Simple volumes, no RAID (aside from DSM's normal RAID1 for DSM and swap)
btrfs filesystems
/volume1 is NOT being mounted

So let's see what the error message is when the system tries to mount your volume:

# mount -v /dev/md2 /volume1

Bose321 · August 14, 2020

Thanks, but that tells me this:

mount: wrong fs type, bad option, bad superblock on /dev/md2,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

flyride · August 14, 2020

1 minute ago, Bose321 said:


       In some cases useful info is found in syslog - try
       dmesg | tail or so.

Bose321 · August 14, 2020

Of course, sorry.

[ 2362.066098] parent transid verify failed on 2854282428416 wanted 5246133 found 5245893
[ 2362.066580] parent transid verify failed on 2854282428416 wanted 5246133 found 5245893
[ 2362.066953] md/raid1:md2: syno_raid1_self_heal_set_and_submit_read_bio(1226): No suitable device for self healing retry read at round 2 at sector 1563983552
[ 2362.067252] md/raid1:md2: syno_raid1_self_heal_set_and_submit_read_bio(1226): No suitable device for self healing retry read at round 2 at sector 1563983560
[ 2362.067536] md/raid1:md2: syno_raid1_self_heal_set_and_submit_read_bio(1226): No suitable device for self healing retry read at round 2 at sector 1563983576
[ 2362.067808] md/raid1:md2: syno_raid1_self_heal_set_and_submit_read_bio(1226): No suitable device for self healing retry read at round 2 at sector 1563983568
[ 2362.068123] parent transid verify failed on 2854282428416 wanted 5246133 found 5245893
[ 2362.068127] BTRFS error (device md2): BTRFS: md2 failed to repair parent transid verify failure on 2854282428416, mirror = 2

[ 2362.089368] BTRFS: open_ctree failed

flyride · August 14, 2020

There is some btrfs corruption that is preventing the volume mount (if not obvious from the log dump).

BTRFS tried to self-heal but it can't in this case. My advice is try and recover all the files to new storage, delete the volume and recreate.

Here's a thread with some methods of extracting the data from the damaged volume. Start with post #9, read through the rest before doing anything, and you may ignore any "vgchange" commands which do not apply to you. There are a few other threads around on repairing/recovering btrfs corruption if you search for them.

https://xpenology.com/forum/topic/14337-volume-crash-after-4-months-of-stability/?do=findComment&comment=107979

Very likely that you can either mount read-only or restore (recover) files to other storage.

Edited August 14, 2020 by flyride

flyride · August 14, 2020

I would add that if DSM was in a best-practice configuration (passthrough of disks to DSM, and use of RAID) btrfs may very well have fixed this problem itself given the log output.

Edited August 14, 2020 by flyride

Bose321 · August 14, 2020

Is it strange that /dev/vg1000 isn't available? I keep reading that everywhere, but that doesn't exist for me. I can't really find commands that I can use, since they almost all use that.

I used to have passthrough disks in the past, but that was later no longer advised and I then switched.

I've got it mounted with this command:

mount -o recovery,ro /dev/md2 /volume1

That works and I can see my files. But I can't normally mount it. Will have to backup and see if I can mount it, or just recreate it.

edit2: Okay weird, I just shut down my VM, spun it up again and my volume is back to normal!?

Edited August 14, 2020 by Bose321

flyride · August 14, 2020

Again, the vgchange commands don't apply to you since you don't have a multi-disk SHR. Your /volume1 device was /dev/md2 which was shown from the /etc/fstab dump.

Great if you are working now. You may want to rethink your storage configuration to make things more resilient in the future.

Edited August 14, 2020 by flyride

Crashed volume after 6.2.3 update

Question

Link to comment

Share on other sites

10 answers to this question

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation