• 0

crashed btrfs volume please help


Question

So I want to say I'm probably screwed and need to do a btrfs recover to a volume that is big enough. Problem is my volume is nearly 64TB (and is completely filled) and I don't have anything now or soon to be able to dump the btrfs volume to. Even more concerning is that I am getting kernel panics running btrfs commands. I am running 6.2.3 update-2.

 

What happened:

I was adding more drives to my NAS and might have bumped the sas to 4 sata cable in my machine. When I booted up, a few drives showed up as disconnected, needing to be initialized, bad system partition, etc, I shutdown, wiggled a few cables and  rebooted, still having issues but with a different set of drives. I shutdown once again and booted again. This time, one drive shows up as disconnected and needing to be added to the volume, while the other three say they have bad system partition.

 

I started the repair process in DSM and added the disconnected drive back to the volume, which started a consistency check. Still checking parity consistency.

I also repaired the system partition which completed and all 8 drives are showing healthy.

 

The pool says 64TB used / 64TB total, the volume says 0 bytes used / 1 byte total. This is the part that has me worried. Usually checking consistency is a background process. At this point the volume should be usable.

 

I started googling and found my way back to the xpenology forums.

 

I followed the advice here: https://xpenology.com/forum/topic/14337-volume-crash-after-4-months-of-stability

 

I tried mount -o recovery,ro /dev/vg1/volume_1 /volume1 , got these errors from dmesg:

 

[34649.993061] BTRFS error (device dm-1): parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.003239] BTRFS error (device dm-1): parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.013465] parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.013490] parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.013521] parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.013541] parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.013563] parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.013584] parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.013602] BTRFS error (device dm-1): failed to read block groups: -5
[34650.039212] BTRFS: open_ctree failed

 

I also ran btrfs check --check-data-csum /dev/vg1/volume_1 but I got a kernel panic...

[34650.003239] BTRFS error (device dm-1): parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.013465] parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.013490] parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.013521] parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.013541] parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.013563] parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.013584] parent transid verify failed on 10053299585024 wanted 154500 found 154497
[34650.013602] BTRFS error (device dm-1): failed to read block groups: -5
[34650.039212] BTRFS: open_ctree failed
[34976.628348] btrfs[5977]: segfault at f9f80e81 ip 000000000042edff sp 00007ffc1ee25880 error 4 in btrfs[400000+9d000]

 

yikes

 

some of the other commands that were suggested had that error where synology's btrfs tools mismatched the open source version.

btrfs check --init-extent-tree /dev/vg1/volume_1
couldn't open RDWR because of unsupported option features (3).
Couldn't open file system

 

So yeah, so I'm thinking I'm kind of screwed. I'm hoping the partition checks will finish and magically everything will be back, but I kind of doubt it.

 

I'm thinking maybe I should boot from a live USB drive and try to mount the partition, see what data is actually recoverable before I decide if I should spend money to build an array that large just so I can recover a bunch of corrupted data.

 

If I go the liveUSB route, which distro and version would you recommend?

Anything else I can try?

 

 

 

Edited by bmacklin
Link to post
Share on other sites

0 answers to this question

Recommended Posts

There have been no answers to this question yet

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.