Jump to content
XPEnology Community

DANGER : Raid crashed. Help me restore my data!


Recommended Posts

17 minutes ago, flyride said:

Ok, the goal here is to flag the out of sequence drive for use.  Try this:


# mdadm --stop /dev/md2
# mdadm --assemble --run /dev/md2 /dev/sd[bcd]5

Post results as before.

 

Here is the result

 

root@DiskStation:~# mdadm --assemble --run /dev/md2 /dev/sd[bcd]5
mdadm: /dev/md2 has been started with 2 drives (out of 4).

 

Link to comment
Share on other sites

5 o r10 mins after I did that. Hope it helps

 

root@DiskStation:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [raidF                                                                                         1]
md2 : active raid5 sdb5[2] sdd5[4]
      8776594944 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/2] [__UU]

md1 : active raid1 sdb2[0] sdc2[1] sdd2[2]
      2097088 blocks [12/3] [UUU_________]

md0 : active raid1 sdb1[2] sdd1[3]
      2490176 blocks [12/2] [__UU________]

unused devices: <none>


 

Link to comment
Share on other sites

Hello, sorry I had to work (I work in health care so very busy lately).

 

These commands we are trying have not started the array yet, but you are no worse off.  I don't quite understand why the drive hasn't unflagged, but let's try one more combination before telling it to create a new array metadata.

# mdadm --stop /dev/md2
# mdadm --assemble --run --force /dev/md2 /dev/sd[bcd]5

 

Link to comment
Share on other sites

1 hour ago, flyride said:

Hello, sorry I had to work (I work in health care so very busy lately).

 

These commands we are trying have not started the array yet, but you are no worse off.  I don't quite understand why the drive hasn't unflagged, but let's try one more combination before telling it to create a new array metadata.


# mdadm --stop /dev/md2
# mdadm --assemble --run --force /dev/md2 /dev/sd[bcd]5

 

 

 

Hi there! I totally understand if you work in health. I am from Europe and we are in the same ******* here! Even if it is slowing down in some countries. Stay safe :-)

Here is are the results

root@DiskStation:~# mdadm --stop /dev/md2
mdadm: stopped /dev/md2
root@DiskStation:~# mdadm --assemble --run --force /dev/md2 /dev/sd[bcd]5
mdadm: /dev/md2 has been started with 2 drives (out of 4).


 

 

 

 

Link to comment
Share on other sites

On 4/4/2020 at 7:12 PM, IG-88 said:

if flyride has some time to help you here its good for you, he's defiantly better at this then i am

hi IG88, do you mind steping back in this thread as Flyride is super busy those days! So far, it is not too complicated, even for me :-)  I just want to be sure that I am not making any mistake... Thanks!

Link to comment
Share on other sites

i haven't done it this often and have not seen anything like this, /dev/sdc5 looked like it would be easy to force it back into the array like you already tried by stopping /dev/md2

and then "force" the drive back into the raid - as you already tried
thst would have been my approach, its the same as you already tied

mdadm --stop /dev/md2

mdadm --assemble --force /dev/md2 /dev/sd[bcd]5

mdadm --detail /dev/md2

doing more advanced steps would be experimental for me and i don't like suggesting stuff i haven't tried myself before

here is the procedure for recreating the whole /dev/md2 instead of assemble it

https://raid.wiki.kernel.org/index.php/RAID_Recovery

 

drive 0 is missing (sda5), sdc5 ist device 1 (odd but it say's so in the status in examine), sdb5 is device 2 and sdd5 is device 3

Used Dev Size : 5851063296 (2790.00 GiB 2995.74 GB) from status -> divided by two its 2925531648

 

i came up with this to do it

mdadm --create --assume-clean --level=5 --raid-devices=4 --size=2925531648 /dev/md2 missing /dev/sdc5 /dev/sdb5 /dev/sdd5

its a suggestion, nothing more (or its less then a suggestion? - what would be the name for this?)

 

edit: maybe try this before trying a create

mdadm --assemble --force --verbose /dev/md2 /dev/sdc5 /dev/sdb5 /dev/sdd5

 

 

Edited by IG-88
Link to comment
Share on other sites

 

7 hours ago, IG-88 said:

i haven't done it this often and have not seen anything like this, /dev/sdc5 looked like it would be easy to force it back into the array like you already tried by stopping /dev/md2

and then "force" the drive back into the raid - as you already tried
thst would have been my approach, its the same as you already tied


mdadm --stop /dev/md2

mdadm --assemble --force /dev/md2 /dev/sd[bcd]5

mdadm --detail /dev/md2

doing more advanced steps would be experimental for me and i don't like suggesting stuff i haven't tried myself before

here is the procedure for recreating the whole /dev/md2 instead of assemble it

https://raid.wiki.kernel.org/index.php/RAID_Recovery

 

drive 0 is missing (sda5), sdc5 ist device 1 (odd but it say's so in the status in examine), sdb5 is device 2 and sdd5 is device 3

Used Dev Size : 5851063296 (2790.00 GiB 2995.74 GB) from status -> divided by two its 2925531648

 

i came up with this to do it


mdadm --create --assume-clean --level=5 --raid-devices=4 --size=2925531648 /dev/md2 missing /dev/sdc5 /dev/sdb5 /dev/sdd5

its a suggestion, nothing more (or its less then a suggestion? - what would be the name for this?)

 

edit: maybe try this before trying a create


mdadm --assemble --force --verbose /dev/md2 /dev/sdc5 /dev/sdb5 /dev/sdd5

 

 

 

Hello IG-88 and thanks for speending time looking into my problem. 

Even if I can not wait to find the solution for that problem, I am ready to give it some time to find the correct solution  I following your first recommendation I will try not to do any irreversible action as you said it might damage more the array.

So, if I understand your post correctly it is OK for me to try now the first 3 commands below without creating more issue for that array : In other word, it is not irreversible and can not damange more :-)

Do you recommend me to have also the opinion of Flyride on those steps below?

And another question for you : Do you recommend that I turn off my NAS when not using it to do those commands? If so, will it affect the array?

mdadm --stop /dev/md2

mdadm --assemble --force /dev/md2 /dev/sd[bcd]5

mdadm --detail /dev/md2
Link to comment
Share on other sites

11 hours ago, jbesclapez said:

So, if I understand your post correctly it is OK for me to try now the first 3 commands below without creating more issue for that array : In other word, it is not irreversible and can not damange more

 

that was what you already did with flyride, so nothing new to expect if you repeat it

in my edit from the last post i suggested a a little different assemble try, is gives the /dev/sdX in the order as they are in the examine, sdc as the 1st and the other two after this, i 'm not sure if that makes a difference when using --force, it also contains --verbose, maybe we get more information when it fails to assemble it

its just a slightly different variation of what you already tried, cant make things worse

so you should try this next

mdadm --stop /dev/md2

mdadm --assemble --force --verbose /dev/md2 /dev/sdc5 /dev/sdb5 /dev/sdd5

mdadm --detail /dev/md2

the --create command would be only if the above does not work and we can't figure out why, i would like to know why --assemble --force does not work as it should

 

11 hours ago, jbesclapez said:

Do you recommend me to have also the opinion of Flyride on those steps below?

for the --create coammand yes, the other try with the --verbose should be ok

 

11 hours ago, jbesclapez said:

And another question for you : Do you recommend that I turn off my NAS when not using it to do those commands? If so, will it affect the array?

 

when you are sure your problem is located and the reason why sdc also dropped out a removed

sda seems to be a bad drive already and is not used anymore

if the s.m.a.r.t. info's of the other three drives is ok it might be safe to shutdown, but if i would be in your place i would let it in the state is now and running, if there would be indication that ram, board, controller or psu are source of the problem i would shut down to get a stable system, thats key for a recovery

if the assemble or create is successful even then you would not shut down, maybe a reboot

for me it seems still unclear why sdc dropped out of the raid

did you check logs the see when sda and sdc dropped out, dis they at the same time or was sda already failed for longer and you did not noticed it?

 

in a more professional recovery environment (much more money involved) i guess one would make a image file from every disk and work with these (an a tested stable system)

 

 

 

  • Like 1
Link to comment
Share on other sites

1 hour ago, IG-88 said:

 


mdadm --stop /dev/md2

mdadm --assemble --force --verbose /dev/md2 /dev/sdc5 /dev/sdb5 /dev/sdd5

mdadm --detail /dev/md2

 

 

Hi IG-88!

 

As you said, I think the bad drive broke and I never saw it. Then I have no idea why my other drive got unsycn.

Here are the commands I just did 

 

root@DiskStation:~# mdadm --stop /dev/md2
mdadm: stopped /dev/md2
root@DiskStation:~# mdadm --assemble --force --verbose /dev/md2 /dev/sdc5 /dev/sdb5 /dev/sdd5
mdadm: looking for devices for /dev/md2
mdadm: /dev/sdc5 is identified as a member of /dev/md2, slot 1.
mdadm: /dev/sdb5 is identified as a member of /dev/md2, slot 2.
mdadm: /dev/sdd5 is identified as a member of /dev/md2, slot 3.
mdadm: no uptodate device for slot 0 of /dev/md2
mdadm: added /dev/sdc5 to /dev/md2 as 1 (possibly out of date)
mdadm: added /dev/sdd5 to /dev/md2 as 3
mdadm: added /dev/sdb5 to /dev/md2 as 2
mdadm: /dev/md2 assembled from 2 drives - not enough to start the array.
root@DiskStation:~# mdadm --detail /dev/md2
mdadm: cannot open /dev/md2: No such file or directory

And unfortunately, as you can see, it is not working. I will follow your other recommendation and wait for Flyride too!

I will also leave the NAS on.
 

 

 

 

 

Link to comment
Share on other sites

ignore the gui, it cant help you with your real raid5 problem

md0 and md1 are system and swap partitions as a raid1 as long as even one drive is working its not going to fail and you can later repair that within the gui

stick to your problem thats important, the raid1 problems are easy to repair later, i would not risk any automatic you dont know do do anything

from what i've found on the internet it seems the superblock does not match and you might have to recreate the array be resetting the superblocks and recreating the array (in the order as before), also in examine there is a Recovery Offset flag set for sdc

i'd suggest to find a second opinion

(in theory) my next steps would be like this (and thats something you can't undo so be careful)

mdadm --stop /dev/md2
mdadm --zero-superblock /dev/sd[bcd]5
mdadm --create --assume-clean --level=5 --raid-devices=4 --size=2925531648 /dev/md2 missing /dev/sdc5 /dev/sdb5 /dev/sdd5
mdadm --detail /dev/md2

 

Link to comment
Share on other sites

I'm also a bit perplexed about /dev/sdc not coming online with the commands we've used.  But I think I know why it isn't joining the array - it has a "feature map" bit set which flags the drive as being in the middle of an array recovery operation.  So it is reluctant to include the drive in the array assembly.

 

In my opinion zapping off the superblocks is a last resort, only when nothing else will work.  There is a lot of consistency information that is embedded in the superblock (evidence by the --detail command output) and the positional information of the disk within the stripe, and all that is lost when we zero a superblock.

 

Before we go doing that, let's repeat the last command with verbose mode on and change the syntax a bit:

mdadm --stop /dev/md2
mdadm -v --assemble --scan --force --run /dev/md2 /dev/sdb5 /dev/sdc5 /dev/sdd5

If that doesn't work, we'll come up with something to clear the feature map bit.

Edited by flyride
Link to comment
Share on other sites

3 hours ago, flyride said:

I'm also a bit perplexed about /dev/sdc not coming online with the commands we've used.  But I think I know why it isn't joining the array - it has a "feature map" bit set which flags the drive as being in the middle of an array recovery operation.  So it is reluctant to include the drive in the array assembly.

 

In my opinion zapping off the superblocks is a last resort, only when nothing else will work.  There is a lot of consistency information that is embedded in the superblock (evidence by the --detail command output) and the positional information of the disk within the stripe, and all that is lost when we zero a superblock.

 

Before we go doing that, let's repeat the last command with verbose mode on and change the syntax a bit:


mdadm --stop /dev/md2
mdadm -v --assemble --scan --force --run /dev/md2 /dev/sdb5 /dev/sdc5 /dev/sdd5

If that doesn't work, we'll come up with something to clear the feature map bit.

 

Oh god... nothing gets the job done.  :-(

 

root@DiskStation:~# mdadm --stop /dev/md2
mdadm: stopped /dev/md2
root@DiskStation:~# mdadm -v --assemble --scan --force --run /dev/md2 /dev/sdb5 /dev/sdc5 /dev/sdd5
mdadm: /dev/md2 not identified in config file.
mdadm: /dev/sdb5 not identified in config file.
mdadm: /dev/sdc5 not identified in config file.
mdadm: /dev/sdd5 not identified in config file.


 

Link to comment
Share on other sites

That was going to be my next suggestion.  But, are you sure there was not more output from the last command?  For verbose mode, it sure didn't say very much.  Can you post a mdstat please?

 

After that, if it still only has assembled with two instead of three drives, let's try:

# mdadm --stop /dev/md2
# mdadm -v --assemble --force /dev/md2 --uuid 75762e2e:4629b4db:259f216e:a39c266d

 

Edited by flyride
Link to comment
Share on other sites

6 hours ago, flyride said:

That was going to be my next suggestion.  But, are you sure there was not more output from the last command?  For verbose mode, it sure didn't say very much.  Can you post a mdstat please?

 

After that, if it still only has assembled with two instead of three drives, let's try:


# mdadm --stop /dev/md2
# mdadm -v --assemble --force /dev/md2 --uuid 75762e2e:4629b4db:259f216e:a39c266d

 

You asked me to do the mdstat. I restarted my server as the md2 was stopped and I did this 

root@DiskStation:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [raidF1]
md2 : active raid5 sdb5[2] sdd5[4]
      8776594944 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/2] [__UU]

md1 : active raid1 sdb2[0] sdc2[1] sdd2[2]
      2097088 blocks [12/3] [UUU_________]

md0 : active raid1 sdb1[2] sdd1[3]
      2490176 blocks [12/2] [__UU________]

So, should I retry the post 39 and then try with post 41 after a reboot?
 

Link to comment
Share on other sites

Here is the result

 

root@DiskStation:~# mdadm --stop /dev/md2
mdadm: stopped /dev/md2
root@DiskStation:~# mdadm -v --assemble --force /dev/md2 --uuid 75762e2e:4629b4db:259f216e:a39c266d
mdadm: looking for devices for /dev/md2
mdadm: no recogniseable superblock on /dev/synoboot3
mdadm: Cannot assemble mbr metadata on /dev/synoboot2
mdadm: Cannot assemble mbr metadata on /dev/synoboot1
mdadm: Cannot assemble mbr metadata on /dev/synoboot
mdadm: no recogniseable superblock on /dev/md1
mdadm: no recogniseable superblock on /dev/md0
mdadm: No super block found on /dev/sdd2 (Expected magic a92b4efc, got 31333231)
mdadm: no RAID superblock on /dev/sdd2
mdadm: No super block found on /dev/sdd1 (Expected magic a92b4efc, got 00000131)
mdadm: no RAID superblock on /dev/sdd1
mdadm: No super block found on /dev/sdd (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sdd
mdadm: No super block found on /dev/sdc2 (Expected magic a92b4efc, got 31333231)
mdadm: no RAID superblock on /dev/sdc2
mdadm: No super block found on /dev/sdc1 (Expected magic a92b4efc, got 00000131)
mdadm: no RAID superblock on /dev/sdc1
mdadm: No super block found on /dev/sdc (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sdc
mdadm: No super block found on /dev/sdb2 (Expected magic a92b4efc, got 31333231)
mdadm: no RAID superblock on /dev/sdb2
mdadm: No super block found on /dev/sdb1 (Expected magic a92b4efc, got 00000131)
mdadm: no RAID superblock on /dev/sdb1
mdadm: No super block found on /dev/sdb (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sdb
mdadm: Cannot read superblock on /dev/sda
mdadm: no RAID superblock on /dev/sda
mdadm: /dev/sdd5 is identified as a member of /dev/md2, slot 3.
mdadm: /dev/sdc5 is identified as a member of /dev/md2, slot 1.
mdadm: /dev/sdb5 is identified as a member of /dev/md2, slot 2.
mdadm: no uptodate device for slot 0 of /dev/md2
mdadm: added /dev/sdc5 to /dev/md2 as 1 (possibly out of date)
mdadm: added /dev/sdd5 to /dev/md2 as 3
mdadm: added /dev/sdb5 to /dev/md2 as 2
mdadm: /dev/md2 assembled from 2 drives - not enough to start the array.


 

Link to comment
Share on other sites

root@DiskStation:~# mdadm --examine /dev/sd[abcd]5 | egrep 'Event|/dev/sd'
/dev/sdb5:
         Events : 15417
/dev/sdc5:
         Events : 15357
/dev/sdd5:
         Events : 15417

 

root@DiskStation:~# mdadm --detail /dev/md2
/dev/md2:
        Version : 1.2
  Creation Time : Mon May 13 08:39:01 2013
     Raid Level : raid5
     Array Size : 8776594944 (8370.01 GiB 8987.23 GB)
  Used Dev Size : 2925531648 (2790.00 GiB 2995.74 GB)
   Raid Devices : 4
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Fri Apr 10 08:00:50 2020
          State : clean, FAILED
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           Name : VirDiskStation:2
           UUID : 75762e2e:4629b4db:259f216e:a39c266d
         Events : 15417

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       -       0        0        1      removed
       2       8       21        2      active sync   /dev/sdb5
       4       8       53        3      active sync   /dev/sdd5

 

I have done a  bit of research and your "--re-add" should have worked as the Events are not far from each other. Maybe not close enough... no idea.

Link to comment
Share on other sites

13 hours ago, flyride said:

Let's try this:


# mdadm /dev/md2 --re-add /dev/sdc5

 

Hi Flyride,

I think it would be easier if you give me a time where you will be online, and I will be there posting your ideas and giving you instant feedback. FYI, I am in Netherlands. What time and day would suits you? 
Cheers!

Nic

 

Link to comment
Share on other sites

Still quite perplexed about the refusal of this drive to play, but we're probably out of non-invasive options and need to do a create - the path that IG-88 charted out. Before we do that let's get a current state of your system. Please do not reboot or do anything else to change the system state once we start using this information or your data is at risk.  If anything changes at all, please advise.

# mdadm --detail /dev/md2 | fgrep "/dev/"
# mdadm --examine /dev/sdb5 /dev/sdc5 /dev/sdd5 | egrep "/dev|Role|Events|UUID"

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...