[Urgent] RAID 1 SHR volume is missing

fitman · September 10, 2019

Hi there,

My Xpenology NAS stop to share files suddenly. Having reboot the NAS, the RAID 1 group and volume are disappeared.

I can also check there are some bad sector errors in 2 drives in the event log, and 2 drives are in "Unused Disks" status

One of the drive status is in "initialized", and other is in "warning"

Can anyone advise how to recover the data?

Million thanks.

Louie

flyride · September 10, 2019

You can try to force the arrays online with mdadm --assemble --force

That's not the exact syntax; you will need to do some investigation first. Google "recover mdadm array" for some examples. Because SHR, you may need to reboot again once the arrays are online in order for the volume to be visible. This is one of the reasons I personally don't use SHR (that it makes recovery more complicated in this scenario).

Once the volume is back online, you can force a resync. You will undoubtedly lose some data, and you won't know what files it affects.

Sorry, there is no easy step-by-step solution to this. You also need to figure out the original cause... Got a backup, right?

fitman · September 10, 2019

3 minutes ago, flyride said:

You can try to force the arrays online with mdadm --assemble --force

That's not the exact syntax; you will need to do some investigation first. Google "recover mdadm array" for some examples. Because SHR, you may need to reboot again once the arrays are online in order for the volume to be visible. This is one of the reasons I personally don't use SHR (that it makes recovery more complicated in this scenario).

Once the volume is back online, you can force a resync. You will undoubtedly lose some data, and you won't know what files it affects.

Sorry, there is no easy step-by-step solution to this. You also need to figure out the original cause... Got a backup, right?

Do you think it is caused by the bad hard disks?

Btw, my backup is not latest😓

flyride · September 10, 2019

On 9/10/2019 at 8:34 AM, fitman said:

Do you think it is caused by the bad hard disks?

Probably, but it is not guaranteed. "Warning" disk status means that the disk has been flagged but is otherwise working. "Initialized" means that the disk is not currently part of the array but is recognized by DSM and has the Synology disk structure (partitions) configured on it.

Generally a disk that has been part of an array and is now in Initialized status has either been disconnected from its controller (bad data cable or power), or in a read error state (bad sectors) for a long time such that DSM has decided it no longer was functioning. Standard hard disks (in contrast with "NAS" hard disks like WD Reds) have internal firmware that tries to recover disk errors for a very long time and will fail the RAID.

Two drives must be "offline" in a (3 drives or more) SHR for the array to be down.

Edited September 12, 2019 by flyride

fitman · September 11, 2019

Having had done couple of mdadm options, such as --examine, --assemble, I still cannot get one of RAID on line.

Now, I have 2 hard disks. Here are the outputs:

1. cat /proc/mdstat, and it seem it miss another RAID device /dev/md2

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md1 : active raid1 sda2[0] sdb2[1]
      2097088 blocks [12/2] [UU__________]

md0 : active raid1 sda1[0](E)
      2490176 blocks [12/1] [E___________]

unused devices: <none>

2. fdisk -l /dev/sda /dev/sdb

Disk /dev/sda: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x9c724db7

Device     Boot   Start       End   Sectors   Size Id Type
/dev/sda1          2048   4982527   4980480   2.4G fd Linux raid autodetect
/dev/sda2       4982528   9176831   4194304     2G fd Linux raid autodetect
/dev/sda3       9437184 976568351 967131168 461.2G fd Linux raid autodetect
Disk /dev/sdb: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0xb07b6a04

Device     Boot   Start       End   Sectors   Size Id Type
/dev/sdb1          2048   4982527   4980480   2.4G fd Linux raid autodetect
/dev/sdb2       4982528   9176831   4194304     2G fd Linux raid autodetect
/dev/sdb3       9437184 976568351 967131168 461.2G fd Linux raid autodetect

3. mdadm -D /dev/md0 /dev/md1 /dev/md2

/dev/md0:
        Version : 0.90
  Creation Time : Fri Jan 13 17:25:52 2017
     Raid Level : raid1
     Array Size : 2490176 (2.37 GiB 2.55 GB)
  Used Dev Size : 2490176 (2.37 GiB 2.55 GB)
   Raid Devices : 12
  Total Devices : 1
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed Sep 11 13:11:48 2019
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : 910e59bb:a575e17c:3017a5a8:c86610be
         Events : 0.5875244

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       0        0        1      removed
       2       0        0        2      removed
       3       0        0        3      removed
       4       0        0        4      removed
       5       0        0        5      removed
       6       0        0        6      removed
       7       0        0        7      removed
       8       0        0        8      removed
       9       0        0        9      removed
      10       0        0       10      removed
      11       0        0       11      removed
/dev/md1:
        Version : 0.90
  Creation Time : Wed Sep 11 12:08:28 2019
     Raid Level : raid1
     Array Size : 2097088 (2048.28 MiB 2147.42 MB)
  Used Dev Size : 2097088 (2048.28 MiB 2147.42 MB)
   Raid Devices : 12
  Total Devices : 2
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Wed Sep 11 12:55:40 2019
          State : clean, degraded
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 902be1b8:ec776a6f:24da047d:d8682150 (local to host xxxxxx)
         Events : 0.22

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       18        1      active sync   /dev/sdb2
       2       0        0        2      removed
       3       0        0        3      removed
       4       0        0        4      removed
       5       0        0        5      removed
       6       0        0        6      removed
       7       0        0        7      removed
       8       0        0        8      removed
       9       0        0        9      removed
      10       0        0       10      removed
      11       0        0       11      removed
mdadm: md device /dev/md2 does not appear to be active.

4. mdadm --examine --scan -v

ARRAY /dev/md0 level=raid1 num-devices=12 UUID=910e59bb:a575e17c:3017a5a8:c86610be
   spares=1   devices=/dev/sdb1,/dev/sda1
ARRAY /dev/md1 level=raid1 num-devices=12 UUID=902be1b8:ec776a6f:24da047d:d8682150
   devices=/dev/sdb2,/dev/sda2
ARRAY /dev/md/2 level=raid1 metadata=1.2 num-devices=2 UUID=b0e75a68:4e614e29:41c8ef67:8417ce3a name=wowhififever:2
   devices=/dev/sdb3,/dev/sda3

5. mdadm --assemble --scan -v

mdadm: looking for devices for further assembly
mdadm: cannot open device /dev/zram3: Device or resource busy
mdadm: cannot open device /dev/zram2: Device or resource busy
mdadm: cannot open device /dev/zram1: Device or resource busy
mdadm: cannot open device /dev/zram0: Device or resource busy
mdadm: no recogniseable superblock on /dev/synoboot3
mdadm: no recogniseable superblock on /dev/synoboot2
mdadm: no recogniseable superblock on /dev/synoboot1
mdadm: no recogniseable superblock on /dev/synoboot
mdadm: cannot open device /dev/md1: Device or resource busy
mdadm: cannot open device /dev/md0: Device or resource busy
mdadm: cannot open device /dev/sdb2: Device or resource busy
mdadm: no RAID superblock on /dev/sdb1
mdadm: cannot open device /dev/sdb: Device or resource busy
mdadm: cannot open device /dev/sda2: Device or resource busy
mdadm: cannot open device /dev/sda1: Device or resource busy
mdadm: cannot open device /dev/sda: Device or resource busy
mdadm: /dev/md/2 exists - ignoring
mdadm: /dev/sdb3 is identified as a member of /dev/md2, slot 1.
mdadm: /dev/sda3 is identified as a member of /dev/md2, slot 0.
mdadm: failed to add /dev/sda3 to /dev/md2: Invalid argument
mdadm: failed to add /dev/sdb3 to /dev/md2: Invalid argument
mdadm: /dev/md2 assembled from -1 drives - not enough to start the array.
mdadm: looking for devices for further assembly
mdadm: /dev/md/0_1 exists - ignoring
mdadm: /dev/sdb1 is identified as a member of /dev/md126, slot 12.
mdadm: No suitable drives found for /dev/md126
mdadm: looking for devices for further assembly
mdadm: No arrays found in config file or automatically

6. In dmesg output, it seem there is a lot of hard disk error on sda and sdb

[ 1144.879250] ata1.00: read unc at 9437194
[ 1144.883333] lba 9437194 start 9437184 end 976568351
[ 1144.883336] sda3 auto_remap 0
[ 1144.883341] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 1144.890026] ata1.00: irq_stat 0x40000001
[ 1144.894105] ata1.00: failed command: READ DMA
[ 1144.898652] ata1.00: cmd c8/00:08:08:00:90/00:00:00:00:00/e0 tag 10 dma 4096 in
                        res 51/40:06:0a:00:90/00:00:00:00:00/e0 Emask 0x9 (media error)
[ 1144.914175] ata1.00: status: { DRDY ERR }
[ 1144.918325] ata1.00: error: { UNC }
[ 1144.925207] ata1.00: configured for UDMA/100
[ 1144.925226] ata1: UNC RTF LBA Restored
[ 1144.925266] sd 0:0:0:0: [sda] Unhandled sense code
[ 1144.925281] sd 0:0:0:0: [sda]
[ 1144.925289] Result: hostbyte=0x00 driverbyte=0x08
[ 1144.925301] sd 0:0:0:0: [sda]
[ 1144.925310] Sense Key : 0x3 [current] [descriptor]
[ 1144.925326] Descriptor sense data with sense descriptors (in hex):
[ 1144.925334]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[ 1144.925363]         00 90 00 08
[ 1144.925367] sd 0:0:0:0: [sda]
[ 1144.925369] ASC=0x11 ASCQ=0x4
[ 1144.925371] sd 0:0:0:0: [sda] CDB:
[ 1144.925373] cdb[0]=0x28: 28 00 00 90 00 08 00 00 08 00
[ 1144.925381] end_request: I/O error, dev sda, sector 9437192
[ 1144.931190] md: disabled device sda3, could not read superblock.
[ 1144.931194] md: sda3 does not have a valid v1.2 superblock, not importing!
[ 1144.931197] ata1: EH complete
[ 1144.931244] md: md_import_device returned -22
[ 1145.029537] ata2.00: read unc at 9437194
[ 1145.033668] lba 9437194 start 9437184 end 976568351
[ 1145.033671] sdb3 auto_remap 0
[ 1145.033675] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 1145.040380] ata2.00: irq_stat 0x40000001
[ 1145.044492] ata2.00: failed command: READ DMA
[ 1145.049036] ata2.00: cmd c8/00:08:08:00:90/00:00:00:00:00/e0 tag 7 dma 4096 in
                        res 51/40:06:0a:00:90/00:00:00:00:00/e0 Emask 0x9 (media error)
[ 1145.064506] ata2.00: status: { DRDY ERR }
[ 1145.068703] ata2.00: error: { UNC }
[ 1145.075723] ata2.00: configured for UDMA/100
[ 1145.075744] ata2: UNC RTF LBA Restored
[ 1145.075783] sd 1:0:0:0: [sdb] Unhandled sense code
[ 1145.075797] sd 1:0:0:0: [sdb]
[ 1145.075806] Result: hostbyte=0x00 driverbyte=0x08
[ 1145.075818] sd 1:0:0:0: [sdb]
[ 1145.075826] Sense Key : 0x3 [current] [descriptor]
[ 1145.075843] Descriptor sense data with sense descriptors (in hex):
[ 1145.075851]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[ 1145.075886]         00 90 00 08
[ 1145.075890] sd 1:0:0:0: [sdb]
[ 1145.075891] ASC=0x11 ASCQ=0x4
[ 1145.075893] sd 1:0:0:0: [sdb] CDB:
[ 1145.075895] cdb[0]=0x28: 28 00 00 90 00 08 00 00 08 00
[ 1145.075903] end_request: I/O error, dev sdb, sector 9437192
[ 1145.081702] ata2: EH complete
[ 1145.081878] md: disabled device sdb3, could not read superblock.
[ 1145.081881] md: sdb3 does not have a valid v1.2 superblock, not importing!
[ 1145.081892] md: md_import_device returned -22
[ 1145.081948] md: md2 stopped.
[ 1145.131671] md: md126 stopped.

Thus, I am afraid both of the hard drive have problem. Is it possible that I can copy the data out?

Can anyone advise how to recover it???

Millions thanks,

Louie

flyride · September 12, 2019

This is a two-drive RAID1. If both members have problems, you are not in great shape for recovery. But try the mdadm assemble command with force (-f) instead of -v

fitman · September 12, 2019

Just now, flyride said:

This is a two-drive RAID1. If both members have problems, you are not in great shape for recovery. But try the mdadm assemble command with force (-f) instead of -v

Also use --force, no luck! Do you think it is certified as dead😓

flyride · September 12, 2019

If you are writing it off as dead, you have nothing to lose by trying more advanced options, like re-creating the array. Google is your friend, but here's a thread to start with:

http://paregov.net/blog/21-linux/25-how-to-recover-mdadm-raid-array-after-superblock-is-zeroed

Don't mess with /dev/md0, /dev/md1 and the members (/dev/sda1, /dev/sdb1 and /dev/sda2, /dev/sdb2). Those are your DSM and swap partitions, respectively.

You are worried about recreating /dev/md2 and its members /dev/sda3, /dev/sdb3 - your raid group and volume

Sign In

[Urgent] RAID 1 SHR volume is missing

Question

fitman

Link to comment

Share on other sites

7 answers to this question

Recommended Posts

flyride

Link to comment

Share on other sites

fitman

Link to comment

Share on other sites

flyride

Link to comment

Share on other sites

fitman

Link to comment

Share on other sites

flyride

Link to comment

Share on other sites

fitman

Link to comment

Share on other sites

flyride

Link to comment

Share on other sites

Join the conversation

Forums

What's new

MUST READ

Members