DSM not recognises all HDD & volume crashed now

Agent-Orange · May 21, 2016

Hi all

I just wanted to add 2 8TB Seagate drives to create a new Raid0. I already have a Raid5 with 6 4TB WD drives. They are all connected with a Fujitsu CP400i 12GB (SAS3108) SAS controller. After I added the 2 8TB drives the other 2 4TB drives won't be recognised anymore in the DSM storage manager. In the controller Bios I see all drives.

Now I connected the 2 8TB drives directly with the sata-ports on my mainboard, except that my raid5 has been degraded and I have to repair it now.

But I that a problem of the DSM or the raid controller? It sill runs with an IR-Firmware, cause there isnt a IT-Firmware for that controller. But at least I can switch to JBOD mode inside the controller Bios.

Any help would be great, cause I want to run all drives with the sas controller.

EDIT: Unfortunately my existing Raid 5 crashed caused of that. DSM said one disk wasnt initialized anymore so I repaired it and run a consistency check over night. Now the disk group status is normal, all drives are normal (no smart errors) but the status of the volume on this disk group is "crashed". What can I do, that I dont lose my data? The data on this drive arent that important, that I would save them external, just some movies and series.

thanks for any help

Edited May 22, 2016 by Guest

AllGamer · May 21, 2016

it's possible the Fujitsu controller was not compatible with the 8TB drives, you might need a new firmware for it.

if your motherboard BIOS can see it, then Synology should be able to show at least the two 8TB HDD in the Storage Manager

if Storage Manager :arrow: HDD/SDD :arrow: does not show the two 8TB drives

then you should try to get a Ubuntu / Fedora USB boot, and see if you can access those two 8TB drives under linux

Agent-Orange · May 21, 2016

The run with the sas controller, I could create a disk group, that wasnt the problem. The problem is, that DSM doesnt recognises the 2 WD drives anymore, which are on the same SAS-port like the 8TB drives. And the controller bios recognises all drives.

Possible a firmware update would help, but on the Fujitsu page I couldnt find a newer firmware and the firmware on the LSI page arent working with this controller.

At the moment the 8TB drives are connected directly with the motherboard sata ports and DSM recognises all 8 drives now.

AllGamer · May 21, 2016

Well if your current goal is to recover the degraded RAID from the 2 WD drives, remove the 2 8TB drives.

let Synology finish fixing the degraded RAID,

then when the system is back to normal, add the 2 new 8TB drives, and keep them in a separate disk group, and create a separate volumen on them.

the key is to pay attention to the disk layouts, and which disk is plugged to which slot.

under the hood synology uses mdadm which has an annoying flaw that it associates the drives to the boot order how the controllers discovered them.

there is a topic here viewtopic.php?f=2&t=11672

regarding that issue, it is not as evident on good controllers, but on some controllers it becomes a real annoyance,

the same happened to me when I was running Linux with mdadm, i swapped the disks around and the RAID got messed up, because the order of the HDD changed, and it believed the RAID was corrupted, if I swap back the HDD to its original location then it's all well.

so make note which drive is plugged to which SATA port, and inside Synology, figure out which drive is in which slot, as it might be different than the physical layout.

Agent-Orange · May 22, 2016

Yeah, it seems that I smashed my existing raid 5. DSM said that 1 Disk failed, in the HDD/SSD overview one of the 6 WD wasn't initialized or something like that. So I had to run a "Consistency check or repair process" on this disk-group, which took about 9hours. Now the disk group status is normal, but the status of the volume is still crashed :-|

When I click on manage, I had the possibility to run a "Data rubbing process" which I started about 3hours ago. When I run cat /proc/mdstat

it seems that this is a resync process, cause it sais "resync = 37.8% (1478646144/3902163776) finish=322.8min speed=125113K/sec"

Agent-Orange · May 22, 2016

Unfortunately my existing Raid 5 crashed caused of that. DSM said one disk wasnt initialized anymore so I repaired it and run a consistency check over night. Now the disk group status is normal, all drives are normal (no smart errors) but the status of the volume on this disk group is "crashed". What can I do, that I dont lose my data? The data on this drive arent that important, that I would save them external, just some movies and series.

The crashed volume is volume2 on vg1 and /dev/md2

Thats what I can say so far:

cat /proc/mdstat:

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]

md3 : active linear sda3[0] sdc3[1]

15618409088 blocks super 1.2 64k rounding [2/2] [uU]

md2 : active raid5 sdl3[0] sdg3[6] sdk3[4] sdj3[3] sdi3[2] sdh3[1]

19510818880 blocks super 1.2 level 5, 64k chunk, algorithm 2 [6/6] [uUUUUU]

md4 : active raid1 sdf5[0]

1948683456 blocks super 1.2 [1/1]

md1 : active raid1 sda2[0] sdc2[1] sdf2[2] sdg2[3] sdh2[4] sdi2[5] sdj2[6] sdk2[7] sdl2[8]

2097088 blocks [12/9] [uUUUUUUUU___]

md0 : active raid1 sda1[6] sdc1[8] sdf1[7] sdg1[0] sdh1[3] sdi1[2] sdj1[4] sdk1[5] sdl1[1]

2490176 blocks [12/9] [uUUUUUUUU___]

mdadm --detail /dev/md2

/dev/md2:

Version : 1.2

Creation Time : Sun Feb 14 21:29:19 2016

Raid Level : raid5

Array Size : 19510818880 (18606.97 GiB 19979.08 GB)

Used Dev Size : 3902163776 (3721.39 GiB 3995.82 GB)

Raid Devices : 6

Total Devices : 6

Persistence : Superblock is persistent

Update Time : Sun May 22 22:55:33 2016

State : clean

Active Devices : 6

Working Devices : 6

Failed Devices : 0

Spare Devices : 0

Layout : left-symmetric

Chunk Size : 64K

Name : MatternetNAS1:2

UUID : 62ecb900:75a271c3:e16afda9:7aadf653

Events : 309

Number Major Minor RaidDevice State

0 8 179 0 active sync /dev/sdl3

1 8 115 1 active sync /dev/sdh3

2 8 131 2 active sync /dev/sdi3

3 8 147 3 active sync /dev/sdj3

4 8 163 4 active sync /dev/sdk3

6 8 99 5 active sync /dev/sdg3

lvm vgscan:

Reading all physical volumes. This may take a while...

Found volume group "vg1000" using metadata type lvm2

Found volume group "vg2" using metadata type lvm2

Found volume group "vg1" using metadata type lvm2

Lvdisplay:

--- Logical volume ---

LV Name /dev/vg1/volume_2

VG Name vg1

LV UUID uAJ200-oQvN-3Fo2-heSS-A3lN-mfq5-IjzE2k

LV Write Access read/write

LV Status available

# open 0

LV Size 18.17 TB

Current LE 4763380

Segments 1

Allocation inherit

Read ahead sectors auto

- currently set to 4096

Block device 253:4

lvm pvscan:

PV /dev/md4 VG vg1000 lvm2 [1.81 TB / 0 free]

PV /dev/md3 VG vg2 lvm2 [14.55 TB / 0 free]

PV /dev/md2 VG vg1 lvm2 [18.17 TB / 0 free]

Total: 3 [2.53 TB] / in use: 3 [2.53 TB] / in no VG: 0 [0 ]

lvm lvscan:

ACTIVE '/dev/vg1000/lv' [1.81 TB] inherit

ACTIVE '/dev/vg2/syno_vg_reserved_area' [12.00 MB] inherit

ACTIVE '/dev/vg2/volume_3' [14.55 TB] inherit

ACTIVE '/dev/vg1/syno_vg_reserved_area' [12.00 MB] inherit

ACTIVE '/dev/vg1/volume_2' [18.17 TB] inherit

mount:

1. mkdir /mnt/syno

2. mount /dev/vg1/volume_2 /mnt/syno

mount: mounting /dev/vg1/volume_2 on /mnt/syno failed: No such device

# e2fsck -vnf /dev/md2

e2fsck 1.42.6 (21-Sep-2012)

Warning! /dev/md2 is in use.

ext2fs_open2: Bad magic number in super-block

e2fsck: Superblock invalid, trying backup blocks...

e2fsck: Bad magic number in super-block while trying to open /dev/md2

The superblock could not be read or does not describe a correct ext2

filesystem. If the device is valid and it really contains an ext2

filesystem (and not swap or ufs or something else), then the superblock

is corrupt, and you might try running e2fsck with an alternate superblock:

e2fsck -b 8193

Dsmeg:

2.509976] md: Autodetecting RAID arrays.

[ 2.555128] md: invalid raid superblock magic on sda3

[ 2.555130] md: sda3 does not have a valid v0.90 superblock, not importing!

[ 2.601175] md: invalid raid superblock magic on sdc3

[ 2.601176] md: sdc3 does not have a valid v0.90 superblock, not importing!

[ 2.602861] md: invalid raid superblock magic on sdf5

[ 2.602862] md: sdf5 does not have a valid v0.90 superblock, not importing!

[ 2.660901] md: invalid raid superblock magic on sdg3

[ 2.660902] md: sdg3 does not have a valid v0.90 superblock, not importing!

[ 2.715268] md: invalid raid superblock magic on sdh3

[ 2.715287] md: sdh3 does not have a valid v0.90 superblock, not importing!

[ 2.768305] md: invalid raid superblock magic on sdi3

[ 2.768323] md: sdi3 does not have a valid v0.90 superblock, not importing!

[ 2.840546] md: invalid raid superblock magic on sdj3

[ 2.840547] md: sdj3 does not have a valid v0.90 superblock, not importing!

[ 2.903760] md: invalid raid superblock magic on sdk3

[ 2.903761] md: sdk3 does not have a valid v0.90 superblock, not importing!

[ 2.976482] md: invalid raid superblock magic on sdl3

[ 2.976482] md: sdl3 does not have a valid v0.90 superblock, not importing!

[ 2.976484] md: Scanned 27 and added 18 devices.

[ 2.976485] md: autorun ...

[ 2.976485] md: considering sda1 ...

[ 2.976487] md: adding sda1 ...

[ 2.976488] md: sda2 has different UUID to sda1

[ 2.976490] md: adding sdc1 ...

[ 2.976491] md: sdc2 has different UUID to sda1

[ 2.976492] md: adding sdf1 ...

[ 2.976493] md: sdf2 has different UUID to sda1

[ 2.976495] md: adding sdg1 ...

[ 2.976496] md: sdg2 has different UUID to sda1

[ 2.976497] md: adding sdh1 ...

[ 2.976499] md: sdh2 has different UUID to sda1

[ 2.976500] md: adding sdi1 ...

[ 2.976501] md: sdi2 has different UUID to sda1

[ 2.976502] md: adding sdj1 ...

[ 2.976504] md: sdj2 has different UUID to sda1

[ 2.976505] md: adding sdk1 ...

[ 2.976506] md: sdk2 has different UUID to sda1

[ 2.976507] md: adding sdl1 ...

[ 2.976508] md: sdl2 has different UUID to sda1

[ 2.976517] md: created md0

Edited May 22, 2016 by Guest

AllGamer · May 22, 2016

if it's not important data lost

I'll suggest you to simply build a new RAID instead, and don't go for RAID5 it's just a disaster waiting to happen.

minimum RAID6 or better.

I use RAID10 as a miminum for performance and for safety. RAID6 is rather slow, due all the overhead of double parity calculations.

The extra Paranoid people will run RAID50 or RAID60, which is a combination of the RAID10 and 5 or 6... now that is really over overboard with security :razz:

For my real important data, I keep a mirror in Box.net / DropBox / Google Drive... that's tripple redundancy :razz:

my home Synology / XPEnology boxes are mostly just for games files, videos, TV shows, movies, VMware snapshots, and backup of all my physical machines.

Agent-Orange · May 22, 2016

is there nothing else I could try? Its not importing data, but it would pisses me off, to download all the Movies/series....

AllGamer · May 22, 2016

is there nothing else I could try? Its not importing data, but it would pisses me off, to download all the Movies/series....

I doubt it, you already tried everything that's available by the book.

Did you try using TestDisk or rlinux in a Fedora / Ubuntu machine, to try and read and recover the missing data from the drives?

That is the flaw of RAID5, if you lose 1 disk, you can still recover, but if you lose 2 disk (which is what happend), then the RAID is corrupted beyond repair.

RAID6 you can lose 2 disk, and still able to recover, but if you lose 3 disk, then it's the same story, the RAID will become degraded beyond repair when you lose more than 3 disk.

Agent-Orange · May 22, 2016

Nope, I didnt tried a Linux Live distribution to read the missing data, thats something I could try

Agent-Orange · May 23, 2016

I cant even run an Ubuntu Live Version from an USB stick. I always got a black screen after I made a choice in the GRUB menu. Is that a driver problem?

mdj156 · May 24, 2016

Don't mean to hijack your thread, but I am having the same issues as well, only the GUI errors were different - the system partition was corrupted.

Did you try the latest Ubuntu from

http://www.ubuntu.com/download/desktop ? I was able to boot with that perfectly, after creating it with Rufus usb Tool.

My Mdadm output is this:

BusyBox v1.16.1 (2015-11-12 18:06:25 CST) built-in shell (ash)

Enter 'help' for a list of built-in commands.

NAS> cat /proc/mdstat

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]

md2 : active raid5 sda5[0] sdc3[2] sdb3[1]

5846049792 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/3] [uUU_]

md3 : active raid1 sdb4[0] sdc4[1]

976702592 blocks super 1.2 [2/2] [uU]

md1 : active raid1 sdd2[3] sdc2[2] sdb2[1] sda2[0]

2097088 blocks [12/4] [uUUU________]

md0 : active raid1 sdd1[3] sda1[0] sdb1[1] sdc1[2]

2490176 blocks [12/4] [uUUU________]

unused devices:

NAS> mdadm --detail /dev/md0

/dev/md0:

Version : 0.90

Creation Time : Tue May 3 20:08:11 2016

Raid Level : raid1

Array Size : 2490176 (2.37 GiB 2.55 GB)

Used Dev Size : 2490176 (2.37 GiB 2.55 GB)

Raid Devices : 12

Total Devices : 4

Preferred Minor : 0

Persistence : Superblock is persistent

Update Time : Tue May 24 22:26:13 2016

State : clean, degraded

Active Devices : 4

Working Devices : 4

Failed Devices : 0

Spare Devices : 0

UUID : 2ff4a715:bb0bc725:3017a5a8:c86610be

Events : 0.42971

Number Major Minor RaidDevice State

0 8 1 0 active sync /dev/sda1

1 8 17 1 active sync /dev/sdb1

2 8 33 2 active sync /dev/sdc1

3 8 49 3 active sync /dev/sdd1

4 0 0 4 removed

5 0 0 5 removed

6 0 0 6 removed

7 0 0 7 removed

8 0 0 8 removed

9 0 0 9 removed

10 0 0 10 removed

11 0 0 11 removed

NAS> mdadm --detail /dev/md1

/dev/md1:

Version : 0.90

Creation Time : Tue May 24 22:16:55 2016

Raid Level : raid1

Array Size : 2097088 (2048.28 MiB 2147.42 MB)

Used Dev Size : 2097088 (2048.28 MiB 2147.42 MB)

Raid Devices : 12

Total Devices : 4

Preferred Minor : 1

Persistence : Superblock is persistent

Update Time : Tue May 24 22:17:36 2016

State : active, degraded

Active Devices : 4

Working Devices : 4

Failed Devices : 0

Spare Devices : 0

UUID : 0c471b1c:73a4ead1:cced5de7:ca715931 (local to host NAS)

Events : 0.19

Number Major Minor RaidDevice State

0 8 2 0 active sync /dev/sda2

1 8 18 1 active sync /dev/sdb2

2 8 34 2 active sync /dev/sdc2

3 8 50 3 active sync /dev/sdd2

4 0 0 4 removed

5 0 0 5 removed

6 0 0 6 removed

7 0 0 7 removed

8 0 0 8 removed

9 0 0 9 removed

10 0 0 10 removed

11 0 0 11 removed

NAS> mdadm --detail /dev/md2

/dev/md2:

Version : 1.2

Creation Time : Fri Feb 20 19:52:31 2015

Raid Level : raid5

Array Size : 5846049792 (5575.23 GiB 5986.35 GB)

Used Dev Size : 1948683264 (1858.41 GiB 1995.45 GB)

Raid Devices : 4

Total Devices : 3

Persistence : Superblock is persistent

Update Time : Tue May 24 22:23:19 2016

State : clean, degraded

Active Devices : 3

Working Devices : 3

Failed Devices : 0

Spare Devices : 0

Layout : left-symmetric

Chunk Size : 64K

Name : NAS:2 (local to host NAS)

UUID : 8fcc7e7b:1ec18766:628f40c5:d73e88ef

Events : 249

Number Major Minor RaidDevice State

0 8 5 0 active sync /dev/sda5

1 8 19 1 active sync /dev/sdb3

2 8 35 2 active sync /dev/sdc3

3 0 0 3 removed

NAS> mdadm --detail /dev/md3

/dev/md3:

Version : 1.2

Creation Time : Fri Feb 20 19:52:31 2015

Raid Level : raid1

Array Size : 976702592 (931.46 GiB 1000.14 GB)

Used Dev Size : 976702592 (931.46 GiB 1000.14 GB)

Raid Devices : 2

Total Devices : 2

Persistence : Superblock is persistent

Update Time : Tue May 24 22:23:21 2016

State : clean

Active Devices : 2

Working Devices : 2

Failed Devices : 0

Spare Devices : 0

Name : NAS:3 (local to host NAS)

UUID : 7f9dd02e:c98c9a71:c6e96c71:e8d6eae6

Events : 2

Number Major Minor RaidDevice State

0 8 20 0 active sync /dev/sdb4

1 8 36 1 active sync /dev/sdc4

NAS>

Agent-Orange · May 26, 2016

It was a problem with the CPU, Ubuntu has some problems with Skylake CPU's

I ended up in delete the raid and created a new one.

AllGamer · May 26, 2016

It was a problem with the CPU, Ubuntu has some problems with Skylake CPU's

I ended up in delete the raid and created a new one.

This doesn't compute.... how can a CPU affect the HDDs RAID order?

Well I guess I'll find out soon, I just ordered a new Motherboard and an Core i3 Skylake to run a new 24 HDD server I'm building.

If I start to see the weird issues you found, then it might very well be a Skylake chipset bug, after all these are very new.

Agent-Orange · May 29, 2016

I mean, when I tried to boot with an Ubuntu Live USB Stick (v. 16.04), but it always hang after the GRUB menu. It seems that ubuntu has a problem with skylake CPU's, there are some threads in the net.

DSM not recognises all HDD & volume crashed now

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites