No Reboot after Volume Crash

JigglyJoe · April 25, 2020

Hey Guys,

finally i've overcome myself to write my problem down. Hope you guys can help me.

i have the DS918+ modell with DSM version 6.2-23739. I am using 8x4TB WD Reds plugged in the mainboard and 2x4TB WD Reds + 2 SSDs on a HBA controller.

I use 8 WD Reds as a SHR with 1-disk worth of redundancy. Also i used the SSD Cache with 2 128GB SSDs.

Now to my problem...

I wanted to upgrade my SHR from 8x4TB to 10x4TB.

since its not the first time adding new disks to the SHR i thougt its a good idea to add the 2 disks simultaneously (maybe this was the first step of my error series)

first it seemed everything right but the reshape was extremely slow (cat /proc/mdstat said it would take 140 days to complete), after a few researches i changed

the stripe cache size to the maximum of 16384.

it worked instantly and cat /proc/mdstat said it would take less then a week. at about 40% the NAS suddenly was not reachable over IP or network, putty also didnt worked.

after one day it was still not reachable so i restarted the NAS completely (yes i cut power and restarted). after rebooting the reshape progress went on from ca. 35% and i just changed the stripe cache size again so it would be faster. Also during the reshaping i used the NAS normaly, opened a few files and copy/paste stuff, but performance was very slow, in the end files copied with just 1MB/sec. So maybe a week later it was at about 95% and the next day i looked into the DSM it said my Volume crashed.. The Log said that both 2 new drives had an error close before finishing the reshape (dont know which error and i dont have screenshots). SMART tests of both 2 new 4TB drives are okay. the Volume was still reachable through network but most of the files where corrupt and not readable. the Repair function of the volume didnt worked. after clicking on repair nothing happened. so i restarted the NAS and since 16 days its still rebooting and i dont know what to do. htop says that the volume is reading at 160M/sec since 16 days! what is it reading??

i hope you guys can help me fixing this and help to restore my volume!

Thanks for helping

putty screenhot here: (jbd2/md0-8, /usr/bin/syslog, /synologrotated are writing every once in while a few bytes)

.

Edited April 25, 2020 by JigglyJoe

Profiler64 · May 4, 2020

Ok we made it.

First we booted from a Live Ubuntu USB and then we followed this tutorial.

https://techwiztime.com/guide/synology-nas-data-recovery-ubuntu/

But the Folder is encrypted and then we followed this tutorial.

https://robertcastle.com/2012/10/howto-recover-synology-encrypted-folders-in-linux/

After this we safed the files on external drives.

When this is done we try to recover the Volume in Xpenology. Just for research.

Balrog · May 4, 2020

Hi! I just read this thread and the first thing which comes to my mind was: "Maybe the new WD red hard disks are "new ones" with SMR.".
WD40EFAX is SMR (shingled recording) with many problems on a RAID rebuild when used in a mixed environment with to good old and fast WD40EFRX (which uses CMR).

If it is the case that you have a mixed environment and getting performance problems you can say "thank you" to western digital.

Just google about this.

Balrog · May 7, 2020

well, i analyzed one of the "new" WD RED 4 TB with my PC and an advanced SMART test showed "Too many bad sectors", at about 91% of disk space. it was the same percentage the NAS reshape crashed. and i think that's why the rebuilding process went on forever... i just used a bad drive for reshaping without running an advanced test before... (and it was a used one from ebay [emoji28]) so guys, please check all your drives before adding them to the volume [emoji849] even changing the faulty disk with a completely new one did not help and the volume still was crashed. synology is really really sensitive... thank god we could restore everything with a live ubuntu stick.

@Balrog every single one of my wd red drives are the same model so in this case it was not the problem.

Ah well, that makes sense too. I broken hard disk is also a good reason for breaking a RAID.

JigglyJoe · May 7, 2020

well, i analyzed one of the "new" WD RED 4 TB with my PC and an advanced SMART test showed "Too many bad sectors", at about 91% of disk space. it was the same percentage the NAS reshape crashed. and i think that's why the rebuilding process went on forever... i just used a bad drive for reshaping without running an advanced test before... (and it was a used one from ebay 😅) so guys, please check all your drives before adding them to the volume 🙄 even changing the faulty disk with a completely new one did not help and the volume still was crashed. synology is really really sensitive... thank god we could restore everything with a live ubuntu stick.

@Balrog every single one of my wd red drives are the same model so in this case it was not the problem.

Sign In

No Reboot after Volume Crash

Question

JigglyJoe

Link to comment

Share on other sites

4 answers to this question

Recommended Posts

Profiler64

Link to comment

Share on other sites

Balrog

Link to comment

Share on other sites

Balrog

Link to comment

Share on other sites

JigglyJoe

Link to comment

Share on other sites

Join the conversation

Forums

What's new

MUST READ

Members