• 0

Volume Crashed but RAID and all drives healthy - help?


Question

Hi,

 

Randomly yesterday I got an email from my Xpenology box saying my volume had crashed! After logging in to DSM I can see that the volume is indeed in a crashed state and appears to have been set to read-only. However, checking my HDD status all are healthy, as is the RAID5 Storage Pool. This is very out of the ordinary as everything has been running well for years on end now. I did recently attempt to update to TinyCore but couldn't get the past the correct drive mapping so gave up and went back to my Jun loader USB which booted fine as usual and everything has been working flawlessly since!

 

I have been checking similar recent posts from search (https://xpenology.com/forum/topic/57921-volume-crashed-after-power-outage/?tab=comments#comment-269420 and https://xpenology.com/forum/topic/14337-volume-crash-after-4-months-of-stability/#comment-107979) but after enabling SSH and running some of the check commands I'm not getting the same results. 

I'm currently backing up data that I can to USB drive just in case, but with 12TB of data this is going to take some time.

 

Some more details on my setup;

Jun's 1.03b DS3615xs loader running DSM 6.2.3 u3

Core i3-4130

8GB RAM

Gigabtye H81M S2V motherboard

4x 6TB HGST Enterprise HUS726T6TALE6L4 in RAID5

Btrfs Volume

 

If anyone is able to offer some assistance on a possible repair of the volume that would be fantastic. I haven't tried simple things like a reboot yet as I wanted to recover as much data while I can!

Thanks

 

Matt

 

 

Link to post
Share on other sites

4 answers to this question

Recommended Posts

  • 0

you gave no details about btrfs stats but if disks and mdadm raid is ok i'd expect that to be the problem and if thats the case there is often not much to do with repairs, usually i'd expect ram or disk/cable problems, check /var/log/messages and boot a live linux and use memtest to check the ram

 

 

 

Link to post
Share on other sites
  • 0

Ok so I just finished backing up all of the data I could (Docker and Downloads folders appeared to be inaccessible) - took over a week via USB onto multiple drives!!


Rebooted the NAS after completing the backups before attempting a fresh install..... and everything now shows healthy!! What the hell 🤯😂 

Is there anything else I should be checking? The only anomaly I know of on my install is that the "Data Scrubbing" task wont run - it gets to around 1.7% and then stops and shows "Never performed" as the status. Nothing I've tried has been able to resolve this, I even ran the scrub task via SSH. I'm wondering if the lack of data scrub could have been the cause for the potentially incorrect "Crashed volume"?

 

Again - any assistance greatly appreciated

Link to post
Share on other sites
  • 0

I've found the root cause - some files I was downloading at the time of the initial crash appear to be corrupt/causing the volume to crash. Trying to delete them causes the volume to go into the crashed state again, which can be resolved by rebooting. 

 

I've tried deleting the suspect files via SMB and File Station with the same result. Is there any way I can delete them via SSH or similar to bypass the issues? Or run a file system check somehow? 

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.