CiViCKiDD Posted January 2, 2020 Share #1 Posted January 2, 2020 I set up an Xpenology system a while back with their SHR solution and BTRFS, with six mostly mismatched-sized drives into one volume. So one drive is a parity drive. Every now and then I get checksum mismatch errors, which sucks because I built this primarily to have a robust and semi-reliable backup solution in parallel with a cloud backup solution. I'm on the latest DSM with Jun's loader. Googling has found no solutions - before I give up and plan my next move, any advice on things to look into? A few thoughts I have: -check hdd for bad sectors / issues (any bootable tools you guys recommend?) -check RAM for issues (Memtest86) While typing this I also realized I have not done "data scrubbing" in 4 months - is this something that can repair files that have been identified as having checksum errors? Quote Link to comment Share on other sites More sharing options...
bearcat Posted January 2, 2020 Share #2 Posted January 2, 2020 Seems like you should do a "scrubbing". Quote Link to comment Share on other sites More sharing options...
CiViCKiDD Posted January 2, 2020 Author Share #3 Posted January 2, 2020 Thanks, it’s been running since I posted it and will probably take a few hours. Is it common / normal to see checksum errors? Also - does the log identify what was repaired vs what couldn’t be? Quote Link to comment Share on other sites More sharing options...
flyride Posted January 2, 2020 Share #4 Posted January 2, 2020 45 minutes ago, CiViCKiDD said: I set up an Xpenology system a while back with their SHR solution and BTRFS, with six mostly mismatched-sized drives into one volume. So one drive is a parity drive. Every now and then I get checksum mismatch errors, which sucks because I built this primarily to have a robust and semi-reliable backup solution in parallel with a cloud backup solution. I'm on the latest DSM with Jun's loader. Googling has found no solutions - before I give up and plan my next move, any advice on things to look into? A few thoughts I have: -check hdd for bad sectors / issues (any bootable tools you guys recommend?) -check RAM for issues (Memtest86) While typing this I also realized I have not done "data scrubbing" in 4 months - is this something that can repair files that have been identified as having checksum errors? You're looking at this the wrong way - with another system that was not using btrfs, you would not know when your files had bit-level errors as there would be no checksum. This is the dirty reality of the storage industry - spinning disk drives encounter a statistically measurable number of write errors that are never noticed in standard, non-redundant applications. btrfs scrubbing allows the system to use the redundancy of the RAID array to correct the errors detected by checksum. So do that; your system is working as designed. Also, you don't have "one parity drive." With MDRAID, parity is spread across all the drives. I realize this is somewhat semantic, but I see people describe the system this way on a regular basis, potentially leading to loss of data if the wrong decision is made. Quote Link to comment Share on other sites More sharing options...
IG-88 Posted January 2, 2020 Share #5 Posted January 2, 2020 48 minutes ago, CiViCKiDD said: Every now and then I get checksum mismatch errors, where do you see this errors? did you check files in /var/log/ ? you should also check the s.m.a.r.t. values of every disk this are values i look for: Read Error Rate, should be low Reallocated Sectors Count, should be 0, anything else is alarming (at least for me) UltraDMA CRC Error Count, should be low, can be a indicator for connection or cable problems 2 minutes ago, flyride said: Also, you don't have "one parity drive." With MDRAID, parity is spread across all the drives. I realize this is somewhat semantic, but I see people describe the system this way on a regular basis, potentially leading to loss of data if the wrong decision is made. i guess thats because its the easy way to count the amount of usable storage in a SHR1 scenario, add all drives size and reduce it by the biggest drive (for parity information), works but does not even remotely reflect the complexity of the real raid/lvm structure in a SHR szenario not just that the parity is spread inside a raid set, also there is more then one raid set and those raid sets are then glued together by lvm Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.