Checksum mismatch errors


Recommended Posts

I'm truly at a loss on what to do about this. I've attached a log file, just did a date range from Nov 1, 2020 to present.

The only thing I can think of then was I had installed 2 memory modules around Nov 4 or 5th. I removed them in mid January but the problem still happens. I've also done a expansion to my pool, but after I was already getting checksum errors, I believe I did the expansion around the 10th-12th of December.

I've tried with the Plex folder as its the worst, I unistalled plex, deleted the folder, and reinstalled plex and let it rebuilt the database\metadata, takes so long,, and the problem came back.

Any suggestions on where to go next?

Thanks

syslog_2021-2-8-16 50 55.html

Link to post
Share on other sites

BTRFS checksum errors indicate corrupted files.  If they can be fixed with redundancy, it should show that in the log.  These appear not to be correctable and need to be restored from backup.  It's pretty unusual but I have personally encountered BTRFS checksum errors twice.  In each case I was able to restore the affected file from a Snapshot located on the same machine.

 

It's worth noting that if corruption of this type happened to another filesystem (i.e. ext4) there would be no way to know about it.

 

It can happen spontaneously because of gamma rays or bitrot, but not with this frequency. So why is it happening in the first place?  Is it still happening?  Unknown, you need to figure it out.  If you suspect you have memory problems then test the bejeezus out of it before subjecting your data to more errors.

Link to post
Share on other sites

I understand the part if there was some sort of file corruption prior to migrating to btrfs, it wouldn’t fix it. But the ******* ton of metadata errors with plex, and I wiped it all and it still comes back?? Has me confused. I mentioned memory because it’s what I read as a potential issue, HDDs were also a possibility but they don’t report any abnormalities. Another reason I perhaps thought memory, was because the issues started about a month after adding more modules. Maybe it was a coincidence, as removing them the problem persists. 

Link to post
Share on other sites
21 hours ago, merve04 said:

Any suggestions on where to go next?

 

whats the hardware and dsm type?

any cache drives involved? - i did have btrfs trouble with my main system after trying write cache with two sata ssd's and had do revert to my last backup

anything in /var/log/ about disk errors? the log in the webgui are not that helpful if it comes to analyze problems

 

Link to post
Share on other sites
4 hours ago, IG-88 said:

 

whats the hardware and dsm type?

any cache drives involved? - i did have btrfs trouble with my main system after trying write cache with two sata ssd's and had do revert to my last backup

anything in /var/log/ about disk errors? the log in the webgui are not that helpful if it comes to analyze problems

 

Gigabyte B365M DS3H, i5 8400, 2x JMB585

 

Ive never implemented any read/write caches ever. 
 

How do I go about access the /var/log?

Link to post
Share on other sites
On 2/13/2021 at 3:35 AM, IG-88 said:

dmesg and messages

So I ssh with my admin account, did "cd /var/log" then i typed "dmesg" got a bunch of info and attached a copy here. When i try "message" i get command not found. I do "ls" and see there is indeed a "messages" file, not sure how much it pertains into finding clues to my issue.

dmesg log.html

Link to post
Share on other sites

 

On 2/14/2021 at 4:31 PM, merve04 said:

So I ssh with my admin account, did "cd /var/log" then i typed "dmesg" got a bunch of info and attached a copy here. When i try "message" i get command not found. I do "ls" and see there is indeed a "messages" file, not sure how much it pertains into finding clues to my issue.

the files are named like that

/var/log/dmesg

/var/log/messages

 

there are tons of corrected and uncorrected btrfs errors

also two docker crashes

you should have a backup and you should backup whatever you still need  and is not recent on your backup

check the hardware, there might be ram problems or something else, the file system should not break down like this

 

i use a very similar hardware, gigabyte b365m-hd3, i3-9100, JMB585 added, no problems like that

a few days i found out that you can have NCQ problems with WD disks and jmb585 but there where so problems visible beside some log entries

Edited by IG-88
Link to post
Share on other sites

“message” does not work for me. 
All drives in my pool are Seagate 4 and 8 TB drives. I have one hitachi running standalone for VM and surveillance recordings. I could flash a USB key with memtest I suppose. I replaced a dozen files that reported checksum errors, just tv shows though. Nearly all other errors are plex metadata issues which I don’t think is critical and dockers can be rebuilt if need be. 

ADA501DB-8804-46EA-B23B-3DE4197A1BCB.png

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.