Jump to content
XPEnology Community

Suppress virtual disk SMART errors from /var/log/messages


flyride

Recommended Posts

One annoyance when running DSM under ESXi is that virtual disks can't properly handle its SMART interrogations. This is because Synology embedded a custom version of the smartctl binary into its own libraries and utilities, ignoring standard config files that could generate compatible queries or suppress them. The result is spurious error messages logged to /var/log/messages every few seconds, wasting disk space and SSD lifecycle, and making it hard to see what's happening. If you use virtual disks and are not familiar with this, monitor the messages logfile with the command below to see how frequently DSM tries (and fails) to query the drives.

 

# tail -f /var/log/messages

 

The problem has been around for a long time and is well-documented here. An indirect fix was discovered when the virtual disks were attached to the LSI Logic SAS dialect of the ESXi virtual SCSI controller, but this solution worked reliably only under 6.1.x.  On 6.2.x, the virtual SCSI controller tends to result in corrupted (but recoverable) arrays.

 

I recently migrated my 6.1.7 system to 6.2.3, so I had to convert my virtual SCSI controller to SATA, and of course, the logfile chatter was back. I don't really care about SMART on virtual disks (and you probably don't either) so I decided to get rid of the log messages once and for all.  Syslog-ng has a lot of capability to manage the log message stream, so I knew it was possible. The results follow:

 

We need to install two files, first a syslog-ng filter:

# ESXiSmart.conf
# edit the [bracket values] with drive slots where SMART should be suppressed
# in this example /dev/sda through /dev/sdl are suppressed

filter fs_disks { match("/sd[a-l]" value("MESSAGE")); };

filter fs_badsec { match("/exc_bad_sec_ct$" value("MESSAGE")); };
filter fs_errcnt { match("disk_monitor\.c:.*Failed\ to\ check" value("MESSAGE")); };
filter fs_tmpget { match("disk/disk_temperature_get\.c:" value("MESSAGE")); };
filter fs_health { match("disk/disk_current_health_get\.c:" value("MESSAGE")); };
filter fs_sdread { match("SmartDataRead.*read\ value\ /dev/.*fail$" value("MESSAGE")); };
filter fs_stests { match("SmartSelfTestExecutionStatusGet.*read\ value\ /dev/.*fail$" value("MESSAGE")); };
filter fs_tstget { match("smartctl/smartctl_test_status_get\.c:" value("MESSAGE")); };

filter fs_allmsgs { filter(fs_badsec) or filter(fs_errcnt) or filter(fs_tmpget) or filter(fs_health) or filter(fs_sdread) or filter(fs_stests) or filter(fs_tstget); };
filter f_smart { filter(fs_disks) and filter(fs_allmsgs); };

log { source(src); filter(f_smart); };

Save this to /usr/local/etc/syslog-ng/patterndb.d/ESXiSmart.conf

 

You will need to edit the string inside the brackets on the first "fs_disks" line to refer to those disks that should be SMART suppressed. If you want all SMART errors suppressed, just leave it as is. In my system, I have both virtual and passthrough disks, and the passthrough disks SMART correctly. So as an example, I have [ab] selected for the virtuals/dev/sda and /dev/sdb, leaving SMART log messages intact for the passthrough disks.

 

Please note that the file is extremely sensitive to syntax. A missing semicolon, slash or backslash error, or an extra space will cause syslog-ng to fail completely and you will have no logging. To make sure it doesn't suppress valid log messages, this filter matches SMART-related error messages with references to the selected disks. However, it cannot actually remove them from the log file because there is a superseding match command embedded in DSM's syslog-ng configuration.

 

The second file adds our filter to a dynamic exclusion list that DSM's syslog-ng configuration compiles from a special folder. There is only one line:

and not filter(f_smart)

Save it to /usr/local/etc/syslog-ng/patterndb.d/include/not2msg/ESXiSmart

 

Reboot to activate the new configuration, or just restart syslog-ng with this command:

 

# synoservice --restart syslog-ng

 

If you want to make sure that your syslog-ng service is working correctly, generate a test log:

 

# logger -t "test" -p error "test"

 

And then check /var/log/messages as above. If you have made no mistakes in the filter files, you should see the test log entry and the bogus SMART messages should stop. As this solution only modifies extensible structures under /usr/local, it should survive an upgrade as long as there is no major change to message syntax.

  • Like 1
  • Thanks 2
Link to comment
Share on other sites

Thank you very much! Your instructions are working very well! :-)

 

I applied it in the same way as yours (only for supressing SMART errors on /dev/sda and /dev/sdb) and /var/log/messages is a lot quieter now!!

 

EDIT:

After enabling your fix for suppressing the SMART-Error messages I recognize a new error message which comes every minute:

 

Every minute an error will be logged in /var/log/messages:

2020-05-19T06:17:38+02:00 diskstation ovs-appctl: ovs|00001|daemon_unix|WARN|/var/run/openvswitch/ovs-vswitchd.pid: open: No such file or directory

 

But the openvswitch is not in use or even activated/configured! I never touched it or used docker on this DSM.

 

Workaround:

mkdir -p /var/run/openvswitch
touch /var/run/openvswitch/ovs-vswitchd.pid

Then the error messages are stopping instantly. :-)

 

...just for others who might get the error every minute also...

 

Edited by Balrog
  • Thanks 1
Link to comment
Share on other sites

  • 3 months later...
  • 1 month later...

I have some news about other log entries which are anyoing and useless:

 

The open-vm-tools package logs every 30 seconds this:

root@ds:~# tail -f /var/log/vmware-vmsvc.log
[Oct 09 19:50:17.681] [ warning] [vmsvc] HostinfoOSData: Error: no distro file found
[Oct 09 19:50:17.681] [ warning] [guestinfo] Failed to get OS info.
[Oct 09 19:50:17.683] [ warning] [vmsvc] HostinfoOSData: Error: no distro file found

Solution:

root@ds:~# cat /proc/version > /etc/release

If this file exists the open-vm-tools are happy and do not throw any error anymore. :-)

________________________________________________________________

 

I have mapped a 200 GByte-vmdk on my NVME-SSD into the Xpenology-VM seen as /dev/sdb. For sure it can't give any infos about temperature for this. So every 30 seconds the daemon scemd logs this into /var/log/scemd.log .

As /var/log is distributed as RAID1 over all hard disks also a write access on the main hard disks occour every 30 seconds and will spam into the log-file /var/log/scemd.log:

root@ds:~# tail -f /var/log/scemd.log
2020-10-09T20:14:44+02:00 ds scemd: SmartDataRead(108) read value /dev/sdb fail
2020-10-09T20:14:44+02:00 ds scemd: disk/disk_temperature_get.c:104 read value /dev/sdb fail
2020-10-09T20:14:44+02:00 ds scemd: disk_temperature_update.c:63 Temperature Update Fail
2020-10-09T20:14:50+02:00 ds scemd: SmartDataRead(108) read value /dev/sdb fail

Solution:

Change the logfile to /tmp/scemd.conf (which will be in memory and lost after a reboot). But as scemd.log will not have any useful information for at all this is a quick and working solution.

One will have to edit 2 files for make it working:

vi /etc/syslog-ng/patterndb.d/scemd.conf
vi /etc.defaults/syslog-ng/patterndb.d/scemd.conf

# original line:
destination d_scemd { file("/var/log/scemd.log"); };

# modified line:
destination d_scemd { file("/tmp/scemd.log"); };

I hope its clear what I mean. :-)

 

I think it matches the original content pretty well as its also about preventing unwanted content in the logs.

Additions and improvements to this are very welcome!

 

  • Thanks 1
Link to comment
Share on other sites

  • 3 months later...

Hi,

 

I applied your fix and it works quite well, but may i ask your opinion ? :

 

synoscgi_SYNO.Core.System_1_info[1419]: fan/fan_table_type_disk_temperature_ops.c:242 Invalid Disks temperature -1
synoscgi_SYNO.Core.System_1_info[1419]: fan/fan_temp_hw_sys_degree.c:50 Failed to get disks temperature
synoscgi_SYNO.Core.System_1_info[1419]: fan/fan_temp_hw_sys_degree.c:54 Get wrong sys temperature

 

Thank you

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...