mcdull

What if SMART is not supported?

Recommended Posts

Will I be able to tell drive failure if my HDD do not support SMART in Synology?

There is still no way to enable SMART under ESXi environment. Although the hdd serial number is now regconzied by DSM.

If that is not possible, are there any plugin that will do regular surface scan?

 

Thanks.

Share this post


Link to post
Share on other sites

Hi,

 

I'm facing the same issues, as I have ESXI 5.5 and xpenology 4.3 on it.

 

SMART is not handled on virtual disks, sadly :sad:

Though, ESXI system has tools to get SMART info in command line.

http://kb.vmware.com/selfservice/micros ... Id=2040405

 

Actually, I'm working on a way to transmit SMART data from ESXI to a custom web application (hosted by xpenology) to track and analyze disk status in real time.

Share this post


Link to post
Share on other sites
Hi,

 

I'm facing the same issues, as I have ESXI 5.5 and xpenology 4.3 on it.

 

SMART is not handled on virtual disks, sadly :sad:

Though, ESXI system has tools to get SMART info in command line.

http://kb.vmware.com/selfservice/micros ... Id=2040405

 

Actually, I'm working on a way to transmit SMART data from ESXI to a custom web application (hosted by xpenology) to track and analyze disk status in real time.

 

Hows it going with your web application? Have you made any progress?

Share this post


Link to post
Share on other sites

Hi.

 

Web application basically is there, I just need to find a way to send SMART data from ESXi to running instance.

Not much a progress so far indeed, hopefully holidays are there (in a few days!) to get time to work on it finally.

 

Thanks for your interest in this tool. I guess I will have to create a thread to keep people informed :smile:

 

Project is hosted here : https://github.com/djey47/smartX

Share this post


Link to post
Share on other sites

I doubt why ESXi cannot pass through SMART status.

In fact, I used to get Smart status on RDM SATA disk in windows using HD Tune Pro. (HD Tune free is not capable)

So there is means to obtain SMART Data in a VM but it is just not implemented correctly for most of the software.

Share this post


Link to post
Share on other sites

Of course it's a software issue. There's a miss in virtual scsi controller driver, preventing SMART data to reach DSM.

 

So I prefer making my own solution rather than waiting for someone to code a proper driver for it. Moreover, SMART is just raw data and should be processed a bit smarter to really anticipate hdd problems. SMART status flag may cry when it's to late...

Share this post


Link to post
Share on other sites
Of course it's a software issue. There's a miss in virtual scsi controller driver, preventing SMART data to reach DSM.

 

So I prefer making my own solution rather than waiting for someone to code a proper driver for it. Moreover, SMART is just raw data and should be processed a bit smarter to really anticipate hdd problems. SMART status flag may cry when it's to late...

 

I think you mis-understood my question. The virtual scsi controller passes ALL SMART data to DSM but the DSM just cannot handle it. I am not sure if it is the protocol or parameter issue.

for example, in the DSM box inside an ESXi with a physical harddisk RDM to DSM..

 

DiskStation> smartctl --all /dev/sdd

smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.2.40] (local build)

Copyright © 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

 

=== START OF INFORMATION SECTION ===

Model Family: Hitachi Deskstar 7K3000

Device Model: Hitachi HDS723030ALA640

Serial Number:

LU WWN Device Id: 5 000cca 225c186bb

Firmware Version: MKAOA3B0

User Capacity: 3,000,592,982,016 bytes [3.00 TB]

Sector Size: 512 bytes logical/physical

Device is: In smartctl database [for details use: -P show]

ATA Version is: 8

ATA Standard is: ATA-8-ACS revision 4

Local Time is: Tue Dec 24 10:27:56 2013 CST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== Detail are all listed below ===

 

The problem is how DSM can make use of these data, it shouldnt be that difficult as the smartctl already read all parameter.

In fact, in the DSM page, the smart status for all RDM drive are showing Normal. But when I clicked into Smart info, it was telling not supported.

Share this post


Link to post
Share on other sites

By wild guess, the DSM may feed the smartctl with -d scsi parameter where most smart data are lost here:-

 

DiskStation> smartctl -a /dev/sdd -d scsi

smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.2.40] (local build)

Copyright © 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

 

User Capacity: 3,000,592,982,016 bytes [3.00 TB]

Logical block size: 512 bytes

Serial number:

Device type: disk

Local Time is: Tue Dec 24 11:06:18 2013 CST

Device does not support SMART

 

Current Drive Temperature:

 

Error Counter logging not supported

 

[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']

Device does not support Self Test logging

 

Very similar to what I can see from the GUI interface, are there any means to force the smartctl to override the -d parameter?

using smartctl -a /dev/sdd -d sat is perfectly fine.

recompile it?

Share this post


Link to post
Share on other sites

Thanks mcdull, I did not know smartctl tool ...

 

so basically, DSM should invoke smartctl without -d scsi parameter.

 

Is it possible to write a fake smartctl script/alias ignoring -d switch then calling real smartctl in background ? DSM would call script instead of binary directly.

 

 

---

- To fill HDD table, this script is invoked :

https://<>/webman/modules/StorageManager/storagehandler.cgi?_dc=<>&SynoToken=<>&action=load_smart

 

On my box, it answers :

{
  "disks" : [
     {
        "container" : {
           "str" : "DS3612xs",
           "supportPwrBtnDisable" : false,
           "type" : "internal"
        },
        "device" : "/dev/sda",
        "diskType" : "SSD",
        "id" : "sda",
        "isSsd" : true,
        "longName" : "Disque 1",
        "model" : "ST3000VN000-1H4167      ",
        "name" : "Disque 1",
        "num_id" : 1,
        "order" : 0,
        "portType" : "normal",
        "rpm" : 0,
        "serial" : "",
        "size_total" : "3000592982016",
        "smart_status" : "unknown",
        "status" : "normal",
        "support" : false,
        "temp" : -1,
        "used_by" : "volume_1"
     },
     {
        "container" : {
           "str" : "DS3612xs",
           "supportPwrBtnDisable" : false,
           "type" : "internal"
        },
        "device" : "/dev/sdb",
        "diskType" : "SSD",
        "id" : "sdb",
        "isSsd" : true,
        "longName" : "Disque 2",
        "model" : "ST3000VN000-1H4167      ",
        "name" : "Disque 2",
        "num_id" : 2,
        "order" : 0,
        "portType" : "normal",
        "rpm" : 0,
        "serial" : "",
        "size_total" : "3000592982016",
        "smart_status" : "unknown",
        "status" : "normal",
        "support" : false,
        "temp" : -1,
        "used_by" : "volume_1"
     },
     {
        "container" : {
           "str" : "DS3612xs",
           "supportPwrBtnDisable" : false,
           "type" : "internal"
        },
        "device" : "/dev/sdc",
        "diskType" : "SSD",
        "id" : "sdc",
        "isSsd" : true,
        "longName" : "Disque 3",
        "model" : "ST2000DL003-9VT166      ",
        "name" : "Disque 3",
        "num_id" : 3,
        "order" : 0,
        "portType" : "normal",
        "rpm" : 0,
        "serial" : "",
        "size_total" : "2000398934016",
        "smart_status" : "unknown",
        "status" : "normal",
        "support" : false,
        "temp" : -1,
        "used_by" : "volume_1"
     },
     {
        "container" : {
           "str" : "DS3612xs",
           "supportPwrBtnDisable" : false,
           "type" : "internal"
        },
        "device" : "/dev/sdd",
        "diskType" : "SSD",
        "id" : "sdd",
        "isSsd" : true,
        "longName" : "Disque 4",
        "model" : "ST2000DL003-9VT166      ",
        "name" : "Disque 4",
        "num_id" : 4,
        "order" : 0,
        "portType" : "normal",
        "rpm" : 0,
        "serial" : "",
        "size_total" : "2000398934016",
        "smart_status" : "unknown",
        "status" : "normal",
        "support" : false,
        "temp" : -1,
        "used_by" : "volume_1"
     }
  ],
  "success" : true,
  "system_crashed" : false
}

 

- To get SMART details, this script is invoked :

https://<>/webman/modules/StorageManager/smart.cgi

 

POST data:

action:apply

operation:diskInfo

disk:/dev/sda

 

It answers :

{
  "errinfo" : {
     "key" : "error_system",
     "sec" : "common"
  },
  "success" : false
}

Share this post


Link to post
Share on other sites

So, to check whether (and how) smartctl is called by cgi script, I have renamed smartctl to smartctl1 then created a sh script call smartctl.

 

This script dumps parameter list ($*) to a plain text file then calls smartctl1 (with same parameters at the moment)

 

A solution might be to only transfer disk parameters to smartctl1 and ommiting -d scsi ones.

Share this post


Link to post
Share on other sites

"synodisk" command should be used for basic information showed in interface.

At least the temperature is how to be called. But it is not sure what other binary being called for other disk info.

Share this post


Link to post
Share on other sites

When I type :

$ synodisk --read_temp /dev/sda 
disk /dev/sda temp is -1

 

Putting a script to fool DSM does not work with this binary either, like it's not the one to be called.

 

 

But as smartctl works, I can rely on it to get SMART values to populate my web app :smile:

Share this post


Link to post
Share on other sites

I am also looking for a web app that replaces the default app.

Please share if you get any progress.

BTW, is smartd able to run under syno?

Share this post


Link to post
Share on other sites

I installed smartmontools from ipkg and found smartd is run successfully.

It can be used to generate alarm or notification to me.

However, I cannot send mail successfully becoz it was lacking mailx in synology nor I cannot use the synology mail server.

Still try to figure out this part.

Share this post


Link to post
Share on other sites

I installed nailx to sucessfully send mail from smartd. And appearently it sent the Test message successfully.

however, I lowered the temperature threashold to 2,20,23 and it still not sending out any warning email to me.

I doubt if it is continues monitoring or the smartd is not functionly correctly.

/dev/sdc -d sat -a -n standby,3 -W 2,20,23 -s S/../.././10 \

-m xxx@xxx.com -M test

 

this is the conf file being used in one of my disk.

Share this post


Link to post
Share on other sites

Ok.. I finally got some error report after waiting some hours.

 

===== Email content ======

The following warning/error was logged by the smartd daemon:

Device: /dev/sdd [sAT], Failed SMART usage Attribute: 184 End-to-End_Error.

For details see host's SYSLOG (default: /var/log/messages).

You can also use the smartctl utility for further investigation.

No additional email messages about this problem will be sent.

========================

 

I got this error from a Seagate Drive.. but the drive seems running okay..

Not sure of what the error is about.. ( A internal cache parity issue, but seems common on seagate drive)

 

And all temperature warning.

 

BTW, I am not sure if those SMART information from RDM are static (meaning that will not update) or not.

My drive are always in standby mode from smartctl. I can never wake them but of course they are active.

The up time does not change either. Hope someone would help to figure it out.

Share this post


Link to post
Share on other sites

confirmed the smartd working under DSM. I received an email today.

 

This email was generated by the smartd daemon running on:

 

host name: vDSM

DNS domain: [unknown]

NIS domain: (none)

 

The following warning/error was logged by the smartd daemon:

 

Device: /dev/sdd [sAT], Temperature 35 Celsius reached critical limit of 35 Celsius (Min/Max ??/35!)

 

 

For details see host's SYSLOG (default: /var/log/messages).

 

You can also use the smartctl utility for further investigation.

No additional email messages about this problem will be sent.

Share this post


Link to post
Share on other sites

Thanks for all this information, I will try to get data from smartctl/smartd in a usable form.

When I can.

Share this post


Link to post
Share on other sites

Hi All,

 

Is there any progress with that? I'm keen to get proper SMART monitoring for vSynology. I'm planning to have a nagios (monitoring tool) VM, that monitors my home lab, monitoring vSynology disks can be a task to implement.

Share this post


Link to post
Share on other sites

Any progress on making SMART work natively in the synology web page while using RDM in ESXi? I'm trying to get rid of the following error messages in /var/log/messages

 

Mar 5 14:22:00 HPNAS rsrcmonitor2.cgi: smartctl_enable.c:92 AtaSmartEnable failed.

Mar 5 14:22:00 HPNAS rsrcmonitor2.cgi: SmartDataRead(107) enable smart /dev/sde fail

Mar 5 14:22:00 HPNAS rsrcmonitor2.cgi: disk_temperature_get.c:71 read value /dev/sde fail

Share this post


Link to post
Share on other sites
Thanks for all this information, I will try to get data from smartctl/smartd in a usable form.

When I can.

Have you made any new progress regarding this?

Share this post


Link to post
Share on other sites

Hi,

sorry for the lack of news, I've been very busy lately.

 

There is some progress, meaning I had been able to get SMART data of any disk recognized by DSM 4.3 (tested with temperature). Parsing hdparm output has given me list of hard disks, and smartctl a bunch of SMART indicators.

 

Last issue is I upgraded to DSM5 with gnoBoot two weeks ago, and now hdparm is not able to produce same output as before: I can't get list of handled hard disks in the bay so I've got to find out another way. Maybe in asking ESXI directly - results should be more stable.

 

My current project is pi-control, a small REST service set, which is able to communicate with ESXI through SSH.

https://github.com/djey47/pi-control

pi-control currently is able to get vm list and status, and to remotely start/stop ESXi system.

 

I'm planning to integrate disk list and smart status into it; those services would be hosted in either xpenology system, or a raspberry-pi device.

xpenology would always host a graphical user interface (as a web site) to easily monitor disk state.

Share this post


Link to post
Share on other sites

Hiya,

 

I'm very interested in this as now the weather is getting better in the UK i've become very aware of the noise from the fans on my N54L.

 

I don't really know enough about coding though so the links you posted all look alien to me. Is this something relatively easy to do? I do have a spare PI so can easily build something to pump out SMART data.

 

I can get SMART data via putty using

~ # esxcli storage core device smart get -d=t10.ATA_____WDC_WD40EFRX2D68WT0N0_________________________WD2DWCC4E0685966

 

although for me it gives it fahrenheit and not celcius which is a pain. Either way it's not hard to Google the translation.

 

Any help would be appreciated.

 

Thanks

Share this post


Link to post
Share on other sites

Hi.

 

Yes, that's rather easy to do with a tutorial and a raspberry pi. Tuto does not exist still, I'd like to be sure this way is the good one before giving too much hope and writing it.

 

What you did get with esxcli is not temperature in Fahrenheit (looks similar, though), as you seem to have a Western Digital disk, that's temperature as a normalized value depending on manufacturer. It has no unit.

.. whereas Seagate disks give values as Celsius degrees and are good to go (I've put 4 Seagate + 1 small WD in N54L)

 

In fact, we could get the real value by reading SMART raw data, but esxcli does not provide it :sad: I could not find any practical way in ESXi

(only known way in using smartctl tool from xpenology or other linux distrib).

 

So for now, I'm looking for interpretation of those normalized values (temperature, power on hours, etc... ) to trigger alerts. But no hope on getting a clean temp reading with WD disks + ESXi this way...

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.