Jump to content
XPEnology Community

TheExpert

Rookie
  • Posts

    7
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

TheExpert's Achievements

Newbie

Newbie (1/7)

2

Reputation

  1. Hi elmuziko, because of running a bash script on the ESXi host it should be possible to start full SMART scans. My VBScript may not be able to parse the output when it's not the same as running /opt/smartmontools/smartctl -d sat --all /dev/disks/<drive> which is the command I've running every night now. So at the moment I do not scan the the drives. The bash script is only reading the overall health status, checking the Pre-failure values against the thresholds and writing the status "PASSED" or "FAILED" to each Pre-failure value of each installed disk in each ESXi host. The VBScript is then parsing the output of the bash script for "FAILED" to send me a mail message with high priority/importance. It's enough to have one "FAILED" status for this. When all is "PASSED" the script sends me a mail with normal priority/importance so I can see that the task of checking the disks is running. After knowing now how to handle this it's no problem to do more healthcare regarding the hard disks with smartctl on ESXi and getting an email message of the output with a VBScript running on a Windows system. I never thought about doing regular scans of my hard drives on the ESXi host. Maybe this is done by the smartd of ESXi? Or should I do this with smartctl, too? Kind Regards
  2. Hi elmuziko, I finished now the step to create a VBScript that gets the output of plink running the bash script and parses the output for error messages to send a mail of the disk health. The most difficult thing was to define the mail priority in VBScript (with PowerShell this is so easy). Tonight there will be the first run as scheduled task which is planned for running once a day. Kind Regards
  3. Hi elmuziko, yes, this is the smartmontools version for ESXi I mentioned. It's running even with ESXi 7.0 U3d. But unfortunatelly there's no newer build available for ESXi. Newer builds of smartmontools can create json formatted output which can be read by other programs or scripts much easier. I created now a little bash script that reads out the important SMART parameters which are known for pre-fail. And if there are errors the bash script writes them into the output. The next step will be to create a VBScript that gets the output of plink running the bash script and parses the output for error messages to send a mail of the disk health. It's frustrating that we have to do such komplex scriptings in the 2020s :-(. I thought there are better solutions for my homelab, i. e. Proxmox. But Proxmox doesn't fit better in my opinion. I tried to recover one of my Windows VMs with Veeam agent but the VM won't boot. So I think it's much easier to stay at VMware ESXi and create a monitoring tool that reads out the SMART infromation of the hard disks correctly and sends me a mail if theres something wrong with the disks. I also tried to use RDMs for passing through the SATA disks to my Windows VM. Under Windows I can read the SMART information with CrystalDiskInfo, which a is really perfect tool. But I wasn't able to do so. The disks won't be passed through as physical disks and so the software isn't able to read the SMART information. Kind Regards
  4. Hi elmuziko, the script itself works still fine but you have to trust to the SMART values ESXi is reading. And I found out that these values aren't reliable when read by ESXi! I use enterprise storage disks from WD and here ESXi doesn't read the values right. The values of some important SMART parameters are 200 if your disk is OK, i. e. Reallocated_Sector_Ct. But ESXi reads 0. If the disk is OK the value is greater than the threshold which is at 140. Because of reading the value as 0 you get an error message. You see warnings in the syslog of ESXi 2022-04-04T17:58:55.592Z smartd[526738]: [warn] t10.ATA_____WDC_WD3000F9YZ2D09N20L1_______________________WD2DWCC13DRRJZJP: REALLOCATED SECTOR CT below threshold (0 < 140) 2022-04-04T17:58:55.680Z smartd[526738]: [warn] t10.ATA_____WDC_WD3000F9YZ2D09N20L1_______________________WD2DWCC136FZX9CU: REALLOCATED SECTOR CT below threshold (0 < 140) 2022-04-04T17:58:55.927Z smartd[526738]: [warn] t10.ATA_____WDC_WD3000F9YZ2D09N20L1_______________________WD2DWCC130NF3EEZ: REALLOCATED SECTOR CT below threshold (0 < 140) 2022-04-04T17:58:56.016Z smartd[526738]: [warn] t10.ATA_____WDC_WD3000F9YZ2D09N20L1_______________________WD2DWMC130E1ZMKA: REALLOCATED SECTOR CT below threshold (0 < 140) and my script always sends an error notification. So I bought new disks believing that they are getting old and can cause disk errors and data corruption in the near future which isn't the case . By the way, the new WD disks are seen as faulty by ESXi, too. When reading the SMART values of these disks under Windows with CrystalDiskInfo or under ESXi with smartmontools - there's an ESXi version of an older version - the correct value 200 is read. And then I found out that this issue seems to happen with all WD disks, i. e. WD Red etc. I saw a lot of examples with this issue. With disks from other manufactorer the values are read by ESXi right. WD doesn't see an issue regarding using the disks with VMware ESXi and getting wrong SMART values. They won't help to solve the issue. Now I try to get a solution for reading the SMART values by using the smartmontools and getting error notifications when there are really errors. Kind Regards
  5. Hi elmuziko, without your scripts I never found a way to read SMART data from ESXi by using PowerShell. I was searching a very long time for a solution to do this. Thank you for giving me an idea how to do this. I created a new script based on another example I found later on further investigations. With my script I can now read all available SMART data of each device from ESXi, write all data and the data with warnings to text files and send a warning mail if there are issues with values that are below the tresholds. It took a long time to adapt this example script to meet my requirements and get it running but now it's working great. For this script there's no need to use a SQL database and a web server. If someone is interested in my script I can give you a copy. Keep in mind that my script can't get the TBW for a better monitoring of SSDs because ESXi resets this data when it has been rebooted. Kind Regards TheExpert
  6. Hi elmuziko, this is a great script, thank you. I want to give it a try and I have some further ideas: 1. Option to only send email alerts 2. Website with IIS 3. Database on SQL express How can I change your script to only send email alerts of the SMART data? I'm not a experienced user of PowerShell. But I'm good in programming Visual Basic scripts.
×
×
  • Create New...