-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intel ssd wearout not reported when almost dead #86
Comments
How do you see that the drive is failing now? Any indicators, failures, logs, etc? As you correctly mentioned, this is the same problem as the linked issue #73. check_smart currently can only read and interpret the "raw values". In this case, the plugin would need to read the "normalized values" which can either be an increasing or decreasing counter (this makes it even more tricky):
|
Same disks in raid1 both 1% lifetime and system is sooo slow. Write about 40M and loadavg about 80 on 6 core machine ( waiting for iops ) |
Where do you see 1% lifetime in the SMART table? |
Sorry, i posted wrong smart |
So value 001 means 1% remaining? Is this one the replacement drive and has 92% remaining?
|
Yes, the atribute The number is decreasing from 100 ... the percent remaining. |
As the raw value remains 0, this is kinda tricky and cannot be easily integrated into the existing (raw) checks. We would have to add a new check with its own option (e.g. |
Im absolutely fine with it. When it happens, it happens |
Tried to scan all our servers and here are values which can be reported as wear level in pct 177 Wear_Leveling_Count |
Similar as #73 .. Disk is failing now but not reported as crit
The Smart is
The disk info
Output for plugin
The text was updated successfully, but these errors were encountered: