NVMe Flash HealthIn this blog post, I’ll look at the types of NVMe flash health information you can get from using the NVMe command line tools.

Checking SATA-based drive health is easy. Whether it’s an SSD or older spinning drive, you can use the smartctl command to get a wealth of information about the device’s performance and health. As an example:

While smartctl might not know all vendor-specific smart values, typically you can Google the drive model along with “smart attributes” and find documents like this to get more details.

Checking NVMe Flash Health

If you move to newer generation NVMe-based flash storage, smartctl won’t work anymore – at least it doesn’t work for the packages available for Ubuntu 16.04 (what I’m running). It looks like support for NVMe in Smartmontools is coming, and it would be great to get a single tool that supports both  SATA and NVMe flash storage.

In the meantime, you can use the nvme tool available from the nvme-cli package. It provides some basic information for NVMe devices.

To get information about the NVMe devices installed:

To get SMART information:

To get additional SMART information (not all devices support it):

Some of this information is self-explanatory, and some of it isn’t. After looking at the NVMe specification document, here is my read on some of the data:

Available Spare. Contains a normalized percentage (0 to 100%) of the remaining spare capacity that is available.

Available Spare Threshold. When the Available Spare capacity falls below the threshold indicated in this field, an asynchronous event completion can occur. The value is indicated as a normalized percentage (0 to 100%).

(Note: I’m not quite sure what the practical meaning of “asynchronous event completion” is, but it looks like something to avoid!)

Percentage Used. Contains a vendor specific estimate of the percentage of the NVM subsystem life used, based on actual usage and the manufacturer’s prediction of NVM life.

(Note: the number can be more than 100% if you’re using storage for longer than its planned life.)

Data Units Read/Data Units Written. This is the number of 512-byte data units that are read/written, but it is measured in an unusual way. The first value corresponds to 1000 of the 512-byte units. So you can multiply this value by 512000 to get value in bytes. It does not include meta-data accesses.

Host Read/Write Commands. The number of commands of the appropriate type issued. Using this value, as well as one below, you can compute the average IO size for “physical” reads and writes.

Controller Busy Time. Time in minutes that the controller was busy servicing commands. This can be used to gauge long-term storage load trends.

Unsafe Shutdowns. The number of times a power loss happened without a shutdown notification being sent. Depending on the NVMe device you’re using, an unsafe shutdown might corrupt user data.

Warning Temperature Time/Critical Temperature Time. The time in minutes a device operated above a warning or critical temperature. It should be zeroes.

Wear_Leveling. This shows how much of the rated cell life was used, as well as the min/max/avg write count for different cells. In this case, it looks like the cells are rated for 1800 writes and about 1100 on average were used

Timed Workload Media Wear. The media wear by the current “workload.” This device allows you to measure some statistics from the time you reset them (called the “workload”) in addition to showing the device lifetime values.

Timed Workload Host Reads. The percentage of IO operations that were reads (since the workload timer was reset).

Thermal Throttle Status. This shows if the device is throttled due to overheating, and when there were throttling events in the past.

Nand Bytes Written. The bytes written to NAND cells. For this device, the measured unit seems to be in 32MB values. It might be different for other devices.

Host Bytes Written. The bytes written to the NVMe storage from the system. This unit also is in 32MB values. The scale of these values is not very important, as they are the most helpful for finding the write amplification of your workload. This ratio is measured in writes to NAND and writes to HOST. For this example, the Write Amplification Factor (WAF) is 16185227 / 6405605 = 2.53  

As you can see, the NVMe command line tools provide a lot of good information for understanding the health and performance of NVMe devices. You don’t need to use vendor-specific tools (like isdct).

8 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Jörg Brühe

Hi Peter!
Sadly, you cannot rely on the values reported by smartctl giving any good indication of the SSD’s wearout.
German magazine c’t had an article in its issue 1/2017 where they had done a long-term test of SSDs, continuously writing data to them and checking the smartctl values – the results were very disappointing.
And even worse, when a SSD was worn out, it would not only fail to write anything but also deny read access.
All this was about SATA SSDs, but I wouldn’t be surprised if it were no better for NVMe devices.
So: Beware!
Regards,
Jörg

Jörg Brühe

Hi Peter!
Sorry about the late reply …
The article I refer to appeared in the 1/2017 issue of c’t, a German IT paper. It seems the full text is not available online, just the start; this is the URL:
https://www.heise.de/ct/ausgabe/2017-1-Flash-Speicher-im-Langzeittest-3573503.html
To get the full article, you have to buy it (online, from that link), and it is German only.

You are right about the SSDs operating much longer than specified by the manufacturer, c’t also reported that. So my “disappointing” does not refer to the overall quality, it refers to the lack of clear diagnostics that would give a meaningful warning sign to the admin. I’m sorry my comment didn’t make that obvious, my fault.

I followed your link, that text sounds more positive on the helpfulness of SMART data than c’t is. I suspect that is an area where manufacturers differ (or: where quality differs). Or might it have changed from that 2015 report to the c’t 2017 one?
c’t had tried these SSDs of 240 – 256 GB: Crucial BX 200, Samsung 750 Evo and 850 Pro, SanDisk Extreme Pro, SanDisk Ultra II, and Toshiba OCZ TR. It might be significant that in both reports, the Samsung Pro could stand the largest amount of data written.

Also, both reports agree that SSDs fail completely, do not fall back into a read-only mode – and IMO, that is a really sad fact.
Regards,
Jörg

Peter Zaitsev

Hi,

Oh yes. I would not count on SSDs to become read only after they wear out. I think the idea is when you’re operating SSD beyond its operational parameters you’re using it on your own risk.

Samthedev

Hi Peter,

Thank you for the great article, I have question about the write amplification pls.

My Intel nvme reports a bigger value on the host_bytes_written attribute compared to the nand_bytes_written which doesn’t make sense, even when i try to write the disk, the host_bytes_written attribute seems to increase more than the nand_bytes_written! do you have any explanation to that ? given the fact that the nand_bytes_written should be bigger since it includes also the data written by the host and the overhead of the garbage collector/Wear leveler !

AnjanaDhanvi

Thanks for this post.I want more details for Checking NVMe Flash Health

AnjanaDhanvi

Thanks for this post.I need more details for Thermal Throttle Status?

anp

Where can I find documents related to discovery of NVMe drives connected to a Serial Attached SCSI controller.?