Write Amplification in True Embedded Applications
When it comes to embedded storage, there are many differences between solid state drives, or SSDs, and hard disk drives, or HDDs. For today’s embedded applications, these differences mean that SSDs are usually a better fit than HDDs. The faster speeds and greater reliability of SSDs make them ideal for today’s sophisticated devices. This is particularly true in the case of industrial embedded applications, where embedded storage failures can cause significant disruptions. However, there are some downsides with SSDs that all designers and OEMs of embedded applications need to know about. One of the largest of these is write amplification. Write amplification can have serious impacts on the longevity of your embedded flash memory, and thus on the longevity of your entire device. Understanding write amplification will allow you to plan for it and will help to inform design decisions you make for your true embedded applications. Here are the facts you need to know.
What is write amplification?
Although amplification is often considered to be a positive thing, write amplification is one kind of amplification you want to avoid as much as possible. Write amplification, or WA, refers to the phenomenon that occurs in flash memory and SSDs in which the amount of data written to your memory device is greater than the amount of information you are actually trying to store. Measuring WA is usually done in ratio format, comparing the number of writes that are committed to the flash memory to the number of writes that are coming from the host system. In other words, WA is measured as the ratio of the amount of data stored versus the amount of data the host system asks the memory device to store.
One of the greatest advantages of SSDs over HDDs is that they are not subject to issues caused by mechanical errors. HDDs have spinning disks that have to move in order for data to be stored. These mechanical components frequently fail, and are even more prone to error in industrial environments in which high levels of shock and vibration can interfere with the mechanical operations. SSDs don’t have any moving parts, so they are more reliable than HDDs. However, they do have a finite lifespan. The lifespan of an SSD, and that of any kind of flash memory, is a function of how many program and erase cycles it can perform. Thus, write amplification can have a dramatic impact on the overall longevity of a storage device.
Why does write amplification occur?
In HDDs, new data can be written on top of existing data, allowing the new data to be stored in the old data’s place. In SSDs, existing data has to be erased to make room for new data. The erase process that occurs has a major role in write amplification.
To understand how amplification occurs during this process, it’s important to understand how SSDs store data. Information within an SSD is stored in cells, which are grouped together into pages, which are then grouped together on blocks. Each cell may store as little as one bit of data, while each page may store just over 4,000 bits. Data blocks usually have about 128 pages each.
When something is written to an SSD, it may be written into different parts of the memory. In other words, different bits of data are stored on different pages and in different blocks, which complicates the process of erasing it. Additionally, while new data can be written into cells and pages, the erase cycle of SSDs only allows blocks of data to be erased, rather than smaller chunks.
To manage this, a process called garbage collection occurs during the erase cycle. When a flash block has a mix of data to be saved and data to be erased, garbage collection occurs. In this process, the data that are to be saved are rewritten to another block, which has already been erased to make room for new data. Next, the block that is left with the data that need to be deleted is fully erased. During this process, multiple copies of the same data—both good data and undesired data that are destined for deletion—will exist within the memory in more than one location, which means the data will be taking up extra space within the memory.
There are two issues with this process. One is that erasing data to write new data requires rewriting existing data to new locations. This amplifies the number of program and erase cycles, which are finite in SSDs. The other issue is that there must be space for the duplicate data on the drive, as the amount of data being written is amplified.
Are there processes in SSDs to mitigate write amplification?
There are several features integrated into SSDs to help reduce write amplification and improve the longevity of the device. One of these is over-provisioning. Over-provisioning exists to ensure there is always enough capacity left in the drive to store the multiple copies of data that are created during the garbage collection process. Essentially, over-provisioning refers to the creation of storage capacity that is not available to the host system. Whatever the stated storage capacity of a flash drive is, there will be some additional capacity in the drive that is withheld from the host system purely for use during garbage collection to store the extra copies of the data.
When you purchase an SSD, it may state what percent of the storage capacity is available for over-provisioning, but many designers and OEMs do not realize that this additional storage can easily be customized. You can opt to increase the amount of storage available for over-provisioning by dedicating some of the storage that is available to the host system to over-provisioning. Although reducing the available storage may not seem desirable, there are actually instances in which doing so can be advantageous. SSDs are at their most efficient when there is ample storage space available, so dedicating additional storage to over-provisioning functionally limits your ability to fill up the drive completely, thus ensuring it works as fast and reliably as possible. This scenario is not ideal for every user, but depending on the way the host device is being used, it is beneficial for some designs.
TRIM is another feature that can help to control write amplification. This command is a SATA command, and the host system in which the SSD is being used must support the TRIM command, as must the SSD itself. TRIM prevents a problem that can occur when the drive repeatedly includes invalid Logical Block Addresses, or LBAs, in the garbage collection process.
Typically, when the host system tells the drive to overwrite a file and replace an LBA, the drive understands to mark the LBA as invalid, and it won’t save the blocks or make multiple copies of the data during garbage collection. However, if the host system attempts to delete a file instead of replacing the data, the SSD will mark the file to be deleted, but the information will not actually be deleted. Because the information is still there, the drive doesn’t understand that the file does not have to be copied during garbage collection, so it will continue to do so. Through the course of that process, write amplification occurs.
TRIM helps to prevent this problem from occurring. The TRIM command tells the drive that the LBAs can be erased for reuse when a file is deleted, so that the system will no longer move the LBAs around and copy the data during garbage collection. This reduces write amplification and increases the amount of free space available to the user, so that the overall performance of the SSD is improved.
There are some limitations of TRIM. In some cases, SSDs don’t support the TRIM command, and so users have to perform a firmware update in order to gain the benefits. Even when the TRIM command is supported, it does not always lead to immediate user benefits. This is because the additional space that is freed up by the TRIM command is not usually centered in one location, but is scattered throughout the drive. Because of this, the increase in speed may not be noticeable until multiple garbage collection cycles have occurred and a substantial amount of spare space has been freed up in a single location on the drive.
It is important to note that over-provisioning and TRIM are linked. TRIM cannot create more free space on the drive than is available via the space that is created for over-provisioning. For example, in a drive with 200GB in user capacity and 5GB of capacity reserved for over-provisioning, TRIM will never create more than 5GB of additional storage space. However, a user with this kind of SSD can see better results by customizing the over-provisioning to 10GB, so there is a larger cushion of free space, even when the TRIM command is not in use. This allows the drive to function more efficiently overall.
It is also important to note that there is a risk associated with using the TRIM command, as theoretically, data that are erased through a TRIM command process cannot be recovered. For industrial applications with critical data, this can be particularly problematic, so erases must be done with caution. Note that in some cases, a data recovery specialist can recover data that have been erased after a TRIM command has been executed if the data were already written in a different location on the drive or if the command was not fully executed.
How do write processes affect write amplification?
Different write processes have a different impact on drives and write amplification. Sequential writes are the most “drive-friendly” form of writes. These writes occur when large amounts of information are written at one time, in a sequential order. When this happens, the data can fill up blocks with related, sequential information, so when the information is set to be deleted, the entire block can simply be erased. The drive doesn’t have to write any part of the block to another location, so there is no write amplification. This translates to a write amplification factor of one, which means that write amplification does not exist.
Random writes do not follow this simple procedure. With random writes, a process of read-modify-erase-write has to occur for every new write cycle. Because data are not being stored on blocks sequentially, when blocks have be deleted to make space for new data, the existing data will be rewritten on the drive, thus creating write amplification. However, it is important to note that this process does not become an issue until every block on the drive has been fully written once. Until then, there is space for new data without garbage collection occurring. However, once every block on the drive has been used, garbage collection and TRIM commands are necessary to make room for new data, and the drive will experience the highest levels of write amplification. The exact write amplification factor that the drive will experience will depend on a number of issues, from the size of the drive and the available space for over-provisioning to the host system and how the device is being used.
What should I consider about write amplification when I am selecting a flash drive?
Although write amplification has a big impact on flash memory, it isn’t the only thing you should consider when selecting memory for an embedded application. It is important to consider the overall features of the drive, including the capacity, speed, and grade of flash. The grade of flash has significant implications for industrial applications in particular, since industrial grade flash drives can tolerate the extended temperature ranges and high levels of shock and vibration that may be present in rugged operating conditions.
At Delkin, we understand how complex choosing the right memory for your embedded application can be. Our customer service team is here to help you navigate your options and select a drive that meets all of the needs of your design. Contact us today, and let us help you learn more about write amplification and all of the factors that impact flash memory usability.