The Effects of Radiation on NAND Flash Based Devices
This article describes the effects of radiation on NAND FLASH devices and covers possible ways to minimize such effects.
Applications Involving Radiation Exposure
Most consumer-based products are not usually exposed to radiation. Even through security screening devices, levels are very low. However, it wouldn’t be recommended to put them
through a full-fledged X-Ray or CT Scan. Years ago, EEPROMS, heavily used in embedded systems, with the ultra-violet erase window, could not pass through security checks at the airport, at the risk of losing data. As one can imagine this caused all sorts of use limitations and transport issues. We have come a long way since then, but we still are not totally invulnerable.
Devices used in Aerospace, Aircraft and medical applications are some examples of cases where one must consider radiation effects on electronic devices.
Products Requiring Radiation Effect Considerations
Increasing demand for higher memory densities in space and aircraft electronics has generated significant interest in the use of flash NAND based devices. However, there are a number of features of NAND flash memory that require additional care when they are used in these applications. In addition to being susceptible to single-event effects (SEE) and total ionizing dose (TID) degradation, most NAND devices are subject to corrupted data naturally arising during normal operation conditions but are increased with radiation exposure. Because of this, NAND devices require error correction code (ECC) schemes even in terrestrial environments, without degradation due to radiation effects, with radiation effects reducing the overall lifespan of the device.
One obvious remedy is shielding. This always provides protection. However, one must consider what happens if the shields are destroyed, as in the case of a flight recorder. So, the best design practices involve designing if no shields are present. Another possible solution is the use of Rad Hardened devices, which incorporate internal radiation shields.
Summary of Radiation Testing That Has Been Preformed
Single-event effects and total ionizing-dose testing was performed on a NAND Flash device. Results are presented and the consequences for error correction architectures analyzed, taking into account the fact that even brand new flash devices could be susceptible to data corruption even with no external stress applied. The analysis indicates that many typical error correction architectures are not totally effective because of the variety of radiation effects that are possible.
TID and SEE testing for the flash NAND die used in the DDC 256 Gb (16-bit bus) and 192 Gb (24-bit bus) parts have been performed. Using the results of these tests, comparisons on the effectiveness of various ECC architectures in reducing the upset rate for the device are evident. This is only representative of what could happen and not intended to be totally encompassing.
Three modes were tested.
In the static test, the device was programmed, and the pattern was verified
immediately prior to irradiation. The DUT was powered on but
no reads or writes were performed during irradiation. The DUT
was irradiated to a fluence of 1Å~107 ion/cm2 and monitored for
single event latch-up (SEL) or single event functional interrupt
(SEFI). Following irradiation, the device was read again, and a
final erase-write-verify (EWV) performed to verify
During read-only testing, the device was programmed and
the pattern was verified immediately prior to irradiation. The
DUT was powered on and read continually during the test. The
log file recorded the number of blocks that were read during the
test. During irradiation the device was monitored to verify
functionality. Following a SEFI the beam was stopped, the
current was recorded, and an attempt was made to recover the
device, first through software intervention, and eventually by
cycling device power. Following irradiation, the device was
read again, and a final Erase-Write-Verify (EWV) was performed to verify
For EWV testing, the DUT was powered on and a pattern was
continually erased, written, and verified to each block in the
device. The log file recorded the number of blocks that were
accessed during the test. During irradiation the device was
monitored to determine if a SEFI had occurred, and recovery
was attempted following beam termination.
Testing was performed on die, not on shielded packages.
All modes displayed some bit errors resulting in incorrect data.
Some Architectures Used to minimize Radiation Effects
A. Triple Mode Redundancy (TMR)
Rates for the failure of triple mode redundant (TMR) devices are based on the method using three 8 bit data buses, and a voter module that selects the best of three, These methods are especially pertinent for the use of a device with a 24 bit data bus such as 69F192G24 device which has a 24 bit wide bus to facilitate a triplicated 8-bit wide architecture. Of course, three separate 8 bit devices can be used as well.
The single-error correct, double-error detect (SEC/DED) uses16 data-bits with 6 check-bits and is capable of correcting a single-bit error in a single address, or detecting a double bit
error in the same address location.
The method must be adjusted to accommodate the BCH architecture discussed in the next
section. Error rate improvement beyond the MBU rate would require more robust error
The BCH uses 13 bits of overhead correction for each bit of correction. Theoretically, any number of bits can be corrected so long as there is enough overhead area in the Flash Pages. That is not a problem in today’s Flash. 24 or more bit correction is common. Usually a CRC is used over the data and overhead area for preventing failure.
D. TMR + ECC
It is also possible to use multiple layers of correction. For example, TMR and ECC codes.
In one instance, TMR die are voted prior to applying ECC. In another each die is separately subjected to ECC prior to TMR voting. In both cases, a system error will occur when there is a SEFI in two legs of the TMR. However, these architectures would be less vulnerable to SEFI+SEU failures that dominate the TMR only architecture. We note that the UE rate in the SEFI + SEU case for both architectures is the product of the SEFI rate and
the rate for the appropriate type of SEU.
With each of these architectures, periodic data scrubbing is included, which further enhances reliability.
NAND Flash devices are subject to data corruption even under terrestrial conditions. The data indicates that the error correction methods used in terrestrial application can be used to attain relatively low error rates even in the harsh environment of space. Determining NAND UE rate in space requires careful attention to the interplay of the baseline upset rates for multiple effects as well as architectural considerations such as ECC precedence and scrubbing rates. Because there are multiple SEE that need to be accounted for care must be taken to ensure that a specific error correction architecture is robust to all possible failure modes. As an example, the BCH ECC is robust in correcting bit errors, however it can be defeated by a single SEFI, and as a result, the UE for the system is limited by the SEFI rate. Thus SEFI must be considered in any NAND architecture application intended for space.
Please note that this research and article is for general information and was independently completed by Carmine Cupani of CTech Electronics, not by Delkin Engineers.