Revision History Published: 27 Dec 2013 - first published Updated : 28 Dec 2013 - add TODO list of Samsung 840 and Crucial M500 Updated : 29 Dec 2013 - add Editor's note after slashdot article Editor's note 29Dec2013 Thank you for everyone's input from the Slashdot story. The additional drives for consideration is extremely useful but they will have to go through the same process of cost-benefit - followed only then by reliability - analysis that the other drives went through, with the additional handicap that the Intel S3500 has already "won" and been selected for live deployment. Which brings me to a keen point that is difficult to express when there are 275 slashdot comments to contend with. The belief that Intel paid for this report comes through loud and clear. Those who believe that are severely mistaken. Let's look at it again. Statement of fact: The S3500 SSD happens to be the sole drive which a) is cost-effective b) passed all the extreme tests c) is within budget d) was clearly marked in the online marketing as "having power loss protection" e) is not end-of-life So let us be absolutely clear: Fact: the Intel S3500 was the only drive which matched the requirements That it did so so completely comprehensively despite the extreme nature of the testing, which lasted several days whilst all other drives failed within minutes, is the real key point of this report. However that point - that success - is itself also completely irrelevant beside the fact that the testing itself provided the company that commissioned the work with an amazingly high level of confidence in "an SSD" despite their complete paranoia which had driven them to commission the testing in the first place. To make that clear: The company doesn't care about Intel: they care about a reliable drive If there were other drives that had passed or were known about or could have been found, they would have been added to the list already. Analysis of SSD Reliability during power-outages This report was originally commissioned due to the remote deployment of over 200 32gb OCZ SSDs resulting in severe data corruption in over 50% of the units. The recovery costs were far in excess of the costs saved by purchasing the cheaper OCZ units. They were replaced rapidly over a period of years by Intel SSD 320s, where, despite remote deployment of over 500 units there have only ever been three unrecoverable failures. However, the Intel 320 SSD has reached end-of-life, so a replacement was sought. Due to paranoia over the OCZs an in-depth analysis was requested. Around the time that the paranoia was hitting, a report had come out on slashdot, covering power-related corruption. It made sense therefore to attempt to replicate that report, as it was believed that the data corruption of the OCZs was related to power loss. This report therefore covers the drives selected and the testing that was carried out. We follow up with a conclusion (summary: if you care about power loss don't buy anything other than Intel SSDs - end of story) and some interesting twists. Picking drives for testing The scenario for deployment is one where huge amounts of data simply are not required. An 8gb drive would be able to store 1 month's worth of sensor data, as well as have room for a 1.5gb OS deployment. A 16gb drive stores over two months. Bizarrely, except in the Industrial arena the focus is on constant increases in data storage capacity rather than data reliability. The fact that shrinking geometries automatically results in higher susceptibility to data corruption is left for another time, however. Additionally, due to the aforementioned paranoia and assumptions that the data loss was occurring due to loss of power, the requirements to have "Power Loss Protection" were made mandatory. Power Loss Protection is usually found in Industrial and Server Grade SSDs, which are typically more expensive. So, finding low-cost low-size reliable SSD reported to have "Power Loss Protection" proved... challenging. After an exhaustive search, the following candidates were found: Crucial M4 128gb The unpronounceable Toshiba THNSNH060GCS 60gb The new Intel S3500 The Innodisk 3MP Sata Slim (8gb and 16gb) The Innodisk units came in around £30, whilst all the other drives came in at between £60 and £90. Also added to the testing was the original 32gb Vertex OCZ and the Intel 320. Test procedure The original report at the FAST conference was quite hard to replicate: the report is a summary rather than containing detailed procedures or source code. A best effort was made and then extended. OS-based test. The first test devised was to boot up a full OS and to power-cycle it using a mains timer. This test turned out to be completely lame, except for its negative results proving that simply switching power on and off was not the root cause of problems. OS-based huge parallel writes. The second test was to write huge numbers of files and subdirectories in parallel. Thousands of directories and millions of small files as well as one large one were copied, sync'd then deleted using 64 parallel processes. Power was not pulled during this test. Direct disk writing. This test was closer to the original FAST report, except simplified in some ways and extended in others. Crucial M4 The Crucial M4 was tested with an early prototype version of the SSD torture program. It was power-cycled approximately 1,900 times over a 48 hour period. Data was randomly written, sync'd and then read back, whilst power-cycling was done on a random basis between 8 and 25 seconds through the read-sync-write cycle. Every 30 seconds the geometry was checked and a smartctl report obtained. After approximately 1600 power-cycles, the Crucial M4's SMART report showed over 20,000 CRC errors. Within 1900 power-cycles, that number had jumped to 40,000 CRC errors and had been joined by serious LBA errors. Conclusion: epic fail. Not fit for purpose: returned under warranty. Toshiba THNSNH060GCS 60gb This drive turned out to be a little more interesting. It passed the OS-based parallel writes test with flying colours. Running for over 20 minutes, several million files and directories were created and deleted. In between each run no filesystem corruption was observed. Then came the direct-disk writing. It turns out that if the write speed is kept below around 20mbytes/sec, the Toshiba THNSNH060GCS is perfectly capable of retaining data integrity even when power is being pulled, even when there are 64 parallel threads all writing at the same time. However when the write speed exceeds a certain threshold, all bets are off. At higher write speeds, data loss when power is pulled is only a matter of time (minutes). We conclude from this that the Toshiba THNSNH060GCS does have power-loss protection circuitry and firmware, but that the internal power reservoir (presumably supercapacitors) simply isn't large enough to cover saving the entire outstanding cache of writes. Conclusion: close, but no banana. Innodisk 3MP Sata Slim There were high hopes for these drives, based on the form-factor and low cost. However, unfortunately they turned out to have rather interesting firmware issues. The observed write-then-read speeds (a write followed by a verify step) turned out to be adversely affected by the number of parallel writes. If there were no parallel writes (only one thread) then it was possible to write and then read at least 18 mbytes per second (i.e. the data was written at probably 30mbytes/sec then read at probably 45mbytes/sec, except that the timer was started at the beginning of the write and stopped at the end of the read). This speed was sustained. However, if there were even just two parallel write-read threads, the speed was sustained for approximately 15 seconds and then dropped down to 1 (one!) mbyte/sec. The more threads were introduced, the less time it took for the write-then-read speed to drop to a crawl. Paradoxically, if the torture program was suspended temporarily even for a duration of a few seconds, then when it was resumed the speed would shoot back up to 18 mbytes / sec and then once again plummet. We conclude from this that either the CPU on the Innodisk SATA Slim or the algorithms being used are just too slow to deal with parallel writes. There is clearly a RAM cache which is being filled up: the speed of writing to the NAND itself is not an issue (because if it was, then single-threaded writes would be slow as well). So it really is a firmare / CPU issue: when the cache is full of random parallel data, the firmware / CPU goes into meltdown, cannot cope, and the write speed suffers as a result. To Innodisk's credit, they actually responded and were given a copy of the SSD torture program and instructions on how to replicate the issue. It will be interesting to see how they solve this one: updates will be provided. Conclusion: wait and see. OCZ Vertex 32gb This was also interesting. The OS-based test (which was ordered to be run, despite reservations that it would be ineffective) showed absolutely ZERO data corruption. Let's repeat that. When picking one of the worst drives with the worst smartctl report ever seen that was still functional from a batch with over 50% failure rates and using it to install an OS and then leaving it to power-cycle over 100 times there was ZERO data corruption. What we can conclude from this is that power-loss had absolutely nothing to do with the data-loss. What it was then necessary to do was to devise a test which would show where the problem actually was. This test was the "OS-based huge parallel writes" test. Running this test for a mere 5 minutes (bear in mind that there was no power-cycling) resulted in immediate data corruption. Further investigation was therefore warranted. OCZ (before they went into liquidation) had been advising - without explanation - to upgrade the firmware. After working out how this can be done on GNU/Linux systems, and after observing in passing that the firmware upgrade system was using syslinux and FreeDOS, the firmware was "downgraded" to Revision 1.6. The exact same OCZ - with an incredible array of failures, CRC errors, lost sectors as reported by smartctl - when downgraded to firmware Revision 1.6 - then showed ZERO data corruption when the exact same OS-based parallel write testing was carried out. which is fascinating in itself. Further investigation then dug up an interesting nugget: it turns out that OCZ apparently had been warned by Sandforce not to enable a switch in the firmware which would result in "increased speed". OCZ, in their desperate attempt to remain "king of the speed wars" ignored the advice that doing so would result in data corruption. The results correlate with this advice: at higher speeds, data corruption is guaranteed to occur. The hypothesis here is that at higher speeds there is a bug in the firmware which results in the data being written incorrectly. What was not determined was whether that data was simply... not written at all or whether it was written in the wrong place. Given that out of the 50% failed drives a number of them actually could not be seen on the SATA bus at all, it seems likely that at high speeds, OCZs with the faulty firmware are actually capable of overwriting their own firmware! However, actually demonstrating this is beyond the scope of the tests carried out, not least because it would require wiping an entire drive, carrying out some parallel writes, then checking the entire drive to see where the writes actually ended up. This test may be added to the suite at a later date. Once the firmware was downgraded to Revision 1.6, the drive-level testing was carried out (there was no point doing so when the drive's firmware could not even maintain data integrity even when power was provided). Surprisingly, the drive fared pretty well. Sustained random speed levels were good, but data was lost intermittently when power was pulled, especially (like the Toshiba) at higher speeds. Conclusion: buy cheap, flash firmware to 1.6 if power-loss not important Intel 320 and S3500 As already hinted at, these drives simply could not be made to fail, no matter what was thrown at them. The S3500 was power-cycled some 6,500 times for several days: several terabytes of random data were written and read from that drive. not a single byte of data was lost. Despite even the reads being interrupted, there was not a single time - not even once - when the S3500 failed to verify the data that had been written. The only strange behaviour observed was that the write-then-read cycle speeds tended to fluctuate, sustaining around 25 to 30mbytes of write-then-read speed continuously for several minutes then dropping after 10 or so minutes to 20 or even 12 mbytes / sec for one (and only one) write-read cycle. The only possible explanation for this could be some housekeeping going on, in the firmware, which would take up CPU cycles for short durations. Conclusion: don't buy anything other than Intel SSDs Conclusion Right now, there is only one reliable SSD manufacturer: Intel. That really is the end of the discussion. It would appear that Intel is the only manufacturer of SSDs that provide sufficiently large on-board temporary power (probably in the form of supercapacitors) to cover writing back the entire cache when power is pulled, even when the on-board cache is completely full. The Toshiba drives have some power-loss protection, but it's not enough to cover an entire cache. The Innodisk team have tried hard: their datasheet shows that they are also providing power-loss protection as well as detecting when power and current drop below unsustainable levels. Given how difficult it is to even find out if Manufacturers provide this kind of capability at all it is worth giving Innodisk credit for at least making that information publicly accessible. The OCZ Management deserve everything that's happened to OCZ. They should have listened to Sandforce: the history of SSDs would have been a radically different story. The sad thing is that when the firmware is downgraded, the drives are no worse than any other consumer-grade SSD. The Crucial M4 is probably okay for general use, as are all the other drives (except the Innodisk until they fix the firmware issues to get the sustained write speeds back). And so, if it's possible to buy them cheap, and power-loss is not an issue, getting hold of second-hand OZC Vertex drives and downgrading the firmware would not be that bad an option. However, if data integrity is really important, even when power could be pulled at any time, then there really is absolutely no question: get an Intel SSD. it's as simple as that. Future On the TODO list is to write that test which wipes the drive, carries out random writes, then checks the entire drive to see if the writes went in the correct places. On the face of it this seems such an obvious thing that drives should do, but the OCZ Vertex's show that it's an assumption that cannot be made. The Innodisk drives are one to watch: the price and tiny size is well worth continuing to work with Innodisk to see if they can solve the problem of parallel-write-cache overload. Other drives may prove to be as good as the Intel S3500, however they were not tested during this research because other drives were either way outside of the budget, or it was impossible to find out from even exhaustive Internet searches as well as speaking to suppliers whether the other potential candidates had any form of power-loss protection. If anyone would like to find out if a particular make or model of drive is reliable under extreme torturing and power-interruption, contact lkcl@lkcl.net: a contract can be arranged and this report updated. Lastly, it is worth noting that this testing was only carried out for a maximum of a few days sustained writing. The long-term viability obviously has not been tested. However, given that deployment of over 500 Intel 320 SSDs has been carried out and only 3 failures observed over several years, it would be reasonable to conclude that Intel S3500s could be trusted long-term as well, bearing in mind - as a precautionary tale - that lower geometries means more unreliability for the firmware to contend with. TODO Updated: 28th Dec 2013 Thank you to everyone who's recommended drives since this report was published. The initial investigation is basically over: the Intel S3500 was top of the list as it was the only one that passed. However, based on unit cost it could well be the case that the investigation is reopened. Recommended drives for consideration at a later date: * Samsung 840 * Crucial M500 (first Crucial drive with power-loss capacitors) * Intel 540 series (which are apparently made differently from S3500 and 320s) Recommended tests: * Use new linux kernel 3.8 "cmd flush disable" option to check data integrity * "Power brown-outs" (reducing current intermittently) as an advanced test