From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Sandeen Subject: Re: Intel SSD data loss: Any possible way this is user / software error? Date: Fri, 13 Aug 2010 07:57:14 -0400 Message-ID: <4C65331A.9050203@redhat.com> References: <4C64615B.70308@mit.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: linux-ext4@vger.kernel.org To: Evan Jones Return-path: Received: from mx1.redhat.com ([209.132.183.28]:8911 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1761525Ab0HML5T (ORCPT ); Fri, 13 Aug 2010 07:57:19 -0400 In-Reply-To: <4C64615B.70308@mit.edu> Sender: linux-ext4-owner@vger.kernel.org List-ID: Evan Jones wrote: > I'm testing a few systems that attempt to log data to disk reliably. I > bought a brand new Intel SSD (X25-M G2) for this purpose. It appears to > me that this disk does *not* store data reliably when there are power > failures, even with write barriers, even with the cache disabled. I'm > surprised that this disk might be this broken (possible), but it may > also mean I've made a mistake. Is there any possible way that I have a > bug in the test described below? The test works as expected with a > couple SATA magnetic disks. > > > Configuration: > > * Linux 2.6.32 (a distributed with Ubuntu 10.04) > * SATA SSD directly attached to the system's built-in controller (Intel > N10/ICH7) > * ext4 with default options (meaning barrier=1) > * Disable the write cache (hdparm -W 0 /dev/sdb) Just out of curiosity, what do you see when the write cache is on? Seems counter-intuitive that it'd work better, but talking w/ Ric Wheeler, he was curious... maybe Intel didn't test with the write cache off? Also, would you be willing to publish the test you're using? Thanks, -Eric > > The test: > > 1. Write a 64 MB file of zeros (first use fallocate, then zero fill) > 2. fsync() > 3. write() blocks of this file with a sequence number. > 4. fdatasync() > 5. Send UDP packet reporting the sequence number written. > 6. Go to 3. > > While this test is running, I pull the power out of the drive to > simulate a hard failure. On the magnetic disks I have, this works as > expected: On reboot, the log file contains the complete record that was > reported as last written (it may also contain part of the next record). > > On the X25-M, when I use large writes (128 kB), it loses data fairly > frequently (every couple attempts): I either see the last log record as > being before the reported one, or occasionally I get a media error when > reading back the file. > > I'm surprised that this disk could be this broken, but I suppose it is > possible. Any help is welcomed. Thanks, > > Evan Jones >