linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ric Wheeler <ric@emc.com>
To: "Rogério Brito" <rbrito@ime.usp.br>
Cc: linux-kernel@vger.kernel.org, Brett Russ <russb@emc.com>,
	linux-fsdevel@vger.kernel.org
Subject: Re: XFS corruption during power-blackout
Date: Fri, 01 Jul 2005 09:57:48 -0400	[thread overview]
Message-ID: <42C54BDC.6000206@emc.com> (raw)
In-Reply-To: <20050701131950.GA15180@ime.usp.br>

Rogério Brito wrote:

>On Jul 01 2005, Jens Axboe wrote:
>  
>
>>On Fri, Jul 01 2005, David Masover wrote:
>>    
>>
>>>Not always possible.  Some disks lie and leave caching on anyway.
>>>      
>>>
>>And the same (and others) disks will not honor a flush anyways.
>>Moral of that story - avoid bad hardware.
>>    
>>
>
>But how does the end-user know what hardware is "good hardware"? Which
>vendors don't lie (or, at least, lie less than others) regarding HDs?
>
>
>Thanks, Rogério Brito.
>
>  
>
The only real way is to test the drive (and retest when you get a new 
versions of firmware) and the whole fsync -> write barrier code path.

We use a bus analyzer to make sure that when you fsync() a file, you 
will see a cache flush command coming across the bus. Of course, that is 
the easy step ;-)

The second step is to test your system across power failures.  We have a 
"wbtest" code that we have used to catch bugs. The basic idea is to 
write a file to a disk with the cache turned off, write the same file to 
the disk with the write barrier (and working cache flush command) and 
then randomly drop power to the box.  It is important to really drop 
power to the whole box since a "reset button" push often does not drop 
power to the drives and will give you false passes.

Our wbtest used to be good at finding holes in the write barrier code 
using 2.4 kernels and PATA drives, but we have had no luck yet in 
catching known bugs with this test on 2.6 with S-ATA drives.

Ideas on how to get a more effective test are welcome - it is a very 
small window that you need to hit to catch a misbehaving drive (i.e., 
your write cache flush command has returned, you want to drop power and 
on reboot, validate that the platter contains that last IO correctly).  
If you had enough NVRAM in a test system, you might be able to 
substitute a NVRAM backed file system for the write-cache disabled drive 
and get closer to catching the window.

The alternative is to either run with the write cache disabled (again, 
you will need to validate that the drive really disabled the cache) or 
to buy a mid-range or better storage array that provides a non-volatile 
(battery backed) write cache.



-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2005-07-01 13:59 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20050629001847.GB850@frodo>
2005-06-29  4:53 ` XFS corruption during power-blackout Al Boldi
2005-06-29 16:38   ` Christian Rice
2005-06-29 17:02   ` Chris Wedgwood
2005-06-29 17:56     ` Steve Lord
2005-06-29 20:56       ` Chris Wedgwood
2005-06-30 16:30         ` Bryan Henderson
2005-06-30 18:46           ` Chris Wedgwood
2005-06-30 19:44             ` Jörn Engel
2005-06-30 20:32               ` Chris Wedgwood
2005-06-30 21:07                 ` Jörn Engel
2005-07-01 12:36                 ` Ric Wheeler
2005-07-01 12:56                   ` Jens Axboe
2005-06-30 20:49             ` Bryan Henderson
2005-07-01 12:53               ` Ric Wheeler
2005-07-01 18:24                 ` Bryan Henderson
2005-07-01 19:58                   ` David Masover
2005-07-01 21:10                     ` Jörn Engel
2005-07-01 21:39                       ` David Masover
2005-07-01  1:09             ` Stewart Smith
2005-07-05 15:53             ` Sonny Rao
2005-06-29 21:10       ` Nathan Scott
2005-07-01  8:17     ` David Masover
2005-07-01  9:24       ` Jens Axboe
     [not found]         ` <20050701131950.GA15180@ime.usp.br>
2005-07-01 13:57           ` Ric Wheeler [this message]
2005-07-01 18:37             ` Bryan Henderson
2005-07-01 18:41               ` Jens Axboe
2005-07-11 12:53                 ` Ric Wheeler
2005-07-01 14:05         ` Al Boldi
2005-07-01 16:35           ` Alistair John Strachan
2005-07-05 15:49           ` Sonny Rao
2005-07-05 17:25             ` Al Boldi
2005-07-05 18:10               ` Sonny Rao
2005-07-05 19:24                 ` Dieter Nützel
2005-07-06  4:24                 ` Al Boldi
2005-07-06  4:46                   ` Nathan Scott
2005-07-16  7:02 Al Boldi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42C54BDC.6000206@emc.com \
    --to=ric@emc.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rbrito@ime.usp.br \
    --cc=russb@emc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).