From: Ric Wheeler <ric@emc.com>
To: Tejun Heo <htejun@gmail.com>
Cc: Mark Hahn <hahn@physics.mcmaster.ca>,
David.Ronis@McGill.CA, linux-ide@vger.kernel.org, neilb@suse.de
Subject: Re: Problem with disk
Date: Sun, 07 May 2006 09:21:09 -0400 [thread overview]
Message-ID: <445DF445.6070803@emc.com> (raw)
In-Reply-To: <445D29A1.5000402@gmail.com>
Tejun Heo wrote:
>
>
> Unfortunately, this can result in *massive* destruction of the
> filesystem. I lost my RAID-1 array earlier this year this way. The FS
> code systematically destroyed metadata of the filesystem and, on the
> following reboot, fsck did the final blow, I think. I ended up with
> 100+Gbytes of unorganized data and I had to recover data by grep + bvi.
Were you running with Neil's fixes that make MD devices properly handle write
barrier requests? Until fairly recently (not sure when this was fixed), MD
devices more or less dropped the barrier requests.
With properly working barriers, any journal file system should get you back to a
consistent state after a power drop (although there are many less common ways
that drives can potentially drop data).
>
> This is an extreme case but it shows turning off writeback has its
> advantages. After the initial stress & panic attack subsided, I tried
> to think about how to prevent such catastrophes, but there doesn't seem
> to be a good way. There's no way to tell 1. if the harddrive actually
> lost the writeback cache content 2. if so, how much it has lost. So,
> unless the OS halts the system everytime something seems weird with the
> disk, turning off writeback cache seems to be the only solution.
>
Turning off the writeback cache is definitely the safe and conservative way to
go for mission critical data unless you can be very certain that your barriers
are properly working on the drive & IO stack. We validate the cache flush
commands with a s-ata analyzer (making sure that we see them on sync/transaction
commits) and that they take a reasonable amount of time at the drive...
ric
next prev parent reply other threads:[~2006-05-07 13:21 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-05-03 20:01 Problem with disk David Ronis
2006-05-03 20:08 ` Ric Wheeler
2006-05-05 23:49 ` Mark Hahn
2006-05-06 0:51 ` Ric Wheeler
2006-05-06 17:11 ` Mark Hahn
2006-05-06 18:17 ` Ric Wheeler
2006-05-06 18:34 ` Mark Hahn
2006-05-06 22:56 ` Tejun Heo
2006-05-07 13:21 ` Ric Wheeler [this message]
2006-05-07 13:41 ` Tejun Heo
2006-05-08 14:33 ` Ric Wheeler
2006-05-10 22:21 ` Tejun Heo
2006-05-13 19:31 ` Ric Wheeler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=445DF445.6070803@emc.com \
--to=ric@emc.com \
--cc=David.Ronis@McGill.CA \
--cc=hahn@physics.mcmaster.ca \
--cc=htejun@gmail.com \
--cc=linux-ide@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).