From: Theodore Tso <tytso@MIT.EDU>
To: Andi Kleen <andi@firstfloor.org>
Cc: Eric Sandeen <sandeen@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 0/4] (RESEND) ext3[34] barrier changes
Date: Sun, 18 May 2008 20:43:25 -0400 [thread overview]
Message-ID: <20080519004325.GC8335@mit.edu> (raw)
In-Reply-To: <8763tbcrbo.fsf@basil.nowhere.org>
On Sun, May 18, 2008 at 10:03:55PM +0200, Andi Kleen wrote:
> Eric Sandeen <sandeen@redhat.com> writes:
> >
> > Right, that was the plan. I wasn't really going to stand there and pull
> > the plug. :) I'd like to get to "out of $NUMBER power-loss events
> > under this usage, I saw $THIS corruption $THISMANY times ..."
>
> I'm not sure how good such exact numbers would do. Surely power down
> behaviour that would depend on the exact disk/controller/system
> combination? Some might be better at getting data out at
> power less, some might be worse.
Given how rarely people have reported problems, I think it's a really
good idea to understand what exactly our exposure is for
$COMMON_HARDWARE. And I suspect the biggest question isn't the
hardware, but the workload. Here are the questions that I think are
worth asking:
* How often can we get corruption on a common desktop workload? Given
that we're mostly kernel developers, and kernbench is probably worst
case for desktops, that's useful.
* What is the performance hit on a common desktop workload (let's use
kernbench for consistency).
* How often can we get corruption on a hard-core enterprise
application with lots of fsync()'s? (i.e., postmark, et. al)
* What is the performance hit on a an fsync()-heavy workload?
I have a feeling that the likelihood of corruption when running
kernbench is minimal, but the performance hit is probably minimal as
well. And that the corruption for potential is higher for an
fsync-heavy workload, but that's also where we are seeing the
(reported) 30% hit.
The other thing which we should consider is that I suspect we can do
much better for ext4 given that we have journal checksums. As Chris
pointed out, right now, with barriers turned on, we are doing this:
write log blocks
flush #1
write commit block
flush #2
write metadata blocks
If we don't mind mixing bh and bio functions, we could change it to
this for ext4 (when journal checksumming is enabled)
write log blocks
write commit block
flush (via submitting an empty barrier block I/O request)
write metadata blocks
This should hopefully reduce the performance hit by half, since we're
eliminating one of the flushes. Even more interesting would be moving
the flush until right before we attempt to write the metadata blocks,
and allowing data writes which don't require metadata updates through.
That should be safe, even in data=ordered mode. The point is we
should think about ways that we can optimize barrier mode for ext4.
If we do this, then it may be that people will find it interesting to
mount ext3 filesystems using ext4, even without making any additional
changes, because of the better choices of speed/safety tradeoffs.
- Ted
next prev parent reply other threads:[~2008-05-19 0:44 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-16 19:02 [PATCH 0/4] (RESEND) ext3[34] barrier changes Eric Sandeen
2008-05-16 19:05 ` [PATCH 1/4] ext3: enable barriers by default Eric Sandeen
2008-05-19 8:58 ` Pavel Machek
2008-05-16 19:07 ` [PATCH 2/4] ext3: call blkdev_issue_flush on fsync Eric Sandeen
2008-05-16 22:15 ` Jamie Lokier
2008-05-16 19:08 ` [PATCH 3/4] ext4: enable barriers by default Eric Sandeen
2008-05-16 19:09 ` [PATCH 4/4] ext4: call blkdev_issue_flush on fsync Eric Sandeen
2008-05-20 2:34 ` Theodore Tso
2008-05-20 15:43 ` Jamie Lokier
2008-05-20 15:52 ` Eric Sandeen
2008-05-20 20:14 ` Jens Axboe
2008-05-20 19:54 ` Jens Axboe
2008-05-20 22:02 ` Jamie Lokier
2008-05-21 7:30 ` Jens Axboe
2008-05-16 20:05 ` [PATCH 0/4] (RESEND) ext3[34] barrier changes Andrew Morton
2008-05-16 20:53 ` Eric Sandeen
2008-05-16 20:58 ` Andrew Morton
2008-05-16 21:45 ` Jamie Lokier
2008-05-16 22:03 ` Eric Sandeen
2008-05-16 22:09 ` Jamie Lokier
2008-05-16 22:03 ` Jamie Lokier
2008-05-16 22:21 ` Eric Sandeen
2008-05-16 22:53 ` Jamie Lokier
2008-05-17 0:20 ` Theodore Tso
2008-05-17 0:35 ` Andrew Morton
2008-05-17 13:43 ` Theodore Tso
2008-05-17 17:59 ` Andreas Dilger
2008-05-17 20:44 ` Theodore Tso
2008-05-20 14:45 ` Jamie Lokier
2008-05-18 0:48 ` Chris Mason
2008-05-18 1:36 ` Theodore Tso
2008-05-18 14:49 ` Ric Wheeler
2008-05-20 14:42 ` Jamie Lokier
2008-05-20 23:48 ` Jamie Lokier
[not found] ` <4830420D.4080608__28835.4277647615$1211137279$gmane$org@gmail.com>
2008-05-18 19:59 ` Andi Kleen
2008-05-18 16:07 ` Ric Wheeler
2008-05-20 23:44 ` Jamie Lokier
2008-05-18 20:03 ` Andi Kleen
2008-05-19 0:43 ` Theodore Tso [this message]
2008-05-19 2:29 ` Eric Sandeen
2008-05-19 4:11 ` Andrew Morton
2008-05-19 17:16 ` Chris Mason
2008-05-19 18:39 ` Chris Mason
2008-05-19 22:39 ` Jan Kara
2008-05-20 0:29 ` Chris Mason
2008-05-20 3:29 ` Timothy Shimmin
2008-05-20 12:04 ` Chris Mason
2008-05-20 8:25 ` Jens Axboe
2008-05-20 12:17 ` Chris Mason
2008-05-21 11:22 ` Pavel Machek
2008-05-21 12:32 ` Theodore Tso
2008-05-21 18:03 ` Andrew Morton
2008-05-21 18:15 ` Eric Sandeen
2008-05-21 19:43 ` Jamie Lokier
2008-05-21 18:29 ` Theodore Tso
2008-05-21 18:49 ` Andrew Morton
2008-05-21 19:42 ` Jamie Lokier
2008-05-21 19:36 ` Jamie Lokier
2008-05-21 19:40 ` Chris Mason
2008-05-21 19:54 ` Jamie Lokier
2008-05-20 14:58 ` Jamie Lokier
2008-05-21 22:30 ` Daniel Phillips
2008-05-20 23:35 ` Jamie Lokier
2008-05-19 0:28 ` Theodore Tso
2008-05-20 15:13 ` Jamie Lokier
2008-05-21 20:25 ` Greg Smith
2008-05-16 22:30 ` Jamie Lokier
2008-05-18 19:54 ` Andi Kleen
2008-05-19 13:26 ` Chris Mason
2008-05-19 14:46 ` Theodore Tso
2008-05-20 2:51 ` [PATCH, RFC] ext4: Fix use of write barrier in commit logic Theodore Tso
2008-05-20 15:23 ` Jamie Lokier
2008-05-23 18:33 ` [PATCH 0/4] (RESEND) ext3[34] barrier changes Ric Wheeler
2008-05-20 15:36 ` Jamie Lokier
2008-05-20 16:02 ` Chris Mason
2008-05-20 16:27 ` Jamie Lokier
2008-05-20 17:08 ` Chris Mason
2008-05-20 22:26 ` Jamie Lokier
2008-05-19 9:04 ` Pavel Machek
2008-05-29 13:36 ` Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080519004325.GC8335@mit.edu \
--to=tytso@mit.edu \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sandeen@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).