From: Chris Mason <mason@suse.com>
To: Andrew Morton <akpm@osdl.org>
Cc: B.Zolnierkiewicz@elka.pw.edu.pl, axboe@suse.de, edt@aei.ca,
linux-kernel@vger.kernel.org
Subject: Re: ide errors in 7-rc1-mm1 and later
Date: Thu, 10 Jun 2004 11:14:07 -0400 [thread overview]
Message-ID: <1086880447.10973.333.camel@watt.suse.com> (raw)
In-Reply-To: <20040609173856.4463e36f.akpm@osdl.org>
On Wed, 2004-06-09 at 20:38, Andrew Morton wrote:
> Chris Mason <mason@suse.com> wrote:
> >
> > On Wed, 2004-06-09 at 19:50, Andrew Morton wrote:
> > > Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> wrote:
> > > >
> > > > Does journal has checksum or some other protection against failure during
> > > > writing journal to a disk? If not than it still can be screwed even with
> > > > ordered writes if we are unfortunate enough. ;-)
> > >
> > > A transaction is written to disk as two synchronous operations: write all
> > > the data, wait on it, write the single commit block, wait on that.
> > >
> > > If the commit block were to hit disk before the data then we have a window
> > > in which poweroff+recovery would replay garbage into the filesystem.
> > >
> > > So I think we have a bug in the current ext3 barrier implementation - we
> > > need a blk_issue_flush() before submitting the buffer_ordered commit block.
> >
> > The IDE barriers are both a pre and post flush. If the commit block is
> > ordered, before the commit block hits the disk we know all the blocks
> > previously submitted are also on disk.
> >
>
> Oh, OK. Will the same apply to (for example) scsi?
For scsi the general expectation is that write cache will be off unless
it is battery backed. blkdev_issue_flush does go down to scsi, but I'm
not sure about the regular WRITE_BARRIER stuff. Jens?
It's true that we need an extra step for external journals in both ext3
and reiser. We need extra flushes for O_SYNC and O_DIRECT as well, I
wanted to get the core basics working and API fixed before we sprinkling
flushes all over the kernel for complete coverage.
I just did some benchmarking of the two BH_Eopnotsupp patches I sent,
and for synctest -t 20 -f -n 1 dir, there's not enough difference
between barriers on and off for ext3. (1-2% at most). It doesn't look
like ext3_sync_file is triggering commits all the time, I think we need
extra flushes there too.
Andrew, both O_SYNC and ext3 fsync rely on inode->i_state & I_DIRTY to
decide when to call write_inode(wait = 1). What happens when a
background writeout clears I_DIRTY without triggering the commit? Looks
like we won't wait on the transaction to complete in this case.
-chris
next prev parent reply other threads:[~2004-06-10 15:13 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-05-27 20:24 [2.6.7-rc1-mm1] cant mount reiserfs using -o barrier=flush Günther Persoons
2004-05-27 23:28 ` Ed Tomlinson
2004-05-28 11:54 ` Gunther Persoons
2004-05-28 12:18 ` Jens Axboe
2004-05-28 21:39 ` Ed Tomlinson
2004-05-29 8:30 ` Jens Axboe
2004-06-04 2:07 ` ide errors in 7-rc1-mm1 and later Ed Tomlinson
2004-06-04 2:31 ` Andrew Morton
2004-06-04 9:42 ` Jens Axboe
2004-06-04 11:22 ` Ed Tomlinson
2004-06-04 11:32 ` Jens Axboe
2004-06-04 11:45 ` Jens Axboe
2004-06-04 11:57 ` Bartlomiej Zolnierkiewicz
2004-06-04 12:01 ` Jens Axboe
2004-06-04 12:38 ` Bartlomiej Zolnierkiewicz
2004-06-04 12:47 ` Jens Axboe
2004-06-04 13:34 ` Bartlomiej Zolnierkiewicz
2004-06-04 15:23 ` Jens Axboe
2004-06-04 16:14 ` Bartlomiej Zolnierkiewicz
2004-06-05 9:18 ` Jens Axboe
2004-06-09 21:52 ` Bartlomiej Zolnierkiewicz
2004-06-09 22:06 ` Andrew Morton
2004-06-09 23:38 ` Bartlomiej Zolnierkiewicz
2004-06-09 23:50 ` Andrew Morton
2004-06-10 0:20 ` Bartlomiej Zolnierkiewicz
2004-06-10 0:37 ` Andrew Morton
2004-06-10 1:02 ` Bartlomiej Zolnierkiewicz
2004-06-10 0:28 ` Chris Mason
2004-06-10 0:38 ` Andrew Morton
2004-06-10 0:45 ` Bartlomiej Zolnierkiewicz
2004-06-10 15:14 ` Chris Mason [this message]
2004-06-10 15:15 ` Jens Axboe
2004-06-10 1:05 ` Bartlomiej Zolnierkiewicz
2004-06-10 6:27 ` Jens Axboe
2004-06-10 6:26 ` Jens Axboe
2004-06-04 17:29 ` Jeff Garzik
2004-06-05 9:24 ` Jens Axboe
2004-06-06 16:18 ` Eric D. Mudama
2004-06-06 20:46 ` Jens Axboe
2004-06-10 0:38 ` Bartlomiej Zolnierkiewicz
2004-06-10 6:11 ` Jens Axboe
2004-06-10 16:41 ` Eric D. Mudama
2004-06-10 17:50 ` flush cache range proposal (was Re: ide errors in 7-rc1-mm1 and later) Jeff Garzik
2004-06-10 18:02 ` Jeff Garzik
2004-06-10 20:33 ` Eric D. Mudama
2004-06-11 16:22 ` Jeff Garzik
2004-06-11 7:55 ` Jens Axboe
2004-06-11 16:17 ` Eric D. Mudama
2004-06-11 16:31 ` Jeff Garzik
2004-06-11 16:52 ` Eric D. Mudama
2004-06-11 16:58 ` Jens Axboe
2004-06-11 16:54 ` Jens Axboe
2004-06-11 16:50 ` Jens Axboe
2004-06-11 16:24 ` Jeff Garzik
2004-06-11 6:10 ` Stuart Young
2004-06-26 8:31 ` ide errors in 7-rc1-mm1 and later Andre Hedrick
2004-06-26 8:58 ` Andre Hedrick
2004-06-28 18:18 ` Eric D. Mudama
2004-07-02 8:29 ` Jens Axboe
2004-07-07 5:40 ` Jeff Garzik
2004-06-04 11:48 ` Bartlomiej Zolnierkiewicz
2004-06-09 23:44 ` Ed Tomlinson
2004-06-09 23:52 ` Andrew Morton
2004-06-10 0:17 ` Ed Tomlinson
2004-06-10 6:29 ` Jens Axboe
2004-06-14 21:42 ` Ed Tomlinson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1086880447.10973.333.camel@watt.suse.com \
--to=mason@suse.com \
--cc=B.Zolnierkiewicz@elka.pw.edu.pl \
--cc=akpm@osdl.org \
--cc=axboe@suse.de \
--cc=edt@aei.ca \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.