All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <mason@suse.com>
To: "Stephen C. Tweedie" <sct@redhat.com>
Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>,
	Jens Axboe <axboe@suse.de>, Jeff Garzik <jgarzik@pobox.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] barrier patch set
Date: Tue, 30 Mar 2004 17:13:17 -0500	[thread overview]
Message-ID: <1080684797.3546.85.camel@watt.suse.com> (raw)
In-Reply-To: <1080683417.1978.53.camel@sisko.scot.redhat.com>

On Tue, 2004-03-30 at 16:50, Stephen C. Tweedie wrote:
> Hi,
> 
> On Tue, 2004-03-30 at 20:19, Chris Mason wrote:
> 
> 
> > I think we're mixing a few concepts together.  submit_bh(WRITE_BARRIER,
> > bh) gives us an ordered write in whatever form the lower layers can
> > provide.  It also ensures that if you happen to call wait_on_buffer()
> > for the barrier buffer, the wait won't return until the data is on
> > media.
> 
> Right, but that's just how it works right now --- one doesn't _have_ to
> imply the other.  You could easily imagine an implementation that
> implements barriers and flushing separately, and which does not do
> automatic flushing on completion of WRITE_BARRIER IOs.  SCSI with
> writeback caching enabled might be one example of that.  NBD/DRBD would
> be another likely candidate --- if you've got network latencies in the
> way, then a flushing sync may be far more expensive than a barrier
> propagation.
> 
Yes, that's true, although the barriers don't really imply a flush, it
just implies that if you do use wait_on_* for flushing, it will report
things accurately.

> Unfortunately, a lot of the cases we care about really have to do the
> barrier via flushing, so the benefit of keeping them separate is
> limited.  For LVM/raid0, for example, we've got no way of preserving
> ordering between IOs on different drives, so a flush is necessary there
> unless we start journaling the low-level IOs to preserve order.
> 
Right.

> Yep.  It scares me to think what performance characteristics we'll start
> seeing once that gets used everywhere it's needed, though.  If every raw
> or O_DIRECT write needs a flush after it, databases are going to become
> very sensitive to flush performance.  I guess disabling the flushing and
> using disks which tell the truth about data hitting the platter is the
> sane answer there.

Most database benchmarks are done on scsi, and the blkdev_flush should
be a noop there.  For IDE based database and mail server benchmarks, the
results won't be pretty.  

The reiserfs fsync code tries hard to only flush once, so if a commit is
done then blkdev_flush isn't called.  We might have to do a few other
tricks to queue up multiple synchronous ios and only flush once.

-chris





  reply	other threads:[~2004-03-30 22:14 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-19 15:35 [PATCH] barrier patch set Jens Axboe
2004-03-19 16:30 ` Mika Penttilä
2004-03-19 18:16   ` Jens Axboe
2004-03-19 18:44     ` Mika Penttilä
2004-03-20  9:55       ` Jens Axboe
2004-03-19 16:34 ` Jeff Garzik
2004-03-19 18:19   ` Jens Axboe
2004-03-19 23:01   ` Matthias Andree
2004-03-20  0:02     ` Bartlomiej Zolnierkiewicz
2004-03-20  1:48       ` Johannes Stezenbach
2004-03-20  2:13         ` Bartlomiej Zolnierkiewicz
2004-03-20  2:53           ` Johannes Stezenbach
2004-03-20 16:03             ` Bartlomiej Zolnierkiewicz
2004-03-20 11:36           ` Matthias Andree
2004-03-20 16:00             ` Bartlomiej Zolnierkiewicz
2004-03-20 23:36               ` Johannes Stezenbach
2004-03-21  1:33                 ` Bartlomiej Zolnierkiewicz
2004-03-20 18:52       ` Helge Hafting
2004-03-22 11:15         ` Matthias Andree
2004-03-19 23:59   ` Bartlomiej Zolnierkiewicz
2004-03-20  0:14     ` Jeff Garzik
2004-03-20  0:40       ` Bartlomiej Zolnierkiewicz
2004-03-20  0:42         ` Jeff Garzik
2004-03-20  1:24           ` Bartlomiej Zolnierkiewicz
2004-03-20  9:58             ` Jens Axboe
2004-03-20 10:12               ` Jeff Garzik
2004-03-20 10:19                 ` Jens Axboe
2004-03-20 10:37                   ` Jeff Garzik
2004-03-20 16:30                     ` Bartlomiej Zolnierkiewicz
2004-03-21 18:12                       ` Jeff Garzik
2004-03-20 10:21             ` Jeff Garzik
2004-03-20 15:54               ` Bartlomiej Zolnierkiewicz
2004-03-20  0:17     ` Jeff Garzik
2004-03-20  9:53     ` Jens Axboe
2004-03-20 16:23       ` Bartlomiej Zolnierkiewicz
2004-03-20 16:27         ` Jens Axboe
2004-03-20 16:32         ` Chris Mason
2004-03-20 17:05           ` Bartlomiej Zolnierkiewicz
2004-03-20 17:10             ` Chris Mason
2004-03-20 20:16               ` Bartlomiej Zolnierkiewicz
2004-03-21  9:43                 ` Jens Axboe
2004-03-30 16:04             ` Stephen C. Tweedie
2004-03-30 19:19               ` Chris Mason
2004-03-30 21:50                 ` Stephen C. Tweedie
2004-03-30 22:13                   ` Chris Mason [this message]
2004-03-31 14:03                     ` Stephen C. Tweedie
2004-03-31 14:27                       ` Chris Mason
2004-03-31 18:28                         ` Ric Wheeler
2004-03-30 22:21                   ` Jeff Garzik
2004-03-30 22:36                     ` Chris Wedgwood
2004-03-30 22:39                       ` Jeff Garzik
2004-03-30 22:41                         ` Chris Wedgwood
2004-03-30 22:40                     ` Bartlomiej Zolnierkiewicz
2004-03-30 22:38                       ` Jeff Garzik
2004-03-31 14:08                     ` Stephen C. Tweedie
2004-03-31 14:21                       ` Chris Mason
2004-03-31 21:26                         ` Jeff Garzik
2004-03-31 22:09                           ` Chris Mason
2004-03-31 21:27                       ` Jeff Garzik
2004-03-19 16:48 ` Marc-Christian Petersen
2004-03-19 18:19   ` Jens Axboe
2004-03-22 11:09 ` Andrew Morton
2004-03-22 11:10   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1080684797.3546.85.camel@watt.suse.com \
    --to=mason@suse.com \
    --cc=B.Zolnierkiewicz@elka.pw.edu.pl \
    --cc=axboe@suse.de \
    --cc=jgarzik@pobox.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sct@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.