linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jamie Lokier <jamie@shareable.org>
To: Werner Almesberger <wa@almesberger.net>
Cc: Bryan Henderson <hbryan@us.ibm.com>, linux-fsdevel@vger.kernel.org
Subject: Re: barriers vs. reads - O_DIRECT
Date: Thu, 24 Jun 2004 19:50:59 +0100	[thread overview]
Message-ID: <20040624185059.GA11175@mail.shareable.org> (raw)
In-Reply-To: <20040624144638.V1325@almesberger.net>

Werner Almesberger wrote:
> Bryan Henderson wrote:
> > It seems obvious to me that whatever ordering guarantees the user gets 
> > without the O_DIRECT flag, he should get with it as well.
> 
> Yes, it would be nice if we could obtain such behaviour without
> unacceptable performance sacrifices. It seems to me that, if we
> can find an efficient way for serializing all write-write and
> read-write overlaps, plus have explicit barriers for serializing
> non-overlapping writes, this should yield pretty much what
> everyone wants (*). Now, that "if" needs a bit of work ... :-)

Note that what filesystems and databases want is write-write *partial
dependencies*.  The per-device I/O barrier is just a crude
approximation.

1. Think about this: two filesystems on different partitions of the same
device.  The writes of each filesystem are independent, yet the
barriers will force the writes of one filesystem to come before
later-queued writes of the other.

2. Or, two database back-ends doing direct I/O to two separate files.

It's probably not a big performance penalty, but it illustrates that
the barriers are "bigger" than they need to be.  Worth taking into
account when deciding what minimal ordering everyone _really_ wants.

If you do implement overlap detection logic, then would giving
barriers an I/O range be helpful?  E.g. corresponding to partitions.

Here's a few more cases, which may not be quite right even now:

3. What if a journal is on a different device to its filesystem?
Ideally, write barriers between the different device queues would be
appropriate.

4. A journalling filesystem mounted on a loopback device.  Is this
reliable now?

5. A journalling filesystem mounted on two loopback devices -- one for
the fs, one for the journal.

> (*) The only difference being that a completing read doesn't
>     tell you whether the elevator has already passed a barrier.
>     Currently, one could be lured into depending on this.

Isn't the barrier itself an I/O operation which can be waited on?
I agree something could depend on the reads at the moment.

-- Jamie

  reply	other threads:[~2004-06-24 18:51 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-06-24  0:48 barriers vs. reads Werner Almesberger
2004-06-24  3:39 ` Werner Almesberger
2004-06-24  8:00   ` Herbert Poetzl
2004-06-24 12:16     ` Werner Almesberger
2004-06-24 13:36   ` Jamie Lokier
2004-06-24 17:02     ` Werner Almesberger
2004-06-24 16:39 ` Steve Lord
2004-06-24 17:00 ` barriers vs. reads - O_DIRECT Bryan Henderson
2004-06-24 17:46   ` Werner Almesberger
2004-06-24 18:50     ` Jamie Lokier [this message]
2004-06-24 20:55       ` Werner Almesberger
2004-06-24 22:42         ` Jamie Lokier
2004-06-25  3:21           ` Werner Almesberger
2004-06-25  3:57           ` Guy
2004-06-25  4:52             ` Werner Almesberger
2004-06-25  0:11     ` Bryan Henderson
2004-06-25  2:42       ` Werner Almesberger
2004-06-25 15:59         ` barriers vs. reads - O_DIRECT aio Bryan Henderson
2004-06-25 16:31         ` barriers vs. reads - O_DIRECT Bryan Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040624185059.GA11175@mail.shareable.org \
    --to=jamie@shareable.org \
    --cc=hbryan@us.ibm.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=wa@almesberger.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).