From: Jamie Lokier <jamie@shareable.org>
To: Werner Almesberger <wa@almesberger.net>
Cc: Bryan Henderson <hbryan@us.ibm.com>, linux-fsdevel@vger.kernel.org
Subject: Re: barriers vs. reads - O_DIRECT
Date: Thu, 24 Jun 2004 19:50:59 +0100 [thread overview]
Message-ID: <20040624185059.GA11175@mail.shareable.org> (raw)
In-Reply-To: <20040624144638.V1325@almesberger.net>
Werner Almesberger wrote:
> Bryan Henderson wrote:
> > It seems obvious to me that whatever ordering guarantees the user gets
> > without the O_DIRECT flag, he should get with it as well.
>
> Yes, it would be nice if we could obtain such behaviour without
> unacceptable performance sacrifices. It seems to me that, if we
> can find an efficient way for serializing all write-write and
> read-write overlaps, plus have explicit barriers for serializing
> non-overlapping writes, this should yield pretty much what
> everyone wants (*). Now, that "if" needs a bit of work ... :-)
Note that what filesystems and databases want is write-write *partial
dependencies*. The per-device I/O barrier is just a crude
approximation.
1. Think about this: two filesystems on different partitions of the same
device. The writes of each filesystem are independent, yet the
barriers will force the writes of one filesystem to come before
later-queued writes of the other.
2. Or, two database back-ends doing direct I/O to two separate files.
It's probably not a big performance penalty, but it illustrates that
the barriers are "bigger" than they need to be. Worth taking into
account when deciding what minimal ordering everyone _really_ wants.
If you do implement overlap detection logic, then would giving
barriers an I/O range be helpful? E.g. corresponding to partitions.
Here's a few more cases, which may not be quite right even now:
3. What if a journal is on a different device to its filesystem?
Ideally, write barriers between the different device queues would be
appropriate.
4. A journalling filesystem mounted on a loopback device. Is this
reliable now?
5. A journalling filesystem mounted on two loopback devices -- one for
the fs, one for the journal.
> (*) The only difference being that a completing read doesn't
> tell you whether the elevator has already passed a barrier.
> Currently, one could be lured into depending on this.
Isn't the barrier itself an I/O operation which can be waited on?
I agree something could depend on the reads at the moment.
-- Jamie
next prev parent reply other threads:[~2004-06-24 18:51 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-06-24 0:48 barriers vs. reads Werner Almesberger
2004-06-24 3:39 ` Werner Almesberger
2004-06-24 8:00 ` Herbert Poetzl
2004-06-24 12:16 ` Werner Almesberger
2004-06-24 13:36 ` Jamie Lokier
2004-06-24 17:02 ` Werner Almesberger
2004-06-24 16:39 ` Steve Lord
2004-06-24 17:00 ` barriers vs. reads - O_DIRECT Bryan Henderson
2004-06-24 17:46 ` Werner Almesberger
2004-06-24 18:50 ` Jamie Lokier [this message]
2004-06-24 20:55 ` Werner Almesberger
2004-06-24 22:42 ` Jamie Lokier
2004-06-25 3:21 ` Werner Almesberger
2004-06-25 3:57 ` Guy
2004-06-25 4:52 ` Werner Almesberger
2004-06-25 0:11 ` Bryan Henderson
2004-06-25 2:42 ` Werner Almesberger
2004-06-25 15:59 ` barriers vs. reads - O_DIRECT aio Bryan Henderson
2004-06-25 16:31 ` barriers vs. reads - O_DIRECT Bryan Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040624185059.GA11175@mail.shareable.org \
--to=jamie@shareable.org \
--cc=hbryan@us.ibm.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=wa@almesberger.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).