From: Josef Bacik <jbacik@fusionio.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Josef Bacik <JBacik@fusionio.com>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"viro@ZenIV.linux.org.uk" <viro@ZenIV.linux.org.uk>,
"jmoyer@redhat.com" <jmoyer@redhat.com>,
"zab@redhat.com" <zab@redhat.com>
Subject: Re: [PATCH] direct-io: allow file systems to do their own waiting for io
Date: Mon, 3 Dec 2012 11:14:03 -0500 [thread overview]
Message-ID: <20121203161403.GB4780@localhost.localdomain> (raw)
In-Reply-To: <20121203154125.GA13344@infradead.org>
On Mon, Dec 03, 2012 at 08:41:25AM -0700, Christoph Hellwig wrote:
> On Mon, Dec 03, 2012 at 08:37:20AM -0500, Josef Bacik wrote:
> > Btrfs is terrible with O_DIRECT|O_SYNC, mostly because of the constant
> > waiting. The thing is we have a handy way of waiting for IO that we can
> > delay to the very last second so we do all of the O_SYNC work and then wait
> > for a bunch of IO to complete. So introduce a flag to allow the generic
> > direct io stuff to forgo waiting and leave that up to the file system.
> > Thanks,
>
> I don't really like passing another flag for this, if we we are going to
> do something like this it should be in a way where:
>
> - the actualy waiting code is a helper that btrfs would also use
> - the main dio code is structured in a way that we have a lower level
> entry point that skips the waiting, and a higher level one that also
> calls it.
>
> That beeing said I'm not imaginative enough to see how you're actually
> going to use it. Posting the btrfs side would help with that.
>
Hrm so I can do that, but it may not make much sense. Here are the two patches
that are relevant (older versions but they get the idea across)
http://git.kernel.org/?p=linux/kernel/git/josef/btrfs-next.git;a=commit;h=78b40072c556d82fac5e58793a3178887ac057ec
http://git.kernel.org/?p=linux/kernel/git/josef/btrfs-next.git;a=commit;h=b7728f1b19eeb2041e3d4da22fd3d5a5c11abd3c
Basically what happens with btrfs now in O_SYNC/fsync() with either O_DIRECT or
not is this
write()
fsync()/O_SYNC
start and wait on all io to complete
log changed metadata into special tree
write and wait on our new log
sync super which points at our new log
What I'm trying to accomplish is this
write()
fsync()/O_SYNC
start io
log changed metadata into special tree
write log and then wait on log and data
sync super
this gives us a pretty great performance boost since we just have to wait the
one time (well two if you include the super). But in the O_DIRECT case it
always waits for writes to be completed before it returns to the file system.
In normal O_DIRECT we want to do that, which is all the first patch does, waits
for the IO like we normally would. But for fsync()/O_SYNC we want to forego the
waiting until the last possible second, so we start io, gather up the ordered
extents (what we use to track pending IO), and then when we're ready wait to
make sure those ordered extents have completed. We already have our own helpers
and such to keep track of when IO finishes for a given range, so all we really
need is a flag to tell O_DIRECT not to do what it normally does since we will
take care of it. I'm open to other ways to do this, but I'd rather not go to
all the trouble to create new helpers and such that btrfs will just never need
to use. Thanks,
Josef
next prev parent reply other threads:[~2012-12-03 16:14 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-03 13:37 [PATCH] direct-io: allow file systems to do their own waiting for io Josef Bacik
2012-12-03 15:41 ` Christoph Hellwig
2012-12-03 16:14 ` Josef Bacik [this message]
2012-12-08 12:17 ` Christoph Hellwig
2012-12-08 12:35 ` Chris Mason
2012-12-14 13:44 ` Chris Mason
2012-12-11 10:00 ` Liu Bo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121203161403.GB4780@localhost.localdomain \
--to=jbacik@fusionio.com \
--cc=hch@infradead.org \
--cc=jmoyer@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=viro@ZenIV.linux.org.uk \
--cc=zab@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).