From: Christoph Hellwig <hch@infradead.org>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>, Ted Ts'o <tytso@mit.edu>,
Dave Chinner <david@fromorbit.com>,
linux-ext4@vger.kernel.org
Subject: Re: Query about DIO/AIO WRITE throttling and ext4 serialization
Date: Thu, 9 Jun 2011 09:09:08 -0400 [thread overview]
Message-ID: <20110609130908.GA22360@infradead.org> (raw)
In-Reply-To: <20110603013345.GD27129@redhat.com>
On Thu, Jun 02, 2011 at 09:33:45PM -0400, Vivek Goyal wrote:
> > Yes this patch helps. I have already laid out the file and doing
> > overwrites.
> >
> > I throttled aio-stress in one cgroup to 1 byte/sec and edited another
> > file from other cgroup and did a "sync" and it completed.
>
> Even other test where I am running aio-stress in one window and edited
> a file in another window and typed "sync" worked. "sync" does not hang
> waiting for aio-stress to finish.
I've been thinking about the patch a bit more, and I think it's simply
incorrect. i_iocount is the only thing that actually tracks in-flight
DIO/AIO requests, so we can't actually skip incrementing it as that
means we can't wait for pending AIO in fsync/sync/inode reclaim or
remount r/o.
We could simply declare AIO is off limits for sync and skip it there,
which is easily doable, but we'd still need a special case version of
sync for remount r/o as that absolutely needs to stop all pending I/O.
Of the other filesystem ext4 also has the counter, but only waits for
it during inode teardown, and using a slightly different, but also
effective scheme for fsync, but completely ignores sync and remount.
I couldn't find a similar scheme in other filesystem supporting AIO,
but it might be hidden a bit better.
I suspect we could optimize things by using the dual count and list
approach ext4 does - there is a counter for in-flight direct I/O, which
we only check for inode teardown and remount, as those need to stop
pending I/O, but sync and fsync can skip them as they only need to
flush pending I/O. There is a list for the pending unwritten extent
conversions that only gets appended to when the actual I/O is done,
and the unwritten extent conversion is queued up.
I'll see if I can come up with a good scheme for that, preferably
sitting directly in the direct I/O code, so that everyone gets it
without additional work.
next prev parent reply other threads:[~2011-06-09 13:09 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-01 21:50 Query about DIO/AIO WRITE throttling and ext4 serialization Vivek Goyal
2011-06-02 1:22 ` Dave Chinner
2011-06-02 14:17 ` Vivek Goyal
2011-06-02 14:36 ` Vivek Goyal
2011-06-02 15:56 ` Vivek Goyal
2011-06-02 23:51 ` Dave Chinner
2011-06-03 0:27 ` Vivek Goyal
2011-06-03 0:43 ` Ted Ts'o
2011-06-03 0:54 ` Vivek Goyal
2011-06-03 1:02 ` Christoph Hellwig
2011-06-03 1:28 ` Vivek Goyal
2011-06-03 1:33 ` Vivek Goyal
2011-06-09 13:09 ` Christoph Hellwig [this message]
2011-06-03 3:30 ` Eric Sandeen
2011-06-03 5:00 ` Christoph Hellwig
2011-06-03 1:11 ` Ted Ts'o
2011-06-02 23:46 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110609130908.GA22360@infradead.org \
--to=hch@infradead.org \
--cc=david@fromorbit.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.