From: Theodore Ts'o <tytso@mit.edu>
To: Rajat Sharma <fs.rajat@gmail.com>
Cc: Ankit Jain <jankit@suse.de>, Al Viro <viro@zeniv.linux.org.uk>,
bcrl@kvack.org, linux-fsdevel@vger.kernel.org,
linux-aio@kvack.org, linux-kernel@vger.kernel.org,
Jan Kara <jack@suse.cz>
Subject: Re: [RFC][PATCH] Make io_submit non-blocking
Date: Tue, 24 Jul 2012 16:27:38 -0400 [thread overview]
Message-ID: <20120724202738.GC29239@thunk.org> (raw)
In-Reply-To: <CAAYFAvp7xDG8et_dyqSgEVOAr7ykNBkgBhs1m5gGjoOdQFkxGQ@mail.gmail.com>
On Tue, Jul 24, 2012 at 06:04:23PM +0530, Rajat Sharma wrote:
> >
> > Currently, io_submit tries to execute the io requests on the
> > same thread, which could block because of various reaons (eg.
> > allocation of disk blocks). So, essentially, io_submit ends
> > up being a blocking call.
>
> Ideally filesystem should take care of it e.g. by deferring such time
> consuming allocations and return -EIOCBQUEUED immediately. But have
> you seen such cases?
Oh, it happens all the time if you are using AIO. If the file system
needs to read or write any metadata block, AIO can become distinctly
non-"A". The workaround that I've chosen is to create a way to cache
the information needed for the bmap() operation, triggered via an
ioctl() issued at open time, so that this is not an issue, but that
only works if the file is pre-allocated, and there is no need to do
any block allocations.
It's all very well and good to say, "the file system should handle
it", but that just pushes the problem onto the file system. And since
you need to potentially issue block I/O requests, which you can't do
from an interrupt context (i.e., a block I/O completion handler), you
really do need to create a workqueue in order to make things work.
If you do it in the fs/direct_io.c layer, at least that way you can
solve the problem once for all file systems....
> With lots of application threads firing continuous IOs, workqueue
> threads might become bottleneck and you might have to eventually
> develop a priority scheduling. This workqueue was originally designed
> for IO retries which is an error path, now consumers of workqueue
> might easily increase by 100x.
Yes, you definitely need to throttle how many outstanding AIO's can be
allowed to be outstanding, either globally, or on a
per-superblock/process/user/cgroup basis, and return EAGAIN if there
are too many outstanding requests.
Speaking of cgroups, one of the other challenges with running the AIO
out of a workqueue is trying to respect cgroup restrictions. In
particular, the io-throttle cgroup (which is needed to provide
Proportional I/O support), but also the memory cgroup.
All of these complications is why I decided to simply go with the "pin
metadata" approach, since I didn't need to worry (at least initially)
with the allocating write case. (These patches to ext4 haven't yet
been published upstream, mainly because they need a lot of cleanup
work and I haven't had time to do that cleanup; my intention is to get
the "big extents" patchset upstream, though.)
- Ted
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
next prev parent reply other threads:[~2012-07-24 20:27 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-24 11:41 [RFC][PATCH] Make io_submit non-blocking Ankit Jain
2012-07-24 12:34 ` Rajat Sharma
2012-07-24 20:27 ` Theodore Ts'o [this message]
2012-07-24 22:31 ` Dave Chinner
2012-07-24 22:50 ` Christoph Hellwig
2012-07-24 23:08 ` Zach Brown
2012-07-26 19:52 ` Ankit Jain
2012-07-26 21:43 ` Zach Brown
2012-07-25 20:12 ` Ankit Jain
2012-07-24 22:37 ` Zach Brown
2012-07-25 20:17 ` Ankit Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120724202738.GC29239@thunk.org \
--to=tytso@mit.edu \
--cc=bcrl@kvack.org \
--cc=fs.rajat@gmail.com \
--cc=jack@suse.cz \
--cc=jankit@suse.de \
--cc=linux-aio@kvack.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).