From: Suparna Bhattacharya <suparna@in.ibm.com>
To: Zach Brown <zach.brown@oracle.com>
Cc: linux-aio@kvack.org, akpm@osdl.org, drepper@redhat.com,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
jakub@redhat.com, mingo@elte.hu
Subject: Re: [RFC] Heads up on a series of AIO patchsets
Date: Wed, 3 Jan 2007 10:33:23 +0530 [thread overview]
Message-ID: <20070103050323.GA3201@in.ibm.com> (raw)
In-Reply-To: <5A322D46-A73A-43DD-8667-CE218DDA48B0@oracle.com>
On Tue, Jan 02, 2007 at 03:56:09PM -0800, Zach Brown wrote:
> Sorry for the delay, I'm finally back from the holiday break :)
Welcome back !
>
> >(1) The filesystem AIO patchset, attempts to address one part of
> >the problem
> > which is to make regular file IO, (without O_DIRECT)
> >asynchronous (mainly
> > the case of reads of uncached or partially cached files, and
> >O_SYNC writes).
>
> One of the properties of the currently implemented EIOCBRETRY aio
> path is that ->mm is the only field in current which matches the
> submitting task_struct while inside the retry path.
Yes and that as I guess you know is to enable the aio worker thread to
operate on the caller's address space for copy_from/to_user.
The actual io setup and associated checks are expected to have been
handled at submission time.
>
> It looks like a retry-based aio write path would be broken because of
> this. generic_write_checks() could run in the aio thread and get its
> task_struct instead of that of the submitter. The wrong rlimit will
> be tested and SIGXFSZ won't be raised. remove_suid() could check the
> capabilities of the aio thread instead of those of the submitter.
generic_write_checks() are done in the submission path, not repeated during
retries, so such types of checks are not intended to run in the aio thread.
Did I miss something here ?
>
> I don't think EIOCBRETRY is the way to go because of this increased
> (and subtle!) complexity. What are the chances that we would have
> ever found those bugs outside code review? How do we make sure that
> current references don't sneak back in after having initially audited
> the paths?
The EIOCBRETRY route is not something that is intended to be used blindly,
It is just one alternative to implement an aio operation by splitting up
responsibility between the submitter and aio threads, where aio threads
can run in the caller's address space.
>
> Take the io_cmd_epoll_wait patch..
>
> > issues). The IO_CMD_EPOLL_WAIT patch (originally from Zach
> >Brown with
> > modifications from Jeff Moyer and me) addresses this problem
> >for native
> > linux aio in a simple manner.
>
> It's simple looking, sure. This current flipping didn't even occur
> to me while throwing the patch together!
>
> But that patch ends up calling ->poll (and poll_table->qproc) and
> writing to userspace (so potentially calling ->nopage) from the aio
Yes of course, but why is that a problem ?
The copy_from/to_user/put_user constructs are designed to handle soft failures,
and we are already using the caller's ->mm. Do you see a need for any
additional asserts() ?
If there is something that is needed by ->nopage etc which is not abstracted
out within the ->mm, then we would need to fix that instead, for correctness
anyway, isn't that so ?
Now it is possible that there are minor blocking points in the code and the
effect of these would be to hold up / delay subsequent queued aio operations;
which is an efficiency issue, but not a correctness concern.
> threads. Are we sure that none of them will behave surprisingly
> because current changed under them?
My take is that we should fix the problems that we see. It is likely that
what manifests relatively more easily with AIO is also a subtle problem
in other cases.
>
> It might be safe now, but that isn't really the point. I'd rather we
> didn't have yet one more subtle invariant to audit and maintain.
>
> At the risk of making myself vulnerable to the charge of mentioning
> vapourware, I will admit that I've been working on a (slightly mad)
> implementation of async syscalls. I've been quiet about it because I
> don't want to whip up complicated discussion without being able to
> show code that works, even if barely. I mention it now only to make
> it clear that I want to be constructive, not just critical :).
That is great and I look forward to it :) I am, however assuming that
whatever implementation you come up will have a different interface
from current linux aio -- i.e. a next generation aio model, that will be
easily integratable with kevents etc.
Which takes me back to Ingo's point - lets have the new evolve parallely
with the old, if we can, and not hold up the patches for POSIX AIO to
start using kernel AIO, or for epoll to integrate with AIO.
OK, I just took a quick look at your blog and I see that you
are basically implementing Linus' microthreads scheduling approach -
one year since we had that discussion. Glad to see that you found a way
to make it workable ... (I'm guessing that you are copying over the part
of the stack in use at the time of every switch, is that correct ? At what
point do you do the allocation of the saved stacks ? Sorry I should hold
off all these questions till your patch comes out)
Regards
Suparna
>
> - z
--
Suparna Bhattacharya (suparna@in.ibm.com)
Linux Technology Center
IBM Software Lab, India
next prev parent reply other threads:[~2007-01-03 4:58 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-12-27 15:38 [RFC] Heads up on a series of AIO patchsets Suparna Bhattacharya
2006-12-27 16:25 ` Christoph Hellwig
2006-12-27 16:55 ` Ingo Molnar
2006-12-27 17:18 ` Ingo Molnar
2006-12-28 11:41 ` Evgeniy Polyakov
2007-01-02 21:38 ` Dan Williams
2007-01-03 13:35 ` Evgeniy Polyakov
2006-12-28 8:23 ` [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write Suparna Bhattacharya
2006-12-28 8:34 ` [FSAIO][PATCH 1/6] Add a wait queue parameter to the wait_bit action routine Suparna Bhattacharya
2006-12-28 8:46 ` Suparna Bhattacharya
2006-12-28 8:36 ` [FSAIO][PATCH 2/8] Rename __lock_page to lock_page_slow Suparna Bhattacharya
2006-12-28 8:39 ` [FSAIO][PATCH 3/8] Routines to initialize and test a wait bit key Suparna Bhattacharya
2006-12-28 22:42 ` Andrew Morton
2006-12-28 8:39 ` [FSAIO][PATCH 4/8] Add a default io wait bit field in task struct Suparna Bhattacharya
2006-12-28 8:40 ` [FSAIO][PATCH 5/8] Enable wait bit based filtered wakeups to work for AIO Suparna Bhattacharya
2006-12-28 8:41 ` [FSAIO][PATCH 6/8] Enable asynchronous wait page and lock page Suparna Bhattacharya
2006-12-28 11:55 ` Christoph Hellwig
2006-12-28 14:47 ` Suparna Bhattacharya
2007-01-02 14:26 ` Christoph Hellwig
2007-01-04 6:50 ` Nick Piggin
2006-12-28 8:42 ` [FSAIO][PATCH 7/8] Filesystem AIO read Suparna Bhattacharya
2006-12-28 11:57 ` Christoph Hellwig
2006-12-28 14:15 ` Christoph Hellwig
2006-12-28 15:18 ` Suparna Bhattacharya
2007-01-02 14:29 ` Christoph Hellwig
2006-12-28 16:22 ` Jan Engelhardt
2006-12-28 16:56 ` Randy Dunlap
2006-12-28 8:44 ` [FSAIO][PATCH 8/8] AIO O_SYNC filesystem write Suparna Bhattacharya
2006-12-28 9:52 ` [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write Ingo Molnar
2006-12-28 22:53 ` Andrew Morton
2007-01-03 22:15 ` Andrew Morton
2007-01-04 4:56 ` Suparna Bhattacharya
2007-01-04 5:51 ` Nick Piggin
2007-01-04 6:26 ` Suparna Bhattacharya
2007-01-04 6:50 ` Nick Piggin
2007-01-04 11:24 ` Suparna Bhattacharya
2007-01-05 4:56 ` Nick Piggin
2007-01-04 17:02 ` Andrew Morton
2007-01-04 17:49 ` Jens Axboe
2007-01-05 6:28 ` Suparna Bhattacharya
2007-01-05 7:02 ` Jens Axboe
2007-01-05 8:08 ` Suparna Bhattacharya
2007-01-05 8:32 ` Jens Axboe
2007-01-10 5:44 ` Suparna Bhattacharya
2007-01-11 1:08 ` Andrew Morton
2007-01-11 3:13 ` Suparna Bhattacharya
2007-01-11 4:52 ` Andrew Morton
2007-01-02 23:56 ` [RFC] Heads up on a series of AIO patchsets Zach Brown
[not found] ` <6f703f960701021640y444bc537w549fd6d74f3e9529@mail.gmail.com>
[not found] ` <A85B8249-FC4E-4612-8B28-02BC680DC812@oracle.com>
2007-01-03 1:18 ` Kent Overstreet
2007-01-04 20:33 ` Pavel Machek
2007-01-03 5:03 ` Suparna Bhattacharya [this message]
2007-01-05 0:36 ` Zach Brown
2007-01-03 7:23 ` [PATCHSET 2][PATCH 1/1] Combining epoll and disk file AIO Suparna Bhattacharya
2007-01-04 9:27 ` [PATCHSET 3][PATCH 0/5][AIO] - AIO completion signal notification v4 Bharata B Rao
2007-01-04 9:30 ` [PATCHSET 3][PATCH 1/5][AIO] - Rework compat_sys_io_submit Bharata B Rao
2007-01-04 9:32 ` [PATCHSET 3][PATCH 2/5][AIO] - fix aio.h includes Bharata B Rao
2007-01-04 9:34 ` [PATCHSET 3][PATCH 3/5][AIO] - Make good_sigevent non-static Bharata B Rao
2007-01-04 9:38 ` [PATCHSET 3][PATCH 4/5][AIO] - AIO completion signal notification Bharata B Rao
2007-01-04 9:40 ` [PATCHSET 3][PATCH 5/5][AIO] - Add listio support Bharata B Rao
2007-01-05 5:32 ` [PATCHSET 4][PATCH 1/1] AIO fallback for pipes, sockets and pollable fds Suparna Bhattacharya
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070103050323.GA3201@in.ibm.com \
--to=suparna@in.ibm.com \
--cc=akpm@osdl.org \
--cc=drepper@redhat.com \
--cc=jakub@redhat.com \
--cc=linux-aio@kvack.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=zach.brown@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).