All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
To: Zach Brown <zab-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	tytso-3s7WtUTddSA@public.gmane.org
Subject: Re: [PATCH 5/5] aio: Refactor aio_read_evt, use cmxchg(), fix bug
Date: Tue, 9 Oct 2012 17:47:46 -0700	[thread overview]
Message-ID: <20121010004746.GF26835@google.com> (raw)
In-Reply-To: <20121010002634.GX26187-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>

On Tue, Oct 09, 2012 at 05:26:34PM -0700, Zach Brown wrote:
> > The AIO ringbuffer stuff just annoys me more than most
> 
> Not more than everyone, though, I can personally promise you that :).
> 
> > (it wasn't until
> > the other day that I realized it was actually exported to userspace...
> > what led to figuring that out was noticing aio_context_t was a ulong,
> > and got truncated to 32 bits with a 32 bit program running on a 64 bit
> > kernel. I'd been horribly misled by the code comments and the lack of
> > documentation.) 
> 
> Yeah.  It's the userspace address of the mmaped ring.  This has annoyed
> the process migration people who can't recreate the context in a new
> kernel because there's no userspace interface to specify creation of a
> context at a specific address.

Yeah I did finally figure that out - and a file descriptor that
userspace then mmap()ed would solve that problem...

> 
> > But if we do have an explicit handle, I don't see why it shouldn't be a
> > file descriptor.
> 
> Because they're expensive to create and destroy when compared to a
> single system call.  Imagine that we're using waiting for a single
> completion to implement a cheap one-off sync call.  Imagine it's a
> buffered op which happens to hit the cache and is really quick.

True. But that could be solved with a separate interface that either
doesn't use a context to submit a call synchronously, or uses an
implicit per thread context.

> (And they're annoying to manage: libraries and O_CLOEXEC, running into
> fd/file limit tunables, bleh.)

I don't have a _strong_ opinion there, but my intuition is that we
shouldn't be creating new types of handles without a good reason. I
don't think the annoyances are for the most part particular to file
descriptors, I think the tend to be applicable to handles in general and
at least with file descriptors they're known and solved.

Also, with a file descriptor it naturally works with an epoll event
loop. (eventfd for aio is a hack).

> If the 'completion context' is no more than a structure in userspace
> memory then a lot of stuff just works.  Tasks can share it amongst
> themselves as they see fit.  A trivial one-off sync call can just dump
> it on the stack and point to it.  It doesn't have to be specifically
> torn down on task exit.

That would be awesome, though for it to be worthwhile there couldn't be
any kernel notion of a context at all and I'm not sure if that's
practical. But the idea hadn't occured to me before and I'm sure you've
thought about it more than I have... hrm.

Oh hey, that's what acall does :P

For completions though you really want the ringbuffer pinned... what do
you do about that?

> > > And perhaps obviously, I'd start with the acall stuff :).  It was a lot
> > > lighter.  We could talk about how to make it extensible without going
> > > all the way to the generic packed variable size duplicating or not and
> > > returning or not or.. attributes :).
> > 
> > Link? I haven't heard of acall before.
> 
> I linked to it after that giant silly comment earlier in the thread,
> here it is again:
> 
>   http://lwn.net/Articles/316806/

Oh whoops, hadn't started reading yet - looking at it now :)

> There's a mostly embarassing video of a jetlagged me giving that talk at
> LCA kicking around.. ah, here:
> 
>  http://mirror.linux.org.au/pub/linux.conf.au/2009/Thursday/131.ogg
> 
> - z

WARNING: multiple messages have this Message-ID (diff)
From: Kent Overstreet <koverstreet@google.com>
To: Zach Brown <zab@redhat.com>
Cc: linux-bcache@vger.kernel.org, linux-kernel@vger.kernel.org,
	dm-devel@redhat.com, tytso@mit.edu
Subject: Re: [PATCH 5/5] aio: Refactor aio_read_evt, use cmxchg(), fix bug
Date: Tue, 9 Oct 2012 17:47:46 -0700	[thread overview]
Message-ID: <20121010004746.GF26835@google.com> (raw)
In-Reply-To: <20121010002634.GX26187@lenny.home.zabbo.net>

On Tue, Oct 09, 2012 at 05:26:34PM -0700, Zach Brown wrote:
> > The AIO ringbuffer stuff just annoys me more than most
> 
> Not more than everyone, though, I can personally promise you that :).
> 
> > (it wasn't until
> > the other day that I realized it was actually exported to userspace...
> > what led to figuring that out was noticing aio_context_t was a ulong,
> > and got truncated to 32 bits with a 32 bit program running on a 64 bit
> > kernel. I'd been horribly misled by the code comments and the lack of
> > documentation.) 
> 
> Yeah.  It's the userspace address of the mmaped ring.  This has annoyed
> the process migration people who can't recreate the context in a new
> kernel because there's no userspace interface to specify creation of a
> context at a specific address.

Yeah I did finally figure that out - and a file descriptor that
userspace then mmap()ed would solve that problem...

> 
> > But if we do have an explicit handle, I don't see why it shouldn't be a
> > file descriptor.
> 
> Because they're expensive to create and destroy when compared to a
> single system call.  Imagine that we're using waiting for a single
> completion to implement a cheap one-off sync call.  Imagine it's a
> buffered op which happens to hit the cache and is really quick.

True. But that could be solved with a separate interface that either
doesn't use a context to submit a call synchronously, or uses an
implicit per thread context.

> (And they're annoying to manage: libraries and O_CLOEXEC, running into
> fd/file limit tunables, bleh.)

I don't have a _strong_ opinion there, but my intuition is that we
shouldn't be creating new types of handles without a good reason. I
don't think the annoyances are for the most part particular to file
descriptors, I think the tend to be applicable to handles in general and
at least with file descriptors they're known and solved.

Also, with a file descriptor it naturally works with an epoll event
loop. (eventfd for aio is a hack).

> If the 'completion context' is no more than a structure in userspace
> memory then a lot of stuff just works.  Tasks can share it amongst
> themselves as they see fit.  A trivial one-off sync call can just dump
> it on the stack and point to it.  It doesn't have to be specifically
> torn down on task exit.

That would be awesome, though for it to be worthwhile there couldn't be
any kernel notion of a context at all and I'm not sure if that's
practical. But the idea hadn't occured to me before and I'm sure you've
thought about it more than I have... hrm.

Oh hey, that's what acall does :P

For completions though you really want the ringbuffer pinned... what do
you do about that?

> > > And perhaps obviously, I'd start with the acall stuff :).  It was a lot
> > > lighter.  We could talk about how to make it extensible without going
> > > all the way to the generic packed variable size duplicating or not and
> > > returning or not or.. attributes :).
> > 
> > Link? I haven't heard of acall before.
> 
> I linked to it after that giant silly comment earlier in the thread,
> here it is again:
> 
>   http://lwn.net/Articles/316806/

Oh whoops, hadn't started reading yet - looking at it now :)

> There's a mostly embarassing video of a jetlagged me giving that talk at
> LCA kicking around.. ah, here:
> 
>  http://mirror.linux.org.au/pub/linux.conf.au/2009/Thursday/131.ogg
> 
> - z

  parent reply	other threads:[~2012-10-10  0:47 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-09  6:39 [PATCH 1/5] aio: Kill return value of aio_complete() Kent Overstreet
2012-10-09  6:39 ` Kent Overstreet
2012-10-09  6:39 ` [PATCH 3/5] aio: Rewrite refcounting Kent Overstreet
     [not found]   ` <1349764760-21093-3-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-09 18:27     ` Zach Brown
2012-10-09 18:27       ` Zach Brown
     [not found]       ` <20121009182755.GN26187-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2012-10-09 22:21         ` Kent Overstreet
2012-10-09 22:21           ` Kent Overstreet
     [not found]           ` <20121009222153.GG29494-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-09 22:35             ` Zach Brown
2012-10-09 22:35               ` Zach Brown
     [not found]               ` <20121009223504.GS26187-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2012-10-10  0:17                 ` Kent Overstreet
2012-10-10  0:17                   ` Kent Overstreet
2012-10-09  6:39 ` [PATCH 4/5] aio: vmap ringbuffer Kent Overstreet
     [not found]   ` <1349764760-21093-4-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-09 18:29     ` Zach Brown
2012-10-09 18:29       ` Zach Brown
2012-10-09 21:31       ` Kent Overstreet
     [not found]         ` <20121009213111.GE29494-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-09 22:32           ` Zach Brown
2012-10-09 22:32             ` Zach Brown
2012-10-09 22:44             ` Kent Overstreet
     [not found]               ` <20121009224428.GH29494-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-09 22:58                 ` Zach Brown
2012-10-09 22:58                   ` Zach Brown
     [not found]                   ` <20121009225836.GU26187-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2012-10-10  0:16                     ` Kent Overstreet
2012-10-10  0:16                       ` Kent Overstreet
     [not found]                       ` <20121010001630.GC26835-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-10  0:36                         ` Zach Brown
2012-10-10  0:36                           ` Zach Brown
     [not found]                           ` <20121010003626.GY26187-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2012-10-10  1:09                             ` Kent Overstreet
2012-10-10  1:09                               ` Kent Overstreet
     [not found] ` <1349764760-21093-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-09  6:39   ` [PATCH 2/5] aio: kiocb_cancel() Kent Overstreet
2012-10-09  6:39     ` Kent Overstreet
     [not found]     ` <1349764760-21093-2-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-09 18:26       ` Zach Brown
2012-10-09 18:26         ` Zach Brown
     [not found]         ` <20121009182625.GM26187-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2012-10-09 21:37           ` Kent Overstreet
2012-10-09 21:37             ` Kent Overstreet
     [not found]             ` <20121009213700.GF29494-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-10 11:03               ` Theodore Ts'o
2012-10-10 11:03                 ` Theodore Ts'o
     [not found]                 ` <20121010110356.GA11468-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-10-10 21:20                   ` Zach Brown
2012-10-10 21:20                     ` Zach Brown
     [not found]                     ` <20121010212051.GD6371-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2012-10-10 23:21                       ` Theodore Ts'o
2012-10-10 23:21                         ` Theodore Ts'o
2012-10-11  2:41                   ` Kent Overstreet
2012-10-11  2:41                     ` Kent Overstreet
2012-10-09  6:39   ` [PATCH 5/5] aio: Refactor aio_read_evt, use cmxchg(), fix bug Kent Overstreet
2012-10-09  6:39     ` Kent Overstreet
     [not found]     ` <1349764760-21093-5-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-09 18:37       ` Zach Brown
2012-10-09 18:37         ` Zach Brown
     [not found]         ` <20121009183753.GP26187-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2012-10-09 21:27           ` Kent Overstreet
2012-10-09 21:27             ` Kent Overstreet
     [not found]             ` <20121009212724.GD29494-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-09 22:47               ` Zach Brown
2012-10-09 22:47                 ` Zach Brown
     [not found]                 ` <20121009224703.GT26187-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2012-10-09 22:55                   ` Kent Overstreet
2012-10-09 22:55                     ` Kent Overstreet
     [not found]                     ` <20121009225509.GA26835-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-09 23:10                       ` Zach Brown
2012-10-09 23:10                         ` Zach Brown
     [not found]                         ` <20121009231059.GV26187-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2012-10-10  0:06                           ` Kent Overstreet
2012-10-10  0:06                             ` Kent Overstreet
     [not found]                             ` <20121010000600.GB26835-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-10  0:26                               ` Zach Brown
2012-10-10  0:26                                 ` Zach Brown
     [not found]                                 ` <20121010002634.GX26187-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2012-10-10  0:47                                   ` Kent Overstreet [this message]
2012-10-10  0:47                                     ` Kent Overstreet
     [not found]                                     ` <20121010004746.GF26835-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-10 21:43                                       ` Zach Brown
2012-10-10 21:43                                         ` Zach Brown
     [not found]                                         ` <20121010214315.GE6371-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2012-10-11  2:51                                           ` Kent Overstreet
2012-10-11  2:51                                             ` Kent Overstreet
     [not found]                                             ` <20121011025102.GE24174-jC9Py7bek1znysI04z7BkA@public.gmane.org>
2012-10-11 16:43                                               ` Zach Brown
2012-10-11 16:43                                                 ` Zach Brown
2012-10-09 18:25   ` [PATCH 1/5] aio: Kill return value of aio_complete() Zach Brown
2012-10-09 18:25     ` Zach Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121010004746.GF26835@google.com \
    --to=koverstreet-hpiqsd4aklfqt0dzr+alfa@public.gmane.org \
    --cc=dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=linux-bcache-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=tytso-3s7WtUTddSA@public.gmane.org \
    --cc=zab-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.