netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Sébastien Dugué" <sebastien.dugue@bull.net>
To: suparna@in.ibm.com
Cc: "Ulrich Drepper" <drepper@redhat.com>,
	=?iso-8859-1?Q?S=E9bastien_Dugu=E9_=3Csebastien=2Edugue=40bull=2Enet?=.=?iso-8859-1?Q?=3E?=@qubit.in.ibm.com,
	"Badari Pulavarty" <pbadari@us.ibm.com>,
	"Zach Brown" <zach.brown@oracle.com>,
	"Christoph Hellwig" <hch@infradead.org>,
	"Evgeniy Polyakov" <johnpol@2ka.mipt.ru>,
	lkml <linux-kernel@vger.kernel.org>,
	"David Miller" <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>,
	linux-aio@kvack.org, "Benjamin LaHaise" <bcrl@linux.intel.com>
Subject: Re: Kernel patches enabling better POSIX AIO (Was Re: [3/4] kevent: AIO, aio_sendfile)
Date: Mon, 04 Sep 2006 16:28:55 +0200	[thread overview]
Message-ID: <1157380136.3895.44.camel@frecb000686> (raw)
In-Reply-To: <20060812182928.GA1989@in.ibm.com>

  Hi,

  just came back from vacation, sorry for the delay.

On Sat, 2006-08-12 at 23:59 +0530, Suparna Bhattacharya wrote:
> BTW, if anyone would like to be dropped off this growing cc list, please
> let us know.
> 
> On Fri, Aug 11, 2006 at 12:45:55PM -0700, Ulrich Drepper wrote:
> > Sébastien Dugué wrote:
> > > 		     aio completion notification
> > 
> > I looked over this now but I don't think I understand everything.  Or I
> > don't see how it all is integrated.  And no, I'm not looking at the
> > proposed glibc code since would mean being tainted.
> 
> Oh, I didn't realise that. 
> I'll make an attempt to clarify parts that I understand based on what I
> have gleaned from my reading of the code and intent, but hopefully Sebastien,
> Ben, Zach et al will be able to pitch in for a more accurate and complete
> picture.
> 
> > 
> > 
> > > Details:
> > > -------
> > > 
> > >   A struct sigevent *aio_sigeventp is added to struct iocb in
> > > include/linux/aio_abi.h
> > > 
> > >   An enum {IO_NOTIFY_SIGNAL = 0, IO_NOTIFY_THREAD_ID = 1} is added in
> > > include/linux/aio.h:
> > > 
> > > 	- IO_NOTIFY_SIGNAL means that the signal is to be sent to the
> > > 	  requesting thread 
> > > 
> > > 	- IO_NOTIFY_THREAD_ID means that the signal is to be sent to a
> > > 	  specifi thread.
> > 
> > This has been proved to be sufficient in the timer code which basically
> > has the same problem.  But why do you need separate constants?  We have
> > the various SIGEV_* constants, among them SIGEV_THREAD_ID.  Just use
> > these constants for the values of ki_notify.
> > 
> 
> I am wondering about that too. IIRC, the IO_NOTIFY_* constants are not
> part of the ABI, but only internal to the kernel implementation. I think
> Zach had suggested inferring THREAD_ID notification if the pid specified
> is not zero. But, I don't see why ->sigev_notify couldn't used directly
> (just like the POSIX timers code does) thus doing away with the 
> new constants altogether. Sebestian/Laurent, do you recall?

  As I see it, those IO_NOTIFY_* constants are uneeded and we could use
->sigev_notify directly. I will change this so that we use the same
mechanism as the POSIX timers code.

> 
> > 
> > >   The following fields are added to struct kiocb in include/linux/aio.h:
> > > 
> > > 	- pid_t ki_pid: target of the signal
> > > 
> > > 	- __u16 ki_signo: signal number
> > > 
> > > 	- __u16 ki_notify: kind of notification, IO_NOTIFY_SIGNAL or
> > > 			   IO_NOTIFY_THREAD_ID
> > > 
> > > 	- uid_t ki_uid, ki_euid: filled with the submitter credentials
> > 
> > These two fields aren't needed for the POSIX interfaces.  Where does the
> > requirement come from?  I don't say they should be removed, they might
> > be useful, but if the costs are non-negligible then they could go away.
> 
> I'm guessing they are being used for validation of permissions at the time
> of sending the signal, but maybe saving the task pointer in the iocb instead
> of the pid would suffice ?

  IIRC, Ben added these for that exact reason. Is this really needed?
Ben?

> 
> > 
> > 
> > > 	- check whether the submitting thread wants to be notified directly
> > > 	  (sigevent->sigev_notify_thread_id is 0) or wants the signal to be sent
> > > 	  to another thread.
> > > 	  In the latter case a check is made to assert that the target thread
> > > 	  is in the same thread group
> > 
> > Is this really how it's implemented?  This is not how it should be.
> > Either a signal is sent to a specific thread in the same process (this
> > is what SIGEV_THREAD_ID is for) or the signal is sent to a calling
> > process.  Sending a signal to the process means that from the kernel's
> > POV any thread which doesn't have the signal blocked can receive it.
> > The final decision is made by the kernel.  There is no mechanism to send
> > the signal to another process.
> 
> The code seems to be set up to call specific_send_sig_info() in the case
> of *_THREAD_ID , and __group_send_sig_info() otherwise. So I think the
> intended behaviour is as you describe it should be (__group_send_sig_info
> does the equivalent of sending a signal to the process and so any thread
> which doesn't have signals blocked can receive it, while specific_send_sig_info
> sends it to a particular thread). 
> 
> But, I should really leave it to Sebestian to confirm that.

  That's right, but I think that part needs to be reworked to follow
the same logic as the POSIX timers.


> > > 			    listio support
> > > 
> > 
> > I really don't understand the kernel interface for this feature.
> 
> I'm sorry this is confusing. This probably means that we need to
> separate the external interface description more clearly and completely
> from the internals.
> 
> > 
> > 
> > > Details:
> > > -------
> > > 
> > >   An IOCB_CMD_GROUP is added to the IOCB_CMD enum in include/linux/aio_abi.h
> > > 
> > >   A struct lio_event is added in include/linux/aio.h
> > > 
> > >   A struct lio_event *ki_lio is added to struct iocb in include/linux/aio.h
> > 
> > So you have a pointer in the structure for the individual requests.  I
> > assume you use the atomic counter to trigger the final delivery.  I
> > further assume that if lio_wait is set the calling thread is suspended
> > until all requests are handled and that the final notification in this
> > case means that thread gets woken.
> > 
> > This is all fine.
> > 
> > But how do you pass the requests to the kernel?  If you have a new
> > lio_listio-like syscall it'll be easy.  But I haven't seen anything like
> > this mentioned.
> > 
> > The alternative is to pass the requests one-by-one in which case I don't
> > see how you create the reference to the lio_listio control block.  This
> > approach seems to be slower.
> 
> The way it works (and better ideas are welcome) is that, since the io_submit()
> syscall already accepts an array of iocbs[], no new syscall was introduced.
> To implement lio_listio, one has to set up such an array, with the first iocb
> in the array having the special (new) grouping opcode of IOCB_CMD_GROUP which
> specifies the sigev notification to be associated with group completion
> (a NULL value of the sigev notification pointer would imply equivalent of
> LIO_WAIT). The following iocbs in the array should correspond to the set of
> listio aiocbs. Whenever it encounters an IOCB_CMD_GROUP iocb opcode, the
> kernel would interpret all subsequent iocbs[] submitted in the same
> io_submit() call to be associated with the same lio control block. 
> 
> Does that clarify ?
> 
> Would an example help ?
> 
> > 
> > If all requests are passed at once, do you have the equivalent of
> > LIO_NOP entries?

  So far, LIO_NOP entries are pruned by the support library 
(libposix-aio) and never sent to the kernel.
> > 
> 
> Good question - we do have an IOCB_CMD_NOOP defined, and I seem to even
> recall a patch that implemented it, but am wondering if it ever got merged.
> Ben/Zach ?
> 
> > 
> > How can we support the extension where we wait for a number of requests
> > which need not be all of them.  I.e., I submit N requests and want to be
> > notified when at least M (M <= N) notified.  I am not yet clear about
> > the actual semantics we should implement (e.g., do we send another
> > notification after the first one?) but it's something which IMO should
> > be taken into account in the design.
> > 
> 
> My thought here was that it should be possible to include M as a parameter
> to the IOCB_CMD_GROUP opcode iocb, and thus incorporated in the lio control
> block ... then whatever semantics are agreed upon can be implemented.
> 
> > 
> > Finally, and this is very important, does you code send out the
> > individual requests notification and then in the end the lio_listio
> > completion?  I think Suparna wrote this is the case but I want to make sure.
> 
> Sebestian, could you confirm ?

  If (and only if) the user did setup a sigevent for one or more
individual requests then those requests completion will trigger a
notification and in the end the list completion notification is sent. 
Otherwise, only the list completion notification is sent.


-- 
-----------------------------------------------------

  Sébastien Dugué                BULL/FREC:B1-247
  phone: (+33) 476 29 77 70      Bullcom: 229-7770

  mailto:sebastien.dugue@bull.net

  Linux POSIX AIO: http://www.bullopensource.org/posix
                   http://sourceforge.net/projects/paiol

-----------------------------------------------------


  parent reply	other threads:[~2006-09-04 14:29 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <44C66FC9.3050402@redhat.com>
2006-07-25 22:01 ` async network I/O, event channels, etc David Miller
2006-07-25 22:55   ` Nicholas Miell
2006-07-26  6:28   ` Evgeniy Polyakov
2006-07-26  9:18     ` [0/4] kevent: generic event processing subsystem Evgeniy Polyakov
2006-07-26  9:18       ` [1/4] kevent: core files Evgeniy Polyakov
2006-07-26  9:18         ` [2/4] kevent: network AIO, socket notifications Evgeniy Polyakov
2006-07-26  9:18           ` [3/4] kevent: AIO, aio_sendfile() implementation Evgeniy Polyakov
2006-07-26  9:18             ` [4/4] kevent: poll/select() notifications. Timer notifications Evgeniy Polyakov
2006-07-26 10:00             ` [3/4] kevent: AIO, aio_sendfile() implementation Christoph Hellwig
2006-07-26 10:08               ` Evgeniy Polyakov
2006-07-26 10:13                 ` Christoph Hellwig
2006-07-26 10:25                   ` Evgeniy Polyakov
2006-07-26 10:04             ` Christoph Hellwig
2006-07-26 10:12               ` David Miller
2006-07-26 10:15                 ` Christoph Hellwig
2006-07-26 20:21                   ` Phillip Susi
2006-07-26 14:14                 ` Avi Kivity
2006-07-26 10:19               ` Evgeniy Polyakov
2006-07-26 10:30                 ` Christoph Hellwig
2006-07-26 14:28                   ` Ulrich Drepper
2006-07-26 16:22                     ` Badari Pulavarty
2006-07-27  6:49                       ` Sébastien Dugué
2006-07-27 15:28                         ` Badari Pulavarty
2006-07-27 18:14                           ` Zach Brown
2006-07-27 18:29                             ` Badari Pulavarty
2006-07-27 18:44                               ` Ulrich Drepper
2006-07-27 21:02                                 ` Badari Pulavarty
2006-07-28  7:31                                   ` Sébastien Dugué
2006-07-28 12:58                                   ` Sébastien Dugué
2006-08-11 19:45                                     ` Ulrich Drepper
2006-08-12 18:29                                       ` Kernel patches enabling better POSIX AIO (Was Re: [3/4] kevent: AIO, aio_sendfile) Suparna Bhattacharya
2006-08-12 19:10                                         ` Ulrich Drepper
2006-08-12 19:28                                           ` Jakub Jelinek
2006-09-04 14:37                                             ` Sébastien Dugué
2006-08-14  7:02                                           ` Suparna Bhattacharya
2006-08-14 16:38                                             ` Ulrich Drepper
2006-08-15  2:06                                               ` Nicholas Miell
2006-09-04 14:36                                           ` Sébastien Dugué
2006-09-04 14:28                                         ` Sébastien Dugué [this message]
2006-07-28  7:29                                 ` [3/4] kevent: AIO, aio_sendfile() implementation Sébastien Dugué
2006-07-31 10:11                                 ` Suparna Bhattacharya
2006-07-28  7:26                           ` Sébastien Dugué
2006-07-26 10:31         ` [1/4] kevent: core files Andrew Morton
2006-07-26 10:37           ` Evgeniy Polyakov
2006-07-26 10:44         ` Evgeniy Polyakov
2006-07-27  6:10     ` async network I/O, event channels, etc David Miller
2006-07-27  7:49       ` Evgeniy Polyakov
2006-07-27  8:02         ` David Miller
2006-07-27  8:09           ` Jens Axboe
2006-07-27  8:11             ` Jens Axboe
2006-07-27  8:20               ` David Miller
2006-07-27  8:29                 ` Jens Axboe
2006-07-27  8:37                   ` David Miller
2006-07-27  8:39                     ` Jens Axboe
2006-07-27  8:58           ` Evgeniy Polyakov
2006-07-27  9:31             ` David Miller
2006-07-27  9:37               ` Evgeniy Polyakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1157380136.3895.44.camel@frecb000686 \
    --to=sebastien.dugue@bull.net \
    --cc==?iso-8859-1?Q?S=E9bastien_Dugu=E9_=3Csebastien=2Edugue=40bull=2Enet?=.=?iso-8859-1?Q?=3E?=@qubit.in.ibm.com \
    --cc=bcrl@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=drepper@redhat.com \
    --cc=hch@infradead.org \
    --cc=johnpol@2ka.mipt.ru \
    --cc=linux-aio@kvack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pbadari@us.ibm.com \
    --cc=suparna@in.ibm.com \
    --cc=zach.brown@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).