From: David Miller <davem@davemloft.net>
To: johnpol@2ka.mipt.ru
Cc: drepper@redhat.com, linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: async network I/O, event channels, etc
Date: Thu, 27 Jul 2006 01:02:55 -0700 (PDT) [thread overview]
Message-ID: <20060727.010255.87351515.davem@davemloft.net> (raw)
In-Reply-To: <20060727074902.GC5490@2ka.mipt.ru>
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Date: Thu, 27 Jul 2006 11:49:02 +0400
> I.e. map skb's data to userspace? Not a good idea especially with it's
> tricky lifetime and unability for userspace to inform kernel when it
> finished and skb can be freed (without additional syscall).
Hmmm...
If it is paged based, I do not see the problem. Events and calls to
AIO I/O routines make transfer of buffer ownership. The fact that
while kernel (and thus networking stack) "owns" the buffer for an AIO
call, the user can have a valid mapping to it is a unimportant detail.
If the user will scramble a piece of data that is in flight to or from
the network card, it is his problem.
If we are using a primitive network card that does not support
scatter-gather I/O and thus not page based SKBs, we will make
copies. But this is transparent to the user.
The idea is that DMA mappings have page granularity.
At least on transmit it should work well. Receive side is more
difficult and initial implementation will need to copy.
> I did it with af_tlb zero-copy sniffer (but I substitute mapped pages
> with physical skb->data pages), and it was not very good.
Trying to be too clever with skb->data has always been catastrophic. :)
> Well, I think preallocate some buffers and use that in AIO setup is a
> plus, since in that case user does not care about when it is possible to
> reuse the same buffer - when appropriate kevent is completed, that means
> that provided buffer is no longer in use by kernel, and user can reuse
> it.
We now enter the most interesting topic of AIO buffer pool management
and where it belongs. :-) We are assuming up to this point that the
user manages this stuff with explicit DMA calls for allocation, then
passes the key based references to those buffers as arguments to the
AIO I/O calls.
But I want to suggest another possibility. What if the kernel managed
the AIO buffer pool for a task? It could grow this dynamically based
upon need. The only implementation road block is how large to we
allow this to grow, but I think normal VM mechanisms can take care
of it.
On transmit this is not straightforward, but for receive it has really
nice possibilities. :)
next prev parent reply other threads:[~2006-07-27 8:02 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <44C66FC9.3050402@redhat.com>
2006-07-25 22:01 ` async network I/O, event channels, etc David Miller
2006-07-25 22:55 ` Nicholas Miell
2006-07-26 6:28 ` Evgeniy Polyakov
2006-07-26 9:18 ` [0/4] kevent: generic event processing subsystem Evgeniy Polyakov
2006-07-26 9:18 ` [1/4] kevent: core files Evgeniy Polyakov
2006-07-26 9:18 ` [2/4] kevent: network AIO, socket notifications Evgeniy Polyakov
2006-07-26 9:18 ` [3/4] kevent: AIO, aio_sendfile() implementation Evgeniy Polyakov
2006-07-26 9:18 ` [4/4] kevent: poll/select() notifications. Timer notifications Evgeniy Polyakov
2006-07-26 10:00 ` [3/4] kevent: AIO, aio_sendfile() implementation Christoph Hellwig
2006-07-26 10:08 ` Evgeniy Polyakov
2006-07-26 10:13 ` Christoph Hellwig
2006-07-26 10:25 ` Evgeniy Polyakov
2006-07-26 10:04 ` Christoph Hellwig
2006-07-26 10:12 ` David Miller
2006-07-26 10:15 ` Christoph Hellwig
2006-07-26 20:21 ` Phillip Susi
2006-07-26 14:14 ` Avi Kivity
2006-07-26 10:19 ` Evgeniy Polyakov
2006-07-26 10:30 ` Christoph Hellwig
2006-07-26 14:28 ` Ulrich Drepper
2006-07-26 16:22 ` Badari Pulavarty
2006-07-27 6:49 ` Sébastien Dugué
2006-07-27 15:28 ` Badari Pulavarty
2006-07-27 18:14 ` Zach Brown
2006-07-27 18:29 ` Badari Pulavarty
2006-07-27 18:44 ` Ulrich Drepper
2006-07-27 21:02 ` Badari Pulavarty
2006-07-28 7:31 ` Sébastien Dugué
2006-07-28 12:58 ` Sébastien Dugué
2006-08-11 19:45 ` Ulrich Drepper
2006-08-12 18:29 ` Kernel patches enabling better POSIX AIO (Was Re: [3/4] kevent: AIO, aio_sendfile) Suparna Bhattacharya
2006-08-12 19:10 ` Ulrich Drepper
2006-08-12 19:28 ` Jakub Jelinek
2006-09-04 14:37 ` Sébastien Dugué
2006-08-14 7:02 ` Suparna Bhattacharya
2006-08-14 16:38 ` Ulrich Drepper
2006-08-15 2:06 ` Nicholas Miell
2006-09-04 14:36 ` Sébastien Dugué
2006-09-04 14:28 ` Sébastien Dugué
2006-07-28 7:29 ` [3/4] kevent: AIO, aio_sendfile() implementation Sébastien Dugué
2006-07-31 10:11 ` Suparna Bhattacharya
2006-07-28 7:26 ` Sébastien Dugué
2006-07-26 10:31 ` [1/4] kevent: core files Andrew Morton
2006-07-26 10:37 ` Evgeniy Polyakov
2006-07-26 10:44 ` Evgeniy Polyakov
2006-07-27 6:10 ` async network I/O, event channels, etc David Miller
2006-07-27 7:49 ` Evgeniy Polyakov
2006-07-27 8:02 ` David Miller [this message]
2006-07-27 8:09 ` Jens Axboe
2006-07-27 8:11 ` Jens Axboe
2006-07-27 8:20 ` David Miller
2006-07-27 8:29 ` Jens Axboe
2006-07-27 8:37 ` David Miller
2006-07-27 8:39 ` Jens Axboe
2006-07-27 8:58 ` Evgeniy Polyakov
2006-07-27 9:31 ` David Miller
2006-07-27 9:37 ` Evgeniy Polyakov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060727.010255.87351515.davem@davemloft.net \
--to=davem@davemloft.net \
--cc=drepper@redhat.com \
--cc=johnpol@2ka.mipt.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).