From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Vladislav Bolkhovitin <vst@vlnb.net>
Cc: linux-scsi@vger.kernel.org,
James Bottomley <James.Bottomley@HansenPartnership.com>,
Andrew Morton <akpm@linux-foundation.org>,
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>,
Mike Christie <michaelc@cs.wisc.edu>,
Jeff Garzik <jeff@garzik.org>,
Boaz Harrosh <bharrosh@panasas.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-kernel@vger.kernel.org, scst-devel@lists.sourceforge.net,
Bart Van Assche <bart.vanassche@gmail.com>,
"Nicholas A. Bellinger" <nab@linux-iscsi.org>,
netdev@vger.kernel.org, Rusty Russell <rusty@rustcorp.com.au>,
Herbert Xu <herbert@gondor.apana.org.au>
Subject: Re: [PATCH][RFC 23/23]: Support for zero-copy TCP transmit of user space data
Date: Fri, 19 Dec 2008 12:21:41 -0800 [thread overview]
Message-ID: <494C0255.8010208@goop.org> (raw)
In-Reply-To: <494012C4.7090304@vlnb.net>
Vladislav Bolkhovitin wrote:
> This patch implements support for zero-copy TCP transmit of user space
> data. It is necessary in iSCSI-SCST target driver for transmitting data
> from user space buffers, supplied by user space backend handlers. In
> this case SCST core needs to know when TCP finished transmitting the
> data, so the corresponding buffers can be reused or freed. Without this
> patch it isn't possible, so iSCSI-SCST has to use data copying to TCP
> send buffers function sock_sendpage(). ISCSI-SCST also works without
> this patch, but that this patch gives a nice performance improvement.
>
In Xen networking it looks like we're going to need to solve a very
similar problem.
When a guest (non-privileged, with no direct hardware access) wants to
send a network packet, it passes it over to the privileged (host)
domain, who then puts it into the network stack for transmission.
The packet gets passed over in a page granted (read "borrowed") from the
guest domain. We can't return it to the guest while its tangled up in
the host's network stack, so we need notification of when the stack has
finished with the page.
The out of tree Xen patches do this by marking a page as having been
allocated by a foreign allocator, and overloads the private memory of
struct page with a destructor function pointer, which put_page calls as
appropriate. We can do this because the page is definitely "owned" by
the Xen subsystem, so most of the fields are available for recycling;
the main problem is that we need to grab another page flag. Your case
sounds more complex because the source page can be mapped by userspace
and/or be in the pagecache, so everything is already claimed.
As with your case, we can simply copy the page data if this mechanism
isn't available. But it would be nice if it were.
> 1. Add net_priv analog in struct sk_buff, not in struct page. But then
> it would be required that all the pages in each skb must be from the
> same originator, i.e. with the same net_priv. It is unpractical to
> change all the operations with skb's to forbid merging them, if they
> have different net_priv. I tried, but quickly gave up. There are too
> many such places in very not obvious code pieces.
>
I think Rusty has a patch to put some kind of put notifier in struct
skb_shared_info, but I'm not sure of the details.
> 2. Have in iSCSI-SCST a hashed list to translate page to iSCSI cmd by a
> simple search function. This approach was rejected, because to copy a
> page a modern CPU needs using MMX about 1500 ticks.
Is that the cold cache timing?
> It was observed,
> that each page can be referenced by TCP during transmit about 20 times
> or even more. So, if each search needs, say, 20 ticks, the overall
> search time will be 20*20*2 (to get() and put()) = 800 ticks. So, this
> approach would considerably worse performance-wise to the chosen
> approach and provide not too much benefit.
>
Wouldn't you only need to do the lookup on the last put?
An external lookup table might well for for us, if the net_put_page()
change is acceptable to the network folk.
J
next prev parent reply other threads:[~2008-12-19 20:22 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-10 18:26 [PATCH][RFC 0/23] New SCSI target framework (SCST) and 4 target drivers Vladislav Bolkhovitin
2008-12-10 18:28 ` [PATCH][RFC 1/23]: SCST public headers Vladislav Bolkhovitin
2008-12-10 18:30 ` [PATCH][RFC 2/23]: SCST core Vladislav Bolkhovitin
2008-12-10 19:12 ` Sam Ravnborg
2008-12-11 17:28 ` Vladislav Bolkhovitin
2008-12-11 21:09 ` Sam Ravnborg
2008-12-12 19:24 ` Vladislav Bolkhovitin
2008-12-12 21:50 ` Steven Rostedt
[not found] ` <20081212230523.GB4775@ghostprotocols.net>
2008-12-13 1:25 ` Frédéric Weisbecker
2008-12-13 1:27 ` Frédéric Weisbecker
2008-12-13 14:46 ` Vladislav Bolkhovitin
2008-12-14 0:35 ` Frédéric Weisbecker
2008-12-16 21:49 ` Ingo Molnar
2008-12-16 22:13 ` Frédéric Weisbecker
2008-12-16 22:22 ` Ingo Molnar
2008-12-16 23:46 ` Frédéric Weisbecker
2008-12-18 11:45 ` Vladislav Bolkhovitin
2008-12-20 13:06 ` Frédéric Weisbecker
2008-12-23 19:11 ` Vladislav Bolkhovitin
2008-12-27 11:20 ` Ingo Molnar
2008-12-30 17:13 ` Vladislav Bolkhovitin
2008-12-30 21:03 ` Frederic Weisbecker
2008-12-30 21:35 ` Steven Rostedt
2008-12-10 18:34 ` [PATCH][RFC 3/23]: SCST core docs Vladislav Bolkhovitin
2008-12-10 18:36 ` [PATCH][RFC 4/23]: SCST debug support Vladislav Bolkhovitin
2008-12-10 18:37 ` [PATCH][RFC 5/23]: SCST /proc interface Vladislav Bolkhovitin
2008-12-11 20:23 ` Nicholas A. Bellinger
2008-12-12 19:23 ` Vladislav Bolkhovitin
2008-12-10 18:39 ` [PATCH][RFC 6/23]: SCST SGV cache Vladislav Bolkhovitin
2008-12-10 18:40 ` [PATCH][RFC 7/23]: SCST integration into the kernel Vladislav Bolkhovitin
2008-12-10 18:42 ` [PATCH][RFC 8/23]: SCST pass-through backend handlers Vladislav Bolkhovitin
2008-12-10 18:43 ` [PATCH][RFC 9/23]: SCST virtual disk backend handler Vladislav Bolkhovitin
2008-12-10 18:44 ` [PATCH][RFC 10/23]: SCST user space " Vladislav Bolkhovitin
2008-12-10 18:46 ` [PATCH][RFC 11/23]: Makefile for SCST backend handlers Vladislav Bolkhovitin
2008-12-10 18:47 ` [PATCH][RFC 12/23]: Patch to add necessary support for SCST pass-through Vladislav Bolkhovitin
2008-12-10 18:49 ` [PATCH][RFC 13/23]: Export of alloc_io_context() function Vladislav Bolkhovitin
2008-12-11 13:34 ` Jens Axboe
2008-12-11 18:17 ` Vladislav Bolkhovitin
2008-12-11 18:41 ` Jens Axboe
2008-12-11 19:00 ` Vladislav Bolkhovitin
2008-12-11 19:06 ` Jens Axboe
2008-12-12 19:16 ` Vladislav Bolkhovitin
2008-12-10 18:50 ` [PATCH][RFC 14/23]: Necessary functionality in qla2xxx driver to support target mode Vladislav Bolkhovitin
2008-12-10 18:51 ` [PATCH][RFC 15/23]: QLogic target driver Vladislav Bolkhovitin
2008-12-10 18:54 ` [PATCH][RFC 16/23]: Documentation for " Vladislav Bolkhovitin
2008-12-10 18:55 ` [PATCH][RFC 17/23]: InfiniBand SRP " Vladislav Bolkhovitin
2008-12-10 18:57 ` [PATCH][RFC 18/23]: Documentation for " Vladislav Bolkhovitin
2008-12-10 18:58 ` [PATCH][RFC 19/23]: scst_local " Vladislav Bolkhovitin
2008-12-10 19:00 ` [PATCH][RFC 20/23]: Documentation for scst_local driver Vladislav Bolkhovitin
2008-12-10 19:01 ` [PATCH][RFC 21/23]: iSCSI target driver Vladislav Bolkhovitin
2008-12-11 22:55 ` Nicholas A. Bellinger
2008-12-11 22:59 ` Nicholas A. Bellinger
2008-12-12 19:26 ` Vladislav Bolkhovitin
2008-12-13 10:03 ` Nicholas A. Bellinger
2008-12-13 10:11 ` Bart Van Assche
2008-12-13 10:16 ` Nicholas A. Bellinger
2008-12-13 10:27 ` Bart Van Assche
2008-12-13 15:01 ` Vladislav Bolkhovitin
2008-12-13 14:57 ` Vladislav Bolkhovitin
2008-12-10 19:02 ` [PATCH][RFC 22/23]: Documentation for iSCSI-SCST Vladislav Bolkhovitin
2008-12-10 19:04 ` [PATCH][RFC 23/23]: Support for zero-copy TCP transmit of user space data Vladislav Bolkhovitin
2008-12-10 21:45 ` Evgeniy Polyakov
2008-12-11 18:16 ` Vladislav Bolkhovitin
2008-12-11 19:12 ` James Bottomley
2008-12-12 19:25 ` Vladislav Bolkhovitin
2008-12-12 19:37 ` James Bottomley
2008-12-15 17:58 ` Vladislav Bolkhovitin
2008-12-15 23:18 ` Christoph Hellwig
2008-12-16 18:57 ` Vladislav Bolkhovitin
2008-12-18 18:35 ` [RFC]: " Vladislav Bolkhovitin
2008-12-18 18:43 ` David M. Lloyd
2008-12-19 17:37 ` Vladislav Bolkhovitin
2008-12-19 19:07 ` Jens Axboe
2008-12-19 19:17 ` Vladislav Bolkhovitin
2008-12-19 19:27 ` Jens Axboe
2008-12-19 21:58 ` Evgeniy Polyakov
2008-12-23 19:11 ` Vladislav Bolkhovitin
2008-12-19 11:27 ` Andi Kleen
2008-12-19 17:38 ` Vladislav Bolkhovitin
2008-12-19 18:00 ` Andi Kleen
2008-12-19 17:57 ` Vladislav Bolkhovitin
2008-12-16 16:00 ` [PATCH][RFC 23/23]: " Bart Van Assche
2008-12-16 17:41 ` Evgeniy Polyakov
2008-12-19 20:21 ` Jeremy Fitzhardinge [this message]
2008-12-19 22:04 ` Evgeniy Polyakov
2008-12-19 22:21 ` Jeremy Fitzhardinge
2008-12-19 22:33 ` Evgeniy Polyakov
2008-12-20 1:56 ` Jeremy Fitzhardinge
2008-12-20 2:02 ` Herbert Xu
2008-12-20 6:14 ` Jeremy Fitzhardinge
2008-12-20 6:51 ` Herbert Xu
2008-12-20 7:43 ` Jeremy Fitzhardinge
2008-12-20 8:10 ` Herbert Xu
2008-12-20 10:32 ` Evgeniy Polyakov
2008-12-20 19:39 ` Jeremy Fitzhardinge
2008-12-22 0:43 ` Rusty Russell
2008-12-23 19:14 ` Vladislav Bolkhovitin
2008-12-23 19:16 ` Vladislav Bolkhovitin
2008-12-23 21:38 ` Evgeniy Polyakov
2008-12-24 14:37 ` Vladislav Bolkhovitin
2008-12-24 14:44 ` Evgeniy Polyakov
2008-12-24 17:46 ` Vladislav Bolkhovitin
2008-12-24 18:08 ` Evgeniy Polyakov
2008-12-30 17:37 ` Vladislav Bolkhovitin
2008-12-30 21:35 ` Evgeniy Polyakov
2008-12-23 19:13 ` Vladislav Bolkhovitin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=494C0255.8010208@goop.org \
--to=jeremy@goop.org \
--cc=James.Bottomley@HansenPartnership.com \
--cc=akpm@linux-foundation.org \
--cc=bart.vanassche@gmail.com \
--cc=bharrosh@panasas.com \
--cc=fujita.tomonori@lab.ntt.co.jp \
--cc=herbert@gondor.apana.org.au \
--cc=jeff@garzik.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=michaelc@cs.wisc.edu \
--cc=nab@linux-iscsi.org \
--cc=netdev@vger.kernel.org \
--cc=rusty@rustcorp.com.au \
--cc=scst-devel@lists.sourceforge.net \
--cc=torvalds@linux-foundation.org \
--cc=vst@vlnb.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox