All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Xin, Xiaohui" <xiaohui.xin@intel.com>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mingo@elte.hu" <mingo@elte.hu>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"jdike@linux.intel.com" <jdike@linux.intel.com>
Subject: Re: [RFC][PATCH v4 00/18] Provide a zero-copy method on KVM virtio-net.
Date: Sun, 9 May 2010 12:26:14 +0300	[thread overview]
Message-ID: <20100509092614.GG16775@redhat.com> (raw)
In-Reply-To: <F2E9EB7348B8264F86B6AB8151CE2D790AB4A3F137@shsmsx502.ccr.corp.intel.com>

On Sat, May 08, 2010 at 03:55:48PM +0800, Xin, Xiaohui wrote:
> Michael,
> Sorry, somehow I missed this mail. :-(
> 
> >> Here, we have ever considered 2 ways to utilize the page constructor
> >> API to dispense the user buffers.
> >> 
> >> One:	Modify __alloc_skb() function a bit, it can only allocate a 
> >> 	structure of sk_buff, and the data pointer is pointing to a 
> >> 	user buffer which is coming from a page constructor API.
> >> 	Then the shinfo of the skb is also from guest.
> >> 	When packet is received from hardware, the skb->data is filled
> >> 	directly by h/w. What we have done is in this way.
> >> 
> >> 	Pros:	We can avoid any copy here.
> >> 	Cons:	Guest virtio-net driver needs to allocate skb as almost
> >> 		the same method with the host NIC drivers, say the size
> >> 		of netdev_alloc_skb() and the same reserved space in the
> >> 		head of skb. Many NIC drivers are the same with guest and
> >> 		ok for this. But some lastest NIC drivers reserves special
> >> 		room in skb head. To deal with it, we suggest to provide
> >> 		a method in guest virtio-net driver to ask for parameter
> >> 		we interest from the NIC driver when we know which device 
> >> 		we have bind to do zero-copy. Then we ask guest to do so.
> >> 		Is that reasonable?
> 
> >Do you still do this?
> 
> Currently, we still use the first way. But we now ignore the room which 
> host skb_reserve() required when device is doing zero-copy. Now we don't 
> taint guest virtio-net driver with a new method by this way.
> 
> >> Two:	Modify driver to get user buffer allocated from a page constructor
> >> 	API(to substitute alloc_page()), the user buffer are used as payload
> >> 	buffers and filled by h/w directly when packet is received. Driver
> >> 	should associate the pages with skb (skb_shinfo(skb)->frags). For 
> >> 	the head buffer side, let host allocates skb, and h/w fills it. 
> >> 	After that, the data filled in host skb header will be copied into
> >> 	guest header buffer which is submitted together with the payload buffer.
> >> 
> >> 	Pros:	We could less care the way how guest or host allocates their
> >> 		buffers.
> >> 	Cons:	We still need a bit copy here for the skb header.
> >> 
> >> We are not sure which way is the better here. This is the first thing we want
> >> to get comments from the community. We wish the modification to the network
> >> part will be generic which not used by vhost-net backend only, but a user
> >> application may use it as well when the zero-copy device may provides async
> >> read/write operations later.
> 
> >I commented on this in the past. Do you still want comments?
> 
> Now we continue with the first way and try to push it. But any comments about the two methods are still welcome.
> 
> >That's nice. The thing to do is probably to enable GSO/TSO
> >and see what we get this way. Also, mergeable buffer support
> >was recently posted and I hope to merge it for 2.6.35.
> >You might want to take a look.
> 
> I'm looking at the mergeable buffer. I think GSO/GRO support with zero-copy also needs it.
> Currently, GSO/TSO is still not supported by vhost-net?

GSO/TSO are currently supported with tap and macvtap,
AF_PACKET socket backend still needs some work to
enable GSO.

> -- 
> MST

  reply	other threads:[~2010-05-09  9:31 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-25  9:19 [RFC][PATCH v4 01/18] Add a new struct for device to manipulate external buffer xiaohui.xin
2010-04-25  9:19 ` [RFC][PATCH v4 02/18] Export 2 func for device to assign/dassign new structure xiaohui.xin
2010-04-25  9:19   ` [RFC][PATCH v4 03/18] Add a ndo_mp_port_prep pointer to net_device_ops xiaohui.xin
2010-04-25  9:19     ` [RFC][PATCH v4 04/18] Add a function make external buffer owner to query capability xiaohui.xin
2010-04-25  9:19       ` [RFC][PATCH v4 05/18] Add a function to indicate if device use external buffer xiaohui.xin
2010-04-25  9:19         ` [RFC][PATCH v4 06/18] Add interface to get external buffers xiaohui.xin
2010-04-25  9:19           ` [RFC][PATCH v4 07/18] Make __alloc_skb() to get external buffer xiaohui.xin
2010-04-25  9:19             ` [RFC][PATCH v4 08/18] Ignore skb_reserve() when device is using " xiaohui.xin
2010-04-25  9:19               ` [RFC][PATCH v4 09/18] Don't do skb recycle, if device use " xiaohui.xin
2010-04-25  9:19                 ` [RFC][PATCH v4 10/18] Use callback to deal with skb_release_data() specially xiaohui.xin
2010-04-25  9:19                   ` [RFC][PATCH v4 11/18] Add a hook to intercept external buffers from NIC driver xiaohui.xin
2010-04-25  9:19                     ` [RFC][PATCH v4 12/18] To skip GRO if buffer is external xiaohui.xin
2010-04-25  9:20                       ` [RFC][PATCH v4 13/18] Add header file for mp device xiaohui.xin
2010-04-25  9:20                         ` [RFC][PATCH v4 14/18] Add basic func and special ioctl to " xiaohui.xin
2010-04-25  9:20                           ` [RFC][PATCH v4 15/18] Manipulate external buffers in " xiaohui.xin
2010-04-25  9:20                             ` [RFC][PATCH v4 16/18] Export proto_ops to vhost-net driver xiaohui.xin
2010-04-25  9:20                               ` [RFC][PATCH v4 17/18] Add a kconfig entry and make entry for mp device xiaohui.xin
2010-04-25  9:20                                 ` [RFC][PATCH v4 18/18] Provides multiple submits and async notifications xiaohui.xin
2010-04-25  9:20                                   ` [RFC][PATCH v4 00/18] Provide a zero-copy method on KVM virtio-net xiaohui.xin
2010-04-25  9:55                                     ` David Miller
2010-04-25 10:46                                       ` Michael S. Tsirkin
2010-04-29  1:33                                         ` Xin, Xiaohui
2010-04-25 12:14                                     ` Michael S. Tsirkin
2010-05-08  7:55                                       ` Xin, Xiaohui
2010-05-09  9:26                                         ` Michael S. Tsirkin [this message]
2010-04-25  9:33         ` [RFC][PATCH v4 05/18] Add a function to indicate if device use external buffer Changli Gao
2010-04-25  9:33           ` Changli Gao
2010-04-25  9:51           ` David Miller
2010-04-25  9:35         ` Changli Gao
2010-04-25  9:35           ` Changli Gao
2010-04-25  9:51           ` David Miller
2010-04-29  1:38           ` Xin, Xiaohui
2010-04-26 20:06 ` [RFC][PATCH v4 01/18] Add a new struct for device to manipulate " Andy Fleming
2010-04-26 20:06   ` Andy Fleming

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100509092614.GG16775@redhat.com \
    --to=mst@redhat.com \
    --cc=davem@davemloft.net \
    --cc=jdike@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    --cc=xiaohui.xin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.