From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Michał Mirosław" <mirqus@gmail.com>
Cc: Shirley Ma <mashirle@us.ibm.com>,
Ben Hutchings <bhutchings@solarflare.com>,
David Miller <davem@davemloft.net>,
Eric Dumazet <eric.dumazet@gmail.com>,
Avi Kivity <avi@redhat.com>, Arnd Bergmann <arnd@arndb.de>,
netdev@vger.kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH V5 2/6 net-next] netdevice.h: Add zero-copy flag in netdevice
Date: Wed, 18 May 2011 14:17:34 +0300 [thread overview]
Message-ID: <20110518111734.GO7589@redhat.com> (raw)
In-Reply-To: <BANLkTimc1RbC3uQsra1HUvS_Trg63iMWGA@mail.gmail.com>
On Wed, May 18, 2011 at 01:10:50PM +0200, Michał Mirosław wrote:
> 2011/5/18 Michael S. Tsirkin <mst@redhat.com>:
> > On Tue, May 17, 2011 at 03:28:38PM -0700, Shirley Ma wrote:
> >> On Tue, 2011-05-17 at 23:48 +0200, Michał Mirosław wrote:
> >> > 2011/5/17 Shirley Ma <mashirle@us.ibm.com>:
> >> > > Hello Michael,
> >> > >
> >> > > Looks like to use a new flag requires more time/work. I am thinking
> >> > > whether we can just use HIGHDMA flag to enable zero-copy in macvtap
> >> > to
> >> > > avoid the new flag for now since mavctap uses real NICs as lower
> >> > device?
> >> >
> >> > Is there any other restriction besides requiring driver to not recycle
> >> > the skb? Are there any drivers that recycle TX skbs?
> >
> > Not just recycling skbs, keeping reference to any of the pages in the
> > skb. Another requirement is to invoke the callback
> > in a timely fashion. For example virtio-net doesn't limit the time until
> > that happens (skbs are only freed when some other packet is
> > transmitted), so we need to avoid zcopy for such (nested-virt)
> > scenarious, right?
>
> Hmm. But every hardware driver supporting SG will keep reference to
> the pages until the packet is sent (or DMA'd to the device). This can
> take a long time if hardware queue happens to stall for some reason.
That's a fundamental property of zero copy transmit.
You can't let the application/guest reuse the memory until
no one looks at it anymore.
> Is it that you mean keeping a reference after all skbs pointing to the
> pages are released?
No one should reference the pages after the callback is invoked, yes.
> >> Not more other restrictions, skb clone is OK. pskb_expand_head() looks
> >> OK to me from code review.
> > Hmm. pskb_expand_head calls skb_release_data while keeping
> > references to pages. How is that ok? What do I miss?
>
> It's making copy of the skb_shinfo earlier, so the pages refcount
> stays the same.
>
> Best Regards,
> Michał Mirosław
Exactly. But the callback is invoked so the guest thinks it's ok to
change this memory. If it does a corrupted packet will be sent out.
--
MST
next prev parent reply other threads:[~2011-05-18 11:17 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-16 19:28 [PATCH V5 2/6 net-next] netdevice.h: Add zero-copy flag in netdevice Shirley Ma
2011-05-16 19:35 ` Ben Hutchings
2011-05-16 19:38 ` Shirley Ma
2011-05-16 19:47 ` Ben Hutchings
2011-05-16 21:14 ` Michael S. Tsirkin
2011-05-16 23:32 ` Shirley Ma
2011-05-17 6:21 ` Michael S. Tsirkin
2011-05-17 20:53 ` Shirley Ma
2011-05-17 21:48 ` Michał Mirosław
2011-05-17 22:28 ` Shirley Ma
2011-05-17 22:58 ` Michał Mirosław
2011-05-17 23:44 ` Shirley Ma
2011-05-18 9:06 ` Michał Mirosław
2011-05-18 10:38 ` Michael S. Tsirkin
2011-05-18 11:10 ` Michał Mirosław
2011-05-18 11:17 ` Michael S. Tsirkin [this message]
2011-05-18 11:40 ` Michał Mirosław
2011-05-18 11:47 ` Michael S. Tsirkin
2011-05-18 14:38 ` Shirley Ma
2011-05-18 15:47 ` Michael S. Tsirkin
2011-05-18 16:07 ` Shirley Ma
2011-05-18 16:36 ` Michael S. Tsirkin
2011-05-18 16:45 ` Shirley Ma
2011-05-18 16:51 ` Michael S. Tsirkin
2011-05-18 17:00 ` Shirley Ma
2011-05-19 19:42 ` Shirley Ma
2011-05-19 23:41 ` Michael S. Tsirkin
2011-05-25 22:49 ` Shirley Ma
2011-05-26 8:49 ` Michael S. Tsirkin
2011-05-26 15:27 ` Shirley Ma
2011-05-26 19:11 ` Shirley Ma
2011-05-18 16:02 ` Shirley Ma
2011-05-18 16:23 ` Michael S. Tsirkin
2011-05-18 16:50 ` Michael S. Tsirkin
2011-05-18 11:47 ` Michał Mirosław
2011-05-18 11:56 ` Michael S. Tsirkin
2011-05-18 12:48 ` Michał Mirosław
2011-05-18 13:19 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110518111734.GO7589@redhat.com \
--to=mst@redhat.com \
--cc=arnd@arndb.de \
--cc=avi@redhat.com \
--cc=bhutchings@solarflare.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mashirle@us.ibm.com \
--cc=mirqus@gmail.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.