All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Shirley Ma <mashirle@us.ibm.com>
Cc: "Michał Mirosław" <mirqus@gmail.com>,
	"Ben Hutchings" <bhutchings@solarflare.com>,
	"David Miller" <davem@davemloft.net>,
	"Eric Dumazet" <eric.dumazet@gmail.com>,
	"Avi Kivity" <avi@redhat.com>, "Arnd Bergmann" <arnd@arndb.de>,
	netdev@vger.kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH V5 2/6 net-next] netdevice.h: Add zero-copy flag in netdevice
Date: Wed, 18 May 2011 19:23:12 +0300	[thread overview]
Message-ID: <20110518162311.GA22001@redhat.com> (raw)
In-Reply-To: <1305734543.32080.50.camel@localhost.localdomain>

On Wed, May 18, 2011 at 09:02:23AM -0700, Shirley Ma wrote:
> On Wed, 2011-05-18 at 07:38 -0700, Shirley Ma wrote:
> > On Wed, 2011-05-18 at 13:40 +0200, Michał Mirosław wrote:
> > > >> >> Not more other restrictions, skb clone is OK.
> > pskb_expand_head()
> > > looks
> > > >> >> OK to me from code review.
> > > >> > Hmm. pskb_expand_head calls skb_release_data while keeping
> > > >> > references to pages. How is that ok? What do I miss?
> > > >> It's making copy of the skb_shinfo earlier, so the pages refcount
> > > >> stays the same.
> > > > Exactly. But the callback is invoked so the guest thinks it's ok
> > to
> > > > change this memory. If it does a corrupted packet will be sent
> > out.
> > > 
> > > Hmm. I tool a quick look at skb_clone(), and it looks like this
> > > sequence will break this scheme:
> > > 
> > > skb2 = skb_clone(skb...);
> > > kfree_skb(skb) or pskb_expand_head(skb);  /* callback called */
> > > [use skb2, pages still referenced]
> > > kfree_skb(skb); /* callback called again */
> > > 
> > > This sequence is common in bridge, might be in other places.
> > > 
> > > Maybe this ubuf thing should just track clones? This will make it
> > work
> > > on all devices then.
> > 
> > The callback was only invoked when last reference of skb was gone.
> > skb_clone does increase skb refcnt. I tested tcpdump on lower device,
> > it
> > worked.
> > 
> > For the sequence of:
> > 
> > skb_clone  -> last refcnt + 1
> > kfree_skb() or pskb_expand_head -> callback not called
> > kfree_skb() -> callback called
> > 
> > I will check page refcount to see whether it's balanced. 
> 
> The page refcounts are balanced too. 
> 
> In macvtap/vhost Real NIC zerocopy case, it always goes to fastpath in
> pskb_expand_head, so I didn't hit any issue.
> 
> But rethinking about pskb_expand_head(), it calls skb_release_data() to
> free old skb head when it's not in the fastpath (pskb_expand_head is not
> the last reference of this skb); And it's impossible to track which skb
> head (old one or new one) will be the last one to free. So better to
> return error for zero-copy skbs when not using fastpath. Does it make
> sense? 

I'm not sure it does. Look e.g. at tg3 - if expand_head fails
packet gets dropped. No crash but unlikely to perform well :).

> Besides this, any other issue?
> 
> 
> Thanks
> Shirley

  reply	other threads:[~2011-05-18 16:23 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-16 19:28 [PATCH V5 2/6 net-next] netdevice.h: Add zero-copy flag in netdevice Shirley Ma
2011-05-16 19:35 ` Ben Hutchings
2011-05-16 19:38   ` Shirley Ma
2011-05-16 19:47     ` Ben Hutchings
2011-05-16 21:14       ` Michael S. Tsirkin
2011-05-16 23:32         ` Shirley Ma
2011-05-17  6:21           ` Michael S. Tsirkin
2011-05-17 20:53             ` Shirley Ma
2011-05-17 21:48           ` Michał Mirosław
2011-05-17 22:28             ` Shirley Ma
2011-05-17 22:58               ` Michał Mirosław
2011-05-17 23:44                 ` Shirley Ma
2011-05-18  9:06                   ` Michał Mirosław
2011-05-18 10:38               ` Michael S. Tsirkin
2011-05-18 11:10                 ` Michał Mirosław
2011-05-18 11:17                   ` Michael S. Tsirkin
2011-05-18 11:40                     ` Michał Mirosław
2011-05-18 11:47                       ` Michael S. Tsirkin
2011-05-18 14:38                       ` Shirley Ma
2011-05-18 15:47                         ` Michael S. Tsirkin
2011-05-18 16:07                           ` Shirley Ma
2011-05-18 16:36                             ` Michael S. Tsirkin
2011-05-18 16:45                               ` Shirley Ma
2011-05-18 16:51                                 ` Michael S. Tsirkin
2011-05-18 17:00                                   ` Shirley Ma
2011-05-19 19:42                                     ` Shirley Ma
2011-05-19 23:41                                       ` Michael S. Tsirkin
2011-05-25 22:49                                         ` Shirley Ma
2011-05-26  8:49                                           ` Michael S. Tsirkin
2011-05-26 15:27                                             ` Shirley Ma
2011-05-26 19:11                                             ` Shirley Ma
2011-05-18 16:02                         ` Shirley Ma
2011-05-18 16:23                           ` Michael S. Tsirkin [this message]
2011-05-18 16:50                       ` Michael S. Tsirkin
2011-05-18 11:47                     ` Michał Mirosław
2011-05-18 11:56                       ` Michael S. Tsirkin
2011-05-18 12:48                         ` Michał Mirosław
2011-05-18 13:19                           ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110518162311.GA22001@redhat.com \
    --to=mst@redhat.com \
    --cc=arnd@arndb.de \
    --cc=avi@redhat.com \
    --cc=bhutchings@solarflare.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mashirle@us.ibm.com \
    --cc=mirqus@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.