All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Ben Hutchings <bhutchings@solarflare.com>
Cc: netdev@vger.kernel.org, Herbert Xu <herbert@gondor.hengli.com.au>
Subject: Re: skb_linearize
Date: Wed, 31 Oct 2012 21:15:30 +0200	[thread overview]
Message-ID: <20121031191530.GA3674@redhat.com> (raw)
In-Reply-To: <1347808032.13258.275.camel@deadeye.wl.decadent.org.uk>

On Sun, Sep 16, 2012 at 04:07:12PM +0100, Ben Hutchings wrote:
> On Sun, 2012-09-16 at 12:17 +0300, Michael S. Tsirkin wrote:
> > I notice that dev_hard_start_xmit might invoke
> > __skb_linearize e.g. if device does not support NETIF_F_SG.
> > 
> > This in turn onvokes __pskb_pull_tail, and
> > documentation of __pskb_pull_tail says:
> >   &sk_buff MUST have reference count of 1.
> > 
> > I am guessing 'reference count' means users in this context, right?
> > IIUC this is because it modifies skb in a way that
> > isn't safe if anyone else is looking at the skb.
> > 
> > 
> > However, I don't see what guarantees that reference
> > count is 1 when dev_hard_start_xmit invokes
> > linearize. In particular it calls dev_queue_xmit_nit
> > which could queue packets on a network tap.
> > 
> > Could someone help me understand please?
> 
> Reference count here means references to struct sk_buff itself.  The
> header area and data fragments are allowed to be shared.
> 
> dev_queue_xmit_nit() clones the skb for each tap, so the reference count
> on the original skb remains 1.
> 
> Ben.

Interesting. But don't skb clones share the fragment list?
Maybe I misunderstand? If they do it looks like the following race
would be possible:

- skb is cloned and queued e.g. at socket receive queue.
  dataref becomes 2.
- On CPU 1, skb_copy_datagram_iovec is called on clone 1, is reads nr_frags and sees
  value > 1.
- On CPU 2, __skb_linearize is now called on clone 2, it modified the
  skb so nr_frags is now 0, and does put_page for all frags > 1.
- On CPU 1, skb_copy_datagram_iovec will now use the previously read
  nr_frags > 1 and access a fragment page that was already freed.

What did I miss?

> -- 
> Ben Hutchings, Staff Engineer, Solarflare
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.

  reply	other threads:[~2012-10-31 19:13 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-16  9:17 skb_linearize Michael S. Tsirkin
2012-09-16 15:07 ` skb_linearize Ben Hutchings
2012-10-31 19:15   ` Michael S. Tsirkin [this message]
2012-10-31 21:03     ` skb_linearize Ben Hutchings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121031191530.GA3674@redhat.com \
    --to=mst@redhat.com \
    --cc=bhutchings@solarflare.com \
    --cc=herbert@gondor.hengli.com.au \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.