public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Andrea Shepard <andrea@persephoneslair.org>,
	David Miller <davem@davemloft.net>
Cc: netdev <netdev@vger.kernel.org>
Subject: [PATCH net-next-2.6] net: pskb_expand_head() optimization
Date: Fri, 23 Jul 2010 07:09:08 +0200	[thread overview]
Message-ID: <1279861748.2482.13.camel@edumazet-laptop> (raw)
In-Reply-To: <20100722191234.GA832@cronus.persephoneslair.org>

Le jeudi 22 juillet 2010 à 12:12 -0700, Andrea Shepard a écrit :
> Make pskb_expand_head() check ip_summed to make sure csum_start is really
> csum_start and not csum before adjusting it.
> 
> This fixes a bug I encountered using a Sun Quad-Fast Ethernet card and VLANs.
> On my configuration, the sunhme driver produces skbs with differing amounts
> of headroom on receive depending on the packet size.  See line 2030 of
> drivers/net/sunhme.c; packets smaller than RX_COPY_THRESHOLD have 52 bytes
> of headroom but packets larger than that cutoff have only 20 bytes.
> 
> When these packets reach the VLAN driver, vlan_check_reorder_header()
> calls skb_cow(), which, if the packet has less than NET_SKB_PAD (== 32) bytes
> of headroom, uses pskb_expand_head() to make more.
> 
> Then, pskb_expand_head() needs to adjust a lot of offsets into the skb,
> including csum_start.  Since csum_start is a union with csum, if the packet
> has a valid csum value this will corrupt it, which was the effect I observed.
> The sunhme hardware computes receive checksums, so the skbs would be created
> by the driver with ip_summed == CHECKSUM_COMPLETE and a valid csum field, and
> then pskb_expand_head() would corrupt the csum field, leading to an "hw csum
> error" message later on, for example in icmp_rcv() for pings larger than the
> sunhme RX_COPY_THRESHOLD.
> 
> On the basis of the comment at the beginning of include/linux/skbuff.h,
> I believe that the csum_start skb field is only meaningful if ip_csummed is
> CSUM_PARTIAL, so this patch makes pskb_expand_head() adjust it only in that
> case to avoid corrupting a valid csum value.
> 
> Please see my more in-depth disucssion of tracking down this bug for
> more details if you like:
> 
> http://puellavulnerata.livejournal.com/112186.html
> http://puellavulnerata.livejournal.com/112567.html
> http://puellavulnerata.livejournal.com/112891.html
> http://puellavulnerata.livejournal.com/113096.html
> http://puellavulnerata.livejournal.com/113591.html
> 
> I am not subscribed to this list, so please CC me on replies.
> 

Excellent Changelog Andrea, thanks !

Reviewing pskb_expand_head(), I see it copy the whole struct
skb_shared_info, even if source contains garbage (uninitialized data).

I wonder why it is not triggering kmemcheck alarms

[PATCH net-next-2.6] net: pskb_expand_head() optimization

Move frags[] at the end of struct skb_shared_info, and make
pskb_expand_head() copy only the used part of it instead of whole array.

This should avoid kmemcheck warnings and speedup pskb_expand_head() as
well, avoiding a lot of cache misses.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/linux/skbuff.h |    3 ++-
 net/core/skbuff.c      |    2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f5aa87e..d89876b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -202,10 +202,11 @@ struct skb_shared_info {
 	 */
 	atomic_t	dataref;
 
-	skb_frag_t	frags[MAX_SKB_FRAGS];
 	/* Intermediate layers must ensure that destructor_arg
 	 * remains valid until skb destructor */
 	void *		destructor_arg;
+	/* must be last field, see pskb_expand_head() */
+	skb_frag_t	frags[MAX_SKB_FRAGS];
 };
 
 /* We divide dataref into two halves.  The higher 16 bits hold references
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 76d33ca..7da58a2 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -817,7 +817,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
 	memcpy(data + nhead, skb->head, skb->tail - skb->head);
 #endif
 	memcpy(data + size, skb_end_pointer(skb),
-	       sizeof(struct skb_shared_info));
+	       offsetof(struct skb_shared_info, frags[skb_shinfo(skb)->nr_frags]));
 
 	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
 		get_page(skb_shinfo(skb)->frags[i].page);



  parent reply	other threads:[~2010-07-23  5:09 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-22 19:12 [PATCH] Fix corruption of skb csum field in pskb_expand_head() of net/core/skbuff.c Andrea Shepard
2010-07-22 20:28 ` David Miller
2010-07-23  5:09 ` Eric Dumazet [this message]
2010-07-25  4:06   ` [PATCH net-next-2.6] net: pskb_expand_head() optimization David Miller
  -- strict thread matches above, loose matches on Subject: below --
2010-09-03  3:43 [PATCH] net: Frag list lost on head expansion David Miller
2010-09-03  5:48 ` Eric Dumazet
2010-09-03  9:09   ` [PATCH net-next-2.6] net: pskb_expand_head() optimization Eric Dumazet
2010-09-03 13:46     ` David Miller
2010-09-07  2:20       ` David Miller
2010-09-07  5:02         ` Eric Dumazet
2010-09-07  5:05           ` David Miller
2010-09-07  9:16           ` Jarek Poplawski
2010-09-07  9:37             ` Eric Dumazet
2010-09-10 19:54               ` David Miller
2010-09-11 12:31                 ` Jarek Poplawski
2010-09-12  3:30                   ` David Miller
2010-09-12 10:45                     ` Jarek Poplawski
2010-09-12 10:58                       ` Jarek Poplawski
2010-09-12 15:58                       ` David Miller
2010-09-12 16:13                         ` David Miller
2010-09-12 20:57                           ` Jarek Poplawski
2010-09-12 22:08                             ` David Miller
2010-09-13  7:49                               ` Jarek Poplawski
2010-09-12 19:55                         ` Ben Pfaff
2010-09-12 20:24                           ` David Miller
2010-09-12 20:45                         ` Jarek Poplawski
2010-09-20  0:17                   ` David Miller
2010-09-20  7:21                     ` Jarek Poplawski
2010-09-20  9:02                       ` Eric Dumazet
2010-09-20  9:14                         ` Jarek Poplawski
2010-09-20 12:12                           ` Jarek Poplawski
2010-09-20 12:40                             ` Eric Dumazet
2010-09-20 16:59                       ` David Miller
2010-09-07  1:25     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1279861748.2482.13.camel@edumazet-laptop \
    --to=eric.dumazet@gmail.com \
    --cc=andrea@persephoneslair.org \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox