netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v5 0/2] net: permit skb_segment on head_frag frag_list skb
@ 2018-03-21 20:36 Yonghong Song
  2018-03-21 20:36 ` [PATCH net-next v5 1/2] " Yonghong Song
  2018-03-21 20:36 ` [PATCH net-next v5 2/2] net: bpf: add a test for skb_segment in test_bpf module Yonghong Song
  0 siblings, 2 replies; 5+ messages in thread
From: Yonghong Song @ 2018-03-21 20:36 UTC (permalink / raw)
  To: edumazet, ast, daniel, diptanu, netdev; +Cc: kernel-team

One of our in-house projects, bpf-based NAT, hits a kernel BUG_ON at
function skb_segment(), line 3667. The bpf program attaches to
clsact ingress, calls bpf_skb_change_proto to change protocol
from ipv4 to ipv6 or from ipv6 to ipv4, and then calls bpf_redirect
to send the changed packet out.
 ...
    3665                 while (pos < offset + len) {
    3666                         if (i >= nfrags) {
    3667                                 BUG_ON(skb_headlen(list_skb));
 ...

The triggering input skb has the following properties:
    list_skb = skb->frag_list;
    skb->nfrags != NULL && skb_headlen(list_skb) != 0
and skb_segment() is not able to handle a frag_list skb
if its headlen (list_skb->len - list_skb->data_len) is not 0.

Patch #1 provides a simple solution to avoid BUG_ON. If
list_skb->head_frag is true, its page-backed frag will
be processed before the list_skb->frags.
Patch #2 provides a test case in test_bpf module which
constructs a skb and calls skb_segment() directly. The test
case is able to trigger the BUG_ON without Patch #1.

The patch has been tested in the following setup:
  ipv6_host <-> nat_server <-> ipv4_host
where nat_server has a bpf program doing ipv4<->ipv6
translation and forwarding through clsact hook
bpf_skb_change_proto.

Changelog:
v4 -> v5:
  . Replace local variable head_frag with
    a static inline function skb_head_frag_to_page_desc
    which gets the head_frag on-demand. This makes
    code more readable and also does not increase
    the stack size, from Alexander.
  . Remove the "if(nfrags)" guard for skb_orphan_frags
    and skb_zerocopy_clone as I found that they can
    handle zero-frag skb (with non-zero skb_headlen(skb))
    properly.
  . Properly release segment list from skb_segment()
    in the test, from Eric.
v3 -> v4:
  . Remove dynamic memory allocation and use rewinding
    for both index and frag to remove one branch in fast path,
    from Alexander.
  . Fix a bunch of issues in test_bpf skb_segment() test,
    including proper way to allocate skb, proper function
    argument for skb_add_rx_frag and not freeint skb, etc.,
    from Eric.
v2 -> v3:
  . Use starting frag index -1 (instead of 0) to
    special process head_frag before other frags in the skb,
    from Alexander Duyck.
v1 -> v2:
  . Removed never-hit BUG_ON, spotted by Linyu Yuan.

Yonghong Song (2):
  net: permit skb_segment on head_frag frag_list skb
  net: bpf: add a test for skb_segment in test_bpf module

 lib/test_bpf.c    | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 net/core/skbuff.c | 26 ++++++++++++----
 2 files changed, 111 insertions(+), 8 deletions(-)

-- 
2.9.5

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH net-next v5 1/2] net: permit skb_segment on head_frag frag_list skb
  2018-03-21 20:36 [PATCH net-next v5 0/2] net: permit skb_segment on head_frag frag_list skb Yonghong Song
@ 2018-03-21 20:36 ` Yonghong Song
  2018-03-21 21:51   ` Alexander Duyck
  2018-03-21 20:36 ` [PATCH net-next v5 2/2] net: bpf: add a test for skb_segment in test_bpf module Yonghong Song
  1 sibling, 1 reply; 5+ messages in thread
From: Yonghong Song @ 2018-03-21 20:36 UTC (permalink / raw)
  To: edumazet, ast, daniel, diptanu, netdev; +Cc: kernel-team

One of our in-house projects, bpf-based NAT, hits a kernel BUG_ON at
function skb_segment(), line 3667. The bpf program attaches to
clsact ingress, calls bpf_skb_change_proto to change protocol
from ipv4 to ipv6 or from ipv6 to ipv4, and then calls bpf_redirect
to send the changed packet out.

3472 struct sk_buff *skb_segment(struct sk_buff *head_skb,
3473                             netdev_features_t features)
3474 {
3475         struct sk_buff *segs = NULL;
3476         struct sk_buff *tail = NULL;
...
3665                 while (pos < offset + len) {
3666                         if (i >= nfrags) {
3667                                 BUG_ON(skb_headlen(list_skb));
3668
3669                                 i = 0;
3670                                 nfrags = skb_shinfo(list_skb)->nr_frags;
3671                                 frag = skb_shinfo(list_skb)->frags;
3672                                 frag_skb = list_skb;
...

call stack:
...
 #1 [ffff883ffef03558] __crash_kexec at ffffffff8110c525
 #2 [ffff883ffef03620] crash_kexec at ffffffff8110d5cc
 #3 [ffff883ffef03640] oops_end at ffffffff8101d7e7
 #4 [ffff883ffef03668] die at ffffffff8101deb2
 #5 [ffff883ffef03698] do_trap at ffffffff8101a700
 #6 [ffff883ffef036e8] do_error_trap at ffffffff8101abfe
 #7 [ffff883ffef037a0] do_invalid_op at ffffffff8101acd0
 #8 [ffff883ffef037b0] invalid_op at ffffffff81a00bab
    [exception RIP: skb_segment+3044]
    RIP: ffffffff817e4dd4  RSP: ffff883ffef03860  RFLAGS: 00010216
    RAX: 0000000000002bf6  RBX: ffff883feb7aaa00  RCX: 0000000000000011
    RDX: ffff883fb87910c0  RSI: 0000000000000011  RDI: ffff883feb7ab500
    RBP: ffff883ffef03928   R8: 0000000000002ce2   R9: 00000000000027da
    R10: 000001ea00000000  R11: 0000000000002d82  R12: ffff883f90a1ee80
    R13: ffff883fb8791120  R14: ffff883feb7abc00  R15: 0000000000002ce2
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #9 [ffff883ffef03930] tcp_gso_segment at ffffffff818713e7
--- <IRQ stack> ---
...

The triggering input skb has the following properties:
    list_skb = skb->frag_list;
    skb->nfrags != NULL && skb_headlen(list_skb) != 0
and skb_segment() is not able to handle a frag_list skb
if its headlen (list_skb->len - list_skb->data_len) is not 0.

This patch addressed the issue by handling skb_headlen(list_skb) != 0
case properly if list_skb->head_frag is true, which is expected in
most cases. The head frag is processed before list_skb->frags
are processed.

Reported-by: Diptanu Gon Choudhury <diptanu@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
---
 net/core/skbuff.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 715c134..23b317a 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3460,6 +3460,19 @@ void *skb_pull_rcsum(struct sk_buff *skb, unsigned int len)
 }
 EXPORT_SYMBOL_GPL(skb_pull_rcsum);
 
+static inline skb_frag_t skb_head_frag_to_page_desc(struct sk_buff *frag_skb)
+{
+	skb_frag_t head_frag;
+	struct page *page;
+
+	page = virt_to_head_page(frag_skb->head);
+	head_frag.page.p = page;
+	head_frag.page_offset = frag_skb->data -
+		(unsigned char *)page_address(page);
+	head_frag.size = skb_headlen(frag_skb);
+	return head_frag;
+}
+
 /**
  *	skb_segment - Perform protocol segmentation on skb.
  *	@head_skb: buffer to segment
@@ -3664,15 +3677,16 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 
 		while (pos < offset + len) {
 			if (i >= nfrags) {
-				BUG_ON(skb_headlen(list_skb));
-
 				i = 0;
 				nfrags = skb_shinfo(list_skb)->nr_frags;
 				frag = skb_shinfo(list_skb)->frags;
-				frag_skb = list_skb;
-
-				BUG_ON(!nfrags);
+				if (skb_headlen(list_skb)) {
+					BUG_ON(!list_skb->head_frag);
 
+					/* to make room for head_frag. */
+					i--; frag--;
+				}
+				frag_skb = list_skb;
 				if (skb_orphan_frags(frag_skb, GFP_ATOMIC) ||
 				    skb_zerocopy_clone(nskb, frag_skb,
 						       GFP_ATOMIC))
@@ -3689,7 +3703,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 				goto err;
 			}
 
-			*nskb_frag = *frag;
+			*nskb_frag = (i < 0) ? skb_head_frag_to_page_desc(frag_skb) : *frag;
 			__skb_frag_ref(nskb_frag);
 			size = skb_frag_size(nskb_frag);
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net-next v5 2/2] net: bpf: add a test for skb_segment in test_bpf module
  2018-03-21 20:36 [PATCH net-next v5 0/2] net: permit skb_segment on head_frag frag_list skb Yonghong Song
  2018-03-21 20:36 ` [PATCH net-next v5 1/2] " Yonghong Song
@ 2018-03-21 20:36 ` Yonghong Song
  1 sibling, 0 replies; 5+ messages in thread
From: Yonghong Song @ 2018-03-21 20:36 UTC (permalink / raw)
  To: edumazet, ast, daniel, diptanu, netdev; +Cc: kernel-team

Without the previous commit,
"modprobe test_bpf" will have the following errors:
...
[   98.149165] ------------[ cut here ]------------
[   98.159362] kernel BUG at net/core/skbuff.c:3667!
[   98.169756] invalid opcode: 0000 [#1] SMP PTI
[   98.179370] Modules linked in:
[   98.179371]  test_bpf(+)
...
which triggers the bug the previous commit intends to fix.

The skbs are constructed to mimic what mlx5 may generate.
The packet size/header may not mimic real cases in production. But
the processing flow is similar.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 lib/test_bpf.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 91 insertions(+), 2 deletions(-)

diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index 2efb213..a468b5c 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -6574,6 +6574,93 @@ static bool exclude_test(int test_id)
 	return test_id < test_range[0] || test_id > test_range[1];
 }
 
+static __init struct sk_buff *build_test_skb(void)
+{
+	u32 headroom = NET_SKB_PAD + NET_IP_ALIGN + ETH_HLEN;
+	struct sk_buff *skb[2];
+	struct page *page[2];
+	int i, data_size = 8;
+
+	for (i = 0; i < 2; i++) {
+		page[i] = alloc_page(GFP_KERNEL);
+		if (!page[i]) {
+			if (i == 0)
+				goto err_page0;
+			else
+				goto err_page1;
+		}
+
+		/* this will set skb[i]->head_frag */
+		skb[i] = dev_alloc_skb(headroom + data_size);
+		if (!skb[i]) {
+			if (i == 0)
+				goto err_skb0;
+			else
+				goto err_skb1;
+		}
+
+		skb_reserve(skb[i], headroom);
+		skb_put(skb[i], data_size);
+		skb[i]->protocol = htons(ETH_P_IP);
+		skb_reset_network_header(skb[i]);
+		skb_set_mac_header(skb[i], -ETH_HLEN);
+
+		skb_add_rx_frag(skb[i], 0, page[i], 0, 64, 64);
+		// skb_headlen(skb[i]): 8, skb[i]->head_frag = 1
+	}
+
+	/* setup shinfo */
+	skb_shinfo(skb[0])->gso_size = 1448;
+	skb_shinfo(skb[0])->gso_type = SKB_GSO_TCPV4;
+	skb_shinfo(skb[0])->gso_type |= SKB_GSO_DODGY;
+	skb_shinfo(skb[0])->gso_segs = 0;
+	skb_shinfo(skb[0])->frag_list = skb[1];
+
+	/* adjust skb[0]'s len */
+	skb[0]->len += skb[1]->len;
+	skb[0]->data_len += skb[1]->data_len;
+	skb[0]->truesize += skb[1]->truesize;
+
+	return skb[0];
+
+err_skb1:
+	__free_page(page[1]);
+err_page1:
+	kfree_skb(skb[0]);
+err_skb0:
+	__free_page(page[0]);
+err_page0:
+	return NULL;
+}
+
+static __init int test_skb_segment(void)
+{
+	netdev_features_t features;
+	struct sk_buff *skb, *segs;
+	int ret = -1;
+
+	features = NETIF_F_SG | NETIF_F_GSO_PARTIAL | NETIF_F_IP_CSUM |
+		   NETIF_F_IPV6_CSUM;
+	features |= NETIF_F_RXCSUM;
+	skb = build_test_skb();
+	if (!skb) {
+		pr_info("%s: failed to build_test_skb", __func__);
+		goto done;
+	}
+
+	segs = skb_segment(skb, features);
+	if (segs) {
+		kfree_skb_list(segs);
+		ret = 0;
+		pr_info("%s: success in skb_segment!", __func__);
+	} else {
+		pr_info("%s: failed in skb_segment!", __func__);
+	}
+	kfree_skb(skb);
+done:
+	return ret;
+}
+
 static __init int test_bpf(void)
 {
 	int i, err_cnt = 0, pass_cnt = 0;
@@ -6632,9 +6719,11 @@ static int __init test_bpf_init(void)
 		return ret;
 
 	ret = test_bpf();
-
 	destroy_bpf_tests();
-	return ret;
+	if (ret)
+		return ret;
+
+	return test_skb_segment();
 }
 
 static void __exit test_bpf_exit(void)
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next v5 1/2] net: permit skb_segment on head_frag frag_list skb
  2018-03-21 20:36 ` [PATCH net-next v5 1/2] " Yonghong Song
@ 2018-03-21 21:51   ` Alexander Duyck
  2018-03-21 23:02     ` Yonghong Song
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Duyck @ 2018-03-21 21:51 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Eric Dumazet, ast, Daniel Borkmann, diptanu, Netdev, Kernel Team

On Wed, Mar 21, 2018 at 1:36 PM, Yonghong Song <yhs@fb.com> wrote:
> One of our in-house projects, bpf-based NAT, hits a kernel BUG_ON at
> function skb_segment(), line 3667. The bpf program attaches to
> clsact ingress, calls bpf_skb_change_proto to change protocol
> from ipv4 to ipv6 or from ipv6 to ipv4, and then calls bpf_redirect
> to send the changed packet out.
>
> 3472 struct sk_buff *skb_segment(struct sk_buff *head_skb,
> 3473                             netdev_features_t features)
> 3474 {
> 3475         struct sk_buff *segs = NULL;
> 3476         struct sk_buff *tail = NULL;
> ...
> 3665                 while (pos < offset + len) {
> 3666                         if (i >= nfrags) {
> 3667                                 BUG_ON(skb_headlen(list_skb));
> 3668
> 3669                                 i = 0;
> 3670                                 nfrags = skb_shinfo(list_skb)->nr_frags;
> 3671                                 frag = skb_shinfo(list_skb)->frags;
> 3672                                 frag_skb = list_skb;
> ...
>
> call stack:
> ...
>  #1 [ffff883ffef03558] __crash_kexec at ffffffff8110c525
>  #2 [ffff883ffef03620] crash_kexec at ffffffff8110d5cc
>  #3 [ffff883ffef03640] oops_end at ffffffff8101d7e7
>  #4 [ffff883ffef03668] die at ffffffff8101deb2
>  #5 [ffff883ffef03698] do_trap at ffffffff8101a700
>  #6 [ffff883ffef036e8] do_error_trap at ffffffff8101abfe
>  #7 [ffff883ffef037a0] do_invalid_op at ffffffff8101acd0
>  #8 [ffff883ffef037b0] invalid_op at ffffffff81a00bab
>     [exception RIP: skb_segment+3044]
>     RIP: ffffffff817e4dd4  RSP: ffff883ffef03860  RFLAGS: 00010216
>     RAX: 0000000000002bf6  RBX: ffff883feb7aaa00  RCX: 0000000000000011
>     RDX: ffff883fb87910c0  RSI: 0000000000000011  RDI: ffff883feb7ab500
>     RBP: ffff883ffef03928   R8: 0000000000002ce2   R9: 00000000000027da
>     R10: 000001ea00000000  R11: 0000000000002d82  R12: ffff883f90a1ee80
>     R13: ffff883fb8791120  R14: ffff883feb7abc00  R15: 0000000000002ce2
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>  #9 [ffff883ffef03930] tcp_gso_segment at ffffffff818713e7
> --- <IRQ stack> ---
> ...
>
> The triggering input skb has the following properties:
>     list_skb = skb->frag_list;
>     skb->nfrags != NULL && skb_headlen(list_skb) != 0
> and skb_segment() is not able to handle a frag_list skb
> if its headlen (list_skb->len - list_skb->data_len) is not 0.
>
> This patch addressed the issue by handling skb_headlen(list_skb) != 0
> case properly if list_skb->head_frag is true, which is expected in
> most cases. The head frag is processed before list_skb->frags
> are processed.
>
> Reported-by: Diptanu Gon Choudhury <diptanu@fb.com>
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  net/core/skbuff.c | 26 ++++++++++++++++++++------
>  1 file changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 715c134..23b317a 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -3460,6 +3460,19 @@ void *skb_pull_rcsum(struct sk_buff *skb, unsigned int len)
>  }
>  EXPORT_SYMBOL_GPL(skb_pull_rcsum);
>
> +static inline skb_frag_t skb_head_frag_to_page_desc(struct sk_buff *frag_skb)
> +{
> +       skb_frag_t head_frag;
> +       struct page *page;
> +
> +       page = virt_to_head_page(frag_skb->head);
> +       head_frag.page.p = page;
> +       head_frag.page_offset = frag_skb->data -
> +               (unsigned char *)page_address(page);
> +       head_frag.size = skb_headlen(frag_skb);
> +       return head_frag;
> +}
> +
>  /**
>   *     skb_segment - Perform protocol segmentation on skb.
>   *     @head_skb: buffer to segment
> @@ -3664,15 +3677,16 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>
>                 while (pos < offset + len) {
>                         if (i >= nfrags) {
> -                               BUG_ON(skb_headlen(list_skb));
> -
>                                 i = 0;
>                                 nfrags = skb_shinfo(list_skb)->nr_frags;
>                                 frag = skb_shinfo(list_skb)->frags;
> -                               frag_skb = list_skb;

You could probably leave this line in place. No point in moving it.

> -
> -                               BUG_ON(!nfrags);
> +                               if (skb_headlen(list_skb)) {
> +                                       BUG_ON(!list_skb->head_frag);
>
> +                                       /* to make room for head_frag. */
> +                                       i--; frag--;

Normally these should be two separate lines one for "i--;" and one for
"frag--;".

> +                               }

You could probably place the BUG_ON(!nfrags) in an else statement here
to handle the case where we have a potentially empty skb which would
be a bug.

> +                               frag_skb = list_skb;
>                                 if (skb_orphan_frags(frag_skb, GFP_ATOMIC) ||
>                                     skb_zerocopy_clone(nskb, frag_skb,
>                                                        GFP_ATOMIC))
> @@ -3689,7 +3703,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>                                 goto err;
>                         }
>
> -                       *nskb_frag = *frag;
> +                       *nskb_frag = (i < 0) ? skb_head_frag_to_page_desc(frag_skb) : *frag;
>                         __skb_frag_ref(nskb_frag);
>                         size = skb_frag_size(nskb_frag);
>
> --
> 2.9.5
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next v5 1/2] net: permit skb_segment on head_frag frag_list skb
  2018-03-21 21:51   ` Alexander Duyck
@ 2018-03-21 23:02     ` Yonghong Song
  0 siblings, 0 replies; 5+ messages in thread
From: Yonghong Song @ 2018-03-21 23:02 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Eric Dumazet, ast, Daniel Borkmann, diptanu, Netdev, Kernel Team



On 3/21/18 2:51 PM, Alexander Duyck wrote:
> On Wed, Mar 21, 2018 at 1:36 PM, Yonghong Song <yhs@fb.com> wrote:
>> One of our in-house projects, bpf-based NAT, hits a kernel BUG_ON at
>> function skb_segment(), line 3667. The bpf program attaches to
>> clsact ingress, calls bpf_skb_change_proto to change protocol
>> from ipv4 to ipv6 or from ipv6 to ipv4, and then calls bpf_redirect
>> to send the changed packet out.
>>
>> 3472 struct sk_buff *skb_segment(struct sk_buff *head_skb,
>> 3473                             netdev_features_t features)
>> 3474 {
>> 3475         struct sk_buff *segs = NULL;
>> 3476         struct sk_buff *tail = NULL;
>> ...
>> 3665                 while (pos < offset + len) {
>> 3666                         if (i >= nfrags) {
>> 3667                                 BUG_ON(skb_headlen(list_skb));
>> 3668
>> 3669                                 i = 0;
>> 3670                                 nfrags = skb_shinfo(list_skb)->nr_frags;
>> 3671                                 frag = skb_shinfo(list_skb)->frags;
>> 3672                                 frag_skb = list_skb;
>> ...
>>
>> call stack:
>> ...
>>   #1 [ffff883ffef03558] __crash_kexec at ffffffff8110c525
>>   #2 [ffff883ffef03620] crash_kexec at ffffffff8110d5cc
>>   #3 [ffff883ffef03640] oops_end at ffffffff8101d7e7
>>   #4 [ffff883ffef03668] die at ffffffff8101deb2
>>   #5 [ffff883ffef03698] do_trap at ffffffff8101a700
>>   #6 [ffff883ffef036e8] do_error_trap at ffffffff8101abfe
>>   #7 [ffff883ffef037a0] do_invalid_op at ffffffff8101acd0
>>   #8 [ffff883ffef037b0] invalid_op at ffffffff81a00bab
>>      [exception RIP: skb_segment+3044]
>>      RIP: ffffffff817e4dd4  RSP: ffff883ffef03860  RFLAGS: 00010216
>>      RAX: 0000000000002bf6  RBX: ffff883feb7aaa00  RCX: 0000000000000011
>>      RDX: ffff883fb87910c0  RSI: 0000000000000011  RDI: ffff883feb7ab500
>>      RBP: ffff883ffef03928   R8: 0000000000002ce2   R9: 00000000000027da
>>      R10: 000001ea00000000  R11: 0000000000002d82  R12: ffff883f90a1ee80
>>      R13: ffff883fb8791120  R14: ffff883feb7abc00  R15: 0000000000002ce2
>>      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>>   #9 [ffff883ffef03930] tcp_gso_segment at ffffffff818713e7
>> --- <IRQ stack> ---
>> ...
>>
>> The triggering input skb has the following properties:
>>      list_skb = skb->frag_list;
>>      skb->nfrags != NULL && skb_headlen(list_skb) != 0
>> and skb_segment() is not able to handle a frag_list skb
>> if its headlen (list_skb->len - list_skb->data_len) is not 0.
>>
>> This patch addressed the issue by handling skb_headlen(list_skb) != 0
>> case properly if list_skb->head_frag is true, which is expected in
>> most cases. The head frag is processed before list_skb->frags
>> are processed.
>>
>> Reported-by: Diptanu Gon Choudhury <diptanu@fb.com>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   net/core/skbuff.c | 26 ++++++++++++++++++++------
>>   1 file changed, 20 insertions(+), 6 deletions(-)
>>
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index 715c134..23b317a 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -3460,6 +3460,19 @@ void *skb_pull_rcsum(struct sk_buff *skb, unsigned int len)
>>   }
>>   EXPORT_SYMBOL_GPL(skb_pull_rcsum);
>>
>> +static inline skb_frag_t skb_head_frag_to_page_desc(struct sk_buff *frag_skb)
>> +{
>> +       skb_frag_t head_frag;
>> +       struct page *page;
>> +
>> +       page = virt_to_head_page(frag_skb->head);
>> +       head_frag.page.p = page;
>> +       head_frag.page_offset = frag_skb->data -
>> +               (unsigned char *)page_address(page);
>> +       head_frag.size = skb_headlen(frag_skb);
>> +       return head_frag;
>> +}
>> +
>>   /**
>>    *     skb_segment - Perform protocol segmentation on skb.
>>    *     @head_skb: buffer to segment
>> @@ -3664,15 +3677,16 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>>
>>                  while (pos < offset + len) {
>>                          if (i >= nfrags) {
>> -                               BUG_ON(skb_headlen(list_skb));
>> -
>>                                  i = 0;
>>                                  nfrags = skb_shinfo(list_skb)->nr_frags;
>>                                  frag = skb_shinfo(list_skb)->frags;
>> -                               frag_skb = list_skb;
> 
> You could probably leave this line in place. No point in moving it.

The only reason I moved it is to make define more close to the use.
But I am totally fine with leaving it as it.

> 
>> -
>> -                               BUG_ON(!nfrags);
>> +                               if (skb_headlen(list_skb)) {
>> +                                       BUG_ON(!list_skb->head_frag);
>>
>> +                                       /* to make room for head_frag. */
>> +                                       i--; frag--;
> 
> Normally these should be two separate lines one for "i--;" and one for
> "frag--;".

Will change. Surprised that checkpatch.pl did not complain about this.

> 
>> +                               }
> 
> You could probably place the BUG_ON(!nfrags) in an else statement here
> to handle the case where we have a potentially empty skb which would
> be a bug.

Yes, this makes sense. Will add this BUG_ON.
> 
>> +                               frag_skb = list_skb;
>>                                  if (skb_orphan_frags(frag_skb, GFP_ATOMIC) ||
>>                                      skb_zerocopy_clone(nskb, frag_skb,
>>                                                         GFP_ATOMIC))
>> @@ -3689,7 +3703,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>>                                  goto err;
>>                          }
>>
>> -                       *nskb_frag = *frag;
>> +                       *nskb_frag = (i < 0) ? skb_head_frag_to_page_desc(frag_skb) : *frag;
>>                          __skb_frag_ref(nskb_frag);
>>                          size = skb_frag_size(nskb_frag);
>>
>> --
>> 2.9.5
>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-03-21 23:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-03-21 20:36 [PATCH net-next v5 0/2] net: permit skb_segment on head_frag frag_list skb Yonghong Song
2018-03-21 20:36 ` [PATCH net-next v5 1/2] " Yonghong Song
2018-03-21 21:51   ` Alexander Duyck
2018-03-21 23:02     ` Yonghong Song
2018-03-21 20:36 ` [PATCH net-next v5 2/2] net: bpf: add a test for skb_segment in test_bpf module Yonghong Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).