From: Jakub Kicinski <kuba@kernel.org>
To: Jiri Benc <jbenc@redhat.com>
Cc: netdev@vger.kernel.org,
Shmulik Ladkani <shmulik@metanetworks.com>,
Eric Dumazet <eric.dumazet@gmail.com>,
Tomas Hruby <tomas@tigera.io>,
Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>,
alexanderduyck@meta.com, willemb@google.com,
Paolo Abeni <pabeni@redhat.com>
Subject: Re: [PATCH net] net: gso: fix panic on frag_list with mixed head alloc types
Date: Fri, 28 Oct 2022 21:41:23 -0700 [thread overview]
Message-ID: <20221028214123.1ac0fc87@kernel.org> (raw)
In-Reply-To: <559cea869928e169240d74c386735f3f95beca32.1666858629.git.jbenc@redhat.com>
On Thu, 27 Oct 2022 10:20:56 +0200 Jiri Benc wrote:
> Since commit 3dcbdb134f32 ("net: gso: Fix skb_segment splat when
> splitting gso_size mangled skb having linear-headed frag_list"), it is
> allowed to change gso_size of a GRO packet. However, that commit assumes
> that "checking the first list_skb member suffices; i.e if either of the
> list_skb members have non head_frag head, then the first one has too".
>
> It turns out this assumption does not hold. We've seen BUG_ON being hit
> in skb_segment when skbs on the frag_list had differing head_frag. That
> particular case was with vmxnet3; looking at the driver, it indeed uses
> different skb allocation strategies based on the packet size.
Where are you looking? I'm not seeing it TBH.
I don't think the driver is that important, tho, __napi_alloc_skb()
will select page backing or kmalloc, all by itself.
The patch LGTM, adding more CCs in case I'm missing something.
> The last packet in frag_list can thus be kmalloced if it is
> sufficiently small. And there's nothing preventing drivers from
> mixing things even more freely.
>
> There are three different locations where this can be fixed:
>
> (1) We could check head_frag in GRO and not allow GROing skbs with
> different head_frag. However, that would lead to performance
> regression (at least on vmxnet3) on normal forward paths with
> unmodified gso_size, where mixed head_frag is not a problem.
>
> (2) Set a flag in bpf_skb_net_grow and bpf_skb_net_shrink indicating
> that NETIF_F_SG is undesirable. That would need to eat a bit in
> sk_buff. Furthermore, that flag can be unset when all skbs on the
> frag_list are page backed. To retain good performance,
> bpf_skb_net_grow/shrink would have to walk the frag_list.
>
> (3) Walk the frag_list in skb_segment when determining whether
> NETIF_F_SG should be cleared. This of course slows things down.
>
> This patch implements (3). To limit the performance impact in
> skb_segment, the list is walked only for skbs with SKB_GSO_DODGY set
> that have gso_size changed. Normal paths thus will not hit it.
>
> Fixes: 3dcbdb134f32 ("net: gso: Fix skb_segment splat when splitting
> gso_size mangled skb having linear-headed frag_list") Signed-off-by:
> Jiri Benc <jbenc@redhat.com>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 1d9719e72f9d..bbf3acff44c6 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4134,23 +4134,25 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
> int i = 0;
> int pos;
>
> - if (list_skb && !list_skb->head_frag && skb_headlen(list_skb) &&
> - (skb_shinfo(head_skb)->gso_type & SKB_GSO_DODGY)) {
> - /* gso_size is untrusted, and we have a frag_list with a linear
> - * non head_frag head.
> - *
> - * (we assume checking the first list_skb member suffices;
> - * i.e if either of the list_skb members have non head_frag
> - * head, then the first one has too).
> - *
> - * If head_skb's headlen does not fit requested gso_size, it
> - * means that the frag_list members do NOT terminate on exact
> - * gso_size boundaries. Hence we cannot perform skb_frag_t page
> - * sharing. Therefore we must fallback to copying the frag_list
> - * skbs; we do so by disabling SG.
> - */
> - if (mss != GSO_BY_FRAGS && mss != skb_headlen(head_skb))
> - features &= ~NETIF_F_SG;
> + if ((skb_shinfo(head_skb)->gso_type & SKB_GSO_DODGY) &&
> + mss != GSO_BY_FRAGS && mss != skb_headlen(head_skb)) {
> + struct sk_buff *check_skb;
> +
> + for (check_skb = list_skb; check_skb; check_skb = check_skb->next) {
> + if (skb_headlen(check_skb) && !check_skb->head_frag) {
> + /* gso_size is untrusted, and we have a frag_list with
> + * a linear non head_frag item.
> + *
> + * If head_skb's headlen does not fit requested gso_size,
> + * it means that the frag_list members do NOT terminate
> + * on exact gso_size boundaries. Hence we cannot perform
> + * skb_frag_t page sharing. Therefore we must fallback to
> + * copying the frag_list skbs; we do so by disabling SG.
> + */
> + features &= ~NETIF_F_SG;
> + break;
> + }
> + }
> }
>
> __skb_push(head_skb, doffset);
next prev parent reply other threads:[~2022-10-29 4:41 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-27 8:20 [PATCH net] net: gso: fix panic on frag_list with mixed head alloc types Jiri Benc
2022-10-29 4:41 ` Jakub Kicinski [this message]
2022-10-31 15:54 ` Jiri Benc
2022-10-29 7:41 ` Shmulik Ladkani
2022-10-29 14:10 ` Willem de Bruijn
2022-10-31 16:52 ` Jiri Benc
2022-10-31 21:16 ` Willem de Bruijn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221028214123.1ac0fc87@kernel.org \
--to=kuba@kernel.org \
--cc=alexanderduyck@meta.com \
--cc=eric.dumazet@gmail.com \
--cc=jbenc@redhat.com \
--cc=jpiotrowski@linux.microsoft.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=shmulik@metanetworks.com \
--cc=tomas@tigera.io \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.