From: John <john.phillips5@hpe.com>
To: Thomas Graf <tgraf@suug.ch>, Jesse Gross <jesse@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
Linux Kernel Network Developers <netdev@vger.kernel.org>,
Tom Herbert <tom@herbertland.com>,
david.roth@hpe.com, Pravin B Shelar <pshelar@nicira.com>
Subject: Re: Kernel memory leak in bnx2x driver with vxlan tunnel
Date: Wed, 20 Jan 2016 16:43:59 -0700 [thread overview]
Message-ID: <56A01BBF.1010301@hpe.com> (raw)
In-Reply-To: <20160120013128.GA6326@pox.localdomain>
On 01/19/2016 06:31 PM, Thomas Graf wrote:
> On 01/19/16 at 04:51pm, Jesse Gross wrote:
>> On Tue, Jan 19, 2016 at 4:17 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>> So what is the purpose of having a dst if we need to drop it ?
>>>
>>> Adding code in GRO would be fine if someone explains me the purpose of
>>> doing apparently useless work.
>>>
>>> (refcounting on dst is not exactly free)
>> In the GRO case, the dst is only dropped on the packets which have
>> been merged and therefore need to be freed (the GRO_MERGED_FREE case).
>> It's not being thrown away for the overall frame, just metadata that
>> has been duplicated on each individual frame, similar to the metadata
>> in struct sk_buff itself. And while it is not used by the IP stack
>> there are other consumers (eBPF/OVS/etc.). This entire process is
>> controlled by the COLLECT_METADATA flag on tunnels, so there is no
>> cost in situations where it is not actually used.
> Right. There were thoughts around leveraging a per CPU scratch
> buffer without a refcount and turn it into a full reference when
> the packet gets enqueued somewhere but the need hasn't really come
> up yet.
>
> Jesse, is this what you have in mind:
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index cc9e365..3a5e96d 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4548,9 +4548,10 @@ static gro_result_t napi_skb_finish(gro_result_t ret, struct sk_buff *skb)
> break;
>
> case GRO_MERGED_FREE:
> - if (NAPI_GRO_CB(skb)->free == NAPI_GRO_FREE_STOLEN_HEAD)
> + if (NAPI_GRO_CB(skb)->free == NAPI_GRO_FREE_STOLEN_HEAD) {
> + skb_release_head_state(skb);
> kmem_cache_free(skbuff_head_cache, skb);
> - else
> + } else
> __kfree_skb(skb);
> break;
So I've tested the below patch (same as one above with minor
modifications made to make it compile) and it worked - no memory leak.
Should I submit this or...?
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4355129..a8fac63 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2829,6 +2829,7 @@ int skb_zerocopy(struct sk_buff *to, struct
sk_buff *from,
void skb_split(struct sk_buff *skb, struct sk_buff *skb1, const u32 len);
int skb_shift(struct sk_buff *tgt, struct sk_buff *skb, int shiftlen);
void skb_scrub_packet(struct sk_buff *skb, bool xnet);
+void skb_release_head_state(struct sk_buff *skb);
unsigned int skb_gso_transport_seglen(const struct sk_buff *skb);
struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t
features);
struct sk_buff *skb_vlan_untag(struct sk_buff *skb);
diff --git a/net/core/dev.c b/net/core/dev.c
index ae00b89..76e3623 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4337,9 +4337,10 @@ static gro_result_t napi_skb_finish(gro_result_t
ret, struct sk_buff *skb)
break;
case GRO_MERGED_FREE:
- if (NAPI_GRO_CB(skb)->free == NAPI_GRO_FREE_STOLEN_HEAD)
+ if (NAPI_GRO_CB(skb)->free == NAPI_GRO_FREE_STOLEN_HEAD) {
+ skb_release_head_state(skb);
kmem_cache_free(skbuff_head_cache, skb);
- else
+ } else
__kfree_skb(skb);
break;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index b2df375..45f6f50 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -633,7 +633,7 @@ fastpath:
kmem_cache_free(skbuff_fclone_cache, fclones);
}
-static void skb_release_head_state(struct sk_buff *skb)
+void skb_release_head_state(struct sk_buff *skb)
{
skb_dst_drop(skb);
#ifdef CONFIG_XFRM
next prev parent reply other threads:[~2016-01-20 23:49 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-14 17:17 Kernel memory leak in bnx2x driver with vxlan tunnel John
2016-01-19 21:07 ` Jesse Gross
2016-01-19 22:47 ` Eric Dumazet
2016-01-19 23:34 ` Jesse Gross
2016-01-19 23:51 ` Eric Dumazet
2016-01-20 0:00 ` Jesse Gross
2016-01-20 0:17 ` Eric Dumazet
2016-01-20 0:51 ` Jesse Gross
2016-01-20 1:31 ` Thomas Graf
2016-01-20 1:40 ` Jesse Gross
2016-01-20 23:43 ` John [this message]
2016-01-21 0:07 ` Eric Dumazet
2016-01-21 0:19 ` Jesse Gross
2016-01-21 0:34 ` Eric Dumazet
2016-01-21 2:27 ` Thomas Graf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56A01BBF.1010301@hpe.com \
--to=john.phillips5@hpe.com \
--cc=david.roth@hpe.com \
--cc=eric.dumazet@gmail.com \
--cc=jesse@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pshelar@nicira.com \
--cc=tgraf@suug.ch \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).