From: John <john.phillips5@hpe.com>
To: Thomas Graf <tgraf@suug.ch>, Jesse Gross <jesse@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
Linux Kernel Network Developers <netdev@vger.kernel.org>,
Tom Herbert <tom@herbertland.com>,
david.roth@hpe.com, Pravin B Shelar <pshelar@nicira.com>
Subject: Re: Kernel memory leak in bnx2x driver with vxlan tunnel
Date: Wed, 20 Jan 2016 16:43:59 -0700 [thread overview]
Message-ID: <56A01BBF.1010301@hpe.com> (raw)
In-Reply-To: <20160120013128.GA6326@pox.localdomain>
On 01/19/2016 06:31 PM, Thomas Graf wrote:
> On 01/19/16 at 04:51pm, Jesse Gross wrote:
>> On Tue, Jan 19, 2016 at 4:17 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>> So what is the purpose of having a dst if we need to drop it ?
>>>
>>> Adding code in GRO would be fine if someone explains me the purpose of
>>> doing apparently useless work.
>>>
>>> (refcounting on dst is not exactly free)
>> In the GRO case, the dst is only dropped on the packets which have
>> been merged and therefore need to be freed (the GRO_MERGED_FREE case).
>> It's not being thrown away for the overall frame, just metadata that
>> has been duplicated on each individual frame, similar to the metadata
>> in struct sk_buff itself. And while it is not used by the IP stack
>> there are other consumers (eBPF/OVS/etc.). This entire process is
>> controlled by the COLLECT_METADATA flag on tunnels, so there is no
>> cost in situations where it is not actually used.
> Right. There were thoughts around leveraging a per CPU scratch
> buffer without a refcount and turn it into a full reference when
> the packet gets enqueued somewhere but the need hasn't really come
> up yet.
>
> Jesse, is this what you have in mind:
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index cc9e365..3a5e96d 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4548,9 +4548,10 @@ static gro_result_t napi_skb_finish(gro_result_t ret, struct sk_buff *skb)
> break;
>
> case GRO_MERGED_FREE:
> - if (NAPI_GRO_CB(skb)->free == NAPI_GRO_FREE_STOLEN_HEAD)
> + if (NAPI_GRO_CB(skb)->free == NAPI_GRO_FREE_STOLEN_HEAD) {
> + skb_release_head_state(skb);
> kmem_cache_free(skbuff_head_cache, skb);
> - else
> + } else
> __kfree_skb(skb);
> break;
So I've tested the below patch (same as one above with minor
modifications made to make it compile) and it worked - no memory leak.
Should I submit this or...?
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4355129..a8fac63 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2829,6 +2829,7 @@ int skb_zerocopy(struct sk_buff *to, struct
sk_buff *from,
void skb_split(struct sk_buff *skb, struct sk_buff *skb1, const u32 len);
int skb_shift(struct sk_buff *tgt, struct sk_buff *skb, int shiftlen);
void skb_scrub_packet(struct sk_buff *skb, bool xnet);
+void skb_release_head_state(struct sk_buff *skb);
unsigned int skb_gso_transport_seglen(const struct sk_buff *skb);
struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t
features);
struct sk_buff *skb_vlan_untag(struct sk_buff *skb);
diff --git a/net/core/dev.c b/net/core/dev.c
index ae00b89..76e3623 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4337,9 +4337,10 @@ static gro_result_t napi_skb_finish(gro_result_t
ret, struct sk_buff *skb)
break;
case GRO_MERGED_FREE:
- if (NAPI_GRO_CB(skb)->free == NAPI_GRO_FREE_STOLEN_HEAD)
+ if (NAPI_GRO_CB(skb)->free == NAPI_GRO_FREE_STOLEN_HEAD) {
+ skb_release_head_state(skb);
kmem_cache_free(skbuff_head_cache, skb);
- else
+ } else
__kfree_skb(skb);
break;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index b2df375..45f6f50 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -633,7 +633,7 @@ fastpath:
kmem_cache_free(skbuff_fclone_cache, fclones);
}
-static void skb_release_head_state(struct sk_buff *skb)
+void skb_release_head_state(struct sk_buff *skb)
{
skb_dst_drop(skb);
#ifdef CONFIG_XFRM
next prev parent reply other threads:[~2016-01-20 23:49 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-14 17:17 Kernel memory leak in bnx2x driver with vxlan tunnel John
2016-01-19 21:07 ` Jesse Gross
2016-01-19 22:47 ` Eric Dumazet
2016-01-19 23:34 ` Jesse Gross
2016-01-19 23:51 ` Eric Dumazet
2016-01-20 0:00 ` Jesse Gross
2016-01-20 0:17 ` Eric Dumazet
2016-01-20 0:51 ` Jesse Gross
2016-01-20 1:31 ` Thomas Graf
2016-01-20 1:40 ` Jesse Gross
2016-01-20 23:43 ` John [this message]
2016-01-21 0:07 ` Eric Dumazet
2016-01-21 0:19 ` Jesse Gross
2016-01-21 0:34 ` Eric Dumazet
2016-01-21 2:27 ` Thomas Graf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56A01BBF.1010301@hpe.com \
--to=john.phillips5@hpe.com \
--cc=david.roth@hpe.com \
--cc=eric.dumazet@gmail.com \
--cc=jesse@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pshelar@nicira.com \
--cc=tgraf@suug.ch \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.