* [PATCH net-next 1/3] net: skb: carve the allocation out of skb_ext_add()
2023-02-15 3:43 [PATCH net-next 0/3] net: skbuff: cache one skb_ext for use by GRO Jakub Kicinski
@ 2023-02-15 3:43 ` Jakub Kicinski
2023-02-15 3:43 ` [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO Jakub Kicinski
2023-02-15 3:43 ` [PATCH net-next 3/3] net: create and use NAPI version of tc_skb_ext_alloc() Jakub Kicinski
2 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2023-02-15 3:43 UTC (permalink / raw)
To: davem; +Cc: netdev, edumazet, pabeni, willemb, fw, Jakub Kicinski
Subsequent patches will try to add different skb_ext allocation
methods. Refactor skb_ext_add() so that most of its code moves
into a tail-called helper, and allocation can be easily swapped
out.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
net/core/skbuff.c | 52 ++++++++++++++++++++++++++++-------------------
1 file changed, 31 insertions(+), 21 deletions(-)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 13ea10cf8544..6f0fc1f09536 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -6672,26 +6672,13 @@ void *__skb_ext_set(struct sk_buff *skb, enum skb_ext_id id,
return skb_ext_get_ptr(ext, id);
}
-/**
- * skb_ext_add - allocate space for given extension, COW if needed
- * @skb: buffer
- * @id: extension to allocate space for
- *
- * Allocates enough space for the given extension.
- * If the extension is already present, a pointer to that extension
- * is returned.
- *
- * If the skb was cloned, COW applies and the returned memory can be
- * modified without changing the extension space of clones buffers.
- *
- * Returns pointer to the extension or NULL on allocation failure.
- */
-void *skb_ext_add(struct sk_buff *skb, enum skb_ext_id id)
+static void *skb_ext_add_finalize(struct sk_buff *skb, enum skb_ext_id id,
+ struct skb_ext *new)
{
- struct skb_ext *new, *old = NULL;
unsigned int newlen, newoff;
+ struct skb_ext *old;
- if (skb->active_extensions) {
+ if (!new) {
old = skb->extensions;
new = skb_ext_maybe_cow(old, skb->active_extensions);
@@ -6704,10 +6691,6 @@ void *skb_ext_add(struct sk_buff *skb, enum skb_ext_id id)
newoff = new->chunks;
} else {
newoff = SKB_EXT_CHUNKSIZEOF(*new);
-
- new = __skb_ext_alloc(GFP_ATOMIC);
- if (!new)
- return NULL;
}
newlen = newoff + skb_ext_type_len[id];
@@ -6719,6 +6702,33 @@ void *skb_ext_add(struct sk_buff *skb, enum skb_ext_id id)
skb->active_extensions |= 1 << id;
return skb_ext_get_ptr(new, id);
}
+
+/**
+ * skb_ext_add - allocate space for given extension, COW if needed
+ * @skb: buffer
+ * @id: extension to allocate space for
+ *
+ * Allocates enough space for the given extension.
+ * If the extension is already present, a pointer to that extension
+ * is returned.
+ *
+ * If the skb was cloned, COW applies and the returned memory can be
+ * modified without changing the extension space of clones buffers.
+ *
+ * Returns pointer to the extension or NULL on allocation failure.
+ */
+void *skb_ext_add(struct sk_buff *skb, enum skb_ext_id id)
+{
+ struct skb_ext *new = NULL;
+
+ if (!skb->active_extensions) {
+ new = __skb_ext_alloc(GFP_ATOMIC);
+ if (!new)
+ return NULL;
+ }
+
+ return skb_ext_add_finalize(skb, id, new);
+}
EXPORT_SYMBOL(skb_ext_add);
#ifdef CONFIG_XFRM
--
2.39.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO
2023-02-15 3:43 [PATCH net-next 0/3] net: skbuff: cache one skb_ext for use by GRO Jakub Kicinski
2023-02-15 3:43 ` [PATCH net-next 1/3] net: skb: carve the allocation out of skb_ext_add() Jakub Kicinski
@ 2023-02-15 3:43 ` Jakub Kicinski
2023-02-15 8:41 ` Paolo Abeni
2023-02-15 15:37 ` Edward Cree
2023-02-15 3:43 ` [PATCH net-next 3/3] net: create and use NAPI version of tc_skb_ext_alloc() Jakub Kicinski
2 siblings, 2 replies; 19+ messages in thread
From: Jakub Kicinski @ 2023-02-15 3:43 UTC (permalink / raw)
To: davem; +Cc: netdev, edumazet, pabeni, willemb, fw, Jakub Kicinski
On the driver -> GRO path we can avoid thrashing the kmemcache
by holding onto one skb_ext.
Drivers usually report static data, so don't bother trying to
hold onto the skb_ext if the ext has contents which require
a destructor.
With a single flow and SW GRO adding a tc_skb_ext to every
frame costs around 16.6% of performance (21.2Gbps -> 17.6Gbps,
yes it's a relatively slow CPU). Using the cache reduces
the loss to 9.3%, (-> 19.2Gbps) although obviously in real
life the recycling will be less effective.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
include/linux/skbuff.h | 1 +
net/core/skbuff.c | 79 +++++++++++++++++++++++++++++++++++++++---
2 files changed, 75 insertions(+), 5 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index d5602b15c714..e68cb0a777b9 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -4622,6 +4622,7 @@ struct skb_ext *__skb_ext_alloc(gfp_t flags);
void *__skb_ext_set(struct sk_buff *skb, enum skb_ext_id id,
struct skb_ext *ext);
void *skb_ext_add(struct sk_buff *skb, enum skb_ext_id id);
+void *napi_skb_ext_add(struct sk_buff *skb, enum skb_ext_id id);
void __skb_ext_del(struct sk_buff *skb, enum skb_ext_id id);
void __skb_ext_put(struct skb_ext *ext);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6f0fc1f09536..feb5034b13ad 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -224,6 +224,9 @@ static void *page_frag_alloc_1k(struct page_frag_1k *nc, gfp_t gfp_mask)
struct napi_alloc_cache {
struct page_frag_cache page;
struct page_frag_1k page_small;
+#ifdef CONFIG_SKB_EXTENSIONS
+ struct skb_ext *ext;
+#endif
unsigned int skb_count;
void *skb_cache[NAPI_SKB_CACHE_SIZE];
};
@@ -1228,6 +1231,43 @@ static void napi_skb_cache_put(struct sk_buff *skb)
}
}
+static bool skb_ext_needs_destruct(const struct skb_ext *ext)
+{
+ bool needs_destruct = false;
+
+#ifdef CONFIG_XFRM
+ needs_destruct |= __skb_ext_exist(ext, SKB_EXT_SEC_PATH);
+#endif
+#ifdef CONFIG_MCTP_FLOWS
+ needs_destruct |= __skb_ext_exist(ext, SKB_EXT_MCTP);
+#endif
+
+ return needs_destruct;
+}
+
+static void napi_skb_ext_put(struct sk_buff *skb)
+{
+#ifdef CONFIG_SKB_EXTENSIONS
+ struct skb_ext *ext;
+
+ if (!skb->active_extensions)
+ return;
+
+ ext = skb->extensions;
+ if (!skb_ext_needs_destruct(ext)) {
+ struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache);
+
+ if (refcount_read(&ext->refcnt) == 1 && !nc->ext) {
+ kasan_poison_object_data(skbuff_ext_cache, ext);
+ nc->ext = ext;
+ return;
+ }
+ }
+
+ __skb_ext_put(ext);
+#endif
+}
+
void __kfree_skb_defer(struct sk_buff *skb)
{
skb_release_all(skb, SKB_DROP_REASON_NOT_SPECIFIED);
@@ -1239,7 +1279,7 @@ void napi_skb_free_stolen_head(struct sk_buff *skb)
if (unlikely(skb->slow_gro)) {
nf_reset_ct(skb);
skb_dst_drop(skb);
- skb_ext_put(skb);
+ napi_skb_ext_put(skb);
skb_orphan(skb);
skb->slow_gro = 0;
}
@@ -6599,6 +6639,12 @@ static void *skb_ext_get_ptr(struct skb_ext *ext, enum skb_ext_id id)
return (void *)ext + (ext->offset[id] * SKB_EXT_ALIGN_VALUE);
}
+static void skb_ext_init(struct skb_ext *new)
+{
+ memset(new->offset, 0, sizeof(new->offset));
+ refcount_set(&new->refcnt, 1);
+}
+
/**
* __skb_ext_alloc - allocate a new skb extensions storage
*
@@ -6612,10 +6658,8 @@ struct skb_ext *__skb_ext_alloc(gfp_t flags)
{
struct skb_ext *new = kmem_cache_alloc(skbuff_ext_cache, flags);
- if (new) {
- memset(new->offset, 0, sizeof(new->offset));
- refcount_set(&new->refcnt, 1);
- }
+ if (new)
+ skb_ext_init(new);
return new;
}
@@ -6731,6 +6775,31 @@ void *skb_ext_add(struct sk_buff *skb, enum skb_ext_id id)
}
EXPORT_SYMBOL(skb_ext_add);
+void *napi_skb_ext_add(struct sk_buff *skb, enum skb_ext_id id)
+{
+ struct skb_ext *new = NULL;
+
+ if (!skb->active_extensions) {
+ struct napi_alloc_cache *nc;
+
+ nc = this_cpu_ptr(&napi_alloc_cache);
+ new = nc->ext;
+ if (new) {
+ kasan_unpoison_object_data(skbuff_ext_cache, new);
+ nc->ext = NULL;
+ } else {
+ new = kmem_cache_alloc(skbuff_ext_cache, GFP_ATOMIC);
+ if (!new)
+ return NULL;
+ }
+
+ skb_ext_init(new);
+ }
+
+ return skb_ext_add_finalize(skb, id, new);
+}
+EXPORT_SYMBOL(napi_skb_ext_add);
+
#ifdef CONFIG_XFRM
static void skb_ext_put_sp(struct sec_path *sp)
{
--
2.39.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO
2023-02-15 3:43 ` [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO Jakub Kicinski
@ 2023-02-15 8:41 ` Paolo Abeni
2023-02-15 17:45 ` Jakub Kicinski
2023-02-15 15:37 ` Edward Cree
1 sibling, 1 reply; 19+ messages in thread
From: Paolo Abeni @ 2023-02-15 8:41 UTC (permalink / raw)
To: Jakub Kicinski, davem; +Cc: netdev, edumazet, willemb, fw
On Tue, 2023-02-14 at 19:43 -0800, Jakub Kicinski wrote:
> On the driver -> GRO path we can avoid thrashing the kmemcache
> by holding onto one skb_ext.
>
> Drivers usually report static data, so don't bother trying to
> hold onto the skb_ext if the ext has contents which require
> a destructor.
>
> With a single flow and SW GRO adding a tc_skb_ext to every
> frame costs around 16.6% of performance (21.2Gbps -> 17.6Gbps,
> yes it's a relatively slow CPU). Using the cache reduces
> the loss to 9.3%, (-> 19.2Gbps) although obviously in real
> life the recycling will be less effective.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> include/linux/skbuff.h | 1 +
> net/core/skbuff.c | 79 +++++++++++++++++++++++++++++++++++++++---
> 2 files changed, 75 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index d5602b15c714..e68cb0a777b9 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -4622,6 +4622,7 @@ struct skb_ext *__skb_ext_alloc(gfp_t flags);
> void *__skb_ext_set(struct sk_buff *skb, enum skb_ext_id id,
> struct skb_ext *ext);
> void *skb_ext_add(struct sk_buff *skb, enum skb_ext_id id);
> +void *napi_skb_ext_add(struct sk_buff *skb, enum skb_ext_id id);
> void __skb_ext_del(struct sk_buff *skb, enum skb_ext_id id);
> void __skb_ext_put(struct skb_ext *ext);
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 6f0fc1f09536..feb5034b13ad 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -224,6 +224,9 @@ static void *page_frag_alloc_1k(struct page_frag_1k *nc, gfp_t gfp_mask)
> struct napi_alloc_cache {
> struct page_frag_cache page;
> struct page_frag_1k page_small;
> +#ifdef CONFIG_SKB_EXTENSIONS
> + struct skb_ext *ext;
> +#endif
> unsigned int skb_count;
> void *skb_cache[NAPI_SKB_CACHE_SIZE];
> };
> @@ -1228,6 +1231,43 @@ static void napi_skb_cache_put(struct sk_buff *skb)
> }
> }
>
> +static bool skb_ext_needs_destruct(const struct skb_ext *ext)
> +{
> + bool needs_destruct = false;
> +
> +#ifdef CONFIG_XFRM
> + needs_destruct |= __skb_ext_exist(ext, SKB_EXT_SEC_PATH);
> +#endif
> +#ifdef CONFIG_MCTP_FLOWS
> + needs_destruct |= __skb_ext_exist(ext, SKB_EXT_MCTP);
> +#endif
> +
> + return needs_destruct;
> +}
> +
> +static void napi_skb_ext_put(struct sk_buff *skb)
> +{
> +#ifdef CONFIG_SKB_EXTENSIONS
> + struct skb_ext *ext;
> +
> + if (!skb->active_extensions)
> + return;
> +
> + ext = skb->extensions;
> + if (!skb_ext_needs_destruct(ext)) {
> + struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache);
> +
> + if (refcount_read(&ext->refcnt) == 1 && !nc->ext) {
> + kasan_poison_object_data(skbuff_ext_cache, ext);
> + nc->ext = ext;
> + return;
> + }
> + }
> +
> + __skb_ext_put(ext);
> +#endif
> +}
> +
> void __kfree_skb_defer(struct sk_buff *skb)
I'm wondering if napi_reuse_skb() should be touched, too? Even it's not
directly used by the following patch...
Cheers,
Paolo
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO
2023-02-15 8:41 ` Paolo Abeni
@ 2023-02-15 17:45 ` Jakub Kicinski
2023-02-15 18:08 ` Alexander Lobakin
0 siblings, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2023-02-15 17:45 UTC (permalink / raw)
To: Paolo Abeni; +Cc: davem, netdev, edumazet, willemb, fw
On Wed, 15 Feb 2023 09:41:13 +0100 Paolo Abeni wrote:
> I'm wondering if napi_reuse_skb() should be touched, too? Even it's not
> directly used by the following patch...
I didn't touch it because I (sadly) don't have access to any driver
using GRO frags to test :( But I certainly can.
What about __kfree_skb_defer() and napi_consume_skb() (basically
the other two napi_skb_cache_put() callers) ?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO
2023-02-15 17:45 ` Jakub Kicinski
@ 2023-02-15 18:08 ` Alexander Lobakin
2023-02-15 19:08 ` Paolo Abeni
0 siblings, 1 reply; 19+ messages in thread
From: Alexander Lobakin @ 2023-02-15 18:08 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: Paolo Abeni, davem, netdev, edumazet, willemb, fw
From: Jakub Kicinski <kuba@kernel.org>
Date: Wed, 15 Feb 2023 09:45:42 -0800
> On Wed, 15 Feb 2023 09:41:13 +0100 Paolo Abeni wrote:
>> I'm wondering if napi_reuse_skb() should be touched, too? Even it's not
>> directly used by the following patch...
>
> I didn't touch it because I (sadly) don't have access to any driver
> using GRO frags to test :( But I certainly can.
>
> What about __kfree_skb_defer() and napi_consume_skb() (basically
> the other two napi_skb_cache_put() callers) ?
>
Sounds good. Basically any caller of napi_skb_cache_put() can be
switched to recycle extensions.
But you certainly need to have a pool instead of just one pointer then,
since napi_consume_skb() will return a lot if exts are actively used :)
Having just one pointer will only make freeing extensions 1 step longer
(`caller -> skb cache -> kmem_cache_free()` 63 times per 1 Tx completion
poll for example to save you 1 pointer to be used on Rx later).
Thanks,
Olek
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO
2023-02-15 18:08 ` Alexander Lobakin
@ 2023-02-15 19:08 ` Paolo Abeni
0 siblings, 0 replies; 19+ messages in thread
From: Paolo Abeni @ 2023-02-15 19:08 UTC (permalink / raw)
To: Alexander Lobakin, Jakub Kicinski; +Cc: davem, netdev, edumazet, willemb, fw
On Wed, 2023-02-15 at 19:08 +0100, Alexander Lobakin wrote:
> From: Jakub Kicinski <kuba@kernel.org>
> Date: Wed, 15 Feb 2023 09:45:42 -0800
>
> > On Wed, 15 Feb 2023 09:41:13 +0100 Paolo Abeni wrote:
> > > I'm wondering if napi_reuse_skb() should be touched, too? Even it's not
> > > directly used by the following patch...
> >
> > I didn't touch it because I (sadly) don't have access to any driver
> > using GRO frags to test :( But I certainly can.
> >
> > What about __kfree_skb_defer() and napi_consume_skb() (basically
> > the other two napi_skb_cache_put() callers) ?
> >
>
> Sounds good. Basically any caller of napi_skb_cache_put() can be
> switched to recycle extensions.
> But you certainly need to have a pool instead of just one pointer then,
> since napi_consume_skb() will return a lot if exts are actively used :)
This could be also a point to (initially) exclude napi_consume_skb()
and keep the (initial) implementation as simple as possible.
If the expected use-case more related to forwarded traffic, local
traffic or independent from such consideration?
Cheers,
Paolo
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO
2023-02-15 3:43 ` [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO Jakub Kicinski
2023-02-15 8:41 ` Paolo Abeni
@ 2023-02-15 15:37 ` Edward Cree
2023-02-15 16:17 ` Alexander Lobakin
1 sibling, 1 reply; 19+ messages in thread
From: Edward Cree @ 2023-02-15 15:37 UTC (permalink / raw)
To: Jakub Kicinski, davem; +Cc: netdev, edumazet, pabeni, willemb, fw
On 15/02/2023 03:43, Jakub Kicinski wrote:
> On the driver -> GRO path we can avoid thrashing the kmemcache
> by holding onto one skb_ext.
Hmm, will one be enough if we're doing GRO_NORMAL batching?
As for e.g. UDP traffic up to 8 skbs (by default) can have
overlapping lifetimes.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO
2023-02-15 15:37 ` Edward Cree
@ 2023-02-15 16:17 ` Alexander Lobakin
2023-02-15 17:52 ` Jakub Kicinski
0 siblings, 1 reply; 19+ messages in thread
From: Alexander Lobakin @ 2023-02-15 16:17 UTC (permalink / raw)
To: Edward Cree, Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, willemb, fw
From: Edward Cree <ecree.xilinx@gmail.com>
Date: Wed, 15 Feb 2023 15:37:44 +0000
> On 15/02/2023 03:43, Jakub Kicinski wrote:
>> On the driver -> GRO path we can avoid thrashing the kmemcache
>> by holding onto one skb_ext.
>
> Hmm, will one be enough if we're doing GRO_NORMAL batching?
> As for e.g. UDP traffic up to 8 skbs (by default) can have
> overlapping lifetimes.
>
I thought of an array of %NAPI_SKB_CACHE_SIZE to be honest. From what
I've ever tested, no cache (for any netstack-related object) is enough
if it can't serve one full NAPI poll :D
+ agree with Paolo re napi_reuse_skb(), it's used only in the NAPI
context and recycles a lot o'stuff already, we can speed it up safely here.
Thanks,
Olek
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO
2023-02-15 16:17 ` Alexander Lobakin
@ 2023-02-15 17:52 ` Jakub Kicinski
2023-02-15 18:01 ` Alexander Lobakin
0 siblings, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2023-02-15 17:52 UTC (permalink / raw)
To: Alexander Lobakin
Cc: Edward Cree, davem, netdev, edumazet, pabeni, willemb, fw
On Wed, 15 Feb 2023 17:17:53 +0100 Alexander Lobakin wrote:
> > On 15/02/2023 03:43, Jakub Kicinski wrote:
> >> On the driver -> GRO path we can avoid thrashing the kmemcache
> >> by holding onto one skb_ext.
> >
> > Hmm, will one be enough if we're doing GRO_NORMAL batching?
> > As for e.g. UDP traffic up to 8 skbs (by default) can have
> > overlapping lifetimes.
> >
> I thought of an array of %NAPI_SKB_CACHE_SIZE to be honest. From what
> I've ever tested, no cache (for any netstack-related object) is enough
> if it can't serve one full NAPI poll :D
I was hoping to leave sizing of the cache until we have some data from
a production network (or at least representative packet traces).
NAPI_SKB_CACHE_SIZE kinda assumes we're not doing much GRO, right?
And the current patch feeds the cache exclusively from GRO...
> + agree with Paolo re napi_reuse_skb(), it's used only in the NAPI
> context and recycles a lot o'stuff already, we can speed it up safely here.
LMK what's your opinion on touching the other potential spots, too.
(in Paolo's subthread).
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO
2023-02-15 17:52 ` Jakub Kicinski
@ 2023-02-15 18:01 ` Alexander Lobakin
2023-02-15 18:20 ` Jakub Kicinski
0 siblings, 1 reply; 19+ messages in thread
From: Alexander Lobakin @ 2023-02-15 18:01 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: Edward Cree, davem, netdev, edumazet, pabeni, willemb, fw
From: Jakub Kicinski <kuba@kernel.org>
Date: Wed, 15 Feb 2023 09:52:00 -0800
> On Wed, 15 Feb 2023 17:17:53 +0100 Alexander Lobakin wrote:
>>> On 15/02/2023 03:43, Jakub Kicinski wrote:
>>>> On the driver -> GRO path we can avoid thrashing the kmemcache
>>>> by holding onto one skb_ext.
>>>
>>> Hmm, will one be enough if we're doing GRO_NORMAL batching?
>>> As for e.g. UDP traffic up to 8 skbs (by default) can have
>>> overlapping lifetimes.
>>>
>> I thought of an array of %NAPI_SKB_CACHE_SIZE to be honest. From what
>> I've ever tested, no cache (for any netstack-related object) is enough
>> if it can't serve one full NAPI poll :D
>
> I was hoping to leave sizing of the cache until we have some data from
> a production network (or at least representative packet traces).
>
> NAPI_SKB_CACHE_SIZE kinda assumes we're not doing much GRO, right?
It assumes we GRO a lot :D
Imagine that you have 64 frames during one poll and the GRO layer
decides to coalesce them by batches of 16. Then only 4 skbs will be
used, the rest will go as frags (with "stolen heads") -> 60 of 64 skbs
will return to that skb cache and will then be reused by napi_build_skb().
> And the current patch feeds the cache exclusively from GRO...
>
>> + agree with Paolo re napi_reuse_skb(), it's used only in the NAPI
>> context and recycles a lot o'stuff already, we can speed it up safely here.
>
> LMK what's your opinion on touching the other potential spots, too.
> (in Paolo's subthread).
<went to take a look already>
Thanks,
Olek
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO
2023-02-15 18:01 ` Alexander Lobakin
@ 2023-02-15 18:20 ` Jakub Kicinski
2023-02-16 12:04 ` Alexander Lobakin
0 siblings, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2023-02-15 18:20 UTC (permalink / raw)
To: Alexander Lobakin
Cc: Edward Cree, davem, netdev, edumazet, pabeni, willemb, fw
On Wed, 15 Feb 2023 19:01:19 +0100 Alexander Lobakin wrote:
> > I was hoping to leave sizing of the cache until we have some data from
> > a production network (or at least representative packet traces).
> >
> > NAPI_SKB_CACHE_SIZE kinda assumes we're not doing much GRO, right?
>
> It assumes we GRO a lot :D
>
> Imagine that you have 64 frames during one poll and the GRO layer
> decides to coalesce them by batches of 16. Then only 4 skbs will be
> used, the rest will go as frags (with "stolen heads") -> 60 of 64 skbs
> will return to that skb cache and will then be reused by napi_build_skb().
Let's say 5 - for 4 resulting skbs GRO will need the 4 resulting and
one extra to shuttle between the driver and GRO (worst case).
With a cache of 1 I'm guaranteed to save 59 alloc calls, 92%, right?
That's why I'm saying - the larger cache would help workloads which
don't GRO as much. Am I missing the point or how GRO works?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO
2023-02-15 18:20 ` Jakub Kicinski
@ 2023-02-16 12:04 ` Alexander Lobakin
0 siblings, 0 replies; 19+ messages in thread
From: Alexander Lobakin @ 2023-02-16 12:04 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: Edward Cree, davem, netdev, edumazet, pabeni, willemb, fw
From: Jakub Kicinski <kuba@kernel.org>
Date: Wed, 15 Feb 2023 10:20:15 -0800
> On Wed, 15 Feb 2023 19:01:19 +0100 Alexander Lobakin wrote:
>>> I was hoping to leave sizing of the cache until we have some data from
>>> a production network (or at least representative packet traces).
>>>
>>> NAPI_SKB_CACHE_SIZE kinda assumes we're not doing much GRO, right?
>>
>> It assumes we GRO a lot :D
>>
>> Imagine that you have 64 frames during one poll and the GRO layer
>> decides to coalesce them by batches of 16. Then only 4 skbs will be
>> used, the rest will go as frags (with "stolen heads") -> 60 of 64 skbs
>> will return to that skb cache and will then be reused by napi_build_skb().
>
> Let's say 5 - for 4 resulting skbs GRO will need the 4 resulting and
> one extra to shuttle between the driver and GRO (worst case).
> With a cache of 1 I'm guaranteed to save 59 alloc calls, 92%, right?
>
> That's why I'm saying - the larger cache would help workloads which
> don't GRO as much. Am I missing the point or how GRO works?
Maybe I'm missing something now :D
The driver receives 5 frames, so it allocates 5 skbs. GRO coalesces them
into one big, so the first one remains as an skb, the following 4 get
their data added as frags and then are moved to the NAPI cache
(%NAPI_GRO_FREE_STOLEN_HEAD).
After GRO decides it's enough for this skb, it gets moved to the pending
list to be flushed soon. @gro_normal_batch is usually 8, so it means
there can be up to 8....
Oh wait, Eric changed this to count segments, not skbs :D
...there can be up to 2* such skbs waiting for a flush (the first one
sets the counter to 5, the second adds 5 more => flush happens). So you
anyway would need at least 2* skb extensions cached, otherwise there
will be new allocations.
This is not counting fraglists, when GRO decides to fraglist an skb, it
requires at least 1 skb more. UDP fraglisted GRO (I know almost nobody
uses it, still it does exist) doesn't use frags at all and requires 1
skb per each segment.
You're right that the cache size of %NAPI_POLL_WEIGHT is needed only for
corner cases like big @gro_normal_batch, fraglists, UDP fraglisted GRO
and so on, still think we shouldn't ignore them :) Also this cache can
then be reused later to bulk-free extensions on Tx completion, just like
it's done for skbs.
* or less/more if customized by user, for example I set 16 on MIPS,
x86_64 works better with 8.
Thanks,
Olek
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH net-next 3/3] net: create and use NAPI version of tc_skb_ext_alloc()
2023-02-15 3:43 [PATCH net-next 0/3] net: skbuff: cache one skb_ext for use by GRO Jakub Kicinski
2023-02-15 3:43 ` [PATCH net-next 1/3] net: skb: carve the allocation out of skb_ext_add() Jakub Kicinski
2023-02-15 3:43 ` [PATCH net-next 2/3] net: skbuff: cache one skb_ext for use by GRO Jakub Kicinski
@ 2023-02-15 3:43 ` Jakub Kicinski
2023-02-15 16:50 ` Jamal Hadi Salim
2 siblings, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2023-02-15 3:43 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, pabeni, willemb, fw, Jakub Kicinski, saeedm,
leon, jhs, xiyou.wangcong, jiri, roid, ozsh, paulb
Try to use the cached skb_ext in the drivers.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: saeedm@nvidia.com
CC: leon@kernel.org
CC: jhs@mojatatu.com
CC: xiyou.wangcong@gmail.com
CC: jiri@resnulli.us
CC: roid@nvidia.com
CC: ozsh@nvidia.com
CC: paulb@nvidia.com
---
drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 2 +-
include/net/pkt_cls.h | 9 +++++++++
3 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
index 3b590cfe33b8..ffbed5a92eab 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
@@ -770,7 +770,7 @@ static bool mlx5e_restore_skb_chain(struct sk_buff *skb, u32 chain, u32 reg_c1,
struct mlx5_eswitch *esw;
u32 zone_restore_id;
- tc_skb_ext = tc_skb_ext_alloc(skb);
+ tc_skb_ext = tc_skb_ext_alloc_napi(skb);
if (!tc_skb_ext) {
WARN_ON(1);
return false;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 2d06b4412762..3d9da4ccaf5d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -5643,7 +5643,7 @@ bool mlx5e_tc_update_skb(struct mlx5_cqe64 *cqe,
if (mapped_obj.type == MLX5_MAPPED_OBJ_CHAIN) {
chain = mapped_obj.chain;
- tc_skb_ext = tc_skb_ext_alloc(skb);
+ tc_skb_ext = tc_skb_ext_alloc_napi(skb);
if (WARN_ON(!tc_skb_ext))
return false;
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index ace437c6754b..82821a3f8a8b 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -764,6 +764,15 @@ static inline struct tc_skb_ext *tc_skb_ext_alloc(struct sk_buff *skb)
memset(tc_skb_ext, 0, sizeof(*tc_skb_ext));
return tc_skb_ext;
}
+
+static inline struct tc_skb_ext *tc_skb_ext_alloc_napi(struct sk_buff *skb)
+{
+ struct tc_skb_ext *tc_skb_ext = napi_skb_ext_add(skb, TC_SKB_EXT);
+
+ if (tc_skb_ext)
+ memset(tc_skb_ext, 0, sizeof(*tc_skb_ext));
+ return tc_skb_ext;
+}
#endif
enum tc_matchall_command {
--
2.39.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH net-next 3/3] net: create and use NAPI version of tc_skb_ext_alloc()
2023-02-15 3:43 ` [PATCH net-next 3/3] net: create and use NAPI version of tc_skb_ext_alloc() Jakub Kicinski
@ 2023-02-15 16:50 ` Jamal Hadi Salim
2023-02-15 17:03 ` Jiri Pirko
2023-02-15 17:35 ` Jakub Kicinski
0 siblings, 2 replies; 19+ messages in thread
From: Jamal Hadi Salim @ 2023-02-15 16:50 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, willemb, fw, saeedm, leon,
xiyou.wangcong, jiri, roid, ozsh, paulb
On Tue, Feb 14, 2023 at 10:44 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> Try to use the cached skb_ext in the drivers.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: saeedm@nvidia.com
> CC: leon@kernel.org
> CC: jhs@mojatatu.com
> CC: xiyou.wangcong@gmail.com
> CC: jiri@resnulli.us
> CC: roid@nvidia.com
> CC: ozsh@nvidia.com
> CC: paulb@nvidia.com
> ---
> drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c | 2 +-
> drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 2 +-
> include/net/pkt_cls.h | 9 +++++++++
> 3 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> index 3b590cfe33b8..ffbed5a92eab 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> @@ -770,7 +770,7 @@ static bool mlx5e_restore_skb_chain(struct sk_buff *skb, u32 chain, u32 reg_c1,
> struct mlx5_eswitch *esw;
> u32 zone_restore_id;
>
> - tc_skb_ext = tc_skb_ext_alloc(skb);
> + tc_skb_ext = tc_skb_ext_alloc_napi(skb);
> if (!tc_skb_ext) {
> WARN_ON(1);
> return false;
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> index 2d06b4412762..3d9da4ccaf5d 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> @@ -5643,7 +5643,7 @@ bool mlx5e_tc_update_skb(struct mlx5_cqe64 *cqe,
>
> if (mapped_obj.type == MLX5_MAPPED_OBJ_CHAIN) {
> chain = mapped_obj.chain;
> - tc_skb_ext = tc_skb_ext_alloc(skb);
> + tc_skb_ext = tc_skb_ext_alloc_napi(skb);
> if (WARN_ON(!tc_skb_ext))
> return false;
>
> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
> index ace437c6754b..82821a3f8a8b 100644
> --- a/include/net/pkt_cls.h
> +++ b/include/net/pkt_cls.h
> @@ -764,6 +764,15 @@ static inline struct tc_skb_ext *tc_skb_ext_alloc(struct sk_buff *skb)
> memset(tc_skb_ext, 0, sizeof(*tc_skb_ext));
> return tc_skb_ext;
> }
> +
> +static inline struct tc_skb_ext *tc_skb_ext_alloc_napi(struct sk_buff *skb)
> +{
> + struct tc_skb_ext *tc_skb_ext = napi_skb_ext_add(skb, TC_SKB_EXT);
> +
> + if (tc_skb_ext)
> + memset(tc_skb_ext, 0, sizeof(*tc_skb_ext));
> + return tc_skb_ext;
> +}
> #endif
>
Dumb question: Would this work with a consumer of the metadata past
RPS stage? didnt look closely but assumed per cpu skb_ext because of
the napi context - which will require a per cpu pointer to fetch it
later.
In P4TC we make heavy use of skb_ext and found it to be very pricey,
so we ended making it per cpu - but that's post RPS so we are safe.
cheers,
jamal
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCH net-next 3/3] net: create and use NAPI version of tc_skb_ext_alloc()
2023-02-15 16:50 ` Jamal Hadi Salim
@ 2023-02-15 17:03 ` Jiri Pirko
2023-02-15 18:36 ` Jamal Hadi Salim
2023-02-15 17:35 ` Jakub Kicinski
1 sibling, 1 reply; 19+ messages in thread
From: Jiri Pirko @ 2023-02-15 17:03 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: Jakub Kicinski, davem, netdev, edumazet, pabeni, willemb, fw,
saeedm, leon, xiyou.wangcong, roid, ozsh, paulb
Wed, Feb 15, 2023 at 05:50:55PM CET, jhs@mojatatu.com wrote:
>On Tue, Feb 14, 2023 at 10:44 PM Jakub Kicinski <kuba@kernel.org> wrote:
>>
>> Try to use the cached skb_ext in the drivers.
>>
>> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
>> ---
>> CC: saeedm@nvidia.com
>> CC: leon@kernel.org
>> CC: jhs@mojatatu.com
>> CC: xiyou.wangcong@gmail.com
>> CC: jiri@resnulli.us
>> CC: roid@nvidia.com
>> CC: ozsh@nvidia.com
>> CC: paulb@nvidia.com
>> ---
>> drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c | 2 +-
>> drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 2 +-
>> include/net/pkt_cls.h | 9 +++++++++
>> 3 files changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
>> index 3b590cfe33b8..ffbed5a92eab 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
>> @@ -770,7 +770,7 @@ static bool mlx5e_restore_skb_chain(struct sk_buff *skb, u32 chain, u32 reg_c1,
>> struct mlx5_eswitch *esw;
>> u32 zone_restore_id;
>>
>> - tc_skb_ext = tc_skb_ext_alloc(skb);
>> + tc_skb_ext = tc_skb_ext_alloc_napi(skb);
>> if (!tc_skb_ext) {
>> WARN_ON(1);
>> return false;
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>> index 2d06b4412762..3d9da4ccaf5d 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>> @@ -5643,7 +5643,7 @@ bool mlx5e_tc_update_skb(struct mlx5_cqe64 *cqe,
>>
>> if (mapped_obj.type == MLX5_MAPPED_OBJ_CHAIN) {
>> chain = mapped_obj.chain;
>> - tc_skb_ext = tc_skb_ext_alloc(skb);
>> + tc_skb_ext = tc_skb_ext_alloc_napi(skb);
>> if (WARN_ON(!tc_skb_ext))
>> return false;
>>
>> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
>> index ace437c6754b..82821a3f8a8b 100644
>> --- a/include/net/pkt_cls.h
>> +++ b/include/net/pkt_cls.h
>> @@ -764,6 +764,15 @@ static inline struct tc_skb_ext *tc_skb_ext_alloc(struct sk_buff *skb)
>> memset(tc_skb_ext, 0, sizeof(*tc_skb_ext));
>> return tc_skb_ext;
>> }
>> +
>> +static inline struct tc_skb_ext *tc_skb_ext_alloc_napi(struct sk_buff *skb)
>> +{
>> + struct tc_skb_ext *tc_skb_ext = napi_skb_ext_add(skb, TC_SKB_EXT);
>> +
>> + if (tc_skb_ext)
>> + memset(tc_skb_ext, 0, sizeof(*tc_skb_ext));
>> + return tc_skb_ext;
>> +}
>> #endif
>>
>
>Dumb question: Would this work with a consumer of the metadata past
>RPS stage? didnt look closely but assumed per cpu skb_ext because of
>the napi context - which will require a per cpu pointer to fetch it
>later.
>In P4TC we make heavy use of skb_ext and found it to be very pricey,
Why don't you use skb->cb internally in p4tc? Or does the skb leave p4tc
and arrive back again? When?
>so we ended making it per cpu - but that's post RPS so we are safe.
>
>cheers,
>jamal
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCH net-next 3/3] net: create and use NAPI version of tc_skb_ext_alloc()
2023-02-15 17:03 ` Jiri Pirko
@ 2023-02-15 18:36 ` Jamal Hadi Salim
0 siblings, 0 replies; 19+ messages in thread
From: Jamal Hadi Salim @ 2023-02-15 18:36 UTC (permalink / raw)
To: Jiri Pirko
Cc: Jakub Kicinski, davem, netdev, edumazet, pabeni, willemb, fw,
saeedm, leon, xiyou.wangcong, roid, ozsh, paulb
On Wed, Feb 15, 2023 at 12:03 PM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Wed, Feb 15, 2023 at 05:50:55PM CET, jhs@mojatatu.com wrote:
> >On Tue, Feb 14, 2023 at 10:44 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >>
> >> Try to use the cached skb_ext in the drivers.
> >>
> >> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> >> ---
> >> CC: saeedm@nvidia.com
> >> CC: leon@kernel.org
> >> CC: jhs@mojatatu.com
> >> CC: xiyou.wangcong@gmail.com
> >> CC: jiri@resnulli.us
> >> CC: roid@nvidia.com
> >> CC: ozsh@nvidia.com
> >> CC: paulb@nvidia.com
> >> ---
> >> drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c | 2 +-
> >> drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 2 +-
> >> include/net/pkt_cls.h | 9 +++++++++
> >> 3 files changed, 11 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> >> index 3b590cfe33b8..ffbed5a92eab 100644
> >> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> >> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> >> @@ -770,7 +770,7 @@ static bool mlx5e_restore_skb_chain(struct sk_buff *skb, u32 chain, u32 reg_c1,
> >> struct mlx5_eswitch *esw;
> >> u32 zone_restore_id;
> >>
> >> - tc_skb_ext = tc_skb_ext_alloc(skb);
> >> + tc_skb_ext = tc_skb_ext_alloc_napi(skb);
> >> if (!tc_skb_ext) {
> >> WARN_ON(1);
> >> return false;
> >> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> >> index 2d06b4412762..3d9da4ccaf5d 100644
> >> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> >> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> >> @@ -5643,7 +5643,7 @@ bool mlx5e_tc_update_skb(struct mlx5_cqe64 *cqe,
> >>
> >> if (mapped_obj.type == MLX5_MAPPED_OBJ_CHAIN) {
> >> chain = mapped_obj.chain;
> >> - tc_skb_ext = tc_skb_ext_alloc(skb);
> >> + tc_skb_ext = tc_skb_ext_alloc_napi(skb);
> >> if (WARN_ON(!tc_skb_ext))
> >> return false;
> >>
> >> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
> >> index ace437c6754b..82821a3f8a8b 100644
> >> --- a/include/net/pkt_cls.h
> >> +++ b/include/net/pkt_cls.h
> >> @@ -764,6 +764,15 @@ static inline struct tc_skb_ext *tc_skb_ext_alloc(struct sk_buff *skb)
> >> memset(tc_skb_ext, 0, sizeof(*tc_skb_ext));
> >> return tc_skb_ext;
> >> }
> >> +
> >> +static inline struct tc_skb_ext *tc_skb_ext_alloc_napi(struct sk_buff *skb)
> >> +{
> >> + struct tc_skb_ext *tc_skb_ext = napi_skb_ext_add(skb, TC_SKB_EXT);
> >> +
> >> + if (tc_skb_ext)
> >> + memset(tc_skb_ext, 0, sizeof(*tc_skb_ext));
> >> + return tc_skb_ext;
> >> +}
> >> #endif
> >>
> >
> >Dumb question: Would this work with a consumer of the metadata past
> >RPS stage? didnt look closely but assumed per cpu skb_ext because of
> >the napi context - which will require a per cpu pointer to fetch it
> >later.
> >In P4TC we make heavy use of skb_ext and found it to be very pricey,
>
> Why don't you use skb->cb internally in p4tc?
Just for space reasons since we have a lot of stuff we carry around
(headers, metadata, and table keys which could be each upto an imposed
512b in size).
> Or does the skb leave p4tc and arrive back again? When?
Only time it leaves is if we do a "recirculate" - which is essentially
a mirred redirect to ingress/egress of another port.
It still causes memory pressure in alloc/free - we are thinking of
getting rid of it altogether and replacing it with a statically
allocated per-cpu scratchpad; if it gets recirculated we start from
scratch.
cheers,
jamal
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next 3/3] net: create and use NAPI version of tc_skb_ext_alloc()
2023-02-15 16:50 ` Jamal Hadi Salim
2023-02-15 17:03 ` Jiri Pirko
@ 2023-02-15 17:35 ` Jakub Kicinski
2023-02-15 18:38 ` Jamal Hadi Salim
1 sibling, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2023-02-15 17:35 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, netdev, edumazet, pabeni, willemb, fw, saeedm, leon,
xiyou.wangcong, jiri, roid, ozsh, paulb
On Wed, 15 Feb 2023 11:50:55 -0500 Jamal Hadi Salim wrote:
> Dumb question: Would this work with a consumer of the metadata past
> RPS stage? didnt look closely but assumed per cpu skb_ext because of
> the napi context - which will require a per cpu pointer to fetch it
> later.
The cache is just for allocation, specifically for the case where
driver allocates the skb and it's quickly coalesced by GRO.
The lifetime of the allocated object is not changed.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net-next 3/3] net: create and use NAPI version of tc_skb_ext_alloc()
2023-02-15 17:35 ` Jakub Kicinski
@ 2023-02-15 18:38 ` Jamal Hadi Salim
0 siblings, 0 replies; 19+ messages in thread
From: Jamal Hadi Salim @ 2023-02-15 18:38 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, willemb, fw, saeedm, leon,
xiyou.wangcong, jiri, roid, ozsh, paulb
On Wed, Feb 15, 2023 at 12:35 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 15 Feb 2023 11:50:55 -0500 Jamal Hadi Salim wrote:
> > Dumb question: Would this work with a consumer of the metadata past
> > RPS stage? didnt look closely but assumed per cpu skb_ext because of
> > the napi context - which will require a per cpu pointer to fetch it
> > later.
>
> The cache is just for allocation, specifically for the case where
> driver allocates the skb and it's quickly coalesced by GRO.
> The lifetime of the allocated object is not changed.
ah. That certainly works.
cheers,
jamal
^ permalink raw reply [flat|nested] 19+ messages in thread