* [net-next 1/9] net/mlx5e: Switch to using napi_build_skb()
2023-02-16 0:09 [pull request][net-next 0/9] mlx5 updates 2023-02-15 Saeed Mahameed
@ 2023-02-16 0:09 ` Saeed Mahameed
2023-02-16 14:55 ` Maciej Fijalkowski
2023-02-16 17:26 ` Alexander Lobakin
2023-02-16 0:09 ` [net-next 2/9] net/mlx5e: Remove redundant page argument in mlx5e_xmit_xdp_buff() Saeed Mahameed
` (7 subsequent siblings)
8 siblings, 2 replies; 21+ messages in thread
From: Saeed Mahameed @ 2023-02-16 0:09 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman
From: Tariq Toukan <tariqt@nvidia.com>
Use napi_build_skb() which uses NAPI percpu caches to obtain
skbuff_head instead of inplace allocation.
napi_build_skb() calls napi_skb_cache_get(), which returns a cached
skb, or allocates a bulk of NAPI_SKB_CACHE_BULK (16) if cache is empty.
Performance test:
TCP single stream, single ring, single core, default MTU (1500B).
Before: 26.5 Gbits/sec
After: 30.1 Gbits/sec (+13.6%)
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index a9473a51edc1..9ac2c7778b5b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1556,7 +1556,7 @@ struct sk_buff *mlx5e_build_linear_skb(struct mlx5e_rq *rq, void *va,
u32 frag_size, u16 headroom,
u32 cqe_bcnt, u32 metasize)
{
- struct sk_buff *skb = build_skb(va, frag_size);
+ struct sk_buff *skb = napi_build_skb(va, frag_size);
if (unlikely(!skb)) {
rq->stats->buff_alloc_err++;
--
2.39.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [net-next 1/9] net/mlx5e: Switch to using napi_build_skb()
2023-02-16 0:09 ` [net-next 1/9] net/mlx5e: Switch to using napi_build_skb() Saeed Mahameed
@ 2023-02-16 14:55 ` Maciej Fijalkowski
2023-02-16 17:26 ` Alexander Lobakin
1 sibling, 0 replies; 21+ messages in thread
From: Maciej Fijalkowski @ 2023-02-16 14:55 UTC (permalink / raw)
To: Saeed Mahameed
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman
On Wed, Feb 15, 2023 at 04:09:10PM -0800, Saeed Mahameed wrote:
> From: Tariq Toukan <tariqt@nvidia.com>
>
> Use napi_build_skb() which uses NAPI percpu caches to obtain
> skbuff_head instead of inplace allocation.
>
> napi_build_skb() calls napi_skb_cache_get(), which returns a cached
> skb, or allocates a bulk of NAPI_SKB_CACHE_BULK (16) if cache is empty.
>
> Performance test:
> TCP single stream, single ring, single core, default MTU (1500B).
>
> Before: 26.5 Gbits/sec
> After: 30.1 Gbits/sec (+13.6%)
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
> Reviewed-by: Gal Pressman <gal@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> index a9473a51edc1..9ac2c7778b5b 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> @@ -1556,7 +1556,7 @@ struct sk_buff *mlx5e_build_linear_skb(struct mlx5e_rq *rq, void *va,
> u32 frag_size, u16 headroom,
> u32 cqe_bcnt, u32 metasize)
> {
> - struct sk_buff *skb = build_skb(va, frag_size);
> + struct sk_buff *skb = napi_build_skb(va, frag_size);
>
> if (unlikely(!skb)) {
> rq->stats->buff_alloc_err++;
> --
> 2.39.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [net-next 1/9] net/mlx5e: Switch to using napi_build_skb()
2023-02-16 0:09 ` [net-next 1/9] net/mlx5e: Switch to using napi_build_skb() Saeed Mahameed
2023-02-16 14:55 ` Maciej Fijalkowski
@ 2023-02-16 17:26 ` Alexander Lobakin
2023-02-16 17:53 ` Jakub Kicinski
1 sibling, 1 reply; 21+ messages in thread
From: Alexander Lobakin @ 2023-02-16 17:26 UTC (permalink / raw)
To: Saeed Mahameed
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman
From: Saeed Mahameed <saeed@kernel.org>
Date: Wed, 15 Feb 2023 16:09:10 -0800
> From: Tariq Toukan <tariqt@nvidia.com>
>
> Use napi_build_skb() which uses NAPI percpu caches to obtain
> skbuff_head instead of inplace allocation.
>
> napi_build_skb() calls napi_skb_cache_get(), which returns a cached
> skb, or allocates a bulk of NAPI_SKB_CACHE_BULK (16) if cache is empty.
>
> Performance test:
> TCP single stream, single ring, single core, default MTU (1500B).
>
> Before: 26.5 Gbits/sec
> After: 30.1 Gbits/sec (+13.6%)
+14%, gosh! Happy to see more and more vendors switching to it, someone
told me back then we have so fast RAM nowadays that it won't make any
sense to directly recycle kmem-cached objects. Maybe it's fast, but
seems like not *so* fast :D
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
> Reviewed-by: Gal Pressman <gal@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> index a9473a51edc1..9ac2c7778b5b 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> @@ -1556,7 +1556,7 @@ struct sk_buff *mlx5e_build_linear_skb(struct mlx5e_rq *rq, void *va,
> u32 frag_size, u16 headroom,
> u32 cqe_bcnt, u32 metasize)
> {
> - struct sk_buff *skb = build_skb(va, frag_size);
> + struct sk_buff *skb = napi_build_skb(va, frag_size);
>
> if (unlikely(!skb)) {
> rq->stats->buff_alloc_err++;
Thanks,
Olek
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [net-next 1/9] net/mlx5e: Switch to using napi_build_skb()
2023-02-16 17:26 ` Alexander Lobakin
@ 2023-02-16 17:53 ` Jakub Kicinski
2023-02-16 17:59 ` Alexander Lobakin
0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2023-02-16 17:53 UTC (permalink / raw)
To: Alexander Lobakin
Cc: Saeed Mahameed, David S. Miller, Paolo Abeni, Eric Dumazet,
Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman
On Thu, 16 Feb 2023 18:26:19 +0100 Alexander Lobakin wrote:
> > Before: 26.5 Gbits/sec
> > After: 30.1 Gbits/sec (+13.6%)
>
> +14%, gosh! Happy to see more and more vendors switching to it, someone
> told me back then we have so fast RAM nowadays that it won't make any
> sense to directly recycle kmem-cached objects. Maybe it's fast, but
> seems like not *so* fast :D
Interestingly I had a similar patch in my tree when testing the skb_ext
cache and enabling slow_gro kills this gain.
IOW without adding an skb_ext using napi_build_skb() gives me ~12%
boost. If I start adding skb_ext (with the cache and perfect reuse)
I'm back to the baseline (26.5Gbps in this case).
But without using napi_build_skb() adding skb_ext (with the cache)
doesn't change anything, skb_ext or not, I'll get 26.5Gbps.
Very finicky. Not sure why this happens. Perhaps napi_build_skb()
let's us fit under some CPU resource constraint and additional
functionality knocks us back over the line?
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next 1/9] net/mlx5e: Switch to using napi_build_skb()
2023-02-16 17:53 ` Jakub Kicinski
@ 2023-02-16 17:59 ` Alexander Lobakin
0 siblings, 0 replies; 21+ messages in thread
From: Alexander Lobakin @ 2023-02-16 17:59 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Saeed Mahameed, David S. Miller, Paolo Abeni, Eric Dumazet,
Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman
From: Jakub Kicinski <kuba@kernel.org>
Date: Thu, 16 Feb 2023 09:53:24 -0800
> On Thu, 16 Feb 2023 18:26:19 +0100 Alexander Lobakin wrote:
>>> Before: 26.5 Gbits/sec
>>> After: 30.1 Gbits/sec (+13.6%)
>>
>> +14%, gosh! Happy to see more and more vendors switching to it, someone
>> told me back then we have so fast RAM nowadays that it won't make any
>> sense to directly recycle kmem-cached objects. Maybe it's fast, but
>> seems like not *so* fast :D
>
> Interestingly I had a similar patch in my tree when testing the skb_ext
> cache and enabling slow_gro kills this gain.
>
> IOW without adding an skb_ext using napi_build_skb() gives me ~12%
> boost. If I start adding skb_ext (with the cache and perfect reuse)
> I'm back to the baseline (26.5Gbps in this case).
>
> But without using napi_build_skb() adding skb_ext (with the cache)
> doesn't change anything, skb_ext or not, I'll get 26.5Gbps.
>
> Very finicky. Not sure why this happens. Perhaps napi_build_skb()
> let's us fit under some CPU resource constraint and additional
> functionality knocks us back over the line?
Both skb and skb ext use kmem cache, maybe calling kmem cache related
functions like alloc/free touches some global objects (or even locks)
we'd like to avoid accessing on hotpath? I'm not deep into the kmem
cache, so might be saying something perfectly stupid here :D
Nevertheless, it's always fun to see how performance does some weird and
counter-intuitive moves sometimes (not speaking of why
CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B exists).
Thanks,
Olek
^ permalink raw reply [flat|nested] 21+ messages in thread
* [net-next 2/9] net/mlx5e: Remove redundant page argument in mlx5e_xmit_xdp_buff()
2023-02-16 0:09 [pull request][net-next 0/9] mlx5 updates 2023-02-15 Saeed Mahameed
2023-02-16 0:09 ` [net-next 1/9] net/mlx5e: Switch to using napi_build_skb() Saeed Mahameed
@ 2023-02-16 0:09 ` Saeed Mahameed
2023-02-16 15:04 ` Maciej Fijalkowski
2023-02-16 0:09 ` [net-next 3/9] net/mlx5e: Remove redundant page argument in mlx5e_xdp_handle() Saeed Mahameed
` (6 subsequent siblings)
8 siblings, 1 reply; 21+ messages in thread
From: Saeed Mahameed @ 2023-02-16 0:09 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Dragos Tatulea
From: Tariq Toukan <tariqt@nvidia.com>
Remove the page parameter, it can be derived from the xdp_buff.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index f7d52b1d293b..4b9cd8ef8d28 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -57,8 +57,9 @@ int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk)
static inline bool
mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq,
- struct page *page, struct xdp_buff *xdp)
+ struct xdp_buff *xdp)
{
+ struct page *page = virt_to_page(xdp->data);
struct skb_shared_info *sinfo = NULL;
struct mlx5e_xmit_data xdptxd;
struct mlx5e_xdp_info xdpi;
@@ -197,7 +198,7 @@ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
case XDP_PASS:
return false;
case XDP_TX:
- if (unlikely(!mlx5e_xmit_xdp_buff(rq->xdpsq, rq, page, xdp)))
+ if (unlikely(!mlx5e_xmit_xdp_buff(rq->xdpsq, rq, xdp)))
goto xdp_abort;
__set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags); /* non-atomic */
return true;
--
2.39.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [net-next 2/9] net/mlx5e: Remove redundant page argument in mlx5e_xmit_xdp_buff()
2023-02-16 0:09 ` [net-next 2/9] net/mlx5e: Remove redundant page argument in mlx5e_xmit_xdp_buff() Saeed Mahameed
@ 2023-02-16 15:04 ` Maciej Fijalkowski
0 siblings, 0 replies; 21+ messages in thread
From: Maciej Fijalkowski @ 2023-02-16 15:04 UTC (permalink / raw)
To: Saeed Mahameed
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Saeed Mahameed, netdev, Tariq Toukan, Dragos Tatulea
On Wed, Feb 15, 2023 at 04:09:11PM -0800, Saeed Mahameed wrote:
> From: Tariq Toukan <tariqt@nvidia.com>
>
> Remove the page parameter, it can be derived from the xdp_buff.
>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
> index f7d52b1d293b..4b9cd8ef8d28 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
> @@ -57,8 +57,9 @@ int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk)
>
> static inline bool
> mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq,
> - struct page *page, struct xdp_buff *xdp)
> + struct xdp_buff *xdp)
> {
> + struct page *page = virt_to_page(xdp->data);
> struct skb_shared_info *sinfo = NULL;
> struct mlx5e_xmit_data xdptxd;
> struct mlx5e_xdp_info xdpi;
> @@ -197,7 +198,7 @@ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
> case XDP_PASS:
> return false;
> case XDP_TX:
> - if (unlikely(!mlx5e_xmit_xdp_buff(rq->xdpsq, rq, page, xdp)))
> + if (unlikely(!mlx5e_xmit_xdp_buff(rq->xdpsq, rq, xdp)))
is there any value behind this patch? you're not super excessive in terms
of count of args that mlx5e_xmit_xdp_buff() takes.
Maybe don't initialize this right on the start but rather just before
getting dma_addr? This way XSK XDP_TX won't be doing this.
> goto xdp_abort;
> __set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags); /* non-atomic */
> return true;
> --
> 2.39.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [net-next 3/9] net/mlx5e: Remove redundant page argument in mlx5e_xdp_handle()
2023-02-16 0:09 [pull request][net-next 0/9] mlx5 updates 2023-02-15 Saeed Mahameed
2023-02-16 0:09 ` [net-next 1/9] net/mlx5e: Switch to using napi_build_skb() Saeed Mahameed
2023-02-16 0:09 ` [net-next 2/9] net/mlx5e: Remove redundant page argument in mlx5e_xmit_xdp_buff() Saeed Mahameed
@ 2023-02-16 0:09 ` Saeed Mahameed
2023-02-16 15:13 ` Maciej Fijalkowski
2023-02-16 0:09 ` [net-next 4/9] net/mlx5: Simplify eq list traversal Saeed Mahameed
` (5 subsequent siblings)
8 siblings, 1 reply; 21+ messages in thread
From: Saeed Mahameed @ 2023-02-16 0:09 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Dragos Tatulea
From: Tariq Toukan <tariqt@nvidia.com>
Remove the page parameter, it can be derived from the xdp_buff member
of mlx5e_xdp_buff.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 4 ++--
drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 4 ++--
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 10 ++++------
4 files changed, 9 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index 4b9cd8ef8d28..bcd6370de440 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -186,7 +186,7 @@ const struct xdp_metadata_ops mlx5e_xdp_metadata_ops = {
};
/* returns true if packet was consumed by xdp */
-bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
+bool mlx5e_xdp_handle(struct mlx5e_rq *rq,
struct bpf_prog *prog, struct mlx5e_xdp_buff *mxbuf)
{
struct xdp_buff *xdp = &mxbuf->xdp;
@@ -210,7 +210,7 @@ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
__set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags);
__set_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags);
if (xdp->rxq->mem.type != MEM_TYPE_XSK_BUFF_POOL)
- mlx5e_page_dma_unmap(rq, page);
+ mlx5e_page_dma_unmap(rq, virt_to_page(xdp->data));
rq->stats->xdp_redirect++;
return true;
default:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
index 69f338bc0633..10bcfa6f88c1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
@@ -52,7 +52,7 @@ struct mlx5e_xdp_buff {
struct mlx5e_xsk_param;
int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk);
-bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
+bool mlx5e_xdp_handle(struct mlx5e_rq *rq,
struct bpf_prog *prog, struct mlx5e_xdp_buff *mlctx);
void mlx5e_xdp_mpwqe_complete(struct mlx5e_xdpsq *sq);
bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
index b7c84ebe8418..fab787600459 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
@@ -289,7 +289,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq,
*/
prog = rcu_dereference(rq->xdp_prog);
- if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, mxbuf))) {
+ if (likely(prog && mlx5e_xdp_handle(rq, prog, mxbuf))) {
if (likely(__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)))
__set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */
return NULL; /* page/packet was consumed by XDP */
@@ -323,7 +323,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq,
net_prefetch(mxbuf->xdp.data);
prog = rcu_dereference(rq->xdp_prog);
- if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, mxbuf)))
+ if (likely(prog && mlx5e_xdp_handle(rq, prog, mxbuf)))
return NULL; /* page/packet was consumed by XDP */
/* XDP_PASS: copy the data from the UMEM to a new SKB. The frame reuse
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 9ac2c7778b5b..ac570945d5d2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1610,7 +1610,7 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi,
net_prefetchw(va); /* xdp_frame data area */
mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, cqe_bcnt, &mxbuf);
- if (mlx5e_xdp_handle(rq, au->page, prog, &mxbuf))
+ if (mlx5e_xdp_handle(rq, prog, &mxbuf))
return NULL; /* page/packet was consumed by XDP */
rx_headroom = mxbuf.xdp.data - mxbuf.xdp.data_hard_start;
@@ -1698,10 +1698,8 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
wi++;
}
- au = head_wi->au;
-
prog = rcu_dereference(rq->xdp_prog);
- if (prog && mlx5e_xdp_handle(rq, au->page, prog, &mxbuf)) {
+ if (prog && mlx5e_xdp_handle(rq, prog, &mxbuf)) {
if (test_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) {
int i;
@@ -1718,7 +1716,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
if (unlikely(!skb))
return NULL;
- page_ref_inc(au->page);
+ page_ref_inc(head_wi->au->page);
if (unlikely(xdp_buff_has_frags(&mxbuf.xdp))) {
int i;
@@ -2013,7 +2011,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
net_prefetchw(va); /* xdp_frame data area */
mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, cqe_bcnt, &mxbuf);
- if (mlx5e_xdp_handle(rq, au->page, prog, &mxbuf)) {
+ if (mlx5e_xdp_handle(rq, prog, &mxbuf)) {
if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags))
__set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */
return NULL; /* page/packet was consumed by XDP */
--
2.39.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [net-next 3/9] net/mlx5e: Remove redundant page argument in mlx5e_xdp_handle()
2023-02-16 0:09 ` [net-next 3/9] net/mlx5e: Remove redundant page argument in mlx5e_xdp_handle() Saeed Mahameed
@ 2023-02-16 15:13 ` Maciej Fijalkowski
0 siblings, 0 replies; 21+ messages in thread
From: Maciej Fijalkowski @ 2023-02-16 15:13 UTC (permalink / raw)
To: Saeed Mahameed
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Saeed Mahameed, netdev, Tariq Toukan, Dragos Tatulea
On Wed, Feb 15, 2023 at 04:09:12PM -0800, Saeed Mahameed wrote:
> From: Tariq Toukan <tariqt@nvidia.com>
>
> Remove the page parameter, it can be derived from the xdp_buff member
> of mlx5e_xdp_buff.
Okay that's nice cleanup. How about squashing this with the previous patch
that does the same thing for xmit xdp_buff routine?
>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 4 ++--
> drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h | 2 +-
> drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 4 ++--
> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 10 ++++------
> 4 files changed, 9 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
> index 4b9cd8ef8d28..bcd6370de440 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
> @@ -186,7 +186,7 @@ const struct xdp_metadata_ops mlx5e_xdp_metadata_ops = {
> };
>
> /* returns true if packet was consumed by xdp */
> -bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
> +bool mlx5e_xdp_handle(struct mlx5e_rq *rq,
> struct bpf_prog *prog, struct mlx5e_xdp_buff *mxbuf)
> {
> struct xdp_buff *xdp = &mxbuf->xdp;
> @@ -210,7 +210,7 @@ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
> __set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags);
> __set_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags);
> if (xdp->rxq->mem.type != MEM_TYPE_XSK_BUFF_POOL)
> - mlx5e_page_dma_unmap(rq, page);
> + mlx5e_page_dma_unmap(rq, virt_to_page(xdp->data));
> rq->stats->xdp_redirect++;
> return true;
> default:
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
> index 69f338bc0633..10bcfa6f88c1 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
> @@ -52,7 +52,7 @@ struct mlx5e_xdp_buff {
>
> struct mlx5e_xsk_param;
> int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk);
> -bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
> +bool mlx5e_xdp_handle(struct mlx5e_rq *rq,
> struct bpf_prog *prog, struct mlx5e_xdp_buff *mlctx);
> void mlx5e_xdp_mpwqe_complete(struct mlx5e_xdpsq *sq);
> bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq);
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> index b7c84ebe8418..fab787600459 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> @@ -289,7 +289,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq,
> */
>
> prog = rcu_dereference(rq->xdp_prog);
> - if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, mxbuf))) {
> + if (likely(prog && mlx5e_xdp_handle(rq, prog, mxbuf))) {
> if (likely(__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)))
> __set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */
> return NULL; /* page/packet was consumed by XDP */
> @@ -323,7 +323,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq,
> net_prefetch(mxbuf->xdp.data);
>
> prog = rcu_dereference(rq->xdp_prog);
> - if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, mxbuf)))
> + if (likely(prog && mlx5e_xdp_handle(rq, prog, mxbuf)))
> return NULL; /* page/packet was consumed by XDP */
>
> /* XDP_PASS: copy the data from the UMEM to a new SKB. The frame reuse
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> index 9ac2c7778b5b..ac570945d5d2 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> @@ -1610,7 +1610,7 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi,
>
> net_prefetchw(va); /* xdp_frame data area */
> mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, cqe_bcnt, &mxbuf);
> - if (mlx5e_xdp_handle(rq, au->page, prog, &mxbuf))
> + if (mlx5e_xdp_handle(rq, prog, &mxbuf))
> return NULL; /* page/packet was consumed by XDP */
>
> rx_headroom = mxbuf.xdp.data - mxbuf.xdp.data_hard_start;
> @@ -1698,10 +1698,8 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
> wi++;
> }
>
> - au = head_wi->au;
> -
> prog = rcu_dereference(rq->xdp_prog);
> - if (prog && mlx5e_xdp_handle(rq, au->page, prog, &mxbuf)) {
> + if (prog && mlx5e_xdp_handle(rq, prog, &mxbuf)) {
> if (test_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) {
> int i;
>
> @@ -1718,7 +1716,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
> if (unlikely(!skb))
> return NULL;
>
> - page_ref_inc(au->page);
> + page_ref_inc(head_wi->au->page);
>
> if (unlikely(xdp_buff_has_frags(&mxbuf.xdp))) {
> int i;
> @@ -2013,7 +2011,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
>
> net_prefetchw(va); /* xdp_frame data area */
> mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, cqe_bcnt, &mxbuf);
> - if (mlx5e_xdp_handle(rq, au->page, prog, &mxbuf)) {
> + if (mlx5e_xdp_handle(rq, prog, &mxbuf)) {
> if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags))
> __set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */
> return NULL; /* page/packet was consumed by XDP */
> --
> 2.39.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [net-next 4/9] net/mlx5: Simplify eq list traversal
2023-02-16 0:09 [pull request][net-next 0/9] mlx5 updates 2023-02-15 Saeed Mahameed
` (2 preceding siblings ...)
2023-02-16 0:09 ` [net-next 3/9] net/mlx5e: Remove redundant page argument in mlx5e_xdp_handle() Saeed Mahameed
@ 2023-02-16 0:09 ` Saeed Mahameed
2023-02-16 15:17 ` Maciej Fijalkowski
2023-02-16 0:09 ` [net-next 5/9] net/mlx5e: Implement CT entry update Saeed Mahameed
` (4 subsequent siblings)
8 siblings, 1 reply; 21+ messages in thread
From: Saeed Mahameed @ 2023-02-16 0:09 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Parav Pandit, Shay Drory
From: Parav Pandit <parav@nvidia.com>
EQ list is read only while finding the matching EQ.
Hence, avoid *_safe() version.
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 66ec7932f008..38b32e98f3bd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -960,11 +960,11 @@ static int vector2eqnirqn(struct mlx5_core_dev *dev, int vector, int *eqn,
unsigned int *irqn)
{
struct mlx5_eq_table *table = dev->priv.eq_table;
- struct mlx5_eq_comp *eq, *n;
+ struct mlx5_eq_comp *eq;
int err = -ENOENT;
int i = 0;
- list_for_each_entry_safe(eq, n, &table->comp_eqs_list, list) {
+ list_for_each_entry(eq, &table->comp_eqs_list, list) {
if (i++ == vector) {
if (irqn)
*irqn = eq->core.irqn;
@@ -999,10 +999,10 @@ struct cpumask *
mlx5_comp_irq_get_affinity_mask(struct mlx5_core_dev *dev, int vector)
{
struct mlx5_eq_table *table = dev->priv.eq_table;
- struct mlx5_eq_comp *eq, *n;
+ struct mlx5_eq_comp *eq;
int i = 0;
- list_for_each_entry_safe(eq, n, &table->comp_eqs_list, list) {
+ list_for_each_entry(eq, &table->comp_eqs_list, list) {
if (i++ == vector)
break;
}
--
2.39.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [net-next 4/9] net/mlx5: Simplify eq list traversal
2023-02-16 0:09 ` [net-next 4/9] net/mlx5: Simplify eq list traversal Saeed Mahameed
@ 2023-02-16 15:17 ` Maciej Fijalkowski
0 siblings, 0 replies; 21+ messages in thread
From: Maciej Fijalkowski @ 2023-02-16 15:17 UTC (permalink / raw)
To: Saeed Mahameed
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Saeed Mahameed, netdev, Tariq Toukan, Parav Pandit, Shay Drory
On Wed, Feb 15, 2023 at 04:09:13PM -0800, Saeed Mahameed wrote:
> From: Parav Pandit <parav@nvidia.com>
>
> EQ list is read only while finding the matching EQ.
> Hence, avoid *_safe() version.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shay Drory <shayd@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/eq.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> index 66ec7932f008..38b32e98f3bd 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> @@ -960,11 +960,11 @@ static int vector2eqnirqn(struct mlx5_core_dev *dev, int vector, int *eqn,
> unsigned int *irqn)
> {
> struct mlx5_eq_table *table = dev->priv.eq_table;
> - struct mlx5_eq_comp *eq, *n;
> + struct mlx5_eq_comp *eq;
> int err = -ENOENT;
> int i = 0;
>
> - list_for_each_entry_safe(eq, n, &table->comp_eqs_list, list) {
> + list_for_each_entry(eq, &table->comp_eqs_list, list) {
> if (i++ == vector) {
> if (irqn)
> *irqn = eq->core.irqn;
> @@ -999,10 +999,10 @@ struct cpumask *
> mlx5_comp_irq_get_affinity_mask(struct mlx5_core_dev *dev, int vector)
> {
> struct mlx5_eq_table *table = dev->priv.eq_table;
> - struct mlx5_eq_comp *eq, *n;
> + struct mlx5_eq_comp *eq;
> int i = 0;
>
> - list_for_each_entry_safe(eq, n, &table->comp_eqs_list, list) {
> + list_for_each_entry(eq, &table->comp_eqs_list, list) {
> if (i++ == vector)
> break;
> }
> --
> 2.39.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [net-next 5/9] net/mlx5e: Implement CT entry update
2023-02-16 0:09 [pull request][net-next 0/9] mlx5 updates 2023-02-15 Saeed Mahameed
` (3 preceding siblings ...)
2023-02-16 0:09 ` [net-next 4/9] net/mlx5: Simplify eq list traversal Saeed Mahameed
@ 2023-02-16 0:09 ` Saeed Mahameed
2023-02-16 15:51 ` Maciej Fijalkowski
2023-02-16 0:09 ` [net-next 6/9] net/mlx5e: Allow offloading of ct 'new' match Saeed Mahameed
` (3 subsequent siblings)
8 siblings, 1 reply; 21+ messages in thread
From: Saeed Mahameed @ 2023-02-16 0:09 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Vlad Buslov, Oz Shlomo,
Paul Blakey
From: Vlad Buslov <vladbu@nvidia.com>
With support for UDP NEW offload the flow_table may now send updates for
existing flows. Support properly replacing existing entries by updating
flow restore_cookie and replacing the rule with new one with the same match
but new mod_hdr action that sets updated ctinfo.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../ethernet/mellanox/mlx5/core/en/tc_ct.c | 118 +++++++++++++++++-
1 file changed, 117 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
index 193562c14c44..76e86f83b6ac 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
@@ -871,6 +871,68 @@ mlx5_tc_ct_entry_add_rule(struct mlx5_tc_ct_priv *ct_priv,
return err;
}
+static int
+mlx5_tc_ct_entry_replace_rule(struct mlx5_tc_ct_priv *ct_priv,
+ struct flow_rule *flow_rule,
+ struct mlx5_ct_entry *entry,
+ bool nat, u8 zone_restore_id)
+{
+ struct mlx5_ct_zone_rule *zone_rule = &entry->zone_rules[nat];
+ struct mlx5_flow_attr *attr = zone_rule->attr, *old_attr;
+ struct mlx5e_mod_hdr_handle *mh;
+ struct mlx5_ct_fs_rule *rule;
+ struct mlx5_flow_spec *spec;
+ int err;
+
+ spec = kvzalloc(sizeof(*spec), GFP_KERNEL);
+ if (!spec)
+ return -ENOMEM;
+
+ old_attr = mlx5_alloc_flow_attr(ct_priv->ns_type);
+ if (!attr) {
+ err = -ENOMEM;
+ goto err_attr;
+ }
+ *old_attr = *attr;
+
+ err = mlx5_tc_ct_entry_create_mod_hdr(ct_priv, attr, flow_rule, &mh, zone_restore_id,
+ nat, mlx5_tc_ct_entry_has_nat(entry));
+ if (err) {
+ ct_dbg("Failed to create ct entry mod hdr");
+ goto err_mod_hdr;
+ }
+
+ mlx5_tc_ct_set_tuple_match(ct_priv, spec, flow_rule);
+ mlx5e_tc_match_to_reg_match(spec, ZONE_TO_REG, entry->tuple.zone, MLX5_CT_ZONE_MASK);
+
+ rule = ct_priv->fs_ops->ct_rule_add(ct_priv->fs, spec, attr, flow_rule);
+ if (IS_ERR(rule)) {
+ err = PTR_ERR(rule);
+ ct_dbg("Failed to add replacement ct entry rule, nat: %d", nat);
+ goto err_rule;
+ }
+
+ ct_priv->fs_ops->ct_rule_del(ct_priv->fs, zone_rule->rule);
+ zone_rule->rule = rule;
+ mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, old_attr, zone_rule->mh);
+ zone_rule->mh = mh;
+
+ kfree(old_attr);
+ kvfree(spec);
+ ct_dbg("Replaced ct entry rule in zone %d", entry->tuple.zone);
+
+ return 0;
+
+err_rule:
+ mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, zone_rule->attr, mh);
+ mlx5_put_label_mapping(ct_priv, attr->ct_attr.ct_labels_id);
+err_mod_hdr:
+ kfree(old_attr);
+err_attr:
+ kvfree(spec);
+ return err;
+}
+
static bool
mlx5_tc_ct_entry_valid(struct mlx5_ct_entry *entry)
{
@@ -1065,6 +1127,52 @@ mlx5_tc_ct_entry_add_rules(struct mlx5_tc_ct_priv *ct_priv,
return err;
}
+static int
+mlx5_tc_ct_entry_replace_rules(struct mlx5_tc_ct_priv *ct_priv,
+ struct flow_rule *flow_rule,
+ struct mlx5_ct_entry *entry,
+ u8 zone_restore_id)
+{
+ int err;
+
+ err = mlx5_tc_ct_entry_replace_rule(ct_priv, flow_rule, entry, false,
+ zone_restore_id);
+ if (err)
+ return err;
+
+ err = mlx5_tc_ct_entry_replace_rule(ct_priv, flow_rule, entry, true,
+ zone_restore_id);
+ if (err)
+ mlx5_tc_ct_entry_del_rule(ct_priv, entry, false);
+ return err;
+}
+
+static int
+mlx5_tc_ct_block_flow_offload_replace(struct mlx5_ct_ft *ft, struct flow_rule *flow_rule,
+ struct mlx5_ct_entry *entry, unsigned long cookie)
+{
+ struct mlx5_tc_ct_priv *ct_priv = ft->ct_priv;
+ int err;
+
+ err = mlx5_tc_ct_entry_replace_rules(ct_priv, flow_rule, entry, ft->zone_restore_id);
+ if (!err)
+ return 0;
+
+ /* If failed to update the entry, then look it up again under ht_lock
+ * protection and properly delete it.
+ */
+ spin_lock_bh(&ct_priv->ht_lock);
+ entry = rhashtable_lookup_fast(&ft->ct_entries_ht, &cookie, cts_ht_params);
+ if (entry) {
+ rhashtable_remove_fast(&ft->ct_entries_ht, &entry->node, cts_ht_params);
+ spin_unlock_bh(&ct_priv->ht_lock);
+ mlx5_tc_ct_entry_put(entry);
+ } else {
+ spin_unlock_bh(&ct_priv->ht_lock);
+ }
+ return err;
+}
+
static int
mlx5_tc_ct_block_flow_offload_add(struct mlx5_ct_ft *ft,
struct flow_cls_offload *flow)
@@ -1087,9 +1195,17 @@ mlx5_tc_ct_block_flow_offload_add(struct mlx5_ct_ft *ft,
spin_lock_bh(&ct_priv->ht_lock);
entry = rhashtable_lookup_fast(&ft->ct_entries_ht, &cookie, cts_ht_params);
if (entry && refcount_inc_not_zero(&entry->refcnt)) {
+ if (entry->restore_cookie == meta_action->ct_metadata.cookie) {
+ spin_unlock_bh(&ct_priv->ht_lock);
+ mlx5_tc_ct_entry_put(entry);
+ return -EEXIST;
+ }
+ entry->restore_cookie = meta_action->ct_metadata.cookie;
spin_unlock_bh(&ct_priv->ht_lock);
+
+ err = mlx5_tc_ct_block_flow_offload_replace(ft, flow_rule, entry, cookie);
mlx5_tc_ct_entry_put(entry);
- return -EEXIST;
+ return err;
}
spin_unlock_bh(&ct_priv->ht_lock);
--
2.39.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [net-next 5/9] net/mlx5e: Implement CT entry update
2023-02-16 0:09 ` [net-next 5/9] net/mlx5e: Implement CT entry update Saeed Mahameed
@ 2023-02-16 15:51 ` Maciej Fijalkowski
2023-02-16 17:15 ` Vlad Buslov
0 siblings, 1 reply; 21+ messages in thread
From: Maciej Fijalkowski @ 2023-02-16 15:51 UTC (permalink / raw)
To: Saeed Mahameed
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Saeed Mahameed, netdev, Tariq Toukan, Vlad Buslov, Oz Shlomo,
Paul Blakey
On Wed, Feb 15, 2023 at 04:09:14PM -0800, Saeed Mahameed wrote:
> From: Vlad Buslov <vladbu@nvidia.com>
>
> With support for UDP NEW offload the flow_table may now send updates for
> existing flows. Support properly replacing existing entries by updating
> flow restore_cookie and replacing the rule with new one with the same match
> but new mod_hdr action that sets updated ctinfo.
>
> Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
> Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
> Reviewed-by: Paul Blakey <paulb@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
> .../ethernet/mellanox/mlx5/core/en/tc_ct.c | 118 +++++++++++++++++-
> 1 file changed, 117 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
> index 193562c14c44..76e86f83b6ac 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
> @@ -871,6 +871,68 @@ mlx5_tc_ct_entry_add_rule(struct mlx5_tc_ct_priv *ct_priv,
> return err;
> }
>
> +static int
> +mlx5_tc_ct_entry_replace_rule(struct mlx5_tc_ct_priv *ct_priv,
> + struct flow_rule *flow_rule,
> + struct mlx5_ct_entry *entry,
> + bool nat, u8 zone_restore_id)
> +{
> + struct mlx5_ct_zone_rule *zone_rule = &entry->zone_rules[nat];
> + struct mlx5_flow_attr *attr = zone_rule->attr, *old_attr;
> + struct mlx5e_mod_hdr_handle *mh;
> + struct mlx5_ct_fs_rule *rule;
> + struct mlx5_flow_spec *spec;
> + int err;
> +
> + spec = kvzalloc(sizeof(*spec), GFP_KERNEL);
> + if (!spec)
> + return -ENOMEM;
> +
> + old_attr = mlx5_alloc_flow_attr(ct_priv->ns_type);
> + if (!attr) {
when can attr == NULL? maybe check it in the first place before allocing
spec above?
> + err = -ENOMEM;
> + goto err_attr;
> + }
> + *old_attr = *attr;
> +
> + err = mlx5_tc_ct_entry_create_mod_hdr(ct_priv, attr, flow_rule, &mh, zone_restore_id,
> + nat, mlx5_tc_ct_entry_has_nat(entry));
> + if (err) {
> + ct_dbg("Failed to create ct entry mod hdr");
> + goto err_mod_hdr;
> + }
> +
> + mlx5_tc_ct_set_tuple_match(ct_priv, spec, flow_rule);
> + mlx5e_tc_match_to_reg_match(spec, ZONE_TO_REG, entry->tuple.zone, MLX5_CT_ZONE_MASK);
> +
> + rule = ct_priv->fs_ops->ct_rule_add(ct_priv->fs, spec, attr, flow_rule);
> + if (IS_ERR(rule)) {
> + err = PTR_ERR(rule);
> + ct_dbg("Failed to add replacement ct entry rule, nat: %d", nat);
> + goto err_rule;
> + }
> +
> + ct_priv->fs_ops->ct_rule_del(ct_priv->fs, zone_rule->rule);
> + zone_rule->rule = rule;
> + mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, old_attr, zone_rule->mh);
> + zone_rule->mh = mh;
> +
> + kfree(old_attr);
> + kvfree(spec);
not a big deal but you could make a common goto below with a different
name
> + ct_dbg("Replaced ct entry rule in zone %d", entry->tuple.zone);
> +
> + return 0;
> +
> +err_rule:
> + mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, zone_rule->attr, mh);
> + mlx5_put_label_mapping(ct_priv, attr->ct_attr.ct_labels_id);
> +err_mod_hdr:
> + kfree(old_attr);
> +err_attr:
> + kvfree(spec);
> + return err;
> +}
> +
> static bool
> mlx5_tc_ct_entry_valid(struct mlx5_ct_entry *entry)
> {
> @@ -1065,6 +1127,52 @@ mlx5_tc_ct_entry_add_rules(struct mlx5_tc_ct_priv *ct_priv,
> return err;
> }
>
> +static int
> +mlx5_tc_ct_entry_replace_rules(struct mlx5_tc_ct_priv *ct_priv,
> + struct flow_rule *flow_rule,
> + struct mlx5_ct_entry *entry,
> + u8 zone_restore_id)
> +{
> + int err;
> +
> + err = mlx5_tc_ct_entry_replace_rule(ct_priv, flow_rule, entry, false,
would it make sense to replace the bool nat in here with some kind of
enum?
> + zone_restore_id);
> + if (err)
> + return err;
> +
> + err = mlx5_tc_ct_entry_replace_rule(ct_priv, flow_rule, entry, true,
> + zone_restore_id);
> + if (err)
> + mlx5_tc_ct_entry_del_rule(ct_priv, entry, false);
> + return err;
> +}
> +
> +static int
> +mlx5_tc_ct_block_flow_offload_replace(struct mlx5_ct_ft *ft, struct flow_rule *flow_rule,
> + struct mlx5_ct_entry *entry, unsigned long cookie)
> +{
> + struct mlx5_tc_ct_priv *ct_priv = ft->ct_priv;
> + int err;
> +
> + err = mlx5_tc_ct_entry_replace_rules(ct_priv, flow_rule, entry, ft->zone_restore_id);
> + if (!err)
> + return 0;
> +
> + /* If failed to update the entry, then look it up again under ht_lock
> + * protection and properly delete it.
> + */
> + spin_lock_bh(&ct_priv->ht_lock);
> + entry = rhashtable_lookup_fast(&ft->ct_entries_ht, &cookie, cts_ht_params);
> + if (entry) {
> + rhashtable_remove_fast(&ft->ct_entries_ht, &entry->node, cts_ht_params);
> + spin_unlock_bh(&ct_priv->ht_lock);
> + mlx5_tc_ct_entry_put(entry);
> + } else {
> + spin_unlock_bh(&ct_priv->ht_lock);
> + }
> + return err;
> +}
> +
> static int
> mlx5_tc_ct_block_flow_offload_add(struct mlx5_ct_ft *ft,
> struct flow_cls_offload *flow)
> @@ -1087,9 +1195,17 @@ mlx5_tc_ct_block_flow_offload_add(struct mlx5_ct_ft *ft,
> spin_lock_bh(&ct_priv->ht_lock);
> entry = rhashtable_lookup_fast(&ft->ct_entries_ht, &cookie, cts_ht_params);
> if (entry && refcount_inc_not_zero(&entry->refcnt)) {
> + if (entry->restore_cookie == meta_action->ct_metadata.cookie) {
> + spin_unlock_bh(&ct_priv->ht_lock);
> + mlx5_tc_ct_entry_put(entry);
> + return -EEXIST;
> + }
> + entry->restore_cookie = meta_action->ct_metadata.cookie;
> spin_unlock_bh(&ct_priv->ht_lock);
> +
> + err = mlx5_tc_ct_block_flow_offload_replace(ft, flow_rule, entry, cookie);
> mlx5_tc_ct_entry_put(entry);
in case of err != 0, haven't you already put the entry inside
mlx5_tc_ct_block_flow_offload_replace() ?
> - return -EEXIST;
> + return err;
> }
> spin_unlock_bh(&ct_priv->ht_lock);
>
> --
> 2.39.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [net-next 5/9] net/mlx5e: Implement CT entry update
2023-02-16 15:51 ` Maciej Fijalkowski
@ 2023-02-16 17:15 ` Vlad Buslov
2023-02-17 11:35 ` Maciej Fijalkowski
0 siblings, 1 reply; 21+ messages in thread
From: Vlad Buslov @ 2023-02-16 17:15 UTC (permalink / raw)
To: Maciej Fijalkowski
Cc: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet, Saeed Mahameed, netdev, Tariq Toukan, Oz Shlomo,
Paul Blakey
On Thu 16 Feb 2023 at 16:51, Maciej Fijalkowski <maciej.fijalkowski@intel.com> wrote:
> On Wed, Feb 15, 2023 at 04:09:14PM -0800, Saeed Mahameed wrote:
>> From: Vlad Buslov <vladbu@nvidia.com>
>>
>> With support for UDP NEW offload the flow_table may now send updates for
>> existing flows. Support properly replacing existing entries by updating
>> flow restore_cookie and replacing the rule with new one with the same match
>> but new mod_hdr action that sets updated ctinfo.
>>
>> Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
>> Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
>> Reviewed-by: Paul Blakey <paulb@nvidia.com>
>> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
>> ---
>> .../ethernet/mellanox/mlx5/core/en/tc_ct.c | 118 +++++++++++++++++-
>> 1 file changed, 117 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
>> index 193562c14c44..76e86f83b6ac 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
>> @@ -871,6 +871,68 @@ mlx5_tc_ct_entry_add_rule(struct mlx5_tc_ct_priv *ct_priv,
>> return err;
>> }
>>
>> +static int
>> +mlx5_tc_ct_entry_replace_rule(struct mlx5_tc_ct_priv *ct_priv,
>> + struct flow_rule *flow_rule,
>> + struct mlx5_ct_entry *entry,
>> + bool nat, u8 zone_restore_id)
>> +{
>> + struct mlx5_ct_zone_rule *zone_rule = &entry->zone_rules[nat];
>> + struct mlx5_flow_attr *attr = zone_rule->attr, *old_attr;
>> + struct mlx5e_mod_hdr_handle *mh;
>> + struct mlx5_ct_fs_rule *rule;
>> + struct mlx5_flow_spec *spec;
>> + int err;
>> +
>> + spec = kvzalloc(sizeof(*spec), GFP_KERNEL);
>> + if (!spec)
>> + return -ENOMEM;
>> +
>> + old_attr = mlx5_alloc_flow_attr(ct_priv->ns_type);
>> + if (!attr) {
>
> when can attr == NULL? maybe check it in the first place before allocing
> spec above?
Should verify 'old_attr', not 'attr'. Thanks for catching this!
>
>> + err = -ENOMEM;
>> + goto err_attr;
>> + }
>> + *old_attr = *attr;
>> +
>> + err = mlx5_tc_ct_entry_create_mod_hdr(ct_priv, attr, flow_rule, &mh, zone_restore_id,
>> + nat, mlx5_tc_ct_entry_has_nat(entry));
>> + if (err) {
>> + ct_dbg("Failed to create ct entry mod hdr");
>> + goto err_mod_hdr;
>> + }
>> +
>> + mlx5_tc_ct_set_tuple_match(ct_priv, spec, flow_rule);
>> + mlx5e_tc_match_to_reg_match(spec, ZONE_TO_REG, entry->tuple.zone, MLX5_CT_ZONE_MASK);
>> +
>> + rule = ct_priv->fs_ops->ct_rule_add(ct_priv->fs, spec, attr, flow_rule);
>> + if (IS_ERR(rule)) {
>> + err = PTR_ERR(rule);
>> + ct_dbg("Failed to add replacement ct entry rule, nat: %d", nat);
>> + goto err_rule;
>> + }
>> +
>> + ct_priv->fs_ops->ct_rule_del(ct_priv->fs, zone_rule->rule);
>> + zone_rule->rule = rule;
>> + mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, old_attr, zone_rule->mh);
>> + zone_rule->mh = mh;
>> +
>> + kfree(old_attr);
>> + kvfree(spec);
>
> not a big deal but you could make a common goto below with a different
> name
You mean jump from here to the middle of the error handling code below
in order not to duplicate these two calls to *free()? Honestly, I would
much rather prefer _not_ to do that since goto is necessary evil for
error handling in C but I don't believe we should handle any common
non-error code paths with it and it is not typically done in mlx5.
>
>> + ct_dbg("Replaced ct entry rule in zone %d", entry->tuple.zone);
>> +
>> + return 0;
>> +
>> +err_rule:
>> + mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, zone_rule->attr, mh);
>> + mlx5_put_label_mapping(ct_priv, attr->ct_attr.ct_labels_id);
>> +err_mod_hdr:
>> + kfree(old_attr);
>> +err_attr:
>> + kvfree(spec);
>> + return err;
>> +}
>> +
>> static bool
>> mlx5_tc_ct_entry_valid(struct mlx5_ct_entry *entry)
>> {
>> @@ -1065,6 +1127,52 @@ mlx5_tc_ct_entry_add_rules(struct mlx5_tc_ct_priv *ct_priv,
>> return err;
>> }
>>
>> +static int
>> +mlx5_tc_ct_entry_replace_rules(struct mlx5_tc_ct_priv *ct_priv,
>> + struct flow_rule *flow_rule,
>> + struct mlx5_ct_entry *entry,
>> + u8 zone_restore_id)
>> +{
>> + int err;
>> +
>> + err = mlx5_tc_ct_entry_replace_rule(ct_priv, flow_rule, entry, false,
>
> would it make sense to replace the bool nat in here with some kind of
> enum?
It is either nat rule or non-nat rule. Why would we invent an enum here?
Moreover, boolean 'nat' is already used in multiple places in this file,
so this patch just re-uses existing convention.
>
>> + zone_restore_id);
>> + if (err)
>> + return err;
>> +
>> + err = mlx5_tc_ct_entry_replace_rule(ct_priv, flow_rule, entry, true,
>> + zone_restore_id);
>> + if (err)
>> + mlx5_tc_ct_entry_del_rule(ct_priv, entry, false);
>> + return err;
>> +}
>> +
>> +static int
>> +mlx5_tc_ct_block_flow_offload_replace(struct mlx5_ct_ft *ft, struct flow_rule *flow_rule,
>> + struct mlx5_ct_entry *entry, unsigned long cookie)
>> +{
>> + struct mlx5_tc_ct_priv *ct_priv = ft->ct_priv;
>> + int err;
>> +
>> + err = mlx5_tc_ct_entry_replace_rules(ct_priv, flow_rule, entry, ft->zone_restore_id);
>> + if (!err)
>> + return 0;
>> +
>> + /* If failed to update the entry, then look it up again under ht_lock
>> + * protection and properly delete it.
>> + */
>> + spin_lock_bh(&ct_priv->ht_lock);
>> + entry = rhashtable_lookup_fast(&ft->ct_entries_ht, &cookie, cts_ht_params);
>> + if (entry) {
>> + rhashtable_remove_fast(&ft->ct_entries_ht, &entry->node, cts_ht_params);
>> + spin_unlock_bh(&ct_priv->ht_lock);
>> + mlx5_tc_ct_entry_put(entry);
>> + } else {
>> + spin_unlock_bh(&ct_priv->ht_lock);
>> + }
>> + return err;
>> +}
>> +
>> static int
>> mlx5_tc_ct_block_flow_offload_add(struct mlx5_ct_ft *ft,
>> struct flow_cls_offload *flow)
>> @@ -1087,9 +1195,17 @@ mlx5_tc_ct_block_flow_offload_add(struct mlx5_ct_ft *ft,
>> spin_lock_bh(&ct_priv->ht_lock);
>> entry = rhashtable_lookup_fast(&ft->ct_entries_ht, &cookie, cts_ht_params);
>> if (entry && refcount_inc_not_zero(&entry->refcnt)) {
>> + if (entry->restore_cookie == meta_action->ct_metadata.cookie) {
>> + spin_unlock_bh(&ct_priv->ht_lock);
>> + mlx5_tc_ct_entry_put(entry);
>> + return -EEXIST;
>> + }
>> + entry->restore_cookie = meta_action->ct_metadata.cookie;
>> spin_unlock_bh(&ct_priv->ht_lock);
>> +
>> + err = mlx5_tc_ct_block_flow_offload_replace(ft, flow_rule, entry, cookie);
>> mlx5_tc_ct_entry_put(entry);
>
> in case of err != 0, haven't you already put the entry inside
> mlx5_tc_ct_block_flow_offload_replace() ?
No. Here we release the reference that was obtained 10 lines up and
mlx5_tc_ct_block_flow_offload_replace() releases the reference held by
ct_entries_ht table.
>
>> - return -EEXIST;
>> + return err;
>> }
>> spin_unlock_bh(&ct_priv->ht_lock);
>>
>> --
>> 2.39.1
>>
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [net-next 5/9] net/mlx5e: Implement CT entry update
2023-02-16 17:15 ` Vlad Buslov
@ 2023-02-17 11:35 ` Maciej Fijalkowski
0 siblings, 0 replies; 21+ messages in thread
From: Maciej Fijalkowski @ 2023-02-17 11:35 UTC (permalink / raw)
To: Vlad Buslov
Cc: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet, Saeed Mahameed, netdev, Tariq Toukan, Oz Shlomo,
Paul Blakey
On Thu, Feb 16, 2023 at 07:15:13PM +0200, Vlad Buslov wrote:
> On Thu 16 Feb 2023 at 16:51, Maciej Fijalkowski <maciej.fijalkowski@intel.com> wrote:
> > On Wed, Feb 15, 2023 at 04:09:14PM -0800, Saeed Mahameed wrote:
> >> From: Vlad Buslov <vladbu@nvidia.com>
> >>
> >> With support for UDP NEW offload the flow_table may now send updates for
> >> existing flows. Support properly replacing existing entries by updating
> >> flow restore_cookie and replacing the rule with new one with the same match
> >> but new mod_hdr action that sets updated ctinfo.
> >>
> >> Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
> >> Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
> >> Reviewed-by: Paul Blakey <paulb@nvidia.com>
> >> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> >> ---
> >> .../ethernet/mellanox/mlx5/core/en/tc_ct.c | 118 +++++++++++++++++-
> >> 1 file changed, 117 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
> >> index 193562c14c44..76e86f83b6ac 100644
> >> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
> >> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
> >> @@ -871,6 +871,68 @@ mlx5_tc_ct_entry_add_rule(struct mlx5_tc_ct_priv *ct_priv,
> >> return err;
> >> }
> >>
> >> +static int
> >> +mlx5_tc_ct_entry_replace_rule(struct mlx5_tc_ct_priv *ct_priv,
> >> + struct flow_rule *flow_rule,
> >> + struct mlx5_ct_entry *entry,
> >> + bool nat, u8 zone_restore_id)
> >> +{
> >> + struct mlx5_ct_zone_rule *zone_rule = &entry->zone_rules[nat];
> >> + struct mlx5_flow_attr *attr = zone_rule->attr, *old_attr;
> >> + struct mlx5e_mod_hdr_handle *mh;
> >> + struct mlx5_ct_fs_rule *rule;
> >> + struct mlx5_flow_spec *spec;
> >> + int err;
> >> +
> >> + spec = kvzalloc(sizeof(*spec), GFP_KERNEL);
> >> + if (!spec)
> >> + return -ENOMEM;
> >> +
> >> + old_attr = mlx5_alloc_flow_attr(ct_priv->ns_type);
> >> + if (!attr) {
> >
> > when can attr == NULL? maybe check it in the first place before allocing
> > spec above?
>
> Should verify 'old_attr', not 'attr'. Thanks for catching this!
>
> >
> >> + err = -ENOMEM;
> >> + goto err_attr;
> >> + }
> >> + *old_attr = *attr;
> >> +
> >> + err = mlx5_tc_ct_entry_create_mod_hdr(ct_priv, attr, flow_rule, &mh, zone_restore_id,
> >> + nat, mlx5_tc_ct_entry_has_nat(entry));
> >> + if (err) {
> >> + ct_dbg("Failed to create ct entry mod hdr");
> >> + goto err_mod_hdr;
> >> + }
> >> +
> >> + mlx5_tc_ct_set_tuple_match(ct_priv, spec, flow_rule);
> >> + mlx5e_tc_match_to_reg_match(spec, ZONE_TO_REG, entry->tuple.zone, MLX5_CT_ZONE_MASK);
> >> +
> >> + rule = ct_priv->fs_ops->ct_rule_add(ct_priv->fs, spec, attr, flow_rule);
> >> + if (IS_ERR(rule)) {
> >> + err = PTR_ERR(rule);
> >> + ct_dbg("Failed to add replacement ct entry rule, nat: %d", nat);
> >> + goto err_rule;
> >> + }
> >> +
> >> + ct_priv->fs_ops->ct_rule_del(ct_priv->fs, zone_rule->rule);
> >> + zone_rule->rule = rule;
> >> + mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, old_attr, zone_rule->mh);
> >> + zone_rule->mh = mh;
> >> +
> >> + kfree(old_attr);
> >> + kvfree(spec);
> >
> > not a big deal but you could make a common goto below with a different
> > name
>
> You mean jump from here to the middle of the error handling code below
> in order not to duplicate these two calls to *free()? Honestly, I would
> much rather prefer _not_ to do that since goto is necessary evil for
> error handling in C but I don't believe we should handle any common
> non-error code paths with it and it is not typically done in mlx5.
ok it's fine as-is
>
> >
> >> + ct_dbg("Replaced ct entry rule in zone %d", entry->tuple.zone);
> >> +
> >> + return 0;
> >> +
> >> +err_rule:
> >> + mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, zone_rule->attr, mh);
> >> + mlx5_put_label_mapping(ct_priv, attr->ct_attr.ct_labels_id);
> >> +err_mod_hdr:
> >> + kfree(old_attr);
> >> +err_attr:
> >> + kvfree(spec);
> >> + return err;
> >> +}
> >> +
> >> static bool
> >> mlx5_tc_ct_entry_valid(struct mlx5_ct_entry *entry)
> >> {
> >> @@ -1065,6 +1127,52 @@ mlx5_tc_ct_entry_add_rules(struct mlx5_tc_ct_priv *ct_priv,
> >> return err;
> >> }
> >>
> >> +static int
> >> +mlx5_tc_ct_entry_replace_rules(struct mlx5_tc_ct_priv *ct_priv,
> >> + struct flow_rule *flow_rule,
> >> + struct mlx5_ct_entry *entry,
> >> + u8 zone_restore_id)
> >> +{
> >> + int err;
> >> +
> >> + err = mlx5_tc_ct_entry_replace_rule(ct_priv, flow_rule, entry, false,
> >
> > would it make sense to replace the bool nat in here with some kind of
> > enum?
>
> It is either nat rule or non-nat rule. Why would we invent an enum here?
> Moreover, boolean 'nat' is already used in multiple places in this file,
> so this patch just re-uses existing convention.
This was just a suggestion to improve readability on callsites and using
bool as an array index was a bit odd for me, but that's related to
personal taste probably.
err = mlx5_tc_ct_entry_replace_rule(ct_priv, flow_rule, entry,
NAT_RULE, zone_restore_id);
if (err)
return err;
err = mlx5_tc_ct_entry_replace_rule(ct_priv, flow_rule, entry,
NON_NAT_RULE, zone_restore_id);
if (err)
return err;
above sheds more light on what is going on instead of opaque true/false.
I didn't realize it's the pattern you're using throughout whole file, so
this change would be weird per-se, maybe consider such refactoring when
you would have some free cycles... or just ignore it:)
> >> + if (err)
> >> + return err;
> >> +
> >> + err = mlx5_tc_ct_entry_replace_rule(ct_priv, flow_rule, entry, true,
> >> + zone_restore_id);
> >> + if (err)
> >> + mlx5_tc_ct_entry_del_rule(ct_priv, entry, false);
> >> + return err;
> >> +}
> >> +
> >> +static int
> >> +mlx5_tc_ct_block_flow_offload_replace(struct mlx5_ct_ft *ft, struct flow_rule *flow_rule,
> >> + struct mlx5_ct_entry *entry, unsigned long cookie)
> >> +{
> >> + struct mlx5_tc_ct_priv *ct_priv = ft->ct_priv;
> >> + int err;
> >> +
> >> + err = mlx5_tc_ct_entry_replace_rules(ct_priv, flow_rule, entry, ft->zone_restore_id);
> >> + if (!err)
> >> + return 0;
> >> +
> >> + /* If failed to update the entry, then look it up again under ht_lock
> >> + * protection and properly delete it.
> >> + */
> >> + spin_lock_bh(&ct_priv->ht_lock);
> >> + entry = rhashtable_lookup_fast(&ft->ct_entries_ht, &cookie, cts_ht_params);
> >> + if (entry) {
> >> + rhashtable_remove_fast(&ft->ct_entries_ht, &entry->node, cts_ht_params);
> >> + spin_unlock_bh(&ct_priv->ht_lock);
> >> + mlx5_tc_ct_entry_put(entry);
> >> + } else {
> >> + spin_unlock_bh(&ct_priv->ht_lock);
> >> + }
> >> + return err;
> >> +}
> >> +
> >> static int
> >> mlx5_tc_ct_block_flow_offload_add(struct mlx5_ct_ft *ft,
> >> struct flow_cls_offload *flow)
> >> @@ -1087,9 +1195,17 @@ mlx5_tc_ct_block_flow_offload_add(struct mlx5_ct_ft *ft,
> >> spin_lock_bh(&ct_priv->ht_lock);
> >> entry = rhashtable_lookup_fast(&ft->ct_entries_ht, &cookie, cts_ht_params);
> >> if (entry && refcount_inc_not_zero(&entry->refcnt)) {
> >> + if (entry->restore_cookie == meta_action->ct_metadata.cookie) {
> >> + spin_unlock_bh(&ct_priv->ht_lock);
> >> + mlx5_tc_ct_entry_put(entry);
> >> + return -EEXIST;
> >> + }
> >> + entry->restore_cookie = meta_action->ct_metadata.cookie;
> >> spin_unlock_bh(&ct_priv->ht_lock);
> >> +
> >> + err = mlx5_tc_ct_block_flow_offload_replace(ft, flow_rule, entry, cookie);
> >> mlx5_tc_ct_entry_put(entry);
> >
> > in case of err != 0, haven't you already put the entry inside
> > mlx5_tc_ct_block_flow_offload_replace() ?
>
> No. Here we release the reference that was obtained 10 lines up and
> mlx5_tc_ct_block_flow_offload_replace() releases the reference held by
> ct_entries_ht table.
thanks for explanation
>
> >
> >> - return -EEXIST;
> >> + return err;
> >> }
> >> spin_unlock_bh(&ct_priv->ht_lock);
> >>
> >> --
> >> 2.39.1
> >>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [net-next 6/9] net/mlx5e: Allow offloading of ct 'new' match
2023-02-16 0:09 [pull request][net-next 0/9] mlx5 updates 2023-02-15 Saeed Mahameed
` (4 preceding siblings ...)
2023-02-16 0:09 ` [net-next 5/9] net/mlx5e: Implement CT entry update Saeed Mahameed
@ 2023-02-16 0:09 ` Saeed Mahameed
2023-02-16 0:09 ` [net-next 7/9] net/mlx5e: Remove unused function mlx5e_sq_xmit_simple Saeed Mahameed
` (2 subsequent siblings)
8 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2023-02-16 0:09 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Vlad Buslov, Paul Blakey,
Oz Shlomo
From: Vlad Buslov <vladbu@nvidia.com>
Allow offloading filters that match on conntrack 'new' state in order to
enable UDP NEW offload in the following patch.
Unhardcode ct 'established' from ct modify header infrastructure code and
determine correct ct state bit according to the metadata action 'cookie'
field.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../ethernet/mellanox/mlx5/core/en/tc_ct.c | 21 ++++++++-----------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
index 76e86f83b6ac..58bbd0780260 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
@@ -35,6 +35,7 @@
#define MLX5_CT_STATE_REPLY_BIT BIT(4)
#define MLX5_CT_STATE_RELATED_BIT BIT(5)
#define MLX5_CT_STATE_INVALID_BIT BIT(6)
+#define MLX5_CT_STATE_NEW_BIT BIT(7)
#define MLX5_CT_LABELS_BITS MLX5_REG_MAPPING_MBITS(LABELS_TO_REG)
#define MLX5_CT_LABELS_MASK MLX5_REG_MAPPING_MASK(LABELS_TO_REG)
@@ -721,12 +722,14 @@ mlx5_tc_ct_entry_create_mod_hdr(struct mlx5_tc_ct_priv *ct_priv,
DECLARE_MOD_HDR_ACTS_ACTIONS(actions_arr, MLX5_CT_MIN_MOD_ACTS);
DECLARE_MOD_HDR_ACTS(mod_acts, actions_arr);
struct flow_action_entry *meta;
+ enum ip_conntrack_info ctinfo;
u16 ct_state = 0;
int err;
meta = mlx5_tc_ct_get_ct_metadata_action(flow_rule);
if (!meta)
return -EOPNOTSUPP;
+ ctinfo = meta->ct_metadata.cookie & NFCT_INFOMASK;
err = mlx5_get_label_mapping(ct_priv, meta->ct_metadata.labels,
&attr->ct_attr.ct_labels_id);
@@ -742,7 +745,8 @@ mlx5_tc_ct_entry_create_mod_hdr(struct mlx5_tc_ct_priv *ct_priv,
ct_state |= MLX5_CT_STATE_NAT_BIT;
}
- ct_state |= MLX5_CT_STATE_ESTABLISHED_BIT | MLX5_CT_STATE_TRK_BIT;
+ ct_state |= MLX5_CT_STATE_TRK_BIT;
+ ct_state |= ctinfo == IP_CT_NEW ? MLX5_CT_STATE_NEW_BIT : MLX5_CT_STATE_ESTABLISHED_BIT;
ct_state |= meta->ct_metadata.orig_dir ? 0 : MLX5_CT_STATE_REPLY_BIT;
err = mlx5_tc_ct_entry_set_registers(ct_priv, &mod_acts,
ct_state,
@@ -1181,16 +1185,12 @@ mlx5_tc_ct_block_flow_offload_add(struct mlx5_ct_ft *ft,
struct mlx5_tc_ct_priv *ct_priv = ft->ct_priv;
struct flow_action_entry *meta_action;
unsigned long cookie = flow->cookie;
- enum ip_conntrack_info ctinfo;
struct mlx5_ct_entry *entry;
int err;
meta_action = mlx5_tc_ct_get_ct_metadata_action(flow_rule);
if (!meta_action)
return -EOPNOTSUPP;
- ctinfo = meta_action->ct_metadata.cookie & NFCT_INFOMASK;
- if (ctinfo == IP_CT_NEW)
- return -EOPNOTSUPP;
spin_lock_bh(&ct_priv->ht_lock);
entry = rhashtable_lookup_fast(&ft->ct_entries_ht, &cookie, cts_ht_params);
@@ -1443,7 +1443,7 @@ mlx5_tc_ct_match_add(struct mlx5_tc_ct_priv *priv,
struct mlx5_ct_attr *ct_attr,
struct netlink_ext_ack *extack)
{
- bool trk, est, untrk, unest, new, rpl, unrpl, rel, unrel, inv, uninv;
+ bool trk, est, untrk, unnew, unest, new, rpl, unrpl, rel, unrel, inv, uninv;
struct flow_rule *rule = flow_cls_offload_flow_rule(f);
struct flow_dissector_key_ct *mask, *key;
u32 ctstate = 0, ctstate_mask = 0;
@@ -1489,15 +1489,18 @@ mlx5_tc_ct_match_add(struct mlx5_tc_ct_priv *priv,
rel = ct_state_on & TCA_FLOWER_KEY_CT_FLAGS_RELATED;
inv = ct_state_on & TCA_FLOWER_KEY_CT_FLAGS_INVALID;
untrk = ct_state_off & TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
+ unnew = ct_state_off & TCA_FLOWER_KEY_CT_FLAGS_NEW;
unest = ct_state_off & TCA_FLOWER_KEY_CT_FLAGS_ESTABLISHED;
unrpl = ct_state_off & TCA_FLOWER_KEY_CT_FLAGS_REPLY;
unrel = ct_state_off & TCA_FLOWER_KEY_CT_FLAGS_RELATED;
uninv = ct_state_off & TCA_FLOWER_KEY_CT_FLAGS_INVALID;
ctstate |= trk ? MLX5_CT_STATE_TRK_BIT : 0;
+ ctstate |= new ? MLX5_CT_STATE_NEW_BIT : 0;
ctstate |= est ? MLX5_CT_STATE_ESTABLISHED_BIT : 0;
ctstate |= rpl ? MLX5_CT_STATE_REPLY_BIT : 0;
ctstate_mask |= (untrk || trk) ? MLX5_CT_STATE_TRK_BIT : 0;
+ ctstate_mask |= (unnew || new) ? MLX5_CT_STATE_NEW_BIT : 0;
ctstate_mask |= (unest || est) ? MLX5_CT_STATE_ESTABLISHED_BIT : 0;
ctstate_mask |= (unrpl || rpl) ? MLX5_CT_STATE_REPLY_BIT : 0;
ctstate_mask |= unrel ? MLX5_CT_STATE_RELATED_BIT : 0;
@@ -1515,12 +1518,6 @@ mlx5_tc_ct_match_add(struct mlx5_tc_ct_priv *priv,
return -EOPNOTSUPP;
}
- if (new) {
- NL_SET_ERR_MSG_MOD(extack,
- "matching on ct_state +new isn't supported");
- return -EOPNOTSUPP;
- }
-
if (mask->ct_zone)
mlx5e_tc_match_to_reg_match(spec, ZONE_TO_REG,
key->ct_zone, MLX5_CT_ZONE_MASK);
--
2.39.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* [net-next 7/9] net/mlx5e: Remove unused function mlx5e_sq_xmit_simple
2023-02-16 0:09 [pull request][net-next 0/9] mlx5 updates 2023-02-15 Saeed Mahameed
` (5 preceding siblings ...)
2023-02-16 0:09 ` [net-next 6/9] net/mlx5e: Allow offloading of ct 'new' match Saeed Mahameed
@ 2023-02-16 0:09 ` Saeed Mahameed
2023-02-16 0:09 ` [net-next 8/9] net/mlx5e: Fix outdated TLS comment Saeed Mahameed
2023-02-16 0:09 ` [net-next 9/9] net/mlx5e: RX, Remove doubtful unlikely call Saeed Mahameed
8 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2023-02-16 0:09 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman
From: Tariq Toukan <tariqt@nvidia.com>
The last usage was removed as part of
commit 40379a0084c2 ("net/mlx5_fpga: Drop INNOVA TLS support").
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h | 1 -
drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 15 ---------------
2 files changed, 16 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
index c10c6ab2e7bc..c067d2efab51 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
@@ -320,7 +320,6 @@ mlx5e_tx_dma_unmap(struct device *pdev, struct mlx5e_sq_dma *dma)
}
}
-void mlx5e_sq_xmit_simple(struct mlx5e_txqsq *sq, struct sk_buff *skb, bool xmit_more);
void mlx5e_tx_mpwqe_ensure_complete(struct mlx5e_txqsq *sq);
static inline bool mlx5e_tx_mpwqe_is_full(struct mlx5e_tx_mpwqe *session, u8 max_sq_mpw_wqebbs)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index f7897ddb29c5..df5e780e8e6a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -720,21 +720,6 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev)
return NETDEV_TX_OK;
}
-void mlx5e_sq_xmit_simple(struct mlx5e_txqsq *sq, struct sk_buff *skb, bool xmit_more)
-{
- struct mlx5e_tx_wqe_attr wqe_attr;
- struct mlx5e_tx_attr attr;
- struct mlx5e_tx_wqe *wqe;
- u16 pi;
-
- mlx5e_sq_xmit_prepare(sq, skb, NULL, &attr);
- mlx5e_sq_calc_wqe_attr(skb, &attr, &wqe_attr);
- pi = mlx5e_txqsq_get_next_pi(sq, wqe_attr.num_wqebbs);
- wqe = MLX5E_TX_FETCH_WQE(sq, pi);
- mlx5e_txwqe_build_eseg_csum(sq, skb, NULL, &wqe->eth);
- mlx5e_sq_xmit_wqe(sq, skb, &attr, &wqe_attr, wqe, pi, xmit_more);
-}
-
static void mlx5e_tx_wi_dma_unmap(struct mlx5e_txqsq *sq, struct mlx5e_tx_wqe_info *wi,
u32 *dma_fifo_cc)
{
--
2.39.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* [net-next 8/9] net/mlx5e: Fix outdated TLS comment
2023-02-16 0:09 [pull request][net-next 0/9] mlx5 updates 2023-02-15 Saeed Mahameed
` (6 preceding siblings ...)
2023-02-16 0:09 ` [net-next 7/9] net/mlx5e: Remove unused function mlx5e_sq_xmit_simple Saeed Mahameed
@ 2023-02-16 0:09 ` Saeed Mahameed
2023-02-16 0:09 ` [net-next 9/9] net/mlx5e: RX, Remove doubtful unlikely call Saeed Mahameed
8 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2023-02-16 0:09 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman
From: Tariq Toukan <tariqt@nvidia.com>
Comment is outdated since
commit 40379a0084c2 ("net/mlx5_fpga: Drop INNOVA TLS support").
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h
index 07187028f0d3..c964644ee866 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h
@@ -124,7 +124,7 @@ static inline bool mlx5e_accel_tx_begin(struct net_device *dev,
mlx5e_udp_gso_handle_tx_skb(skb);
#ifdef CONFIG_MLX5_EN_TLS
- /* May send SKBs and WQEs. */
+ /* May send WQEs. */
if (mlx5e_ktls_skb_offloaded(skb))
if (unlikely(!mlx5e_ktls_handle_tx_skb(dev, sq, skb,
&state->tls)))
--
2.39.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* [net-next 9/9] net/mlx5e: RX, Remove doubtful unlikely call
2023-02-16 0:09 [pull request][net-next 0/9] mlx5 updates 2023-02-15 Saeed Mahameed
` (7 preceding siblings ...)
2023-02-16 0:09 ` [net-next 8/9] net/mlx5e: Fix outdated TLS comment Saeed Mahameed
@ 2023-02-16 0:09 ` Saeed Mahameed
2023-02-16 16:05 ` Maciej Fijalkowski
8 siblings, 1 reply; 21+ messages in thread
From: Saeed Mahameed @ 2023-02-16 0:09 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman
From: Gal Pressman <gal@nvidia.com>
When building an skb in non-linear mode, it is not likely nor unlikely
that the xdp buff has fragments, it depends on the size of the packet
received.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index ac570945d5d2..e79dcc2a6007 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1718,7 +1718,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
page_ref_inc(head_wi->au->page);
- if (unlikely(xdp_buff_has_frags(&mxbuf.xdp))) {
+ if (xdp_buff_has_frags(&mxbuf.xdp)) {
int i;
/* sinfo->nr_frags is reset by build_skb, calculate again. */
--
2.39.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [net-next 9/9] net/mlx5e: RX, Remove doubtful unlikely call
2023-02-16 0:09 ` [net-next 9/9] net/mlx5e: RX, Remove doubtful unlikely call Saeed Mahameed
@ 2023-02-16 16:05 ` Maciej Fijalkowski
0 siblings, 0 replies; 21+ messages in thread
From: Maciej Fijalkowski @ 2023-02-16 16:05 UTC (permalink / raw)
To: Saeed Mahameed
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman
On Wed, Feb 15, 2023 at 04:09:18PM -0800, Saeed Mahameed wrote:
> From: Gal Pressman <gal@nvidia.com>
>
> When building an skb in non-linear mode, it is not likely nor unlikely
> that the xdp buff has fragments, it depends on the size of the packet
> received.
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>
> Signed-off-by: Gal Pressman <gal@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> index ac570945d5d2..e79dcc2a6007 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> @@ -1718,7 +1718,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
>
> page_ref_inc(head_wi->au->page);
>
> - if (unlikely(xdp_buff_has_frags(&mxbuf.xdp))) {
> + if (xdp_buff_has_frags(&mxbuf.xdp)) {
> int i;
>
> /* sinfo->nr_frags is reset by build_skb, calculate again. */
> --
> 2.39.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread