Re: [PATCH net-next v2 2/2] xsk: use a smaller new lock for shared pool case

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Jason Xing <kerneljasonxing@gmail.com>
Cc: <davem@davemloft.net>, <edumazet@google.com>, <kuba@kernel.org>,
	<pabeni@redhat.com>, <bjorn@kernel.org>,
	<magnus.karlsson@intel.com>, <jonathan.lemon@gmail.com>,
	<sdf@fomichev.me>, <ast@kernel.org>, <daniel@iogearbox.net>,
	<hawk@kernel.org>, <john.fastabend@gmail.com>, <horms@kernel.org>,
	<andrew+netdev@lunn.ch>, <bpf@vger.kernel.org>,
	<netdev@vger.kernel.org>, Jason Xing <kernelxing@tencent.com>
Subject: Re: [PATCH net-next v2 2/2] xsk: use a smaller new lock for shared pool case
Date: Mon, 3 Nov 2025 15:58:20 +0100	[thread overview]
Message-ID: <aQjDDJsGIAI5YHBL@boxer> (raw)
In-Reply-To: <20251030000646.18859-3-kerneljasonxing@gmail.com>

On Thu, Oct 30, 2025 at 08:06:46AM +0800, Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
> 
> - Split cq_lock into two smaller locks: cq_prod_lock and
>   cq_cached_prod_lock
> - Avoid disabling/enabling interrupts in the hot xmit path
> 
> In either xsk_cq_cancel_locked() or xsk_cq_reserve_locked() function,
> the race condition is only between multiple xsks sharing the same
> pool. They are all in the process context rather than interrupt context,
> so now the small lock named cq_cached_prod_lock can be used without
> handling interrupts.
> 
> While cq_cached_prod_lock ensures the exclusive modification of
> @cached_prod, cq_prod_lock in xsk_cq_submit_addr_locked() only cares
> about @producer and corresponding @desc. Both of them don't necessarily
> be consistent with @cached_prod protected by cq_cached_prod_lock.
> That's the reason why the previous big lock can be split into two
> smaller ones. Please note that SPSC rule is all about the global state
> of producer and consumer that can affect both layers instead of local
> or cached ones.
> 
> Frequently disabling and enabling interrupt are very time consuming
> in some cases, especially in a per-descriptor granularity, which now
> can be avoided after this optimization, even when the pool is shared by
> multiple xsks.
> 
> With this patch, the performance number[1] could go from 1,872,565 pps
> to 1,961,009 pps. It's a minor rise of around 5%.
> 
> [1]: taskset -c 1 ./xdpsock -i enp2s0f1 -q 0 -t -S -s 64
> 
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
>  include/net/xsk_buff_pool.h | 13 +++++++++----
>  net/xdp/xsk.c               | 15 ++++++---------
>  net/xdp/xsk_buff_pool.c     |  3 ++-
>  3 files changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h
> index cac56e6b0869..92a2358c6ce3 100644
> --- a/include/net/xsk_buff_pool.h
> +++ b/include/net/xsk_buff_pool.h
> @@ -85,11 +85,16 @@ struct xsk_buff_pool {
>  	bool unaligned;
>  	bool tx_sw_csum;
>  	void *addrs;
> -	/* Mutual exclusion of the completion ring in the SKB mode. Two cases to protect:
> -	 * NAPI TX thread and sendmsg error paths in the SKB destructor callback and when
> -	 * sockets share a single cq when the same netdev and queue id is shared.
> +	/* Mutual exclusion of the completion ring in the SKB mode.
> +	 * Protect: NAPI TX thread and sendmsg error paths in the SKB
> +	 * destructor callback.
>  	 */
> -	spinlock_t cq_lock;
> +	spinlock_t cq_prod_lock;
> +	/* Mutual exclusion of the completion ring in the SKB mode.
> +	 * Protect: when sockets share a single cq when the same netdev
> +	 * and queue id is shared.
> +	 */
> +	spinlock_t cq_cached_prod_lock;

Nice that existing hole is utilized here.

Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

>  	struct xdp_buff_xsk *free_heads[];
>  };
>  
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 7b0c68a70888..2f26c918d448 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -548,12 +548,11 @@ static int xsk_wakeup(struct xdp_sock *xs, u8 flags)
>  
>  static int xsk_cq_reserve_locked(struct xsk_buff_pool *pool)
>  {
> -	unsigned long flags;
>  	int ret;
>  
> -	spin_lock_irqsave(&pool->cq_lock, flags);
> +	spin_lock(&pool->cq_cached_prod_lock);
>  	ret = xskq_prod_reserve(pool->cq);
> -	spin_unlock_irqrestore(&pool->cq_lock, flags);
> +	spin_unlock(&pool->cq_cached_prod_lock);
>  
>  	return ret;
>  }
> @@ -566,7 +565,7 @@ static void xsk_cq_submit_addr_locked(struct xsk_buff_pool *pool,
>  	unsigned long flags;
>  	u32 idx;
>  
> -	spin_lock_irqsave(&pool->cq_lock, flags);
> +	spin_lock_irqsave(&pool->cq_prod_lock, flags);
>  	idx = xskq_get_prod(pool->cq);
>  
>  	xskq_prod_write_addr(pool->cq, idx,
> @@ -583,16 +582,14 @@ static void xsk_cq_submit_addr_locked(struct xsk_buff_pool *pool,
>  		}
>  	}
>  	xskq_prod_submit_n(pool->cq, descs_processed);
> -	spin_unlock_irqrestore(&pool->cq_lock, flags);
> +	spin_unlock_irqrestore(&pool->cq_prod_lock, flags);
>  }
>  
>  static void xsk_cq_cancel_locked(struct xsk_buff_pool *pool, u32 n)
>  {
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(&pool->cq_lock, flags);
> +	spin_lock(&pool->cq_cached_prod_lock);
>  	xskq_prod_cancel_n(pool->cq, n);
> -	spin_unlock_irqrestore(&pool->cq_lock, flags);
> +	spin_unlock(&pool->cq_cached_prod_lock);
>  }
>  
>  static void xsk_inc_num_desc(struct sk_buff *skb)
> diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
> index 309075050b2a..00a4eddaa0cd 100644
> --- a/net/xdp/xsk_buff_pool.c
> +++ b/net/xdp/xsk_buff_pool.c
> @@ -90,7 +90,8 @@ struct xsk_buff_pool *xp_create_and_assign_umem(struct xdp_sock *xs,
>  	INIT_LIST_HEAD(&pool->xskb_list);
>  	INIT_LIST_HEAD(&pool->xsk_tx_list);
>  	spin_lock_init(&pool->xsk_tx_list_lock);
> -	spin_lock_init(&pool->cq_lock);
> +	spin_lock_init(&pool->cq_prod_lock);
> +	spin_lock_init(&pool->cq_cached_prod_lock);
>  	refcount_set(&pool->users, 1);
>  
>  	pool->fq = xs->fq_tmp;
> -- 
> 2.41.3
>

next prev parent reply	other threads:[~2025-11-03 14:58 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-30  0:06 [PATCH net-next v2 0/2] xsk: minor optimizations around locks Jason Xing
2025-10-30  0:06 ` [PATCH net-next v2 1/2] xsk: do not enable/disable irq when grabbing/releasing xsk_tx_list_lock Jason Xing
2025-11-03 14:42   ` Maciej Fijalkowski
2025-10-30  0:06 ` [PATCH net-next v2 2/2] xsk: use a smaller new lock for shared pool case Jason Xing
2025-11-03 14:58   ` Maciej Fijalkowski [this message]
2025-11-03 23:26     ` Jason Xing
2025-11-04 15:20 ` [PATCH net-next v2 0/2] xsk: minor optimizations around locks patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aQjDDJsGIAI5YHBL@boxer \
    --to=maciej.fijalkowski@intel.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=ast@kernel.org \
    --cc=bjorn@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=horms@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=kerneljasonxing@gmail.com \
    --cc=kernelxing@tencent.com \
    --cc=kuba@kernel.org \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.