All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: netdev@vger.kernel.org, linux-rt-devel@lists.linux.dev,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Simon Horman <horms@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>
Subject: Re: [PATCH net-next v3 05/18] xdp: Use nested-BH locking for system_page_pool
Date: Fri, 02 May 2025 16:33:10 +0200	[thread overview]
Message-ID: <87ikmj5bh5.fsf@toke.dk> (raw)
In-Reply-To: <20250502133231.lS281-FN@linutronix.de>

Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:

> On 2025-05-01 12:13:24 [+0200], Toke Høiland-Jørgensen wrote:
>> > --- a/net/core/dev.c
>> > +++ b/net/core/dev.c
>> > @@ -462,7 +462,9 @@ EXPORT_PER_CPU_SYMBOL(softnet_data);
>> >   * PP consumers must pay attention to run APIs in the appropriate context
>> >   * (e.g. NAPI context).
>> >   */
>> > -DEFINE_PER_CPU(struct page_pool *, system_page_pool);
>> > +DEFINE_PER_CPU(struct page_pool_bh, system_page_pool) = {
>> > +	.bh_lock = INIT_LOCAL_LOCK(bh_lock),
>> > +};
>> 
>> I'm a little fuzzy on how DEFINE_PER_CPU() works, but does this
>> initialisation automatically do the right thing with the multiple
>> per-CPU instances?
>
> It sets the "first" per-CPU data which is then copied to all
> "possible-CPUs" during early boot when the per-CPU data is made
> available. You can initialize almost everything like that. Pointer based
> structures (such as LIST_HEAD_INIT()) is something that obviously won't
> work.

Right, I see. Cool, thanks for explaining :)

>> >  #ifdef CONFIG_LOCKDEP
>> >  /*
>> > --- a/net/core/xdp.c
>> > +++ b/net/core/xdp.c
>> > @@ -737,10 +737,10 @@ static noinline bool xdp_copy_frags_from_zc(struct sk_buff *skb,
>> >   */
>> >  struct sk_buff *xdp_build_skb_from_zc(struct xdp_buff *xdp)
>> >  {
>> > -	struct page_pool *pp = this_cpu_read(system_page_pool);
>> >  	const struct xdp_rxq_info *rxq = xdp->rxq;
>> >  	u32 len = xdp->data_end - xdp->data_meta;
>> >  	u32 truesize = xdp->frame_sz;
>> > +	struct page_pool *pp;
>> >  	struct sk_buff *skb;
>> >  	int metalen;
>> >  	void *data;
>> > @@ -748,13 +748,18 @@ struct sk_buff *xdp_build_skb_from_zc(struct xdp_buff *xdp)
>> >  	if (!IS_ENABLED(CONFIG_PAGE_POOL))
>> >  		return NULL;
>> >  
>> > +	local_lock_nested_bh(&system_page_pool.bh_lock);
>> > +	pp = this_cpu_read(system_page_pool.pool);
>> >  	data = page_pool_dev_alloc_va(pp, &truesize);
>> > -	if (unlikely(!data))
>> > +	if (unlikely(!data)) {
>> > +		local_unlock_nested_bh(&system_page_pool.bh_lock);
>> >  		return NULL;
>> > +	}
>> >  
>> >  	skb = napi_build_skb(data, truesize);
>> >  	if (unlikely(!skb)) {
>> >  		page_pool_free_va(pp, data, true);
>> > +		local_unlock_nested_bh(&system_page_pool.bh_lock);
>> >  		return NULL;
>> >  	}
>> >  
>> > @@ -773,9 +778,11 @@ struct sk_buff *xdp_build_skb_from_zc(struct xdp_buff *xdp)
>> >  
>> >  	if (unlikely(xdp_buff_has_frags(xdp)) &&
>> >  	    unlikely(!xdp_copy_frags_from_zc(skb, xdp, pp))) {
>> > +		local_unlock_nested_bh(&system_page_pool.bh_lock);
>> >  		napi_consume_skb(skb, true);
>> >  		return NULL;
>> >  	}
>> > +	local_unlock_nested_bh(&system_page_pool.bh_lock);
>> 
>> Hmm, instead of having four separate unlock calls in this function, how
>> about initialising skb = NULL, and having the unlock call just above
>> 'return skb' with an out: label?
>> 
>> Then the three topmost 'return NULL' can just straight-forwardly be
>> replaced with 'goto out', while the last one becomes 'skb = NULL; goto
>> out;'. I think that would be more readable than this repetition.
>
> Something like the following maybe? We would keep the lock during
> napi_consume_skb() which should work.
>
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index b2a5c934fe7b7..1ff0bc328305d 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -740,8 +740,8 @@ struct sk_buff *xdp_build_skb_from_zc(struct xdp_buff *xdp)
>  	const struct xdp_rxq_info *rxq = xdp->rxq;
>  	u32 len = xdp->data_end - xdp->data_meta;
>  	u32 truesize = xdp->frame_sz;
> +	struct sk_buff *skb = NULL;
>  	struct page_pool *pp;
> -	struct sk_buff *skb;
>  	int metalen;
>  	void *data;
>  
> @@ -751,16 +751,13 @@ struct sk_buff *xdp_build_skb_from_zc(struct xdp_buff *xdp)
>  	local_lock_nested_bh(&system_page_pool.bh_lock);
>  	pp = this_cpu_read(system_page_pool.pool);
>  	data = page_pool_dev_alloc_va(pp, &truesize);
> -	if (unlikely(!data)) {
> -		local_unlock_nested_bh(&system_page_pool.bh_lock);
> -		return NULL;
> -	}
> +	if (unlikely(!data))
> +		goto out;
>  
>  	skb = napi_build_skb(data, truesize);
>  	if (unlikely(!skb)) {
>  		page_pool_free_va(pp, data, true);
> -		local_unlock_nested_bh(&system_page_pool.bh_lock);
> -		return NULL;
> +		goto out;
>  	}
>  
>  	skb_mark_for_recycle(skb);
> @@ -778,15 +775,16 @@ struct sk_buff *xdp_build_skb_from_zc(struct xdp_buff *xdp)
>  
>  	if (unlikely(xdp_buff_has_frags(xdp)) &&
>  	    unlikely(!xdp_copy_frags_from_zc(skb, xdp, pp))) {
> -		local_unlock_nested_bh(&system_page_pool.bh_lock);
>  		napi_consume_skb(skb, true);
> -		return NULL;
> +		skb = NULL;
>  	}
> +
> +out:
>  	local_unlock_nested_bh(&system_page_pool.bh_lock);
> -
> -	xsk_buff_free(xdp);
> -
> -	skb->protocol = eth_type_trans(skb, rxq->dev);
> +	if (skb) {
> +		xsk_buff_free(xdp);
> +		skb->protocol = eth_type_trans(skb, rxq->dev);
> +	}

I had in mind moving the out: label (and the unlock) below the
skb->protocol assignment, which would save the if(skb) check; any reason
we can't call xsk_buff_free() while holding the lock?

-Toke


  reply	other threads:[~2025-05-02 14:33 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-30 12:47 [PATCH net-next v3 00/18] net: Cover more per-CPU storage with local nested BH locking Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 01/18] net: page_pool: Don't recycle into cache on PREEMPT_RT Sebastian Andrzej Siewior
2025-05-19  8:18   ` Ilias Apalodimas
2025-04-30 12:47 ` [PATCH net-next v3 02/18] net: dst_cache: Use nested-BH locking for dst_cache::cache Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 03/18] ipv4/route: Use this_cpu_inc() for stats on PREEMPT_RT Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 04/18] ipv6: sr: Use nested-BH locking for hmac_storage Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 05/18] xdp: Use nested-BH locking for system_page_pool Sebastian Andrzej Siewior
2025-04-30 14:20   ` Jesper Dangaard Brouer
2025-05-01 10:13   ` Toke Høiland-Jørgensen
2025-05-02 13:32     ` Sebastian Andrzej Siewior
2025-05-02 14:33       ` Toke Høiland-Jørgensen [this message]
2025-05-02 15:07         ` Sebastian Andrzej Siewior
2025-05-02 15:59           ` Toke Høiland-Jørgensen
2025-05-05  8:57             ` Sebastian Andrzej Siewior
2025-05-05  9:13               ` Toke Høiland-Jørgensen
2025-04-30 12:47 ` [PATCH net-next v3 06/18] netfilter: nf_dup{4, 6}: Move duplication check to task_struct Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 07/18] netfilter: nft_inner: Use nested-BH locking for nft_pcpu_tun_ctx Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 08/18] netfilter: nf_dup_netdev: Move the recursion counter struct netdev_xmit Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 09/18] xfrm: Use nested-BH locking for nat_keepalive_sk_ipv[46] Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 10/18] openvswitch: Merge three per-CPU structures into one Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 11/18] openvswitch: Use nested-BH locking for ovs_pcpu_storage Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 12/18] openvswitch: Move ovs_frag_data_storage into the struct ovs_pcpu_storage Sebastian Andrzej Siewior
2025-05-05 12:34   ` [ovs-dev] " Aaron Conole
2025-04-30 12:47 ` [PATCH net-next v3 13/18] net/sched: act_mirred: Move the recursion counter struct netdev_xmit Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 14/18] net/sched: Use nested-BH locking for sch_frag_data_storage Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 15/18] mptcp: Use nested-BH locking for hmac_storage Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 16/18] rds: Disable only bottom halves in rds_page_remainder_alloc() Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 17/18] rds: Acquire per-CPU pointer within BH disabled section Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 18/18] rds: Use nested-BH locking for rds_page_remainder Sebastian Andrzej Siewior
2025-05-05 23:02 ` [PATCH net-next v3 00/18] net: Cover more per-CPU storage with local nested BH locking Jakub Kicinski
2025-05-09 11:58   ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ikmj5bh5.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=ast@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=horms@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-rt-devel@lists.linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.