From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
netdev@vger.kernel.org, linux-rt-devel@lists.linux.dev
Cc: "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Andrew Lunn <andrew+netdev@lunn.ch>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Jesper Dangaard Brouer <hawk@kernel.org>,
John Fastabend <john.fastabend@gmail.com>
Subject: Re: [PATCH net-next v3 05/18] xdp: Use nested-BH locking for system_page_pool
Date: Thu, 01 May 2025 12:13:24 +0200 [thread overview]
Message-ID: <878qng7i63.fsf@toke.dk> (raw)
In-Reply-To: <20250430124758.1159480-6-bigeasy@linutronix.de>
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> system_page_pool is a per-CPU variable and relies on disabled BH for its
> locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
> this data structure requires explicit locking.
>
> Make a struct with a page_pool member (original system_page_pool) and a
> local_lock_t and use local_lock_nested_bh() for locking. This change
> adds only lockdep coverage and does not alter the functional behaviour
> for !PREEMPT_RT.
>
> Cc: Andrew Lunn <andrew+netdev@lunn.ch>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jesper Dangaard Brouer <hawk@kernel.org>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> include/linux/netdevice.h | 7 ++++++-
> net/core/dev.c | 15 ++++++++++-----
> net/core/xdp.c | 11 +++++++++--
> 3 files changed, 25 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 2d11d013cabed..2018e2432cb56 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3502,7 +3502,12 @@ struct softnet_data {
> };
>
> DECLARE_PER_CPU_ALIGNED(struct softnet_data, softnet_data);
> -DECLARE_PER_CPU(struct page_pool *, system_page_pool);
> +
> +struct page_pool_bh {
> + struct page_pool *pool;
> + local_lock_t bh_lock;
> +};
> +DECLARE_PER_CPU(struct page_pool_bh, system_page_pool);
>
> #ifndef CONFIG_PREEMPT_RT
> static inline int dev_recursion_level(void)
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 1be7cb73a6024..b56becd070bc7 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -462,7 +462,9 @@ EXPORT_PER_CPU_SYMBOL(softnet_data);
> * PP consumers must pay attention to run APIs in the appropriate context
> * (e.g. NAPI context).
> */
> -DEFINE_PER_CPU(struct page_pool *, system_page_pool);
> +DEFINE_PER_CPU(struct page_pool_bh, system_page_pool) = {
> + .bh_lock = INIT_LOCAL_LOCK(bh_lock),
> +};
I'm a little fuzzy on how DEFINE_PER_CPU() works, but does this
initialisation automatically do the right thing with the multiple
per-CPU instances?
> #ifdef CONFIG_LOCKDEP
> /*
> @@ -5238,7 +5240,10 @@ netif_skb_check_for_xdp(struct sk_buff **pskb, const struct bpf_prog *prog)
> struct sk_buff *skb = *pskb;
> int err, hroom, troom;
>
> - if (!skb_cow_data_for_xdp(this_cpu_read(system_page_pool), pskb, prog))
> + local_lock_nested_bh(&system_page_pool.bh_lock);
> + err = skb_cow_data_for_xdp(this_cpu_read(system_page_pool.pool), pskb, prog);
> + local_unlock_nested_bh(&system_page_pool.bh_lock);
> + if (!err)
> return 0;
>
> /* In case we have to go down the path and also linearize,
> @@ -12629,7 +12634,7 @@ static int net_page_pool_create(int cpuid)
> return err;
> }
>
> - per_cpu(system_page_pool, cpuid) = pp_ptr;
> + per_cpu(system_page_pool.pool, cpuid) = pp_ptr;
> #endif
> return 0;
> }
> @@ -12759,13 +12764,13 @@ static int __init net_dev_init(void)
> for_each_possible_cpu(i) {
> struct page_pool *pp_ptr;
>
> - pp_ptr = per_cpu(system_page_pool, i);
> + pp_ptr = per_cpu(system_page_pool.pool, i);
> if (!pp_ptr)
> continue;
>
> xdp_unreg_page_pool(pp_ptr);
> page_pool_destroy(pp_ptr);
> - per_cpu(system_page_pool, i) = NULL;
> + per_cpu(system_page_pool.pool, i) = NULL;
> }
> }
>
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index f86eedad586a7..b2a5c934fe7b7 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -737,10 +737,10 @@ static noinline bool xdp_copy_frags_from_zc(struct sk_buff *skb,
> */
> struct sk_buff *xdp_build_skb_from_zc(struct xdp_buff *xdp)
> {
> - struct page_pool *pp = this_cpu_read(system_page_pool);
> const struct xdp_rxq_info *rxq = xdp->rxq;
> u32 len = xdp->data_end - xdp->data_meta;
> u32 truesize = xdp->frame_sz;
> + struct page_pool *pp;
> struct sk_buff *skb;
> int metalen;
> void *data;
> @@ -748,13 +748,18 @@ struct sk_buff *xdp_build_skb_from_zc(struct xdp_buff *xdp)
> if (!IS_ENABLED(CONFIG_PAGE_POOL))
> return NULL;
>
> + local_lock_nested_bh(&system_page_pool.bh_lock);
> + pp = this_cpu_read(system_page_pool.pool);
> data = page_pool_dev_alloc_va(pp, &truesize);
> - if (unlikely(!data))
> + if (unlikely(!data)) {
> + local_unlock_nested_bh(&system_page_pool.bh_lock);
> return NULL;
> + }
>
> skb = napi_build_skb(data, truesize);
> if (unlikely(!skb)) {
> page_pool_free_va(pp, data, true);
> + local_unlock_nested_bh(&system_page_pool.bh_lock);
> return NULL;
> }
>
> @@ -773,9 +778,11 @@ struct sk_buff *xdp_build_skb_from_zc(struct xdp_buff *xdp)
>
> if (unlikely(xdp_buff_has_frags(xdp)) &&
> unlikely(!xdp_copy_frags_from_zc(skb, xdp, pp))) {
> + local_unlock_nested_bh(&system_page_pool.bh_lock);
> napi_consume_skb(skb, true);
> return NULL;
> }
> + local_unlock_nested_bh(&system_page_pool.bh_lock);
Hmm, instead of having four separate unlock calls in this function, how
about initialising skb = NULL, and having the unlock call just above
'return skb' with an out: label?
Then the three topmost 'return NULL' can just straight-forwardly be
replaced with 'goto out', while the last one becomes 'skb = NULL; goto
out;'. I think that would be more readable than this repetition.
-Toke
next prev parent reply other threads:[~2025-05-01 10:13 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-30 12:47 [PATCH net-next v3 00/18] net: Cover more per-CPU storage with local nested BH locking Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 01/18] net: page_pool: Don't recycle into cache on PREEMPT_RT Sebastian Andrzej Siewior
2025-05-19 8:18 ` Ilias Apalodimas
2025-04-30 12:47 ` [PATCH net-next v3 02/18] net: dst_cache: Use nested-BH locking for dst_cache::cache Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 03/18] ipv4/route: Use this_cpu_inc() for stats on PREEMPT_RT Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 04/18] ipv6: sr: Use nested-BH locking for hmac_storage Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 05/18] xdp: Use nested-BH locking for system_page_pool Sebastian Andrzej Siewior
2025-04-30 14:20 ` Jesper Dangaard Brouer
2025-05-01 10:13 ` Toke Høiland-Jørgensen [this message]
2025-05-02 13:32 ` Sebastian Andrzej Siewior
2025-05-02 14:33 ` Toke Høiland-Jørgensen
2025-05-02 15:07 ` Sebastian Andrzej Siewior
2025-05-02 15:59 ` Toke Høiland-Jørgensen
2025-05-05 8:57 ` Sebastian Andrzej Siewior
2025-05-05 9:13 ` Toke Høiland-Jørgensen
2025-04-30 12:47 ` [PATCH net-next v3 06/18] netfilter: nf_dup{4, 6}: Move duplication check to task_struct Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 07/18] netfilter: nft_inner: Use nested-BH locking for nft_pcpu_tun_ctx Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 08/18] netfilter: nf_dup_netdev: Move the recursion counter struct netdev_xmit Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 09/18] xfrm: Use nested-BH locking for nat_keepalive_sk_ipv[46] Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 10/18] openvswitch: Merge three per-CPU structures into one Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 11/18] openvswitch: Use nested-BH locking for ovs_pcpu_storage Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 12/18] openvswitch: Move ovs_frag_data_storage into the struct ovs_pcpu_storage Sebastian Andrzej Siewior
2025-05-05 12:34 ` [ovs-dev] " Aaron Conole
2025-04-30 12:47 ` [PATCH net-next v3 13/18] net/sched: act_mirred: Move the recursion counter struct netdev_xmit Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 14/18] net/sched: Use nested-BH locking for sch_frag_data_storage Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 15/18] mptcp: Use nested-BH locking for hmac_storage Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 16/18] rds: Disable only bottom halves in rds_page_remainder_alloc() Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 17/18] rds: Acquire per-CPU pointer within BH disabled section Sebastian Andrzej Siewior
2025-04-30 12:47 ` [PATCH net-next v3 18/18] rds: Use nested-BH locking for rds_page_remainder Sebastian Andrzej Siewior
2025-05-05 23:02 ` [PATCH net-next v3 00/18] net: Cover more per-CPU storage with local nested BH locking Jakub Kicinski
2025-05-09 11:58 ` Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=878qng7i63.fsf@toke.dk \
--to=toke@redhat.com \
--cc=andrew+netdev@lunn.ch \
--cc=ast@kernel.org \
--cc=bigeasy@linutronix.de \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.