* [PATCH bpf-next v4 01/16] bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:33 ` bot+bpf-ci
2026-01-31 5:09 ` [PATCH bpf-next v4 02/16] bpf: Convert bpf_selem_unlink_map to failable Amery Hung
` (14 subsequent siblings)
15 siblings, 1 reply; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
A later bpf_local_storage refactor will acquire all locks before
performing any update. To simplified the number of locks needed to take
in bpf_local_storage_map_update(), determine the bucket based on the
local_storage an selem belongs to instead of the selem pointer.
Currently, when a new selem needs to be created to replace the old selem
in bpf_local_storage_map_update(), locks of both buckets need to be
acquired to prevent racing. This can be simplified if the two selem
belongs to the same bucket so that only one bucket needs to be locked.
Therefore, instead of hashing selem, hashing the local_storage pointer
the selem belongs.
This is safe since a selem is always linked to local_storage before
linked to map and unlinked from local_storage after unlinked from map.
Performance wise, this is slightly better as update now requires locking
one bucket. It should not change the level of contention on one bucket
as the pointers to local storages of selems in a map are just as unique
as pointers to selems.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
kernel/bpf/bpf_local_storage.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index e2fe6c32822b..6615091dd0e5 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -19,9 +19,9 @@
static struct bpf_local_storage_map_bucket *
select_bucket(struct bpf_local_storage_map *smap,
- struct bpf_local_storage_elem *selem)
+ struct bpf_local_storage *local_storage)
{
- return &smap->buckets[hash_ptr(selem, smap->bucket_log)];
+ return &smap->buckets[hash_ptr(local_storage, smap->bucket_log)];
}
static int mem_charge(struct bpf_local_storage_map *smap, void *owner, u32 size)
@@ -349,6 +349,7 @@ void bpf_selem_link_storage_nolock(struct bpf_local_storage *local_storage,
static void bpf_selem_unlink_map(struct bpf_local_storage_elem *selem)
{
+ struct bpf_local_storage *local_storage;
struct bpf_local_storage_map *smap;
struct bpf_local_storage_map_bucket *b;
unsigned long flags;
@@ -357,8 +358,10 @@ static void bpf_selem_unlink_map(struct bpf_local_storage_elem *selem)
/* selem has already be unlinked from smap */
return;
+ local_storage = rcu_dereference_check(selem->local_storage,
+ bpf_rcu_lock_held());
smap = rcu_dereference_check(SDATA(selem)->smap, bpf_rcu_lock_held());
- b = select_bucket(smap, selem);
+ b = select_bucket(smap, local_storage);
raw_spin_lock_irqsave(&b->lock, flags);
if (likely(selem_linked_to_map(selem)))
hlist_del_init_rcu(&selem->map_node);
@@ -368,9 +371,13 @@ static void bpf_selem_unlink_map(struct bpf_local_storage_elem *selem)
void bpf_selem_link_map(struct bpf_local_storage_map *smap,
struct bpf_local_storage_elem *selem)
{
- struct bpf_local_storage_map_bucket *b = select_bucket(smap, selem);
+ struct bpf_local_storage *local_storage;
+ struct bpf_local_storage_map_bucket *b;
unsigned long flags;
+ local_storage = rcu_dereference_check(selem->local_storage,
+ bpf_rcu_lock_held());
+ b = select_bucket(smap, local_storage);
raw_spin_lock_irqsave(&b->lock, flags);
hlist_add_head_rcu(&selem->map_node, &b->list);
raw_spin_unlock_irqrestore(&b->lock, flags);
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH bpf-next v4 01/16] bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage
2026-01-31 5:09 ` [PATCH bpf-next v4 01/16] bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage Amery Hung
@ 2026-01-31 5:33 ` bot+bpf-ci
2026-01-31 18:40 ` Amery Hung
0 siblings, 1 reply; 30+ messages in thread
From: bot+bpf-ci @ 2026-01-31 5:33 UTC (permalink / raw)
To: ameryhung, bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team, ast,
andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 3087 bytes --]
> diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
> index e2fe6c328..6615091dd 100644
> --- a/kernel/bpf/bpf_local_storage.c
> +++ b/kernel/bpf/bpf_local_storage.c
[ ... ]
> @@ -368,9 +371,13 @@ static void bpf_selem_unlink_map(struct bpf_local_storage_elem *selem)
> void bpf_selem_link_map(struct bpf_local_storage_map *smap,
> struct bpf_local_storage_elem *selem)
> {
> - struct bpf_local_storage_map_bucket *b = select_bucket(smap, selem);
> + struct bpf_local_storage *local_storage;
> + struct bpf_local_storage_map_bucket *b;
> unsigned long flags;
>
> + local_storage = rcu_dereference_check(selem->local_storage,
> + bpf_rcu_lock_held());
> + b = select_bucket(smap, local_storage);
The commit message states: "This is safe since a selem is always linked to
local_storage before linked to map."
Is this claim accurate? Looking at bpf_local_storage_update(), the order is:
/* First, link the new selem to the map */
bpf_selem_link_map(smap, selem);
/* Second, link (and publish) the new selem to local_storage */
bpf_selem_link_storage_nolock(local_storage, selem);
The comment explicitly says "First, link the new selem to the map" which is
the opposite order from what the commit message claims.
Similarly, in bpf_sk_storage_clone() when new_sk_storage is non-NULL:
bpf_selem_link_map(smap, copy_selem);
bpf_selem_link_storage_nolock(new_sk_storage, copy_selem);
Since bpf_selem_alloc() does not initialize selem->local_storage (it only
sets SDATA(selem)->smap), and bpf_selem_link_storage_nolock() is what sets
selem->local_storage, wouldn't this mean bpf_selem_link_map() reads NULL
from selem->local_storage?
With local_storage being NULL, select_bucket() would compute hash_ptr(NULL)
which returns 0, placing all newly linked selems into bucket 0. Later when
bpf_selem_unlink_map() runs, selem->local_storage has been set, so it
computes a different bucket and tries to unlink from the wrong bucket.
The next commit in the series (82de4fd08 "bpf: Convert bpf_selem_unlink_map
to failable") introduces bpf_selem_link_map_nolock() to fix this for
bpf_local_storage_alloc() and bpf_local_storage_update(), but
bpf_sk_storage_clone() still uses bpf_selem_link_map() and appears to
remain affected throughout the series.
> raw_spin_lock_irqsave(&b->lock, flags);
> hlist_add_head_rcu(&selem->map_node, &b->list);
> raw_spin_unlock_irqrestore(&b->lock, flags);
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21539340647
AI-authorship-score: low
AI-authorship-explanation: The commit has typical human characteristics: domain-specific BPF knowledge, concise explanations, and follows kernel conventions.
issues-found: 1
issue-severity-score: high
issue-severity-explanation: Hash bucket mismatch causes selems to be inserted into wrong bucket and never properly unlinked, leading to hash list corruption.
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH bpf-next v4 01/16] bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage
2026-01-31 5:33 ` bot+bpf-ci
@ 2026-01-31 18:40 ` Amery Hung
2026-01-31 19:13 ` Amery Hung
0 siblings, 1 reply; 30+ messages in thread
From: Amery Hung @ 2026-01-31 18:40 UTC (permalink / raw)
To: bot+bpf-ci
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, memxor,
martin.lau, kpsingh, yonghong.song, song, haoluo, kernel-team,
ast, eddyz87, clm, ihor.solodrai
On Fri, Jan 30, 2026 at 9:33 PM <bot+bpf-ci@kernel.org> wrote:
>
> > diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
> > index e2fe6c328..6615091dd 100644
> > --- a/kernel/bpf/bpf_local_storage.c
> > +++ b/kernel/bpf/bpf_local_storage.c
>
> [ ... ]
>
> > @@ -368,9 +371,13 @@ static void bpf_selem_unlink_map(struct bpf_local_storage_elem *selem)
> > void bpf_selem_link_map(struct bpf_local_storage_map *smap,
> > struct bpf_local_storage_elem *selem)
> > {
> > - struct bpf_local_storage_map_bucket *b = select_bucket(smap, selem);
> > + struct bpf_local_storage *local_storage;
> > + struct bpf_local_storage_map_bucket *b;
> > unsigned long flags;
> >
> > + local_storage = rcu_dereference_check(selem->local_storage,
> > + bpf_rcu_lock_held());
> > + b = select_bucket(smap, local_storage);
>
> The commit message states: "This is safe since a selem is always linked to
> local_storage before linked to map."
>
> Is this claim accurate? Looking at bpf_local_storage_update(), the order is:
>
> /* First, link the new selem to the map */
> bpf_selem_link_map(smap, selem);
>
> /* Second, link (and publish) the new selem to local_storage */
> bpf_selem_link_storage_nolock(local_storage, selem);
>
> The comment explicitly says "First, link the new selem to the map" which is
> the opposite order from what the commit message claims.
>
> Similarly, in bpf_sk_storage_clone() when new_sk_storage is non-NULL:
>
> bpf_selem_link_map(smap, copy_selem);
> bpf_selem_link_storage_nolock(new_sk_storage, copy_selem);
>
> Since bpf_selem_alloc() does not initialize selem->local_storage (it only
> sets SDATA(selem)->smap), and bpf_selem_link_storage_nolock() is what sets
> selem->local_storage, wouldn't this mean bpf_selem_link_map() reads NULL
> from selem->local_storage?
>
> With local_storage being NULL, select_bucket() would compute hash_ptr(NULL)
> which returns 0, placing all newly linked selems into bucket 0. Later when
> bpf_selem_unlink_map() runs, selem->local_storage has been set, so it
> computes a different bucket and tries to unlink from the wrong bucket.
>
> The next commit in the series (82de4fd08 "bpf: Convert bpf_selem_unlink_map
> to failable") introduces bpf_selem_link_map_nolock() to fix this for
> bpf_local_storage_alloc() and bpf_local_storage_update(), but
> bpf_sk_storage_clone() still uses bpf_selem_link_map() and appears to
> remain affected throughout the series.
AI is right here. I will fix the bug. We should follow the
local_storage then map linking order in these three places.
>
> > raw_spin_lock_irqsave(&b->lock, flags);
> > hlist_add_head_rcu(&selem->map_node, &b->list);
> > raw_spin_unlock_irqrestore(&b->lock, flags);
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21539340647
>
> AI-authorship-score: low
> AI-authorship-explanation: The commit has typical human characteristics: domain-specific BPF knowledge, concise explanations, and follows kernel conventions.
> issues-found: 1
> issue-severity-score: high
> issue-severity-explanation: Hash bucket mismatch causes selems to be inserted into wrong bucket and never properly unlinked, leading to hash list corruption.
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH bpf-next v4 01/16] bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage
2026-01-31 18:40 ` Amery Hung
@ 2026-01-31 19:13 ` Amery Hung
0 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 19:13 UTC (permalink / raw)
To: bot+bpf-ci
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, memxor,
martin.lau, kpsingh, yonghong.song, song, haoluo, kernel-team,
ast, eddyz87, clm, ihor.solodrai
On Sat, Jan 31, 2026 at 10:40 AM Amery Hung <ameryhung@gmail.com> wrote:
>
> On Fri, Jan 30, 2026 at 9:33 PM <bot+bpf-ci@kernel.org> wrote:
> >
> > > diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
> > > index e2fe6c328..6615091dd 100644
> > > --- a/kernel/bpf/bpf_local_storage.c
> > > +++ b/kernel/bpf/bpf_local_storage.c
> >
> > [ ... ]
> >
> > > @@ -368,9 +371,13 @@ static void bpf_selem_unlink_map(struct bpf_local_storage_elem *selem)
> > > void bpf_selem_link_map(struct bpf_local_storage_map *smap,
> > > struct bpf_local_storage_elem *selem)
> > > {
> > > - struct bpf_local_storage_map_bucket *b = select_bucket(smap, selem);
> > > + struct bpf_local_storage *local_storage;
> > > + struct bpf_local_storage_map_bucket *b;
> > > unsigned long flags;
> > >
> > > + local_storage = rcu_dereference_check(selem->local_storage,
> > > + bpf_rcu_lock_held());
> > > + b = select_bucket(smap, local_storage);
> >
> > The commit message states: "This is safe since a selem is always linked to
> > local_storage before linked to map."
> >
> > Is this claim accurate? Looking at bpf_local_storage_update(), the order is:
> >
> > /* First, link the new selem to the map */
> > bpf_selem_link_map(smap, selem);
> >
> > /* Second, link (and publish) the new selem to local_storage */
> > bpf_selem_link_storage_nolock(local_storage, selem);
> >
> > The comment explicitly says "First, link the new selem to the map" which is
> > the opposite order from what the commit message claims.
> >
> > Similarly, in bpf_sk_storage_clone() when new_sk_storage is non-NULL:
> >
> > bpf_selem_link_map(smap, copy_selem);
> > bpf_selem_link_storage_nolock(new_sk_storage, copy_selem);
> >
> > Since bpf_selem_alloc() does not initialize selem->local_storage (it only
> > sets SDATA(selem)->smap), and bpf_selem_link_storage_nolock() is what sets
> > selem->local_storage, wouldn't this mean bpf_selem_link_map() reads NULL
> > from selem->local_storage?
> >
> > With local_storage being NULL, select_bucket() would compute hash_ptr(NULL)
> > which returns 0, placing all newly linked selems into bucket 0. Later when
> > bpf_selem_unlink_map() runs, selem->local_storage has been set, so it
> > computes a different bucket and tries to unlink from the wrong bucket.
> >
> > The next commit in the series (82de4fd08 "bpf: Convert bpf_selem_unlink_map
> > to failable") introduces bpf_selem_link_map_nolock() to fix this for
> > bpf_local_storage_alloc() and bpf_local_storage_update(), but
> > bpf_sk_storage_clone() still uses bpf_selem_link_map() and appears to
> > remain affected throughout the series.
>
> AI is right here. I will fix the bug. We should follow the
> local_storage then map linking order in these three places.
Actually, we should link map first. I will pass local_storage to
bpf_selem_link_map() to get the bucket.
>
> >
> > > raw_spin_lock_irqsave(&b->lock, flags);
> > > hlist_add_head_rcu(&selem->map_node, &b->list);
> > > raw_spin_unlock_irqrestore(&b->lock, flags);
> >
> >
> > ---
> > AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> > See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
> >
> > CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21539340647
> >
> > AI-authorship-score: low
> > AI-authorship-explanation: The commit has typical human characteristics: domain-specific BPF knowledge, concise explanations, and follows kernel conventions.
> > issues-found: 1
> > issue-severity-score: high
> > issue-severity-explanation: Hash bucket mismatch causes selems to be inserted into wrong bucket and never properly unlinked, leading to hash list corruption.
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH bpf-next v4 02/16] bpf: Convert bpf_selem_unlink_map to failable
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
2026-01-31 5:09 ` [PATCH bpf-next v4 01/16] bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:33 ` bot+bpf-ci
2026-01-31 5:09 ` [PATCH bpf-next v4 03/16] bpf: Convert bpf_selem_link_map " Amery Hung
` (13 subsequent siblings)
15 siblings, 1 reply; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
To prepare for changing bpf_local_storage_map_bucket::lock to rqspinlock,
convert bpf_selem_unlink_map() to failable. It still always succeeds and
returns 0 for now.
Since some operations updating local storage cannot fail in the middle,
open-code bpf_selem_unlink_map() to take the b->lock before the
operation. There are two such locations:
- bpf_local_storage_alloc()
The first selem will be unlinked from smap if cmpxchg owner_storage_ptr
fails, which should not fail. Therefore, hold b->lock when linking
until allocation complete. Helpers that assume b->lock is held by
callers are introduced: bpf_selem_link_map_nolock() and
bpf_selem_unlink_map_nolock().
- bpf_local_storage_update()
The three step update process: link_map(new_selem),
link_storage(new_selem), and unlink_map(old_selem) should not fail in
the middle.
In bpf_selem_unlink(), bpf_selem_unlink_map() and
bpf_selem_unlink_storage() should either all succeed or fail as a whole
instead of failing in the middle. So, return if unlink_map() failed.
Remove the selem_linked_to_map_lockless() check as an selem in the
commom paths (not bpf_local_storage_map_free() or
bpf_local_storage_destroy()), will be unlinked under b->lock and
local_storage->lock and therefore no other threads can unlink the selem
from map at the same time.
In bpf_local_storage_destroy(), ignore the return of
bpf_selem_unlink_map() for now. A later patch will allow
bpf_local_storage_destroy() to unlink selems even when failing to
acquire locks.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
kernel/bpf/bpf_local_storage.c | 58 +++++++++++++++++++++++-----------
1 file changed, 40 insertions(+), 18 deletions(-)
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index 6615091dd0e5..1916ea2aee4b 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -61,11 +61,6 @@ static bool selem_linked_to_storage(const struct bpf_local_storage_elem *selem)
return !hlist_unhashed(&selem->snode);
}
-static bool selem_linked_to_map_lockless(const struct bpf_local_storage_elem *selem)
-{
- return !hlist_unhashed_lockless(&selem->map_node);
-}
-
static bool selem_linked_to_map(const struct bpf_local_storage_elem *selem)
{
return !hlist_unhashed(&selem->map_node);
@@ -347,25 +342,28 @@ void bpf_selem_link_storage_nolock(struct bpf_local_storage *local_storage,
hlist_add_head_rcu(&selem->snode, &local_storage->list);
}
-static void bpf_selem_unlink_map(struct bpf_local_storage_elem *selem)
+/* Only called in common paths */
+static int bpf_selem_unlink_map(struct bpf_local_storage_elem *selem)
{
struct bpf_local_storage *local_storage;
struct bpf_local_storage_map *smap;
struct bpf_local_storage_map_bucket *b;
unsigned long flags;
- if (unlikely(!selem_linked_to_map_lockless(selem)))
- /* selem has already be unlinked from smap */
- return;
-
local_storage = rcu_dereference_check(selem->local_storage,
bpf_rcu_lock_held());
smap = rcu_dereference_check(SDATA(selem)->smap, bpf_rcu_lock_held());
b = select_bucket(smap, local_storage);
raw_spin_lock_irqsave(&b->lock, flags);
- if (likely(selem_linked_to_map(selem)))
- hlist_del_init_rcu(&selem->map_node);
+ hlist_del_init_rcu(&selem->map_node);
raw_spin_unlock_irqrestore(&b->lock, flags);
+
+ return 0;
+}
+
+static void bpf_selem_unlink_map_nolock(struct bpf_local_storage_elem *selem)
+{
+ hlist_del_init_rcu(&selem->map_node);
}
void bpf_selem_link_map(struct bpf_local_storage_map *smap,
@@ -383,13 +381,24 @@ void bpf_selem_link_map(struct bpf_local_storage_map *smap,
raw_spin_unlock_irqrestore(&b->lock, flags);
}
+static void bpf_selem_link_map_nolock(struct bpf_local_storage_map_bucket *b,
+ struct bpf_local_storage_elem *selem)
+{
+ hlist_add_head_rcu(&selem->map_node, &b->list);
+}
+
void bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool reuse_now)
{
+ int err;
+
/* Always unlink from map before unlinking from local_storage
* because selem will be freed after successfully unlinked from
* the local_storage.
*/
- bpf_selem_unlink_map(selem);
+ err = bpf_selem_unlink_map(selem);
+ if (err)
+ return;
+
bpf_selem_unlink_storage(selem, reuse_now);
}
@@ -431,6 +440,8 @@ int bpf_local_storage_alloc(void *owner,
{
struct bpf_local_storage *prev_storage, *storage;
struct bpf_local_storage **owner_storage_ptr;
+ struct bpf_local_storage_map_bucket *b;
+ unsigned long flags;
int err;
err = mem_charge(smap, owner, sizeof(*storage));
@@ -455,7 +466,10 @@ int bpf_local_storage_alloc(void *owner,
storage->use_kmalloc_nolock = smap->use_kmalloc_nolock;
bpf_selem_link_storage_nolock(storage, first_selem);
- bpf_selem_link_map(smap, first_selem);
+
+ b = select_bucket(smap, storage);
+ raw_spin_lock_irqsave(&b->lock, flags);
+ bpf_selem_link_map_nolock(b, first_selem);
owner_storage_ptr =
(struct bpf_local_storage **)owner_storage(smap, owner);
@@ -471,10 +485,12 @@ int bpf_local_storage_alloc(void *owner,
*/
prev_storage = cmpxchg(owner_storage_ptr, NULL, storage);
if (unlikely(prev_storage)) {
- bpf_selem_unlink_map(first_selem);
+ bpf_selem_unlink_map_nolock(first_selem);
+ raw_spin_unlock_irqrestore(&b->lock, flags);
err = -EAGAIN;
goto uncharge;
}
+ raw_spin_unlock_irqrestore(&b->lock, flags);
return 0;
@@ -496,8 +512,9 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
struct bpf_local_storage_data *old_sdata = NULL;
struct bpf_local_storage_elem *alloc_selem, *selem = NULL;
struct bpf_local_storage *local_storage;
+ struct bpf_local_storage_map_bucket *b;
HLIST_HEAD(old_selem_free_list);
- unsigned long flags;
+ unsigned long flags, b_flags;
int err;
/* BPF_EXIST and BPF_NOEXIST cannot be both set */
@@ -581,20 +598,25 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
goto unlock;
}
+ b = select_bucket(smap, local_storage);
+
+ raw_spin_lock_irqsave(&b->lock, b_flags);
+
alloc_selem = NULL;
/* First, link the new selem to the map */
- bpf_selem_link_map(smap, selem);
+ bpf_selem_link_map_nolock(b, selem);
/* Second, link (and publish) the new selem to local_storage */
bpf_selem_link_storage_nolock(local_storage, selem);
/* Third, remove old selem, SELEM(old_sdata) */
if (old_sdata) {
- bpf_selem_unlink_map(SELEM(old_sdata));
+ bpf_selem_unlink_map_nolock(SELEM(old_sdata));
bpf_selem_unlink_storage_nolock(local_storage, SELEM(old_sdata),
&old_selem_free_list);
}
+ raw_spin_unlock_irqrestore(&b->lock, b_flags);
unlock:
raw_spin_unlock_irqrestore(&local_storage->lock, flags);
bpf_selem_free_list(&old_selem_free_list, false);
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH bpf-next v4 02/16] bpf: Convert bpf_selem_unlink_map to failable
2026-01-31 5:09 ` [PATCH bpf-next v4 02/16] bpf: Convert bpf_selem_unlink_map to failable Amery Hung
@ 2026-01-31 5:33 ` bot+bpf-ci
2026-01-31 18:40 ` Amery Hung
0 siblings, 1 reply; 30+ messages in thread
From: bot+bpf-ci @ 2026-01-31 5:33 UTC (permalink / raw)
To: ameryhung, bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team, ast,
andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 1206 bytes --]
commit 82de4fd08a55cc944c23c9ae889ed792870964d9
Author: Amery Hung <ameryhung@gmail.com>
bpf: Convert bpf_selem_unlink_map to failable
This commit converts bpf_selem_unlink_map() to return an int (always 0 for
now) in preparation for changing bpf_local_storage_map_bucket::lock to
rqspinlock. It introduces _nolock helper variants and extends critical
sections to ensure atomic link/unlink operations.
> Remove the selem_linked_to_map_lockless() check as an selem in the
> commom paths (not bpf_local_storage_map_free() or
This isn't a bug, but there's a typo here: "commom" should be "common".
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21539340647
AI-authorship-score: low
AI-authorship-explanation: The commit demonstrates deep BPF subsystem knowledge, contains a human-typical typo, and is part of a coherent patch series - all indicators of human authorship.
issues-found: 1
issue-severity-score: low
issue-severity-explanation: Minor typo in commit message (commom should be common) - no functional impact.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v4 02/16] bpf: Convert bpf_selem_unlink_map to failable
2026-01-31 5:33 ` bot+bpf-ci
@ 2026-01-31 18:40 ` Amery Hung
0 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 18:40 UTC (permalink / raw)
To: bot+bpf-ci
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, memxor,
martin.lau, kpsingh, yonghong.song, song, haoluo, kernel-team,
ast, eddyz87, clm, ihor.solodrai
On Fri, Jan 30, 2026 at 9:33 PM <bot+bpf-ci@kernel.org> wrote:
>
>
> commit 82de4fd08a55cc944c23c9ae889ed792870964d9
> Author: Amery Hung <ameryhung@gmail.com>
>
> bpf: Convert bpf_selem_unlink_map to failable
>
> This commit converts bpf_selem_unlink_map() to return an int (always 0 for
> now) in preparation for changing bpf_local_storage_map_bucket::lock to
> rqspinlock. It introduces _nolock helper variants and extends critical
> sections to ensure atomic link/unlink operations.
>
> > Remove the selem_linked_to_map_lockless() check as an selem in the
> > commom paths (not bpf_local_storage_map_free() or
>
> This isn't a bug, but there's a typo here: "commom" should be "common".
>
Will fix the typo
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21539340647
>
> AI-authorship-score: low
> AI-authorship-explanation: The commit demonstrates deep BPF subsystem knowledge, contains a human-typical typo, and is part of a coherent patch series - all indicators of human authorship.
> issues-found: 1
> issue-severity-score: low
> issue-severity-explanation: Minor typo in commit message (commom should be common) - no functional impact.
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH bpf-next v4 03/16] bpf: Convert bpf_selem_link_map to failable
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
2026-01-31 5:09 ` [PATCH bpf-next v4 01/16] bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage Amery Hung
2026-01-31 5:09 ` [PATCH bpf-next v4 02/16] bpf: Convert bpf_selem_unlink_map to failable Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:33 ` bot+bpf-ci
2026-01-31 5:09 ` [PATCH bpf-next v4 04/16] bpf: Convert bpf_selem_unlink " Amery Hung
` (12 subsequent siblings)
15 siblings, 1 reply; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
To prepare for changing bpf_local_storage_map_bucket::lock to rqspinlock,
convert bpf_selem_link_map() to failable. It still always succeeds and
returns 0 until the change happens. No functional change.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
include/linux/bpf_local_storage.h | 4 ++--
kernel/bpf/bpf_local_storage.c | 6 ++++--
net/core/bpf_sk_storage.c | 4 +++-
3 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h
index 66432248cd81..6cabf5154cf6 100644
--- a/include/linux/bpf_local_storage.h
+++ b/include/linux/bpf_local_storage.h
@@ -178,8 +178,8 @@ void bpf_selem_link_storage_nolock(struct bpf_local_storage *local_storage,
void bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool reuse_now);
-void bpf_selem_link_map(struct bpf_local_storage_map *smap,
- struct bpf_local_storage_elem *selem);
+int bpf_selem_link_map(struct bpf_local_storage_map *smap,
+ struct bpf_local_storage_elem *selem);
struct bpf_local_storage_elem *
bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner, void *value,
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index 1916ea2aee4b..939102a5f4f7 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -366,8 +366,8 @@ static void bpf_selem_unlink_map_nolock(struct bpf_local_storage_elem *selem)
hlist_del_init_rcu(&selem->map_node);
}
-void bpf_selem_link_map(struct bpf_local_storage_map *smap,
- struct bpf_local_storage_elem *selem)
+int bpf_selem_link_map(struct bpf_local_storage_map *smap,
+ struct bpf_local_storage_elem *selem)
{
struct bpf_local_storage *local_storage;
struct bpf_local_storage_map_bucket *b;
@@ -379,6 +379,8 @@ void bpf_selem_link_map(struct bpf_local_storage_map *smap,
raw_spin_lock_irqsave(&b->lock, flags);
hlist_add_head_rcu(&selem->map_node, &b->list);
raw_spin_unlock_irqrestore(&b->lock, flags);
+
+ return 0;
}
static void bpf_selem_link_map_nolock(struct bpf_local_storage_map_bucket *b,
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index de111818f3a0..9da11a53109b 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -191,7 +191,9 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
}
if (new_sk_storage) {
- bpf_selem_link_map(smap, copy_selem);
+ ret = bpf_selem_link_map(smap, copy_selem);
+ if (ret)
+ goto out;
bpf_selem_link_storage_nolock(new_sk_storage, copy_selem);
} else {
ret = bpf_local_storage_alloc(newsk, smap, copy_selem, GFP_ATOMIC);
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH bpf-next v4 03/16] bpf: Convert bpf_selem_link_map to failable
2026-01-31 5:09 ` [PATCH bpf-next v4 03/16] bpf: Convert bpf_selem_link_map " Amery Hung
@ 2026-01-31 5:33 ` bot+bpf-ci
2026-01-31 18:43 ` Amery Hung
0 siblings, 1 reply; 30+ messages in thread
From: bot+bpf-ci @ 2026-01-31 5:33 UTC (permalink / raw)
To: ameryhung, bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team, ast,
andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 2182 bytes --]
> diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
> index de111818f..9da11a531 100644
> --- a/net/core/bpf_sk_storage.c
> +++ b/net/core/bpf_sk_storage.c
> @@ -191,7 +191,9 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
> }
>
> if (new_sk_storage) {
> - bpf_selem_link_map(smap, copy_selem);
> + ret = bpf_selem_link_map(smap, copy_selem);
> + if (ret)
> + goto out;
> bpf_selem_link_storage_nolock(new_sk_storage, copy_selem);
> } else {
> ret = bpf_local_storage_alloc(newsk, smap, copy_selem, GFP_ATOMIC);
When bpf_selem_link_map() fails, should there be cleanup similar to the
else branch? At this point copy_selem has been allocated via
bpf_sk_storage_clone_elem() and the map refcount was incremented via
bpf_map_inc_not_zero().
The existing else branch on bpf_local_storage_alloc() failure does:
bpf_selem_free(copy_selem, true);
atomic_sub(smap->elem_size, &newsk->sk_omem_alloc);
bpf_map_put(map);
But the new if branch just does goto out, which only calls
rcu_read_unlock_migrate() before returning.
The comment at out: says "the caller is responsible to call
bpf_sk_storage_free" but copy_selem was never linked to new_sk_storage
(bpf_selem_link_storage_nolock() never ran), so bpf_sk_storage_free()
iterating local_storage->list would not find it.
This would leak copy_selem, the memory accounting, and a map reference
once bpf_selem_link_map() can actually fail (after commit 3636ab200
changes to rqspinlock in this series).
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21539340647
AI-authorship-score: low
AI-authorship-explanation: The code follows standard kernel development patterns and the commit message has technical specificity typical of experienced kernel developers.
issues-found: 1
issue-severity-score: medium
issue-severity-explanation: Memory leak, memory accounting leak, and map reference leak in socket cloning error path when bpf_selem_link_map fails after rqspinlock change.
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH bpf-next v4 03/16] bpf: Convert bpf_selem_link_map to failable
2026-01-31 5:33 ` bot+bpf-ci
@ 2026-01-31 18:43 ` Amery Hung
0 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 18:43 UTC (permalink / raw)
To: bot+bpf-ci
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, memxor,
martin.lau, kpsingh, yonghong.song, song, haoluo, kernel-team,
ast, eddyz87, clm, ihor.solodrai
On Fri, Jan 30, 2026 at 9:33 PM <bot+bpf-ci@kernel.org> wrote:
>
> > diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
> > index de111818f..9da11a531 100644
> > --- a/net/core/bpf_sk_storage.c
> > +++ b/net/core/bpf_sk_storage.c
> > @@ -191,7 +191,9 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
> > }
> >
> > if (new_sk_storage) {
> > - bpf_selem_link_map(smap, copy_selem);
> > + ret = bpf_selem_link_map(smap, copy_selem);
> > + if (ret)
> > + goto out;
> > bpf_selem_link_storage_nolock(new_sk_storage, copy_selem);
> > } else {
> > ret = bpf_local_storage_alloc(newsk, smap, copy_selem, GFP_ATOMIC);
>
> When bpf_selem_link_map() fails, should there be cleanup similar to the
> else branch? At this point copy_selem has been allocated via
> bpf_sk_storage_clone_elem() and the map refcount was incremented via
> bpf_map_inc_not_zero().
>
> The existing else branch on bpf_local_storage_alloc() failure does:
>
> bpf_selem_free(copy_selem, true);
> atomic_sub(smap->elem_size, &newsk->sk_omem_alloc);
> bpf_map_put(map);
>
> But the new if branch just does goto out, which only calls
> rcu_read_unlock_migrate() before returning.
>
> The comment at out: says "the caller is responsible to call
> bpf_sk_storage_free" but copy_selem was never linked to new_sk_storage
> (bpf_selem_link_storage_nolock() never ran), so bpf_sk_storage_free()
> iterating local_storage->list would not find it.
>
> This would leak copy_selem, the memory accounting, and a map reference
> once bpf_selem_link_map() can actually fail (after commit 3636ab200
> changes to rqspinlock in this series).
>
Indeed a memory leak due to missing cleanup. Will fix it in the next iteration.
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21539340647
>
> AI-authorship-score: low
> AI-authorship-explanation: The code follows standard kernel development patterns and the commit message has technical specificity typical of experienced kernel developers.
> issues-found: 1
> issue-severity-score: medium
> issue-severity-explanation: Memory leak, memory accounting leak, and map reference leak in socket cloning error path when bpf_selem_link_map fails after rqspinlock change.
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH bpf-next v4 04/16] bpf: Convert bpf_selem_unlink to failable
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
` (2 preceding siblings ...)
2026-01-31 5:09 ` [PATCH bpf-next v4 03/16] bpf: Convert bpf_selem_link_map " Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:09 ` [PATCH bpf-next v4 05/16] bpf: Change local_storage->lock and b->lock to rqspinlock Amery Hung
` (11 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
To prepare changing both bpf_local_storage_map_bucket::lock and
bpf_local_storage::lock to rqspinlock, convert bpf_selem_unlink() to
failable. It still always succeeds and returns 0 until the change
happens. No functional change.
Open code bpf_selem_unlink_storage() in the only caller,
bpf_selem_unlink(), since unlink_map and unlink_storage must be done
together after all the necessary locks are acquired.
For bpf_local_storage_map_free(), ignore the return from
bpf_selem_unlink() for now. A later patch will allow it to unlink selems
even when failing to acquire locks.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
include/linux/bpf_local_storage.h | 2 +-
kernel/bpf/bpf_cgrp_storage.c | 3 +-
kernel/bpf/bpf_inode_storage.c | 4 +-
kernel/bpf/bpf_local_storage.c | 71 +++++++++++++++----------------
kernel/bpf/bpf_task_storage.c | 4 +-
net/core/bpf_sk_storage.c | 4 +-
6 files changed, 39 insertions(+), 49 deletions(-)
diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h
index 6cabf5154cf6..a94e12ddd83d 100644
--- a/include/linux/bpf_local_storage.h
+++ b/include/linux/bpf_local_storage.h
@@ -176,7 +176,7 @@ int bpf_local_storage_map_check_btf(const struct bpf_map *map,
void bpf_selem_link_storage_nolock(struct bpf_local_storage *local_storage,
struct bpf_local_storage_elem *selem);
-void bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool reuse_now);
+int bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool reuse_now);
int bpf_selem_link_map(struct bpf_local_storage_map *smap,
struct bpf_local_storage_elem *selem);
diff --git a/kernel/bpf/bpf_cgrp_storage.c b/kernel/bpf/bpf_cgrp_storage.c
index 0687a760974a..8fef24fcac68 100644
--- a/kernel/bpf/bpf_cgrp_storage.c
+++ b/kernel/bpf/bpf_cgrp_storage.c
@@ -118,8 +118,7 @@ static int cgroup_storage_delete(struct cgroup *cgroup, struct bpf_map *map)
if (!sdata)
return -ENOENT;
- bpf_selem_unlink(SELEM(sdata), false);
- return 0;
+ return bpf_selem_unlink(SELEM(sdata), false);
}
static long bpf_cgrp_storage_delete_elem(struct bpf_map *map, void *key)
diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c
index e54cce2b9175..cedc99184dad 100644
--- a/kernel/bpf/bpf_inode_storage.c
+++ b/kernel/bpf/bpf_inode_storage.c
@@ -110,9 +110,7 @@ static int inode_storage_delete(struct inode *inode, struct bpf_map *map)
if (!sdata)
return -ENOENT;
- bpf_selem_unlink(SELEM(sdata), false);
-
- return 0;
+ return bpf_selem_unlink(SELEM(sdata), false);
}
static long bpf_fd_inode_storage_delete_elem(struct bpf_map *map, void *key)
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index 939102a5f4f7..824d053255cb 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -308,33 +308,6 @@ static bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_stor
return free_local_storage;
}
-static void bpf_selem_unlink_storage(struct bpf_local_storage_elem *selem,
- bool reuse_now)
-{
- struct bpf_local_storage *local_storage;
- bool free_local_storage = false;
- HLIST_HEAD(selem_free_list);
- unsigned long flags;
-
- if (unlikely(!selem_linked_to_storage_lockless(selem)))
- /* selem has already been unlinked from sk */
- return;
-
- local_storage = rcu_dereference_check(selem->local_storage,
- bpf_rcu_lock_held());
-
- raw_spin_lock_irqsave(&local_storage->lock, flags);
- if (likely(selem_linked_to_storage(selem)))
- free_local_storage = bpf_selem_unlink_storage_nolock(
- local_storage, selem, &selem_free_list);
- raw_spin_unlock_irqrestore(&local_storage->lock, flags);
-
- bpf_selem_free_list(&selem_free_list, reuse_now);
-
- if (free_local_storage)
- bpf_local_storage_free(local_storage, reuse_now);
-}
-
void bpf_selem_link_storage_nolock(struct bpf_local_storage *local_storage,
struct bpf_local_storage_elem *selem)
{
@@ -389,19 +362,43 @@ static void bpf_selem_link_map_nolock(struct bpf_local_storage_map_bucket *b,
hlist_add_head_rcu(&selem->map_node, &b->list);
}
-void bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool reuse_now)
+int bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool reuse_now)
{
- int err;
+ struct bpf_local_storage *local_storage;
+ bool free_local_storage = false;
+ HLIST_HEAD(selem_free_list);
+ unsigned long flags;
+ int err = 0;
- /* Always unlink from map before unlinking from local_storage
- * because selem will be freed after successfully unlinked from
- * the local_storage.
- */
- err = bpf_selem_unlink_map(selem);
- if (err)
- return;
+ if (unlikely(!selem_linked_to_storage_lockless(selem)))
+ /* selem has already been unlinked from sk */
+ return 0;
+
+ local_storage = rcu_dereference_check(selem->local_storage,
+ bpf_rcu_lock_held());
+
+ raw_spin_lock_irqsave(&local_storage->lock, flags);
+ if (likely(selem_linked_to_storage(selem))) {
+ /* Always unlink from map before unlinking from local_storage
+ * because selem will be freed after successfully unlinked from
+ * the local_storage.
+ */
+ err = bpf_selem_unlink_map(selem);
+ if (err)
+ goto out;
+
+ free_local_storage = bpf_selem_unlink_storage_nolock(
+ local_storage, selem, &selem_free_list);
+ }
+out:
+ raw_spin_unlock_irqrestore(&local_storage->lock, flags);
- bpf_selem_unlink_storage(selem, reuse_now);
+ bpf_selem_free_list(&selem_free_list, reuse_now);
+
+ if (free_local_storage)
+ bpf_local_storage_free(local_storage, reuse_now);
+
+ return err;
}
void __bpf_local_storage_insert_cache(struct bpf_local_storage *local_storage,
diff --git a/kernel/bpf/bpf_task_storage.c b/kernel/bpf/bpf_task_storage.c
index a1dc1bf0848a..ab902364ac23 100644
--- a/kernel/bpf/bpf_task_storage.c
+++ b/kernel/bpf/bpf_task_storage.c
@@ -167,9 +167,7 @@ static int task_storage_delete(struct task_struct *task, struct bpf_map *map,
if (!nobusy)
return -EBUSY;
- bpf_selem_unlink(SELEM(sdata), false);
-
- return 0;
+ return bpf_selem_unlink(SELEM(sdata), false);
}
static long bpf_pid_task_storage_delete_elem(struct bpf_map *map, void *key)
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index 9da11a53109b..78e936288879 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -40,9 +40,7 @@ static int bpf_sk_storage_del(struct sock *sk, struct bpf_map *map)
if (!sdata)
return -ENOENT;
- bpf_selem_unlink(SELEM(sdata), false);
-
- return 0;
+ return bpf_selem_unlink(SELEM(sdata), false);
}
/* Called by __sk_destruct() & bpf_sk_storage_clone() */
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH bpf-next v4 05/16] bpf: Change local_storage->lock and b->lock to rqspinlock
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
` (3 preceding siblings ...)
2026-01-31 5:09 ` [PATCH bpf-next v4 04/16] bpf: Convert bpf_selem_unlink " Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:33 ` bot+bpf-ci
2026-01-31 5:09 ` [PATCH bpf-next v4 06/16] bpf: Remove task local storage percpu counter Amery Hung
` (10 subsequent siblings)
15 siblings, 1 reply; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
Change bpf_local_storage::lock and bpf_local_storage_map_bucket::lock to
from raw_spin_lock to rqspinlock.
Finally, propagate errors from raw_res_spin_lock_irqsave() to syscall
return or BPF helper return.
In bpf_local_storage_destroy(), ignore return from
raw_res_spin_lock_irqsave() for now. A later patch will allow
bpf_local_storage_destroy() to unlink selems even when failing to
acquire locks.
For, __bpf_local_storage_map_cache(), instead of handling the error,
skip updating the cache.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
include/linux/bpf_local_storage.h | 5 ++-
kernel/bpf/bpf_local_storage.c | 63 +++++++++++++++++++++----------
2 files changed, 46 insertions(+), 22 deletions(-)
diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h
index a94e12ddd83d..903559e2ca91 100644
--- a/include/linux/bpf_local_storage.h
+++ b/include/linux/bpf_local_storage.h
@@ -15,12 +15,13 @@
#include <linux/types.h>
#include <linux/bpf_mem_alloc.h>
#include <uapi/linux/btf.h>
+#include <asm/rqspinlock.h>
#define BPF_LOCAL_STORAGE_CACHE_SIZE 16
struct bpf_local_storage_map_bucket {
struct hlist_head list;
- raw_spinlock_t lock;
+ rqspinlock_t lock;
};
/* Thp map is not the primary owner of a bpf_local_storage_elem.
@@ -94,7 +95,7 @@ struct bpf_local_storage {
* bpf_local_storage_elem.
*/
struct rcu_head rcu;
- raw_spinlock_t lock; /* Protect adding/removing from the "list" */
+ rqspinlock_t lock; /* Protect adding/removing from the "list" */
bool use_kmalloc_nolock;
};
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index 824d053255cb..7661319ad2e3 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -322,14 +322,18 @@ static int bpf_selem_unlink_map(struct bpf_local_storage_elem *selem)
struct bpf_local_storage_map *smap;
struct bpf_local_storage_map_bucket *b;
unsigned long flags;
+ int err;
local_storage = rcu_dereference_check(selem->local_storage,
bpf_rcu_lock_held());
smap = rcu_dereference_check(SDATA(selem)->smap, bpf_rcu_lock_held());
b = select_bucket(smap, local_storage);
- raw_spin_lock_irqsave(&b->lock, flags);
+ err = raw_res_spin_lock_irqsave(&b->lock, flags);
+ if (err)
+ return err;
+
hlist_del_init_rcu(&selem->map_node);
- raw_spin_unlock_irqrestore(&b->lock, flags);
+ raw_res_spin_unlock_irqrestore(&b->lock, flags);
return 0;
}
@@ -345,13 +349,18 @@ int bpf_selem_link_map(struct bpf_local_storage_map *smap,
struct bpf_local_storage *local_storage;
struct bpf_local_storage_map_bucket *b;
unsigned long flags;
+ int err;
local_storage = rcu_dereference_check(selem->local_storage,
bpf_rcu_lock_held());
b = select_bucket(smap, local_storage);
- raw_spin_lock_irqsave(&b->lock, flags);
+
+ err = raw_res_spin_lock_irqsave(&b->lock, flags);
+ if (err)
+ return err;
+
hlist_add_head_rcu(&selem->map_node, &b->list);
- raw_spin_unlock_irqrestore(&b->lock, flags);
+ raw_res_spin_unlock_irqrestore(&b->lock, flags);
return 0;
}
@@ -368,7 +377,7 @@ int bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool reuse_now)
bool free_local_storage = false;
HLIST_HEAD(selem_free_list);
unsigned long flags;
- int err = 0;
+ int err;
if (unlikely(!selem_linked_to_storage_lockless(selem)))
/* selem has already been unlinked from sk */
@@ -377,7 +386,10 @@ int bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool reuse_now)
local_storage = rcu_dereference_check(selem->local_storage,
bpf_rcu_lock_held());
- raw_spin_lock_irqsave(&local_storage->lock, flags);
+ err = raw_res_spin_lock_irqsave(&local_storage->lock, flags);
+ if (err)
+ return err;
+
if (likely(selem_linked_to_storage(selem))) {
/* Always unlink from map before unlinking from local_storage
* because selem will be freed after successfully unlinked from
@@ -391,7 +403,7 @@ int bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool reuse_now)
local_storage, selem, &selem_free_list);
}
out:
- raw_spin_unlock_irqrestore(&local_storage->lock, flags);
+ raw_res_spin_unlock_irqrestore(&local_storage->lock, flags);
bpf_selem_free_list(&selem_free_list, reuse_now);
@@ -406,16 +418,20 @@ void __bpf_local_storage_insert_cache(struct bpf_local_storage *local_storage,
struct bpf_local_storage_elem *selem)
{
unsigned long flags;
+ int err;
/* spinlock is needed to avoid racing with the
* parallel delete. Otherwise, publishing an already
* deleted sdata to the cache will become a use-after-free
* problem in the next bpf_local_storage_lookup().
*/
- raw_spin_lock_irqsave(&local_storage->lock, flags);
+ err = raw_res_spin_lock_irqsave(&local_storage->lock, flags);
+ if (err)
+ return;
+
if (selem_linked_to_storage(selem))
rcu_assign_pointer(local_storage->cache[smap->cache_idx], SDATA(selem));
- raw_spin_unlock_irqrestore(&local_storage->lock, flags);
+ raw_res_spin_unlock_irqrestore(&local_storage->lock, flags);
}
static int check_flags(const struct bpf_local_storage_data *old_sdata,
@@ -460,14 +476,17 @@ int bpf_local_storage_alloc(void *owner,
RCU_INIT_POINTER(storage->smap, smap);
INIT_HLIST_HEAD(&storage->list);
- raw_spin_lock_init(&storage->lock);
+ raw_res_spin_lock_init(&storage->lock);
storage->owner = owner;
storage->use_kmalloc_nolock = smap->use_kmalloc_nolock;
bpf_selem_link_storage_nolock(storage, first_selem);
b = select_bucket(smap, storage);
- raw_spin_lock_irqsave(&b->lock, flags);
+ err = raw_res_spin_lock_irqsave(&b->lock, flags);
+ if (err)
+ goto uncharge;
+
bpf_selem_link_map_nolock(b, first_selem);
owner_storage_ptr =
@@ -485,11 +504,11 @@ int bpf_local_storage_alloc(void *owner,
prev_storage = cmpxchg(owner_storage_ptr, NULL, storage);
if (unlikely(prev_storage)) {
bpf_selem_unlink_map_nolock(first_selem);
- raw_spin_unlock_irqrestore(&b->lock, flags);
+ raw_res_spin_unlock_irqrestore(&b->lock, flags);
err = -EAGAIN;
goto uncharge;
}
- raw_spin_unlock_irqrestore(&b->lock, flags);
+ raw_res_spin_unlock_irqrestore(&b->lock, flags);
return 0;
@@ -572,7 +591,9 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
if (!alloc_selem)
return ERR_PTR(-ENOMEM);
- raw_spin_lock_irqsave(&local_storage->lock, flags);
+ err = raw_res_spin_lock_irqsave(&local_storage->lock, flags);
+ if (err)
+ return ERR_PTR(err);
/* Recheck local_storage->list under local_storage->lock */
if (unlikely(hlist_empty(&local_storage->list))) {
@@ -599,7 +620,9 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
b = select_bucket(smap, local_storage);
- raw_spin_lock_irqsave(&b->lock, b_flags);
+ err = raw_res_spin_lock_irqsave(&b->lock, b_flags);
+ if (err)
+ goto unlock;
alloc_selem = NULL;
/* First, link the new selem to the map */
@@ -615,9 +638,9 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
&old_selem_free_list);
}
- raw_spin_unlock_irqrestore(&b->lock, b_flags);
+ raw_res_spin_unlock_irqrestore(&b->lock, b_flags);
unlock:
- raw_spin_unlock_irqrestore(&local_storage->lock, flags);
+ raw_res_spin_unlock_irqrestore(&local_storage->lock, flags);
bpf_selem_free_list(&old_selem_free_list, false);
if (alloc_selem) {
mem_uncharge(smap, owner, smap->elem_size);
@@ -702,7 +725,7 @@ void bpf_local_storage_destroy(struct bpf_local_storage *local_storage)
* when unlinking elem from the local_storage->list and
* the map's bucket->list.
*/
- raw_spin_lock_irqsave(&local_storage->lock, flags);
+ raw_res_spin_lock_irqsave(&local_storage->lock, flags);
hlist_for_each_entry_safe(selem, n, &local_storage->list, snode) {
/* Always unlink from map before unlinking from
* local_storage.
@@ -717,7 +740,7 @@ void bpf_local_storage_destroy(struct bpf_local_storage *local_storage)
free_storage = bpf_selem_unlink_storage_nolock(
local_storage, selem, &free_selem_list);
}
- raw_spin_unlock_irqrestore(&local_storage->lock, flags);
+ raw_res_spin_unlock_irqrestore(&local_storage->lock, flags);
bpf_selem_free_list(&free_selem_list, true);
@@ -764,7 +787,7 @@ bpf_local_storage_map_alloc(union bpf_attr *attr,
for (i = 0; i < nbuckets; i++) {
INIT_HLIST_HEAD(&smap->buckets[i].list);
- raw_spin_lock_init(&smap->buckets[i].lock);
+ raw_res_spin_lock_init(&smap->buckets[i].lock);
}
smap->elem_size = offsetof(struct bpf_local_storage_elem,
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH bpf-next v4 05/16] bpf: Change local_storage->lock and b->lock to rqspinlock
2026-01-31 5:09 ` [PATCH bpf-next v4 05/16] bpf: Change local_storage->lock and b->lock to rqspinlock Amery Hung
@ 2026-01-31 5:33 ` bot+bpf-ci
2026-01-31 18:46 ` Amery Hung
0 siblings, 1 reply; 30+ messages in thread
From: bot+bpf-ci @ 2026-01-31 5:33 UTC (permalink / raw)
To: ameryhung, bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team, ast,
andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 2045 bytes --]
> diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
> index 824d05325..7661319ad 100644
> --- a/kernel/bpf/bpf_local_storage.c
> +++ b/kernel/bpf/bpf_local_storage.c
[ ... ]
> @@ -572,7 +591,9 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
> if (!alloc_selem)
> return ERR_PTR(-ENOMEM);
>
> - raw_spin_lock_irqsave(&local_storage->lock, flags);
> + err = raw_res_spin_lock_irqsave(&local_storage->lock, flags);
> + if (err)
> + return ERR_PTR(err);
If the lock acquisition fails here, doesn't this leak alloc_selem? The
allocation at the top of this function succeeded:
alloc_selem = selem = bpf_selem_alloc(smap, owner, value, ...);
if (!alloc_selem)
return ERR_PTR(-ENOMEM);
But on lock failure, the early return bypasses the cleanup at the end of
the function:
if (alloc_selem) {
mem_uncharge(smap, owner, smap->elem_size);
bpf_selem_free(alloc_selem, true);
}
Should this goto unlock instead of returning directly, or add explicit
cleanup before returning?
[ ... ]
This isn't a bug, but the commit message has a minor wording issue:
> Change bpf_local_storage::lock and bpf_local_storage_map_bucket::lock to
> from raw_spin_lock to rqspinlock.
"to from" should be "from".
> For, __bpf_local_storage_map_cache(), instead of handling the error,
> skip updating the cache.
There's an extra comma after "For".
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21539340647
AI-authorship-score: low
AI-authorship-explanation: The commit follows standard kernel development patterns with appropriate technical detail and no AI-typical phrasing or structure.
issues-found: 1
issue-severity-score: low
issue-severity-explanation: Memory leak on error path in bpf_local_storage_update when rqspinlock times out; affects only rare lock failure scenarios.
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH bpf-next v4 05/16] bpf: Change local_storage->lock and b->lock to rqspinlock
2026-01-31 5:33 ` bot+bpf-ci
@ 2026-01-31 18:46 ` Amery Hung
0 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 18:46 UTC (permalink / raw)
To: bot+bpf-ci
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, memxor,
martin.lau, kpsingh, yonghong.song, song, haoluo, kernel-team,
ast, eddyz87, clm, ihor.solodrai
On Fri, Jan 30, 2026 at 9:33 PM <bot+bpf-ci@kernel.org> wrote:
>
> > diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
> > index 824d05325..7661319ad 100644
> > --- a/kernel/bpf/bpf_local_storage.c
> > +++ b/kernel/bpf/bpf_local_storage.c
>
> [ ... ]
>
> > @@ -572,7 +591,9 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
> > if (!alloc_selem)
> > return ERR_PTR(-ENOMEM);
> >
> > - raw_spin_lock_irqsave(&local_storage->lock, flags);
> > + err = raw_res_spin_lock_irqsave(&local_storage->lock, flags);
> > + if (err)
> > + return ERR_PTR(err);
>
> If the lock acquisition fails here, doesn't this leak alloc_selem? The
> allocation at the top of this function succeeded:
>
> alloc_selem = selem = bpf_selem_alloc(smap, owner, value, ...);
> if (!alloc_selem)
> return ERR_PTR(-ENOMEM);
>
> But on lock failure, the early return bypasses the cleanup at the end of
> the function:
>
> if (alloc_selem) {
> mem_uncharge(smap, owner, smap->elem_size);
> bpf_selem_free(alloc_selem, true);
> }
>
> Should this goto unlock instead of returning directly, or add explicit
> cleanup before returning?
Right. It should be "goto unlock".
>
> [ ... ]
>
> This isn't a bug, but the commit message has a minor wording issue:
>
> > Change bpf_local_storage::lock and bpf_local_storage_map_bucket::lock to
> > from raw_spin_lock to rqspinlock.
>
> "to from" should be "from".
>
> > For, __bpf_local_storage_map_cache(), instead of handling the error,
> > skip updating the cache.
>
> There's an extra comma after "For".
>
Ack.
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21539340647
>
> AI-authorship-score: low
> AI-authorship-explanation: The commit follows standard kernel development patterns with appropriate technical detail and no AI-typical phrasing or structure.
> issues-found: 1
> issue-severity-score: low
> issue-severity-explanation: Memory leak on error path in bpf_local_storage_update when rqspinlock times out; affects only rare lock failure scenarios.
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH bpf-next v4 06/16] bpf: Remove task local storage percpu counter
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
` (4 preceding siblings ...)
2026-01-31 5:09 ` [PATCH bpf-next v4 05/16] bpf: Change local_storage->lock and b->lock to rqspinlock Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:09 ` [PATCH bpf-next v4 07/16] bpf: Remove cgroup " Amery Hung
` (9 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
The percpu counter in task local storage is no longer needed as the
underlying bpf_local_storage can now handle deadlock with the help of
rqspinlock. Remove the percpu counter and related migrate_{disable,
enable}.
Since the percpu counter is removed, merge back bpf_task_storage_get()
and bpf_task_storage_get_recur(). This will allow the bpf syscalls and
helpers to run concurrently on the same CPU, removing the spurious
-EBUSY error. bpf_task_storage_get(..., F_CREATE) will now always
succeed with enough free memory unless being called recursively.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
kernel/bpf/bpf_task_storage.c | 150 ++++------------------------------
kernel/bpf/helpers.c | 4 -
2 files changed, 18 insertions(+), 136 deletions(-)
diff --git a/kernel/bpf/bpf_task_storage.c b/kernel/bpf/bpf_task_storage.c
index ab902364ac23..dd858226ada2 100644
--- a/kernel/bpf/bpf_task_storage.c
+++ b/kernel/bpf/bpf_task_storage.c
@@ -20,29 +20,6 @@
DEFINE_BPF_STORAGE_CACHE(task_cache);
-static DEFINE_PER_CPU(int, bpf_task_storage_busy);
-
-static void bpf_task_storage_lock(void)
-{
- cant_migrate();
- this_cpu_inc(bpf_task_storage_busy);
-}
-
-static void bpf_task_storage_unlock(void)
-{
- this_cpu_dec(bpf_task_storage_busy);
-}
-
-static bool bpf_task_storage_trylock(void)
-{
- cant_migrate();
- if (unlikely(this_cpu_inc_return(bpf_task_storage_busy) != 1)) {
- this_cpu_dec(bpf_task_storage_busy);
- return false;
- }
- return true;
-}
-
static struct bpf_local_storage __rcu **task_storage_ptr(void *owner)
{
struct task_struct *task = owner;
@@ -70,17 +47,15 @@ void bpf_task_storage_free(struct task_struct *task)
{
struct bpf_local_storage *local_storage;
- rcu_read_lock_dont_migrate();
+ rcu_read_lock();
local_storage = rcu_dereference(task->bpf_storage);
if (!local_storage)
goto out;
- bpf_task_storage_lock();
bpf_local_storage_destroy(local_storage);
- bpf_task_storage_unlock();
out:
- rcu_read_unlock_migrate();
+ rcu_read_unlock();
}
static void *bpf_pid_task_storage_lookup_elem(struct bpf_map *map, void *key)
@@ -106,9 +81,7 @@ static void *bpf_pid_task_storage_lookup_elem(struct bpf_map *map, void *key)
goto out;
}
- bpf_task_storage_lock();
sdata = task_storage_lookup(task, map, true);
- bpf_task_storage_unlock();
put_pid(pid);
return sdata ? sdata->data : NULL;
out:
@@ -143,11 +116,9 @@ static long bpf_pid_task_storage_update_elem(struct bpf_map *map, void *key,
goto out;
}
- bpf_task_storage_lock();
sdata = bpf_local_storage_update(
task, (struct bpf_local_storage_map *)map, value, map_flags,
true, GFP_ATOMIC);
- bpf_task_storage_unlock();
err = PTR_ERR_OR_ZERO(sdata);
out:
@@ -155,8 +126,7 @@ static long bpf_pid_task_storage_update_elem(struct bpf_map *map, void *key,
return err;
}
-static int task_storage_delete(struct task_struct *task, struct bpf_map *map,
- bool nobusy)
+static int task_storage_delete(struct task_struct *task, struct bpf_map *map)
{
struct bpf_local_storage_data *sdata;
@@ -164,9 +134,6 @@ static int task_storage_delete(struct task_struct *task, struct bpf_map *map,
if (!sdata)
return -ENOENT;
- if (!nobusy)
- return -EBUSY;
-
return bpf_selem_unlink(SELEM(sdata), false);
}
@@ -192,111 +159,50 @@ static long bpf_pid_task_storage_delete_elem(struct bpf_map *map, void *key)
goto out;
}
- bpf_task_storage_lock();
- err = task_storage_delete(task, map, true);
- bpf_task_storage_unlock();
+ err = task_storage_delete(task, map);
out:
put_pid(pid);
return err;
}
-/* Called by bpf_task_storage_get*() helpers */
-static void *__bpf_task_storage_get(struct bpf_map *map,
- struct task_struct *task, void *value,
- u64 flags, gfp_t gfp_flags, bool nobusy)
+/* *gfp_flags* is a hidden argument provided by the verifier */
+BPF_CALL_5(bpf_task_storage_get, struct bpf_map *, map, struct task_struct *,
+ task, void *, value, u64, flags, gfp_t, gfp_flags)
{
struct bpf_local_storage_data *sdata;
- sdata = task_storage_lookup(task, map, nobusy);
+ WARN_ON_ONCE(!bpf_rcu_lock_held());
+ if (flags & ~BPF_LOCAL_STORAGE_GET_F_CREATE || !task)
+ return (unsigned long)NULL;
+
+ sdata = task_storage_lookup(task, map, true);
if (sdata)
- return sdata->data;
+ return (unsigned long)sdata->data;
/* only allocate new storage, when the task is refcounted */
if (refcount_read(&task->usage) &&
- (flags & BPF_LOCAL_STORAGE_GET_F_CREATE) && nobusy) {
+ (flags & BPF_LOCAL_STORAGE_GET_F_CREATE)) {
sdata = bpf_local_storage_update(
task, (struct bpf_local_storage_map *)map, value,
BPF_NOEXIST, false, gfp_flags);
- return IS_ERR(sdata) ? NULL : sdata->data;
+ return IS_ERR(sdata) ? (unsigned long)NULL : (unsigned long)sdata->data;
}
- return NULL;
-}
-
-/* *gfp_flags* is a hidden argument provided by the verifier */
-BPF_CALL_5(bpf_task_storage_get_recur, struct bpf_map *, map, struct task_struct *,
- task, void *, value, u64, flags, gfp_t, gfp_flags)
-{
- bool nobusy;
- void *data;
-
- WARN_ON_ONCE(!bpf_rcu_lock_held());
- if (flags & ~BPF_LOCAL_STORAGE_GET_F_CREATE || !task)
- return (unsigned long)NULL;
-
- nobusy = bpf_task_storage_trylock();
- data = __bpf_task_storage_get(map, task, value, flags,
- gfp_flags, nobusy);
- if (nobusy)
- bpf_task_storage_unlock();
- return (unsigned long)data;
-}
-
-/* *gfp_flags* is a hidden argument provided by the verifier */
-BPF_CALL_5(bpf_task_storage_get, struct bpf_map *, map, struct task_struct *,
- task, void *, value, u64, flags, gfp_t, gfp_flags)
-{
- void *data;
-
- WARN_ON_ONCE(!bpf_rcu_lock_held());
- if (flags & ~BPF_LOCAL_STORAGE_GET_F_CREATE || !task)
- return (unsigned long)NULL;
-
- bpf_task_storage_lock();
- data = __bpf_task_storage_get(map, task, value, flags,
- gfp_flags, true);
- bpf_task_storage_unlock();
- return (unsigned long)data;
-}
-
-BPF_CALL_2(bpf_task_storage_delete_recur, struct bpf_map *, map, struct task_struct *,
- task)
-{
- bool nobusy;
- int ret;
-
- WARN_ON_ONCE(!bpf_rcu_lock_held());
- if (!task)
- return -EINVAL;
-
- nobusy = bpf_task_storage_trylock();
- /* This helper must only be called from places where the lifetime of the task
- * is guaranteed. Either by being refcounted or by being protected
- * by an RCU read-side critical section.
- */
- ret = task_storage_delete(task, map, nobusy);
- if (nobusy)
- bpf_task_storage_unlock();
- return ret;
+ return (unsigned long)NULL;
}
BPF_CALL_2(bpf_task_storage_delete, struct bpf_map *, map, struct task_struct *,
task)
{
- int ret;
-
WARN_ON_ONCE(!bpf_rcu_lock_held());
if (!task)
return -EINVAL;
- bpf_task_storage_lock();
/* This helper must only be called from places where the lifetime of the task
* is guaranteed. Either by being refcounted or by being protected
* by an RCU read-side critical section.
*/
- ret = task_storage_delete(task, map, true);
- bpf_task_storage_unlock();
- return ret;
+ return task_storage_delete(task, map);
}
static int notsupp_get_next_key(struct bpf_map *map, void *key, void *next_key)
@@ -311,7 +217,7 @@ static struct bpf_map *task_storage_map_alloc(union bpf_attr *attr)
static void task_storage_map_free(struct bpf_map *map)
{
- bpf_local_storage_map_free(map, &task_cache, &bpf_task_storage_busy);
+ bpf_local_storage_map_free(map, &task_cache, NULL);
}
BTF_ID_LIST_GLOBAL_SINGLE(bpf_local_storage_map_btf_id, struct, bpf_local_storage_map)
@@ -330,17 +236,6 @@ const struct bpf_map_ops task_storage_map_ops = {
.map_owner_storage_ptr = task_storage_ptr,
};
-const struct bpf_func_proto bpf_task_storage_get_recur_proto = {
- .func = bpf_task_storage_get_recur,
- .gpl_only = false,
- .ret_type = RET_PTR_TO_MAP_VALUE_OR_NULL,
- .arg1_type = ARG_CONST_MAP_PTR,
- .arg2_type = ARG_PTR_TO_BTF_ID_OR_NULL,
- .arg2_btf_id = &btf_tracing_ids[BTF_TRACING_TYPE_TASK],
- .arg3_type = ARG_PTR_TO_MAP_VALUE_OR_NULL,
- .arg4_type = ARG_ANYTHING,
-};
-
const struct bpf_func_proto bpf_task_storage_get_proto = {
.func = bpf_task_storage_get,
.gpl_only = false,
@@ -352,15 +247,6 @@ const struct bpf_func_proto bpf_task_storage_get_proto = {
.arg4_type = ARG_ANYTHING,
};
-const struct bpf_func_proto bpf_task_storage_delete_recur_proto = {
- .func = bpf_task_storage_delete_recur,
- .gpl_only = false,
- .ret_type = RET_INTEGER,
- .arg1_type = ARG_CONST_MAP_PTR,
- .arg2_type = ARG_PTR_TO_BTF_ID_OR_NULL,
- .arg2_btf_id = &btf_tracing_ids[BTF_TRACING_TYPE_TASK],
-};
-
const struct bpf_func_proto bpf_task_storage_delete_proto = {
.func = bpf_task_storage_delete,
.gpl_only = false,
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index b54ec0e945aa..1f9f543bf7c5 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2100,12 +2100,8 @@ bpf_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
return &bpf_get_cgroup_classid_curr_proto;
#endif
case BPF_FUNC_task_storage_get:
- if (bpf_prog_check_recur(prog))
- return &bpf_task_storage_get_recur_proto;
return &bpf_task_storage_get_proto;
case BPF_FUNC_task_storage_delete:
- if (bpf_prog_check_recur(prog))
- return &bpf_task_storage_delete_recur_proto;
return &bpf_task_storage_delete_proto;
default:
break;
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH bpf-next v4 07/16] bpf: Remove cgroup local storage percpu counter
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
` (5 preceding siblings ...)
2026-01-31 5:09 ` [PATCH bpf-next v4 06/16] bpf: Remove task local storage percpu counter Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:09 ` [PATCH bpf-next v4 08/16] bpf: Remove unused percpu counter from bpf_local_storage_map_free Amery Hung
` (8 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
The percpu counter in cgroup local storage is no longer needed as the
underlying bpf_local_storage can now handle deadlock with the help of
rqspinlock. Remove the percpu counter and related migrate_{disable,
enable}.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
kernel/bpf/bpf_cgrp_storage.c | 59 +++++------------------------------
1 file changed, 8 insertions(+), 51 deletions(-)
diff --git a/kernel/bpf/bpf_cgrp_storage.c b/kernel/bpf/bpf_cgrp_storage.c
index 8fef24fcac68..4d84611d8222 100644
--- a/kernel/bpf/bpf_cgrp_storage.c
+++ b/kernel/bpf/bpf_cgrp_storage.c
@@ -11,29 +11,6 @@
DEFINE_BPF_STORAGE_CACHE(cgroup_cache);
-static DEFINE_PER_CPU(int, bpf_cgrp_storage_busy);
-
-static void bpf_cgrp_storage_lock(void)
-{
- cant_migrate();
- this_cpu_inc(bpf_cgrp_storage_busy);
-}
-
-static void bpf_cgrp_storage_unlock(void)
-{
- this_cpu_dec(bpf_cgrp_storage_busy);
-}
-
-static bool bpf_cgrp_storage_trylock(void)
-{
- cant_migrate();
- if (unlikely(this_cpu_inc_return(bpf_cgrp_storage_busy) != 1)) {
- this_cpu_dec(bpf_cgrp_storage_busy);
- return false;
- }
- return true;
-}
-
static struct bpf_local_storage __rcu **cgroup_storage_ptr(void *owner)
{
struct cgroup *cg = owner;
@@ -45,16 +22,14 @@ void bpf_cgrp_storage_free(struct cgroup *cgroup)
{
struct bpf_local_storage *local_storage;
- rcu_read_lock_dont_migrate();
+ rcu_read_lock();
local_storage = rcu_dereference(cgroup->bpf_cgrp_storage);
if (!local_storage)
goto out;
- bpf_cgrp_storage_lock();
bpf_local_storage_destroy(local_storage);
- bpf_cgrp_storage_unlock();
out:
- rcu_read_unlock_migrate();
+ rcu_read_unlock();
}
static struct bpf_local_storage_data *
@@ -83,9 +58,7 @@ static void *bpf_cgrp_storage_lookup_elem(struct bpf_map *map, void *key)
if (IS_ERR(cgroup))
return ERR_CAST(cgroup);
- bpf_cgrp_storage_lock();
sdata = cgroup_storage_lookup(cgroup, map, true);
- bpf_cgrp_storage_unlock();
cgroup_put(cgroup);
return sdata ? sdata->data : NULL;
}
@@ -102,10 +75,8 @@ static long bpf_cgrp_storage_update_elem(struct bpf_map *map, void *key,
if (IS_ERR(cgroup))
return PTR_ERR(cgroup);
- bpf_cgrp_storage_lock();
sdata = bpf_local_storage_update(cgroup, (struct bpf_local_storage_map *)map,
value, map_flags, false, GFP_ATOMIC);
- bpf_cgrp_storage_unlock();
cgroup_put(cgroup);
return PTR_ERR_OR_ZERO(sdata);
}
@@ -131,9 +102,7 @@ static long bpf_cgrp_storage_delete_elem(struct bpf_map *map, void *key)
if (IS_ERR(cgroup))
return PTR_ERR(cgroup);
- bpf_cgrp_storage_lock();
err = cgroup_storage_delete(cgroup, map);
- bpf_cgrp_storage_unlock();
cgroup_put(cgroup);
return err;
}
@@ -150,7 +119,7 @@ static struct bpf_map *cgroup_storage_map_alloc(union bpf_attr *attr)
static void cgroup_storage_map_free(struct bpf_map *map)
{
- bpf_local_storage_map_free(map, &cgroup_cache, &bpf_cgrp_storage_busy);
+ bpf_local_storage_map_free(map, &cgroup_cache, NULL);
}
/* *gfp_flags* is a hidden argument provided by the verifier */
@@ -158,7 +127,6 @@ BPF_CALL_5(bpf_cgrp_storage_get, struct bpf_map *, map, struct cgroup *, cgroup,
void *, value, u64, flags, gfp_t, gfp_flags)
{
struct bpf_local_storage_data *sdata;
- bool nobusy;
WARN_ON_ONCE(!bpf_rcu_lock_held());
if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE))
@@ -167,38 +135,27 @@ BPF_CALL_5(bpf_cgrp_storage_get, struct bpf_map *, map, struct cgroup *, cgroup,
if (!cgroup)
return (unsigned long)NULL;
- nobusy = bpf_cgrp_storage_trylock();
-
- sdata = cgroup_storage_lookup(cgroup, map, nobusy);
+ sdata = cgroup_storage_lookup(cgroup, map, true);
if (sdata)
- goto unlock;
+ goto out;
/* only allocate new storage, when the cgroup is refcounted */
if (!percpu_ref_is_dying(&cgroup->self.refcnt) &&
- (flags & BPF_LOCAL_STORAGE_GET_F_CREATE) && nobusy)
+ (flags & BPF_LOCAL_STORAGE_GET_F_CREATE))
sdata = bpf_local_storage_update(cgroup, (struct bpf_local_storage_map *)map,
value, BPF_NOEXIST, false, gfp_flags);
-unlock:
- if (nobusy)
- bpf_cgrp_storage_unlock();
+out:
return IS_ERR_OR_NULL(sdata) ? (unsigned long)NULL : (unsigned long)sdata->data;
}
BPF_CALL_2(bpf_cgrp_storage_delete, struct bpf_map *, map, struct cgroup *, cgroup)
{
- int ret;
-
WARN_ON_ONCE(!bpf_rcu_lock_held());
if (!cgroup)
return -EINVAL;
- if (!bpf_cgrp_storage_trylock())
- return -EBUSY;
-
- ret = cgroup_storage_delete(cgroup, map);
- bpf_cgrp_storage_unlock();
- return ret;
+ return cgroup_storage_delete(cgroup, map);
}
const struct bpf_map_ops cgrp_storage_map_ops = {
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH bpf-next v4 08/16] bpf: Remove unused percpu counter from bpf_local_storage_map_free
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
` (6 preceding siblings ...)
2026-01-31 5:09 ` [PATCH bpf-next v4 07/16] bpf: Remove cgroup " Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:09 ` [PATCH bpf-next v4 09/16] bpf: Prepare for bpf_selem_unlink_nofail() Amery Hung
` (7 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
Percpu locks have been removed from cgroup and task local storage. Now
that all local storage no longer use percpu variables as locks preventing
recursion, there is no need to pass them to bpf_local_storage_map_free().
Remove the argument from the function.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
include/linux/bpf_local_storage.h | 3 +--
kernel/bpf/bpf_cgrp_storage.c | 2 +-
kernel/bpf/bpf_inode_storage.c | 2 +-
kernel/bpf/bpf_local_storage.c | 7 +------
kernel/bpf/bpf_task_storage.c | 2 +-
net/core/bpf_sk_storage.c | 2 +-
6 files changed, 6 insertions(+), 12 deletions(-)
diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h
index 903559e2ca91..70b35dfc01c9 100644
--- a/include/linux/bpf_local_storage.h
+++ b/include/linux/bpf_local_storage.h
@@ -166,8 +166,7 @@ bpf_local_storage_lookup(struct bpf_local_storage *local_storage,
void bpf_local_storage_destroy(struct bpf_local_storage *local_storage);
void bpf_local_storage_map_free(struct bpf_map *map,
- struct bpf_local_storage_cache *cache,
- int __percpu *busy_counter);
+ struct bpf_local_storage_cache *cache);
int bpf_local_storage_map_check_btf(const struct bpf_map *map,
const struct btf *btf,
diff --git a/kernel/bpf/bpf_cgrp_storage.c b/kernel/bpf/bpf_cgrp_storage.c
index 4d84611d8222..853183eead2c 100644
--- a/kernel/bpf/bpf_cgrp_storage.c
+++ b/kernel/bpf/bpf_cgrp_storage.c
@@ -119,7 +119,7 @@ static struct bpf_map *cgroup_storage_map_alloc(union bpf_attr *attr)
static void cgroup_storage_map_free(struct bpf_map *map)
{
- bpf_local_storage_map_free(map, &cgroup_cache, NULL);
+ bpf_local_storage_map_free(map, &cgroup_cache);
}
/* *gfp_flags* is a hidden argument provided by the verifier */
diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c
index cedc99184dad..470f4b02c79e 100644
--- a/kernel/bpf/bpf_inode_storage.c
+++ b/kernel/bpf/bpf_inode_storage.c
@@ -184,7 +184,7 @@ static struct bpf_map *inode_storage_map_alloc(union bpf_attr *attr)
static void inode_storage_map_free(struct bpf_map *map)
{
- bpf_local_storage_map_free(map, &inode_cache, NULL);
+ bpf_local_storage_map_free(map, &inode_cache);
}
const struct bpf_map_ops inode_storage_map_ops = {
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index 7661319ad2e3..fbf41e00be8a 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -809,8 +809,7 @@ bpf_local_storage_map_alloc(union bpf_attr *attr,
}
void bpf_local_storage_map_free(struct bpf_map *map,
- struct bpf_local_storage_cache *cache,
- int __percpu *busy_counter)
+ struct bpf_local_storage_cache *cache)
{
struct bpf_local_storage_map_bucket *b;
struct bpf_local_storage_elem *selem;
@@ -843,11 +842,7 @@ void bpf_local_storage_map_free(struct bpf_map *map,
while ((selem = hlist_entry_safe(
rcu_dereference_raw(hlist_first_rcu(&b->list)),
struct bpf_local_storage_elem, map_node))) {
- if (busy_counter)
- this_cpu_inc(*busy_counter);
bpf_selem_unlink(selem, true);
- if (busy_counter)
- this_cpu_dec(*busy_counter);
cond_resched_rcu();
}
rcu_read_unlock();
diff --git a/kernel/bpf/bpf_task_storage.c b/kernel/bpf/bpf_task_storage.c
index dd858226ada2..4d53aebe6784 100644
--- a/kernel/bpf/bpf_task_storage.c
+++ b/kernel/bpf/bpf_task_storage.c
@@ -217,7 +217,7 @@ static struct bpf_map *task_storage_map_alloc(union bpf_attr *attr)
static void task_storage_map_free(struct bpf_map *map)
{
- bpf_local_storage_map_free(map, &task_cache, NULL);
+ bpf_local_storage_map_free(map, &task_cache);
}
BTF_ID_LIST_GLOBAL_SINGLE(bpf_local_storage_map_btf_id, struct, bpf_local_storage_map)
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index 78e936288879..7ec8a74e7ce5 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -60,7 +60,7 @@ void bpf_sk_storage_free(struct sock *sk)
static void bpf_sk_storage_map_free(struct bpf_map *map)
{
- bpf_local_storage_map_free(map, &sk_cache, NULL);
+ bpf_local_storage_map_free(map, &sk_cache);
}
static struct bpf_map *bpf_sk_storage_map_alloc(union bpf_attr *attr)
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH bpf-next v4 09/16] bpf: Prepare for bpf_selem_unlink_nofail()
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
` (7 preceding siblings ...)
2026-01-31 5:09 ` [PATCH bpf-next v4 08/16] bpf: Remove unused percpu counter from bpf_local_storage_map_free Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:09 ` [PATCH bpf-next v4 10/16] bpf: Support lockless unlink when freeing map or local storage Amery Hung
` (6 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
The next patch will introduce bpf_selem_unlink_nofail() to handle
rqspinlock errors. bpf_selem_unlink_nofail() will allow an selem to be
partially unlinked from map or local storage. Save memory allocation
method in selem so that later an selem can be correctly freed even when
SDATA(selem)->smap is init to NULL.
In addition, keep track of memory charge to the owner in local storage
so that later bpf_selem_unlink_nofail() can return the correct memory
charge to the owner. Updating selems_size is protected by
local_storage->lock.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
include/linux/bpf_local_storage.h | 4 +++-
kernel/bpf/bpf_local_storage.c | 10 +++++++++-
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h
index 70b35dfc01c9..ece32f756d86 100644
--- a/include/linux/bpf_local_storage.h
+++ b/include/linux/bpf_local_storage.h
@@ -80,7 +80,8 @@ struct bpf_local_storage_elem {
* after raw_spin_unlock
*/
};
- /* 8 bytes hole */
+ bool use_kmalloc_nolock;
+ /* 7 bytes hole */
/* The data is stored in another cacheline to minimize
* the number of cachelines access during a cache hit.
*/
@@ -96,6 +97,7 @@ struct bpf_local_storage {
*/
struct rcu_head rcu;
rqspinlock_t lock; /* Protect adding/removing from the "list" */
+ u64 selems_size; /* Total selem size. Protected by "lock" */
bool use_kmalloc_nolock;
};
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index fbf41e00be8a..b8f146d41ffe 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -85,6 +85,7 @@ bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner,
if (selem) {
RCU_INIT_POINTER(SDATA(selem)->smap, smap);
+ selem->use_kmalloc_nolock = smap->use_kmalloc_nolock;
if (value) {
/* No need to call check_and_init_map_value as memory is zero init */
@@ -214,7 +215,7 @@ void bpf_selem_free(struct bpf_local_storage_elem *selem,
smap = rcu_dereference_check(SDATA(selem)->smap, bpf_rcu_lock_held());
- if (!smap->use_kmalloc_nolock) {
+ if (!selem->use_kmalloc_nolock) {
/*
* No uptr will be unpin even when reuse_now == false since uptr
* is only supported in task local storage, where
@@ -305,12 +306,19 @@ static bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_stor
if (rcu_access_pointer(local_storage->smap) == smap)
RCU_INIT_POINTER(local_storage->smap, NULL);
+ local_storage->selems_size -= smap->elem_size;
+
return free_local_storage;
}
void bpf_selem_link_storage_nolock(struct bpf_local_storage *local_storage,
struct bpf_local_storage_elem *selem)
{
+ struct bpf_local_storage_map *smap;
+
+ smap = rcu_dereference_check(SDATA(selem)->smap, bpf_rcu_lock_held());
+ local_storage->selems_size += smap->elem_size;
+
RCU_INIT_POINTER(selem->local_storage, local_storage);
hlist_add_head_rcu(&selem->snode, &local_storage->list);
}
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH bpf-next v4 10/16] bpf: Support lockless unlink when freeing map or local storage
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
` (8 preceding siblings ...)
2026-01-31 5:09 ` [PATCH bpf-next v4 09/16] bpf: Prepare for bpf_selem_unlink_nofail() Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:09 ` [PATCH bpf-next v4 11/16] bpf: Switch to bpf_selem_unlink_nofail in bpf_local_storage_{map_free, destroy} Amery Hung
` (5 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
Introduce bpf_selem_unlink_nofail() to properly handle errors returned
from rqspinlock in bpf_local_storage_map_free() and
bpf_local_storage_destroy() where the operation must succeeds.
The idea of bpf_selem_unlink_nofail() is to allow a selem to be
partially linked and use refcount to determine when and who can free the
selem if any unlink under lock fails. A selem initially is fully linked
to a map and a local storage and therefore selem->link_cnt is set to 2.
Under normal circumstances, bpf_selem_unlink_nofail() will be able to
grab locks and unlink a selem from map and local storage in sequeunce,
just like bpf_selem_unlink(), and then free it after an RCU grace period.
However, if any of the lock attempts fails, it will only clear
SDATA(selem)->smap or selem->local_storage depending on the caller and
decrement link_cnt to signal that the corresponding data structure
holding a reference to the selem is gone. Then, only when both map and
local storage are gone, an selem can be free by the last caller that
turns link_cnt to 0.
To make sure bpf_obj_free_fields() is done only once and when map is
still present, it is called when unlinking an selem from b->list under
b->lock.
To make sure uncharging memory is done only when the owner is still
present in map_free(), block destroy() from returning until there is no
pending map_free().
Later bpf_local_storage_destroy() will return the remaining amount of
memory charge tracked by selems_size to the owner.
Finally, access of selem, SDATA(selem)->smap and selem->local_storage
are racy. Callers will protect these fields with RCU.
Co-developed-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
include/linux/bpf_local_storage.h | 5 +-
kernel/bpf/bpf_local_storage.c | 109 ++++++++++++++++++++++++++++--
2 files changed, 109 insertions(+), 5 deletions(-)
diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h
index ece32f756d86..4e1ebfb3b9e8 100644
--- a/include/linux/bpf_local_storage.h
+++ b/include/linux/bpf_local_storage.h
@@ -80,8 +80,10 @@ struct bpf_local_storage_elem {
* after raw_spin_unlock
*/
};
+ /* Used by map_free() and destroy() when rqspinlock returns err */
+ atomic_t link_cnt;
bool use_kmalloc_nolock;
- /* 7 bytes hole */
+ /* 3 bytes hole */
/* The data is stored in another cacheline to minimize
* the number of cachelines access during a cache hit.
*/
@@ -98,6 +100,7 @@ struct bpf_local_storage {
struct rcu_head rcu;
rqspinlock_t lock; /* Protect adding/removing from the "list" */
u64 selems_size; /* Total selem size. Protected by "lock" */
+ refcount_t owner_refcnt;
bool use_kmalloc_nolock;
};
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index b8f146d41ffe..54d106ebbfe5 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -93,6 +93,7 @@ bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner,
if (swap_uptrs)
bpf_obj_swap_uptrs(smap->map.record, SDATA(selem)->data, value);
}
+ atomic_set(&selem->link_cnt, 2);
return selem;
}
@@ -194,9 +195,11 @@ static void bpf_selem_free_rcu(struct rcu_head *rcu)
/* The bpf_local_storage_map_free will wait for rcu_barrier */
smap = rcu_dereference_check(SDATA(selem)->smap, 1);
- migrate_disable();
- bpf_obj_free_fields(smap->map.record, SDATA(selem)->data);
- migrate_enable();
+ if (smap) {
+ migrate_disable();
+ bpf_obj_free_fields(smap->map.record, SDATA(selem)->data);
+ migrate_enable();
+ }
kfree_nolock(selem);
}
@@ -221,7 +224,8 @@ void bpf_selem_free(struct bpf_local_storage_elem *selem,
* is only supported in task local storage, where
* smap->use_kmalloc_nolock == true.
*/
- bpf_obj_free_fields(smap->map.record, SDATA(selem)->data);
+ if (smap)
+ bpf_obj_free_fields(smap->map.record, SDATA(selem)->data);
__bpf_selem_free(selem, reuse_now);
return;
}
@@ -421,6 +425,96 @@ int bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool reuse_now)
return err;
}
+/*
+ * Unlink an selem from map and local storage with lockless fallback if callers
+ * are racing or rqspinlock returns error. It should only be called by
+ * bpf_local_storage_destroy() or bpf_local_storage_map_free().
+ */
+static void bpf_selem_unlink_nofail(struct bpf_local_storage_elem *selem,
+ struct bpf_local_storage_map_bucket *b)
+{
+ struct bpf_local_storage *local_storage;
+ struct bpf_local_storage_map *smap;
+ bool in_map_free = !!b;
+ unsigned long flags;
+ int err, unlink = 0;
+
+ local_storage = rcu_dereference_check(selem->local_storage, bpf_rcu_lock_held());
+ smap = rcu_dereference_check(SDATA(selem)->smap, bpf_rcu_lock_held());
+
+ /*
+ * map_free() and destroy() each holds a link_cnt on an selem. Prevent called twice
+ * from the same caller on the same selem.
+ */
+ if ((!smap && in_map_free) || (!local_storage && !in_map_free))
+ return;
+
+ if (smap) {
+ b = b ? : select_bucket(smap, local_storage);
+ err = raw_res_spin_lock_irqsave(&b->lock, flags);
+ if (!err) {
+ /*
+ * Call bpf_obj_free_fields() under b->lock to make sure it is done
+ * exactly once for an selem. Safe to free special fields immediately
+ * as no BPF program should be referencing the selem.
+ */
+ if (likely(selem_linked_to_map(selem))) {
+ hlist_del_init_rcu(&selem->map_node);
+ bpf_obj_free_fields(smap->map.record, SDATA(selem)->data);
+ unlink++;
+ }
+ raw_res_spin_unlock_irqrestore(&b->lock, flags);
+ }
+ /*
+ * Highly unlikely scenario: resource leak
+ *
+ * When map_free(selem1), destroy(selem1) and destroy(selem2) are racing
+ * and both selem belong to the same bucket, if destroy(selem2) acquired
+ * b->lock and block for too long, neither map_free(selem1) and
+ * destroy(selem1) will be able to free the special field associated
+ * with selem1 as raw_res_spin_lock_irqsave() returns -ETIMEDOUT.
+ */
+ WARN_ON_ONCE(err && in_map_free);
+ if (!err || in_map_free)
+ RCU_INIT_POINTER(SDATA(selem)->smap, NULL);
+ }
+
+ if (local_storage) {
+ err = raw_res_spin_lock_irqsave(&local_storage->lock, flags);
+ if (!err) {
+ /*
+ * In the common path, map_free() can call mem_uncharge() if
+ * destroy() is not about to return to owner, which can then go
+ * away immediately. Otherwise, the charge of the selem will stay
+ * accounted in local_storage->selems_size and uncharged during
+ * destroy().
+ */
+ if (likely(selem_linked_to_storage(selem))) {
+ hlist_del_init_rcu(&selem->snode);
+ if (smap && in_map_free &&
+ refcount_inc_not_zero(&local_storage->owner_refcnt)) {
+ mem_uncharge(smap, local_storage->owner, smap->elem_size);
+ local_storage->selems_size -= smap->elem_size;
+ refcount_dec(&local_storage->owner_refcnt);
+ }
+ unlink++;
+ }
+ raw_res_spin_unlock_irqrestore(&local_storage->lock, flags);
+ }
+ if (!err || !in_map_free)
+ RCU_INIT_POINTER(selem->local_storage, NULL);
+ }
+
+ /*
+ * Normally, an selem can be unlink under local_storage->lock and b->lock, and
+ * then added to a local to_free list. However, if destroy() and map_free() are
+ * racing or rqspinlock returns errors in unlikely situations (unlink != 2), free
+ * the selem only after both map_free() and destroy() drop the refcnt.
+ */
+ if (unlink == 2 || atomic_dec_and_test(&selem->link_cnt))
+ bpf_selem_free(selem, false);
+}
+
void __bpf_local_storage_insert_cache(struct bpf_local_storage *local_storage,
struct bpf_local_storage_map *smap,
struct bpf_local_storage_elem *selem)
@@ -487,6 +581,7 @@ int bpf_local_storage_alloc(void *owner,
raw_res_spin_lock_init(&storage->lock);
storage->owner = owner;
storage->use_kmalloc_nolock = smap->use_kmalloc_nolock;
+ refcount_set(&storage->owner_refcnt, 1);
bpf_selem_link_storage_nolock(storage, first_selem);
@@ -754,6 +849,12 @@ void bpf_local_storage_destroy(struct bpf_local_storage *local_storage)
if (free_storage)
bpf_local_storage_free(local_storage, true);
+
+ if (!refcount_dec_and_test(&local_storage->owner_refcnt)) {
+ while (refcount_read(&local_storage->owner_refcnt))
+ cpu_relax();
+ smp_rmb(); /* pair with refcount_dec in bpf_selem_unlink_nofail */
+ }
}
u64 bpf_local_storage_map_mem_usage(const struct bpf_map *map)
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH bpf-next v4 11/16] bpf: Switch to bpf_selem_unlink_nofail in bpf_local_storage_{map_free, destroy}
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
` (9 preceding siblings ...)
2026-01-31 5:09 ` [PATCH bpf-next v4 10/16] bpf: Support lockless unlink when freeing map or local storage Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:09 ` [PATCH bpf-next v4 12/16] selftests/bpf: Update sk_storage_omem_uncharge test Amery Hung
` (4 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
Take care of rqspinlock error in bpf_local_storage_{map_free, destroy}()
properly by switching to bpf_selem_unlink_nofail().
Both functions iterate their own RCU-protected list of selems and call
bpf_selem_unlink_nofail(). In map_free(), to prevent infinite loop when
both map_free() and destroy() fail to remove a selem from b->list
(extremely unlikely), switch to hlist_for_each_entry_rcu(). In destroy(),
also switch to hlist_for_each_entry_rcu() since we no longer iterate
local_storage->list under local_storage->lock. In addition, defer it to
workqueue as sleep may not always be possible in destroy().
Since selem, SDATA(selem)->smap and selem->local_storage may be seen by
map_free() and destroy() at the same time, protect them with RCU. This
means passing reuse_now == false to bpf_selem_free() and
bpf_local_storage_free(). The local storage map is already protected as
bpf_local_storage_map_free() waits for an RCU grace period after
iterating b->list and before freeing itself.
bpf_selem_unlink() now becomes dedicated to helpers and syscalls paths
so reuse_now should always be false. Remove it from the argument and
hardcode it.
Co-developed-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
include/linux/bpf_local_storage.h | 5 +-
kernel/bpf/bpf_cgrp_storage.c | 3 +-
kernel/bpf/bpf_inode_storage.c | 3 +-
kernel/bpf/bpf_local_storage.c | 96 +++++++++++++++++--------------
kernel/bpf/bpf_task_storage.c | 3 +-
net/core/bpf_sk_storage.c | 9 ++-
6 files changed, 69 insertions(+), 50 deletions(-)
diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h
index 4e1ebfb3b9e8..605590a8f98d 100644
--- a/include/linux/bpf_local_storage.h
+++ b/include/linux/bpf_local_storage.h
@@ -101,6 +101,7 @@ struct bpf_local_storage {
rqspinlock_t lock; /* Protect adding/removing from the "list" */
u64 selems_size; /* Total selem size. Protected by "lock" */
refcount_t owner_refcnt;
+ struct work_struct work;
bool use_kmalloc_nolock;
};
@@ -168,7 +169,7 @@ bpf_local_storage_lookup(struct bpf_local_storage *local_storage,
return SDATA(selem);
}
-void bpf_local_storage_destroy(struct bpf_local_storage *local_storage);
+u32 bpf_local_storage_destroy(struct bpf_local_storage *local_storage);
void bpf_local_storage_map_free(struct bpf_map *map,
struct bpf_local_storage_cache *cache);
@@ -181,7 +182,7 @@ int bpf_local_storage_map_check_btf(const struct bpf_map *map,
void bpf_selem_link_storage_nolock(struct bpf_local_storage *local_storage,
struct bpf_local_storage_elem *selem);
-int bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool reuse_now);
+int bpf_selem_unlink(struct bpf_local_storage_elem *selem);
int bpf_selem_link_map(struct bpf_local_storage_map *smap,
struct bpf_local_storage_elem *selem);
diff --git a/kernel/bpf/bpf_cgrp_storage.c b/kernel/bpf/bpf_cgrp_storage.c
index 853183eead2c..0bc3ab19c7b4 100644
--- a/kernel/bpf/bpf_cgrp_storage.c
+++ b/kernel/bpf/bpf_cgrp_storage.c
@@ -27,6 +27,7 @@ void bpf_cgrp_storage_free(struct cgroup *cgroup)
if (!local_storage)
goto out;
+ RCU_INIT_POINTER(cgroup->bpf_cgrp_storage, NULL);
bpf_local_storage_destroy(local_storage);
out:
rcu_read_unlock();
@@ -89,7 +90,7 @@ static int cgroup_storage_delete(struct cgroup *cgroup, struct bpf_map *map)
if (!sdata)
return -ENOENT;
- return bpf_selem_unlink(SELEM(sdata), false);
+ return bpf_selem_unlink(SELEM(sdata));
}
static long bpf_cgrp_storage_delete_elem(struct bpf_map *map, void *key)
diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c
index 470f4b02c79e..eb607156ba35 100644
--- a/kernel/bpf/bpf_inode_storage.c
+++ b/kernel/bpf/bpf_inode_storage.c
@@ -68,6 +68,7 @@ void bpf_inode_storage_free(struct inode *inode)
if (!local_storage)
goto out;
+ RCU_INIT_POINTER(bsb->storage, NULL);
bpf_local_storage_destroy(local_storage);
out:
rcu_read_unlock_migrate();
@@ -110,7 +111,7 @@ static int inode_storage_delete(struct inode *inode, struct bpf_map *map)
if (!sdata)
return -ENOENT;
- return bpf_selem_unlink(SELEM(sdata), false);
+ return bpf_selem_unlink(SELEM(sdata));
}
static long bpf_fd_inode_storage_delete_elem(struct bpf_map *map, void *key)
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index 54d106ebbfe5..364198959053 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -383,7 +383,11 @@ static void bpf_selem_link_map_nolock(struct bpf_local_storage_map_bucket *b,
hlist_add_head_rcu(&selem->map_node, &b->list);
}
-int bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool reuse_now)
+/*
+ * Unlink an selem from map and local storage with lock held.
+ * This is the common path used by local storages to delete an selem.
+ */
+int bpf_selem_unlink(struct bpf_local_storage_elem *selem)
{
struct bpf_local_storage *local_storage;
bool free_local_storage = false;
@@ -417,10 +421,10 @@ int bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool reuse_now)
out:
raw_res_spin_unlock_irqrestore(&local_storage->lock, flags);
- bpf_selem_free_list(&selem_free_list, reuse_now);
+ bpf_selem_free_list(&selem_free_list, false);
if (free_local_storage)
- bpf_local_storage_free(local_storage, reuse_now);
+ bpf_local_storage_free(local_storage, false);
return err;
}
@@ -650,7 +654,7 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
local_storage = rcu_dereference_check(*owner_storage(smap, owner),
bpf_rcu_lock_held());
- if (!local_storage || hlist_empty(&local_storage->list)) {
+ if (!local_storage) {
/* Very first elem for the owner */
err = check_flags(NULL, map_flags);
if (err)
@@ -698,17 +702,6 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
if (err)
return ERR_PTR(err);
- /* Recheck local_storage->list under local_storage->lock */
- if (unlikely(hlist_empty(&local_storage->list))) {
- /* A parallel del is happening and local_storage is going
- * away. It has just been checked before, so very
- * unlikely. Return instead of retry to keep things
- * simple.
- */
- err = -EAGAIN;
- goto unlock;
- }
-
old_sdata = bpf_local_storage_lookup(local_storage, smap, false);
err = check_flags(old_sdata, map_flags);
if (err)
@@ -811,13 +804,16 @@ int bpf_local_storage_map_check_btf(const struct bpf_map *map,
return 0;
}
-void bpf_local_storage_destroy(struct bpf_local_storage *local_storage)
+/*
+ * Deferred looping local_storage->list to workqueue since sleeping may not be
+ * allowed in bpf_local_storage_destroy()
+ */
+static void bpf_local_storage_free_deferred(struct work_struct *work)
{
+ struct bpf_local_storage *local_storage;
struct bpf_local_storage_elem *selem;
- bool free_storage = false;
- HLIST_HEAD(free_selem_list);
- struct hlist_node *n;
- unsigned long flags;
+
+ local_storage = container_of(work, struct bpf_local_storage, work);
/* Neither the bpf_prog nor the bpf_map's syscall
* could be modifying the local_storage->list now.
@@ -828,33 +824,44 @@ void bpf_local_storage_destroy(struct bpf_local_storage *local_storage)
* when unlinking elem from the local_storage->list and
* the map's bucket->list.
*/
- raw_res_spin_lock_irqsave(&local_storage->lock, flags);
- hlist_for_each_entry_safe(selem, n, &local_storage->list, snode) {
- /* Always unlink from map before unlinking from
- * local_storage.
- */
- bpf_selem_unlink_map(selem);
- /* If local_storage list has only one element, the
- * bpf_selem_unlink_storage_nolock() will return true.
- * Otherwise, it will return false. The current loop iteration
- * intends to remove all local storage. So the last iteration
- * of the loop will set the free_cgroup_storage to true.
- */
- free_storage = bpf_selem_unlink_storage_nolock(
- local_storage, selem, &free_selem_list);
+ rcu_read_lock();
+restart:
+ hlist_for_each_entry_rcu(selem, &local_storage->list, snode) {
+ bpf_selem_unlink_nofail(selem, NULL);
+
+ if (need_resched()) {
+ cond_resched_rcu();
+ goto restart;
+ }
}
- raw_res_spin_unlock_irqrestore(&local_storage->lock, flags);
+ rcu_read_unlock();
- bpf_selem_free_list(&free_selem_list, true);
+ bpf_local_storage_free(local_storage, false);
+}
+
+/*
+ * Destroy local storage when the owner is going away. Caller must clear owner->storage
+ * and uncharge memory if memory charging is used.
+ *
+ * Since smaps associated with selems may already be gone, mem_uncharge() or
+ * owner_storage() cannot be called in this function. Let the owner (i.e., the caller)
+ * do it instead.
+ */
+u32 bpf_local_storage_destroy(struct bpf_local_storage *local_storage)
+{
+ INIT_WORK(&local_storage->work, bpf_local_storage_free_deferred);
- if (free_storage)
- bpf_local_storage_free(local_storage, true);
+ queue_work(system_dfl_wq, &local_storage->work);
if (!refcount_dec_and_test(&local_storage->owner_refcnt)) {
while (refcount_read(&local_storage->owner_refcnt))
cpu_relax();
smp_rmb(); /* pair with refcount_dec in bpf_selem_unlink_nofail */
}
+
+ local_storage->owner = NULL;
+
+ return sizeof(*local_storage) + local_storage->selems_size;
}
u64 bpf_local_storage_map_mem_usage(const struct bpf_map *map)
@@ -948,11 +955,14 @@ void bpf_local_storage_map_free(struct bpf_map *map,
rcu_read_lock();
/* No one is adding to b->list now */
- while ((selem = hlist_entry_safe(
- rcu_dereference_raw(hlist_first_rcu(&b->list)),
- struct bpf_local_storage_elem, map_node))) {
- bpf_selem_unlink(selem, true);
- cond_resched_rcu();
+restart:
+ hlist_for_each_entry_rcu(selem, &b->list, map_node) {
+ bpf_selem_unlink_nofail(selem, b);
+
+ if (need_resched()) {
+ cond_resched_rcu();
+ goto restart;
+ }
}
rcu_read_unlock();
}
diff --git a/kernel/bpf/bpf_task_storage.c b/kernel/bpf/bpf_task_storage.c
index 4d53aebe6784..ea7ea80d85e7 100644
--- a/kernel/bpf/bpf_task_storage.c
+++ b/kernel/bpf/bpf_task_storage.c
@@ -53,6 +53,7 @@ void bpf_task_storage_free(struct task_struct *task)
if (!local_storage)
goto out;
+ RCU_INIT_POINTER(task->bpf_storage, NULL);
bpf_local_storage_destroy(local_storage);
out:
rcu_read_unlock();
@@ -134,7 +135,7 @@ static int task_storage_delete(struct task_struct *task, struct bpf_map *map)
if (!sdata)
return -ENOENT;
- return bpf_selem_unlink(SELEM(sdata), false);
+ return bpf_selem_unlink(SELEM(sdata));
}
static long bpf_pid_task_storage_delete_elem(struct bpf_map *map, void *key)
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index 7ec8a74e7ce5..abb0e8713a04 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -40,20 +40,25 @@ static int bpf_sk_storage_del(struct sock *sk, struct bpf_map *map)
if (!sdata)
return -ENOENT;
- return bpf_selem_unlink(SELEM(sdata), false);
+ return bpf_selem_unlink(SELEM(sdata));
}
/* Called by __sk_destruct() & bpf_sk_storage_clone() */
void bpf_sk_storage_free(struct sock *sk)
{
struct bpf_local_storage *sk_storage;
+ u32 uncharge;
rcu_read_lock_dont_migrate();
sk_storage = rcu_dereference(sk->sk_bpf_storage);
if (!sk_storage)
goto out;
- bpf_local_storage_destroy(sk_storage);
+ RCU_INIT_POINTER(sk->sk_bpf_storage, NULL);
+
+ uncharge = bpf_local_storage_destroy(sk_storage);
+ if (uncharge)
+ atomic_sub(uncharge, &sk->sk_omem_alloc);
out:
rcu_read_unlock_migrate();
}
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH bpf-next v4 12/16] selftests/bpf: Update sk_storage_omem_uncharge test
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
` (10 preceding siblings ...)
2026-01-31 5:09 ` [PATCH bpf-next v4 11/16] bpf: Switch to bpf_selem_unlink_nofail in bpf_local_storage_{map_free, destroy} Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:09 ` [PATCH bpf-next v4 13/16] selftests/bpf: Update task_local_storage/recursion test Amery Hung
` (3 subsequent siblings)
15 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
Check sk_omem_alloc when the caller of bpf_local_storage_destroy()
returns. bpf_local_storage_destroy() now returns the memory to uncharge
to the caller instead of directly uncharge. Therefore, in the
sk_storage_omem_uncharge, check sk_omem_alloc when bpf_sk_storage_free()
returns instead of bpf_local_storage_destroy().
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
.../selftests/bpf/progs/sk_storage_omem_uncharge.c | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/sk_storage_omem_uncharge.c b/tools/testing/selftests/bpf/progs/sk_storage_omem_uncharge.c
index 46d6eb2a3b17..c8f4815c8dfb 100644
--- a/tools/testing/selftests/bpf/progs/sk_storage_omem_uncharge.c
+++ b/tools/testing/selftests/bpf/progs/sk_storage_omem_uncharge.c
@@ -6,7 +6,6 @@
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>
-void *local_storage_ptr = NULL;
void *sk_ptr = NULL;
int cookie_found = 0;
__u64 cookie = 0;
@@ -19,21 +18,17 @@ struct {
__type(value, int);
} sk_storage SEC(".maps");
-SEC("fexit/bpf_local_storage_destroy")
-int BPF_PROG(bpf_local_storage_destroy, struct bpf_local_storage *local_storage)
+SEC("fexit/bpf_sk_storage_free")
+int BPF_PROG(bpf_sk_storage_free, struct sock *sk)
{
- struct sock *sk;
-
- if (local_storage_ptr != local_storage)
+ if (sk_ptr != sk)
return 0;
- sk = bpf_core_cast(sk_ptr, struct sock);
if (sk->sk_cookie.counter != cookie)
return 0;
cookie_found++;
omem = sk->sk_omem_alloc.counter;
- local_storage_ptr = NULL;
return 0;
}
@@ -50,7 +45,6 @@ int BPF_PROG(inet6_sock_destruct, struct sock *sk)
if (value && *value == 0xdeadbeef) {
cookie_found++;
sk_ptr = sk;
- local_storage_ptr = sk->sk_bpf_storage;
}
return 0;
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH bpf-next v4 13/16] selftests/bpf: Update task_local_storage/recursion test
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
` (11 preceding siblings ...)
2026-01-31 5:09 ` [PATCH bpf-next v4 12/16] selftests/bpf: Update sk_storage_omem_uncharge test Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:33 ` bot+bpf-ci
2026-01-31 5:09 ` [PATCH bpf-next v4 14/16] selftests/bpf: Update task_local_storage/task_storage_nodeadlock test Amery Hung
` (2 subsequent siblings)
15 siblings, 1 reply; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
Update the expected result of the selftest as recursion of task local
storage syscall and helpers have been relaxed. Now that the percpu
counter is removed, task local storage helpers, bpf_task_storage_get()
and bpf_task_storage_delete() can now run on the same CPU at the same
time unless they cause deadlock.
Note that since there is no percpu counter preventing recursion in
task local storage helpers, bpf_trampoline now catches the recursion
of on_update as reported by recursion_misses.
on_enter: tp_btf/sys_enter
on_update: fentry/bpf_local_storage_update
Old behavior New behavior
____________ ____________
on_enter on_enter
bpf_task_storage_get(&map_a) bpf_task_storage_get(&map_a)
bpf_task_storage_trylock succeed bpf_local_storage_update(&map_a)
bpf_local_storage_update(&map_a)
on_update on_update
bpf_task_storage_get(&map_a) bpf_task_storage_get(&map_a)
bpf_task_storage_trylock fail on_update::misses++ (1)
return NULL create and return map_a::ptr
map_a::ptr += 1 (1)
bpf_task_storage_delete(&map_a)
return 0
bpf_task_storage_get(&map_b) bpf_task_storage_get(&map_b)
bpf_task_storage_trylock fail on_update::misses++ (2)
return NULL create and return map_b::ptr
map_b::ptr += 1 (1)
create and return map_a::ptr create and return map_a::ptr
map_a::ptr = 200 map_a::ptr = 200
bpf_task_storage_get(&map_b) bpf_task_storage_get(&map_b)
bpf_task_storage_trylock succeed lockless lookup succeed
bpf_local_storage_update(&map_b) return map_b::ptr
on_update
bpf_task_storage_get(&map_a)
bpf_task_storage_trylock fail
lockless lookup succeed
return map_a::ptr
map_a::ptr += 1 (201)
bpf_task_storage_delete(&map_a)
bpf_task_storage_trylock fail
return -EBUSY
nr_del_errs++ (1)
bpf_task_storage_get(&map_b)
bpf_task_storage_trylock fail
return NULL
create and return ptr
map_b::ptr = 100
Expected result:
map_a::ptr = 201 map_a::ptr = 200
map_b::ptr = 100 map_b::ptr = 1
nr_del_err = 1 nr_del_err = 0
on_update::recursion_misses = 0 on_update::recursion_misses = 2
On_enter::recursion_misses = 0 on_enter::recursion_misses = 0
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
.../testing/selftests/bpf/prog_tests/task_local_storage.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c
index 42e822ea352f..559727b05e08 100644
--- a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c
+++ b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c
@@ -117,19 +117,19 @@ static void test_recursion(void)
map_fd = bpf_map__fd(skel->maps.map_a);
err = bpf_map_lookup_elem(map_fd, &task_fd, &value);
ASSERT_OK(err, "lookup map_a");
- ASSERT_EQ(value, 201, "map_a value");
- ASSERT_EQ(skel->bss->nr_del_errs, 1, "bpf_task_storage_delete busy");
+ ASSERT_EQ(value, 200, "map_a value");
+ ASSERT_EQ(skel->bss->nr_del_errs, 0, "bpf_task_storage_delete busy");
map_fd = bpf_map__fd(skel->maps.map_b);
err = bpf_map_lookup_elem(map_fd, &task_fd, &value);
ASSERT_OK(err, "lookup map_b");
- ASSERT_EQ(value, 100, "map_b value");
+ ASSERT_EQ(value, 1, "map_b value");
prog_fd = bpf_program__fd(skel->progs.on_update);
memset(&info, 0, sizeof(info));
err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len);
ASSERT_OK(err, "get prog info");
- ASSERT_EQ(info.recursion_misses, 0, "on_update prog recursion");
+ ASSERT_EQ(info.recursion_misses, 2, "on_update prog recursion");
prog_fd = bpf_program__fd(skel->progs.on_enter);
memset(&info, 0, sizeof(info));
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH bpf-next v4 13/16] selftests/bpf: Update task_local_storage/recursion test
2026-01-31 5:09 ` [PATCH bpf-next v4 13/16] selftests/bpf: Update task_local_storage/recursion test Amery Hung
@ 2026-01-31 5:33 ` bot+bpf-ci
2026-01-31 18:49 ` Amery Hung
0 siblings, 1 reply; 30+ messages in thread
From: bot+bpf-ci @ 2026-01-31 5:33 UTC (permalink / raw)
To: ameryhung, bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team, ast,
andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 1886 bytes --]
> diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c
> index 42e822ea3..559727b05 100644
> --- a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c
> +++ b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c
> @@ -117,19 +117,19 @@ static void test_recursion(void)
The comment just above this hunk at test_recursion() says:
/* Refer to the comment in BPF_PROG(on_update) for
* the explanation on the value 201 and 100.
*/
Should this comment be updated to reference the new expected values 200 and 1
instead of the old values 201 and 100?
> map_fd = bpf_map__fd(skel->maps.map_a);
> err = bpf_map_lookup_elem(map_fd, &task_fd, &value);
> ASSERT_OK(err, "lookup map_a");
> - ASSERT_EQ(value, 201, "map_a value");
> - ASSERT_EQ(skel->bss->nr_del_errs, 1, "bpf_task_storage_delete busy");
> + ASSERT_EQ(value, 200, "map_a value");
> + ASSERT_EQ(skel->bss->nr_del_errs, 0, "bpf_task_storage_delete busy");
>
> map_fd = bpf_map__fd(skel->maps.map_b);
> err = bpf_map_lookup_elem(map_fd, &task_fd, &value);
> ASSERT_OK(err, "lookup map_b");
> - ASSERT_EQ(value, 100, "map_b value");
> + ASSERT_EQ(value, 1, "map_b value");
[ ... ]
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21539340647
AI-authorship-score: low
AI-authorship-explanation: The commit message demonstrates deep BPF subsystem expertise with detailed execution flow diagrams that suggest human authorship from a domain expert.
issues-found: 1
issue-severity-score: low
issue-severity-explanation: Stale comment in test code references old expected values, causing potential confusion but no functional impact.
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH bpf-next v4 13/16] selftests/bpf: Update task_local_storage/recursion test
2026-01-31 5:33 ` bot+bpf-ci
@ 2026-01-31 18:49 ` Amery Hung
0 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 18:49 UTC (permalink / raw)
To: bot+bpf-ci
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, memxor,
martin.lau, kpsingh, yonghong.song, song, haoluo, kernel-team,
ast, eddyz87, clm, ihor.solodrai
On Fri, Jan 30, 2026 at 9:33 PM <bot+bpf-ci@kernel.org> wrote:
>
> > diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c
> > index 42e822ea3..559727b05 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c
> > @@ -117,19 +117,19 @@ static void test_recursion(void)
>
> The comment just above this hunk at test_recursion() says:
>
> /* Refer to the comment in BPF_PROG(on_update) for
> * the explanation on the value 201 and 100.
> */
>
> Should this comment be updated to reference the new expected values 200 and 1
> instead of the old values 201 and 100?
Indeed. Will update the commit message.
>
> > map_fd = bpf_map__fd(skel->maps.map_a);
> > err = bpf_map_lookup_elem(map_fd, &task_fd, &value);
> > ASSERT_OK(err, "lookup map_a");
> > - ASSERT_EQ(value, 201, "map_a value");
> > - ASSERT_EQ(skel->bss->nr_del_errs, 1, "bpf_task_storage_delete busy");
> > + ASSERT_EQ(value, 200, "map_a value");
> > + ASSERT_EQ(skel->bss->nr_del_errs, 0, "bpf_task_storage_delete busy");
> >
> > map_fd = bpf_map__fd(skel->maps.map_b);
> > err = bpf_map_lookup_elem(map_fd, &task_fd, &value);
> > ASSERT_OK(err, "lookup map_b");
> > - ASSERT_EQ(value, 100, "map_b value");
> > + ASSERT_EQ(value, 1, "map_b value");
>
> [ ... ]
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21539340647
>
> AI-authorship-score: low
> AI-authorship-explanation: The commit message demonstrates deep BPF subsystem expertise with detailed execution flow diagrams that suggest human authorship from a domain expert.
> issues-found: 1
> issue-severity-score: low
> issue-severity-explanation: Stale comment in test code references old expected values, causing potential confusion but no functional impact.
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH bpf-next v4 14/16] selftests/bpf: Update task_local_storage/task_storage_nodeadlock test
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
` (12 preceding siblings ...)
2026-01-31 5:09 ` [PATCH bpf-next v4 13/16] selftests/bpf: Update task_local_storage/recursion test Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:33 ` bot+bpf-ci
2026-01-31 5:09 ` [PATCH bpf-next v4 15/16] selftests/bpf: Remove test_task_storage_map_stress_lookup Amery Hung
2026-01-31 5:09 ` [PATCH bpf-next v4 16/16] selftests/bpf: Choose another percpu variable in bpf for btf_dump test Amery Hung
15 siblings, 1 reply; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
Adjsut the error code we are checking against as bpf_task_storage_get()
now returns -EDEADLK or -ETIMEDOUT when deadlock happens.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
.../testing/selftests/bpf/progs/task_storage_nodeadlock.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/task_storage_nodeadlock.c b/tools/testing/selftests/bpf/progs/task_storage_nodeadlock.c
index 986829aaf73a..6ce98fe9f387 100644
--- a/tools/testing/selftests/bpf/progs/task_storage_nodeadlock.c
+++ b/tools/testing/selftests/bpf/progs/task_storage_nodeadlock.c
@@ -1,15 +1,12 @@
// SPDX-License-Identifier: GPL-2.0
#include "vmlinux.h"
+#include <errno.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
char _license[] SEC("license") = "GPL";
-#ifndef EBUSY
-#define EBUSY 16
-#endif
-
extern bool CONFIG_PREEMPTION __kconfig __weak;
int nr_get_errs = 0;
int nr_del_errs = 0;
@@ -40,7 +37,7 @@ int BPF_PROG(socket_post_create, struct socket *sock, int family, int type,
ret = bpf_task_storage_delete(&task_storage,
bpf_get_current_task_btf());
- if (ret == -EBUSY)
+ if (ret == -EDEADLK || ret == -ETIMEDOUT)
__sync_fetch_and_add(&nr_del_errs, 1);
return 0;
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH bpf-next v4 14/16] selftests/bpf: Update task_local_storage/task_storage_nodeadlock test
2026-01-31 5:09 ` [PATCH bpf-next v4 14/16] selftests/bpf: Update task_local_storage/task_storage_nodeadlock test Amery Hung
@ 2026-01-31 5:33 ` bot+bpf-ci
2026-01-31 18:47 ` Amery Hung
0 siblings, 1 reply; 30+ messages in thread
From: bot+bpf-ci @ 2026-01-31 5:33 UTC (permalink / raw)
To: ameryhung, bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team, ast,
andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 1946 bytes --]
commit d58597062a783f7c4b2997c007bb7178b268dd14
Author: Amery Hung <ameryhung@gmail.com>
selftests/bpf: Update task_local_storage/task_storage_nodeadlock test
This commit updates the task_storage_nodeadlock test to check for the new
error codes (-EDEADLK or -ETIMEDOUT) that bpf_task_storage_delete() returns
after the rqspinlock conversion, replacing the previous -EBUSY check.
> Adjsut the error code we are checking against as bpf_task_storage_get()
^^^^^^
There's a typo here: "Adjsut" should be "Adjust".
> Adjsut the error code we are checking against as bpf_task_storage_get()
> ^^^^^^^^^^^^^^^^^^^^^^
> now returns -EDEADLK or -ETIMEDOUT when deadlock happens.
Should this say bpf_task_storage_delete() instead of bpf_task_storage_get()?
The test code checks the return value of bpf_task_storage_delete():
ret = bpf_task_storage_delete(&task_storage,
bpf_get_current_task_btf());
if (ret == -EDEADLK || ret == -ETIMEDOUT)
Looking at the kernel implementation, bpf_task_storage_get() returns NULL on
failure (not error codes), while bpf_task_storage_delete() propagates integer
error codes from bpf_selem_unlink() which can return -EDEADLK or -ETIMEDOUT
from the rqspinlock.
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21539340647
AI-authorship-score: low
AI-authorship-explanation: The typo 'Adjsut' and incorrect function name reference suggest human error rather than AI generation, which typically produces more precise but sometimes generic content.
issues-found: 2
issue-severity-score: low
issue-severity-explanation: Both issues are commit message errors (typo and wrong function name) that do not affect functionality of the selftest code.
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH bpf-next v4 14/16] selftests/bpf: Update task_local_storage/task_storage_nodeadlock test
2026-01-31 5:33 ` bot+bpf-ci
@ 2026-01-31 18:47 ` Amery Hung
0 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 18:47 UTC (permalink / raw)
To: bot+bpf-ci
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, memxor,
martin.lau, kpsingh, yonghong.song, song, haoluo, kernel-team,
ast, eddyz87, clm, ihor.solodrai
On Fri, Jan 30, 2026 at 9:33 PM <bot+bpf-ci@kernel.org> wrote:
>
>
> commit d58597062a783f7c4b2997c007bb7178b268dd14
> Author: Amery Hung <ameryhung@gmail.com>
>
> selftests/bpf: Update task_local_storage/task_storage_nodeadlock test
>
> This commit updates the task_storage_nodeadlock test to check for the new
> error codes (-EDEADLK or -ETIMEDOUT) that bpf_task_storage_delete() returns
> after the rqspinlock conversion, replacing the previous -EBUSY check.
>
> > Adjsut the error code we are checking against as bpf_task_storage_get()
> ^^^^^^
>
> There's a typo here: "Adjsut" should be "Adjust".
>
> > Adjsut the error code we are checking against as bpf_task_storage_get()
> > ^^^^^^^^^^^^^^^^^^^^^^
> > now returns -EDEADLK or -ETIMEDOUT when deadlock happens.
>
> Should this say bpf_task_storage_delete() instead of bpf_task_storage_get()?
Yes. I will change the commit message.
>
> The test code checks the return value of bpf_task_storage_delete():
>
> ret = bpf_task_storage_delete(&task_storage,
> bpf_get_current_task_btf());
> if (ret == -EDEADLK || ret == -ETIMEDOUT)
>
> Looking at the kernel implementation, bpf_task_storage_get() returns NULL on
> failure (not error codes), while bpf_task_storage_delete() propagates integer
> error codes from bpf_selem_unlink() which can return -EDEADLK or -ETIMEDOUT
> from the rqspinlock.
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21539340647
>
> AI-authorship-score: low
> AI-authorship-explanation: The typo 'Adjsut' and incorrect function name reference suggest human error rather than AI generation, which typically produces more precise but sometimes generic content.
> issues-found: 2
> issue-severity-score: low
> issue-severity-explanation: Both issues are commit message errors (typo and wrong function name) that do not affect functionality of the selftest code.
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH bpf-next v4 15/16] selftests/bpf: Remove test_task_storage_map_stress_lookup
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
` (13 preceding siblings ...)
2026-01-31 5:09 ` [PATCH bpf-next v4 14/16] selftests/bpf: Update task_local_storage/task_storage_nodeadlock test Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
2026-01-31 5:09 ` [PATCH bpf-next v4 16/16] selftests/bpf: Choose another percpu variable in bpf for btf_dump test Amery Hung
15 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
Remove a test in test_maps that checks if the updating of the percpu
counter in task local storage map is preemption safe as the percpu
counter is now removed.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
.../bpf/map_tests/task_storage_map.c | 128 ------------------
.../bpf/progs/read_bpf_task_storage_busy.c | 38 ------
2 files changed, 166 deletions(-)
delete mode 100644 tools/testing/selftests/bpf/map_tests/task_storage_map.c
delete mode 100644 tools/testing/selftests/bpf/progs/read_bpf_task_storage_busy.c
diff --git a/tools/testing/selftests/bpf/map_tests/task_storage_map.c b/tools/testing/selftests/bpf/map_tests/task_storage_map.c
deleted file mode 100644
index a4121d2248ac..000000000000
--- a/tools/testing/selftests/bpf/map_tests/task_storage_map.c
+++ /dev/null
@@ -1,128 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2022. Huawei Technologies Co., Ltd */
-#define _GNU_SOURCE
-#include <sched.h>
-#include <unistd.h>
-#include <stdlib.h>
-#include <stdbool.h>
-#include <errno.h>
-#include <string.h>
-#include <pthread.h>
-
-#include <bpf/bpf.h>
-#include <bpf/libbpf.h>
-
-#include "bpf_util.h"
-#include "test_maps.h"
-#include "task_local_storage_helpers.h"
-#include "read_bpf_task_storage_busy.skel.h"
-
-struct lookup_ctx {
- bool start;
- bool stop;
- int pid_fd;
- int map_fd;
- int loop;
-};
-
-static void *lookup_fn(void *arg)
-{
- struct lookup_ctx *ctx = arg;
- long value;
- int i = 0;
-
- while (!ctx->start)
- usleep(1);
-
- while (!ctx->stop && i++ < ctx->loop)
- bpf_map_lookup_elem(ctx->map_fd, &ctx->pid_fd, &value);
- return NULL;
-}
-
-static void abort_lookup(struct lookup_ctx *ctx, pthread_t *tids, unsigned int nr)
-{
- unsigned int i;
-
- ctx->stop = true;
- ctx->start = true;
- for (i = 0; i < nr; i++)
- pthread_join(tids[i], NULL);
-}
-
-void test_task_storage_map_stress_lookup(void)
-{
-#define MAX_NR_THREAD 4096
- unsigned int i, nr = 256, loop = 8192, cpu = 0;
- struct read_bpf_task_storage_busy *skel;
- pthread_t tids[MAX_NR_THREAD];
- struct lookup_ctx ctx;
- cpu_set_t old, new;
- const char *cfg;
- int err;
-
- cfg = getenv("TASK_STORAGE_MAP_NR_THREAD");
- if (cfg) {
- nr = atoi(cfg);
- if (nr > MAX_NR_THREAD)
- nr = MAX_NR_THREAD;
- }
- cfg = getenv("TASK_STORAGE_MAP_NR_LOOP");
- if (cfg)
- loop = atoi(cfg);
- cfg = getenv("TASK_STORAGE_MAP_PIN_CPU");
- if (cfg)
- cpu = atoi(cfg);
-
- skel = read_bpf_task_storage_busy__open_and_load();
- err = libbpf_get_error(skel);
- CHECK(err, "open_and_load", "error %d\n", err);
-
- /* Only for a fully preemptible kernel */
- if (!skel->kconfig->CONFIG_PREEMPTION) {
- printf("%s SKIP (no CONFIG_PREEMPTION)\n", __func__);
- read_bpf_task_storage_busy__destroy(skel);
- skips++;
- return;
- }
-
- /* Save the old affinity setting */
- sched_getaffinity(getpid(), sizeof(old), &old);
-
- /* Pinned on a specific CPU */
- CPU_ZERO(&new);
- CPU_SET(cpu, &new);
- sched_setaffinity(getpid(), sizeof(new), &new);
-
- ctx.start = false;
- ctx.stop = false;
- ctx.pid_fd = sys_pidfd_open(getpid(), 0);
- ctx.map_fd = bpf_map__fd(skel->maps.task);
- ctx.loop = loop;
- for (i = 0; i < nr; i++) {
- err = pthread_create(&tids[i], NULL, lookup_fn, &ctx);
- if (err) {
- abort_lookup(&ctx, tids, i);
- CHECK(err, "pthread_create", "error %d\n", err);
- goto out;
- }
- }
-
- ctx.start = true;
- for (i = 0; i < nr; i++)
- pthread_join(tids[i], NULL);
-
- skel->bss->pid = getpid();
- err = read_bpf_task_storage_busy__attach(skel);
- CHECK(err, "attach", "error %d\n", err);
-
- /* Trigger program */
- sys_gettid();
- skel->bss->pid = 0;
-
- CHECK(skel->bss->busy != 0, "bad bpf_task_storage_busy", "got %d\n", skel->bss->busy);
-out:
- read_bpf_task_storage_busy__destroy(skel);
- /* Restore affinity setting */
- sched_setaffinity(getpid(), sizeof(old), &old);
- printf("%s:PASS\n", __func__);
-}
diff --git a/tools/testing/selftests/bpf/progs/read_bpf_task_storage_busy.c b/tools/testing/selftests/bpf/progs/read_bpf_task_storage_busy.c
deleted file mode 100644
index 69da05bb6c63..000000000000
--- a/tools/testing/selftests/bpf/progs/read_bpf_task_storage_busy.c
+++ /dev/null
@@ -1,38 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2022. Huawei Technologies Co., Ltd */
-#include "vmlinux.h"
-#include <bpf/bpf_helpers.h>
-#include <bpf/bpf_tracing.h>
-
-extern bool CONFIG_PREEMPTION __kconfig __weak;
-extern const int bpf_task_storage_busy __ksym;
-
-char _license[] SEC("license") = "GPL";
-
-int pid = 0;
-int busy = 0;
-
-struct {
- __uint(type, BPF_MAP_TYPE_TASK_STORAGE);
- __uint(map_flags, BPF_F_NO_PREALLOC);
- __type(key, int);
- __type(value, long);
-} task SEC(".maps");
-
-SEC("raw_tp/sys_enter")
-int BPF_PROG(read_bpf_task_storage_busy)
-{
- int *value;
-
- if (!CONFIG_PREEMPTION)
- return 0;
-
- if (bpf_get_current_pid_tgid() >> 32 != pid)
- return 0;
-
- value = bpf_this_cpu_ptr(&bpf_task_storage_busy);
- if (value)
- busy = *value;
-
- return 0;
-}
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH bpf-next v4 16/16] selftests/bpf: Choose another percpu variable in bpf for btf_dump test
2026-01-31 5:09 [PATCH bpf-next v4 00/16] Remove task and cgroup local storage percpu counters Amery Hung
` (14 preceding siblings ...)
2026-01-31 5:09 ` [PATCH bpf-next v4 15/16] selftests/bpf: Remove test_task_storage_map_stress_lookup Amery Hung
@ 2026-01-31 5:09 ` Amery Hung
15 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2026-01-31 5:09 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, martin.lau,
kpsingh, yonghong.song, song, haoluo, ameryhung, kernel-team
bpf_cgrp_storage_busy has been removed. Use bpf_bprintf_nest_level
instead. This percpu variable is also in the bpf subsystem so that
if it is removed in the future, BPF-CI will catch this type of CI-
breaking change.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
tools/testing/selftests/bpf/prog_tests/btf_dump.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dump.c b/tools/testing/selftests/bpf/prog_tests/btf_dump.c
index 10cba526d3e6..f1642794f70e 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf_dump.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf_dump.c
@@ -875,8 +875,8 @@ static void test_btf_dump_var_data(struct btf *btf, struct btf_dump *d,
TEST_BTF_DUMP_VAR(btf, d, NULL, str, "cpu_number", int, BTF_F_COMPACT,
"int cpu_number = (int)100", 100);
#endif
- TEST_BTF_DUMP_VAR(btf, d, NULL, str, "bpf_cgrp_storage_busy", int, BTF_F_COMPACT,
- "static int bpf_cgrp_storage_busy = (int)2", 2);
+ TEST_BTF_DUMP_VAR(btf, d, NULL, str, "bpf_bprintf_nest_level", int, BTF_F_COMPACT,
+ "static int bpf_bprintf_nest_level = (int)2", 2);
}
struct btf_dump_string_ctx {
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread