* [PATCH net v2 1/1] net/sched: act_api: use RCU with deferred freeing for action lifecycle
@ 2026-05-31 16:08 Jamal Hadi Salim
2026-05-31 19:14 ` Kyle Zeng
0 siblings, 1 reply; 2+ messages in thread
From: Jamal Hadi Salim @ 2026-05-31 16:08 UTC (permalink / raw)
To: netdev
Cc: davem, edumazet, kuba, pabeni, horms, victor, kylebot, jiri,
vladbu, linux-kernel, security, stable, Jamal Hadi Salim, syzbot
When NEWTFILTER and DELFILTER are run concurrently it is possible to create a
race with an associated action.
Let's illustrate with CPU0 running NEWTFILTER and CPU1 running DELFILTER:
0: mutex_lock() <-- holds the idr lock
0: rcu_read_lock()
0: p = idr_find(idr, index) <-- action p is valid (RCU protects IDR)
0: mutex_unlock() <-- releases the idr lock
1: refcount_dec_and_mutex_lock() <-- refcnt 1->0, mutex held
1: idr_remove(idr, index) <-- Action removed from IDR
1: mutex_unlock() <-- mutex released allowing us to delete the action
1: tcf_action_cleanup(p); kfree(p) <-- Kfrees p immediately, no deferral
0: refcount_inc_not_zero(&p->tcfa_refcnt) <-- ouch, UAF p points to freed memory
This patch fixes the race condition between NEWTFILTER and DELFILTER by
adding struct rcu_head to tc_action used in the deferral and introducing a
call_rcu() in the delete path to defer the final kfree().
Note: this is a revert of commit d7fb60b9cafb ("net_sched: get rid of tcfa_rcu")
but also modernization/simplification to directly use kfree_rcu().
Let's illustrate the new restored code path:
0: rcu_read_lock()
1: refcount_dec_and_mutex_lock() <-- refcnt 1->0, mutex held
1: idr_remove(idr, index)
1: mutex_unlock()
1: call_rcu(&p->tcfa_rcu, tcf_action_rcu_free) <-- defer kfree after grace period
0: p = idr_find(idr, index)
0: refcount_inc_not_zero(&p->tcfa_refcnt) <-- fails, refcnt already 0
1: rcu_read_unlock() <-- release so freeing can run after grace period
After CPU1 calls idr_remove(), the object is no longer reachable through the IDR.
CPU0's subsequent idr_find() will return NULL, and even if it still held a
stale pointer, the immediate kfree() is now deferred until after the RCU grace
period, so no UAF can occur.
Fixes: d7fb60b9cafb ("net_sched: get rid of tcfa_rcu")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Reported-by: Kyle Zeng <kylebot@openai.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
Tested-by: syzbot@syzkaller.appspotmail.com
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
v1->v2
1) syzbot ci found revealed that the cleanup code could be called under
softirq/atomic context. Because tcf_action_cleanup() calls __tcf_chain_put()
which grabs a mutex (which can sleep). Fix is to move tcf_action_cleanup()
to be invoked before call to rcu.
2) And while looking closely i noticed a bad comment on action code which
claims that "actions are always connected to filters" ;-> Tracing back
on git i noticed that infact we did have exactly this new change circa 2017
that was erronously removed by commit d7fb60b9cafb. Well, at least I
am certain what the "Fixes" tag should be now ;->
3) Patch is much much simpler now than it was in v1
---
include/net/act_api.h | 1 +
net/sched/act_api.c | 7 +------
2 files changed, 2 insertions(+), 6 deletions(-)
diff --git a/include/net/act_api.h b/include/net/act_api.h
index d11b79107930..fd2967ee08f7 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -45,6 +45,7 @@ struct tc_action {
struct tc_cookie __rcu *user_cookie;
struct tcf_chain __rcu *goto_chain;
u32 tcfa_flags;
+ struct rcu_head tcfa_rcu;
u8 hw_stats;
u8 used_hw_stats;
bool used_hw_stats_valid;
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 332fd9695e54..04ea11c90e03 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -112,11 +112,6 @@ struct tcf_chain *tcf_action_set_ctrlact(struct tc_action *a, int action,
}
EXPORT_SYMBOL(tcf_action_set_ctrlact);
-/* XXX: For standalone actions, we don't need a RCU grace period either, because
- * actions are always connected to filters and filters are already destroyed in
- * RCU callbacks, so after a RCU grace period actions are already disconnected
- * from filters. Readers later can not find us.
- */
static void free_tcf(struct tc_action *p)
{
struct tcf_chain *chain = rcu_dereference_protected(p->goto_chain, 1);
@@ -129,7 +124,7 @@ static void free_tcf(struct tc_action *p)
if (chain)
tcf_chain_put_by_act(chain);
- kfree(p);
+ kfree_rcu(p, tcfa_rcu);
}
static void offload_action_hw_count_set(struct tc_action *act,
--
2.34.1
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH net v2 1/1] net/sched: act_api: use RCU with deferred freeing for action lifecycle
2026-05-31 16:08 [PATCH net v2 1/1] net/sched: act_api: use RCU with deferred freeing for action lifecycle Jamal Hadi Salim
@ 2026-05-31 19:14 ` Kyle Zeng
0 siblings, 0 replies; 2+ messages in thread
From: Kyle Zeng @ 2026-05-31 19:14 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: netdev, davem, edumazet, kuba, pabeni, horms, victor, jiri,
vladbu, linux-kernel, security, stable, syzbot
On Sun, May 31, 2026 at 12:08:12PM -0400, Jamal Hadi Salim wrote:
> When NEWTFILTER and DELFILTER are run concurrently it is possible to create a
> race with an associated action.
>
> Let's illustrate with CPU0 running NEWTFILTER and CPU1 running DELFILTER:
>
> 0: mutex_lock() <-- holds the idr lock
> 0: rcu_read_lock()
> 0: p = idr_find(idr, index) <-- action p is valid (RCU protects IDR)
> 0: mutex_unlock() <-- releases the idr lock
> 1: refcount_dec_and_mutex_lock() <-- refcnt 1->0, mutex held
> 1: idr_remove(idr, index) <-- Action removed from IDR
> 1: mutex_unlock() <-- mutex released allowing us to delete the action
> 1: tcf_action_cleanup(p); kfree(p) <-- Kfrees p immediately, no deferral
> 0: refcount_inc_not_zero(&p->tcfa_refcnt) <-- ouch, UAF p points to freed memory
>
> This patch fixes the race condition between NEWTFILTER and DELFILTER by
> adding struct rcu_head to tc_action used in the deferral and introducing a
> call_rcu() in the delete path to defer the final kfree().
>
> Note: this is a revert of commit d7fb60b9cafb ("net_sched: get rid of tcfa_rcu")
> but also modernization/simplification to directly use kfree_rcu().
>
> Let's illustrate the new restored code path:
>
> 0: rcu_read_lock()
> 1: refcount_dec_and_mutex_lock() <-- refcnt 1->0, mutex held
> 1: idr_remove(idr, index)
> 1: mutex_unlock()
> 1: call_rcu(&p->tcfa_rcu, tcf_action_rcu_free) <-- defer kfree after grace period
> 0: p = idr_find(idr, index)
> 0: refcount_inc_not_zero(&p->tcfa_refcnt) <-- fails, refcnt already 0
> 1: rcu_read_unlock() <-- release so freeing can run after grace period
>
> After CPU1 calls idr_remove(), the object is no longer reachable through the IDR.
> CPU0's subsequent idr_find() will return NULL, and even if it still held a
> stale pointer, the immediate kfree() is now deferred until after the RCU grace
> period, so no UAF can occur.
>
> Fixes: d7fb60b9cafb ("net_sched: get rid of tcfa_rcu")
> Suggested-by: Jakub Kicinski <kuba@kernel.org>
> Reported-by: Kyle Zeng <kylebot@openai.com>
> Tested-by: Victor Nogueira <victor@mojatatu.com>
> Tested-by: syzbot@syzkaller.appspotmail.com
> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Tested-by: Kyle Zeng <kylebot@openai.com>
Best,
Kyle
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-31 19:14 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-31 16:08 [PATCH net v2 1/1] net/sched: act_api: use RCU with deferred freeing for action lifecycle Jamal Hadi Salim
2026-05-31 19:14 ` Kyle Zeng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox