* [PATCH net v7] net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes.
@ 2024-11-10 17:28 Alexandre Ferrieux
2024-11-10 18:14 ` Victor Nogueira
2024-11-13 5:00 ` patchwork-bot+netdevbpf
0 siblings, 2 replies; 8+ messages in thread
From: Alexandre Ferrieux @ 2024-11-10 17:28 UTC (permalink / raw)
To: edumazet; +Cc: jhs, xiyou.wangcong, jiri, horms, alexandre.ferrieux, netdev
To generate hnode handles (in gen_new_htid()), u32 uses IDR and
encodes the returned small integer into a structured 32-bit
word. Unfortunately, at disposal time, the needed decoding
is not done. As a result, idr_remove() fails, and the IDR
fills up. Since its size is 2048, the following script ends up
with "Filter already exists":
tc filter add dev myve $FILTER1
tc filter add dev myve $FILTER2
for i in {1..2048}
do
echo $i
tc filter del dev myve $FILTER2
tc filter add dev myve $FILTER2
done
This patch adds the missing decoding logic for handles that
deserve it.
Fixes: e7614370d6f0 ("net_sched: use idr to allocate u32 filter handles")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
---
v7: static inline -> static
v6: big speedup of the tdc test with batch tc
v5: fix title - again
v4: add tdc test
v3: prepend title with subsystem ident
v2: use u32 type in handle encoder/decoder
net/sched/cls_u32.c | 18 ++++++++++----
.../tc-testing/tc-tests/filters/u32.json | 24 +++++++++++++++++++
2 files changed, 38 insertions(+), 4 deletions(-)
diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 9412d88a99bc..d3a03c57545b 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -92,6 +92,16 @@ struct tc_u_common {
long knodes;
};
+static u32 handle2id(u32 h)
+{
+ return ((h & 0x80000000) ? ((h >> 20) & 0x7FF) : h);
+}
+
+static u32 id2handle(u32 id)
+{
+ return (id | 0x800U) << 20;
+}
+
static inline unsigned int u32_hash_fold(__be32 key,
const struct tc_u32_sel *sel,
u8 fshift)
@@ -310,7 +320,7 @@ static u32 gen_new_htid(struct tc_u_common *tp_c, struct tc_u_hnode *ptr)
int id = idr_alloc_cyclic(&tp_c->handle_idr, ptr, 1, 0x7FF, GFP_KERNEL);
if (id < 0)
return 0;
- return (id | 0x800U) << 20;
+ return id2handle(id);
}
static struct hlist_head *tc_u_common_hash;
@@ -360,7 +370,7 @@ static int u32_init(struct tcf_proto *tp)
return -ENOBUFS;
refcount_set(&root_ht->refcnt, 1);
- root_ht->handle = tp_c ? gen_new_htid(tp_c, root_ht) : 0x80000000;
+ root_ht->handle = tp_c ? gen_new_htid(tp_c, root_ht) : id2handle(0);
root_ht->prio = tp->prio;
root_ht->is_root = true;
idr_init(&root_ht->handle_idr);
@@ -612,7 +622,7 @@ static int u32_destroy_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht,
if (phn == ht) {
u32_clear_hw_hnode(tp, ht, extack);
idr_destroy(&ht->handle_idr);
- idr_remove(&tp_c->handle_idr, ht->handle);
+ idr_remove(&tp_c->handle_idr, handle2id(ht->handle));
RCU_INIT_POINTER(*hn, ht->next);
kfree_rcu(ht, rcu);
return 0;
@@ -989,7 +999,7 @@ static int u32_change(struct net *net, struct sk_buff *in_skb,
err = u32_replace_hw_hnode(tp, ht, userflags, extack);
if (err) {
- idr_remove(&tp_c->handle_idr, handle);
+ idr_remove(&tp_c->handle_idr, handle2id(handle));
kfree(ht);
return err;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net v7] net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes.
2024-11-10 17:28 [PATCH net v7] net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes Alexandre Ferrieux
@ 2024-11-10 18:14 ` Victor Nogueira
2024-11-13 5:00 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 8+ messages in thread
From: Victor Nogueira @ 2024-11-10 18:14 UTC (permalink / raw)
To: Alexandre Ferrieux, edumazet
Cc: jhs, xiyou.wangcong, jiri, horms, alexandre.ferrieux, netdev
On 10/11/2024 14:28, Alexandre Ferrieux wrote:
> To generate hnode handles (in gen_new_htid()), u32 uses IDR and
> encodes the returned small integer into a structured 32-bit
> word. Unfortunately, at disposal time, the needed decoding
> is not done. As a result, idr_remove() fails, and the IDR
> fills up. Since its size is 2048, the following script ends up
> with "Filter already exists":
>
> tc filter add dev myve $FILTER1
> tc filter add dev myve $FILTER2
> for i in {1..2048}
> do
> echo $i
> tc filter del dev myve $FILTER2
> tc filter add dev myve $FILTER2
> done
>
> This patch adds the missing decoding logic for handles that
> deserve it.
>
> Fixes: e7614370d6f0 ("net_sched: use idr to allocate u32 filter handles")
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net v7] net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes.
2024-11-10 17:28 [PATCH net v7] net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes Alexandre Ferrieux
2024-11-10 18:14 ` Victor Nogueira
@ 2024-11-13 5:00 ` patchwork-bot+netdevbpf
2024-11-14 18:24 ` RFC: chasing all idr_remove() misses Alexandre Ferrieux
1 sibling, 1 reply; 8+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-11-13 5:00 UTC (permalink / raw)
To: Alexandre Ferrieux
Cc: edumazet, jhs, xiyou.wangcong, jiri, horms, alexandre.ferrieux,
netdev
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Sun, 10 Nov 2024 18:28:36 +0100 you wrote:
> To generate hnode handles (in gen_new_htid()), u32 uses IDR and
> encodes the returned small integer into a structured 32-bit
> word. Unfortunately, at disposal time, the needed decoding
> is not done. As a result, idr_remove() fails, and the IDR
> fills up. Since its size is 2048, the following script ends up
> with "Filter already exists":
>
> [...]
Here is the summary with links:
- [net,v7] net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes.
https://git.kernel.org/netdev/net/c/73af53d82076
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* RFC: chasing all idr_remove() misses
2024-11-13 5:00 ` patchwork-bot+netdevbpf
@ 2024-11-14 18:24 ` Alexandre Ferrieux
2024-11-19 3:51 ` Cong Wang
0 siblings, 1 reply; 8+ messages in thread
From: Alexandre Ferrieux @ 2024-11-14 18:24 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: edumazet, jhs, xiyou.wangcong, jiri, horms, netdev
Hi,
In the recent fix of u32's IDR leaks, one side remark is that the problem went
unnoticed for 7 years due to the NULL result from idr_remove() being ignored at
this call site.
Now, a cursory grep over the whole Linux tree shows 306 out of 386 call sites
(excluding those hidden in macros, if any) don't bother to extract the value
returned by idr_remove().
Indeed, a failed IDR removal is "mostly harmless" since IDs are not pointers so
the mismatch is detectable (and is detected, returning NULL). However, in racy
situations you may end up killing an innocent fresh entry, which may really
break things a bit later. And in all cases, a true bug is the root cause.
So, unless we have reasons to think cls_u32 was the only place where two ID
encodings might lend themselves to confusion, I'm wondering if it wouldn't make
sense to chase the issue more systematically:
- either with WARN_ON[_ONCE](idr_remove()==NULL) on each call site individually
(a year-long endeavor implying tens of maintainers)
- or with WARN_ON[_ONCE] just before returning NULL within idr_remove() itself,
or even radix_tree_delete_item().
Opinions ?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: RFC: chasing all idr_remove() misses
2024-11-14 18:24 ` RFC: chasing all idr_remove() misses Alexandre Ferrieux
@ 2024-11-19 3:51 ` Cong Wang
2024-11-19 3:57 ` Cong Wang
2024-11-19 6:46 ` Alexandre Ferrieux
0 siblings, 2 replies; 8+ messages in thread
From: Cong Wang @ 2024-11-19 3:51 UTC (permalink / raw)
To: Alexandre Ferrieux; +Cc: Jakub Kicinski, edumazet, jhs, jiri, horms, netdev
On Thu, Nov 14, 2024 at 07:24:27PM +0100, Alexandre Ferrieux wrote:
> Hi,
>
> In the recent fix of u32's IDR leaks, one side remark is that the problem went
> unnoticed for 7 years due to the NULL result from idr_remove() being ignored at
> this call site.
I'd blame the lack of self test coverage. :)
>
> Now, a cursory grep over the whole Linux tree shows 306 out of 386 call sites
> (excluding those hidden in macros, if any) don't bother to extract the value
> returned by idr_remove().
>
> Indeed, a failed IDR removal is "mostly harmless" since IDs are not pointers so
> the mismatch is detectable (and is detected, returning NULL). However, in racy
> situations you may end up killing an innocent fresh entry, which may really
> break things a bit later. And in all cases, a true bug is the root cause.
>
> So, unless we have reasons to think cls_u32 was the only place where two ID
> encodings might lend themselves to confusion, I'm wondering if it wouldn't make
> sense to chase the issue more systematically:
>
> - either with WARN_ON[_ONCE](idr_remove()==NULL) on each call site individually
> (a year-long endeavor implying tens of maintainers)
>
> - or with WARN_ON[_ONCE] just before returning NULL within idr_remove() itself,
> or even radix_tree_delete_item().
>
> Opinions ?
Yeah, or simply WARN_ON uncleaned IDR in idr_destroy(), which is a more
common pattern.
Thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: RFC: chasing all idr_remove() misses
2024-11-19 3:51 ` Cong Wang
@ 2024-11-19 3:57 ` Cong Wang
2024-11-19 6:46 ` Alexandre Ferrieux
1 sibling, 0 replies; 8+ messages in thread
From: Cong Wang @ 2024-11-19 3:57 UTC (permalink / raw)
To: Alexandre Ferrieux; +Cc: Jakub Kicinski, edumazet, jhs, jiri, horms, netdev
On Mon, Nov 18, 2024 at 07:51:47PM -0800, Cong Wang wrote:
> On Thu, Nov 14, 2024 at 07:24:27PM +0100, Alexandre Ferrieux wrote:
> > Hi,
> >
> > In the recent fix of u32's IDR leaks, one side remark is that the problem went
> > unnoticed for 7 years due to the NULL result from idr_remove() being ignored at
> > this call site.
>
> I'd blame the lack of self test coverage. :)
>
> >
> > Now, a cursory grep over the whole Linux tree shows 306 out of 386 call sites
> > (excluding those hidden in macros, if any) don't bother to extract the value
> > returned by idr_remove().
> >
> > Indeed, a failed IDR removal is "mostly harmless" since IDs are not pointers so
> > the mismatch is detectable (and is detected, returning NULL). However, in racy
> > situations you may end up killing an innocent fresh entry, which may really
> > break things a bit later. And in all cases, a true bug is the root cause.
> >
> > So, unless we have reasons to think cls_u32 was the only place where two ID
> > encodings might lend themselves to confusion, I'm wondering if it wouldn't make
> > sense to chase the issue more systematically:
> >
> > - either with WARN_ON[_ONCE](idr_remove()==NULL) on each call site individually
> > (a year-long endeavor implying tens of maintainers)
> >
> > - or with WARN_ON[_ONCE] just before returning NULL within idr_remove() itself,
> > or even radix_tree_delete_item().
> >
> > Opinions ?
>
> Yeah, or simply WARN_ON uncleaned IDR in idr_destroy(), which is a more
> common pattern.
Something like this (or, of course, move it to caller to reduce the noise):
diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 976b9bd02a1b..20cc690ffb32 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -1559,6 +1559,8 @@ void __rcu **idr_get_free(struct radix_tree_root *root,
void idr_destroy(struct idr *idr)
{
struct radix_tree_node *node = rcu_dereference_raw(idr->idr_rt.xa_head);
+
+ WARN_ON(!idr_is_empty(idr));
if (radix_tree_is_internal_node(node))
radix_tree_free_nodes(node);
idr->idr_rt.xa_head = NULL;
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: RFC: chasing all idr_remove() misses
2024-11-19 3:51 ` Cong Wang
2024-11-19 3:57 ` Cong Wang
@ 2024-11-19 6:46 ` Alexandre Ferrieux
2024-11-22 21:32 ` Cong Wang
1 sibling, 1 reply; 8+ messages in thread
From: Alexandre Ferrieux @ 2024-11-19 6:46 UTC (permalink / raw)
To: Cong Wang, Alexandre Ferrieux
Cc: Jakub Kicinski, edumazet, jhs, jiri, horms, netdev
On 19/11/2024 04:51, Cong Wang wrote:
> On Thu, Nov 14, 2024 at 07:24:27PM +0100, Alexandre Ferrieux wrote:
>> Hi,
>>
>> In the recent fix of u32's IDR leaks, one side remark is that the problem went
>> unnoticed for 7 years due to the NULL result from idr_remove() being ignored at
>> this call site.
>> [...]
>> So, unless we have reasons to think cls_u32 was the only place where two ID
>> encodings might lend themselves to confusion, I'm wondering if it wouldn't make
>> sense to chase the issue more systematically:
>>
>> - either with WARN_ON[_ONCE](idr_remove()==NULL) on each call site individually
>> (a year-long endeavor implying tens of maintainers)
>>
>> - or with WARN_ON[_ONCE] just before returning NULL within idr_remove() itself,
>> or even radix_tree_delete_item().
>>
>> Opinions ?
>
> Yeah, or simply WARN_ON uncleaned IDR in idr_destroy(), which is a more
> common pattern.
No, in the general case, idr_destroy() only happens at the end of life of an IDR
set. Some structures in the kernel have a long lifetime, which means possibly
splipping out of fuzzers' scrutiny.
As an illustration, in cls_u32 itself, in the 2048-delete-add loop I use in the
tdc test committed with the fix, idr_destroy(&tp_c->handle_idr) is called only
at the "cleanup" step, when deleting the interface.
You can only imagine, in the hundreds of other uses of IDR, the "miss rate" that
would follow from targeting idr_destroy() instead of idr_remove().
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: RFC: chasing all idr_remove() misses
2024-11-19 6:46 ` Alexandre Ferrieux
@ 2024-11-22 21:32 ` Cong Wang
0 siblings, 0 replies; 8+ messages in thread
From: Cong Wang @ 2024-11-22 21:32 UTC (permalink / raw)
To: Alexandre Ferrieux; +Cc: Jakub Kicinski, edumazet, jhs, jiri, horms, netdev
On Tue, Nov 19, 2024 at 07:46:32AM +0100, Alexandre Ferrieux wrote:
> On 19/11/2024 04:51, Cong Wang wrote:
> > On Thu, Nov 14, 2024 at 07:24:27PM +0100, Alexandre Ferrieux wrote:
> >> Hi,
> >>
> >> In the recent fix of u32's IDR leaks, one side remark is that the problem went
> >> unnoticed for 7 years due to the NULL result from idr_remove() being ignored at
> >> this call site.
> >> [...]
> >> So, unless we have reasons to think cls_u32 was the only place where two ID
> >> encodings might lend themselves to confusion, I'm wondering if it wouldn't make
> >> sense to chase the issue more systematically:
> >>
> >> - either with WARN_ON[_ONCE](idr_remove()==NULL) on each call site individually
> >> (a year-long endeavor implying tens of maintainers)
> >>
> >> - or with WARN_ON[_ONCE] just before returning NULL within idr_remove() itself,
> >> or even radix_tree_delete_item().
> >>
> >> Opinions ?
> >
> > Yeah, or simply WARN_ON uncleaned IDR in idr_destroy(), which is a more
> > common pattern.
>
> No, in the general case, idr_destroy() only happens at the end of life of an IDR
> set. Some structures in the kernel have a long lifetime, which means possibly
> splipping out of fuzzers' scrutiny.
Sure, move it to where you believe is appropriate.
It is a very common pattern we detect resource leakage when destroying,
for a quick example, in inet_sock_destruct() we detect skb accounting
leaks:
153 WARN_ON_ONCE(atomic_read(&sk->sk_rmem_alloc));
154 WARN_ON_ONCE(refcount_read(&sk->sk_wmem_alloc));
155 WARN_ON_ONCE(sk->sk_wmem_queued);
156 WARN_ON_ONCE(sk_forward_alloc_get(sk));
Another example of IDR leakage detection can be found in
drivers/gpu/drm/vmwgfx/ttm_object.c:
447 void ttm_object_device_release(struct ttm_object_device **p_tdev)
448 {
449 struct ttm_object_device *tdev = *p_tdev;
450
451 *p_tdev = NULL;
452
453 WARN_ON_ONCE(!idr_is_empty(&tdev->idr));
454 idr_destroy(&tdev->idr);
455
456 kfree(tdev);
457 }
>
> As an illustration, in cls_u32 itself, in the 2048-delete-add loop I use in the
> tdc test committed with the fix, idr_destroy(&tp_c->handle_idr) is called only
> at the "cleanup" step, when deleting the interface.
>
> You can only imagine, in the hundreds of other uses of IDR, the "miss rate" that
> would follow from targeting idr_destroy() instead of idr_remove().
>
I am not saying it is suitable for this specific case, I am just saying it is a
common pattern for you to consier, that's all.
Thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-11-22 21:32 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-10 17:28 [PATCH net v7] net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes Alexandre Ferrieux
2024-11-10 18:14 ` Victor Nogueira
2024-11-13 5:00 ` patchwork-bot+netdevbpf
2024-11-14 18:24 ` RFC: chasing all idr_remove() misses Alexandre Ferrieux
2024-11-19 3:51 ` Cong Wang
2024-11-19 3:57 ` Cong Wang
2024-11-19 6:46 ` Alexandre Ferrieux
2024-11-22 21:32 ` Cong Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).