linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
@ 2025-06-08 15:34 Charalampos Mitrodimas
  2025-06-09 15:51 ` Alexei Starovoitov
  0 siblings, 1 reply; 8+ messages in thread
From: Charalampos Mitrodimas @ 2025-06-08 15:34 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Martin KaFai Lau, Daniel Borkmann, John Fastabend,
	Alexei Starovoitov, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Feng Yang, Tejun Heo
  Cc: netdev, linux-kernel, bpf, syzbot+b4169a1cfb945d2ed0ec,
	Charalampos Mitrodimas

The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
types") made bpf_get_cgroup_classid_curr helper available to all BPF
program types.  This helper used __task_get_classid() which calls
task_cls_state() that requires rcu_read_lock_bh_held().

This triggers an RCU warning when called from BPF syscall programs
which run under rcu_read_lock_trace():

  WARNING: suspicious RCU usage
  6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
  -----------------------------
  net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!

Fix this by replacing __task_get_classid() with task_cls_classid()
which handles RCU locking internally using regular rcu_read_lock() and
is safe to call from any context.

Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
---
 net/core/filter.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
 #ifdef CONFIG_CGROUP_NET_CLASSID
 BPF_CALL_0(bpf_get_cgroup_classid_curr)
 {
-	return __task_get_classid(current);
+	return task_cls_classid(current);
 }
 
 const struct bpf_func_proto bpf_get_cgroup_classid_curr_proto = {

---
base-commit: 079e5c56a5c41d285068939ff7b0041ab10386fa
change-id: 20250608-rcu-fix-task_cls_state-0ed73f437d1e

Best regards,
-- 
Charalampos Mitrodimas <charmitro@posteo.net>


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
  2025-06-08 15:34 [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper Charalampos Mitrodimas
@ 2025-06-09 15:51 ` Alexei Starovoitov
  2025-06-10 12:58   ` Daniel Borkmann
  0 siblings, 1 reply; 8+ messages in thread
From: Alexei Starovoitov @ 2025-06-09 15:51 UTC (permalink / raw)
  To: Charalampos Mitrodimas, Daniel Borkmann
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Martin KaFai Lau, John Fastabend,
	Alexei Starovoitov, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Feng Yang, Tejun Heo, Network Development, LKML, bpf,
	syzbot+b4169a1cfb945d2ed0ec

On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
<charmitro@posteo.net> wrote:
>
> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
> types") made bpf_get_cgroup_classid_curr helper available to all BPF
> program types.  This helper used __task_get_classid() which calls
> task_cls_state() that requires rcu_read_lock_bh_held().
>
> This triggers an RCU warning when called from BPF syscall programs
> which run under rcu_read_lock_trace():
>
>   WARNING: suspicious RCU usage
>   6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>   -----------------------------
>   net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>
> Fix this by replacing __task_get_classid() with task_cls_classid()
> which handles RCU locking internally using regular rcu_read_lock() and
> is safe to call from any context.
>
> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
> ---
>  net/core/filter.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
>  #ifdef CONFIG_CGROUP_NET_CLASSID
>  BPF_CALL_0(bpf_get_cgroup_classid_curr)
>  {
> -       return __task_get_classid(current);
> +       return task_cls_classid(current);
>  }

Daniel added this helper in
commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
with intention to use it from networking hooks.

But task_cls_classid() has
        if (in_interrupt())
                return 0;

which will trigger in softirq and tc hooks.
So this might break Daniel's use case.

Daniel,
ptal.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
  2025-06-09 15:51 ` Alexei Starovoitov
@ 2025-06-10 12:58   ` Daniel Borkmann
  2025-06-10 14:56     ` Alexei Starovoitov
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Borkmann @ 2025-06-10 12:58 UTC (permalink / raw)
  To: Alexei Starovoitov, Charalampos Mitrodimas
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Martin KaFai Lau, John Fastabend,
	Alexei Starovoitov, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Feng Yang, Tejun Heo, Network Development, LKML, bpf,
	syzbot+b4169a1cfb945d2ed0ec

On 6/9/25 5:51 PM, Alexei Starovoitov wrote:
> On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
> <charmitro@posteo.net> wrote:
>>
>> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
>> types") made bpf_get_cgroup_classid_curr helper available to all BPF
>> program types.  This helper used __task_get_classid() which calls
>> task_cls_state() that requires rcu_read_lock_bh_held().
>>
>> This triggers an RCU warning when called from BPF syscall programs
>> which run under rcu_read_lock_trace():
>>
>>    WARNING: suspicious RCU usage
>>    6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>>    -----------------------------
>>    net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>>
>> Fix this by replacing __task_get_classid() with task_cls_classid()
>> which handles RCU locking internally using regular rcu_read_lock() and
>> is safe to call from any context.
>>
>> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
>> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
>> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
>> ---
>>   net/core/filter.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
>>   #ifdef CONFIG_CGROUP_NET_CLASSID
>>   BPF_CALL_0(bpf_get_cgroup_classid_curr)
>>   {
>> -       return __task_get_classid(current);
>> +       return task_cls_classid(current);
>>   }
> 
> Daniel added this helper in
> commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
> with intention to use it from networking hooks.
> 
> But task_cls_classid() has
>          if (in_interrupt())
>                  return 0;
> 
> which will trigger in softirq and tc hooks.
> So this might break Daniel's use case.

Yeap, we cannot break tc(x) BPF programs. It probably makes sense to have
a new helper implementation for the more generic, non-networking case which
then internally uses task_cls_classid().

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
  2025-06-10 12:58   ` Daniel Borkmann
@ 2025-06-10 14:56     ` Alexei Starovoitov
  2025-06-10 15:23       ` Charalampos Mitrodimas
  0 siblings, 1 reply; 8+ messages in thread
From: Alexei Starovoitov @ 2025-06-10 14:56 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Charalampos Mitrodimas, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Martin KaFai Lau,
	John Fastabend, Alexei Starovoitov, Andrii Nakryiko,
	Eduard Zingerman, Song Liu, Yonghong Song, KP Singh,
	Stanislav Fomichev, Hao Luo, Jiri Olsa, Feng Yang, Tejun Heo,
	Network Development, LKML, bpf, syzbot+b4169a1cfb945d2ed0ec

On Tue, Jun 10, 2025 at 5:58 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 6/9/25 5:51 PM, Alexei Starovoitov wrote:
> > On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
> > <charmitro@posteo.net> wrote:
> >>
> >> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
> >> types") made bpf_get_cgroup_classid_curr helper available to all BPF
> >> program types.  This helper used __task_get_classid() which calls
> >> task_cls_state() that requires rcu_read_lock_bh_held().
> >>
> >> This triggers an RCU warning when called from BPF syscall programs
> >> which run under rcu_read_lock_trace():
> >>
> >>    WARNING: suspicious RCU usage
> >>    6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
> >>    -----------------------------
> >>    net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
> >>
> >> Fix this by replacing __task_get_classid() with task_cls_classid()
> >> which handles RCU locking internally using regular rcu_read_lock() and
> >> is safe to call from any context.
> >>
> >> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
> >> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
> >> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
> >> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
> >> ---
> >>   net/core/filter.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/net/core/filter.c b/net/core/filter.c
> >> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
> >> --- a/net/core/filter.c
> >> +++ b/net/core/filter.c
> >> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
> >>   #ifdef CONFIG_CGROUP_NET_CLASSID
> >>   BPF_CALL_0(bpf_get_cgroup_classid_curr)
> >>   {
> >> -       return __task_get_classid(current);
> >> +       return task_cls_classid(current);
> >>   }
> >
> > Daniel added this helper in
> > commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
> > with intention to use it from networking hooks.
> >
> > But task_cls_classid() has
> >          if (in_interrupt())
> >                  return 0;
> >
> > which will trigger in softirq and tc hooks.
> > So this might break Daniel's use case.
>
> Yeap, we cannot break tc(x) BPF programs. It probably makes sense to have
> a new helper implementation for the more generic, non-networking case which
> then internally uses task_cls_classid().

Instead of forking the helper I think we can :
rcu_read_lock_bh_held() || rcu_read_lock_held()
in task_cls_state().
And that will do it.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
  2025-06-10 14:56     ` Alexei Starovoitov
@ 2025-06-10 15:23       ` Charalampos Mitrodimas
  2025-06-10 15:42         ` Alexei Starovoitov
  0 siblings, 1 reply; 8+ messages in thread
From: Charalampos Mitrodimas @ 2025-06-10 15:23 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Martin KaFai Lau, John Fastabend,
	Alexei Starovoitov, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Feng Yang, Tejun Heo, Network Development, LKML, bpf,
	syzbot+b4169a1cfb945d2ed0ec

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Tue, Jun 10, 2025 at 5:58 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>
>> On 6/9/25 5:51 PM, Alexei Starovoitov wrote:
>> > On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
>> > <charmitro@posteo.net> wrote:
>> >>
>> >> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
>> >> types") made bpf_get_cgroup_classid_curr helper available to all BPF
>> >> program types.  This helper used __task_get_classid() which calls
>> >> task_cls_state() that requires rcu_read_lock_bh_held().
>> >>
>> >> This triggers an RCU warning when called from BPF syscall programs
>> >> which run under rcu_read_lock_trace():
>> >>
>> >>    WARNING: suspicious RCU usage
>> >>    6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>> >>    -----------------------------
>> >>    net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>> >>
>> >> Fix this by replacing __task_get_classid() with task_cls_classid()
>> >> which handles RCU locking internally using regular rcu_read_lock() and
>> >> is safe to call from any context.
>> >>
>> >> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
>> >> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
>> >> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
>> >> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
>> >> ---
>> >>   net/core/filter.c | 2 +-
>> >>   1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/net/core/filter.c b/net/core/filter.c
>> >> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
>> >> --- a/net/core/filter.c
>> >> +++ b/net/core/filter.c
>> >> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
>> >>   #ifdef CONFIG_CGROUP_NET_CLASSID
>> >>   BPF_CALL_0(bpf_get_cgroup_classid_curr)
>> >>   {
>> >> -       return __task_get_classid(current);
>> >> +       return task_cls_classid(current);
>> >>   }
>> >
>> > Daniel added this helper in
>> > commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
>> > with intention to use it from networking hooks.
>> >
>> > But task_cls_classid() has
>> >          if (in_interrupt())
>> >                  return 0;
>> >
>> > which will trigger in softirq and tc hooks.
>> > So this might break Daniel's use case.
>>
>> Yeap, we cannot break tc(x) BPF programs. It probably makes sense to have
>> a new helper implementation for the more generic, non-networking case which
>> then internally uses task_cls_classid().
>
> Instead of forking the helper I think we can :
> rcu_read_lock_bh_held() || rcu_read_lock_held()
> in task_cls_state().

I tested your suggestion with,

  rcu_read_lock_bh_held() || rcu_read_lock_held()

but it still triggers the RCU warning because BPF syscall programs use
rcu_read_lock_trace().

Adding rcu_read_lock_trace_held() fixes it functionally but triggers a
checkpatch warning:

  WARNING: use of RCU tasks trace is incorrect outside BPF or core RCU code

I think the best solution here would be to add
local_bh_disable()/enable() protection directly in the BPF helper. This
keeps the fix localized to where the problem exists, and avoids
modifying core cgroup RCU.

> And that will do it.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
  2025-06-10 15:23       ` Charalampos Mitrodimas
@ 2025-06-10 15:42         ` Alexei Starovoitov
  2025-06-10 15:51           ` Charalampos Mitrodimas
  0 siblings, 1 reply; 8+ messages in thread
From: Alexei Starovoitov @ 2025-06-10 15:42 UTC (permalink / raw)
  To: Charalampos Mitrodimas
  Cc: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Martin KaFai Lau, John Fastabend,
	Alexei Starovoitov, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Feng Yang, Tejun Heo, Network Development, LKML, bpf,
	syzbot+b4169a1cfb945d2ed0ec

On Tue, Jun 10, 2025 at 8:23 AM Charalampos Mitrodimas
<charmitro@posteo.net> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>
> > On Tue, Jun 10, 2025 at 5:58 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>
> >> On 6/9/25 5:51 PM, Alexei Starovoitov wrote:
> >> > On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
> >> > <charmitro@posteo.net> wrote:
> >> >>
> >> >> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
> >> >> types") made bpf_get_cgroup_classid_curr helper available to all BPF
> >> >> program types.  This helper used __task_get_classid() which calls
> >> >> task_cls_state() that requires rcu_read_lock_bh_held().
> >> >>
> >> >> This triggers an RCU warning when called from BPF syscall programs
> >> >> which run under rcu_read_lock_trace():
> >> >>
> >> >>    WARNING: suspicious RCU usage
> >> >>    6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
> >> >>    -----------------------------
> >> >>    net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
> >> >>
> >> >> Fix this by replacing __task_get_classid() with task_cls_classid()
> >> >> which handles RCU locking internally using regular rcu_read_lock() and
> >> >> is safe to call from any context.
> >> >>
> >> >> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
> >> >> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
> >> >> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
> >> >> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
> >> >> ---
> >> >>   net/core/filter.c | 2 +-
> >> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >> >>
> >> >> diff --git a/net/core/filter.c b/net/core/filter.c
> >> >> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
> >> >> --- a/net/core/filter.c
> >> >> +++ b/net/core/filter.c
> >> >> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
> >> >>   #ifdef CONFIG_CGROUP_NET_CLASSID
> >> >>   BPF_CALL_0(bpf_get_cgroup_classid_curr)
> >> >>   {
> >> >> -       return __task_get_classid(current);
> >> >> +       return task_cls_classid(current);
> >> >>   }
> >> >
> >> > Daniel added this helper in
> >> > commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
> >> > with intention to use it from networking hooks.
> >> >
> >> > But task_cls_classid() has
> >> >          if (in_interrupt())
> >> >                  return 0;
> >> >
> >> > which will trigger in softirq and tc hooks.
> >> > So this might break Daniel's use case.
> >>
> >> Yeap, we cannot break tc(x) BPF programs. It probably makes sense to have
> >> a new helper implementation for the more generic, non-networking case which
> >> then internally uses task_cls_classid().
> >
> > Instead of forking the helper I think we can :
> > rcu_read_lock_bh_held() || rcu_read_lock_held()
> > in task_cls_state().
>
> I tested your suggestion with,
>
>   rcu_read_lock_bh_held() || rcu_read_lock_held()
>
> but it still triggers the RCU warning because BPF syscall programs use
> rcu_read_lock_trace().
>
> Adding rcu_read_lock_trace_held() fixes it functionally but triggers a
> checkpatch warning:
>
>   WARNING: use of RCU tasks trace is incorrect outside BPF or core RCU code

It's safe to ignore checkpatch in this case.

> I think the best solution here would be to add
> local_bh_disable()/enable() protection directly in the BPF helper. This
> keeps the fix localized to where the problem exists, and avoids
> modifying core cgroup RCU.

That works, but it will add runtime overhead.
I doubt the helper is in a critical path, so I don't mind.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
  2025-06-10 15:42         ` Alexei Starovoitov
@ 2025-06-10 15:51           ` Charalampos Mitrodimas
  2025-06-10 16:08             ` Daniel Borkmann
  0 siblings, 1 reply; 8+ messages in thread
From: Charalampos Mitrodimas @ 2025-06-10 15:51 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Martin KaFai Lau, John Fastabend,
	Alexei Starovoitov, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Feng Yang, Tejun Heo, Network Development, LKML, bpf,
	syzbot+b4169a1cfb945d2ed0ec

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Tue, Jun 10, 2025 at 8:23 AM Charalampos Mitrodimas
> <charmitro@posteo.net> wrote:
>>
>> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>>
>> > On Tue, Jun 10, 2025 at 5:58 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> >>
>> >> On 6/9/25 5:51 PM, Alexei Starovoitov wrote:
>> >> > On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
>> >> > <charmitro@posteo.net> wrote:
>> >> >>
>> >> >> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
>> >> >> types") made bpf_get_cgroup_classid_curr helper available to all BPF
>> >> >> program types.  This helper used __task_get_classid() which calls
>> >> >> task_cls_state() that requires rcu_read_lock_bh_held().
>> >> >>
>> >> >> This triggers an RCU warning when called from BPF syscall programs
>> >> >> which run under rcu_read_lock_trace():
>> >> >>
>> >> >>    WARNING: suspicious RCU usage
>> >> >>    6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>> >> >>    -----------------------------
>> >> >>    net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>> >> >>
>> >> >> Fix this by replacing __task_get_classid() with task_cls_classid()
>> >> >> which handles RCU locking internally using regular rcu_read_lock() and
>> >> >> is safe to call from any context.
>> >> >>
>> >> >> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
>> >> >> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
>> >> >> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
>> >> >> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
>> >> >> ---
>> >> >>   net/core/filter.c | 2 +-
>> >> >>   1 file changed, 1 insertion(+), 1 deletion(-)
>> >> >>
>> >> >> diff --git a/net/core/filter.c b/net/core/filter.c
>> >> >> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
>> >> >> --- a/net/core/filter.c
>> >> >> +++ b/net/core/filter.c
>> >> >> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
>> >> >>   #ifdef CONFIG_CGROUP_NET_CLASSID
>> >> >>   BPF_CALL_0(bpf_get_cgroup_classid_curr)
>> >> >>   {
>> >> >> -       return __task_get_classid(current);
>> >> >> +       return task_cls_classid(current);
>> >> >>   }
>> >> >
>> >> > Daniel added this helper in
>> >> > commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
>> >> > with intention to use it from networking hooks.
>> >> >
>> >> > But task_cls_classid() has
>> >> >          if (in_interrupt())
>> >> >                  return 0;
>> >> >
>> >> > which will trigger in softirq and tc hooks.
>> >> > So this might break Daniel's use case.
>> >>
>> >> Yeap, we cannot break tc(x) BPF programs. It probably makes sense to have
>> >> a new helper implementation for the more generic, non-networking case which
>> >> then internally uses task_cls_classid().
>> >
>> > Instead of forking the helper I think we can :
>> > rcu_read_lock_bh_held() || rcu_read_lock_held()
>> > in task_cls_state().
>>
>> I tested your suggestion with,
>>
>>   rcu_read_lock_bh_held() || rcu_read_lock_held()
>>
>> but it still triggers the RCU warning because BPF syscall programs use
>> rcu_read_lock_trace().
>>
>> Adding rcu_read_lock_trace_held() fixes it functionally but triggers a
>> checkpatch warning:
>>
>>   WARNING: use of RCU tasks trace is incorrect outside BPF or core RCU code
>
> It's safe to ignore checkpatch in this case.

If that is the case I'll move forward with this. It was my initial fix
for this[1] anyway, but checkpatch made me change it.

[1]: https://github.com/charmitro/linux/commit/e5c42d49bfb967c3c35f536971f397492d2f46bf

>
>> I think the best solution here would be to add
>> local_bh_disable()/enable() protection directly in the BPF helper. This
>> keeps the fix localized to where the problem exists, and avoids
>> modifying core cgroup RCU.
>
> That works, but it will add runtime overhead.
> I doubt the helper is in a critical path, so I don't mind.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper
  2025-06-10 15:51           ` Charalampos Mitrodimas
@ 2025-06-10 16:08             ` Daniel Borkmann
  0 siblings, 0 replies; 8+ messages in thread
From: Daniel Borkmann @ 2025-06-10 16:08 UTC (permalink / raw)
  To: Charalampos Mitrodimas, Alexei Starovoitov
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Martin KaFai Lau, John Fastabend,
	Alexei Starovoitov, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Feng Yang, Tejun Heo, Network Development, LKML, bpf,
	syzbot+b4169a1cfb945d2ed0ec

On 6/10/25 5:51 PM, Charalampos Mitrodimas wrote:
> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>> On Tue, Jun 10, 2025 at 8:23 AM Charalampos Mitrodimas
>> <charmitro@posteo.net> wrote:
>>> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>>>> On Tue, Jun 10, 2025 at 5:58 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>>> On 6/9/25 5:51 PM, Alexei Starovoitov wrote:
>>>>>> On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
>>>>>> <charmitro@posteo.net> wrote:
>>>>>>>
>>>>>>> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
>>>>>>> types") made bpf_get_cgroup_classid_curr helper available to all BPF
>>>>>>> program types.  This helper used __task_get_classid() which calls
>>>>>>> task_cls_state() that requires rcu_read_lock_bh_held().
>>>>>>>
>>>>>>> This triggers an RCU warning when called from BPF syscall programs
>>>>>>> which run under rcu_read_lock_trace():
>>>>>>>
>>>>>>>     WARNING: suspicious RCU usage
>>>>>>>     6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>>>>>>>     -----------------------------
>>>>>>>     net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>>>>>>>
>>>>>>> Fix this by replacing __task_get_classid() with task_cls_classid()
>>>>>>> which handles RCU locking internally using regular rcu_read_lock() and
>>>>>>> is safe to call from any context.
>>>>>>>
>>>>>>> Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com
>>>>>>> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
>>>>>>> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
>>>>>>> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
>>>>>>> ---
>>>>>>>    net/core/filter.c | 2 +-
>>>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>>>>>> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
>>>>>>> --- a/net/core/filter.c
>>>>>>> +++ b/net/core/filter.c
>>>>>>> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
>>>>>>>    #ifdef CONFIG_CGROUP_NET_CLASSID
>>>>>>>    BPF_CALL_0(bpf_get_cgroup_classid_curr)
>>>>>>>    {
>>>>>>> -       return __task_get_classid(current);
>>>>>>> +       return task_cls_classid(current);
>>>>>>>    }
>>>>>>
>>>>>> Daniel added this helper in
>>>>>> commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
>>>>>> with intention to use it from networking hooks.
>>>>>>
>>>>>> But task_cls_classid() has
>>>>>>           if (in_interrupt())
>>>>>>                   return 0;
>>>>>>
>>>>>> which will trigger in softirq and tc hooks.
>>>>>> So this might break Daniel's use case.
>>>>>
>>>>> Yeap, we cannot break tc(x) BPF programs. It probably makes sense to have
>>>>> a new helper implementation for the more generic, non-networking case which
>>>>> then internally uses task_cls_classid().
>>>>
>>>> Instead of forking the helper I think we can :
>>>> rcu_read_lock_bh_held() || rcu_read_lock_held()
>>>> in task_cls_state().
>>>
>>> I tested your suggestion with,
>>>
>>>    rcu_read_lock_bh_held() || rcu_read_lock_held()
>>>
>>> but it still triggers the RCU warning because BPF syscall programs use
>>> rcu_read_lock_trace().
>>>
>>> Adding rcu_read_lock_trace_held() fixes it functionally but triggers a
>>> checkpatch warning:
>>>
>>>    WARNING: use of RCU tasks trace is incorrect outside BPF or core RCU code
>>
>> It's safe to ignore checkpatch in this case.
> 
> If that is the case I'll move forward with this. It was my initial fix
> for this[1] anyway, but checkpatch made me change it.

Agree that one is better!

> [1]: https://github.com/charmitro/linux/commit/e5c42d49bfb967c3c35f536971f397492d2f46bf

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-06-10 16:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-08 15:34 [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper Charalampos Mitrodimas
2025-06-09 15:51 ` Alexei Starovoitov
2025-06-10 12:58   ` Daniel Borkmann
2025-06-10 14:56     ` Alexei Starovoitov
2025-06-10 15:23       ` Charalampos Mitrodimas
2025-06-10 15:42         ` Alexei Starovoitov
2025-06-10 15:51           ` Charalampos Mitrodimas
2025-06-10 16:08             ` Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).