* Re:
@ 2022-03-04 8:47 Harald Hauge
0 siblings, 0 replies; 21+ messages in thread
From: Harald Hauge @ 2022-03-04 8:47 UTC (permalink / raw)
To: bpf
Hello,
I'm Harald Hauge, an Investment Manager from Norway.
I will your assistance in executing this Business from my country
to yours.
This is a short term investment with good returns. Kindly
reply to confirm the validity of your email so I can give you comprehensive details about the project.
Best Regards,
Harald Hauge
Business Consultant
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2023-05-20 9:47 ` Ze Gao
@ 2023-05-21 3:58 ` Yonghong Song
2023-05-21 15:10 ` Re: Ze Gao
2023-05-21 8:08 ` Re: Jiri Olsa
1 sibling, 1 reply; 21+ messages in thread
From: Yonghong Song @ 2023-05-21 3:58 UTC (permalink / raw)
To: Ze Gao, jolsa
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
songliubraving, Ze Gao
On 5/20/23 2:47 AM, Ze Gao wrote:
>
> Hi Jiri,
>
> Would you like to consider to add rcu_is_watching check in
> to solve this from the viewpoint of kprobe_multi_link_prog_run
> itself? And accounting of missed runs can be added as well
> to imporve observability.
>
> Regards,
> Ze
>
>
> -----------------
> From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> From: Ze Gao <zegao@tencent.com>
> Date: Sat, 20 May 2023 17:32:05 +0800
> Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
>
> From the perspective of kprobe_multi_link_prog_run, any traceable
> functions can be attached while bpf progs need specical care and
> ought to be under rcu protection. To solve the likely rcu lockdep
> warns once for good, when (future) functions in idle path were
> attached accidentally, we better paying some cost to check at least
> in kernel-side, and return when rcu is not watching, which helps
> to avoid any unpredictable results.
kprobe_multi/fprobe share the same set of attachments with fentry.
Currently, fentry does not filter with !rcu_is_watching, maybe
because this is an extreme corner case. Not sure whether it is
worthwhile or not.
Maybe if you can give a concrete example (e.g., attachment point)
with current code base to show what the issue you encountered and
it will make it easier to judge whether adding !rcu_is_watching()
is necessary or not.
>
> Signed-off-by: Ze Gao <zegao@tencent.com>
> ---
> kernel/trace/bpf_trace.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 9a050e36dc6c..3e6ea7274765 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
> struct bpf_run_ctx *old_run_ctx;
> int err;
>
> - if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> + if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
> err = 0;
> goto out;
> }
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2023-05-20 9:47 ` Ze Gao
2023-05-21 3:58 ` Yonghong Song
@ 2023-05-21 8:08 ` Jiri Olsa
2023-05-21 10:09 ` Re: Masami Hiramatsu
1 sibling, 1 reply; 21+ messages in thread
From: Jiri Olsa @ 2023-05-21 8:08 UTC (permalink / raw)
To: Ze Gao
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
songliubraving, Ze Gao
On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
>
> Hi Jiri,
>
> Would you like to consider to add rcu_is_watching check in
> to solve this from the viewpoint of kprobe_multi_link_prog_run
I think this was discussed in here:
https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/
and was considered a bug, there's fix mentioned later in the thread
there's also this recent patchset:
https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/
that solves related problems
> itself? And accounting of missed runs can be added as well
> to imporve observability.
right, we count fprobe->nmissed but it's not exposed, we should allow
to get 'missed' stats from both fprobe and kprobe_multi later, which
is missing now, will check
thanks,
jirka
>
> Regards,
> Ze
>
>
> -----------------
> From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> From: Ze Gao <zegao@tencent.com>
> Date: Sat, 20 May 2023 17:32:05 +0800
> Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
>
> From the perspective of kprobe_multi_link_prog_run, any traceable
> functions can be attached while bpf progs need specical care and
> ought to be under rcu protection. To solve the likely rcu lockdep
> warns once for good, when (future) functions in idle path were
> attached accidentally, we better paying some cost to check at least
> in kernel-side, and return when rcu is not watching, which helps
> to avoid any unpredictable results.
>
> Signed-off-by: Ze Gao <zegao@tencent.com>
> ---
> kernel/trace/bpf_trace.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 9a050e36dc6c..3e6ea7274765 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
> struct bpf_run_ctx *old_run_ctx;
> int err;
>
> - if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> + if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
> err = 0;
> goto out;
> }
> --
> 2.40.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2023-05-21 8:08 ` Re: Jiri Olsa
@ 2023-05-21 10:09 ` Masami Hiramatsu
2023-05-21 14:19 ` Re: Ze Gao
0 siblings, 1 reply; 21+ messages in thread
From: Masami Hiramatsu @ 2023-05-21 10:09 UTC (permalink / raw)
To: Jiri Olsa
Cc: Ze Gao, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau,
Masami Hiramatsu, Song Liu, Stanislav Fomichev, Steven Rostedt,
Yonghong Song, bpf, linux-kernel, linux-trace-kernel, kafai,
kpsingh, netdev, paulmck, songliubraving, Ze Gao
On Sun, 21 May 2023 10:08:46 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:
> On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
> >
> > Hi Jiri,
> >
> > Would you like to consider to add rcu_is_watching check in
> > to solve this from the viewpoint of kprobe_multi_link_prog_run
>
> I think this was discussed in here:
> https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/
>
> and was considered a bug, there's fix mentioned later in the thread
>
> there's also this recent patchset:
> https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/
>
> that solves related problems
I think this rcu_is_watching() is a bit different issue. This rcu_is_watching()
check is required if the kprobe_multi_link_prog_run() uses any RCU API.
E.g. rethook_try_get() is also checks rcu_is_watching() because it uses
call_rcu().
Thank you,
>
> > itself? And accounting of missed runs can be added as well
> > to imporve observability.
>
> right, we count fprobe->nmissed but it's not exposed, we should allow
> to get 'missed' stats from both fprobe and kprobe_multi later, which
> is missing now, will check
>
> thanks,
> jirka
>
> >
> > Regards,
> > Ze
> >
> >
> > -----------------
> > From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> > From: Ze Gao <zegao@tencent.com>
> > Date: Sat, 20 May 2023 17:32:05 +0800
> > Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> >
> > From the perspective of kprobe_multi_link_prog_run, any traceable
> > functions can be attached while bpf progs need specical care and
> > ought to be under rcu protection. To solve the likely rcu lockdep
> > warns once for good, when (future) functions in idle path were
> > attached accidentally, we better paying some cost to check at least
> > in kernel-side, and return when rcu is not watching, which helps
> > to avoid any unpredictable results.
> >
> > Signed-off-by: Ze Gao <zegao@tencent.com>
> > ---
> > kernel/trace/bpf_trace.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 9a050e36dc6c..3e6ea7274765 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
> > struct bpf_run_ctx *old_run_ctx;
> > int err;
> >
> > - if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> > + if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
> > err = 0;
> > goto out;
> > }
> > --
> > 2.40.1
> >
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2023-05-21 10:09 ` Re: Masami Hiramatsu
@ 2023-05-21 14:19 ` Ze Gao
0 siblings, 0 replies; 21+ messages in thread
From: Ze Gao @ 2023-05-21 14:19 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Jiri Olsa, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau, Song Liu,
Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
songliubraving, Ze Gao
On Sun, May 21, 2023 at 6:09 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Sun, 21 May 2023 10:08:46 +0200
> Jiri Olsa <olsajiri@gmail.com> wrote:
>
> > On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
> > >
> > > Hi Jiri,
> > >
> > > Would you like to consider to add rcu_is_watching check in
> > > to solve this from the viewpoint of kprobe_multi_link_prog_run
> >
> > I think this was discussed in here:
> > https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/
> >
> > and was considered a bug, there's fix mentioned later in the thread
> >
> > there's also this recent patchset:
> > https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/
> >
> > that solves related problems
>
> I think this rcu_is_watching() is a bit different issue. This rcu_is_watching()
> check is required if the kprobe_multi_link_prog_run() uses any RCU API.
> E.g. rethook_try_get() is also checks rcu_is_watching() because it uses
> call_rcu().
Yes, that's my point!
Regards,
Ze
>
> >
> > > itself? And accounting of missed runs can be added as well
> > > to imporve observability.
> >
> > right, we count fprobe->nmissed but it's not exposed, we should allow
> > to get 'missed' stats from both fprobe and kprobe_multi later, which
> > is missing now, will check
> >
> > thanks,
> > jirka
> >
> > >
> > > Regards,
> > > Ze
> > >
> > >
> > > -----------------
> > > From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> > > From: Ze Gao <zegao@tencent.com>
> > > Date: Sat, 20 May 2023 17:32:05 +0800
> > > Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> > >
> > > From the perspective of kprobe_multi_link_prog_run, any traceable
> > > functions can be attached while bpf progs need specical care and
> > > ought to be under rcu protection. To solve the likely rcu lockdep
> > > warns once for good, when (future) functions in idle path were
> > > attached accidentally, we better paying some cost to check at least
> > > in kernel-side, and return when rcu is not watching, which helps
> > > to avoid any unpredictable results.
> > >
> > > Signed-off-by: Ze Gao <zegao@tencent.com>
> > > ---
> > > kernel/trace/bpf_trace.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > > index 9a050e36dc6c..3e6ea7274765 100644
> > > --- a/kernel/trace/bpf_trace.c
> > > +++ b/kernel/trace/bpf_trace.c
> > > @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
> > > struct bpf_run_ctx *old_run_ctx;
> > > int err;
> > >
> > > - if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> > > + if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
> > > err = 0;
> > > goto out;
> > > }
> > > --
> > > 2.40.1
> > >
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2023-05-21 3:58 ` Yonghong Song
@ 2023-05-21 15:10 ` Ze Gao
2023-05-21 20:26 ` Re: Jiri Olsa
0 siblings, 1 reply; 21+ messages in thread
From: Ze Gao @ 2023-05-21 15:10 UTC (permalink / raw)
To: Yonghong Song
Cc: jolsa, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau,
Masami Hiramatsu, Song Liu, Stanislav Fomichev, Steven Rostedt,
Yonghong Song, bpf, linux-kernel, linux-trace-kernel, kafai,
kpsingh, netdev, paulmck, songliubraving, Ze Gao
> kprobe_multi/fprobe share the same set of attachments with fentry.
> Currently, fentry does not filter with !rcu_is_watching, maybe
> because this is an extreme corner case. Not sure whether it is
> worthwhile or not.
Agreed, it's rare, especially after Peter's patches which push narrow
down rcu eqs regions
in the idle path and reduce the chance of any traceable functions
happening in between.
However, from RCU's perspective, we ought to check if rcu_is_watching
theoretically
when there's a chance our code will run in the idle path and also we
need rcu to be alive,
And also we cannot simply make assumptions for any future changes in
the idle path.
You know, just like what was hit in the thread.
> Maybe if you can give a concrete example (e.g., attachment point)
> with current code base to show what the issue you encountered and
> it will make it easier to judge whether adding !rcu_is_watching()
> is necessary or not.
I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
traceable but not on the latest version
so far. But as I state above, in theory we need it. So here is a
gentle ping :) .
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2023-05-21 15:10 ` Re: Ze Gao
@ 2023-05-21 20:26 ` Jiri Olsa
2023-05-22 1:36 ` Re: Masami Hiramatsu
2023-05-22 2:07 ` Re: Ze Gao
0 siblings, 2 replies; 21+ messages in thread
From: Jiri Olsa @ 2023-05-21 20:26 UTC (permalink / raw)
To: Ze Gao
Cc: Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
Steven Rostedt, Yonghong Song, bpf, linux-kernel,
linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
songliubraving, Ze Gao
On Sun, May 21, 2023 at 11:10:16PM +0800, Ze Gao wrote:
> > kprobe_multi/fprobe share the same set of attachments with fentry.
> > Currently, fentry does not filter with !rcu_is_watching, maybe
> > because this is an extreme corner case. Not sure whether it is
> > worthwhile or not.
>
> Agreed, it's rare, especially after Peter's patches which push narrow
> down rcu eqs regions
> in the idle path and reduce the chance of any traceable functions
> happening in between.
>
> However, from RCU's perspective, we ought to check if rcu_is_watching
> theoretically
> when there's a chance our code will run in the idle path and also we
> need rcu to be alive,
> And also we cannot simply make assumptions for any future changes in
> the idle path.
> You know, just like what was hit in the thread.
>
> > Maybe if you can give a concrete example (e.g., attachment point)
> > with current code base to show what the issue you encountered and
> > it will make it easier to judge whether adding !rcu_is_watching()
> > is necessary or not.
>
> I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
> traceable but not on the latest version
> so far. But as I state above, in theory we need it. So here is a
> gentle ping :) .
hum, this change [1] added rcu_is_watching check to ftrace_test_recursion_trylock,
which we use in fprobe_handler and is coming to fprobe_exit_handler in [2]
I might be missing something, but it seems like we don't need another
rcu_is_watching call on kprobe_multi level
jirka
[1] d099dbfd3306 cpuidle: tracing: Warn about !rcu_is_watching()
[2] https://lore.kernel.org/bpf/20230517034510.15639-4-zegao@tencent.com/
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2023-05-21 20:26 ` Re: Jiri Olsa
@ 2023-05-22 1:36 ` Masami Hiramatsu
2023-05-22 2:07 ` Re: Ze Gao
1 sibling, 0 replies; 21+ messages in thread
From: Masami Hiramatsu @ 2023-05-22 1:36 UTC (permalink / raw)
To: Jiri Olsa
Cc: Ze Gao, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
Steven Rostedt, Yonghong Song, bpf, linux-kernel,
linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
songliubraving, Ze Gao
On Sun, 21 May 2023 22:26:37 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:
> On Sun, May 21, 2023 at 11:10:16PM +0800, Ze Gao wrote:
> > > kprobe_multi/fprobe share the same set of attachments with fentry.
> > > Currently, fentry does not filter with !rcu_is_watching, maybe
> > > because this is an extreme corner case. Not sure whether it is
> > > worthwhile or not.
> >
> > Agreed, it's rare, especially after Peter's patches which push narrow
> > down rcu eqs regions
> > in the idle path and reduce the chance of any traceable functions
> > happening in between.
> >
> > However, from RCU's perspective, we ought to check if rcu_is_watching
> > theoretically
> > when there's a chance our code will run in the idle path and also we
> > need rcu to be alive,
> > And also we cannot simply make assumptions for any future changes in
> > the idle path.
> > You know, just like what was hit in the thread.
> >
> > > Maybe if you can give a concrete example (e.g., attachment point)
> > > with current code base to show what the issue you encountered and
> > > it will make it easier to judge whether adding !rcu_is_watching()
> > > is necessary or not.
> >
> > I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
> > traceable but not on the latest version
> > so far. But as I state above, in theory we need it. So here is a
> > gentle ping :) .
>
> hum, this change [1] added rcu_is_watching check to ftrace_test_recursion_trylock,
> which we use in fprobe_handler and is coming to fprobe_exit_handler in [2]
>
> I might be missing something, but it seems like we don't need another
> rcu_is_watching call on kprobe_multi level
Good point! OK, then it seems we don't need it. The rethook continues to
use the rcu_is_watching() because it is also used from kprobes, but the
kprobe_multi doesn't need it.
Thank you,
>
> jirka
>
>
> [1] d099dbfd3306 cpuidle: tracing: Warn about !rcu_is_watching()
> [2] https://lore.kernel.org/bpf/20230517034510.15639-4-zegao@tencent.com/
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2023-05-21 20:26 ` Re: Jiri Olsa
2023-05-22 1:36 ` Re: Masami Hiramatsu
@ 2023-05-22 2:07 ` Ze Gao
2023-05-23 4:38 ` Re: Yonghong Song
2023-05-23 5:30 ` Re: Masami Hiramatsu
1 sibling, 2 replies; 21+ messages in thread
From: Ze Gao @ 2023-05-22 2:07 UTC (permalink / raw)
To: Jiri Olsa
Cc: Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
Steven Rostedt, Yonghong Song, bpf, linux-kernel,
linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
songliubraving, Ze Gao
Oops, I missed that. Thanks for pointing that out, which I thought is
conditional use of rcu_is_watching before.
One last point, I think we should double check on this
"fentry does not filter with !rcu_is_watching"
as quoted from Yonghong and argue whether it needs
the same check for fentry as well.
Regards,
Ze
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2023-05-22 2:07 ` Re: Ze Gao
@ 2023-05-23 4:38 ` Yonghong Song
2023-05-23 5:30 ` Re: Masami Hiramatsu
1 sibling, 0 replies; 21+ messages in thread
From: Yonghong Song @ 2023-05-23 4:38 UTC (permalink / raw)
To: Ze Gao, Jiri Olsa
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
songliubraving, Ze Gao
On 5/21/23 7:07 PM, Ze Gao wrote:
> Oops, I missed that. Thanks for pointing that out, which I thought is
> conditional use of rcu_is_watching before.
>
> One last point, I think we should double check on this
> "fentry does not filter with !rcu_is_watching"
> as quoted from Yonghong and argue whether it needs
> the same check for fentry as well.
I would suggest that we address rcu_is_watching issue for fentry
only if we do have a reproducible case to show something goes wrong...
>
> Regards,
> Ze
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2023-05-22 2:07 ` Re: Ze Gao
2023-05-23 4:38 ` Re: Yonghong Song
@ 2023-05-23 5:30 ` Masami Hiramatsu
2023-05-23 6:59 ` Re: Paul E. McKenney
1 sibling, 1 reply; 21+ messages in thread
From: Masami Hiramatsu @ 2023-05-23 5:30 UTC (permalink / raw)
To: Ze Gao
Cc: Jiri Olsa, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
Steven Rostedt, Yonghong Song, bpf, linux-kernel,
linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
songliubraving, Ze Gao
On Mon, 22 May 2023 10:07:42 +0800
Ze Gao <zegao2021@gmail.com> wrote:
> Oops, I missed that. Thanks for pointing that out, which I thought is
> conditional use of rcu_is_watching before.
>
> One last point, I think we should double check on this
> "fentry does not filter with !rcu_is_watching"
> as quoted from Yonghong and argue whether it needs
> the same check for fentry as well.
rcu_is_watching() comment says;
* if the current CPU is not in its idle loop or is in an interrupt or
* NMI handler, return true.
Thus it returns *fault* if the current CPU is in the idle loop and not
any interrupt(including NMI) context. This means if any tracable function
is called from idle loop, it can be !rcu_is_watching(). I meant, this is
'context' based check, thus fentry can not filter out that some commonly
used functions is called from that context but it can be detected.
Thank you,
>
> Regards,
> Ze
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2023-05-23 5:30 ` Re: Masami Hiramatsu
@ 2023-05-23 6:59 ` Paul E. McKenney
2023-05-25 0:13 ` Re: Masami Hiramatsu
0 siblings, 1 reply; 21+ messages in thread
From: Paul E. McKenney @ 2023-05-23 6:59 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Ze Gao, Jiri Olsa, Yonghong Song, Alexei Starovoitov,
Andrii Nakryiko, Daniel Borkmann, Hao Luo, John Fastabend,
KP Singh, Martin KaFai Lau, Song Liu, Stanislav Fomichev,
Steven Rostedt, Yonghong Song, bpf, linux-kernel,
linux-trace-kernel, kafai, kpsingh, netdev, songliubraving,
Ze Gao
On Tue, May 23, 2023 at 01:30:19PM +0800, Masami Hiramatsu wrote:
> On Mon, 22 May 2023 10:07:42 +0800
> Ze Gao <zegao2021@gmail.com> wrote:
>
> > Oops, I missed that. Thanks for pointing that out, which I thought is
> > conditional use of rcu_is_watching before.
> >
> > One last point, I think we should double check on this
> > "fentry does not filter with !rcu_is_watching"
> > as quoted from Yonghong and argue whether it needs
> > the same check for fentry as well.
>
> rcu_is_watching() comment says;
>
> * if the current CPU is not in its idle loop or is in an interrupt or
> * NMI handler, return true.
>
> Thus it returns *fault* if the current CPU is in the idle loop and not
> any interrupt(including NMI) context. This means if any tracable function
> is called from idle loop, it can be !rcu_is_watching(). I meant, this is
> 'context' based check, thus fentry can not filter out that some commonly
> used functions is called from that context but it can be detected.
It really does return false (rather than faulting?) if the current CPU
is deep within the idle loop.
In addition, the recent x86/entry rework (thank you Peter and
Thomas!) mean that the "idle loop" is quite restricted, as can be
seen by the invocations of ct_cpuidle_enter() and ct_cpuidle_exit().
For example, in default_idle_call(), these are immediately before and
after the call to arch_cpu_idle().
Would the following help? Or am I missing your point?
Thanx, Paul
------------------------------------------------------------------------
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 1449cb69a0e0..fae9b4e29c93 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -679,10 +679,14 @@ static void rcu_disable_urgency_upon_qs(struct rcu_data *rdp)
/**
* rcu_is_watching - see if RCU thinks that the current CPU is not idle
*
- * Return true if RCU is watching the running CPU, which means that this
- * CPU can safely enter RCU read-side critical sections. In other words,
- * if the current CPU is not in its idle loop or is in an interrupt or
- * NMI handler, return true.
+ * Return @true if RCU is watching the running CPU and @false otherwise.
+ * An @true return means that this CPU can safely enter RCU read-side
+ * critical sections.
+ *
+ * More specifically, if the current CPU is not deep within its idle
+ * loop, return @true. Note that rcu_is_watching() will return @true if
+ * invoked from an interrupt or NMI handler, even if that interrupt or
+ * NMI interrupted the CPU while it was deep within its idle loop.
*
* Make notrace because it can be called by the internal functions of
* ftrace, and making this notrace removes unnecessary recursion calls.
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re:
2023-05-23 6:59 ` Re: Paul E. McKenney
@ 2023-05-25 0:13 ` Masami Hiramatsu
0 siblings, 0 replies; 21+ messages in thread
From: Masami Hiramatsu @ 2023-05-25 0:13 UTC (permalink / raw)
To: paulmck
Cc: Ze Gao, Jiri Olsa, Yonghong Song, Alexei Starovoitov,
Andrii Nakryiko, Daniel Borkmann, Hao Luo, John Fastabend,
KP Singh, Martin KaFai Lau, Song Liu, Stanislav Fomichev,
Steven Rostedt, Yonghong Song, bpf, linux-kernel,
linux-trace-kernel, kafai, kpsingh, netdev, songliubraving,
Ze Gao
On Mon, 22 May 2023 23:59:28 -0700
"Paul E. McKenney" <paulmck@kernel.org> wrote:
> On Tue, May 23, 2023 at 01:30:19PM +0800, Masami Hiramatsu wrote:
> > On Mon, 22 May 2023 10:07:42 +0800
> > Ze Gao <zegao2021@gmail.com> wrote:
> >
> > > Oops, I missed that. Thanks for pointing that out, which I thought is
> > > conditional use of rcu_is_watching before.
> > >
> > > One last point, I think we should double check on this
> > > "fentry does not filter with !rcu_is_watching"
> > > as quoted from Yonghong and argue whether it needs
> > > the same check for fentry as well.
> >
> > rcu_is_watching() comment says;
> >
> > * if the current CPU is not in its idle loop or is in an interrupt or
> > * NMI handler, return true.
> >
> > Thus it returns *fault* if the current CPU is in the idle loop and not
> > any interrupt(including NMI) context. This means if any tracable function
> > is called from idle loop, it can be !rcu_is_watching(). I meant, this is
> > 'context' based check, thus fentry can not filter out that some commonly
> > used functions is called from that context but it can be detected.
>
> It really does return false (rather than faulting?) if the current CPU
> is deep within the idle loop.
>
> In addition, the recent x86/entry rework (thank you Peter and
> Thomas!) mean that the "idle loop" is quite restricted, as can be
> seen by the invocations of ct_cpuidle_enter() and ct_cpuidle_exit().
> For example, in default_idle_call(), these are immediately before and
> after the call to arch_cpu_idle().
Thanks! I also found that the default_idle_call() is enough small and
it seems not happening on fentry because there are no commonly used
functions on that path.
>
> Would the following help? Or am I missing your point?
Yes, thank you for the update!
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 1449cb69a0e0..fae9b4e29c93 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -679,10 +679,14 @@ static void rcu_disable_urgency_upon_qs(struct rcu_data *rdp)
> /**
> * rcu_is_watching - see if RCU thinks that the current CPU is not idle
> *
> - * Return true if RCU is watching the running CPU, which means that this
> - * CPU can safely enter RCU read-side critical sections. In other words,
> - * if the current CPU is not in its idle loop or is in an interrupt or
> - * NMI handler, return true.
> + * Return @true if RCU is watching the running CPU and @false otherwise.
> + * An @true return means that this CPU can safely enter RCU read-side
> + * critical sections.
> + *
> + * More specifically, if the current CPU is not deep within its idle
> + * loop, return @true. Note that rcu_is_watching() will return @true if
> + * invoked from an interrupt or NMI handler, even if that interrupt or
> + * NMI interrupted the CPU while it was deep within its idle loop.
> *
> * Make notrace because it can be called by the internal functions of
> * ftrace, and making this notrace removes unnecessary recursion calls.
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2024-06-26 6:11 Totoro W
@ 2024-06-26 7:09 ` Eduard Zingerman
0 siblings, 0 replies; 21+ messages in thread
From: Eduard Zingerman @ 2024-06-26 7:09 UTC (permalink / raw)
To: Totoro W, bpf
On Wed, 2024-06-26 at 14:11 +0800, Totoro W wrote:
> Hi folks,
>
> This is my first time to ask questions in this mailing list. I'm the
> author of https://github.com/tw4452852/zbpf which is a framework to
> write BPF programs with Zig toolchain.
> During the development, as the BTF is totally generated by the Zig
> toolchain, some naming conventions will make the BTF verifier refuse
> to load.
> Right now I have to patch the libbpf to do some fixup before loading
> into the kernel
> (https://github.com/tw4452852/libbpf_zig/blob/main/0001-temporary-WA-for-invalid-BTF-info-generated-by-Zig.patch).
> + // https://github.com/tw4452852/zbpf/issues/3
> + else if (btf_is_ptr(t)) {
> + t->name_off = 0;
As far as I understand, you control BTF generation, why generate names
for pointers in a first place?
> Even though this just work-around the issue, I'm still curious about
> the current naming sanitation, I want to know some background about
> it.
Doing some git digging shows that name check was first introduced by
the following commit:
2667a2626f4d ("bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO")
And lived like that afterwards.
My guess is that kernel BTF is used to work with kernel functions and
data structures. All of which follow C naming convention.
> If possible, could we relax this to accept more languages (like Zig)
> to write BPF programs? Thanks in advance.
Could you please elaborate a bit?
Citation from [1]:
Identifiers must start with an alphabetic character or underscore
and may be followed by any number of alphanumeric characters or
underscores. They must not overlap with any keywords.
If a name that does not fit these requirements is needed, such as
for linking with external libraries, the @"" syntax may be used.
Paragraph 1 matches C naming convention and should be accepted by
kernel/bpf/btf.c:btf_name_valid_identifier().
Paragraph 2 is basically any string.
Which one do you want?
[1] https://ziglang.org/documentation/master/#Identifiers
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2025-04-18 7:46 Shung-Hsi Yu
@ 2025-04-18 7:49 ` Shung-Hsi Yu
2025-04-23 17:30 ` Re: patchwork-bot+netdevbpf
1 sibling, 0 replies; 21+ messages in thread
From: Shung-Hsi Yu @ 2025-04-18 7:49 UTC (permalink / raw)
To: bpf
Cc: Martin KaFai Lau, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Kumar Kartikeya Dwivedi, Dan Carpenter
On Fri, Apr 18, 2025 at 3:46 PM Shung-Hsi Yu <shung-hsi.yu@suse.com> wrote:
> From bda8bb8011d865cebf066350c8625e8be1625656 Mon Sep 17 00:00:00 2001
> From: Shung-Hsi Yu <shung-hsi.yu@suse.com>
> Date: Fri, 18 Apr 2025 15:22:00 +0800
> Subject: [PATCH bpf-next 1/1] bpf: use proper type to calculate
> bpf_raw_tp_null_args.mask index
...
Email headers are off, hence no subject. WIll resend.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2025-04-22 8:04 ` Feng Yang
@ 2025-04-22 14:37 ` Alexei Starovoitov
0 siblings, 0 replies; 21+ messages in thread
From: Alexei Starovoitov @ 2025-04-22 14:37 UTC (permalink / raw)
To: Feng Yang
Cc: Andrii Nakryiko, Alexei Starovoitov, bpf, Daniel Borkmann, Eduard,
LKML, linux-trace-kernel, Martin KaFai Lau, Network Development,
Song Liu, Feng Yang, Yonghong Song
On Tue, Apr 22, 2025 at 1:04 AM Feng Yang <yangfeng59949@163.com> wrote:
>
> Subject: Re: [PATCH bpf-next] bpf: Remove bpf_get_smp_processor_id_proto
>
> On Mon, 21 Apr 2025 18:53:07 -0700 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
>
> > On Thu, Apr 17, 2025 at 8:41 PM Feng Yang <yangfeng59949@163.com> wrote:
> > >
> > > From: Feng Yang <yangfeng@kylinos.cn>
> > >
> > > All BPF programs either disable CPU preemption or CPU migration,
> > > so the bpf_get_smp_processor_id_proto can be safely removed,
> > > and the bpf_get_raw_smp_processor_id_proto in bpf_base_func_proto works perfectly.
> > >
> > > Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > > Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
> > > ---
> > > include/linux/bpf.h | 1 -
> > > kernel/bpf/core.c | 1 -
> > > kernel/bpf/helpers.c | 12 ------------
> > > kernel/trace/bpf_trace.c | 2 --
> > > net/core/filter.c | 6 ------
> > > 5 files changed, 22 deletions(-)
> > >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index 3f0cc89c0622..36e525141556 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -3316,7 +3316,6 @@ extern const struct bpf_func_proto bpf_map_peek_elem_proto;
> > > extern const struct bpf_func_proto bpf_map_lookup_percpu_elem_proto;
> > >
> > > extern const struct bpf_func_proto bpf_get_prandom_u32_proto;
> > > -extern const struct bpf_func_proto bpf_get_smp_processor_id_proto;
> > > extern const struct bpf_func_proto bpf_get_numa_node_id_proto;
> > > extern const struct bpf_func_proto bpf_tail_call_proto;
> > > extern const struct bpf_func_proto bpf_ktime_get_ns_proto;
> > > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > > index ba6b6118cf50..1ad41a16b86e 100644
> > > --- a/kernel/bpf/core.c
> > > +++ b/kernel/bpf/core.c
> > > @@ -2943,7 +2943,6 @@ const struct bpf_func_proto bpf_spin_unlock_proto __weak;
> > > const struct bpf_func_proto bpf_jiffies64_proto __weak;
> > >
> > > const struct bpf_func_proto bpf_get_prandom_u32_proto __weak;
> > > -const struct bpf_func_proto bpf_get_smp_processor_id_proto __weak;
> > > const struct bpf_func_proto bpf_get_numa_node_id_proto __weak;
> > > const struct bpf_func_proto bpf_ktime_get_ns_proto __weak;
> > > const struct bpf_func_proto bpf_ktime_get_boot_ns_proto __weak;
> > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > index e3a2662f4e33..2d2bfb2911f8 100644
> > > --- a/kernel/bpf/helpers.c
> > > +++ b/kernel/bpf/helpers.c
> > > @@ -149,18 +149,6 @@ const struct bpf_func_proto bpf_get_prandom_u32_proto = {
> > > .ret_type = RET_INTEGER,
> > > };
> > >
> > > -BPF_CALL_0(bpf_get_smp_processor_id)
> > > -{
> > > - return smp_processor_id();
> > > -}
> > > -
> > > -const struct bpf_func_proto bpf_get_smp_processor_id_proto = {
> > > - .func = bpf_get_smp_processor_id,
> > > - .gpl_only = false,
> > > - .ret_type = RET_INTEGER,
> > > - .allow_fastcall = true,
> > > -};
> > > -
> >
> > bpf_get_raw_smp_processor_id_proto doesn't have
> > allow_fastcall = true
> >
> > so this breaks tests.
> >
> > Instead of removing BPF_CALL_0(bpf_get_smp_processor_id)
> > we should probably remove BPF_CALL_0(bpf_get_raw_cpu_id)
> > and adjust SKF_AD_OFF + SKF_AD_CPU case.
> > I don't recall why raw_ version was used back in 2014.
> >
>
> The following two seem to explain the reason:
> https://lore.kernel.org/all/7103e2085afa29c006cd5b94a6e4a2ac83efc30d.1467106475.git.daniel@iogearbox.net/
> https://lore.kernel.org/all/02fa71ebe1c560cad489967aa29c653b48932596.1474586162.git.daniel@iogearbox.net/
>
Ahh. socket filters run in RCU CS. They don't disable preemption or migration.
Then let's keep things as-is.
We still want debugging provided by smp_processor_id().
If we switch everything to raw_ may miss things. Like this example with
socket filters.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2025-04-18 7:46 Shung-Hsi Yu
2025-04-18 7:49 ` Shung-Hsi Yu
@ 2025-04-23 17:30 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 21+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-04-23 17:30 UTC (permalink / raw)
To: Shung-Hsi Yu
Cc: bpf, martin.lau, ast, daniel, andrii, eddyz87, song,
yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
memxor, dan.carpenter
Hello:
This patch was applied to bpf/bpf-next.git (master)
by Andrii Nakryiko <andrii@kernel.org>:
On Fri, 18 Apr 2025 15:46:31 +0800 you wrote:
> >From bda8bb8011d865cebf066350c8625e8be1625656 Mon Sep 17 00:00:00 2001
> From: Shung-Hsi Yu <shung-hsi.yu@suse.com>
> Date: Fri, 18 Apr 2025 15:22:00 +0800
> Subject: [PATCH bpf-next 1/1] bpf: use proper type to calculate
> bpf_raw_tp_null_args.mask index
>
> The calculation of the index used to access the mask field in 'struct
> bpf_raw_tp_null_args' is done with 'int' type, which could overflow when
> the tracepoint being attached has more than 8 arguments.
>
> [...]
Here is the summary with links:
-
https://git.kernel.org/bpf/bpf-next/c/53ebef53a657
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 21+ messages in thread
* (no subject)
@ 2025-04-24 0:40 Cong Wang
2025-04-24 0:59 ` Jiayuan Chen
0 siblings, 1 reply; 21+ messages in thread
From: Cong Wang @ 2025-04-24 0:40 UTC (permalink / raw)
To: jiayuan.chen; +Cc: john.fastabend, jakub, netdev, bpf
netdev@vger.kernel.org, bpf@vger.kernel.org
Bcc:
Subject: test_sockmap failures on the latest bpf-next
Reply-To:
Hi all,
The latest bpf-next failed on test_sockmap tests, I got the following
failures (including 1 kernel warning). It is 100% reproducible here.
I don't have time to look into them, a quick glance at the changelog
shows quite some changes from Jiayuan. So please take a look, Jiayuan.
Meanwhile, please let me know if you need more information from me.
Thanks!
--------------->
[root@localhost bpf]# ./test_sockmap
# 1/ 6 sockmap::txmsg test passthrough:OK
# 2/ 6 sockmap::txmsg test redirect:OK
# 3/ 2 sockmap::txmsg test redirect wait send mem:OK
# 4/ 6 sockmap::txmsg test drop:OK
[ 182.498017] perf: interrupt took too long (3406 > 3238), lowering kernel.perf_event_max_sample_rate to 58500
# 5/ 6 sockmap::txmsg test ingress redirect:OK
# 6/ 7 sockmap::txmsg test skb:OK
# 7/12 sockmap::txmsg test apply:OK
# 8/12 sockmap::txmsg test cork:OK
# 9/ 3 sockmap::txmsg test hanging corks:OK
#10/11 sockmap::txmsg test push_data:OK
#11/17 sockmap::txmsg test pull-data:OK
#12/ 9 sockmap::txmsg test pop-data:OK
#13/ 6 sockmap::txmsg test push/pop data:OK
#14/ 1 sockmap::txmsg test ingress parser:OK
#15/ 1 sockmap::txmsg test ingress parser2:OK
#16/ 6 sockhash::txmsg test passthrough:OK
#17/ 6 sockhash::txmsg test redirect:OK
#18/ 2 sockhash::txmsg test redirect wait send mem:OK
#19/ 6 sockhash::txmsg test drop:OK
#20/ 6 sockhash::txmsg test ingress redirect:OK
#21/ 7 sockhash::txmsg test skb:OK
#22/12 sockhash::txmsg test apply:OK
#23/12 sockhash::txmsg test cork:OK
#24/ 3 sockhash::txmsg test hanging corks:OK
#25/11 sockhash::txmsg test push_data:OK
#26/17 sockhash::txmsg test pull-data:OK
#27/ 9 sockhash::txmsg test pop-data:OK
#28/ 6 sockhash::txmsg test push/pop data:OK
#29/ 1 sockhash::txmsg test ingress parser:OK
#30/ 1 sockhash::txmsg test ingress parser2:OK
#31/ 6 sockhash:ktls:txmsg test passthrough:OK
#32/ 6 sockhash:ktls:txmsg test redirect:OK
#33/ 2 sockhash:ktls:txmsg test redirect wait send mem:OK
[ 263.509707] ------------[ cut here ]------------
[ 263.510439] WARNING: CPU: 1 PID: 40 at net/ipv4/af_inet.c:156 inet_sock_destruct+0x173/0x1d5
[ 263.511450] CPU: 1 UID: 0 PID: 40 Comm: kworker/1:1 Tainted: G W 6.15.0-rc3+ #238 PREEMPT(voluntary)
[ 263.512683] Tainted: [W]=WARN
[ 263.513062] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[ 263.514763] Workqueue: events sk_psock_destroy
[ 263.515332] RIP: 0010:inet_sock_destruct+0x173/0x1d5
[ 263.515916] Code: e8 dc dc 3f ff 41 83 bc 24 c0 02 00 00 00 74 02 0f 0b 49 8d bc 24 ac 02 00 00 e8 c2 dc 3f ff 41 83 bc 24 ac 02 00 00 00 74 02 <0f> 0b e8 c7 95 3d 00 49 8d bc 24 b0 05 00 00 e8 c0 dd 3f ff 49 8b
[ 263.518899] RSP: 0018:ffff8880085cfc18 EFLAGS: 00010202
[ 263.519596] RAX: 1ffff11003dbfc00 RBX: ffff88801edfe3e8 RCX: ffffffff822f5af4
[ 263.520502] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff88801edfe16c
[ 263.522128] RBP: ffff88801edfe184 R08: ffffed1003dbfc31 R09: 0000000000000000
[ 263.523008] R10: ffffffff822f5ab7 R11: ffff88801edfe187 R12: ffff88801edfdec0
[ 263.523822] R13: ffff888020376ac0 R14: ffff888020376ac0 R15: ffff888020376a60
[ 263.524682] FS: 0000000000000000(0000) GS:ffff8880b0e88000(0000) knlGS:0000000000000000
[ 263.525999] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 263.526765] CR2: 0000556365155830 CR3: 000000001d6aa000 CR4: 0000000000350ef0
[ 263.527700] Call Trace:
[ 263.528037] <TASK>
[ 263.528339] __sk_destruct+0x46/0x222
[ 263.528856] sk_psock_destroy+0x22f/0x242
[ 263.529471] process_one_work+0x504/0x8a8
[ 263.530029] ? process_one_work+0x39d/0x8a8
[ 263.530587] ? __pfx_process_one_work+0x10/0x10
[ 263.531195] ? worker_thread+0x44/0x2ae
[ 263.531721] ? __list_add_valid_or_report+0x83/0xea
[ 263.532395] ? srso_return_thunk+0x5/0x5f
[ 263.532929] ? __list_add+0x45/0x52
[ 263.533482] process_scheduled_works+0x73/0x82
[ 263.534079] worker_thread+0x1ce/0x2ae
[ 263.534582] ? _raw_spin_unlock_irqrestore+0x2e/0x44
[ 263.535243] ? __pfx_worker_thread+0x10/0x10
[ 263.535822] kthread+0x32a/0x33c
[ 263.536278] ? kthread+0x13c/0x33c
[ 263.536724] ? __pfx_kthread+0x10/0x10
[ 263.537225] ? srso_return_thunk+0x5/0x5f
[ 263.537869] ? find_held_lock+0x2b/0x75
[ 263.538388] ? __pfx_kthread+0x10/0x10
[ 263.538866] ? srso_return_thunk+0x5/0x5f
[ 263.539523] ? local_clock_noinstr+0x32/0x9c
[ 263.540128] ? srso_return_thunk+0x5/0x5f
[ 263.540677] ? srso_return_thunk+0x5/0x5f
[ 263.541228] ? __lock_release+0xd3/0x1ad
[ 263.541890] ? srso_return_thunk+0x5/0x5f
[ 263.542442] ? tracer_hardirqs_on+0x17/0x149
[ 263.543047] ? _raw_spin_unlock_irq+0x24/0x39
[ 263.543589] ? __pfx_kthread+0x10/0x10
[ 263.544069] ? __pfx_kthread+0x10/0x10
[ 263.544543] ret_from_fork+0x21/0x41
[ 263.545000] ? __pfx_kthread+0x10/0x10
[ 263.545557] ret_from_fork_asm+0x1a/0x30
[ 263.546095] </TASK>
[ 263.546374] irq event stamp: 1094079
[ 263.546798] hardirqs last enabled at (1094089): [<ffffffff813be0f6>] __up_console_sem+0x47/0x4e
[ 263.547762] hardirqs last disabled at (1094098): [<ffffffff813be0d6>] __up_console_sem+0x27/0x4e
[ 263.548817] softirqs last enabled at (1093692): [<ffffffff812f2906>] handle_softirqs+0x48c/0x4de
[ 263.550127] softirqs last disabled at (1094117): [<ffffffff812f29b3>] __irq_exit_rcu+0x4b/0xc3
[ 263.551104] ---[ end trace 0000000000000000 ]---
#34/ 6 sockhash:ktls:txmsg test drop:OK
#35/ 6 sockhash:ktls:txmsg test ingress redirect:OK
#36/ 7 sockhash:ktls:txmsg test skb:OK
#37/12 sockhash:ktls:txmsg test apply:OK
[ 278.915147] perf: interrupt took too long (4331 > 4257), lowering kernel.perf_event_max_sample_rate to 46000
[ 282.974989] test_sockmap (1077) used greatest stack depth: 25072 bytes left
#38/12 sockhash:ktls:txmsg test cork:OK
#39/ 3 sockhash:ktls:txmsg test hanging corks:OK
#40/11 sockhash:ktls:txmsg test push_data:OK
#41/17 sockhash:ktls:txmsg test pull-data:OK
recv failed(): Invalid argument
rx thread exited with err 1.
recv failed(): Invalid argument
rx thread exited with err 1.
recv failed(): Bad message
rx thread exited with err 1.
#42/ 9 sockhash:ktls:txmsg test pop-data:FAIL
recv failed(): Bad message
rx thread exited with err 1.
recv failed(): Message too long
rx thread exited with err 1.
#43/ 6 sockhash:ktls:txmsg test push/pop data:FAIL
#44/ 1 sockhash:ktls:txmsg test ingress parser:OK
#45/ 0 sockhash:ktls:txmsg test ingress parser2:OK
Pass: 43 Fail: 5
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2025-04-24 0:40 Cong Wang
@ 2025-04-24 0:59 ` Jiayuan Chen
2025-04-24 9:19 ` Re: Jiayuan Chen
0 siblings, 1 reply; 21+ messages in thread
From: Jiayuan Chen @ 2025-04-24 0:59 UTC (permalink / raw)
To: Cong Wang; +Cc: john.fastabend, jakub, netdev, bpf
April 24, 2025 at 08:40, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
>
> netdev@vger.kernel.org, bpf@vger.kernel.org
>
> Bcc:
>
> Subject: test_sockmap failures on the latest bpf-next
>
> Reply-To:
>
> Hi all,
>
> The latest bpf-next failed on test_sockmap tests, I got the following
>
> failures (including 1 kernel warning). It is 100% reproducible here.
>
> I don't have time to look into them, a quick glance at the changelog
>
> shows quite some changes from Jiayuan. So please take a look, Jiayuan.
>
> Meanwhile, please let me know if you need more information from me.
>
> Thanks!
>
> --------------->
Thanks, I'm working on it.
>
> [root@localhost bpf]# ./test_sockmap
>
> # 1/ 6 sockmap::txmsg test passthrough:OK
>
> # 2/ 6 sockmap::txmsg test redirect:OK
>
> # 3/ 2 sockmap::txmsg test redirect wait send mem:OK
>
> # 4/ 6 sockmap::txmsg test drop:OK
>
> [ 182.498017] perf: interrupt took too long (3406 > 3238), lowering kernel.perf_event_max_sample_rate to 58500
>
> # 5/ 6 sockmap::txmsg test ingress redirect:OK
>
> # 6/ 7 sockmap::txmsg test skb:OK
>
> # 7/12 sockmap::txmsg test apply:OK
>
> # 8/12 sockmap::txmsg test cork:OK
>
> # 9/ 3 sockmap::txmsg test hanging corks:OK
>
> #10/11 sockmap::txmsg test push_data:OK
>
> #11/17 sockmap::txmsg test pull-data:OK
>
> #12/ 9 sockmap::txmsg test pop-data:OK
>
> #13/ 6 sockmap::txmsg test push/pop data:OK
>
> #14/ 1 sockmap::txmsg test ingress parser:OK
>
> #15/ 1 sockmap::txmsg test ingress parser2:OK
>
> #16/ 6 sockhash::txmsg test passthrough:OK
>
> #17/ 6 sockhash::txmsg test redirect:OK
>
> #18/ 2 sockhash::txmsg test redirect wait send mem:OK
>
> #19/ 6 sockhash::txmsg test drop:OK
>
> #20/ 6 sockhash::txmsg test ingress redirect:OK
>
> #21/ 7 sockhash::txmsg test skb:OK
>
> #22/12 sockhash::txmsg test apply:OK
>
> #23/12 sockhash::txmsg test cork:OK
>
> #24/ 3 sockhash::txmsg test hanging corks:OK
>
> #25/11 sockhash::txmsg test push_data:OK
>
> #26/17 sockhash::txmsg test pull-data:OK
>
> #27/ 9 sockhash::txmsg test pop-data:OK
>
> #28/ 6 sockhash::txmsg test push/pop data:OK
>
> #29/ 1 sockhash::txmsg test ingress parser:OK
>
> #30/ 1 sockhash::txmsg test ingress parser2:OK
>
> #31/ 6 sockhash:ktls:txmsg test passthrough:OK
>
> #32/ 6 sockhash:ktls:txmsg test redirect:OK
>
> #33/ 2 sockhash:ktls:txmsg test redirect wait send mem:OK
>
> [ 263.509707] ------------[ cut here ]------------
>
> [ 263.510439] WARNING: CPU: 1 PID: 40 at net/ipv4/af_inet.c:156 inet_sock_destruct+0x173/0x1d5
>
> [ 263.511450] CPU: 1 UID: 0 PID: 40 Comm: kworker/1:1 Tainted: G W 6.15.0-rc3+ #238 PREEMPT(voluntary)
>
> [ 263.512683] Tainted: [W]=WARN
>
> [ 263.513062] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
>
> [ 263.514763] Workqueue: events sk_psock_destroy
>
> [ 263.515332] RIP: 0010:inet_sock_destruct+0x173/0x1d5
>
> [ 263.515916] Code: e8 dc dc 3f ff 41 83 bc 24 c0 02 00 00 00 74 02 0f 0b 49 8d bc 24 ac 02 00 00 e8 c2 dc 3f ff 41 83 bc 24 ac 02 00 00 00 74 02 <0f> 0b e8 c7 95 3d 00 49 8d bc 24 b0 05 00 00 e8 c0 dd 3f ff 49 8b
>
> [ 263.518899] RSP: 0018:ffff8880085cfc18 EFLAGS: 00010202
>
> [ 263.519596] RAX: 1ffff11003dbfc00 RBX: ffff88801edfe3e8 RCX: ffffffff822f5af4
>
> [ 263.520502] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff88801edfe16c
>
> [ 263.522128] RBP: ffff88801edfe184 R08: ffffed1003dbfc31 R09: 0000000000000000
>
> [ 263.523008] R10: ffffffff822f5ab7 R11: ffff88801edfe187 R12: ffff88801edfdec0
>
> [ 263.523822] R13: ffff888020376ac0 R14: ffff888020376ac0 R15: ffff888020376a60
>
> [ 263.524682] FS: 0000000000000000(0000) GS:ffff8880b0e88000(0000) knlGS:0000000000000000
>
> [ 263.525999] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>
> [ 263.526765] CR2: 0000556365155830 CR3: 000000001d6aa000 CR4: 0000000000350ef0
>
> [ 263.527700] Call Trace:
>
> [ 263.528037] <TASK>
>
> [ 263.528339] __sk_destruct+0x46/0x222
>
> [ 263.528856] sk_psock_destroy+0x22f/0x242
>
> [ 263.529471] process_one_work+0x504/0x8a8
>
> [ 263.530029] ? process_one_work+0x39d/0x8a8
>
> [ 263.530587] ? __pfx_process_one_work+0x10/0x10
>
> [ 263.531195] ? worker_thread+0x44/0x2ae
>
> [ 263.531721] ? __list_add_valid_or_report+0x83/0xea
>
> [ 263.532395] ? srso_return_thunk+0x5/0x5f
>
> [ 263.532929] ? __list_add+0x45/0x52
>
> [ 263.533482] process_scheduled_works+0x73/0x82
>
> [ 263.534079] worker_thread+0x1ce/0x2ae
>
> [ 263.534582] ? _raw_spin_unlock_irqrestore+0x2e/0x44
>
> [ 263.535243] ? __pfx_worker_thread+0x10/0x10
>
> [ 263.535822] kthread+0x32a/0x33c
>
> [ 263.536278] ? kthread+0x13c/0x33c
>
> [ 263.536724] ? __pfx_kthread+0x10/0x10
>
> [ 263.537225] ? srso_return_thunk+0x5/0x5f
>
> [ 263.537869] ? find_held_lock+0x2b/0x75
>
> [ 263.538388] ? __pfx_kthread+0x10/0x10
>
> [ 263.538866] ? srso_return_thunk+0x5/0x5f
>
> [ 263.539523] ? local_clock_noinstr+0x32/0x9c
>
> [ 263.540128] ? srso_return_thunk+0x5/0x5f
>
> [ 263.540677] ? srso_return_thunk+0x5/0x5f
>
> [ 263.541228] ? __lock_release+0xd3/0x1ad
>
> [ 263.541890] ? srso_return_thunk+0x5/0x5f
>
> [ 263.542442] ? tracer_hardirqs_on+0x17/0x149
>
> [ 263.543047] ? _raw_spin_unlock_irq+0x24/0x39
>
> [ 263.543589] ? __pfx_kthread+0x10/0x10
>
> [ 263.544069] ? __pfx_kthread+0x10/0x10
>
> [ 263.544543] ret_from_fork+0x21/0x41
>
> [ 263.545000] ? __pfx_kthread+0x10/0x10
>
> [ 263.545557] ret_from_fork_asm+0x1a/0x30
>
> [ 263.546095] </TASK>
>
> [ 263.546374] irq event stamp: 1094079
>
> [ 263.546798] hardirqs last enabled at (1094089): [<ffffffff813be0f6>] __up_console_sem+0x47/0x4e
>
> [ 263.547762] hardirqs last disabled at (1094098): [<ffffffff813be0d6>] __up_console_sem+0x27/0x4e
>
> [ 263.548817] softirqs last enabled at (1093692): [<ffffffff812f2906>] handle_softirqs+0x48c/0x4de
>
> [ 263.550127] softirqs last disabled at (1094117): [<ffffffff812f29b3>] __irq_exit_rcu+0x4b/0xc3
>
> [ 263.551104] ---[ end trace 0000000000000000 ]---
>
> #34/ 6 sockhash:ktls:txmsg test drop:OK
>
> #35/ 6 sockhash:ktls:txmsg test ingress redirect:OK
>
> #36/ 7 sockhash:ktls:txmsg test skb:OK
>
> #37/12 sockhash:ktls:txmsg test apply:OK
>
> [ 278.915147] perf: interrupt took too long (4331 > 4257), lowering kernel.perf_event_max_sample_rate to 46000
>
> [ 282.974989] test_sockmap (1077) used greatest stack depth: 25072 bytes left
>
> #38/12 sockhash:ktls:txmsg test cork:OK
>
> #39/ 3 sockhash:ktls:txmsg test hanging corks:OK
>
> #40/11 sockhash:ktls:txmsg test push_data:OK
>
> #41/17 sockhash:ktls:txmsg test pull-data:OK
>
> recv failed(): Invalid argument
>
> rx thread exited with err 1.
>
> recv failed(): Invalid argument
>
> rx thread exited with err 1.
>
> recv failed(): Bad message
>
> rx thread exited with err 1.
>
> #42/ 9 sockhash:ktls:txmsg test pop-data:FAIL
>
> recv failed(): Bad message
>
> rx thread exited with err 1.
>
> recv failed(): Message too long
>
> rx thread exited with err 1.
>
> #43/ 6 sockhash:ktls:txmsg test push/pop data:FAIL
>
> #44/ 1 sockhash:ktls:txmsg test ingress parser:OK
>
> #45/ 0 sockhash:ktls:txmsg test ingress parser2:OK
>
> Pass: 43 Fail: 5
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:
2025-04-24 0:59 ` Jiayuan Chen
@ 2025-04-24 9:19 ` Jiayuan Chen
2025-05-19 0:30 ` test_sockmap failures on the latest bpf-next Cong Wang
0 siblings, 1 reply; 21+ messages in thread
From: Jiayuan Chen @ 2025-04-24 9:19 UTC (permalink / raw)
To: Cong Wang; +Cc: john.fastabend, jakub, netdev, bpf
April 24, 2025 at 08:59, "Jiayuan Chen" <jiayuan.chen@linux.dev> wrote:
>
> April 24, 2025 at 08:40, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
>
> >
> > netdev@vger.kernel.org, bpf@vger.kernel.org
> >
> > Bcc:
> >
> >
> > Subject: test_sockmap failures on the latest bpf-next
> >
> > Reply-To:
> >
> >
> >
> > Hi all,
> >
> >
> >
> > The latest bpf-next failed on test_sockmap tests, I got the following
> >
> > failures (including 1 kernel warning). It is 100% reproducible here.
> >
> > I don't have time to look into them, a quick glance at the changelog
> >
> > shows quite some changes from Jiayuan. So please take a look, Jiayuan.
> >
> > Meanwhile, please let me know if you need more information from me.
> >
> > Thanks!
> >
> >
> >
> > --------------->
> >
>
> Thanks, I'm working on it.
>
After resetting my commit to 0bb2f7a1ad1f, which is before my changes, the warning still exists.
The warning originates from test_txmsg_redir_wait_sndmem(), which performs
'KTLS + sockmap with redir EGRESS and limited receive buffer'.
The memory charge/uncharge logic is problematic, I need some time to investigate and fix it.
Thanks.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: test_sockmap failures on the latest bpf-next
2025-04-24 9:19 ` Re: Jiayuan Chen
@ 2025-05-19 0:30 ` Cong Wang
0 siblings, 0 replies; 21+ messages in thread
From: Cong Wang @ 2025-05-19 0:30 UTC (permalink / raw)
To: Jiayuan Chen
Cc: John Fastabend, Jakub Sitnicki, Linux Kernel Network Developers,
bpf
Hi Jiayuan,
Thanks for fixing the kernel warning, however, it looks like the
sockmap selftests
still fail on the latest bpf-next, see below.
Do you mind taking a look? ;-)
[root@localhost bpf]# ./test_sockmap
# 1/ 6 sockmap::txmsg test passthrough:OK
# 2/ 6 sockmap::txmsg test redirect:OK
# 3/ 2 sockmap::txmsg test redirect wait send mem:OK
# 4/ 6 sockmap::txmsg test drop:OK
# 5/ 6 sockmap::txmsg test ingress redirect:OK
# 6/ 7 sockmap::txmsg test skb:OK
# 7/12 sockmap::txmsg test apply:OK
# 8/12 sockmap::txmsg test cork:OK
[ 347.904830] test_sockmap (676) used greatest stack depth: 24712 bytes left
# 9/ 3 sockmap::txmsg test hanging corks:OK
#10/11 sockmap::txmsg test push_data:OK
#11/17 sockmap::txmsg test pull-data:OK
#12/ 9 sockmap::txmsg test pop-data:OK
#13/ 6 sockmap::txmsg test push/pop data:OK
#14/ 1 sockmap::txmsg test ingress parser:OK
#15/ 1 sockmap::txmsg test ingress parser2:OK
#16/ 6 sockhash::txmsg test passthrough:OK
#17/ 6 sockhash::txmsg test redirect:OK
#18/ 2 sockhash::txmsg test redirect wait send mem:OK
#19/ 6 sockhash::txmsg test drop:OK
#20/ 6 sockhash::txmsg test ingress redirect:OK
#21/ 7 sockhash::txmsg test skb:OK
#22/12 sockhash::txmsg test apply:OK
#23/12 sockhash::txmsg test cork:OK
#24/ 3 sockhash::txmsg test hanging corks:OK
#25/11 sockhash::txmsg test push_data:OK
#26/17 sockhash::txmsg test pull-data:OK
#27/ 9 sockhash::txmsg test pop-data:OK
#28/ 6 sockhash::txmsg test push/pop data:OK
#29/ 1 sockhash::txmsg test ingress parser:OK
#30/ 1 sockhash::txmsg test ingress parser2:OK
#31/ 6 sockhash:ktls:txmsg test passthrough:OK
[ 408.408666] perf: interrupt took too long (12003 > 12002), lowering
kernel.perf_event_max_sample_rate to 16500
#32/ 6 sockhash:ktls:txmsg test redirect:OK
#33/ 2 sockhash:ktls:txmsg test redirect wait send mem:OK
#34/ 6 sockhash:ktls:txmsg test drop:OK
#35/ 6 sockhash:ktls:txmsg test ingress redirect:OK
#36/ 7 sockhash:ktls:txmsg test skb:OK
#37/12 sockhash:ktls:txmsg test apply:OK
#38/12 sockhash:ktls:txmsg test cork:OK
#39/ 3 sockhash:ktls:txmsg test hanging corks:OK
#40/11 sockhash:ktls:txmsg test push_data:OK
#41/17 sockhash:ktls:txmsg test pull-data:OK
recv failed(): Invalid argument
rx thread exited with err 1.
recv failed(): Invalid argument
rx thread exited with err 1.
recv failed(): Bad message
rx thread exited with err 1.
#42/ 9 sockhash:ktls:txmsg test pop-data:FAIL
recv failed(): Bad message
rx thread exited with err 1.
recv failed(): Bad message
rx thread exited with err 1.
#43/ 6 sockhash:ktls:txmsg test push/pop data:FAIL
#44/ 1 sockhash:ktls:txmsg test ingress parser:OK
#45/ 0 sockhash:ktls:txmsg test ingress parser2:OK
Pass: 43 Fail: 5
Thanks!
Cong
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2025-05-19 0:31 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-24 0:40 Cong Wang
2025-04-24 0:59 ` Jiayuan Chen
2025-04-24 9:19 ` Re: Jiayuan Chen
2025-05-19 0:30 ` test_sockmap failures on the latest bpf-next Cong Wang
-- strict thread matches above, loose matches on Subject: below --
2025-04-22 1:53 [PATCH bpf-next] bpf: Remove bpf_get_smp_processor_id_proto Alexei Starovoitov
2025-04-22 8:04 ` Feng Yang
2025-04-22 14:37 ` Alexei Starovoitov
2025-04-18 7:46 Shung-Hsi Yu
2025-04-18 7:49 ` Shung-Hsi Yu
2025-04-23 17:30 ` Re: patchwork-bot+netdevbpf
2024-06-26 6:11 Totoro W
2024-06-26 7:09 ` Eduard Zingerman
2022-05-15 20:36 [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Jiri Olsa
2023-05-20 9:47 ` Ze Gao
2023-05-21 3:58 ` Yonghong Song
2023-05-21 15:10 ` Re: Ze Gao
2023-05-21 20:26 ` Re: Jiri Olsa
2023-05-22 1:36 ` Re: Masami Hiramatsu
2023-05-22 2:07 ` Re: Ze Gao
2023-05-23 4:38 ` Re: Yonghong Song
2023-05-23 5:30 ` Re: Masami Hiramatsu
2023-05-23 6:59 ` Re: Paul E. McKenney
2023-05-25 0:13 ` Re: Masami Hiramatsu
2023-05-21 8:08 ` Re: Jiri Olsa
2023-05-21 10:09 ` Re: Masami Hiramatsu
2023-05-21 14:19 ` Re: Ze Gao
2022-03-04 8:47 Re: Harald Hauge
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).