[PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
@ 2024-07-13  4:46 Kyle Huey
  2024-07-13 20:32 ` Jiri Olsa
  0 siblings, 1 reply; 17+ messages in thread
From: Kyle Huey @ 2024-07-13  4:46 UTC (permalink / raw)
  To: khuey, Ingo Molnar, Namhyung Kim, Jiri Olsa, Linus Torvalds
  Cc: robert, Joe Damato, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Ian Rogers, Adrian Hunter,
	Liang, Kan, Andrii Nakryiko, Kyle Huey, Song Liu,
	linux-perf-users, linux-kernel, bpf

The regressing commit is new in 6.10. It assumed that anytime event->prog
is set bpf_overflow_handler() should be invoked to execute the attached bpf
program. This assumption is false for tracing events, and as a result the
regressing commit broke bpftrace by invoking the bpf handler with garbage
inputs on overflow.

Prior to the regression the overflow handlers formed a chain (of length 0,
1, or 2) and perf_event_set_bpf_handler() (the !tracing case) added
bpf_overflow_handler() to that chain, while perf_event_attach_bpf_prog()
(the tracing case) did not. Both set event->prog. The chain of overflow
handlers was replaced by a single overflow handler slot and a fixed call to
bpf_overflow_handler() when appropriate. This modifies the condition there
to include !perf_event_is_tracing(), restoring the previous behavior and
fixing bpftrace.

Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Reported-by: Joe Damato <jdamato@fastly.com>
Fixes: f11f10bfa1ca ("perf/bpf: Call BPF handler directly, not through overflow machinery")
Tested-by: Joe Damato <jdamato@fastly.com> # bpftrace
Tested-by: Kyle Huey <khuey@kylehuey.com> # bpf overflow handlers
---
 kernel/events/core.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 8f908f077935..f0d7119585dc 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9666,6 +9666,8 @@ static inline void perf_event_free_bpf_handler(struct perf_event *event)
  * Generic event overflow handling, sampling.
  */
 
+static bool perf_event_is_tracing(struct perf_event *event);
+
 static int __perf_event_overflow(struct perf_event *event,
 				 int throttle, struct perf_sample_data *data,
 				 struct pt_regs *regs)
@@ -9682,7 +9684,9 @@ static int __perf_event_overflow(struct perf_event *event,
 
 	ret = __perf_event_account_interrupt(event, throttle);
 
-	if (event->prog && !bpf_overflow_handler(event, data, regs))
+	if (event->prog &&
+	    !perf_event_is_tracing(event) &&
+	    !bpf_overflow_handler(event, data, regs))
 		return ret;
 
 	/*
@@ -10612,6 +10616,11 @@ void perf_event_free_bpf_prog(struct perf_event *event)
 
 #else
 
+static inline bool perf_event_is_tracing(struct perf_event *event)
+{
+	return false;
+}
+
 static inline void perf_tp_register(void)
 {
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-07-13  4:46 [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events Kyle Huey
@ 2024-07-13 20:32 ` Jiri Olsa
  2024-07-15 11:12   ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Jiri Olsa @ 2024-07-13 20:32 UTC (permalink / raw)
  To: Kyle Huey
  Cc: khuey, Ingo Molnar, Namhyung Kim, Linus Torvalds, robert,
	Joe Damato, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Ian Rogers, Adrian Hunter,
	Liang, Kan, Andrii Nakryiko, Song Liu, linux-perf-users,
	linux-kernel, bpf

On Fri, Jul 12, 2024 at 09:46:45PM -0700, Kyle Huey wrote:
> The regressing commit is new in 6.10. It assumed that anytime event->prog
> is set bpf_overflow_handler() should be invoked to execute the attached bpf
> program. This assumption is false for tracing events, and as a result the
> regressing commit broke bpftrace by invoking the bpf handler with garbage
> inputs on overflow.
> 
> Prior to the regression the overflow handlers formed a chain (of length 0,
> 1, or 2) and perf_event_set_bpf_handler() (the !tracing case) added
> bpf_overflow_handler() to that chain, while perf_event_attach_bpf_prog()
> (the tracing case) did not. Both set event->prog. The chain of overflow
> handlers was replaced by a single overflow handler slot and a fixed call to
> bpf_overflow_handler() when appropriate. This modifies the condition there
> to include !perf_event_is_tracing(), restoring the previous behavior and
> fixing bpftrace.
> 
> Signed-off-by: Kyle Huey <khuey@kylehuey.com>
> Reported-by: Joe Damato <jdamato@fastly.com>
> Fixes: f11f10bfa1ca ("perf/bpf: Call BPF handler directly, not through overflow machinery")
> Tested-by: Joe Damato <jdamato@fastly.com> # bpftrace
> Tested-by: Kyle Huey <khuey@kylehuey.com> # bpf overflow handlers
> ---
>  kernel/events/core.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 8f908f077935..f0d7119585dc 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -9666,6 +9666,8 @@ static inline void perf_event_free_bpf_handler(struct perf_event *event)
>   * Generic event overflow handling, sampling.
>   */
>  
> +static bool perf_event_is_tracing(struct perf_event *event);
> +
>  static int __perf_event_overflow(struct perf_event *event,
>  				 int throttle, struct perf_sample_data *data,
>  				 struct pt_regs *regs)
> @@ -9682,7 +9684,9 @@ static int __perf_event_overflow(struct perf_event *event,
>  
>  	ret = __perf_event_account_interrupt(event, throttle);
>  
> -	if (event->prog && !bpf_overflow_handler(event, data, regs))
> +	if (event->prog &&
> +	    !perf_event_is_tracing(event) &&
> +	    !bpf_overflow_handler(event, data, regs))
>  		return ret;

ok makes sense, it's better to follow the perf_event_set_bpf_prog condition

Reviewed-by: Jiri Olsa <jolsa@kernel.org>

jirka

>  
>  	/*
> @@ -10612,6 +10616,11 @@ void perf_event_free_bpf_prog(struct perf_event *event)
>  
>  #else
>  
> +static inline bool perf_event_is_tracing(struct perf_event *event)
> +{
> +	return false;
> +}
> +
>  static inline void perf_tp_register(void)
>  {
>  }
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-07-13 20:32 ` Jiri Olsa
@ 2024-07-15 11:12   ` Peter Zijlstra
  2024-07-15 14:33     ` Kyle Huey
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2024-07-15 11:12 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Kyle Huey, khuey, Ingo Molnar, Namhyung Kim, Linus Torvalds,
	robert, Joe Damato, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Ian Rogers, Adrian Hunter, Liang, Kan,
	Andrii Nakryiko, Song Liu, linux-perf-users, linux-kernel, bpf

On Sat, Jul 13, 2024 at 10:32:07PM +0200, Jiri Olsa wrote:
> On Fri, Jul 12, 2024 at 09:46:45PM -0700, Kyle Huey wrote:
> > The regressing commit is new in 6.10. It assumed that anytime event->prog
> > is set bpf_overflow_handler() should be invoked to execute the attached bpf
> > program. This assumption is false for tracing events, and as a result the
> > regressing commit broke bpftrace by invoking the bpf handler with garbage
> > inputs on overflow.
> > 
> > Prior to the regression the overflow handlers formed a chain (of length 0,
> > 1, or 2) and perf_event_set_bpf_handler() (the !tracing case) added
> > bpf_overflow_handler() to that chain, while perf_event_attach_bpf_prog()
> > (the tracing case) did not. Both set event->prog. The chain of overflow
> > handlers was replaced by a single overflow handler slot and a fixed call to
> > bpf_overflow_handler() when appropriate. This modifies the condition there
> > to include !perf_event_is_tracing(), restoring the previous behavior and
> > fixing bpftrace.
> > 
> > Signed-off-by: Kyle Huey <khuey@kylehuey.com>
> > Reported-by: Joe Damato <jdamato@fastly.com>
> > Fixes: f11f10bfa1ca ("perf/bpf: Call BPF handler directly, not through overflow machinery")
> > Tested-by: Joe Damato <jdamato@fastly.com> # bpftrace
> > Tested-by: Kyle Huey <khuey@kylehuey.com> # bpf overflow handlers
> > ---
> >  kernel/events/core.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 8f908f077935..f0d7119585dc 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -9666,6 +9666,8 @@ static inline void perf_event_free_bpf_handler(struct perf_event *event)
> >   * Generic event overflow handling, sampling.
> >   */
> >  
> > +static bool perf_event_is_tracing(struct perf_event *event);
> > +
> >  static int __perf_event_overflow(struct perf_event *event,
> >  				 int throttle, struct perf_sample_data *data,
> >  				 struct pt_regs *regs)
> > @@ -9682,7 +9684,9 @@ static int __perf_event_overflow(struct perf_event *event,
> >  
> >  	ret = __perf_event_account_interrupt(event, throttle);
> >  
> > -	if (event->prog && !bpf_overflow_handler(event, data, regs))
> > +	if (event->prog &&
> > +	    !perf_event_is_tracing(event) &&
> > +	    !bpf_overflow_handler(event, data, regs))
> >  		return ret;
> 
> ok makes sense, it's better to follow the perf_event_set_bpf_prog condition
> 
> Reviewed-by: Jiri Olsa <jolsa@kernel.org>

Urgh, so wth does event_is_tracing do with event->prog? And can't we
clean this up?

That whole perf_event_is_tracing() is a pretty gross function.

Also, I think the default return value of bpf_overflow_handler() is
wrong -- note how if !event->prog we won't call bpf_overflow_handler(),
but if we do call it, but then have !event->prog on the re-read, we
still return 0.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-07-15 11:12   ` Peter Zijlstra
@ 2024-07-15 14:33     ` Kyle Huey
  2024-07-15 15:04       ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Kyle Huey @ 2024-07-15 14:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, khuey, Ingo Molnar, Namhyung Kim, Linus Torvalds,
	robert, Joe Damato, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Ian Rogers, Adrian Hunter, Liang, Kan,
	Andrii Nakryiko, Song Liu, linux-perf-users, linux-kernel, bpf

On Mon, Jul 15, 2024 at 4:12 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Sat, Jul 13, 2024 at 10:32:07PM +0200, Jiri Olsa wrote:
> > On Fri, Jul 12, 2024 at 09:46:45PM -0700, Kyle Huey wrote:
> > > The regressing commit is new in 6.10. It assumed that anytime event->prog
> > > is set bpf_overflow_handler() should be invoked to execute the attached bpf
> > > program. This assumption is false for tracing events, and as a result the
> > > regressing commit broke bpftrace by invoking the bpf handler with garbage
> > > inputs on overflow.
> > >
> > > Prior to the regression the overflow handlers formed a chain (of length 0,
> > > 1, or 2) and perf_event_set_bpf_handler() (the !tracing case) added
> > > bpf_overflow_handler() to that chain, while perf_event_attach_bpf_prog()
> > > (the tracing case) did not. Both set event->prog. The chain of overflow
> > > handlers was replaced by a single overflow handler slot and a fixed call to
> > > bpf_overflow_handler() when appropriate. This modifies the condition there
> > > to include !perf_event_is_tracing(), restoring the previous behavior and
> > > fixing bpftrace.
> > >
> > > Signed-off-by: Kyle Huey <khuey@kylehuey.com>
> > > Reported-by: Joe Damato <jdamato@fastly.com>
> > > Fixes: f11f10bfa1ca ("perf/bpf: Call BPF handler directly, not through overflow machinery")
> > > Tested-by: Joe Damato <jdamato@fastly.com> # bpftrace
> > > Tested-by: Kyle Huey <khuey@kylehuey.com> # bpf overflow handlers
> > > ---
> > >  kernel/events/core.c | 11 ++++++++++-
> > >  1 file changed, 10 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > index 8f908f077935..f0d7119585dc 100644
> > > --- a/kernel/events/core.c
> > > +++ b/kernel/events/core.c
> > > @@ -9666,6 +9666,8 @@ static inline void perf_event_free_bpf_handler(struct perf_event *event)
> > >   * Generic event overflow handling, sampling.
> > >   */
> > >
> > > +static bool perf_event_is_tracing(struct perf_event *event);
> > > +
> > >  static int __perf_event_overflow(struct perf_event *event,
> > >                              int throttle, struct perf_sample_data *data,
> > >                              struct pt_regs *regs)
> > > @@ -9682,7 +9684,9 @@ static int __perf_event_overflow(struct perf_event *event,
> > >
> > >     ret = __perf_event_account_interrupt(event, throttle);
> > >
> > > -   if (event->prog && !bpf_overflow_handler(event, data, regs))
> > > +   if (event->prog &&
> > > +       !perf_event_is_tracing(event) &&
> > > +       !bpf_overflow_handler(event, data, regs))
> > >             return ret;
> >
> > ok makes sense, it's better to follow the perf_event_set_bpf_prog condition
> >
> > Reviewed-by: Jiri Olsa <jolsa@kernel.org>
>
> Urgh, so wth does event_is_tracing do with event->prog? And can't we
> clean this up?

Tracing events keep track of the bpf program in event->prog solely for
cleanup. The bpf programs are stored in and invoked from
event->tp_event->prog_array, but when the event is destroyed it needs
to know which bpf program to remove from that array.

> That whole perf_event_is_tracing() is a pretty gross function.
>
> Also, I think the default return value of bpf_overflow_handler() is
> wrong -- note how if !event->prog we won't call bpf_overflow_handler(),
> but if we do call it, but then have !event->prog on the re-read, we
> still return 0.

The synchronization model here isn't quite clear to me but I don't
think this matters in practice. Once event->prog is set the only
allowed change is for it to be cleared when the perf event is freed.
Anything else is refused by perf_event_set_bpf_handler() with EEXIST.
Can that free race with an overflow handler? I'm not sure, but even if
it can, dropping an overflow for an event that's being freed seems
fine to me. If it can't race then we could remove the condition on the
re-read entirely.

- Kyle

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-07-15 14:33     ` Kyle Huey
@ 2024-07-15 15:04       ` Peter Zijlstra
  2024-07-15 15:19         ` Kyle Huey
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2024-07-15 15:04 UTC (permalink / raw)
  To: Kyle Huey
  Cc: Jiri Olsa, khuey, Ingo Molnar, Namhyung Kim, Linus Torvalds,
	robert, Joe Damato, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Ian Rogers, Adrian Hunter, Liang, Kan,
	Andrii Nakryiko, Song Liu, linux-perf-users, linux-kernel, bpf

On Mon, Jul 15, 2024 at 07:33:57AM -0700, Kyle Huey wrote:
> On Mon, Jul 15, 2024 at 4:12 AM Peter Zijlstra <peterz@infradead.org> wrote:

> > Urgh, so wth does event_is_tracing do with event->prog? And can't we
> > clean this up?
> 
> Tracing events keep track of the bpf program in event->prog solely for
> cleanup. The bpf programs are stored in and invoked from
> event->tp_event->prog_array, but when the event is destroyed it needs
> to know which bpf program to remove from that array.

Yeah, figured it out eventually.. Does look like it needs event->prog
and we can't easily remedy this dual use :/

> > That whole perf_event_is_tracing() is a pretty gross function.
> >
> > Also, I think the default return value of bpf_overflow_handler() is
> > wrong -- note how if !event->prog we won't call bpf_overflow_handler(),
> > but if we do call it, but then have !event->prog on the re-read, we
> > still return 0.
> 
> The synchronization model here isn't quite clear to me but I don't
> think this matters in practice. Once event->prog is set the only
> allowed change is for it to be cleared when the perf event is freed.
> Anything else is refused by perf_event_set_bpf_handler() with EEXIST.
> Can that free race with an overflow handler? I'm not sure, but even if
> it can, dropping an overflow for an event that's being freed seems
> fine to me. If it can't race then we could remove the condition on the
> re-read entirely.

Right, also rcu_read_lock() is cheap enough to unconditionally do I'm
thinking.

So since we have two distinct users of event->prog, I figured we could
distinguish them from one of the LSB in the pointer value, which then
got me the below.

But now that I see the end result I'm not at all sure this is sane.

But I figure it ought to work... 

---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index ab6c4c942f79..5ec78346c2a1 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9594,6 +9594,13 @@ static inline bool sample_is_allowed(struct perf_event *event, struct pt_regs *r
 }
 
 #ifdef CONFIG_BPF_SYSCALL
+
+static inline struct bpf_prog *event_prog(struct perf_event *event)
+{
+	unsigned long _prog = (unsigned long)READ_ONCE(event->prog);
+	return (void *)(_prog & ~1);
+}
+
 static int bpf_overflow_handler(struct perf_event *event,
 				struct perf_sample_data *data,
 				struct pt_regs *regs)
@@ -9603,19 +9610,21 @@ static int bpf_overflow_handler(struct perf_event *event,
 		.event = event,
 	};
 	struct bpf_prog *prog;
-	int ret = 0;
+	int ret = 1;
+
+	guard(rcu)();
 
-	ctx.regs = perf_arch_bpf_user_pt_regs(regs);
-	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1))
-		goto out;
-	rcu_read_lock();
 	prog = READ_ONCE(event->prog);
-	if (prog) {
+	if (!((unsigned long)prog & 1))
+		return ret;
+
+	prog = (void *)((unsigned long)prog & ~1);
+
+	if (unlikely(__this_cpu_inc_return(bpf_prog_active) == 1)) {
 		perf_prepare_sample(data, event, regs);
+		ctx.regs = perf_arch_bpf_user_pt_regs(regs);
 		ret = bpf_prog_run(prog, &ctx);
 	}
-	rcu_read_unlock();
-out:
 	__this_cpu_dec(bpf_prog_active);
 
 	return ret;
@@ -9652,14 +9661,14 @@ static inline int perf_event_set_bpf_handler(struct perf_event *event,
 		return -EPROTO;
 	}
 
-	event->prog = prog;
+	event->prog = (void *)((unsigned long)prog | 1);
 	event->bpf_cookie = bpf_cookie;
 	return 0;
 }
 
 static inline void perf_event_free_bpf_handler(struct perf_event *event)
 {
-	struct bpf_prog *prog = event->prog;
+	struct bpf_prog *prog = event_prog(event);
 
 	if (!prog)
 		return;
@@ -9707,7 +9716,7 @@ static int __perf_event_overflow(struct perf_event *event,
 
 	ret = __perf_event_account_interrupt(event, throttle);
 
-	if (event->prog && !bpf_overflow_handler(event, data, regs))
+	if (!bpf_overflow_handler(event, data, regs))
 		return ret;
 
 	/*
@@ -12026,10 +12035,10 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 		context = parent_event->overflow_handler_context;
 #if defined(CONFIG_BPF_SYSCALL) && defined(CONFIG_EVENT_TRACING)
 		if (parent_event->prog) {
-			struct bpf_prog *prog = parent_event->prog;
+			struct bpf_prog *prog = event_prog(parent_event);
 
 			bpf_prog_inc(prog);
-			event->prog = prog;
+			event->prog = parent_event->prog;
 		}
 #endif
 	}

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-07-15 15:04       ` Peter Zijlstra
@ 2024-07-15 15:19         ` Kyle Huey
  2024-07-15 16:30           ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Kyle Huey @ 2024-07-15 15:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, khuey, Ingo Molnar, Namhyung Kim, Linus Torvalds,
	robert, Joe Damato, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Ian Rogers, Adrian Hunter, Liang, Kan,
	Andrii Nakryiko, Song Liu, linux-perf-users, linux-kernel, bpf

On Mon, Jul 15, 2024 at 8:04 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Jul 15, 2024 at 07:33:57AM -0700, Kyle Huey wrote:
> > On Mon, Jul 15, 2024 at 4:12 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> > > Urgh, so wth does event_is_tracing do with event->prog? And can't we
> > > clean this up?
> >
> > Tracing events keep track of the bpf program in event->prog solely for
> > cleanup. The bpf programs are stored in and invoked from
> > event->tp_event->prog_array, but when the event is destroyed it needs
> > to know which bpf program to remove from that array.
>
> Yeah, figured it out eventually.. Does look like it needs event->prog
> and we can't easily remedy this dual use :/
>
> > > That whole perf_event_is_tracing() is a pretty gross function.
> > >
> > > Also, I think the default return value of bpf_overflow_handler() is
> > > wrong -- note how if !event->prog we won't call bpf_overflow_handler(),
> > > but if we do call it, but then have !event->prog on the re-read, we
> > > still return 0.
> >
> > The synchronization model here isn't quite clear to me but I don't
> > think this matters in practice. Once event->prog is set the only
> > allowed change is for it to be cleared when the perf event is freed.
> > Anything else is refused by perf_event_set_bpf_handler() with EEXIST.
> > Can that free race with an overflow handler? I'm not sure, but even if
> > it can, dropping an overflow for an event that's being freed seems
> > fine to me. If it can't race then we could remove the condition on the
> > re-read entirely.
>
> Right, also rcu_read_lock() is cheap enough to unconditionally do I'm
> thinking.
>
> So since we have two distinct users of event->prog, I figured we could
> distinguish them from one of the LSB in the pointer value, which then
> got me the below.
>
> But now that I see the end result I'm not at all sure this is sane.
>
> But I figure it ought to work...

I think this would probably work but stealing the bit seems far more
complicated than just gating on perf_event_is_tracing().

Would it assuage your concerns at all if I made event->prog a simple
union between say handler_prog and sample_prog (still discriminated by
perf_event_is_tracing() where necessary) with appropriate comments and
changed the two code paths accordingly?

- Kyle

> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index ab6c4c942f79..5ec78346c2a1 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -9594,6 +9594,13 @@ static inline bool sample_is_allowed(struct perf_event *event, struct pt_regs *r
>  }
>
>  #ifdef CONFIG_BPF_SYSCALL
> +
> +static inline struct bpf_prog *event_prog(struct perf_event *event)
> +{
> +       unsigned long _prog = (unsigned long)READ_ONCE(event->prog);
> +       return (void *)(_prog & ~1);
> +}
> +
>  static int bpf_overflow_handler(struct perf_event *event,
>                                 struct perf_sample_data *data,
>                                 struct pt_regs *regs)
> @@ -9603,19 +9610,21 @@ static int bpf_overflow_handler(struct perf_event *event,
>                 .event = event,
>         };
>         struct bpf_prog *prog;
> -       int ret = 0;
> +       int ret = 1;
> +
> +       guard(rcu)();
>
> -       ctx.regs = perf_arch_bpf_user_pt_regs(regs);
> -       if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1))
> -               goto out;
> -       rcu_read_lock();
>         prog = READ_ONCE(event->prog);
> -       if (prog) {
> +       if (!((unsigned long)prog & 1))
> +               return ret;
> +
> +       prog = (void *)((unsigned long)prog & ~1);
> +
> +       if (unlikely(__this_cpu_inc_return(bpf_prog_active) == 1)) {
>                 perf_prepare_sample(data, event, regs);
> +               ctx.regs = perf_arch_bpf_user_pt_regs(regs);
>                 ret = bpf_prog_run(prog, &ctx);
>         }
> -       rcu_read_unlock();
> -out:
>         __this_cpu_dec(bpf_prog_active);
>
>         return ret;
> @@ -9652,14 +9661,14 @@ static inline int perf_event_set_bpf_handler(struct perf_event *event,
>                 return -EPROTO;
>         }
>
> -       event->prog = prog;
> +       event->prog = (void *)((unsigned long)prog | 1);
>         event->bpf_cookie = bpf_cookie;
>         return 0;
>  }
>
>  static inline void perf_event_free_bpf_handler(struct perf_event *event)
>  {
> -       struct bpf_prog *prog = event->prog;
> +       struct bpf_prog *prog = event_prog(event);
>
>         if (!prog)
>                 return;
> @@ -9707,7 +9716,7 @@ static int __perf_event_overflow(struct perf_event *event,
>
>         ret = __perf_event_account_interrupt(event, throttle);
>
> -       if (event->prog && !bpf_overflow_handler(event, data, regs))
> +       if (!bpf_overflow_handler(event, data, regs))
>                 return ret;
>
>         /*
> @@ -12026,10 +12035,10 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
>                 context = parent_event->overflow_handler_context;
>  #if defined(CONFIG_BPF_SYSCALL) && defined(CONFIG_EVENT_TRACING)
>                 if (parent_event->prog) {
> -                       struct bpf_prog *prog = parent_event->prog;
> +                       struct bpf_prog *prog = event_prog(parent_event);
>
>                         bpf_prog_inc(prog);
> -                       event->prog = prog;
> +                       event->prog = parent_event->prog;
>                 }
>  #endif
>         }

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-07-15 15:19         ` Kyle Huey
@ 2024-07-15 16:30           ` Peter Zijlstra
  2024-07-15 16:48             ` Kyle Huey
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2024-07-15 16:30 UTC (permalink / raw)
  To: Kyle Huey
  Cc: Jiri Olsa, khuey, Ingo Molnar, Namhyung Kim, Linus Torvalds,
	robert, Joe Damato, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Ian Rogers, Adrian Hunter, Liang, Kan,
	Andrii Nakryiko, Song Liu, linux-perf-users, linux-kernel, bpf

On Mon, Jul 15, 2024 at 08:19:44AM -0700, Kyle Huey wrote:

> I think this would probably work but stealing the bit seems far more
> complicated than just gating on perf_event_is_tracing().

perf_event_is_tracing() is something like 3 branches. It is not a simple
conditional. Combined with that re-load and the wrong return value, this
all wants a cleanup.

Using that LSB works, it's just that the code aint pretty.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-07-15 16:30           ` Peter Zijlstra
@ 2024-07-15 16:48             ` Kyle Huey
  2024-07-16  7:25               ` Jiri Olsa
  0 siblings, 1 reply; 17+ messages in thread
From: Kyle Huey @ 2024-07-15 16:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, khuey, Ingo Molnar, Namhyung Kim, Linus Torvalds,
	robert, Joe Damato, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Ian Rogers, Adrian Hunter, Liang, Kan,
	Andrii Nakryiko, Song Liu, linux-perf-users, linux-kernel, bpf

On Mon, Jul 15, 2024 at 9:30 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Jul 15, 2024 at 08:19:44AM -0700, Kyle Huey wrote:
>
> > I think this would probably work but stealing the bit seems far more
> > complicated than just gating on perf_event_is_tracing().
>
> perf_event_is_tracing() is something like 3 branches. It is not a simple
> conditional. Combined with that re-load and the wrong return value, this
> all wants a cleanup.
>
> Using that LSB works, it's just that the code aint pretty.

Maybe we could gate on !event->tp_event instead. Somebody who is more
familiar with this code than me should probably confirm that tp_event
being non-null and perf_event_is_tracing() being true are equivalent
though.

- Kyle

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-07-15 16:48             ` Kyle Huey
@ 2024-07-16  7:25               ` Jiri Olsa
  2024-07-19 18:26                 ` Andrii Nakryiko
  2024-07-20 16:03                 ` Masami Hiramatsu
  0 siblings, 2 replies; 17+ messages in thread
From: Jiri Olsa @ 2024-07-16  7:25 UTC (permalink / raw)
  To: Kyle Huey, Masami Hiramatsu
  Cc: Peter Zijlstra, Jiri Olsa, khuey, Ingo Molnar, Namhyung Kim,
	Linus Torvalds, robert, Joe Damato, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Ian Rogers, Adrian Hunter,
	Liang, Kan, Andrii Nakryiko, Song Liu, linux-perf-users,
	linux-kernel, bpf

On Mon, Jul 15, 2024 at 09:48:58AM -0700, Kyle Huey wrote:
> On Mon, Jul 15, 2024 at 9:30 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Mon, Jul 15, 2024 at 08:19:44AM -0700, Kyle Huey wrote:
> >
> > > I think this would probably work but stealing the bit seems far more
> > > complicated than just gating on perf_event_is_tracing().
> >
> > perf_event_is_tracing() is something like 3 branches. It is not a simple
> > conditional. Combined with that re-load and the wrong return value, this
> > all wants a cleanup.
> >
> > Using that LSB works, it's just that the code aint pretty.
> 
> Maybe we could gate on !event->tp_event instead. Somebody who is more
> familiar with this code than me should probably confirm that tp_event
> being non-null and perf_event_is_tracing() being true are equivalent
> though.
> 

it looks like that's the case, AFAICS tracepoint/kprobe/uprobe events
are the only ones having the tp_event pointer set, Masami?

fwiw I tried to run bpf selftests with that and it's fine

jirka


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-07-16  7:25               ` Jiri Olsa
@ 2024-07-19 18:26                 ` Andrii Nakryiko
  2024-07-26 12:37                   ` Kyle Huey
  2024-07-20 16:03                 ` Masami Hiramatsu
  1 sibling, 1 reply; 17+ messages in thread
From: Andrii Nakryiko @ 2024-07-19 18:26 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Kyle Huey, Masami Hiramatsu, Peter Zijlstra, khuey, Ingo Molnar,
	Namhyung Kim, Linus Torvalds, robert, Joe Damato,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Ian Rogers, Adrian Hunter, Liang, Kan, Andrii Nakryiko, Song Liu,
	linux-perf-users, linux-kernel, bpf

On Tue, Jul 16, 2024 at 12:25 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Mon, Jul 15, 2024 at 09:48:58AM -0700, Kyle Huey wrote:
> > On Mon, Jul 15, 2024 at 9:30 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Mon, Jul 15, 2024 at 08:19:44AM -0700, Kyle Huey wrote:
> > >
> > > > I think this would probably work but stealing the bit seems far more
> > > > complicated than just gating on perf_event_is_tracing().
> > >
> > > perf_event_is_tracing() is something like 3 branches. It is not a simple
> > > conditional. Combined with that re-load and the wrong return value, this
> > > all wants a cleanup.
> > >
> > > Using that LSB works, it's just that the code aint pretty.
> >
> > Maybe we could gate on !event->tp_event instead. Somebody who is more
> > familiar with this code than me should probably confirm that tp_event
> > being non-null and perf_event_is_tracing() being true are equivalent
> > though.
> >
>
> it looks like that's the case, AFAICS tracepoint/kprobe/uprobe events
> are the only ones having the tp_event pointer set, Masami?
>
> fwiw I tried to run bpf selftests with that and it's fine

Why can't we do the most straightforward thing in this case?

diff --git a/kernel/events/core.c b/kernel/events/core.c
index ab6c4c942f79..cf4645b26c90 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9707,7 +9707,8 @@ static int __perf_event_overflow(struct perf_event *event,

        ret = __perf_event_account_interrupt(event, throttle);

-       if (event->prog && !bpf_overflow_handler(event, data, regs))
+       if (event->prog && event->prog->type == BPF_PROG_TYPE_PERF_EVENT &&
+           !bpf_overflow_handler(event, data, regs))
                return ret;


>
> jirka
>

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-07-16  7:25               ` Jiri Olsa
  2024-07-19 18:26                 ` Andrii Nakryiko
@ 2024-07-20 16:03                 ` Masami Hiramatsu
  1 sibling, 0 replies; 17+ messages in thread
From: Masami Hiramatsu @ 2024-07-20 16:03 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Kyle Huey, Peter Zijlstra, khuey, Ingo Molnar, Namhyung Kim,
	Linus Torvalds, robert, Joe Damato, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Ian Rogers, Adrian Hunter,
	Liang, Kan, Andrii Nakryiko, Song Liu, linux-perf-users,
	linux-kernel, bpf

On Tue, 16 Jul 2024 09:25:21 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:

> On Mon, Jul 15, 2024 at 09:48:58AM -0700, Kyle Huey wrote:
> > On Mon, Jul 15, 2024 at 9:30 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Mon, Jul 15, 2024 at 08:19:44AM -0700, Kyle Huey wrote:
> > >
> > > > I think this would probably work but stealing the bit seems far more
> > > > complicated than just gating on perf_event_is_tracing().
> > >
> > > perf_event_is_tracing() is something like 3 branches. It is not a simple
> > > conditional. Combined with that re-load and the wrong return value, this
> > > all wants a cleanup.
> > >
> > > Using that LSB works, it's just that the code aint pretty.
> > 
> > Maybe we could gate on !event->tp_event instead. Somebody who is more
> > familiar with this code than me should probably confirm that tp_event
> > being non-null and perf_event_is_tracing() being true are equivalent
> > though.
> > 
> 
> it looks like that's the case, AFAICS tracepoint/kprobe/uprobe events
> are the only ones having the tp_event pointer set, Masami?

Hmm, I think any dynamic_events has tp_event (is struct trace_event_call *)
because it represents the event itself. But yes, if the event is working
like a trace-event, it should have tp_event. So you can use it instead
perf_event_is_tracing().

Thank you,

> 
> fwiw I tried to run bpf selftests with that and it's fine
> 
> jirka
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-07-19 18:26                 ` Andrii Nakryiko
@ 2024-07-26 12:37                   ` Kyle Huey
  2024-07-26 16:34                     ` Andrii Nakryiko
  0 siblings, 1 reply; 17+ messages in thread
From: Kyle Huey @ 2024-07-26 12:37 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Masami Hiramatsu, Peter Zijlstra, khuey, Ingo Molnar,
	Namhyung Kim, Linus Torvalds, robert, Joe Damato,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Ian Rogers, Adrian Hunter, Liang, Kan, Andrii Nakryiko, Song Liu,
	linux-perf-users, linux-kernel, bpf

On Fri, Jul 19, 2024 at 11:26 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Jul 16, 2024 at 12:25 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Mon, Jul 15, 2024 at 09:48:58AM -0700, Kyle Huey wrote:
> > > On Mon, Jul 15, 2024 at 9:30 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > >
> > > > On Mon, Jul 15, 2024 at 08:19:44AM -0700, Kyle Huey wrote:
> > > >
> > > > > I think this would probably work but stealing the bit seems far more
> > > > > complicated than just gating on perf_event_is_tracing().
> > > >
> > > > perf_event_is_tracing() is something like 3 branches. It is not a simple
> > > > conditional. Combined with that re-load and the wrong return value, this
> > > > all wants a cleanup.
> > > >
> > > > Using that LSB works, it's just that the code aint pretty.
> > >
> > > Maybe we could gate on !event->tp_event instead. Somebody who is more
> > > familiar with this code than me should probably confirm that tp_event
> > > being non-null and perf_event_is_tracing() being true are equivalent
> > > though.
> > >
> >
> > it looks like that's the case, AFAICS tracepoint/kprobe/uprobe events
> > are the only ones having the tp_event pointer set, Masami?
> >
> > fwiw I tried to run bpf selftests with that and it's fine
>
> Why can't we do the most straightforward thing in this case?
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index ab6c4c942f79..cf4645b26c90 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -9707,7 +9707,8 @@ static int __perf_event_overflow(struct perf_event *event,
>
>         ret = __perf_event_account_interrupt(event, throttle);
>
> -       if (event->prog && !bpf_overflow_handler(event, data, regs))
> +       if (event->prog && event->prog->type == BPF_PROG_TYPE_PERF_EVENT &&
> +           !bpf_overflow_handler(event, data, regs))
>                 return ret;
>
>
> >
> > jirka
> >

Yes, that's effectively equivalent to calling perf_event_is_tracing()
and would work too. Do you want to land that patch? It needs to go to
6.10 stable too.

- Kyle

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-07-26 12:37                   ` Kyle Huey
@ 2024-07-26 16:34                     ` Andrii Nakryiko
  2024-07-26 16:35                       ` Kyle Huey
  0 siblings, 1 reply; 17+ messages in thread
From: Andrii Nakryiko @ 2024-07-26 16:34 UTC (permalink / raw)
  To: Kyle Huey
  Cc: Jiri Olsa, Masami Hiramatsu, Peter Zijlstra, khuey, Ingo Molnar,
	Namhyung Kim, Linus Torvalds, robert, Joe Damato,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Ian Rogers, Adrian Hunter, Liang, Kan, Andrii Nakryiko, Song Liu,
	linux-perf-users, linux-kernel, bpf

On Fri, Jul 26, 2024 at 5:37 AM Kyle Huey <me@kylehuey.com> wrote:
>
> On Fri, Jul 19, 2024 at 11:26 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, Jul 16, 2024 at 12:25 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Mon, Jul 15, 2024 at 09:48:58AM -0700, Kyle Huey wrote:
> > > > On Mon, Jul 15, 2024 at 9:30 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > > >
> > > > > On Mon, Jul 15, 2024 at 08:19:44AM -0700, Kyle Huey wrote:
> > > > >
> > > > > > I think this would probably work but stealing the bit seems far more
> > > > > > complicated than just gating on perf_event_is_tracing().
> > > > >
> > > > > perf_event_is_tracing() is something like 3 branches. It is not a simple
> > > > > conditional. Combined with that re-load and the wrong return value, this
> > > > > all wants a cleanup.
> > > > >
> > > > > Using that LSB works, it's just that the code aint pretty.
> > > >
> > > > Maybe we could gate on !event->tp_event instead. Somebody who is more
> > > > familiar with this code than me should probably confirm that tp_event
> > > > being non-null and perf_event_is_tracing() being true are equivalent
> > > > though.
> > > >
> > >
> > > it looks like that's the case, AFAICS tracepoint/kprobe/uprobe events
> > > are the only ones having the tp_event pointer set, Masami?
> > >
> > > fwiw I tried to run bpf selftests with that and it's fine
> >
> > Why can't we do the most straightforward thing in this case?
> >
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index ab6c4c942f79..cf4645b26c90 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -9707,7 +9707,8 @@ static int __perf_event_overflow(struct perf_event *event,
> >
> >         ret = __perf_event_account_interrupt(event, throttle);
> >
> > -       if (event->prog && !bpf_overflow_handler(event, data, regs))
> > +       if (event->prog && event->prog->type == BPF_PROG_TYPE_PERF_EVENT &&
> > +           !bpf_overflow_handler(event, data, regs))
> >                 return ret;
> >
> >
> > >
> > > jirka
> > >
>
> Yes, that's effectively equivalent to calling perf_event_is_tracing()
> and would work too. Do you want to land that patch? It needs to go to
> 6.10 stable too.

I'd appreciate it if you can just incorporate that into your patch and
resend it, thank you!

>
> - Kyle

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-07-26 16:34                     ` Andrii Nakryiko
@ 2024-07-26 16:35                       ` Kyle Huey
  2024-08-05 11:55                         ` Joe Damato
  0 siblings, 1 reply; 17+ messages in thread
From: Kyle Huey @ 2024-07-26 16:35 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Masami Hiramatsu, Peter Zijlstra, khuey, Ingo Molnar,
	Namhyung Kim, Linus Torvalds, robert, Joe Damato,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Ian Rogers, Adrian Hunter, Liang, Kan, Andrii Nakryiko, Song Liu,
	linux-perf-users, linux-kernel, bpf

Will do.

- Kyle

On Fri, Jul 26, 2024 at 9:34 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Jul 26, 2024 at 5:37 AM Kyle Huey <me@kylehuey.com> wrote:
> >
> > On Fri, Jul 19, 2024 at 11:26 AM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Tue, Jul 16, 2024 at 12:25 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > >
> > > > On Mon, Jul 15, 2024 at 09:48:58AM -0700, Kyle Huey wrote:
> > > > > On Mon, Jul 15, 2024 at 9:30 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > > > >
> > > > > > On Mon, Jul 15, 2024 at 08:19:44AM -0700, Kyle Huey wrote:
> > > > > >
> > > > > > > I think this would probably work but stealing the bit seems far more
> > > > > > > complicated than just gating on perf_event_is_tracing().
> > > > > >
> > > > > > perf_event_is_tracing() is something like 3 branches. It is not a simple
> > > > > > conditional. Combined with that re-load and the wrong return value, this
> > > > > > all wants a cleanup.
> > > > > >
> > > > > > Using that LSB works, it's just that the code aint pretty.
> > > > >
> > > > > Maybe we could gate on !event->tp_event instead. Somebody who is more
> > > > > familiar with this code than me should probably confirm that tp_event
> > > > > being non-null and perf_event_is_tracing() being true are equivalent
> > > > > though.
> > > > >
> > > >
> > > > it looks like that's the case, AFAICS tracepoint/kprobe/uprobe events
> > > > are the only ones having the tp_event pointer set, Masami?
> > > >
> > > > fwiw I tried to run bpf selftests with that and it's fine
> > >
> > > Why can't we do the most straightforward thing in this case?
> > >
> > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > index ab6c4c942f79..cf4645b26c90 100644
> > > --- a/kernel/events/core.c
> > > +++ b/kernel/events/core.c
> > > @@ -9707,7 +9707,8 @@ static int __perf_event_overflow(struct perf_event *event,
> > >
> > >         ret = __perf_event_account_interrupt(event, throttle);
> > >
> > > -       if (event->prog && !bpf_overflow_handler(event, data, regs))
> > > +       if (event->prog && event->prog->type == BPF_PROG_TYPE_PERF_EVENT &&
> > > +           !bpf_overflow_handler(event, data, regs))
> > >                 return ret;
> > >
> > >
> > > >
> > > > jirka
> > > >
> >
> > Yes, that's effectively equivalent to calling perf_event_is_tracing()
> > and would work too. Do you want to land that patch? It needs to go to
> > 6.10 stable too.
>
> I'd appreciate it if you can just incorporate that into your patch and
> resend it, thank you!
>
> >
> > - Kyle

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-07-26 16:35                       ` Kyle Huey
@ 2024-08-05 11:55                         ` Joe Damato
  2024-08-13 10:37                           ` Joe Damato
  0 siblings, 1 reply; 17+ messages in thread
From: Joe Damato @ 2024-08-05 11:55 UTC (permalink / raw)
  To: Kyle Huey
  Cc: Andrii Nakryiko, Jiri Olsa, Masami Hiramatsu, Peter Zijlstra,
	khuey, Ingo Molnar, Namhyung Kim, Linus Torvalds, robert,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Ian Rogers, Adrian Hunter, Liang, Kan, Andrii Nakryiko, Song Liu,
	linux-perf-users, linux-kernel, bpf

On Fri, Jul 26, 2024 at 09:35:43AM -0700, Kyle Huey wrote:
> Will do.
> 
> - Kyle
> 
> On Fri, Jul 26, 2024 at 9:34 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Fri, Jul 26, 2024 at 5:37 AM Kyle Huey <me@kylehuey.com> wrote:
> > >
> > > On Fri, Jul 19, 2024 at 11:26 AM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Tue, Jul 16, 2024 at 12:25 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > > >
> > > > > On Mon, Jul 15, 2024 at 09:48:58AM -0700, Kyle Huey wrote:
> > > > > > On Mon, Jul 15, 2024 at 9:30 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > > > > >
> > > > > > > On Mon, Jul 15, 2024 at 08:19:44AM -0700, Kyle Huey wrote:
> > > > > > >
> > > > > > > > I think this would probably work but stealing the bit seems far more
> > > > > > > > complicated than just gating on perf_event_is_tracing().
> > > > > > >
> > > > > > > perf_event_is_tracing() is something like 3 branches. It is not a simple
> > > > > > > conditional. Combined with that re-load and the wrong return value, this
> > > > > > > all wants a cleanup.
> > > > > > >
> > > > > > > Using that LSB works, it's just that the code aint pretty.
> > > > > >
> > > > > > Maybe we could gate on !event->tp_event instead. Somebody who is more
> > > > > > familiar with this code than me should probably confirm that tp_event
> > > > > > being non-null and perf_event_is_tracing() being true are equivalent
> > > > > > though.
> > > > > >
> > > > >
> > > > > it looks like that's the case, AFAICS tracepoint/kprobe/uprobe events
> > > > > are the only ones having the tp_event pointer set, Masami?
> > > > >
> > > > > fwiw I tried to run bpf selftests with that and it's fine
> > > >
> > > > Why can't we do the most straightforward thing in this case?
> > > >
> > > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > > index ab6c4c942f79..cf4645b26c90 100644
> > > > --- a/kernel/events/core.c
> > > > +++ b/kernel/events/core.c
> > > > @@ -9707,7 +9707,8 @@ static int __perf_event_overflow(struct perf_event *event,
> > > >
> > > >         ret = __perf_event_account_interrupt(event, throttle);
> > > >
> > > > -       if (event->prog && !bpf_overflow_handler(event, data, regs))
> > > > +       if (event->prog && event->prog->type == BPF_PROG_TYPE_PERF_EVENT &&
> > > > +           !bpf_overflow_handler(event, data, regs))
> > > >                 return ret;
> > > >
> > > >
> > > > >
> > > > > jirka
> > > > >
> > >
> > > Yes, that's effectively equivalent to calling perf_event_is_tracing()
> > > and would work too. Do you want to land that patch? It needs to go to
> > > 6.10 stable too.
> >
> > I'd appreciate it if you can just incorporate that into your patch and
> > resend it, thank you!
> >
> > >
> > > - Kyle

I probably missed the updated patch, but I am happy to test any new
versions, if needed, to ensure that the bug I hit is fixed.

Kyle: please let me know if there's a patch you'd like me to test?

- Joe

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-08-05 11:55                         ` Joe Damato
@ 2024-08-13 10:37                           ` Joe Damato
  2024-08-13 13:38                             ` Kyle Huey
  0 siblings, 1 reply; 17+ messages in thread
From: Joe Damato @ 2024-08-13 10:37 UTC (permalink / raw)
  To: Kyle Huey
  Cc: Andrii Nakryiko, Jiri Olsa, Masami Hiramatsu, Peter Zijlstra,
	khuey, Ingo Molnar, Namhyung Kim, Linus Torvalds, robert,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Ian Rogers, Adrian Hunter, Liang, Kan, Andrii Nakryiko, Song Liu,
	linux-perf-users, linux-kernel, bpf

On Mon, Aug 05, 2024 at 12:55:17PM +0100, Joe Damato wrote:
> On Fri, Jul 26, 2024 at 09:35:43AM -0700, Kyle Huey wrote:
> > Will do.
> > 
> > - Kyle
> > 
> > On Fri, Jul 26, 2024 at 9:34 AM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Fri, Jul 26, 2024 at 5:37 AM Kyle Huey <me@kylehuey.com> wrote:
> > > >
> > > > On Fri, Jul 19, 2024 at 11:26 AM Andrii Nakryiko
> > > > <andrii.nakryiko@gmail.com> wrote:
> > > > >
> > > > > On Tue, Jul 16, 2024 at 12:25 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, Jul 15, 2024 at 09:48:58AM -0700, Kyle Huey wrote:
> > > > > > > On Mon, Jul 15, 2024 at 9:30 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > > > > > >
> > > > > > > > On Mon, Jul 15, 2024 at 08:19:44AM -0700, Kyle Huey wrote:
> > > > > > > >
> > > > > > > > > I think this would probably work but stealing the bit seems far more
> > > > > > > > > complicated than just gating on perf_event_is_tracing().
> > > > > > > >
> > > > > > > > perf_event_is_tracing() is something like 3 branches. It is not a simple
> > > > > > > > conditional. Combined with that re-load and the wrong return value, this
> > > > > > > > all wants a cleanup.
> > > > > > > >
> > > > > > > > Using that LSB works, it's just that the code aint pretty.
> > > > > > >
> > > > > > > Maybe we could gate on !event->tp_event instead. Somebody who is more
> > > > > > > familiar with this code than me should probably confirm that tp_event
> > > > > > > being non-null and perf_event_is_tracing() being true are equivalent
> > > > > > > though.
> > > > > > >
> > > > > >
> > > > > > it looks like that's the case, AFAICS tracepoint/kprobe/uprobe events
> > > > > > are the only ones having the tp_event pointer set, Masami?
> > > > > >
> > > > > > fwiw I tried to run bpf selftests with that and it's fine
> > > > >
> > > > > Why can't we do the most straightforward thing in this case?
> > > > >
> > > > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > > > index ab6c4c942f79..cf4645b26c90 100644
> > > > > --- a/kernel/events/core.c
> > > > > +++ b/kernel/events/core.c
> > > > > @@ -9707,7 +9707,8 @@ static int __perf_event_overflow(struct perf_event *event,
> > > > >
> > > > >         ret = __perf_event_account_interrupt(event, throttle);
> > > > >
> > > > > -       if (event->prog && !bpf_overflow_handler(event, data, regs))
> > > > > +       if (event->prog && event->prog->type == BPF_PROG_TYPE_PERF_EVENT &&
> > > > > +           !bpf_overflow_handler(event, data, regs))
> > > > >                 return ret;
> > > > >
> > > > >
> > > > > >
> > > > > > jirka
> > > > > >
> > > >
> > > > Yes, that's effectively equivalent to calling perf_event_is_tracing()
> > > > and would work too. Do you want to land that patch? It needs to go to
> > > > 6.10 stable too.
> > >
> > > I'd appreciate it if you can just incorporate that into your patch and
> > > resend it, thank you!
> > >
> > > >
> > > > - Kyle
> 
> I probably missed the updated patch, but I am happy to test any new
> versions, if needed, to ensure that the bug I hit is fixed.
> 
> Kyle: please let me know if there's a patch you'd like me to test?

Sorry for pinging this thread again; let me know if a fix was merged
and I missed it?

Otherwise, if it'd be helpful, I am happy to modify Kyle's patch to
take Andrii's suggestion and resend, but I don't want to step on
anyone's toes :)

- Joe

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events
  2024-08-13 10:37                           ` Joe Damato
@ 2024-08-13 13:38                             ` Kyle Huey
  0 siblings, 0 replies; 17+ messages in thread
From: Kyle Huey @ 2024-08-13 13:38 UTC (permalink / raw)
  To: Joe Damato
  Cc: Andrii Nakryiko, Jiri Olsa, Masami Hiramatsu, Peter Zijlstra,
	khuey, Ingo Molnar, Namhyung Kim, Linus Torvalds, robert,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Ian Rogers, Adrian Hunter, Liang, Kan, Andrii Nakryiko, Song Liu,
	linux-perf-users, linux-kernel, bpf

On Tue, Aug 13, 2024 at 3:37 AM Joe Damato <jdamato@fastly.com> wrote:
>
> On Mon, Aug 05, 2024 at 12:55:17PM +0100, Joe Damato wrote:
> > On Fri, Jul 26, 2024 at 09:35:43AM -0700, Kyle Huey wrote:
> > > Will do.
> > >
> > > - Kyle
> > >
> > > On Fri, Jul 26, 2024 at 9:34 AM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Fri, Jul 26, 2024 at 5:37 AM Kyle Huey <me@kylehuey.com> wrote:
> > > > >
> > > > > On Fri, Jul 19, 2024 at 11:26 AM Andrii Nakryiko
> > > > > <andrii.nakryiko@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Jul 16, 2024 at 12:25 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > > > > >
> > > > > > > On Mon, Jul 15, 2024 at 09:48:58AM -0700, Kyle Huey wrote:
> > > > > > > > On Mon, Jul 15, 2024 at 9:30 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Jul 15, 2024 at 08:19:44AM -0700, Kyle Huey wrote:
> > > > > > > > >
> > > > > > > > > > I think this would probably work but stealing the bit seems far more
> > > > > > > > > > complicated than just gating on perf_event_is_tracing().
> > > > > > > > >
> > > > > > > > > perf_event_is_tracing() is something like 3 branches. It is not a simple
> > > > > > > > > conditional. Combined with that re-load and the wrong return value, this
> > > > > > > > > all wants a cleanup.
> > > > > > > > >
> > > > > > > > > Using that LSB works, it's just that the code aint pretty.
> > > > > > > >
> > > > > > > > Maybe we could gate on !event->tp_event instead. Somebody who is more
> > > > > > > > familiar with this code than me should probably confirm that tp_event
> > > > > > > > being non-null and perf_event_is_tracing() being true are equivalent
> > > > > > > > though.
> > > > > > > >
> > > > > > >
> > > > > > > it looks like that's the case, AFAICS tracepoint/kprobe/uprobe events
> > > > > > > are the only ones having the tp_event pointer set, Masami?
> > > > > > >
> > > > > > > fwiw I tried to run bpf selftests with that and it's fine
> > > > > >
> > > > > > Why can't we do the most straightforward thing in this case?
> > > > > >
> > > > > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > > > > index ab6c4c942f79..cf4645b26c90 100644
> > > > > > --- a/kernel/events/core.c
> > > > > > +++ b/kernel/events/core.c
> > > > > > @@ -9707,7 +9707,8 @@ static int __perf_event_overflow(struct perf_event *event,
> > > > > >
> > > > > >         ret = __perf_event_account_interrupt(event, throttle);
> > > > > >
> > > > > > -       if (event->prog && !bpf_overflow_handler(event, data, regs))
> > > > > > +       if (event->prog && event->prog->type == BPF_PROG_TYPE_PERF_EVENT &&
> > > > > > +           !bpf_overflow_handler(event, data, regs))
> > > > > >                 return ret;
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > jirka
> > > > > > >
> > > > >
> > > > > Yes, that's effectively equivalent to calling perf_event_is_tracing()
> > > > > and would work too. Do you want to land that patch? It needs to go to
> > > > > 6.10 stable too.
> > > >
> > > > I'd appreciate it if you can just incorporate that into your patch and
> > > > resend it, thank you!
> > > >
> > > > >
> > > > > - Kyle
> >
> > I probably missed the updated patch, but I am happy to test any new
> > versions, if needed, to ensure that the bug I hit is fixed.
> >
> > Kyle: please let me know if there's a patch you'd like me to test?
>
> Sorry for pinging this thread again; let me know if a fix was merged
> and I missed it?
>
> Otherwise, if it'd be helpful, I am happy to modify Kyle's patch to
> take Andrii's suggestion and resend, but I don't want to step on
> anyone's toes :)
>
> - Joe

Hi Joe,

You didn't miss anything. I've been remiss in getting back to this.
I'm currently away from home (and the machine I usually do kernel
development on), so if you want to run with this please do so.

Sorry for the inconvenience,

- Kyle

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-08-13 13:38 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-13  4:46 [PATCH] perf/bpf: Don't call bpf_overflow_handler() for tracing events Kyle Huey
2024-07-13 20:32 ` Jiri Olsa
2024-07-15 11:12   ` Peter Zijlstra
2024-07-15 14:33     ` Kyle Huey
2024-07-15 15:04       ` Peter Zijlstra
2024-07-15 15:19         ` Kyle Huey
2024-07-15 16:30           ` Peter Zijlstra
2024-07-15 16:48             ` Kyle Huey
2024-07-16  7:25               ` Jiri Olsa
2024-07-19 18:26                 ` Andrii Nakryiko
2024-07-26 12:37                   ` Kyle Huey
2024-07-26 16:34                     ` Andrii Nakryiko
2024-07-26 16:35                       ` Kyle Huey
2024-08-05 11:55                         ` Joe Damato
2024-08-13 10:37                           ` Joe Damato
2024-08-13 13:38                             ` Kyle Huey
2024-07-20 16:03                 ` Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).