All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Olsa <olsajiri@gmail.com>
To: Joe Damato <jdamato@fastly.com>, Kyle Huey <me@kylehuey.com>
Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
	netdev@vger.kernel.org, acme@kernel.org,
	andrii.nakryiko@gmail.com, elver@google.com, khuey@kylehuey.com,
	mingo@kernel.org, namhyung@kernel.org, peterz@infradead.org,
	robert@ocallahan.org, yonghong.song@linux.dev,
	mkarsten@uwaterloo.ca, kuba@kernel.org
Subject: Re: [bpf?] [net-next ?] [RESEND] possible bpf overflow/output bug introduced in 6.10rc1 ?
Date: Sat, 13 Jul 2024 00:18:26 +0200	[thread overview]
Message-ID: <ZpGrstyKD-PtWyoP@krava> (raw)
In-Reply-To: <ZpFfocvyF3KHaSzF@LQ3V64L9R2>

On Fri, Jul 12, 2024 at 09:53:53AM -0700, Joe Damato wrote:
> Greetings:
> 
> (I am reposting this question after 2 days and to a wider audience
> as I didn't hear back [1]; my apologies it just seemed like a
> possible bug slipped into 6.10-rc1 and I wanted to bring attention
> to it before 6.10 is released.)
> 
> While testing some unrelated networking code with Martin Karsten (cc'd on
> this email) we discovered what appears to be some sort of overflow bug in
> bpf.
> 
> git bisect suggests that commit f11f10bfa1ca ("perf/bpf: Call BPF handler
> directly, not through overflow machinery") is the first commit where the
> (I assume) buggy behavior appears.

heya, nice catch!

I can reproduce.. it seems that after f11f10bfa1ca we allow to run tracepoint
program as perf event overflow program 

bpftrace's bpf program returns 1 which means that perf_trace_run_bpf_submit
will continue to execute perf_tp_event and then:

  perf_tp_event
    perf_swevent_event
      __perf_event_overflow
        bpf_overflow_handler

bpf_overflow_handler then executes event->prog on wrong arguments, which
results in wrong 'work' data in bpftrace output

I can 'fix' that by checking the event type before running the program like
in the change below, but I wonder there's probably better fix

Kyle, any idea?

> 
> Running the following on my machine as of the commit mentioned above:
> 
>   bpftrace -e 'tracepoint:napi:napi_poll { @[args->work] = count(); }'
> 
> while simultaneously transferring data to the target machine (in my case, I
> scp'd a 100MiB file of zeros in a loop) results in very strange output
> (snipped):
> 
>   @[11]: 5
>   @[18]: 5
>   @[-30590]: 6
>   @[10]: 7
>   @[14]: 9
> 
> It does not seem that the driver I am using on my test system (mlx5) would
> ever return a negative value from its napi poll function and likewise for
> the driver Martin is using (mlx4).
> 
> As such, I don't think it is possible for args->work to ever be a large
> negative number, but perhaps I am misunderstanding something?
> 
> I would like to note that commit 14e40a9578b7 ("perf/bpf: Remove #ifdef
> CONFIG_BPF_SYSCALL from struct perf_event members") does not exhibit this
> behavior and the output seems reasonable on my test system. Martin confirms
> the same for both commits on his test system, which uses different hardware
> than mine.
> 
> Is this an expected side effect of this change? I would expect it is not
> and that the output is a bug of some sort. My apologies in that I am not
> particularly familiar with the bpf code and cannot suggest what the root
> cause might be.
> 
> If it is not a bug:
>   1. Sorry for the noise :(

your report is great, thanks a lot!

jirka


>   2. Can anyone suggest what this output might mean or how the
>      script run above should be modified? AFAIK this is a fairly
>      common bpftrace that many folks run for profiling/debugging
>      purposes.
> 
> Thanks,
> Joe
> 
> [1]: https://lore.kernel.org/bpf/Zo64cpho2cFQiOeE@LQ3V64L9R2/T/#u

---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index c6a6936183d5..0045dc754ef7 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9580,7 +9580,7 @@ static int bpf_overflow_handler(struct perf_event *event,
 		goto out;
 	rcu_read_lock();
 	prog = READ_ONCE(event->prog);
-	if (prog) {
+	if (prog && prog->type == BPF_PROG_TYPE_PERF_EVENT) {
 		perf_prepare_sample(data, event, regs);
 		ret = bpf_prog_run(prog, &ctx);
 	}

  reply	other threads:[~2024-07-12 22:18 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-12 16:53 [bpf?] [net-next ?] [RESEND] possible bpf overflow/output bug introduced in 6.10rc1 ? Joe Damato
2024-07-12 22:18 ` Jiri Olsa [this message]
2024-07-12 22:39   ` Jiri Olsa
2024-07-12 22:49   ` Kyle Huey
2024-07-12 23:05     ` Kyle Huey
2024-07-12 23:30       ` Kyle Huey
2024-07-13  0:45         ` Joe Damato
2024-07-13  4:47           ` Kyle Huey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZpGrstyKD-PtWyoP@krava \
    --to=olsajiri@gmail.com \
    --cc=acme@kernel.org \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=elver@google.com \
    --cc=jdamato@fastly.com \
    --cc=khuey@kylehuey.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=me@kylehuey.com \
    --cc=mingo@kernel.org \
    --cc=mkarsten@uwaterloo.ca \
    --cc=namhyung@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=robert@ocallahan.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.