linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eduard Zingerman <eddyz87@gmail.com>
To: Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Leon Hwang <hffilwlqm@gmail.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	 Menglong Dong <menglong.dong@linux.dev>,
	Menglong Dong <menglong8.dong@gmail.com>,
	Alexei Starovoitov	 <ast@kernel.org>, bpf <bpf@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	 linux-trace-kernel <linux-trace-kernel@vger.kernel.org>,
	jiang.biao@linux.dev
Subject: Re: bpf_errno. Was: [PATCH RFC bpf-next 1/3] bpf: report probe fault to BPF stderr
Date: Wed, 08 Oct 2025 12:34:08 -0700	[thread overview]
Message-ID: <0adc5d8a299483004f4796a418420fe1c69f24bc.camel@gmail.com> (raw)
In-Reply-To: <CAP01T75TegFO0DrZ=DvpNQBSnJqjn4HvM9OLsbJWFKJwzZeYXw@mail.gmail.com>

On Wed, 2025-10-08 at 19:08 +0200, Kumar Kartikeya Dwivedi wrote:
> On Wed, 8 Oct 2025 at 18:27, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> > 
> > On Wed, Oct 8, 2025 at 7:41 AM Leon Hwang <hffilwlqm@gmail.com> wrote:
> > > 
> > > 
> > > 
> > > On 2025/10/7 14:14, Menglong Dong wrote:
> > > > On 2025/10/2 10:03, Alexei Starovoitov wrote:
> > > > > On Fri, Sep 26, 2025 at 11:12 PM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > > > > > 
> > > > > > Introduce the function bpf_prog_report_probe_violation(), which is used
> > > > > > to report the memory probe fault to the user by the BPF stderr.
> > > > > > 
> > > > > > Signed-off-by: Menglong Dong <menglong.dong@linux.dev>
> > > 
> > > [...]
> > > 
> > > > > 
> > > > > Interesting idea, but the above message is not helpful.
> > > > > Users cannot decipher a fault_ip within a bpf prog.
> > > > > It's just a random number.
> > > > 
> > > > Yeah, I have noticed this too. What useful is the
> > > > bpf_stream_dump_stack(), which will print the code
> > > > line that trigger the fault.
> > > > 
> > > > > But stepping back... just faults are common in tracing.
> > > > > If we start printing them we will just fill the stream to the max,
> > > > > but users won't know that the message is there, since no one
> > > > 
> > > > You are right, we definitely can't output this message
> > > > to STDERR directly. We can add an extra flag for it, as you
> > > > said below.
> > > > 
> > > > Or, maybe we can introduce a enum stream_type, and
> > > > the users can subscribe what kind of messages they
> > > > want to receive.
> > > > 
> > > > > expects it. arena and lock errors are rare and arena faults
> > > > > were specifically requested by folks who develop progs that use arena.
> > > > > This one is different. These faults have been around for a long time
> > > > > and I don't recall people asking for more verbosity.
> > > > > We can add them with an extra flag specified at prog load time,
> > > > > but even then. Doesn't feel that useful.
> > > > 
> > > > Generally speaking, users can do invalid checking before
> > > > they do the memory reading, such as NULL checking. And
> > > > the pointer in function arguments that we hook is initialized
> > > > in most case. So the fault is someting that can be prevented.
> > > > 
> > > > I have a BPF tools which is writed for 4.X kernel and kprobe
> > > > based BPF is used. Now I'm planing to migrate it to 6.X kernel
> > > > and replace bpf_probe_read_kernel() with bpf_core_cast() to
> > > > obtain better performance. Then I find that I can't check if the
> > > > memory reading is success, which can lead to potential risk.
> > > > So my tool will be happy to get such fault event :)
> > > > 
> > > > Leon suggested to add a global errno for each BPF programs,
> > > > and I haven't dig deeply on this idea yet.
> > > > 
> > > 
> > > Yeah, as we discussed, a global errno would be a much more lightweight
> > > approach for handling such faults.
> > > 
> > > The idea would look like this:
> > > 
> > > DEFINE_PER_CPU(int, bpf_errno);
> > > 
> > > __bpf_kfunc void bpf_errno_clear(void);
> > > __bpf_kfunc void bpf_errno_set(int errno);
> > > __bpf_kfunc int bpf_errno_get(void);
> > > 
> > > When a fault occurs, the kernel can simply call
> > > 'bpf_errno_set(-EFAULT);'.
> > > 
> > > If users want to detect whether a fault happened, they can do:
> > > 
> > > bpf_errno_clear();
> > > header = READ_ONCE(skb->network_header);
> > > if (header == 0 && bpf_errno_get() == -EFAULT)
> > >         /* handle fault */;
> > > 
> > > This way, users can identify faults immediately and handle them gracefully.
> > > 
> > > Furthermore, these kfuncs can be inlined by the verifier, so there would
> > > be no runtime function call overhead.
> > 
> > Interesting idea, but errno as-is doesn't quite fit,
> > since we only have 2 (or 3 ?) cases without explicit error return:
> > probe_read_kernel above, arena read, arena write.
> > I guess we can add may_goto to this set as well.
> > But in all these cases we'll struggle to find an appropriate errno code,
> > so it probably should be a custom enum and not called "errno".
> 
> Yeah, agreed that this would be useful, particularly in this case. I'm
> wondering how we'll end up implementing this.
> Sounds like it needs to be tied to the program's invocation, so it
> cannot be per-cpu per-program, since they nest. Most likely should be
> backed by run_ctx, but that is unavailable in all program types. Next
> best thing that comes to mind is reserving some space in the stack
> frame at a known offset in each subprog that invokes this helper, and
> use that to signal (by finding the program's bp and writing to the
> stack), the downside being it likely becomes yet-another arch-specific
> thing. Any other better ideas?

Another option is to lower probe_read to a BPF_PROBE_MEM instruction
and generate a special kind of exception handler, that would set r0 to
-EFAULT. (We don't do this already, right? Don't see anything like that
in verifier.c or x86/../bpf_jit_comp.c).

This would avoid any user-visible changes and address performance
concern. Not so convenient for a chain of dereferences a->b->c->d,
though.

  reply	other threads:[~2025-10-08 19:34 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-27  6:12 [PATCH RFC bpf-next 0/3] bpf: report probe fault to BPF stderr Menglong Dong
2025-09-27  6:12 ` [PATCH RFC bpf-next 1/3] " Menglong Dong
2025-10-02  2:03   ` Alexei Starovoitov
2025-10-07  6:14     ` Menglong Dong
2025-10-08 14:40       ` Leon Hwang
2025-10-08 16:27         ` bpf_errno. Was: " Alexei Starovoitov
2025-10-08 17:08           ` Kumar Kartikeya Dwivedi
2025-10-08 19:34             ` Eduard Zingerman [this message]
2025-10-08 20:08               ` Kumar Kartikeya Dwivedi
2025-10-08 20:30                 ` Eduard Zingerman
2025-10-08 20:59                   ` Kumar Kartikeya Dwivedi
2025-10-09 14:29                 ` Leon Hwang
2025-10-09 15:15                   ` Leon Hwang
2025-10-10 12:05                 ` Menglong Dong
2025-10-10 15:10                   ` Menglong Dong
2025-10-10 18:55                   ` Eduard Zingerman
2025-10-11  1:23                     ` Menglong Dong
2025-10-09 14:15           ` Leon Hwang
2025-10-09 14:45             ` Alexei Starovoitov
2025-10-10 14:22               ` Leon Hwang
2025-09-27  6:12 ` [PATCH RFC bpf-next 2/3] x86,bpf: use bpf_prog_report_probe_violation for x86 Menglong Dong
2025-09-27  6:12 ` [PATCH RFC bpf-next 3/3] selftests/bpf: add testcase for probe read fault Menglong Dong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0adc5d8a299483004f4796a418420fe1c69f24bc.camel@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=hffilwlqm@gmail.com \
    --cc=jiang.biao@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=memxor@gmail.com \
    --cc=menglong.dong@linux.dev \
    --cc=menglong8.dong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).