linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marco Elver <elver@google.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	 Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	 Martin KaFai Lau <martin.lau@linux.dev>,
	Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
	 Yonghong Song <yonghong.song@linux.dev>,
	John Fastabend <john.fastabend@gmail.com>,
	 KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
	 Jiri Olsa <jolsa@kernel.org>, Dmitry Vyukov <dvyukov@google.com>,
	 Steven Rostedt <rostedt@goodmis.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	bpf <bpf@vger.kernel.org>,
	 "open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	linux-trace-kernel@vger.kernel.org,
	 LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH bpf-next 1/2] bpf: Introduce bpf_probe_write_user_registered()
Date: Mon, 8 Apr 2024 11:30:11 +0200	[thread overview]
Message-ID: <CANpmjNMCJwCaGiUpMCukgruNJ9k120sJ8pVkrdpyo-Tonve2Sw@mail.gmail.com> (raw)
In-Reply-To: <CAEf4BzZCj=3hevf+Je=oed9Nisctotp_CX00NrLaO6_7+-0LSQ@mail.gmail.com>

On Fri, 5 Apr 2024 at 22:28, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Apr 5, 2024 at 1:28 AM Marco Elver <elver@google.com> wrote:
> >
> > On Fri, 5 Apr 2024 at 01:23, Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
[...]
> > > and the tasks can use mmaped array shared across all or unique to each
> > > process.
> > > And both bpf and user space can read/write them with a single instruction.
> >
> > That's BPF_F_MMAPABLE, right?
> >
> > That does not work because the mmapped region is global. Our requirements are:
> >
> > 1. Single tracing BPF program.
> >
> > 2. Per-process (per VM) memory region (here it's per-thread, but each
> > thread just registers the same process-wide region).  No sharing
> > between processes.
> >
> > 3. From #2 it follows: exec unregisters the registered memory region;
> > fork gets a cloned region.
> >
> > 4. Unprivileged processes can do prctl(REGISTER). Some of them might
> > not be able to use the bpf syscall.
> >
> > The reason for #2 is that each user space process also writes to the
> > memory region (read by the BPF program to make updates depending on
> > what state it finds), and having shared state between processes
> > doesn't work here.
> >
> > Is there any reasonable BPF facility that can do this today? (If
> > BPF_F_MMAPABLE could do it while satisfying requirements 2-4, I'd be a
> > happy camper.)
>
> You could simulate something like this with multi-element ARRAY +
> BPF_F_MMAPABLE, though you'd need to pre-allocate up to max number of
> processes, so it's not an exact fit.

Right, for production use this is infeasible.

> But what seems to be much closer is using BPF task-local storage, if
> we support mmap()'ing its memory into user-space. We've had previous
> discussions on how to achieve this (the simplest being that
> mmap(task_local_map_fd, ...) maps current thread's part of BPF task
> local storage). You won't get automatic cloning (you'd have to do that
> from the BPF program on fork/exec tracepoint, for example), and within
> the process you'd probably want to have just one thread (main?) to
> mmap() initially and just share the pointer across all relevant
> threads.

In the way you imagine it, would that allow all threads sharing the
same memory, despite it being task-local? Presumably each task's local
storage would be mapped to just point to the same memory?

> But this is a more generic building block, IMO. This relying
> on BPF map also means pinning is possible and all the other BPF map
> abstraction benefits.

Deployment-wise it will make things harder because unprivileged
processes still have to somehow get the map's shared fd somehow to
mmap() it. Not unsolvable, and in general what you describe looks
interesting, but I currently can't see how it will be simpler.

In absence of all that, is a safer "bpf_probe_write_user()" like I
proposed in this patch ("bpf_probe_write_user_registered()") at all
appealing?

Thanks,
-- Marco

  reply	other threads:[~2024-04-08  9:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-04 19:01 [PATCH bpf-next 1/2] bpf: Introduce bpf_probe_write_user_registered() Marco Elver
2024-04-04 19:01 ` [PATCH bpf-next 2/2] selftests/bpf: Add test for bpf_probe_write_user_registered() Marco Elver
2024-04-04 23:23 ` [PATCH bpf-next 1/2] bpf: Introduce bpf_probe_write_user_registered() Alexei Starovoitov
2024-04-05  8:28   ` Marco Elver
2024-04-05 20:28     ` Andrii Nakryiko
2024-04-08  9:30       ` Marco Elver [this message]
2024-04-08 18:24         ` Alexei Starovoitov
2024-04-09  8:10           ` Marco Elver
2024-04-09 21:41             ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANpmjNMCJwCaGiUpMCukgruNJ9k120sJ8pVkrdpyo-Tonve2Sw@mail.gmail.com \
    --to=elver@google.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dvyukov@google.com \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).