From: Dave Marchevsky <davemarchevsky@fb.com>
To: <bpf@vger.kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@kernel.org>,
Kernel Team <kernel-team@fb.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Dave Marchevsky <davemarchevsky@fb.com>
Subject: [PATCH v1 bpf-next 0/2] bpf: Add mmapable task_local storage
Date: Mon, 20 Nov 2023 09:59:23 -0800 [thread overview]
Message-ID: <20231120175925.733167-1-davemarchevsky@fb.com> (raw)
This series adds support for mmap()ing single task_local storage mapvals
into userspace. Two motivating usecases:
* sched_ext ([0]) schedulers might want to act on 'scheduling hints'
provided by userspace tasks. For example, a task can tag itself as
latency-sensitive but not particularly computationally intensive and
BPF scheduler can use this information to make better scheduling
decisions. Similarly, a database task about to start a
transaction can tag itself as doing so without high overhead by
writing to the mmap'd mapval. In both cases the information is
task-specific and in the latter it'd be preferable to avoid
incurring syscall overhead as the hint would change often.
* strobemeta ([1]) technique to read thread_local storage is used
by tracing programs at Meta to annotate tracing data with
task-specific metadata. For example, a multithreaded webserver with
a pool of worker threads preparing responses and other threads
handling request connections might want to tag threads by type, and
further tag worker threads with feature flags enabled during request
processing.
* The strobemeta technique predates existence of task_local
storage map, instead relying on reverse-engineering thread_local
storage implementation specifics. The approach enabled here
avoids much of this complexity.
The general thrust of this series' implementation is "simplest thing
that works". A userspace thread can mmap() a task_local storage map fd
and receive the map_value corresponding to its task. In the future we
can support mmap()ing in other threads' map_values via offset parameter
or some other approach. Similarly, this series makes no attempt to pack
multiple map_values into a userspace-mappable page - each map_value for
a BPF_F_MMAPABLE task_local storage map is given its own page. For the
motivating usecases above neither of those potential improvements is
necessary. Patch 1's summary digs deeper into implementation details.
This series' changes to generic local_storage implementation shared by
cgroup_local storage and others will make extending this support to
those local storage types straightforward in the future.
Summary of patches:
* Patch 1 adds support for mmapable map_vals in generic
bpf_local_storage infrastructure and uses the new feature in
task_local storage
* Patch 2 adds tests
[0]: https://lore.kernel.org/bpf/20231111024835.2164816-1-tj@kernel.org/
[1]: tools/testing/selftests/bpf/progs/strobemeta*
Dave Marchevsky (2):
bpf: Support BPF_F_MMAPABLE task_local storage
selftests/bpf: Add test exercising mmapable task_local_storage
include/linux/bpf_local_storage.h | 14 +-
kernel/bpf/bpf_local_storage.c | 145 +++++++++++---
kernel/bpf/bpf_task_storage.c | 35 +++-
kernel/bpf/syscall.c | 2 +-
.../bpf/prog_tests/task_local_storage.c | 177 ++++++++++++++++++
.../bpf/progs/task_local_storage__mmap.c | 59 ++++++
.../bpf/progs/task_local_storage__mmap.h | 7 +
.../bpf/progs/task_local_storage__mmap_fail.c | 39 ++++
8 files changed, 445 insertions(+), 33 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/task_local_storage__mmap.c
create mode 100644 tools/testing/selftests/bpf/progs/task_local_storage__mmap.h
create mode 100644 tools/testing/selftests/bpf/progs/task_local_storage__mmap_fail.c
--
2.34.1
next reply other threads:[~2023-11-20 17:59 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-20 17:59 Dave Marchevsky [this message]
2023-11-20 17:59 ` [PATCH v1 bpf-next 1/2] bpf: Support BPF_F_MMAPABLE task_local storage Dave Marchevsky
2023-11-20 21:41 ` Johannes Weiner
2023-11-21 0:42 ` Martin KaFai Lau
2023-11-21 6:11 ` David Marchevsky
2023-11-21 19:27 ` Martin KaFai Lau
2023-11-21 19:49 ` Alexei Starovoitov
2023-12-11 17:31 ` David Marchevsky
2023-11-21 2:32 ` kernel test robot
2023-11-21 5:06 ` kernel test robot
2023-11-21 5:20 ` kernel test robot
2023-11-21 5:44 ` Alexei Starovoitov
2023-11-21 6:41 ` Yonghong Song
2023-11-21 15:34 ` Yonghong Song
2023-11-21 19:30 ` Andrii Nakryiko
2023-11-20 17:59 ` [PATCH v1 bpf-next 2/2] selftests/bpf: Add test exercising mmapable task_local_storage Dave Marchevsky
2023-11-21 19:34 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231120175925.733167-1-davemarchevsky@fb.com \
--to=davemarchevsky@fb.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox