From: Akihiko Odaki <akihiko.odaki@daynix.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Jason Wang <jasowang@redhat.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>, Jonathan Corbet <corbet@lwn.net>,
Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
Mykola Lysenko <mykolal@fb.com>, Shuah Khan <shuah@kernel.org>,
Yuri Benditovich <yuri.benditovich@daynix.com>,
Andrew Melnychenko <andrew@daynix.com>,
Benjamin Tissoires <bentiss@kernel.org>
Cc: bpf <bpf@vger.kernel.org>,
"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
kvm@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
virtualization@lists.linux-foundation.org,
"open list:KERNEL SELFTEST FRAMEWORK"
<linux-kselftest@vger.kernel.org>,
Network Development <netdev@vger.kernel.org>
Subject: Should I add BPF kfuncs for userspace apps? And how?
Date: Tue, 12 Dec 2023 17:05:15 +0900 [thread overview]
Message-ID: <2f33be45-fe11-4b69-8e89-4d2824a0bf01@daynix.com> (raw)
Hi,
It is said eBPF is a safe way to extend kernels and that is very
attarctive, but we need to use kfuncs to add new usage of eBPF and
kfuncs are said as unstable as EXPORT_SYMBOL_GPL. So now I'd like to ask
some questions:
1) Which should I choose, BPF kfuncs or ioctl, when adding a new feature
for userspace apps?
2) How should I use BPF kfuncs from userspace apps if I add them?
Here, a "userspace app" means something not like a system-wide daemon
like systemd (particularly, I have QEMU in mind). I'll describe the
context more below:
---
I'm working on a new feature that aids virtio-net implementations using
tuntap virtual network device. You can see [1] for details, but
basically it's to extend BPF_PROG_TYPE_SOCKET_FILTER to report four more
bytes.
However, with long discussions we have confirmed extending
BPF_PROG_TYPE_SOCKET_FILTER is not going to happen, and adding kfuncs is
the way forward. So I decided how to add kfuncs to the kernel and how to
use it. There are rich documentations for the kernel side, but I found
little about the userspace. The best I could find is a systemd change
proposal that is based on WIP kernel changes[2].
So now I'm wondering how I should use BPF kfuncs from userspace apps if
I add them. In the systemd discussion, it is told that Linus said it's
fine to use BPF kfuncs in a private infrastructure big companies own, or
in systemd as those users know well about the system[3]. Indeed, those
users should be able to make more assumptions on the kernel than
"normal" userspace applications can.
Returning to my proposal, I'm proposing a new feature to be used by QEMU
or other VMM applications. QEMU is more like a normal userspace
application, and usually does not make much assumptions on the kernel it
runs on. For example, it's generally safe to run a Debian container
including QEMU installed with apt on Fedora. BPF kfuncs may work even in
such a situation thanks to CO-RE, but it sounds like *accidentally*
creating UAPIs.
Considering all above, how can I integrate BPF kfuncs to the application?
If BPF kfuncs are like EXPORT_SYMBOL_GPL, the natural way to handle them
is to think of BPF programs as some sort of kernel modules and
incorporate logic that behaves like modprobe. More concretely, I can put
eBPF binaries to a directory like:
/usr/local/share/qemu/ebpf/$KERNEL_RELEASE
Then, QEMU can uname() and get the path to the binary. It will give an
error if it can't find the binary for the current kernel so that it
won't create accidental UAPIs.
The obvious downside of this is that it complicates packaging a lot; it
requires packaging QEMU eBPF binaries each time a new kernel comes up.
This complexity is centrally managed by modprobe for kernel modules, but
apparently each application needs to take care of it for BPF programs.
In conclusion, I see too much complexity to use BPF in a userspace
application, which we didn't have to care for
BPF_PROG_TYPE_SOCKET_FILTER. Isn't there a better way? Or shouldn't I
use BPF in my case in the first place?
Thanks,
Akihiko Odaki
[1]
https://lore.kernel.org/all/20231015141644.260646-1-akihiko.odaki@daynix.com/
[2] https://github.com/systemd/systemd/pull/29797
[3] https://github.com/systemd/systemd/pull/29797#discussion_r1384637939
next reply other threads:[~2023-12-12 8:05 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-12 8:05 Akihiko Odaki [this message]
2023-12-12 10:39 ` Should I add BPF kfuncs for userspace apps? And how? Benjamin Tissoires
2023-12-12 12:41 ` Akihiko Odaki
2023-12-13 10:22 ` Benjamin Tissoires
2023-12-14 5:51 ` Akihiko Odaki
2023-12-14 17:40 ` Stephen Hemminger
2023-12-15 5:49 ` Akihiko Odaki
2023-12-15 16:36 ` Stephen Hemminger
2023-12-16 8:15 ` Akihiko Odaki
2023-12-18 19:56 ` Song Liu
2023-12-19 12:16 ` Akihiko Odaki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2f33be45-fe11-4b69-8e89-4d2824a0bf01@daynix.com \
--to=akihiko.odaki@daynix.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrew@daynix.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bentiss@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=jasowang@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=mst@redhat.com \
--cc=mykolal@fb.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@google.com \
--cc=shuah@kernel.org \
--cc=virtualization@lists.linux-foundation.org \
--cc=willemdebruijn.kernel@gmail.com \
--cc=xuanzhuo@linux.alibaba.com \
--cc=yonghong.song@linux.dev \
--cc=yuri.benditovich@daynix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).