From: Tejun Heo <tj@kernel.org>
To: Josh Don <joshdon@google.com>
Cc: Rohan Kakulawaram <rohanka@google.com>,
bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
ohn Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>,
Jiri Olsa <jolsa@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Matt Bobrowski <mattbobrowski@google.com>
Subject: Re: [RFC PATCH bpf-next] bpf: ephemeral cgroup BPF control programs
Date: Wed, 4 Feb 2026 10:25:14 -1000 [thread overview]
Message-ID: <aYOrKpI7vxCUdZVr@slm.duckdns.org> (raw)
In-Reply-To: <CABk29NuyzS_XSUQM3yA0a2rqpNXg6xaGP5pVGXJoJa-xaOBdOQ@mail.gmail.com>
Hello, Josh.
On Tue, Feb 03, 2026 at 05:04:09PM -0800, Josh Don wrote:
> > Can you elaborate why this *needs* to be a separate file interface? Note
> > that this doesn't really expand what BPF progs can do with cgroups. The only
> > thing being added is a different and not-particularly-efficient way to
> > communicate with BPF progs.
>
> Each of those existing communication mechanisms have advantages and
> disadvantages, and my take is that none are really optimal for the use
> case described/implied here.
>
> For starters, I think it is important to have the interface be
> synchronous. Stat collection and reporting for example makes much more
> sense to do on a read() edge rather than arbitrarily dumping info
> continuously into a map or ring buffer or something.
>
> For the BPF iterators we already have, you could in theory pin and
> unpin as cgroups are created and destroyed but that feels like a bit
> of a hack; at that point you don't really care about it being an
> iterator program, you're just piggy-backing off the fact that it
> exposes a seqfile interface. Add to that the fact the trickiness of
> keeping everything in sync as the cgroup tree is modified, plus there
> will always be a latency between cgroups getting created and userspace
> going to pin an iterator (especially if the jobs creating the cgroups
> are not the ones caring to pin the program).
Wouldn't pinned BPF_PROG_RUN program fit the bill? It can serve as a generic
entry point with arbitrary input and output data. It can take the cgroup ID
along with other params, do whatever operations necessary and then return
output in whatever format. The users don't have to know much either. It just
needs to know the name of the pinned program and input/output formats and
then do bpf_prog_test_run_opts(). It's not whole lot different from doing an
ioctl call.
> I also find the file based interface incredibly convenient. You don't
> need to have code deal with making BPF upcalls or read() from an
> iterator fd, instead you can use traditional file based APIs. Exposing
> as a file-based interface also easily lets scripts and manual
> observation/manipulation work easily as you can cat/grep/etc just as
> any other file. I have to imagine the motivation for allowing file
> based pinning of iterators shared similar motivations.
AFAICS, this is the only actual benefit, right? Having text files as
interface.
> Typically cgroupfs interfaces are low bandwidth communication
> mechanisms to occasionally set/get resource limits and stats. So, in
> contrast to the APIs you describe, this is also about offering a more
> flexible and convenient solution without needing to worry as much
> about efficiency.
>
> I also think this pairs pretty nicely with sched_ext as schedulers can
> define custom tuning knobs that will be automatically exposed for
> manipulation on a per-job (cgroup) basis.
Maybe, but, for cgroup level low-freq hinting, being able to read xattrs on
cgroupfs should be enough. Anything high-volume/freq or needing finer
granularity, cgroupfs file interface is far from ideal.
So, I don't know. I'm not dead against it but unless I'm misunderstanding
something the rationale seems pretty weak.
Thanks.
--
tejun
prev parent reply other threads:[~2026-02-04 20:25 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-03 10:20 [RFC PATCH bpf-next] bpf: ephemeral cgroup BPF control programs Rohan Kakulawaram
2026-02-03 20:26 ` Tejun Heo
2026-02-04 1:04 ` Josh Don
2026-02-04 20:25 ` Tejun Heo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aYOrKpI7vxCUdZVr@slm.duckdns.org \
--to=tj@kernel.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=joshdon@google.com \
--cc=kpsingh@kernel.org \
--cc=martin.lau@linux.dev \
--cc=mattbobrowski@google.com \
--cc=rohanka@google.com \
--cc=roman.gushchin@linux.dev \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox