From: Matt Bobrowski <mattbobrowski@google.com>
To: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Eduard Zingerman <eddyz87@gmail.com>,
Yonghong Song <yonghong.song@linux.dev>,
ohn Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>,
Jiri Olsa <jolsa@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Chuyi Zhou <zhouchuyi@bytedance.com>, Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH bpf-next 1/2] bpf: add new BPF_CGROUP_ITER_CHILDREN_ONLY control option
Date: Thu, 22 Jan 2026 12:31:44 +0000 [thread overview]
Message-ID: <aXIYsFwaov8SQczm@google.com> (raw)
In-Reply-To: <CAPhsuW6B27WYpkT-ZY1wgBUNQknM9pKZm+oGk+bM_p33mVPx_A@mail.gmail.com>
On Wed, Jan 21, 2026 at 11:14:10AM -0800, Song Liu wrote:
> On Wed, Jan 21, 2026 at 5:54 AM Matt Bobrowski <mattbobrowski@google.com> wrote:
> >
> > Currently, the BPF cgroup iterator supports walking descendants in
> > either pre-order (BPF_CGROUP_ITER_DESCENDANTS_PRE) or post-order
> > (BPF_CGROUP_ITER_DESCENDANTS_POST). These modes perform an exhaustive
> > depth-first search (DFS) of the hierarchy. In scenarios where a BPF
> > program may need to inspect only the direct children of a given parent
> > cgroup, a full DFS is unnecessarily expensive.
> >
> > This patch introduces a new BPF cgroup iterator control option,
> > BPF_CGROUP_ITER_CHILDREN_ONLY. This control option restricts the
> > traversal to the immediate children of a specified parent cgroup,
> > allowing for more targeted and efficient iteration, particularly when
> > exhaustive depth-first search (DFS) traversal is not required.
> >
> > Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
>
> The code looks good to me.
>
> Some nitpick and some high level questions/ideas below
Sure, thank you for taking a look!
> > enum bpf_cgroup_iter_order {
> > BPF_CGROUP_ITER_ORDER_UNSPEC = 0,
> > - BPF_CGROUP_ITER_SELF_ONLY, /* process only a single object. */
> > - BPF_CGROUP_ITER_DESCENDANTS_PRE, /* walk descendants in pre-order. */
> > - BPF_CGROUP_ITER_DESCENDANTS_POST, /* walk descendants in post-order. */
> > - BPF_CGROUP_ITER_ANCESTORS_UP, /* walk ancestors upward. */
> > + BPF_CGROUP_ITER_SELF_ONLY, /* process only a single object. */
> > + BPF_CGROUP_ITER_DESCENDANTS_PRE, /* walk descendants in pre-order. */
> > + BPF_CGROUP_ITER_DESCENDANTS_POST, /* walk descendants in post-order. */
> > + BPF_CGROUP_ITER_ANCESTORS_UP, /* walk ancestors upward. */
>
> Changes above seem unnecessary.
This is just noise, sorry. Will revert this once I send out v2.
> > + /*
> > + * Walks the immediate children of the specified parent
> > + * cgroup_subsys_state. Unlike BPF_CGROUP_ITER_DESCENDANTS_PRE,
> > + * BPF_CGROUP_ITER_DESCENDANTS_POST, and BPF_CGROUP_ITER_ANCESTORS_UP
> > + * the iterator does not include the specified parent as one of the
> > + * returned iterator elements.
> > + */
> > + BPF_CGROUP_ITER_CHILDREN_ONLY,
> > };
>
> [...]
>
> > @@ -320,6 +332,7 @@ __bpf_kfunc int bpf_iter_css_new(struct bpf_iter_css *it,
> > case BPF_CGROUP_ITER_DESCENDANTS_PRE:
> > case BPF_CGROUP_ITER_DESCENDANTS_POST:
> > case BPF_CGROUP_ITER_ANCESTORS_UP:
> > + case BPF_CGROUP_ITER_CHILDREN_ONLY:
> > break;
> > default:
> > return -EINVAL;
> > @@ -345,6 +358,9 @@ __bpf_kfunc struct cgroup_subsys_state *bpf_iter_css_next(struct bpf_iter_css *i
> > case BPF_CGROUP_ITER_DESCENDANTS_POST:
> > kit->pos = css_next_descendant_post(kit->pos, kit->start);
> > break;
> > + case BPF_CGROUP_ITER_CHILDREN_ONLY:
> > + kit->pos = css_next_child(kit->pos, kit->start);
> > + break;
> > case BPF_CGROUP_ITER_ANCESTORS_UP:
> > kit->pos = kit->pos ? kit->pos->parent : kit->start;
> > }
>
> I wonder whether we can use the return values and/or the kfuncs to
> enable more flexible walks. For example, some return value means
> "do not walk more children, go back to the parent". This will enable
> use cases like "walk children and grandchildren nodes from here,
> but not any deeper". WDYT?
Some general control over whether the DFS traversal should continue in
a specific direction could be nice. That way you could reverse out of
a specific subtree/branch early without necessarily having to walk all
nodes under a given subtree/branch.
With the above said however, it's not really a control/semantic which
I'm after at this point. I'm purely interested in exhaustively
iterating children of a given parent before selectively deciding which
subtree/branch, and therefore parent, I'd like to explore next.
next prev parent reply other threads:[~2026-01-22 12:31 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-21 13:54 [PATCH bpf-next 1/2] bpf: add new BPF_CGROUP_ITER_CHILDREN_ONLY control option Matt Bobrowski
2026-01-21 13:54 ` [PATCH bpf-next 2/2] bpf/selftests: cover " Matt Bobrowski
2026-01-21 19:14 ` [PATCH bpf-next 1/2] bpf: add new " Song Liu
2026-01-22 12:31 ` Matt Bobrowski [this message]
2026-01-23 4:26 ` Alexei Starovoitov
2026-01-23 11:06 ` Matt Bobrowski
2026-01-23 17:17 ` Alexei Starovoitov
2026-01-26 9:03 ` Matt Bobrowski
2026-01-27 2:26 ` Alexei Starovoitov
2026-01-27 8:28 ` Matt Bobrowski
2026-01-23 18:50 ` Tejun Heo
2026-01-26 9:14 ` Matt Bobrowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aXIYsFwaov8SQczm@google.com \
--to=mattbobrowski@google.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=martin.lau@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=tj@kernel.org \
--cc=yonghong.song@linux.dev \
--cc=zhouchuyi@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox