public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Matt Bobrowski <mattbobrowski@google.com>
To: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Eduard Zingerman <eddyz87@gmail.com>,
	Yonghong Song <yonghong.song@linux.dev>,
	ohn Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@fomichev.me>,
	Jiri Olsa <jolsa@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Chuyi Zhou <zhouchuyi@bytedance.com>, Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH bpf-next 1/2] bpf: add new BPF_CGROUP_ITER_CHILDREN_ONLY control option
Date: Thu, 22 Jan 2026 12:31:44 +0000	[thread overview]
Message-ID: <aXIYsFwaov8SQczm@google.com> (raw)
In-Reply-To: <CAPhsuW6B27WYpkT-ZY1wgBUNQknM9pKZm+oGk+bM_p33mVPx_A@mail.gmail.com>

On Wed, Jan 21, 2026 at 11:14:10AM -0800, Song Liu wrote:
> On Wed, Jan 21, 2026 at 5:54 AM Matt Bobrowski <mattbobrowski@google.com> wrote:
> >
> > Currently, the BPF cgroup iterator supports walking descendants in
> > either pre-order (BPF_CGROUP_ITER_DESCENDANTS_PRE) or post-order
> > (BPF_CGROUP_ITER_DESCENDANTS_POST). These modes perform an exhaustive
> > depth-first search (DFS) of the hierarchy. In scenarios where a BPF
> > program may need to inspect only the direct children of a given parent
> > cgroup, a full DFS is unnecessarily expensive.
> >
> > This patch introduces a new BPF cgroup iterator control option,
> > BPF_CGROUP_ITER_CHILDREN_ONLY. This control option restricts the
> > traversal to the immediate children of a specified parent cgroup,
> > allowing for more targeted and efficient iteration, particularly when
> > exhaustive depth-first search (DFS) traversal is not required.
> >
> > Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
> 
> The code looks good to me.
>
> Some nitpick and some high level questions/ideas below

Sure, thank you for taking a look!

> >  enum bpf_cgroup_iter_order {
> >         BPF_CGROUP_ITER_ORDER_UNSPEC = 0,
> > -       BPF_CGROUP_ITER_SELF_ONLY,              /* process only a single object. */
> > -       BPF_CGROUP_ITER_DESCENDANTS_PRE,        /* walk descendants in pre-order. */
> > -       BPF_CGROUP_ITER_DESCENDANTS_POST,       /* walk descendants in post-order. */
> > -       BPF_CGROUP_ITER_ANCESTORS_UP,           /* walk ancestors upward. */
> > +       BPF_CGROUP_ITER_SELF_ONLY,              /* process only a single object. */
> > +       BPF_CGROUP_ITER_DESCENDANTS_PRE,        /* walk descendants in pre-order. */
> > +       BPF_CGROUP_ITER_DESCENDANTS_POST,       /* walk descendants in post-order. */
> > +       BPF_CGROUP_ITER_ANCESTORS_UP,           /* walk ancestors upward. */
> 
> Changes above seem unnecessary.

This is just noise, sorry. Will revert this once I send out v2.

> > +       /*
> > +        * Walks the immediate children of the specified parent
> > +        * cgroup_subsys_state. Unlike BPF_CGROUP_ITER_DESCENDANTS_PRE,
> > +        * BPF_CGROUP_ITER_DESCENDANTS_POST, and BPF_CGROUP_ITER_ANCESTORS_UP
> > +        * the iterator does not include the specified parent as one of the
> > +        * returned iterator elements.
> > +        */
> > +       BPF_CGROUP_ITER_CHILDREN_ONLY,
> >  };
> 
> [...]
> 
> > @@ -320,6 +332,7 @@ __bpf_kfunc int bpf_iter_css_new(struct bpf_iter_css *it,
> >         case BPF_CGROUP_ITER_DESCENDANTS_PRE:
> >         case BPF_CGROUP_ITER_DESCENDANTS_POST:
> >         case BPF_CGROUP_ITER_ANCESTORS_UP:
> > +       case BPF_CGROUP_ITER_CHILDREN_ONLY:
> >                 break;
> >         default:
> >                 return -EINVAL;
> > @@ -345,6 +358,9 @@ __bpf_kfunc struct cgroup_subsys_state *bpf_iter_css_next(struct bpf_iter_css *i
> >         case BPF_CGROUP_ITER_DESCENDANTS_POST:
> >                 kit->pos = css_next_descendant_post(kit->pos, kit->start);
> >                 break;
> > +       case BPF_CGROUP_ITER_CHILDREN_ONLY:
> > +               kit->pos = css_next_child(kit->pos, kit->start);
> > +               break;
> >         case BPF_CGROUP_ITER_ANCESTORS_UP:
> >                 kit->pos = kit->pos ? kit->pos->parent : kit->start;
> >         }
> 
> I wonder whether we can use the return values and/or the kfuncs to
> enable more flexible walks. For example, some return value means
> "do not walk more children, go back to the parent". This will enable
> use cases like "walk children and grandchildren nodes from here,
> but not any deeper". WDYT?

Some general control over whether the DFS traversal should continue in
a specific direction could be nice. That way you could reverse out of
a specific subtree/branch early without necessarily having to walk all
nodes under a given subtree/branch.

With the above said however, it's not really a control/semantic which
I'm after at this point. I'm purely interested in exhaustively
iterating children of a given parent before selectively deciding which
subtree/branch, and therefore parent, I'd like to explore next.

  reply	other threads:[~2026-01-22 12:31 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-21 13:54 [PATCH bpf-next 1/2] bpf: add new BPF_CGROUP_ITER_CHILDREN_ONLY control option Matt Bobrowski
2026-01-21 13:54 ` [PATCH bpf-next 2/2] bpf/selftests: cover " Matt Bobrowski
2026-01-21 19:14 ` [PATCH bpf-next 1/2] bpf: add new " Song Liu
2026-01-22 12:31   ` Matt Bobrowski [this message]
2026-01-23  4:26 ` Alexei Starovoitov
2026-01-23 11:06   ` Matt Bobrowski
2026-01-23 17:17     ` Alexei Starovoitov
2026-01-26  9:03       ` Matt Bobrowski
2026-01-27  2:26         ` Alexei Starovoitov
2026-01-27  8:28           ` Matt Bobrowski
2026-01-23 18:50     ` Tejun Heo
2026-01-26  9:14       ` Matt Bobrowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aXIYsFwaov8SQczm@google.com \
    --to=mattbobrowski@google.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=tj@kernel.org \
    --cc=yonghong.song@linux.dev \
    --cc=zhouchuyi@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox