Netdev List
 help / color / mirror / Atom feed
From: "teawater" <hui.zhu@linux.dev>
To: "Usama Arif" <usama.arif@linux.dev>
Cc: "Usama Arif" <usama.arif@linux.dev>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"Martin KaFai Lau" <martin.lau@linux.dev>,
	"Eduard Zingerman" <eddyz87@gmail.com>,
	"Kumar Kartikeya Dwivedi" <memxor@gmail.com>,
	"Song Liu" <song@kernel.org>,
	"Yonghong Song" <yonghong.song@linux.dev>,
	"Jiri Olsa" <jolsa@kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Shakeel Butt" <shakeel.butt@linux.dev>,
	"Muchun Song" <muchun.song@linux.dev>,
	"JP Kobryn" <inwardvessel@gmail.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Shuah Khan" <shuah@kernel.org>,
	davem@davemloft.net, "Jakub Kicinski" <kuba@kernel.org>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"Stanislav Fomichev" <sdf@fomichev.me>,
	"KP Singh" <kpsingh@kernel.org>,
	"Tao Chen" <chen.dylane@linux.dev>,
	"Mykyta Yatsenko" <yatsenko@meta.com>,
	"Leon Hwang" <leon.hwang@linux.dev>,
	"Anton Protopopov" <a.s.protopopov@gmail.com>,
	"Amery Hung" <ameryhung@gmail.com>,
	"Tobias Klauser" <tklauser@distanz.ch>,
	"Eyal Birger" <eyal.birger@gmail.com>,
	"Rong Tao" <rongtao@cestc.cn>, "Hao Luo" <haoluo@google.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Nathan Chancellor" <nathan@kernel.org>,
	"Kees Cook" <kees@kernel.org>, "Tejun Heo" <tj@kernel.org>,
	"Jeff Xu" <jeffxu@chromium.org>,
	mkoutny@suse.com, "Jan Hendrik Farr" <kernel@jfarr.cc>,
	"Christian Brauner" <brauner@kernel.org>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Brian Gerst" <brgerst@gmail.com>,
	"Masahiro Yamada" <masahiroy@kernel.org>,
	"Willem de Bruijn" <willemb@google.com>,
	"Jason Xing" <kerneljasonxing@gmail.com>,
	"Paul Chaignon" <paul.chaignon@gmail.com>,
	"Chen Ridong" <chenridong@huaweicloud.com>,
	"Lance Yang" <lance.yang@linux.dev>,
	"Jiayuan Chen" <jiayuan.chen@linux.dev>,
	linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	netdev@vger.kernel.org, linux-kselftest@vger.kernel.org,
	geliang@kernel.org, baohua@kernel.org,
	"Hui Zhu" <zhuhui@kylinos.cn>
Subject: Re: [RFC PATCH bpf-next v7 00/11] mm: BPF struct_ops for dynamic memory protection and async reclaim
Date: Wed, 27 May 2026 09:31:25 +0000	[thread overview]
Message-ID: <7f440116b23fba1e20fe70fda502ad66f7dbf158@linux.dev> (raw)
In-Reply-To: <20260526134115.816081-1-usama.arif@linux.dev>

> 
> On Tue, 26 May 2026 10:20:00 +0800 Hui Zhu <hui.zhu@linux.dev> wrote:
> 
> > 
> > From: Hui Zhu <zhuhui@kylinos.cn>
> >  
> >  Overview:
> >  This series introduces BPF struct_ops support for the memory controller,
> >  enabling userspace BPF programs to implement custom, dynamic memory
> >  management policies per cgroup. The feature allows BPF programs to hook
> >  into the core reclaim and charge paths without requiring kernel
> >  modifications, providing a flexible alternative to static knobs such as
> >  memory.low and memory.min.
> >  
> >  The series enables two complementary use cases.
> >  
...
...
...
> >  
> >  Asynchronous proactive reclaim: the memcg_charged and memcg_uncharged
> >  hooks, combined with the BPF workqueue mechanism and the new
> >  bpf_try_to_free_mem_cgroup_pages() kfunc, enable BPF programs to perform
> >  proactive background reclaim without blocking the charge path. The
> >  pattern works as follows: the memcg_charged callback tracks accumulated
> >  memory usage; when usage crosses a configurable threshold, it enqueues an
> >  asynchronous work item via bpf_wq_start() and returns immediately without
> >  throttling the charging task. The workqueue callback then invokes
> >  bpf_try_to_free_mem_cgroup_pages() to reclaim pages from the target
> >  cgroup; if usage remains elevated after reclaim, the callback re-enqueues
> >  itself to continue. This allows a BPF program to keep a cgroup's
> >  footprint below its hard limit (memory.max) entirely in the background,
> >  avoiding the OOM killer or direct-reclaim stalls that would otherwise
> >  occur. The selftest for this feature (patch 10/11) validates the
> >  mechanism concretely: a workload that writes and mmaps a 64 MB file inside
> >  a 32 MB cgroup reliably triggers memory.events "max" events without BPF;
> >  with the async reclaim program attached, the "max" counter does not
> >  increase at all across the same workload.
> > 
> Hi Hui,
> 
> Thanks for the series.
> Would it not be simpler to just have another memcg knob, something like
> memory.high_async.
> When memory usage > memory.high_async, queue a per-memcg work item that calls
> try_to_free_mem_cgroup_pages() until usage drops back below some threshold.
> I am not sure I see what programability aspect from bpf you need here.
> 
> Thanks

Hi Usama,

That's a good question.

By introducing a new BPF kfunc bpf_try_to_free_mem_cgroup_pages,
a BPF program can flexibly control when to start and stop async
reclaim, rather than being constrained to trigger and stop based
solely on memcg usage or one or two fixed events, as with
traditional proactive reclaim interfaces.

For example, async reclaim could be triggered based on PSI, or
on the number of page faults, or even on a combination of
multiple events working together to decide both when to start
and when to stop async reclaim.

That is the motivation behind adding the BPF kfunc
bpf_try_to_free_mem_cgroup_pages in this patch set.

I admit the cover letter did not explain this well enough, and
the example code does not demonstrate this use case either. I
will address both in the next version.

Best,
Hui

> 
> > 
> > 08/11 selftests/bpf: Add tests for memcg_bpf_ops
> >  Adds prog_tests/memcg_ops.c covering three scenarios:
> >  memcg_charged-only throttling, below_low + memcg_charged
> >  interaction, and below_min + memcg_charged interaction. A
> >  tracepoint on memcg:count_memcg_events (PGFAULT) is used to
> >  detect memory pressure and trigger hooks accordingly.
...
...
...
> >  -- 
> >  2.43.0
> >
>

  reply	other threads:[~2026-05-27  9:31 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-26  2:20 [RFC PATCH bpf-next v7 00/11] mm: BPF struct_ops for dynamic memory protection and async reclaim Hui Zhu
2026-05-26  2:20 ` [RFC PATCH bpf-next v7 01/11] bpf: move bpf_struct_ops_link into bpf.h Hui Zhu
2026-05-26  2:20 ` [RFC PATCH bpf-next v7 02/11] bpf: allow attaching struct_ops to cgroups Hui Zhu
2026-05-26  3:19   ` bot+bpf-ci
2026-05-26  2:20 ` [RFC PATCH bpf-next v7 03/11] libbpf: fix return value on memory allocation failure Hui Zhu
2026-05-26  3:06   ` bot+bpf-ci
2026-05-26  2:20 ` [RFC PATCH bpf-next v7 04/11] libbpf: introduce bpf_map__attach_struct_ops_opts() Hui Zhu
2026-05-26  3:06   ` bot+bpf-ci
2026-05-27 15:43   ` Yonghong Song
2026-05-26  2:20 ` [RFC PATCH bpf-next v7 05/11] bpf: Pass flags in bpf_link_create for struct_ops Hui Zhu
2026-05-26  2:24 ` [RFC PATCH bpf-next v7 06/11] mm: memcontrol: Add BPF struct_ops for memory controller Hui Zhu
2026-05-26  3:19   ` bot+bpf-ci
2026-05-26  2:24 ` [RFC PATCH bpf-next v7 07/11] mm/bpf: Add bpf_try_to_free_mem_cgroup_pages kfunc Hui Zhu
2026-05-26  3:06   ` bot+bpf-ci
2026-05-26  2:24 ` [RFC PATCH bpf-next v7 08/11] selftests/bpf: Add tests for memcg_bpf_ops Hui Zhu
2026-05-26  2:27 ` [RFC PATCH bpf-next v7 09/11] selftests/bpf: Add test for memcg_bpf_ops hierarchies Hui Zhu
2026-05-26  2:27 ` [RFC PATCH bpf-next v7 10/11] selftests/bpf: Add selftest for memcg async reclaim via BPF Hui Zhu
2026-05-26  3:06   ` bot+bpf-ci
2026-05-26  2:27 ` [RFC PATCH bpf-next v7 11/11] samples/bpf: Add memcg priority control and async reclaim example Hui Zhu
2026-05-26 13:41 ` [RFC PATCH bpf-next v7 00/11] mm: BPF struct_ops for dynamic memory protection and async reclaim Usama Arif
2026-05-27  9:31   ` teawater [this message]
2026-05-27  8:47 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7f440116b23fba1e20fe70fda502ad66f7dbf158@linux.dev \
    --to=hui.zhu@linux.dev \
    --cc=a.s.protopopov@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=ameryhung@gmail.com \
    --cc=andrii@kernel.org \
    --cc=baohua@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=brgerst@gmail.com \
    --cc=cgroups@vger.kernel.org \
    --cc=chen.dylane@linux.dev \
    --cc=chenridong@huaweicloud.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=eddyz87@gmail.com \
    --cc=eyal.birger@gmail.com \
    --cc=geliang@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=haoluo@google.com \
    --cc=hawk@kernel.org \
    --cc=inwardvessel@gmail.com \
    --cc=jeffxu@chromium.org \
    --cc=jiayuan.chen@linux.dev \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kees@kernel.org \
    --cc=kernel@jfarr.cc \
    --cc=kerneljasonxing@gmail.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=lance.yang@linux.dev \
    --cc=leon.hwang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=martin.lau@linux.dev \
    --cc=masahiroy@kernel.org \
    --cc=memxor@gmail.com \
    --cc=mhocko@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=nathan@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=ojeda@kernel.org \
    --cc=paul.chaignon@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=roman.gushchin@linux.dev \
    --cc=rongtao@cestc.cn \
    --cc=sdf@fomichev.me \
    --cc=shakeel.butt@linux.dev \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=tj@kernel.org \
    --cc=tklauser@distanz.ch \
    --cc=usama.arif@linux.dev \
    --cc=willemb@google.com \
    --cc=yatsenko@meta.com \
    --cc=yonghong.song@linux.dev \
    --cc=zhuhui@kylinos.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox