From: hui.zhu@linux.dev
To: "Michal Hocko" <mhocko@suse.com>
Cc: "Roman Gushchin" <roman.gushchin@linux.dev>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Shakeel Butt" <shakeel.butt@linux.dev>,
"Muchun Song" <muchun.song@linux.dev>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Andrii Nakryiko" <andrii@kernel.org>,
"Martin KaFai Lau" <martin.lau@linux.dev>,
"Eduard Zingerman" <eddyz87@gmail.com>,
"Song Liu" <song@kernel.org>,
"Yonghong Song" <yonghong.song@linux.dev>,
"John Fastabend" <john.fastabend@gmail.com>,
"KP Singh" <kpsingh@kernel.org>,
"Stanislav Fomichev" <sdf@fomichev.me>,
"Hao Luo" <haoluo@google.com>, "Jiri Olsa" <jolsa@kernel.org>,
"Shuah Khan" <shuah@kernel.org>,
"Peter Zijlstra" <peterz@infradead.org>,
"Miguel Ojeda" <ojeda@kernel.org>,
"Nathan Chancellor" <nathan@kernel.org>,
"Kees Cook" <kees@kernel.org>, "Tejun Heo" <tj@kernel.org>,
"Jeff Xu" <jeffxu@chromium.org>,
mkoutny@suse.com, "Jan Hendrik Farr" <kernel@jfarr.cc>,
"Christian Brauner" <brauner@kernel.org>,
"Randy Dunlap" <rdunlap@infradead.org>,
"Brian Gerst" <brgerst@gmail.com>,
"Masahiro Yamada" <masahiroy@kernel.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
cgroups@vger.kernel.org, bpf@vger.kernel.org,
linux-kselftest@vger.kernel.org, "Hui Zhu" <zhuhui@kylinos.cn>
Subject: Re: [RFC PATCH 0/3] Memory Controller eBPF support
Date: Fri, 21 Nov 2025 02:46:31 +0000 [thread overview]
Message-ID: <f5c4c443f8ba855d329a180a6816fc259eb8dfca@linux.dev> (raw)
In-Reply-To: <aR9p8n3VzpNHdPFw@tiehlicka>
2025年11月21日 03:20, "Michal Hocko" <mhocko@suse.com mailto:mhocko@suse.com?to=%22Michal%20Hocko%22%20%3Cmhocko%40suse.com%3E > 写到:
>
> On Thu 20-11-25 09:29:52, hui.zhu@linux.dev wrote:
> [...]
>
> >
> > I generally agree with an idea to use BPF for various memcg-related
> > policies, but I'm not sure how specific callbacks can be used in
> > practice.
> >
> > Hi Roman,
> >
> > Following are some ideas that can use ebpf memcg:
> >
> > Priority‑Based Reclaim and Limits in Multi‑Tenant Environments:
> > On a single machine with multiple tenants / namespaces / containers,
> > under memory pressure it’s hard to decide “who should be squeezed first”
> > with static policies baked into the kernel.
> > Assign a BPF profile to each tenant’s memcg:
> > Under high global pressure, BPF can decide:
> > Which memcgs’ memory.high should be raised (delaying reclaim),
> > Which memcgs should be scanned and reclaimed more aggressively.
> >
> > Online Profiling / Diagnosing Memory Hotspots:
> > A cgroup’s memory keeps growing, but without patching the kernel it’s
> > difficult to obtain fine‑grained information.
> > Attach BPF to the memcg charge/uncharge path:
> > Record large allocations (greater than N KB) with call stacks and
> > owning file/module, and send them to user space via a BPF ring buffer.
> > Based on sampled data, generate:
> > “Top N memory allocation stacks in this container over the last 10 minutes,”
> > Reports of which objects / call paths are growing fastest.
> > This makes it possible to pinpoint the root cause of host memory
> > anomalies without changing application code, which is very useful
> > in operations/ops scenarios.
> >
> > SLO‑Driven Auto Throttling / Scale‑In/Out Signals:
> > Use eBPF to observe memory usage slope, frequent reclaim,
> > or near‑OOM behavior within a memcg.
> > When it decides “OOM is imminent,” instead of just killing/raising
> > limits, it can emit a signal to a control‑plane component.
> > For example, send an event to a user‑space agent to trigger
> > automatic scaling, QPS adjustment, or throttling.
> >
> > Prevent a cgroup from launching a large‑scale fork+malloc attack:
> > BPF checks per‑uid or per‑cgroup allocation behavior over the
> > last few seconds during memcg charge.
> >
> AFAIU, these are just very high level ideas rather than anything you are
> trying to target with this patch series, right?
>
> All I can see is that you add a reclaim hook but it is not really clear
> to me how feasible it is to actually implement a real memory reclaim
> strategy this way.
>
> In prinicipal I am not really opposed but the memory reclaim process is
> rather involved process and I would really like to see there is
> something real to be done without exporting all the MM code to BPF for
> any practical use. Is there any POC out there?
Hi Michal,
I apologize for not delivering a more substantial POC.
I was hesitant to add extensive eBPF support to memcg
because I wasn't certain it aligned with the community's
vision—and such support would require introducing many
eBPF hooks into memcg.
I will add more eBPF hook to memcg and provide a more
meaningful POC in the next version.
Best,
Hui
> --
> Michal Hocko
> SUSE Labs
>
next prev parent reply other threads:[~2025-11-21 2:46 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-19 1:34 [RFC PATCH 0/3] Memory Controller eBPF support Hui Zhu
2025-11-19 1:34 ` [RFC PATCH 1/3] memcg: add eBPF struct ops support for memory charging Hui Zhu
2025-11-19 2:10 ` bot+bpf-ci
2025-11-19 16:07 ` Tejun Heo
2025-11-21 19:24 ` kernel test robot
2025-11-19 1:34 ` [RFC PATCH 2/3] selftests/bpf: add memcg eBPF struct ops test Hui Zhu
2025-11-19 2:19 ` bot+bpf-ci
2025-11-19 1:34 ` [RFC PATCH 3/3] samples/bpf: add example memcg eBPF program Hui Zhu
2025-11-19 2:19 ` bot+bpf-ci
2025-11-20 3:04 ` [RFC PATCH 0/3] Memory Controller eBPF support Roman Gushchin
2025-11-20 9:29 ` hui.zhu
2025-11-20 19:20 ` Michal Hocko
2025-11-21 2:46 ` hui.zhu [this message]
2025-11-25 12:12 ` Michal Hocko
2025-11-25 12:39 ` hui.zhu
2025-11-25 12:55 ` Michal Hocko
2025-11-26 3:05 ` hui.zhu
2025-11-26 16:01 ` Michal Hocko
2025-11-27 8:51 ` hui.zhu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f5c4c443f8ba855d329a180a6816fc259eb8dfca@linux.dev \
--to=hui.zhu@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brauner@kernel.org \
--cc=brgerst@gmail.com \
--cc=cgroups@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=haoluo@google.com \
--cc=jeffxu@chromium.org \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kees@kernel.org \
--cc=kernel@jfarr.cc \
--cc=kpsingh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=martin.lau@linux.dev \
--cc=masahiroy@kernel.org \
--cc=mhocko@suse.com \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=nathan@kernel.org \
--cc=ojeda@kernel.org \
--cc=peterz@infradead.org \
--cc=rdunlap@infradead.org \
--cc=roman.gushchin@linux.dev \
--cc=sdf@fomichev.me \
--cc=shakeel.butt@linux.dev \
--cc=shuah@kernel.org \
--cc=song@kernel.org \
--cc=tj@kernel.org \
--cc=yonghong.song@linux.dev \
--cc=zhuhui@kylinos.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.