From: Usama Arif <usama.arif@linux.dev>
To: Andrew Morton <akpm@linux-foundation.org>,
david@kernel.org, linux-mm@kvack.org
Cc: hannes@cmpxchg.org, tj@kernel.org, mkoutny@suse.com,
shakeel.butt@linux.dev, roman.gushchin@linux.dev,
liam@infradead.org, linux-kernel@vger.kernel.org, ljs@kernel.org,
mhocko@suse.com, rppt@kernel.org, surenb@google.com,
vbabka@kernel.org, kernel-team@meta.com,
Usama Arif <usama.arif@linux.dev>
Subject: [PATCH 0/2] mm/vmpressure: reduce CPU, memory and code overhead on cgroup v2
Date: Sat, 6 Jun 2026 04:41:32 -0700 [thread overview]
Message-ID: <20260606114158.3126210-1-usama.arif@linux.dev> (raw)
The vmpressure subsystem has two distinct consumers, gated by the
@tree argument:
tree=false : in-kernel socket pressure, consumed by TCP/SCTP. This
is cgroup v2 only; v1 sockets read memcg->tcpmem_pressure
instead.
tree=true : cgroup v1 userspace eventfd notifications via the
memory.pressure_level / cgroup.event_control interface.
v2 has no equivalent (userspace gets reclaim signals
through memory.pressure / PSI, which doesn't touch
vmpressure).
So of the four (hierarchy, tree) combinations, only two carry data
that anyone reads. The existing early return in vmpressure() covered
v1 + tree=false; the symmetric v2 + tree=true case was falling through
and doing the full lock / accumulate / schedule_work / parent-walk
dance, even though the events list it eventually iterates is empty
on cgroup v2 (vmpressure_register_event() is wired up only through the
v1 cftype "memory.pressure_level" and can't be reached from a v2
memcg).
Patch 1 extends the existing early return to also skip v2 + tree=true.
On a v2-only host this eliminates a contended path where reclaimers
can serialize on a single global sr_lock. bpftrace on a 176-core production
host (cgroup v2, 285 memcgs, sustained reclaim) showed ~16,200 such calls
per minute with tree = true.
Patch 2 follows up with a cleanup: it splits the v1 userspace eventfd
interface (struct vmpressure_event, the events list and its mutex, the
work_struct and its handler, the parent walk,
vmpressure_register_event / unregister_event, and vmpressure_prio)
into a new mm/vmpressure-v1.c built only when CONFIG_MEMCG_V1=y,
behind small no-op stubs in the header. mm/vmpressure.c keeps the
shared bits and the tree=false socket-pressure path. The size of
vmpressure.c goes down to half and the code is much more simpler.
The only #ifdef CONFIG_MEMCG_V1 remaining in source is around the
v1-only fields inside struct vmpressure itself. Memory savings on
CONFIG_MEMCG_V1=n:
struct vmpressure : 112B -> 24B
struct mem_cgroup : 1664B -> 1536B
Usama Arif (2):
mm/vmpressure: skip tree=true accounting on cgroup v2
mm/vmpressure: split v1 userspace eventfd code into vmpressure-v1.c
include/linux/vmpressure.h | 46 +++++-
mm/Makefile | 2 +-
mm/vmpressure-v1.c | 305 +++++++++++++++++++++++++++++++++++++
mm/vmpressure.c | 303 +++---------------------------------
4 files changed, 364 insertions(+), 292 deletions(-)
create mode 100644 mm/vmpressure-v1.c
--
2.52.0
next reply other threads:[~2026-06-06 11:42 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-06 11:41 Usama Arif [this message]
2026-06-06 11:41 ` [PATCH 1/2] mm/vmpressure: skip tree=true accounting on cgroup v2 Usama Arif
2026-06-08 17:06 ` Shakeel Butt
2026-06-06 11:41 ` [PATCH 2/2] mm/vmpressure: split v1 userspace eventfd code into vmpressure-v1.c Usama Arif
2026-06-08 17:05 ` [PATCH 0/2] mm/vmpressure: reduce CPU, memory and code overhead on cgroup v2 Shakeel Butt
2026-06-08 18:49 ` Usama Arif
2026-06-08 19:56 ` Shakeel Butt
2026-06-08 21:19 ` Usama Arif
2026-06-08 22:26 ` Shakeel Butt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260606114158.3126210-1-usama.arif@linux.dev \
--to=usama.arif@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@meta.com \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=mkoutny@suse.com \
--cc=roman.gushchin@linux.dev \
--cc=rppt@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=tj@kernel.org \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.