From: Shakeel Butt <shakeel.butt@linux.dev>
To: Usama Arif <usama.arif@linux.dev>
Cc: Andrew Morton <akpm@linux-foundation.org>,
david@kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org,
tj@kernel.org, mkoutny@suse.com, roman.gushchin@linux.dev,
liam@infradead.org, linux-kernel@vger.kernel.org, ljs@kernel.org,
mhocko@suse.com, rppt@kernel.org, surenb@google.com,
vbabka@kernel.org, kernel-team@meta.com
Subject: Re: [PATCH 0/2] mm/vmpressure: reduce CPU, memory and code overhead on cgroup v2
Date: Mon, 8 Jun 2026 10:05:30 -0700 [thread overview]
Message-ID: <aib11jSWosT6635u@linux.dev> (raw)
In-Reply-To: <20260606114158.3126210-1-usama.arif@linux.dev>
On Sat, Jun 06, 2026 at 04:41:32AM -0700, Usama Arif wrote:
> The vmpressure subsystem has two distinct consumers, gated by the
> @tree argument:
>
> tree=false : in-kernel socket pressure, consumed by TCP/SCTP. This
> is cgroup v2 only; v1 sockets read memcg->tcpmem_pressure
> instead.
We should really move v2 away from vmpressure.
> tree=true : cgroup v1 userspace eventfd notifications via the
> memory.pressure_level / cgroup.event_control interface.
> v2 has no equivalent (userspace gets reclaim signals
> through memory.pressure / PSI, which doesn't touch
> vmpressure).
>
> So of the four (hierarchy, tree) combinations, only two carry data
> that anyone reads. The existing early return in vmpressure() covered
> v1 + tree=false; the symmetric v2 + tree=true case was falling through
> and doing the full lock / accumulate / schedule_work / parent-walk
> dance, even though the events list it eventually iterates is empty
> on cgroup v2 (vmpressure_register_event() is wired up only through the
> v1 cftype "memory.pressure_level" and can't be reached from a v2
> memcg).
>
> Patch 1 extends the existing early return to also skip v2 + tree=true.
> On a v2-only host this eliminates a contended path where reclaimers
> can serialize on a single global sr_lock. bpftrace on a 176-core production
> host (cgroup v2, 285 memcgs, sustained reclaim) showed ~16,200 such calls
> per minute with tree = true.
This is good.
>
> Patch 2 follows up with a cleanup: it splits the v1 userspace eventfd
> interface (struct vmpressure_event, the events list and its mutex, the
> work_struct and its handler, the parent walk,
> vmpressure_register_event / unregister_event, and vmpressure_prio)
> into a new mm/vmpressure-v1.c built only when CONFIG_MEMCG_V1=y,
> behind small no-op stubs in the header. mm/vmpressure.c keeps the
> shared bits and the tree=false socket-pressure path. The size of
> vmpressure.c goes down to half and the code is much more simpler.
> The only #ifdef CONFIG_MEMCG_V1 remaining in source is around the
> v1-only fields inside struct vmpressure itself. Memory savings on
> CONFIG_MEMCG_V1=n:
> struct vmpressure : 112B -> 24B
> struct mem_cgroup : 1664B -> 1536B
For this, I am wondering if we should just go ahead and work towards making
vmpressure memcg-v1 only unless we foresee a lot of or complex work is needed
for that and only then patch 2 makes sense.
next prev parent reply other threads:[~2026-06-08 17:05 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-06 11:41 [PATCH 0/2] mm/vmpressure: reduce CPU, memory and code overhead on cgroup v2 Usama Arif
2026-06-06 11:41 ` [PATCH 1/2] mm/vmpressure: skip tree=true accounting " Usama Arif
2026-06-08 17:06 ` Shakeel Butt
2026-06-06 11:41 ` [PATCH 2/2] mm/vmpressure: split v1 userspace eventfd code into vmpressure-v1.c Usama Arif
2026-06-08 17:05 ` Shakeel Butt [this message]
2026-06-08 18:49 ` [PATCH 0/2] mm/vmpressure: reduce CPU, memory and code overhead on cgroup v2 Usama Arif
2026-06-08 19:56 ` Shakeel Butt
2026-06-08 21:19 ` Usama Arif
2026-06-08 22:26 ` Shakeel Butt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aib11jSWosT6635u@linux.dev \
--to=shakeel.butt@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@meta.com \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=mkoutny@suse.com \
--cc=roman.gushchin@linux.dev \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=tj@kernel.org \
--cc=usama.arif@linux.dev \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.