From: YoungJun Park <youngjun.park@lge.com>
To: Nhat Pham <nphamcs@gmail.com>
Cc: akpm@linux-foundation.org, chrisl@kernel.org, linux-mm@kvack.org,
cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
kasong@tencent.com, hannes@cmpxchg.org, mhocko@kernel.org,
roman.gushchin@linux.dev, shakeel.butt@linux.dev,
muchun.song@linux.dev, shikemeng@huaweicloud.com,
baoquan.he@linux.dev, baohua@kernel.org, yosry@kernel.org,
gunho.lee@lge.com, taejoon.song@lge.com, hyungjun.cho@lge.com,
mkoutny@suse.com, baver.bae@lge.com, matia.kim@lge.com
Subject: Re: [PATCH v8 0/4] mm/swap, memcg: Introduce swap tiers for cgroup based swap control
Date: Thu, 18 Jun 2026 10:47:53 +0900 [thread overview]
Message-ID: <ajNOSesjwTyZc8EX@yjaykim-PowerEdge-T330> (raw)
In-Reply-To: <CAKEwX=NfSy0XiD_UMsDOHGCwpE7sYmBmhV4Y9vk_cbnnr6J6PQ@mail.gmail.com>
On Wed, Jun 17, 2026 at 01:50:49PM -0400, Nhat Pham wrote:
> On Wed, Jun 17, 2026 at 1:34 AM Youngjun Park <youngjun.park@lge.com> wrote:
> >
> > This is the v8 series of the swap tier patchset.
> >
> > Great thanks to Shakeel Butt and Yosry for the reviews and discussions [1].
> > The main change in this version is the interface change to use
> > memory.swap.tiers.max with '0' (disable) and 'max' (enable) values.
> > This mechanism was suggested by Shakeel and Yosry
>
> I like this interface too :)
Good to hear. Now it looks like we have found a memcg interface that
aligns well with the existing memcg model.
I like this idea as well. Thanks again to Shakeel Butt and Yosry.
> > Here is a brief summary of our tentative conclusions. Please correct me
> > if anything is misrepresented (details in references):
> >
> > * Zswap tiering [2]:
> > Tiering applies only to the vswap + zswap combo. Zswap itself will
> > not be tiered, as the current architecture requires a physical device
> > for zswap allocation.
>
> I think Yosry wants zswap as a tier, right?
>
> Just that without vswap, maybe don't allow it to be an tier of itself?
With the current architecture, users cannot dynamically specify zswap as
a tier, and zswap is a separate layer, so it is not tiered by itself.
Once your vswap work lands, I think we can make the zswap
become the default, top-level tier.
After that, we can also look into cleaning up the zswap.writeback
interface together.
> #2: Inter-tier promotion and demotion:
> Promotion and demotion apply between tiers, not within a single
> tier. The current interface defines only tier assignment; it does
> not yet define when or how pages move between tiers. Two triggering
> models are possible:
>
> > (a) User-triggered: userspace explicitly initiates migration between
> > tiers (e.g. via a new interface or existing move_pages semantics).
> > (b) Kernel-triggered: the kernel moves pages between tiers at
> > appropriate points such as reclaim or refault.
>
> We'll likely need some kernel-triggered mechanism, or we'd have LRU inversion :)
>
> Cold pages will fill up fast tiers first, and more recent/warm pages
> will land on slow tiers...
Yeah, good point!
> We'll also need to enforce isolation/fairness to make sure no wordload
> hoard the fast tiers too (but that probably requires demotion
> support).
Right, that makes sense.
BTW, One thing I am curious about, though, is whether there are strong
real-world use cases that require demotion/promotion.
Theoretically, this looks useful but it would be helpful to better understand
the requirements from such deployments.
> >
> > #3: Per-VMA, per-process swap and BPF:
> > Not just for memcg based swap, possible to extend Per-VMA or per-process
> > swap. Or we can use it as BPF program.
> >
> > #4: Zswap and vswap tiering:
> > Tiering applies to the vswap + zswap combination.
> >
> > #5: Vswap on/off control:
> > Currently not supported. If a strong use case arises where vswap needs
> > to be controlled by memcg, the tier interface could be used for it.
>
> +1.
>
> Also, per-si/per-tier per-CPU allocation caching? :) Kairui already
> has a patch for it, IIUC, but if not it's pretty critical I'd say.
Yes, I missed it. Thank you for addressing it.
we need an implementation that integrates this with the per-CPU
allocation currently implemented on the vswap side.
If Kairui's patch lands, my patch #4 also can be optimized based on that.
> BTW, can we add some selftests, to make sure the new interface works
> as expected, and to have example programs for new users to model their
> scripts after? :)
Yes, I agree. I think selftests are necessary.
Do you want them to be introduced in this patchset, or would it be okay
to add them separately as follow-up work?
prev parent reply other threads:[~2026-06-18 1:47 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-17 5:34 [PATCH v8 0/4] mm/swap, memcg: Introduce swap tiers for cgroup based swap control Youngjun Park
2026-06-17 5:34 ` [PATCH v8 1/4] mm: swap: introduce swap tier infrastructure Youngjun Park
2026-06-17 5:34 ` [PATCH v8 2/4] mm: swap: associate swap devices with tiers Youngjun Park
2026-06-17 5:34 ` [PATCH v8 3/4] mm: memcontrol: add interface for swap tier selection Youngjun Park
2026-06-17 6:10 ` YoungJun Park
2026-06-17 5:34 ` [PATCH v8 4/4] mm: swap: filter swap allocation by memcg tier mask Youngjun Park
2026-06-17 6:24 ` YoungJun Park
2026-06-17 17:50 ` [PATCH v8 0/4] mm/swap, memcg: Introduce swap tiers for cgroup based swap control Nhat Pham
2026-06-18 1:47 ` YoungJun Park [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ajNOSesjwTyZc8EX@yjaykim-PowerEdge-T330 \
--to=youngjun.park@lge.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baoquan.he@linux.dev \
--cc=baver.bae@lge.com \
--cc=cgroups@vger.kernel.org \
--cc=chrisl@kernel.org \
--cc=gunho.lee@lge.com \
--cc=hannes@cmpxchg.org \
--cc=hyungjun.cho@lge.com \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=matia.kim@lge.com \
--cc=mhocko@kernel.org \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=shikemeng@huaweicloud.com \
--cc=taejoon.song@lge.com \
--cc=yosry@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox