From: YoungJun Park <youngjun.park@lge.com>
To: Nhat Pham <nphamcs@gmail.com>
Cc: akpm@linux-foundation.org, chrisl@kernel.org, linux-mm@kvack.org,
cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
kasong@tencent.com, hannes@cmpxchg.org, mhocko@kernel.org,
roman.gushchin@linux.dev, shakeel.butt@linux.dev,
muchun.song@linux.dev, shikemeng@huaweicloud.com,
baoquan.he@linux.dev, baohua@kernel.org, yosry@kernel.org,
gunho.lee@lge.com, taejoon.song@lge.com, hyungjun.cho@lge.com,
mkoutny@suse.com, baver.bae@lge.com, matia.kim@lge.com
Subject: Re: [PATCH v8 0/4] mm/swap, memcg: Introduce swap tiers for cgroup based swap control
Date: Thu, 18 Jun 2026 10:47:53 +0900 [thread overview]
Message-ID: <ajNOSesjwTyZc8EX@yjaykim-PowerEdge-T330> (raw)
In-Reply-To: <CAKEwX=NfSy0XiD_UMsDOHGCwpE7sYmBmhV4Y9vk_cbnnr6J6PQ@mail.gmail.com>
On Wed, Jun 17, 2026 at 01:50:49PM -0400, Nhat Pham wrote:
> On Wed, Jun 17, 2026 at 1:34 AM Youngjun Park <youngjun.park@lge.com> wrote:
> >
> > This is the v8 series of the swap tier patchset.
> >
> > Great thanks to Shakeel Butt and Yosry for the reviews and discussions [1].
> > The main change in this version is the interface change to use
> > memory.swap.tiers.max with '0' (disable) and 'max' (enable) values.
> > This mechanism was suggested by Shakeel and Yosry
>
> I like this interface too :)
Good to hear. Now it looks like we have found a memcg interface that
aligns well with the existing memcg model.
I like this idea as well. Thanks again to Shakeel Butt and Yosry.
> > Here is a brief summary of our tentative conclusions. Please correct me
> > if anything is misrepresented (details in references):
> >
> > * Zswap tiering [2]:
> > Tiering applies only to the vswap + zswap combo. Zswap itself will
> > not be tiered, as the current architecture requires a physical device
> > for zswap allocation.
>
> I think Yosry wants zswap as a tier, right?
>
> Just that without vswap, maybe don't allow it to be an tier of itself?
With the current architecture, users cannot dynamically specify zswap as
a tier, and zswap is a separate layer, so it is not tiered by itself.
Once your vswap work lands, I think we can make the zswap
become the default, top-level tier.
After that, we can also look into cleaning up the zswap.writeback
interface together.
> #2: Inter-tier promotion and demotion:
> Promotion and demotion apply between tiers, not within a single
> tier. The current interface defines only tier assignment; it does
> not yet define when or how pages move between tiers. Two triggering
> models are possible:
>
> > (a) User-triggered: userspace explicitly initiates migration between
> > tiers (e.g. via a new interface or existing move_pages semantics).
> > (b) Kernel-triggered: the kernel moves pages between tiers at
> > appropriate points such as reclaim or refault.
>
> We'll likely need some kernel-triggered mechanism, or we'd have LRU inversion :)
>
> Cold pages will fill up fast tiers first, and more recent/warm pages
> will land on slow tiers...
Yeah, good point!
> We'll also need to enforce isolation/fairness to make sure no wordload
> hoard the fast tiers too (but that probably requires demotion
> support).
Right, that makes sense.
BTW, One thing I am curious about, though, is whether there are strong
real-world use cases that require demotion/promotion.
Theoretically, this looks useful but it would be helpful to better understand
the requirements from such deployments.
> >
> > #3: Per-VMA, per-process swap and BPF:
> > Not just for memcg based swap, possible to extend Per-VMA or per-process
> > swap. Or we can use it as BPF program.
> >
> > #4: Zswap and vswap tiering:
> > Tiering applies to the vswap + zswap combination.
> >
> > #5: Vswap on/off control:
> > Currently not supported. If a strong use case arises where vswap needs
> > to be controlled by memcg, the tier interface could be used for it.
>
> +1.
>
> Also, per-si/per-tier per-CPU allocation caching? :) Kairui already
> has a patch for it, IIUC, but if not it's pretty critical I'd say.
Yes, I missed it. Thank you for addressing it.
we need an implementation that integrates this with the per-CPU
allocation currently implemented on the vswap side.
If Kairui's patch lands, my patch #4 also can be optimized based on that.
> BTW, can we add some selftests, to make sure the new interface works
> as expected, and to have example programs for new users to model their
> scripts after? :)
Yes, I agree. I think selftests are necessary.
Do you want them to be introduced in this patchset, or would it be okay
to add them separately as follow-up work?
prev parent reply other threads:[~2026-06-18 1:47 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-17 5:34 [PATCH v8 0/4] mm/swap, memcg: Introduce swap tiers for cgroup based swap control Youngjun Park
2026-06-17 5:34 ` [PATCH v8 1/4] mm: swap: introduce swap tier infrastructure Youngjun Park
2026-06-17 5:34 ` [PATCH v8 2/4] mm: swap: associate swap devices with tiers Youngjun Park
2026-06-17 5:34 ` [PATCH v8 3/4] mm: memcontrol: add interface for swap tier selection Youngjun Park
2026-06-17 6:10 ` YoungJun Park
2026-06-17 5:34 ` [PATCH v8 4/4] mm: swap: filter swap allocation by memcg tier mask Youngjun Park
2026-06-17 6:24 ` YoungJun Park
2026-06-17 17:50 ` [PATCH v8 0/4] mm/swap, memcg: Introduce swap tiers for cgroup based swap control Nhat Pham
2026-06-18 1:47 ` YoungJun Park [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ajNOSesjwTyZc8EX@yjaykim-PowerEdge-T330 \
--to=youngjun.park@lge.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baoquan.he@linux.dev \
--cc=baver.bae@lge.com \
--cc=cgroups@vger.kernel.org \
--cc=chrisl@kernel.org \
--cc=gunho.lee@lge.com \
--cc=hannes@cmpxchg.org \
--cc=hyungjun.cho@lge.com \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=matia.kim@lge.com \
--cc=mhocko@kernel.org \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=shikemeng@huaweicloud.com \
--cc=taejoon.song@lge.com \
--cc=yosry@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.