From: Shakeel Butt <shakeel.butt@linux.dev>
To: YoungJun Park <youngjun.park@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, Chris Li <chrisl@kernel.org>,
Kairui Song <kasong@tencent.com>,
Kemeng Shi <shikemeng@huaweicloud.com>,
Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
Barry Song <baohua@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Muchun Song <muchun.song@linux.dev>,
gunho.lee@lge.com, taejoon.song@lge.com, austin.kim@lge.com
Subject: Re: [RFC PATCH v2 0/5] mm/swap, memcg: Introduce swap tiers for cgroup based swap control
Date: Sun, 22 Feb 2026 21:56:13 -0800 [thread overview]
Message-ID: <aZvX0HZy1PDylL8A@linux.dev> (raw)
In-Reply-To: <aZnBo+P3ifskts9J@yjaykim-PowerEdge-T330>
Hi YoungJun,
I see you have sent a separate email on BPF specific questions to which I will
respond separately, here I will respond to other questions/comments.
On Sat, Feb 21, 2026 at 11:30:59PM +0900, YoungJun Park wrote:
> On Fri, Feb 20, 2026 at 07:47:22PM -0800, Shakeel Butt wrote:
[...]
>
> > Taking a step back, can you describe your use-case a bit more and share
> > requirements?
>
> Our use case is simple at now.
> We have two swap devices with different performance
> characteristics and want to assign different swap devices to different
> workloads (cgroups).
If you don't mind, can you share a bit more about the cgroup hierarchy structure
of your deployment. Do you use cgroup v1 or v2 on your production environment?
>
> For some background, when I initially proposed this, I suggested allowing
> per-cgroup swap device priorities so that it could also accommodate the
> broader scenarios you mentioned. However, since even our own use case
> does not require reversing swap priorities within a cgroup, we pivoted
> to the "swap tier" mechanism that Chris proposed.
>
> > 1. If more than one device is assign to a workload, do you want to have
> > some kind of ordering between them for the worklod or do you want option to
> > have round robin kind of policy?
>
> Both. If devices are in the same tier with the same priority, round robin.
> If they are in the same tier with different priorities, or in different
> tiers, ordering applies. The current tier structure should be able to
> satisfy either preference.
I assume this is the same swap priorities as of today, right? You want similar
priority behavior within a tier.
>
> > 2. What's the reason to use 'tiers' in the name? Is it similar to memory tiers
> > and you want promotion/demotion among the tiers?
>
> This was originally Chris's idea. I think he explained the rationale
> well in his reply.
>
> > 3. If a workload has multiple swap devices assigned, can you describe the
> > scenario where such workloads need to partition/divide given devices to their
> > sub-workloads?
>
> One possible scenario is reducing lock contention by partitioning swap
> devices between parent and child cgroups.
The lock contention is orthogonal (and distraction here).
>
> > Let's start with these questions. Please note that I want us to not just look at
> > the current use-case but brainstorm more future use-cases and then come up with
> > the solution which is more future proof.
>
> We have clear production use cases from both us and Chris, and I also
> presented a deployment example in the cover letter.
>
> I think it is hard to design concretely for future use cases at this
> point. When those needs become clearer, BPF with its flexibility
> would be a better fit then. I see BPF as a natural extension path
> rather than a starting point.
>
> For now, guarding the memcg & tier behind a CONFIG option would
> let us move forward without committing to a stable interface, and
> we can always pivot to BPF later if needed
I think your use-case is very clear. Before committing to any options, I want us
to brainstorm all options and gather pros/cons and then make an informed
decision. Anyways I will respond to your other email (in a day or two).
Shakeel
next prev parent reply other threads:[~2026-02-23 5:56 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-26 6:52 [RFC PATCH v2 0/5] mm/swap, memcg: Introduce swap tiers for cgroup based swap control Youngjun Park
2026-01-26 6:52 ` [RFC PATCH v2 v2 1/5] mm: swap: introduce swap tier infrastructure Youngjun Park
2026-02-12 9:07 ` Chris Li
2026-02-13 2:18 ` YoungJun Park
2026-02-13 14:33 ` YoungJun Park
2026-01-26 6:52 ` [RFC PATCH v2 v2 2/5] mm: swap: associate swap devices with tiers Youngjun Park
2026-01-26 6:52 ` [RFC PATCH v2 v2 3/5] mm: memcontrol: add interface for swap tier selection Youngjun Park
2026-01-26 6:52 ` [RFC PATCH v2 v2 4/5] mm, swap: change back to use each swap device's percpu cluster Youngjun Park
2026-02-12 7:37 ` Chris Li
2026-01-26 6:52 ` [RFC PATCH v2 v2 5/5] mm, swap: introduce percpu swap device cache to avoid fragmentation Youngjun Park
2026-02-12 6:12 ` [RFC PATCH v2 0/5] mm/swap, memcg: Introduce swap tiers for cgroup based swap control Chris Li
2026-02-12 9:22 ` Chris Li
2026-02-13 2:26 ` YoungJun Park
2026-02-13 1:59 ` YoungJun Park
2026-02-12 17:57 ` Nhat Pham
2026-02-12 17:58 ` Nhat Pham
2026-02-13 2:43 ` YoungJun Park
2026-02-12 18:33 ` Shakeel Butt
2026-02-13 3:58 ` YoungJun Park
2026-02-21 3:47 ` Shakeel Butt
2026-02-21 6:07 ` Chris Li
2026-02-21 17:44 ` Shakeel Butt
2026-02-22 1:16 ` YoungJun Park
2026-03-02 21:27 ` Shakeel Butt
2026-03-04 7:27 ` YoungJun Park
2026-03-18 3:54 ` Shakeel Butt
2026-03-18 4:57 ` YoungJun Park
2026-03-10 2:14 ` YoungJun Park
2026-03-14 17:32 ` Chris Li
2026-03-18 2:46 ` YoungJun Park
2026-02-21 14:30 ` YoungJun Park
2026-02-23 5:56 ` Shakeel Butt [this message]
2026-02-27 2:43 ` YoungJun Park
2026-03-02 14:50 ` YoungJun Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aZvX0HZy1PDylL8A@linux.dev \
--to=shakeel.butt@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=austin.kim@lge.com \
--cc=baohua@kernel.org \
--cc=bhe@redhat.com \
--cc=chrisl@kernel.org \
--cc=gunho.lee@lge.com \
--cc=hannes@cmpxchg.org \
--cc=kasong@tencent.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=roman.gushchin@linux.dev \
--cc=shikemeng@huaweicloud.com \
--cc=taejoon.song@lge.com \
--cc=youngjun.park@lge.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.