From: Hao Jia <jiahao.kernel@gmail.com>
To: Yosry Ahmed <yosry@kernel.org>
Cc: akpm@linux-foundation.org, tj@kernel.org, hannes@cmpxchg.org,
shakeel.butt@linux.dev, mhocko@kernel.org, mkoutny@suse.com,
nphamcs@gmail.com, chengming.zhou@linux.dev,
muchun.song@linux.dev, roman.gushchin@linux.dev,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, Hao Jia <jiahao1@lixiang.com>
Subject: Re: [PATCH v5 4/6] mm/zswap: Implement proactive writeback
Date: Wed, 1 Jul 2026 17:35:55 +0800 [thread overview]
Message-ID: <a1c139d9-08b8-9631-7a85-697df4c23d52@gmail.com> (raw)
In-Reply-To: <CAO9r8zNCEis2QHROEsM5QZsb_H4ofNjA_sE-pM7SVxtgHg_rqg@mail.gmail.com>
On 2026/7/1 00:10, Yosry Ahmed wrote:
>>> Before going through more versions we need to figure out if this will
>>> pivot to be a proactive demotion interfcae for swap tiering.
>>>
>>
>> Yes. Should I drop patches 4-6 in the next version and wait for swap
>> tiering to be finalized?
>> We can try to get the non-memcg parts (patches 1-3) merged upstream
>> first. This would also give them plenty of time to bake and catch any
>> potential regressions. Thoughts?
>
> Patches 1-2 can be sent and merged separately, yes. For patch 2,
> please include some numbers for the writeback performance before and
> after batching.
>
> Patch 3 does refactoring in preparation for patch 4, so I don't think
> it makes sense on its own.
Will do.
>
>>>> +int zswap_proactive_writeback(struct mem_cgroup *memcg, u64 bytes_to_writeback)
>>>> +{
>>>> + struct zswap_shrink_state s = {};
>>>> + struct mem_cgroup *iter = NULL;
>>>> + u64 bytes_written = 0;
>>>> + int ret = 0;
>>>> +
>>>> + if (!memcg)
>>>> + return -EINVAL;
>>>
>>> Can this ever happen? It would be a bug in the caller.
>>
>> IIRC,Writing the following to the NUMA node sysfs entry triggers this
>> check:
>> echo "10M source=zswap" > /sys/devices/system/node/nodeN/reclaim
>
> Oh yeah, I forgot about that one :)
>
> If we keep this, probably combine the !memcg and writeback check below.
Will do.
>
>>
>>>
>>>> + if (!mem_cgroup_zswap_writeback_enabled(memcg))
>>>> + return -EINVAL;
>>>> + if (!bytes_to_writeback)
>>>> + return 0;
>>>
>>> Do we need this? I think the loop will just never enter and
>>> mem_cgroup_iter_break() will do nothing.
>>
>> Will do.
>>>
>>>> +
>>>> + while (bytes_written < bytes_to_writeback) {
>>>> + long shrunk;
>>>> +
>>>> + cond_resched();
>>>> +
>>>> + if (signal_pending(current)) {
>>>> + ret = -EINTR;
>>>> + break;
>>>> + }
>>>> +
>>>> + /*
>>>> + * Use a local iterator to walk the memcg and its online descendants
>>>> + * in a round-robin manner. Upon exiting the loop, mem_cgroup_iter_break()
>>>> + * must be called to drop the iterator reference.
>>>> + */
>>>> + do {
>>>> + iter = mem_cgroup_iter(memcg, iter, NULL);
>>>> + } while (iter && !mem_cgroup_tryget_online(iter));
>>>> +
>>>> + shrunk = zswap_shrink_one_memcg(iter, &s);
>>>> + if (shrunk > 0)
>>>> + bytes_written += shrunk;
>>>> +
>>>> + /* drop the extra reference taken by mem_cgroup_tryget_online() */
>>>> + mem_cgroup_put(iter);
>>>
>>>
>>> Can we just use mem_cgroup_online() instead since mem_cgroup_iter()
>>> already graps a ref?
>>>
>> Will do.
>
> If you're looking for another cleanup to do, shrink_worker() should
> probably also use mem_cgroup_online() and avoid taking/dropping an
> extra ref :)
IIRC, this might not work because zswap_next_shrink is a global variable
and is accessed outside the lock during reclamation.
Consider the following race condition between shrink_worker on CPU0 and
zswap_memcg_offline_cleanup on CPU1:
CPU0 CPU1
spin_lock(zswap_shrink_lock)
memcg1 = mem_cgroup_iter()
memcg1.ref = 1
zswap_next_shrink = memcg1
spin_unlock(zswap_shrink_lock)
zswap_memcg_offline_cleanup()
spin_lock(zswap_shrink_lock)
css_put(zswap_next_shrink)
memcg1.ref = 0 <--
shrink_memcg(memcg1) *maybe UAF*
Thanks,
Hao
next prev parent reply other threads:[~2026-07-01 9:36 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-29 11:20 [PATCH v5 0/6] mm/zswap: Implement per-cgroup proactive writeback Hao Jia
2026-06-29 11:20 ` [PATCH v5 1/6] mm/zswap: Fix global shrinker when memory cgroup is disabled Hao Jia
2026-06-29 18:37 ` Nhat Pham
2026-06-30 10:51 ` Hao Jia
2026-06-30 16:02 ` Yosry Ahmed
2026-07-01 9:39 ` Hao Jia
2026-07-01 17:33 ` Nhat Pham
2026-06-29 11:20 ` [PATCH v5 2/6] mm/zswap: Support batch writeback in shrink_memcg() Hao Jia
2026-06-30 0:21 ` Yosry Ahmed
2026-06-30 1:18 ` Hao Jia
2026-06-29 11:20 ` [PATCH v5 3/6] mm/zswap: Extract a reusable writeback helper from shrink_worker() Hao Jia
2026-06-29 11:20 ` [PATCH v5 4/6] mm/zswap: Implement proactive writeback Hao Jia
2026-06-30 0:15 ` Yosry Ahmed
2026-06-30 1:49 ` Hao Jia
2026-06-30 16:10 ` Yosry Ahmed
2026-07-01 9:35 ` Hao Jia [this message]
2026-07-01 11:45 ` Hao Jia
2026-07-02 12:32 ` Hao Jia
2026-06-29 11:20 ` [PATCH v5 5/6] mm/zswap: Add per-memcg stat for " Hao Jia
2026-06-29 11:20 ` [PATCH v5 6/6] selftests/cgroup: Add tests for zswap " Hao Jia
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a1c139d9-08b8-9631-7a85-697df4c23d52@gmail.com \
--to=jiahao.kernel@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=jiahao1@lixiang.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=tj@kernel.org \
--cc=yosry@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.