From: Johannes Weiner <hannes@cmpxchg.org>
To: Nhat Pham <nphamcs@gmail.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, hughd@google.com,
yosry.ahmed@linux.dev, mhocko@kernel.org,
roman.gushchin@linux.dev, shakeel.butt@linux.dev,
muchun.song@linux.dev, len.brown@intel.com,
chengming.zhou@linux.dev, kasong@tencent.com, chrisl@kernel.org,
huang.ying.caritas@gmail.com, ryan.roberts@arm.com,
viro@zeniv.linux.org.uk, baohua@kernel.org, osalvador@suse.de,
lorenzo.stoakes@oracle.com, christophe.leroy@csgroup.eu,
pavel@kernel.org, kernel-team@meta.com,
linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
linux-pm@vger.kernel.org
Subject: Re: [RFC PATCH 04/14] mm: swap: swap cache support for virtualized swap
Date: Tue, 8 Apr 2025 11:00:19 -0400 [thread overview]
Message-ID: <20250408150019.GB816@cmpxchg.org> (raw)
In-Reply-To: <20250407234223.1059191-5-nphamcs@gmail.com>
On Mon, Apr 07, 2025 at 04:42:05PM -0700, Nhat Pham wrote:
> Currently, the swap cache code assumes that the swap space is of a fixed
> size. The virtual swap space is dynamically sized, so the existing
> partitioning code cannot be easily reused. A dynamic partitioning is
> planned, but for now keep the design simple and just use a flat
> swapcache for vswap.
>
> Since the vswap's implementation has begun to diverge from the old
> implementation, we also introduce a new build config
> (CONFIG_VIRTUAL_SWAP). Users who do not select this config will get the
> old implementation, with no behavioral change.
>
> Signed-off-by: Nhat Pham <nphamcs@gmail.com>
> ---
> mm/Kconfig | 13 ++++++++++
> mm/swap.h | 22 ++++++++++------
> mm/swap_state.c | 68 +++++++++++++++++++++++++++++++++++++++++--------
> 3 files changed, 85 insertions(+), 18 deletions(-)
>
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 1b501db06417..1a6acdb64333 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -22,6 +22,19 @@ menuconfig SWAP
> used to provide more virtual memory than the actual RAM present
> in your computer. If unsure say Y.
>
> +config VIRTUAL_SWAP
> + bool "Swap space virtualization"
> + depends on SWAP
> + default n
> + help
> + When this is selected, the kernel is built with the new swap
> + design. This will allow us to decouple the swap backends
> + (zswap, on-disk swapfile, etc.), and save disk space when we
> + use zswap (or the zero-filled swap page optimization).
> +
> + There might be more lock contentions with heavy swap use, since
> + the swap cache is no longer range partitioned.
> +
> config ZSWAP
> bool "Compressed cache for swap pages"
> depends on SWAP
> diff --git a/mm/swap.h b/mm/swap.h
> index d5f8effa8015..06e20b1d79c4 100644
> --- a/mm/swap.h
> +++ b/mm/swap.h
> @@ -22,22 +22,27 @@ void swap_write_unplug(struct swap_iocb *sio);
> int swap_writepage(struct page *page, struct writeback_control *wbc);
> void __swap_writepage(struct folio *folio, struct writeback_control *wbc);
>
> -/* linux/mm/swap_state.c */
> -/* One swap address space for each 64M swap space */
> +/* Return the swap device position of the swap slot. */
> +static inline loff_t swap_slot_pos(swp_slot_t slot)
> +{
> + return ((loff_t)swp_slot_offset(slot)) << PAGE_SHIFT;
> +}
In the same vein as the previous email, please avoid mixing moves,
renames and new code as much as possible. This makes it quite hard to
follow what's going on.
I think it would be better if you structure the series as follows:
1. Prep patches. Separate patches for moves, renames, new code.
3. mm: vswap
- config VIRTUAL_SWAP
- mm/vswap.c with skeleton data structures, init/exit, Makefile hookup
4. (temporarily) flatten existing address spaces
IMO you can do the swapcache and zswap in one patch
5+. conversion patches
Grow mm/vswap.c as you add discrete components like the descriptor
allocator, swapoff locking, the swap_cgroup tracker etc.
You're mostly doing this part already. But try to order them by
complexity and on a "core to periphery" gradient. I.e. swapoff
locking should probably come before cgroup stuff.
Insert move and rename patches at points where they make the most
sense. I.e. if they can be understood in the current upstream code
already, put them with step 1 prep patches. If you find a move or a
rename can only be understood in the context of one of the components,
put them in a prep patch right before that one.
> @@ -260,6 +269,28 @@ void delete_from_swap_cache(struct folio *folio)
> folio_ref_sub(folio, folio_nr_pages(folio));
> }
>
> +#ifdef CONFIG_VIRTUAL_SWAP
> +void clear_shadow_from_swap_cache(int type, unsigned long begin,
> + unsigned long end)
> +{
> + swp_slot_t slot = swp_slot(type, begin);
> + swp_entry_t entry = swp_slot_to_swp_entry(slot);
> + unsigned long index = swap_cache_index(entry);
> + struct address_space *address_space = swap_address_space(entry);
> + void *old;
> + XA_STATE(xas, &address_space->i_pages, index);
> +
> + xas_set_update(&xas, workingset_update_node);
> +
> + xa_lock_irq(&address_space->i_pages);
> + xas_for_each(&xas, old, entry.val + end - begin) {
> + if (!xa_is_value(old))
> + continue;
> + xas_store(&xas, NULL);
> + }
> + xa_unlock_irq(&address_space->i_pages);
I don't think you need separate functions for this, init, exit etc. if
you tweak the macros to resolve to one tree. The current code already
works if swapfiles are smaller than SWAP_ADDRESS_SPACE_PAGES and there
is only one tree, after all.
This would save a lot of duplication and keep ifdefs more confined.
next prev parent reply other threads:[~2025-04-08 15:00 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-07 23:42 [RFC PATCH 00/14] Virtual Swap Space Nhat Pham
2025-04-07 23:42 ` [RFC PATCH 01/14] swapfile: rearrange functions Nhat Pham
2025-04-07 23:42 ` [RFC PATCH 02/14] mm: swap: add an abstract API for locking out swapoff Nhat Pham
2025-04-07 23:42 ` [RFC PATCH 03/14] mm: swap: add a separate type for physical swap slots Nhat Pham
2025-04-08 14:15 ` Johannes Weiner
2025-04-08 15:11 ` Nhat Pham
2025-04-22 14:41 ` Yosry Ahmed
[not found] ` <6807ab09.670a0220.152ca3.502fSMTPIN_ADDED_BROKEN@mx.google.com>
2025-04-22 15:50 ` Nhat Pham
2025-04-22 18:50 ` Kairui Song
2025-04-07 23:42 ` [RFC PATCH 04/14] mm: swap: swap cache support for virtualized swap Nhat Pham
2025-04-08 15:00 ` Johannes Weiner [this message]
2025-04-08 15:34 ` Nhat Pham
2025-04-08 15:43 ` Nhat Pham
2025-04-07 23:42 ` [RFC PATCH 05/14] zswap: unify zswap tree " Nhat Pham
2025-04-07 23:42 ` [RFC PATCH 06/14] mm: swap: allocate a virtual swap slot for each swapped out page Nhat Pham
2025-04-07 23:42 ` [RFC PATCH 07/14] swap: implement the swap_cgroup API using virtual swap Nhat Pham
2025-04-07 23:42 ` [RFC PATCH 08/14] swap: manage swap entry lifetime at the virtual swap layer Nhat Pham
2025-04-07 23:42 ` [RFC PATCH 09/14] swap: implement locking out swapoff using virtual swap slot Nhat Pham
2025-04-07 23:42 ` [RFC PATCH 10/14] mm: swap: decouple virtual swap slot from backing store Nhat Pham
2025-04-07 23:42 ` [RFC PATCH 11/14] memcg: swap: only charge physical swap slots Nhat Pham
2025-04-07 23:42 ` [RFC PATCH 12/14] vswap: support THP swapin and batch free_swap_and_cache Nhat Pham
2025-04-07 23:42 ` [RFC PATCH 13/14] swap: simplify swapoff using virtual swap Nhat Pham
2025-04-07 23:42 ` [RFC PATCH 14/14] zswap: do not start zswap shrinker if there is no physical swap slots Nhat Pham
2025-04-08 13:04 ` [RFC PATCH 00/14] Virtual Swap Space Usama Arif
2025-04-08 15:20 ` Nhat Pham
2025-04-08 15:45 ` Johannes Weiner
2025-04-08 16:25 ` Nhat Pham
2025-04-08 16:27 ` Nhat Pham
2025-04-08 16:22 ` Kairui Song
2025-04-08 16:47 ` Nhat Pham
2025-04-08 16:59 ` Kairui Song
2025-04-22 14:43 ` Yosry Ahmed
2025-04-22 14:56 ` Yosry Ahmed
[not found] ` <6807afd0.a70a0220.2ae8b9.e07cSMTPIN_ADDED_BROKEN@mx.google.com>
2025-04-22 17:15 ` Nhat Pham
2025-04-22 19:29 ` Nhat Pham
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250408150019.GB816@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=cgroups@vger.kernel.org \
--cc=chengming.zhou@linux.dev \
--cc=chrisl@kernel.org \
--cc=christophe.leroy@csgroup.eu \
--cc=huang.ying.caritas@gmail.com \
--cc=hughd@google.com \
--cc=kasong@tencent.com \
--cc=kernel-team@meta.com \
--cc=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-pm@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=osalvador@suse.de \
--cc=pavel@kernel.org \
--cc=roman.gushchin@linux.dev \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=viro@zeniv.linux.org.uk \
--cc=yosry.ahmed@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox