From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
To: linux-mm@kvack.org
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Nicholas Piggin <npiggin@gmail.com>,
Christophe Leroy <chleroy@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Chris Li <chrisl@kernel.org>, Kairui Song <kasong@tencent.com>,
Kemeng Shi <shikemeng@huaweicloud.com>,
Nhat Pham <nphamcs@gmail.com>, Baoquan He <baoquan.he@linux.dev>,
Barry Song <baohua@kernel.org>,
Youngjun Park <youngjun.park@lge.com>,
David Hildenbrand <david@kernel.org>,
linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
Sayali Patil <sayalip@linux.ibm.com>,
"Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Subject: [PATCH v4 1/3] mm, swap: make SWAPFILE_CLUSTER runtime
Date: Fri, 19 Jun 2026 10:10:57 +0530 [thread overview]
Message-ID: <e519e98f51be2d310367e39df0935da79001e535.1781843449.git.ritesh.list@gmail.com> (raw)
In-Reply-To: <cover.1781843449.git.ritesh.list@gmail.com>
On PowerPC Book3S64, MMU is selected at runtime, so macros like
PMD_SHIFT are effectively runtime variables in the Book3S64 code. THP
swap code uses these macros to size some of its array data structures
based on PMD_ORDER e.g. SWAPFILE_CLUSTER macro is used for this very
purpose.
Hence this patch makes the users of SWAPFILE_CLUSTER to use this macro value at
runtime and also modifies swap_table and swap_memcg_table which were earlier
using this macro for defining the number of table entries.
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
mm/swap_table.h | 6 ++----
mm/swapfile.c | 17 ++++++++++++-----
2 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/mm/swap_table.h b/mm/swap_table.h
index e6613e62f8d0..90e2a7852300 100644
--- a/mm/swap_table.h
+++ b/mm/swap_table.h
@@ -8,16 +8,14 @@
/* A typical flat array in each cluster as swap table */
struct swap_table {
- atomic_long_t entries[SWAPFILE_CLUSTER];
+ DECLARE_FLEX_ARRAY(atomic_long_t, entries);
};
/* For storing memcg private id */
struct swap_memcg_table {
- unsigned short id[SWAPFILE_CLUSTER];
+ DECLARE_FLEX_ARRAY(unsigned short, id);
};
-#define SWP_TABLE_USE_PAGE (sizeof(struct swap_table) == PAGE_SIZE)
-
/*
* A swap table entry represents the status of a swap slot on a swap
* (physical or virtual) device. The swap table in each cluster is a
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 78b49b0658ad..4bf11c5b87eb 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -129,6 +129,8 @@ static DEFINE_PER_CPU(struct percpu_swap_cluster, percpu_swap_cluster) = {
.lock = INIT_LOCAL_LOCK(),
};
+static bool swap_table_use_page __ro_after_init;
+
/* May return NULL on invalid type, caller must check for NULL return */
static struct swap_info_struct *swap_type_to_info(int type)
{
@@ -437,7 +439,7 @@ static void swap_cluster_free_table(struct swap_cluster_info *ci)
return;
rcu_assign_pointer(ci->table, NULL);
- if (!SWP_TABLE_USE_PAGE) {
+ if (!swap_table_use_page) {
kmem_cache_free(swap_table_cachep, table);
return;
}
@@ -456,7 +458,7 @@ static int swap_cluster_alloc_table(struct swap_cluster_info *ci, gfp_t gfp)
if (rcu_access_pointer(ci->table))
return 0;
- if (SWP_TABLE_USE_PAGE) {
+ if (swap_table_use_page) {
folio = folio_alloc(gfp | __GFP_ZERO, 0);
if (folio)
table = folio_address(folio);
@@ -471,7 +473,8 @@ static int swap_cluster_alloc_table(struct swap_cluster_info *ci, gfp_t gfp)
#ifdef CONFIG_MEMCG
if (!mem_cgroup_disabled()) {
VM_WARN_ON_ONCE(ci->memcg_table);
- ci->memcg_table = kzalloc_obj(*ci->memcg_table, gfp);
+ ci->memcg_table = kzalloc_flex(*ci->memcg_table, id,
+ SWAPFILE_CLUSTER, gfp);
if (!ci->memcg_table) {
swap_cluster_free_table(ci);
return -ENOMEM;
@@ -3912,14 +3915,18 @@ static int __init swapfile_init(void)
{
swapfile_maximum_size = arch_max_swapfile_size();
+ swap_table_use_page =
+ (SWAPFILE_CLUSTER * sizeof(atomic_long_t) == PAGE_SIZE);
+
/*
* Once a cluster is freed, it's swap table content is read
* only, and all swap cache readers (swap_cache_*) verifies
* the content before use. So it's safe to use RCU slab here.
*/
- if (!SWP_TABLE_USE_PAGE)
+ if (!swap_table_use_page)
swap_table_cachep = kmem_cache_create("swap_table",
- sizeof(struct swap_table),
+ struct_size_t(struct swap_table, entries,
+ SWAPFILE_CLUSTER),
0, SLAB_PANIC | SLAB_TYPESAFE_BY_RCU, NULL);
#ifdef CONFIG_MIGRATION
--
2.39.5
next prev parent reply other threads:[~2026-06-19 4:41 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-19 4:40 [PATCH v4 0/3] mm, swap: Enable THP SWAP for PowerPC Book3S64 Ritesh Harjani (IBM)
2026-06-19 4:40 ` Ritesh Harjani (IBM) [this message]
2026-06-19 4:40 ` [PATCH v4 2/3] mm, swap: allow archs to override SWAP_NR_ORDERS via ARCH_MAX_PMD_ORDER Ritesh Harjani (IBM)
2026-06-19 4:40 ` [PATCH v4 3/3] powerpc: Kconfig: Enable THP_SWAP on Book3S64 Ritesh Harjani (IBM)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e519e98f51be2d310367e39df0935da79001e535.1781843449.git.ritesh.list@gmail.com \
--to=ritesh.list@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baoquan.he@linux.dev \
--cc=chleroy@kernel.org \
--cc=chrisl@kernel.org \
--cc=david@kernel.org \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=mpe@ellerman.id.au \
--cc=nphamcs@gmail.com \
--cc=npiggin@gmail.com \
--cc=sayalip@linux.ibm.com \
--cc=shikemeng@huaweicloud.com \
--cc=youngjun.park@lge.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox