* + mm-swap-avoid-leaving-unused-extend-table-after-alloc-race.patch added to mm-new branch
@ 2026-05-13 18:29 Andrew Morton
0 siblings, 0 replies; only message in thread
From: Andrew Morton @ 2026-05-13 18:29 UTC (permalink / raw)
To: mm-commits, shikemeng, nphamcs, leitao, chrisl, bhe, baohua,
kasong, akpm
The patch titled
Subject: mm, swap: avoid leaving unused extend table after alloc race
has been added to the -mm mm-new branch. Its filename is
mm-swap-avoid-leaving-unused-extend-table-after-alloc-race.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-swap-avoid-leaving-unused-extend-table-after-alloc-race.patch
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
The mm-new branch of mm.git is not included in linux-next
If a few days of testing in mm-new is successful, the patch will me moved
into mm.git's mm-unstable branch, which is included in linux-next
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Kairui Song <kasong@tencent.com>
Subject: mm, swap: avoid leaving unused extend table after alloc race
Date: Wed, 13 May 2026 17:21:11 +0800
Allocating an extend table requires dropping the ci lock first. While the
lock is dropped, a concurrent put can decrease the slot's swap count to a
value that is no longer maxed out, so the extend table is no longer
required. The current allocation path still attach the new extend table
to the cluster anyway, leaving it unused.
It's not really leaked, the next maxed out count on the same cluster
reuses the table, and frees it properly. Swapoff will also clean it up.
The worst case is one unused page pinned per cluster until the next
maxed-out allocation or swapoff. To eliminate the waste, re-check under
the ci lock that the extend table is still needed before publishing it,
and free the local allocation otherwise. The added overhead is ignorable.
Link: https://lore.kernel.org/20260513-swap-extend-table-fix-v1-1-a71dea851fb3@tencent.com
Fixes: 0d6af9bcf383 ("mm, swap: use the swap table to track the swap count")
Signed-off-by: Kairui Song <kasong@tencent.com>
Reported-by: Breno Leitao <leitao@debian.org>
Closes: https://lore.kernel.org/linux-mm/agG6Dp0umhs6O1SY@gmail.com/
Tested-by: Breno Leitao <leitao@debian.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/swapfile.c | 32 +++++++++++++++++++++++++-------
1 file changed, 25 insertions(+), 7 deletions(-)
--- a/mm/swapfile.c~mm-swap-avoid-leaving-unused-extend-table-after-alloc-race
+++ a/mm/swapfile.c
@@ -1443,8 +1443,10 @@ start_over:
}
static int swap_extend_table_alloc(struct swap_info_struct *si,
- struct swap_cluster_info *ci, gfp_t gfp)
+ struct swap_cluster_info *ci,
+ unsigned int ci_off, gfp_t gfp)
{
+ int count;
void *table;
table = kzalloc(sizeof(ci->extend_table[0]) * SWAPFILE_CLUSTER, gfp);
@@ -1452,11 +1454,27 @@ static int swap_extend_table_alloc(struc
return -ENOMEM;
spin_lock(&ci->lock);
- if (!ci->extend_table)
- ci->extend_table = table;
- else
- kfree(table);
+ /*
+ * Extend table allocation requires releasing ci lock first so it's
+ * possible that the slot has been freed, no longer overflowed, or
+ * a concurrent extend table allocation has already succeeded, so
+ * the allocation is no longer needed.
+ */
+ if (!cluster_table_is_alloced(ci))
+ goto out_free;
+ count = swp_tb_get_count(__swap_table_get(ci, ci_off));
+ if (count < (SWP_TB_COUNT_MAX - 1))
+ goto out_free;
+ if (ci->extend_table)
+ goto out_free;
+
+ ci->extend_table = table;
+ spin_unlock(&ci->lock);
+ return 0;
+
+out_free:
spin_unlock(&ci->lock);
+ kfree(table);
return 0;
}
@@ -1472,7 +1490,7 @@ int swap_retry_table_alloc(swp_entry_t e
return 0;
ci = __swap_offset_to_cluster(si, offset);
- ret = swap_extend_table_alloc(si, ci, gfp);
+ ret = swap_extend_table_alloc(si, ci, swp_cluster_offset(entry), gfp);
put_swap_device(si);
return ret;
@@ -1665,7 +1683,7 @@ restart:
if (unlikely(err)) {
if (err == -ENOMEM) {
spin_unlock(&ci->lock);
- err = swap_extend_table_alloc(si, ci, GFP_ATOMIC);
+ err = swap_extend_table_alloc(si, ci, ci_off, GFP_ATOMIC);
spin_lock(&ci->lock);
if (!err)
goto restart;
_
Patches currently in -mm which might be from kasong@tencent.com are
mm-mglru-consolidate-common-code-for-retrieving-evictable-size.patch
mm-mglru-rename-variables-related-to-aging-and-rotation.patch
mm-mglru-relocate-the-lru-scan-batch-limit-to-callers.patch
mm-mglru-restructure-the-reclaim-loop.patch
mm-mglru-scan-and-count-the-exact-number-of-folios.patch
mm-mglru-use-a-smaller-batch-for-reclaim.patch
mm-mglru-dont-abort-scan-immediately-right-after-aging.patch
mm-mglru-remove-redundant-swap-constrained-check-upon-isolation.patch
mm-mglru-use-the-common-routine-for-dirty-writeback-reactivation.patch
mm-mglru-simplify-and-improve-dirty-writeback-handling.patch
mm-mglru-remove-no-longer-used-reclaim-argument-for-folio-protection.patch
mm-vmscan-remove-sc-file_taken.patch
mm-vmscan-remove-sc-unqueued_dirty.patch
mm-vmscan-unify-writeback-reclaim-statistic-and-throttling.patch
mm-swap-avoid-leaving-unused-extend-table-after-alloc-race.patch
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2026-05-13 18:29 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-13 18:29 + mm-swap-avoid-leaving-unused-extend-table-after-alloc-race.patch added to mm-new branch Andrew Morton
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.