* [PATCH] mm, swap: free the cluster extend table on teardown
@ 2026-06-02 22:23 David Carlier
2026-06-03 2:41 ` Kairui Song
0 siblings, 1 reply; 2+ messages in thread
From: David Carlier @ 2026-06-02 22:23 UTC (permalink / raw)
To: akpm
Cc: linux-mm, linux-kernel, David Carlier,
syzbot+deedf22929084640666f, stable, Chris Li, Kairui Song,
Kemeng Shi, Nhat Pham, Baoquan He, Barry Song, Youngjun Park
swap_cluster_free_table() frees every per-cluster side table but
ci->extend_table. That table is only released by
swap_extend_table_try_free(), which the teardown path never calls, so a
cluster can be freed with an extend table still attached.
It can also linger while the cluster is live. swap_dup_entries_cluster()
drops the lock to allocate an extend table when a slot reaches
SWP_TB_COUNT_MAX - 1, then retries. If the count dropped in the meantime,
the retry takes the normal path and leaves the table behind, all entries
zero; only the failure path frees it.
Since a swap_cluster_info is reused in place and swap_extend_table_alloc()
skips allocation when ci->extend_table is set, the next user of the
cluster inherits the stale table and its leftover counts, corrupting the
swap count of any slot that overflows. CONFIG_DEBUG_VM catches the
dangling table in swap_cluster_assert_empty(); otherwise it is silent.
Free it in swap_cluster_free_table(), and also on the
swap_dup_entries_cluster() success path to match the failure path.
Reported-by: syzbot+deedf22929084640666f@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=deedf22929084640666f
Fixes: 0d6af9bcf383 ("mm, swap: use the swap table to track the swap count")
Cc: <stable@vger.kernel.org>
Signed-off-by: David Carlier <devnexen@gmail.com>
---
mm/swapfile.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 615d90867111..a69a26aec4c0 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -432,6 +432,9 @@ static void swap_cluster_free_table(struct swap_cluster_info *ci)
ci->zero_bitmap = NULL;
#endif
+ kfree(ci->extend_table);
+ ci->extend_table = NULL;
+
table = (struct swap_table *)rcu_access_pointer(ci->table);
if (!table)
return;
@@ -1711,6 +1714,7 @@ static int swap_dup_entries_cluster(struct swap_info_struct *si,
goto failed;
}
} while (++ci_off < ci_end);
+ swap_extend_table_try_free(ci);
swap_cluster_unlock(ci);
return 0;
failed:
--
2.53.0
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH] mm, swap: free the cluster extend table on teardown
2026-06-02 22:23 [PATCH] mm, swap: free the cluster extend table on teardown David Carlier
@ 2026-06-03 2:41 ` Kairui Song
0 siblings, 0 replies; 2+ messages in thread
From: Kairui Song @ 2026-06-03 2:41 UTC (permalink / raw)
To: David Carlier
Cc: akpm, linux-mm, linux-kernel, syzbot+deedf22929084640666f, stable,
Chris Li, Kemeng Shi, Nhat Pham, Baoquan He, Barry Song,
Youngjun Park
On Wed, Jun 3, 2026 at 6:27 AM David Carlier <devnexen@gmail.com> wrote:
>
> swap_cluster_free_table() frees every per-cluster side table but
> ci->extend_table. That table is only released by
> swap_extend_table_try_free(), which the teardown path never calls, so a
> cluster can be freed with an extend table still attached.
>
> It can also linger while the cluster is live. swap_dup_entries_cluster()
> drops the lock to allocate an extend table when a slot reaches
> SWP_TB_COUNT_MAX - 1, then retries. If the count dropped in the meantime,
> the retry takes the normal path and leaves the table behind, all entries
> zero; only the failure path frees it.
>
> Since a swap_cluster_info is reused in place and swap_extend_table_alloc()
> skips allocation when ci->extend_table is set, the next user of the
> cluster inherits the stale table and its leftover counts, corrupting the
> swap count of any slot that overflows. CONFIG_DEBUG_VM catches the
There won't be a corruption, extend_table is all zero at this point,
the leak on swapoff is real though.
> dangling table in swap_cluster_assert_empty(); otherwise it is silent.
>
> Free it in swap_cluster_free_table(), and also on the
> swap_dup_entries_cluster() success path to match the failure path.
>
> Reported-by: syzbot+deedf22929084640666f@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=deedf22929084640666f
> Fixes: 0d6af9bcf383 ("mm, swap: use the swap table to track the swap count")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: David Carlier <devnexen@gmail.com>
> ---
> mm/swapfile.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 615d90867111..a69a26aec4c0 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -432,6 +432,9 @@ static void swap_cluster_free_table(struct swap_cluster_info *ci)
> ci->zero_bitmap = NULL;
> #endif
>
> + kfree(ci->extend_table);
> + ci->extend_table = NULL;
> +
Still a bit too late to avoid the WARN? The WARN is already triggered
at this point, swap_cluster_free_table is called after
swap_cluster_assert_empty.
> table = (struct swap_table *)rcu_access_pointer(ci->table);
> if (!table)
> return;
> @@ -1711,6 +1714,7 @@ static int swap_dup_entries_cluster(struct swap_info_struct *si,
> goto failed;
> }
> } while (++ci_off < ci_end);
> + swap_extend_table_try_free(ci);
> swap_cluster_unlock(ci);
> return 0;
> failed:
> --
> 2.53.0
I think we have already fixed this?
https://lore.kernel.org/all/6a1eac8e.fbc46276.3c3783.0008.GAE@google.com/T/
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-06-03 2:42 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-02 22:23 [PATCH] mm, swap: free the cluster extend table on teardown David Carlier
2026-06-03 2:41 ` Kairui Song
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox