public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] mm/swap: enhance swap cluster allocation checks
@ 2026-03-09  8:05 Hui Zhu
  2026-03-09  8:05 ` [PATCH v2 1/2] mm/swap: Add VM_WARN_ON to isolate_lock_cluster() Hui Zhu
  2026-03-09  8:05 ` [PATCH v2 2/2] mm/swap: Add lockdep for si->global_cluster_lock in swap_cluster_alloc_table() Hui Zhu
  0 siblings, 2 replies; 6+ messages in thread
From: Hui Zhu @ 2026-03-09  8:05 UTC (permalink / raw)
  To: Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Baoquan He, Barry Song, YoungJun Park, linux-mm, linux-kernel
  Cc: Hui Zhu

From: Hui Zhu <zhuhui@kylinos.cn>

This series enhances safety and debugging for swap cluster allocation
paths in mm/swapfile.c.

Patch 1 adds a VM_WARN_ON in isolate_lock_cluster() to verify that
full clusters (which should already have allocated tables) are not
processed by swap_cluster_alloc_table().

Patch 2 adds lockdep_assert_held for si->global_cluster_lock on non-SSD
devices and adjusts the assertion order to match the actual lock
acquisition sequence.

Changelog:
v2:
According to the comments of YoungJun Park, Kairui Song and Chris Li,
change acquire locks in swap_reclaim_work() to adds a VM_WARN_ON in
isolate_lock_cluster().
According to the comments of YoungJun Park, add code in patch 2 to Change
the order of lockdep_assert_held() to match the actual lock acquisition
order.

Hui Zhu (2):
  mm/swap: Add VM_WARN_ON to isolate_lock_cluster()
  mm/swap: Add lockdep for si->global_cluster_lock in
    swap_cluster_alloc_table()

 mm/swapfile.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/2] mm/swap: Add VM_WARN_ON to isolate_lock_cluster()
  2026-03-09  8:05 [PATCH v2 0/2] mm/swap: enhance swap cluster allocation checks Hui Zhu
@ 2026-03-09  8:05 ` Hui Zhu
  2026-03-09  8:46   ` YoungJun Park
  2026-03-09 13:23   ` Kairui Song
  2026-03-09  8:05 ` [PATCH v2 2/2] mm/swap: Add lockdep for si->global_cluster_lock in swap_cluster_alloc_table() Hui Zhu
  1 sibling, 2 replies; 6+ messages in thread
From: Hui Zhu @ 2026-03-09  8:05 UTC (permalink / raw)
  To: Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Baoquan He, Barry Song, YoungJun Park, linux-mm, linux-kernel
  Cc: Hui Zhu

From: Hui Zhu <zhuhui@kylinos.cn>

swap_cluster_alloc_table() assumes that the caller holds the following
locks:
ci->lock
percpu_swap_cluster.lock
si->global_cluster_lock (required for non-SWP_SOLIDSTATE devices)

There are five call paths leading to swap_cluster_alloc_table():
swap_alloc_hibernation_slot->cluster_alloc_swap_entry
->alloc_swap_scan_list->isolate_lock_cluster->swap_cluster_alloc_table

swap_alloc_slow->cluster_alloc_swap_entry->alloc_swap_scan_list
->isolate_lock_cluster->swap_cluster_alloc_table

swap_alloc_hibernation_slot->cluster_alloc_swap_entry
->swap_reclaim_full_clusters->isolate_lock_cluster
->swap_cluster_alloc_table

swap_alloc_slow->cluster_alloc_swap_entry->swap_reclaim_full_clusters
->isolate_lock_cluster->swap_cluster_alloc_table

swap_reclaim_work->swap_reclaim_full_clusters->isolate_lock_cluster
->swap_cluster_alloc_table

Other paths correctly acquire the necessary locks before calling
swap_cluster_alloc_table().
But swap_reclaim_work() doesn't need do that because
swap_reclaim_full_clusters() just isolate si->full_clusters that
the tables of it must be allocated.
Then isolate_lock_cluster() that is called by
swap_reclaim_full_clusters() will never call swap_cluster_alloc_table().

This patch add a VM_WARN_ON to warning if a fill cluster will be handle
by swap_cluster_alloc_table()

Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
---
 mm/swapfile.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 94af29d1de88..3fc2eb30c187 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -596,6 +596,9 @@ static struct swap_cluster_info *isolate_lock_cluster(
 	spin_unlock(&si->lock);
 
 	if (found && !cluster_table_is_alloced(found)) {
+		/* Table of full cluster must be allocated. */
+		VM_WARN_ON(ci->flags == CLUSTER_FLAG_FULL);
+
 		/* Only an empty free cluster's swap table can be freed. */
 		VM_WARN_ON_ONCE(list != &si->free_clusters);
 		VM_WARN_ON_ONCE(!cluster_is_empty(found));
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] mm/swap: Add lockdep for si->global_cluster_lock in swap_cluster_alloc_table()
  2026-03-09  8:05 [PATCH v2 0/2] mm/swap: enhance swap cluster allocation checks Hui Zhu
  2026-03-09  8:05 ` [PATCH v2 1/2] mm/swap: Add VM_WARN_ON to isolate_lock_cluster() Hui Zhu
@ 2026-03-09  8:05 ` Hui Zhu
  2026-03-09  8:41   ` YoungJun Park
  1 sibling, 1 reply; 6+ messages in thread
From: Hui Zhu @ 2026-03-09  8:05 UTC (permalink / raw)
  To: Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Baoquan He, Barry Song, YoungJun Park, linux-mm, linux-kernel
  Cc: Hui Zhu

From: Hui Zhu <zhuhui@kylinos.cn>

Add a lockdep_assert_held(&si->global_cluster_lock) in
swap_cluster_alloc_table() for non-SWP_SOLIDSTATE devices.

The function already requires the caller to hold both ci->lock
and percpu_swap_cluster.lock.
And it also necessitates si->global_cluster_lock when the device is not
SWP_SOLIDSTATE.

Adding this assertion ensures locking consistency and helps catch
potential synchronization issues during development.
Change the order of lockdep_assert_held() to match the actual lock
acquisition order.

Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
---
 mm/swapfile.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 3fc2eb30c187..b31b86263b89 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -476,8 +476,10 @@ swap_cluster_alloc_table(struct swap_info_struct *si,
 	 * Only cluster isolation from the allocator does table allocation.
 	 * Swap allocator uses percpu clusters and holds the local lock.
 	 */
-	lockdep_assert_held(&ci->lock);
 	lockdep_assert_held(&this_cpu_ptr(&percpu_swap_cluster)->lock);
+	if (!(si->flags & SWP_SOLIDSTATE))
+		lockdep_assert_held(&si->global_cluster_lock);
+	lockdep_assert_held(&ci->lock);
 
 	/* The cluster must be free and was just isolated from the free list. */
 	VM_WARN_ON_ONCE(ci->flags || !cluster_is_empty(ci));
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 2/2] mm/swap: Add lockdep for si->global_cluster_lock in swap_cluster_alloc_table()
  2026-03-09  8:05 ` [PATCH v2 2/2] mm/swap: Add lockdep for si->global_cluster_lock in swap_cluster_alloc_table() Hui Zhu
@ 2026-03-09  8:41   ` YoungJun Park
  0 siblings, 0 replies; 6+ messages in thread
From: YoungJun Park @ 2026-03-09  8:41 UTC (permalink / raw)
  To: Hui Zhu
  Cc: Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Baoquan He, Barry Song, linux-mm, linux-kernel, Hui Zhu

On Mon, Mar 09, 2026 at 04:05:42PM +0800, Hui Zhu wrote:
> From: Hui Zhu <zhuhui@kylinos.cn>
> 
> Add a lockdep_assert_held(&si->global_cluster_lock) in
> swap_cluster_alloc_table() for non-SWP_SOLIDSTATE devices.
> 
> The function already requires the caller to hold both ci->lock
> and percpu_swap_cluster.lock.
> And it also necessitates si->global_cluster_lock when the device is not
> SWP_SOLIDSTATE.
> 
> Adding this assertion ensures locking consistency and helps catch
> potential synchronization issues during development.
> Change the order of lockdep_assert_held() to match the actual lock
> acquisition order.
> 
> Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
> ---
>  mm/swapfile.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 3fc2eb30c187..b31b86263b89 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -476,8 +476,10 @@ swap_cluster_alloc_table(struct swap_info_struct *si,
>  	 * Only cluster isolation from the allocator does table allocation.
>  	 * Swap allocator uses percpu clusters and holds the local lock.
>  	 */
> -	lockdep_assert_held(&ci->lock);
>  	lockdep_assert_held(&this_cpu_ptr(&percpu_swap_cluster)->lock);
> +	if (!(si->flags & SWP_SOLIDSTATE))
> +		lockdep_assert_held(&si->global_cluster_lock);
> +	lockdep_assert_held(&ci->lock);
>  
>  	/* The cluster must be free and was just isolated from the free list. */
>  	VM_WARN_ON_ONCE(ci->flags || !cluster_is_empty(ci));
> -- 
> 2.43.0

The changes look good to me. Thanks!

Reviewed-by: Youngjun Park <youngjun.park@lge.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] mm/swap: Add VM_WARN_ON to isolate_lock_cluster()
  2026-03-09  8:05 ` [PATCH v2 1/2] mm/swap: Add VM_WARN_ON to isolate_lock_cluster() Hui Zhu
@ 2026-03-09  8:46   ` YoungJun Park
  2026-03-09 13:23   ` Kairui Song
  1 sibling, 0 replies; 6+ messages in thread
From: YoungJun Park @ 2026-03-09  8:46 UTC (permalink / raw)
  To: Hui Zhu
  Cc: Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Baoquan He, Barry Song, linux-mm, linux-kernel, Hui Zhu

On Mon, Mar 09, 2026 at 04:05:41PM +0800, Hui Zhu wrote:
> From: Hui Zhu <zhuhui@kylinos.cn>
> 
> swap_cluster_alloc_table() assumes that the caller holds the following
> locks:
> ci->lock
> percpu_swap_cluster.lock
> si->global_cluster_lock (required for non-SWP_SOLIDSTATE devices)
> 
> There are five call paths leading to swap_cluster_alloc_table():
> swap_alloc_hibernation_slot->cluster_alloc_swap_entry
> ->alloc_swap_scan_list->isolate_lock_cluster->swap_cluster_alloc_table
> 
> swap_alloc_slow->cluster_alloc_swap_entry->alloc_swap_scan_list
> ->isolate_lock_cluster->swap_cluster_alloc_table
> 
> swap_alloc_hibernation_slot->cluster_alloc_swap_entry
> ->swap_reclaim_full_clusters->isolate_lock_cluster
> ->swap_cluster_alloc_table
> 
> swap_alloc_slow->cluster_alloc_swap_entry->swap_reclaim_full_clusters
> ->isolate_lock_cluster->swap_cluster_alloc_table
> 
> swap_reclaim_work->swap_reclaim_full_clusters->isolate_lock_cluster
> ->swap_cluster_alloc_table
> 
> Other paths correctly acquire the necessary locks before calling
> swap_cluster_alloc_table().
> But swap_reclaim_work() doesn't need do that because
> swap_reclaim_full_clusters() just isolate si->full_clusters that
> the tables of it must be allocated.
> Then isolate_lock_cluster() that is called by
> swap_reclaim_full_clusters() will never call swap_cluster_alloc_table().
> 
> This patch add a VM_WARN_ON to warning if a fill cluster will be handle
> by swap_cluster_alloc_table()
> 
> Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
> ---
>  mm/swapfile.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 94af29d1de88..3fc2eb30c187 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -596,6 +596,9 @@ static struct swap_cluster_info *isolate_lock_cluster(
>  	spin_unlock(&si->lock);
>  
>  	if (found && !cluster_table_is_alloced(found)) {
> +		/* Table of full cluster must be allocated. */
> +		VM_WARN_ON(ci->flags == CLUSTER_FLAG_FULL);


As Kairui suggested,
Would a VM_WARN to check if `list == si->free_cluster` be good?
Or perhaps `ci->flags != CLUSTER_FLAG_FREE`?

This check only cares about CLUSTER_FLAG_FULL, so it might be better to
cover other flags except the free flag. 

Thanks! 
Youngjun Park


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] mm/swap: Add VM_WARN_ON to isolate_lock_cluster()
  2026-03-09  8:05 ` [PATCH v2 1/2] mm/swap: Add VM_WARN_ON to isolate_lock_cluster() Hui Zhu
  2026-03-09  8:46   ` YoungJun Park
@ 2026-03-09 13:23   ` Kairui Song
  1 sibling, 0 replies; 6+ messages in thread
From: Kairui Song @ 2026-03-09 13:23 UTC (permalink / raw)
  To: Hui Zhu
  Cc: Andrew Morton, Chris Li, Kemeng Shi, Nhat Pham, Baoquan He,
	Barry Song, YoungJun Park, linux-mm, linux-kernel, Hui Zhu

On Mon, Mar 9, 2026 at 4:06 PM Hui Zhu <hui.zhu@linux.dev> wrote:
>
> From: Hui Zhu <zhuhui@kylinos.cn>
>
> swap_cluster_alloc_table() assumes that the caller holds the following
> locks:
> ci->lock
> percpu_swap_cluster.lock
> si->global_cluster_lock (required for non-SWP_SOLIDSTATE devices)
>
> There are five call paths leading to swap_cluster_alloc_table():
> swap_alloc_hibernation_slot->cluster_alloc_swap_entry
> ->alloc_swap_scan_list->isolate_lock_cluster->swap_cluster_alloc_table
>
> swap_alloc_slow->cluster_alloc_swap_entry->alloc_swap_scan_list
> ->isolate_lock_cluster->swap_cluster_alloc_table
>
> swap_alloc_hibernation_slot->cluster_alloc_swap_entry
> ->swap_reclaim_full_clusters->isolate_lock_cluster
> ->swap_cluster_alloc_table
>
> swap_alloc_slow->cluster_alloc_swap_entry->swap_reclaim_full_clusters
> ->isolate_lock_cluster->swap_cluster_alloc_table
>
> swap_reclaim_work->swap_reclaim_full_clusters->isolate_lock_cluster
> ->swap_cluster_alloc_table
>
> Other paths correctly acquire the necessary locks before calling
> swap_cluster_alloc_table().
> But swap_reclaim_work() doesn't need do that because
> swap_reclaim_full_clusters() just isolate si->full_clusters that
> the tables of it must be allocated.
> Then isolate_lock_cluster() that is called by
> swap_reclaim_full_clusters() will never call swap_cluster_alloc_table().
>
> This patch add a VM_WARN_ON to warning if a fill cluster will be handle
> by swap_cluster_alloc_table()
>
> Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
> ---
>  mm/swapfile.c | 3 +++
>  1 file changed, 3 insertions(+)

Hi Hui,

Thanks for the quick update.

>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 94af29d1de88..3fc2eb30c187 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -596,6 +596,9 @@ static struct swap_cluster_info *isolate_lock_cluster(
>         spin_unlock(&si->lock);
>
>         if (found && !cluster_table_is_alloced(found)) {
> +               /* Table of full cluster must be allocated. */
> +               VM_WARN_ON(ci->flags == CLUSTER_FLAG_FULL);
> +
>                 /* Only an empty free cluster's swap table can be freed. */
>                 VM_WARN_ON_ONCE(list != &si->free_clusters);
>                 VM_WARN_ON_ONCE(!cluster_is_empty(found));

Hmm, now as I see it, we actually already have the "list !=
&si->free_clusters" and cluster_is_empty check, adding more debug
checks seems not that necessary. Or it might be better if you check
ci->flags != CLUSTER_FLAG_FREE to be more strict? I saw YoungJun have
the same idea here. Also you might want the _ONCE version.

The later lockdep check could be helpful, but for chores like this I
think having one patch is enough if there is no strong reason to split
them. These two patches are all hardening the swap table allocation
path with more sanity checks, right?


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-03-09 13:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-09  8:05 [PATCH v2 0/2] mm/swap: enhance swap cluster allocation checks Hui Zhu
2026-03-09  8:05 ` [PATCH v2 1/2] mm/swap: Add VM_WARN_ON to isolate_lock_cluster() Hui Zhu
2026-03-09  8:46   ` YoungJun Park
2026-03-09 13:23   ` Kairui Song
2026-03-09  8:05 ` [PATCH v2 2/2] mm/swap: Add lockdep for si->global_cluster_lock in swap_cluster_alloc_table() Hui Zhu
2026-03-09  8:41   ` YoungJun Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox