The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH] mm/slub: hold cpus_read_lock around flush_rcu_sheaves_on_cache()
@ 2026-05-08  8:21 Qing Wang
  2026-05-12  2:56 ` Harry Yoo (Oracle)
  0 siblings, 1 reply; 4+ messages in thread
From: Qing Wang @ 2026-05-08  8:21 UTC (permalink / raw)
  To: Vlastimil Babka, Harry Yoo, Andrew Morton, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin
  Cc: linux-mm, linux-kernel, Qing Wang

flush_rcu_sheaves_on_cache() calls queue_work_on() in a
for_each_online_cpu() loop, which requires the cpu to stay online.
But cpus_read_lock() is not held in kvfree_rcu_barrier_on_cache().

There are two paths that call flush_rcu_sheaves_on_cache():

  // has cpus_read_lock()
  flush_all_rcu_sheaves()
    -> flush_rcu_sheaves_on_cache()

  // no cpus_read_lock()
  kvfree_rcu_barrier_on_cache()
    -> flush_rcu_sheaves_on_cache()

Fix this by holding cpus_read_lock() in kvfree_rcu_barrier_on_cache().

Why not move cpus_read_lock() from flush_all_rcu_sheaves() into
flush_rcu_sheaves_on_cache()? The reason is it would introduce a new l
ock order (slab_mutex -> cpu_hotplug_lock). The reverse order
(cpu_hotplug_lock -> slab_mutex) is established by

- cpuhp_setup_state_nocalls(..., slub_cpu_setup, ...)
- kmem_cache_destroy()

The two orders together would form an AB-BA deadlock.

Finally, add lockdep_assert_cpus_held() in flush_rcu_sheaves_on_cache()
to catch the same problem in the future.

Signed-off-by: Qing Wang <wangqing7171@gmail.com>
---
 mm/slab_common.c | 6 ++++++
 mm/slub.c        | 1 +
 2 files changed, 7 insertions(+)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index d5a70a831a2a..0ee5a4189453 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -2110,7 +2110,13 @@ EXPORT_SYMBOL_GPL(kvfree_rcu_barrier);
 void kvfree_rcu_barrier_on_cache(struct kmem_cache *s)
 {
 	if (cache_has_sheaves(s)) {
+		/*
+		 * flush_rcu_sheaves_on_cache() use queue_work_on() and queue_work_on()
+		 * must be called with the CPU hotplug read lock.
+		 */
+		cpus_read_lock();
 		flush_rcu_sheaves_on_cache(s);
+		cpus_read_unlock();
 		rcu_barrier();
 	}
 
diff --git a/mm/slub.c b/mm/slub.c
index 161079ac5ba1..2a005d1e3a74 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4024,6 +4024,7 @@ void flush_rcu_sheaves_on_cache(struct kmem_cache *s)
 	struct slub_flush_work *sfw;
 	unsigned int cpu;
 
+	lockdep_assert_cpus_held();
 	mutex_lock(&flush_lock);
 
 	for_each_online_cpu(cpu) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/slub: hold cpus_read_lock around flush_rcu_sheaves_on_cache()
  2026-05-08  8:21 [PATCH] mm/slub: hold cpus_read_lock around flush_rcu_sheaves_on_cache() Qing Wang
@ 2026-05-12  2:56 ` Harry Yoo (Oracle)
  2026-05-12  3:46   ` [PATCH v2] " Qing Wang
  2026-05-12  3:50   ` [PATCH v3] " Qing Wang
  0 siblings, 2 replies; 4+ messages in thread
From: Harry Yoo (Oracle) @ 2026-05-12  2:56 UTC (permalink / raw)
  To: Qing Wang
  Cc: Vlastimil Babka, Andrew Morton, Hao Li, Christoph Lameter,
	David Rientjes, Roman Gushchin, linux-mm, linux-kernel

On Fri, May 08, 2026 at 04:21:49PM +0800, Qing Wang wrote:
> flush_rcu_sheaves_on_cache() calls queue_work_on() in a
> for_each_online_cpu() loop, which requires the cpu to stay online.

Good catch, and also the set of "online cpus" is subject to
change if we don't hold the lock.

> But cpus_read_lock() is not held in kvfree_rcu_barrier_on_cache().
> 
> There are two paths that call flush_rcu_sheaves_on_cache():
> 
>   // has cpus_read_lock()
>   flush_all_rcu_sheaves()
>     -> flush_rcu_sheaves_on_cache()
> 
>   // no cpus_read_lock()
>   kvfree_rcu_barrier_on_cache()
>     -> flush_rcu_sheaves_on_cache()
> 
> Fix this by holding cpus_read_lock() in kvfree_rcu_barrier_on_cache().
> 
> Why not move cpus_read_lock() from flush_all_rcu_sheaves() into
> flush_rcu_sheaves_on_cache()? The reason is it would introduce a new l

I would not split the word "lock" to "l" and "ock" and instead
start newline before the word :)

> ock order (slab_mutex -> cpu_hotplug_lock). The reverse order
> (cpu_hotplug_lock -> slab_mutex) is established by
> 
> - cpuhp_setup_state_nocalls(..., slub_cpu_setup, ...)
> - kmem_cache_destroy()
> 
> The two orders together would form an AB-BA deadlock.
> 
> Finally, add lockdep_assert_cpus_held() in flush_rcu_sheaves_on_cache()
> to catch the same problem in the future.

Should add:

Cc: <stable@vger.kernel.org> 
Fixes: 0f35040de593 ("mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction") 

?

> Signed-off-by: Qing Wang <wangqing7171@gmail.com>
> ---
>  mm/slab_common.c | 6 ++++++
>  mm/slub.c        | 1 +
>  2 files changed, 7 insertions(+)
> 
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index d5a70a831a2a..0ee5a4189453 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -2110,7 +2110,13 @@ EXPORT_SYMBOL_GPL(kvfree_rcu_barrier);
>  void kvfree_rcu_barrier_on_cache(struct kmem_cache *s)
>  {
>  	if (cache_has_sheaves(s)) {
> +		/*
> +		 * flush_rcu_sheaves_on_cache() use queue_work_on() and queue_work_on()
> +		 * must be called with the CPU hotplug read lock.
> +		 */

nit: not sure this comment is really necessary, given that we can detect
it with the lockdep assert added by this patch.

Otherwise LGTM:
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>

> +		cpus_read_lock();
>  		flush_rcu_sheaves_on_cache(s);
> +		cpus_read_unlock();
>  		rcu_barrier();
>  	}
>  
> diff --git a/mm/slub.c b/mm/slub.c
> index 161079ac5ba1..2a005d1e3a74 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4024,6 +4024,7 @@ void flush_rcu_sheaves_on_cache(struct kmem_cache *s)
>  	struct slub_flush_work *sfw;
>  	unsigned int cpu;
>  
> +	lockdep_assert_cpus_held();
>  	mutex_lock(&flush_lock);
>  
>  	for_each_online_cpu(cpu) {
> -- 
> 2.34.1

-- 
Cheers,
Harry / Hyeonggon

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2] mm/slub: hold cpus_read_lock around flush_rcu_sheaves_on_cache()
  2026-05-12  2:56 ` Harry Yoo (Oracle)
@ 2026-05-12  3:46   ` Qing Wang
  2026-05-12  3:50   ` [PATCH v3] " Qing Wang
  1 sibling, 0 replies; 4+ messages in thread
From: Qing Wang @ 2026-05-12  3:46 UTC (permalink / raw)
  To: harry
  Cc: akpm, cl, hao.li, linux-kernel, linux-mm, rientjes,
	roman.gushchin, vbabka, wangqing7171

flush_rcu_sheaves_on_cache() calls queue_work_on() in a
for_each_online_cpu() loop, which requires the cpu to stay online.
But cpus_read_lock() is not held in kvfree_rcu_barrier_on_cache() and the
set of "online cpus" is subject to change.

There are two paths that call flush_rcu_sheaves_on_cache():

  // has cpus_read_lock()
  flush_all_rcu_sheaves()
    -> flush_rcu_sheaves_on_cache()

  // no cpus_read_lock()
  kvfree_rcu_barrier_on_cache()
    -> flush_rcu_sheaves_on_cache()

Fix this by holding cpus_read_lock() in kvfree_rcu_barrier_on_cache().

Why not move cpus_read_lock() from flush_all_rcu_sheaves() into
flush_rcu_sheaves_on_cache()? The reason is it would introduce a new lock
order (slab_mutex -> cpu_hotplug_lock). The reverse order
(cpu_hotplug_lock -> slab_mutex) is established by

- cpuhp_setup_state_nocalls(..., slub_cpu_setup, ...)
- kmem_cache_destroy()

The two orders together would form an AB-BA deadlock.

Finally, add lockdep_assert_cpus_held() in flush_rcu_sheaves_on_cache()
to catch the same problem in the future.

Fixes: 0f35040de593 ("mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction")
Signed-off-by: Qing Wang <wangqing7171@gmail.com>
---
Changes in v2:
- Deleted the unnecessary comment.
- Added "Fixes" field in the commit message.

 mm/slab_common.c | 6 ++++++
 mm/slub.c        | 1 +
 2 files changed, 7 insertions(+)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index d5a70a831a2a..0ee5a4189453 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -2110,7 +2110,13 @@ EXPORT_SYMBOL_GPL(kvfree_rcu_barrier);
 void kvfree_rcu_barrier_on_cache(struct kmem_cache *s)
 {
 	if (cache_has_sheaves(s)) {
+		/*
+		 * flush_rcu_sheaves_on_cache() use queue_work_on() and queue_work_on()
+		 * must be called with the CPU hotplug read lock.
+		 */
+		cpus_read_lock();
 		flush_rcu_sheaves_on_cache(s);
+		cpus_read_unlock();
 		rcu_barrier();
 	}
 
diff --git a/mm/slub.c b/mm/slub.c
index 161079ac5ba1..2a005d1e3a74 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4024,6 +4024,7 @@ void flush_rcu_sheaves_on_cache(struct kmem_cache *s)
 	struct slub_flush_work *sfw;
 	unsigned int cpu;
 
+	lockdep_assert_cpus_held();
 	mutex_lock(&flush_lock);
 
 	for_each_online_cpu(cpu) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v3] mm/slub: hold cpus_read_lock around flush_rcu_sheaves_on_cache()
  2026-05-12  2:56 ` Harry Yoo (Oracle)
  2026-05-12  3:46   ` [PATCH v2] " Qing Wang
@ 2026-05-12  3:50   ` Qing Wang
  1 sibling, 0 replies; 4+ messages in thread
From: Qing Wang @ 2026-05-12  3:50 UTC (permalink / raw)
  To: harry
  Cc: akpm, cl, hao.li, linux-kernel, linux-mm, rientjes,
	roman.gushchin, vbabka, wangqing7171

flush_rcu_sheaves_on_cache() calls queue_work_on() in a
for_each_online_cpu() loop, which requires the cpu to stay online.
But cpus_read_lock() is not held in kvfree_rcu_barrier_on_cache() and the
set of "online cpus" is subject to change.

There are two paths that call flush_rcu_sheaves_on_cache():

  // has cpus_read_lock()
  flush_all_rcu_sheaves()
    -> flush_rcu_sheaves_on_cache()

  // no cpus_read_lock()
  kvfree_rcu_barrier_on_cache()
    -> flush_rcu_sheaves_on_cache()

Fix this by holding cpus_read_lock() in kvfree_rcu_barrier_on_cache().

Why not move cpus_read_lock() from flush_all_rcu_sheaves() into
flush_rcu_sheaves_on_cache()? The reason is it would introduce a new lock
order (slab_mutex -> cpu_hotplug_lock). The reverse order
(cpu_hotplug_lock -> slab_mutex) is established by

- cpuhp_setup_state_nocalls(..., slub_cpu_setup, ...)
- kmem_cache_destroy()

The two orders together would form an AB-BA deadlock.

Finally, add lockdep_assert_cpus_held() in flush_rcu_sheaves_on_cache()
to catch the same problem in the future.

Fixes: 0f35040de593 ("mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction")
Signed-off-by: Qing Wang <wangqing7171@gmail.com>
---
Changes in v2:
- Deleted the unnecessary comment.
- Added "Fixes" field in the commit message.
Changes in v3:
- Deleted the unnecessary comment.

 mm/slab_common.c | 2 ++
 mm/slub.c        | 1 +
 2 files changed, 3 insertions(+)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index d5a70a831a2a..8b661fff5eed 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -2110,7 +2110,9 @@ EXPORT_SYMBOL_GPL(kvfree_rcu_barrier);
 void kvfree_rcu_barrier_on_cache(struct kmem_cache *s)
 {
 	if (cache_has_sheaves(s)) {
+		cpus_read_lock();
 		flush_rcu_sheaves_on_cache(s);
+		cpus_read_unlock();
 		rcu_barrier();
 	}
 
diff --git a/mm/slub.c b/mm/slub.c
index 161079ac5ba1..2a005d1e3a74 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4024,6 +4024,7 @@ void flush_rcu_sheaves_on_cache(struct kmem_cache *s)
 	struct slub_flush_work *sfw;
 	unsigned int cpu;
 
+	lockdep_assert_cpus_held();
 	mutex_lock(&flush_lock);
 
 	for_each_online_cpu(cpu) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-12  3:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-08  8:21 [PATCH] mm/slub: hold cpus_read_lock around flush_rcu_sheaves_on_cache() Qing Wang
2026-05-12  2:56 ` Harry Yoo (Oracle)
2026-05-12  3:46   ` [PATCH v2] " Qing Wang
2026-05-12  3:50   ` [PATCH v3] " Qing Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox