linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: Fix possible deadlock in console_trylock_spinning
@ 2025-07-30  9:49 Gu Bowen
  2025-08-01 22:33 ` Andrew Morton
  2025-08-01 23:52 ` Andrew Morton
  0 siblings, 2 replies; 12+ messages in thread
From: Gu Bowen @ 2025-07-30  9:49 UTC (permalink / raw)
  To: Catalin Marinas, Andrew Morton; +Cc: stable, linux-mm, Lu Jialin, Gu Bowen

kmemleak_scan_thread() invokes scan_block() which may invoke a nomal
printk() to print warning message. This can cause a deadlock in the
scenario reported below:

       CPU0                    CPU1
       ----                    ----
  lock(kmemleak_lock);
                               lock(&port->lock);
                               lock(kmemleak_lock);
  lock(console_owner);

To solve this problem, switch to printk_safe mode before printing warning
message, this will redirect all printk()-s to a special per-CPU buffer,
which will be flushed later from a safe context (irq work), and this
deadlock problem can be avoided.

Our syztester report the following lockdep error:

======================================================
WARNING: possible circular locking dependency detected
5.10.0-22221-gca646a51dd00 #16 Not tainted
------------------------------------------------------
kmemleak/182 is trying to acquire lock:
ffffffffaf9e9020 (console_owner){-...}-{0:0}, at: console_trylock_spinning+0xda/0x1d0 kernel/printk/printk.c:1900

but task is already holding lock:
ffffffffb007cf58 (kmemleak_lock){-.-.}-{2:2}, at: scan_block+0x3d/0x220 mm/kmemleak.c:1310

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (kmemleak_lock){-.-.}-{2:2}:
       validate_chain+0x5df/0xac0 kernel/locking/lockdep.c:3729
       __lock_acquire+0x514/0x940 kernel/locking/lockdep.c:4958
       lock_acquire+0x15a/0x3a0 kernel/locking/lockdep.c:5569
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3b/0x60 kernel/locking/spinlock.c:164
       create_object.isra.0+0x36/0x80 mm/kmemleak.c:691
       kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
       slab_post_alloc_hook mm/slab.h:518 [inline]
       slab_alloc_node mm/slub.c:2987 [inline]
       slab_alloc mm/slub.c:2995 [inline]
       __kmalloc+0x637/0xb60 mm/slub.c:4100
       kmalloc include/linux/slab.h:620 [inline]
       tty_buffer_alloc+0x127/0x140 drivers/tty/tty_buffer.c:176
       __tty_buffer_request_room+0x9b/0x110 drivers/tty/tty_buffer.c:276
       tty_insert_flip_string_fixed_flag+0x60/0x130 drivers/tty/tty_buffer.c:321
       tty_insert_flip_string include/linux/tty_flip.h:36 [inline]
       tty_insert_flip_string_and_push_buffer+0x3a/0xb0 drivers/tty/tty_buffer.c:578
       process_output_block+0xc2/0x2e0 drivers/tty/n_tty.c:592
       n_tty_write+0x298/0x540 drivers/tty/n_tty.c:2433
       do_tty_write drivers/tty/tty_io.c:1041 [inline]
       file_tty_write.constprop.0+0x29b/0x4b0 drivers/tty/tty_io.c:1147
       redirected_tty_write+0x51/0x90 drivers/tty/tty_io.c:1176
       call_write_iter include/linux/fs.h:2117 [inline]
       do_iter_readv_writev+0x274/0x350 fs/read_write.c:741
       do_iter_write+0xbb/0x1f0 fs/read_write.c:867
       vfs_writev+0xfa/0x380 fs/read_write.c:940
       do_writev+0xd6/0x1d0 fs/read_write.c:983
       do_syscall_64+0x2b/0x40 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x6c/0xd6

-> #2 (&port->lock){-.-.}-{2:2}:
       validate_chain+0x5df/0xac0 kernel/locking/lockdep.c:3729
       __lock_acquire+0x514/0x940 kernel/locking/lockdep.c:4958
       lock_acquire+0x15a/0x3a0 kernel/locking/lockdep.c:5569
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3b/0x60 kernel/locking/spinlock.c:164
       tty_port_tty_get+0x1f/0xa0 drivers/tty/tty_port.c:289
       tty_port_default_wakeup+0xb/0x30 drivers/tty/tty_port.c:48
       serial8250_tx_chars+0x259/0x430 drivers/tty/serial/8250/8250_port.c:1906
       __start_tx drivers/tty/serial/8250/8250_port.c:1598 [inline]
       serial8250_start_tx+0x304/0x320 drivers/tty/serial/8250/8250_port.c:1720
       uart_write+0x1a1/0x2e0 drivers/tty/serial/serial_core.c:635
       do_output_char+0x2c0/0x370 drivers/tty/n_tty.c:444
       process_output drivers/tty/n_tty.c:511 [inline]
       n_tty_write+0x269/0x540 drivers/tty/n_tty.c:2445
       do_tty_write drivers/tty/tty_io.c:1041 [inline]
       file_tty_write.constprop.0+0x29b/0x4b0 drivers/tty/tty_io.c:1147
       call_write_iter include/linux/fs.h:2117 [inline]
       do_iter_readv_writev+0x274/0x350 fs/read_write.c:741
       do_iter_write+0xbb/0x1f0 fs/read_write.c:867
       vfs_writev+0xfa/0x380 fs/read_write.c:940
       do_writev+0xd6/0x1d0 fs/read_write.c:983
       do_syscall_64+0x2b/0x40 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x6c/0xd6

-> #1 (&port_lock_key){-.-.}-{2:2}:
       validate_chain+0x5df/0xac0 kernel/locking/lockdep.c:3729
       __lock_acquire+0x514/0x940 kernel/locking/lockdep.c:4958
       lock_acquire+0x15a/0x3a0 kernel/locking/lockdep.c:5569
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3b/0x60 kernel/locking/spinlock.c:164
       serial8250_console_write+0x292/0x320 drivers/tty/serial/8250/8250_port.c:3458
       call_console_drivers.constprop.0+0x185/0x240 kernel/printk/printk.c:1988
       console_unlock+0x2b4/0x640 kernel/printk/printk.c:2648
       register_console.part.0+0x2a1/0x390 kernel/printk/printk.c:3024
       univ8250_console_init+0x24/0x2b drivers/tty/serial/8250/8250_core.c:724
       console_init+0x188/0x24b kernel/printk/printk.c:3134
       start_kernel+0x2b0/0x41e init/main.c:1072
       secondary_startup_64_no_verify+0xc3/0xcb

-> #0 (console_owner){-...}-{0:0}:
       check_prev_add+0xfa/0x1380 kernel/locking/lockdep.c:2988
       check_prevs_add+0x1d8/0x3c0 kernel/locking/lockdep.c:3113
       validate_chain+0x5df/0xac0 kernel/locking/lockdep.c:3729
       __lock_acquire+0x514/0x940 kernel/locking/lockdep.c:4958
       lock_acquire+0x15a/0x3a0 kernel/locking/lockdep.c:5569
       console_trylock_spinning+0x10d/0x1d0 kernel/printk/printk.c:1921
       vprintk_emit+0x1a5/0x270 kernel/printk/printk.c:2134
       printk+0xb2/0xe7 kernel/printk/printk.c:2183
       lookup_object.cold+0xf/0x24 mm/kmemleak.c:405
       scan_block+0x1fa/0x220 mm/kmemleak.c:1357
       scan_object+0xdd/0x140 mm/kmemleak.c:1415
       scan_gray_list+0x8f/0x1c0 mm/kmemleak.c:1453
       kmemleak_scan+0x649/0xf30 mm/kmemleak.c:1608
       kmemleak_scan_thread+0x94/0xb6 mm/kmemleak.c:1721
       kthread+0x1c4/0x210 kernel/kthread.c:328
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:299

other info that might help us debug this:

Chain exists of:
  console_owner --> &port->lock --> kmemleak_lock

Cc: stable@vger.kernel.org # 5.10
Signed-off-by: Gu Bowen <gubowen5@huawei.com>
---
 mm/kmemleak.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 4801751cb6b6..d322897a1de1 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -390,9 +390,11 @@ static struct kmemleak_object *lookup_object(unsigned long ptr, int alias)
 		else if (object->pointer == ptr || alias)
 			return object;
 		else {
+			__printk_safe_enter();
 			kmemleak_warn("Found object by alias at 0x%08lx\n",
 				      ptr);
 			dump_object_info(object);
+			__printk_safe_exit();
 			break;
 		}
 	}
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix possible deadlock in console_trylock_spinning
  2025-07-30  9:49 [PATCH] mm: Fix possible deadlock in console_trylock_spinning Gu Bowen
@ 2025-08-01 22:33 ` Andrew Morton
  2025-08-02  3:09   ` Waiman Long
  2025-08-01 23:52 ` Andrew Morton
  1 sibling, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2025-08-01 22:33 UTC (permalink / raw)
  To: Gu Bowen
  Cc: Catalin Marinas, stable, linux-mm, Lu Jialin, Waiman Long,
	Breno Leitao

On Wed, 30 Jul 2025 17:49:14 +0800 Gu Bowen <gubowen5@huawei.com> wrote:

> kmemleak_scan_thread() invokes scan_block() which may invoke a nomal
> printk() to print warning message. This can cause a deadlock in the
> scenario reported below:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(kmemleak_lock);
>                                lock(&port->lock);
>                                lock(kmemleak_lock);
>   lock(console_owner);
> 
> To solve this problem, switch to printk_safe mode before printing warning
> message, this will redirect all printk()-s to a special per-CPU buffer,
> which will be flushed later from a safe context (irq work), and this
> deadlock problem can be avoided.
> 
> Our syztester report the following lockdep error:
> 
> ...
>
> index 4801751cb6b6..d322897a1de1 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -390,9 +390,11 @@ static struct kmemleak_object *lookup_object(unsigned long ptr, int alias)
>  		else if (object->pointer == ptr || alias)
>  			return object;
>  		else {
> +			__printk_safe_enter();
>  			kmemleak_warn("Found object by alias at 0x%08lx\n",
>  				      ptr);
>  			dump_object_info(object);
> +			__printk_safe_exit();
>  			break;
>  		}
>  	}

Thanks.

There have been a few kmemleak locking fixes lately.

I believe this fix is independent from the previous ones:

https://lkml.kernel.org/r/20250731-kmemleak_lock-v1-1-728fd470198f@debian.org
https://lkml.kernel.org/r/20250728190248.605750-1-longman@redhat.com

But can people please check?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix possible deadlock in console_trylock_spinning
  2025-07-30  9:49 [PATCH] mm: Fix possible deadlock in console_trylock_spinning Gu Bowen
  2025-08-01 22:33 ` Andrew Morton
@ 2025-08-01 23:52 ` Andrew Morton
  2025-08-11 14:33   ` Petr Mladek
  1 sibling, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2025-08-01 23:52 UTC (permalink / raw)
  To: Gu Bowen
  Cc: Catalin Marinas, stable, linux-mm, Lu Jialin, Catalin Marinas,
	stable, linux-mm, Lu Jialin, Waiman Long, Breno Leitao,
	Petr Mladek, Steven Rostedt, John Ogness

On Wed, 30 Jul 2025 17:49:14 +0800 Gu Bowen <gubowen5@huawei.com> wrote:

> kmemleak_scan_thread() invokes scan_block() which may invoke a nomal
> printk() to print warning message. This can cause a deadlock in the
> scenario reported below:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(kmemleak_lock);
>                                lock(&port->lock);
>                                lock(kmemleak_lock);
>   lock(console_owner);
> 
> To solve this problem, switch to printk_safe mode before printing warning
> message, this will redirect all printk()-s to a special per-CPU buffer,
> which will be flushed later from a safe context (irq work), and this
> deadlock problem can be avoided.
> 
> Our syztester report the following lockdep error:
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.10.0-22221-gca646a51dd00 #16 Not tainted
> ------------------------------------------------------
> 
> ...
>
> Chain exists of:
>   console_owner --> &port->lock --> kmemleak_lock
> 
> Cc: stable@vger.kernel.org # 5.10
> Signed-off-by: Gu Bowen <gubowen5@huawei.com>
> ---
>  mm/kmemleak.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index 4801751cb6b6..d322897a1de1 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -390,9 +390,11 @@ static struct kmemleak_object *lookup_object(unsigned long ptr, int alias)
>  		else if (object->pointer == ptr || alias)
>  			return object;
>  		else {
> +			__printk_safe_enter();
>  			kmemleak_warn("Found object by alias at 0x%08lx\n",
>  				      ptr);
>  			dump_object_info(object);
> +			__printk_safe_exit();
>  			break;
>  		}
>  	}

umm,

--- a/mm/kmemleak.c~a
+++ a/mm/kmemleak.c
@@ -103,6 +103,8 @@
 #include <linux/kmemleak.h>
 #include <linux/memory_hotplug.h>
 
+#include "../kernel/printk/internal.h"		/* __printk_safe_enter */
+
 /*
  * Kmemleak configuration and common defines.
  */


I'm not sure we're allowed to do that.  Is there an official way?





^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix possible deadlock in console_trylock_spinning
  2025-08-01 22:33 ` Andrew Morton
@ 2025-08-02  3:09   ` Waiman Long
  2025-08-02  3:53     ` Andrew Morton
  2025-08-06 15:34     ` John Ogness
  0 siblings, 2 replies; 12+ messages in thread
From: Waiman Long @ 2025-08-02  3:09 UTC (permalink / raw)
  To: Andrew Morton, Gu Bowen
  Cc: Catalin Marinas, stable, linux-mm, Lu Jialin, Breno Leitao

On 8/1/25 6:33 PM, Andrew Morton wrote:
> On Wed, 30 Jul 2025 17:49:14 +0800 Gu Bowen <gubowen5@huawei.com> wrote:
>
>> kmemleak_scan_thread() invokes scan_block() which may invoke a nomal
>> printk() to print warning message. This can cause a deadlock in the
>> scenario reported below:
>>
>>         CPU0                    CPU1
>>         ----                    ----
>>    lock(kmemleak_lock);
>>                                 lock(&port->lock);
>>                                 lock(kmemleak_lock);
>>    lock(console_owner);
>>
>> To solve this problem, switch to printk_safe mode before printing warning
>> message, this will redirect all printk()-s to a special per-CPU buffer,
>> which will be flushed later from a safe context (irq work), and this
>> deadlock problem can be avoided.
>>
>> Our syztester report the following lockdep error:
>>
>> ...
>>
>> index 4801751cb6b6..d322897a1de1 100644
>> --- a/mm/kmemleak.c
>> +++ b/mm/kmemleak.c
>> @@ -390,9 +390,11 @@ static struct kmemleak_object *lookup_object(unsigned long ptr, int alias)
>>   		else if (object->pointer == ptr || alias)
>>   			return object;
>>   		else {
>> +			__printk_safe_enter();
>>   			kmemleak_warn("Found object by alias at 0x%08lx\n",
>>   				      ptr);
>>   			dump_object_info(object);
>> +			__printk_safe_exit();
>>   			break;
>>   		}
>>   	}
> Thanks.
>
> There have been a few kmemleak locking fixes lately.
>
> I believe this fix is independent from the previous ones:
>
> https://lkml.kernel.org/r/20250731-kmemleak_lock-v1-1-728fd470198f@debian.org
> https://lkml.kernel.org/r/20250728190248.605750-1-longman@redhat.com
>
> But can people please check?

I believe that __printk_safe_enter()/_printk_safe_exit() are for printk 
internal use only. The proper API to use should be 
printk_deferred_enter()/printk_deferred_exit() if we want to deferred 
the printing. Since kmemleak_lock will have been acquired with irq 
disabled, it meets the condition that printk_deferred_*() APIs can be used.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix possible deadlock in console_trylock_spinning
  2025-08-02  3:09   ` Waiman Long
@ 2025-08-02  3:53     ` Andrew Morton
  2025-08-04 12:08       ` Catalin Marinas
  2025-08-06 15:34     ` John Ogness
  1 sibling, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2025-08-02  3:53 UTC (permalink / raw)
  To: Waiman Long
  Cc: Gu Bowen, Catalin Marinas, stable, linux-mm, Lu Jialin,
	Breno Leitao

On Fri, 1 Aug 2025 23:09:31 -0400 Waiman Long <llong@redhat.com> wrote:

> > Thanks.
> >
> > There have been a few kmemleak locking fixes lately.
> >
> > I believe this fix is independent from the previous ones:
> >
> > https://lkml.kernel.org/r/20250731-kmemleak_lock-v1-1-728fd470198f@debian.org
> > https://lkml.kernel.org/r/20250728190248.605750-1-longman@redhat.com
> >
> > But can people please check?
> 
> I believe that __printk_safe_enter()/_printk_safe_exit() are for printk 
> internal use only. The proper API to use should be 
> printk_deferred_enter()/printk_deferred_exit() if we want to deferred 
> the printing. Since kmemleak_lock will have been acquired with irq 
> disabled, it meets the condition that printk_deferred_*() APIs can be used.

Gotcha, thanks.

kmemleak;c:__lookup_object() has a lot of callers.  I hope you're
correct that all have local irqs enabled, but I'll ask Gu to verify
that then please send along a new patch which uses
printk_deferred_enter()?



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix possible deadlock in console_trylock_spinning
  2025-08-02  3:53     ` Andrew Morton
@ 2025-08-04 12:08       ` Catalin Marinas
  2025-08-04 12:34         ` Breno Leitao
  0 siblings, 1 reply; 12+ messages in thread
From: Catalin Marinas @ 2025-08-04 12:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Waiman Long, Gu Bowen, stable, linux-mm, Lu Jialin, Breno Leitao

On Fri, Aug 01, 2025 at 08:53:23PM -0700, Andrew Morton wrote:
> On Fri, 1 Aug 2025 23:09:31 -0400 Waiman Long <llong@redhat.com> wrote:
> > > There have been a few kmemleak locking fixes lately.
> > >
> > > I believe this fix is independent from the previous ones:
> > >
> > > https://lkml.kernel.org/r/20250731-kmemleak_lock-v1-1-728fd470198f@debian.org

That's a similar bug in another part of the kmemleak code but fixed
differently (which I actually prefer if feasible).

> > > https://lkml.kernel.org/r/20250728190248.605750-1-longman@redhat.com

That's a soft lockup, unrelated to the printk deadlock.

> > I believe that __printk_safe_enter()/_printk_safe_exit() are for printk 
> > internal use only. The proper API to use should be 
> > printk_deferred_enter()/printk_deferred_exit() if we want to deferred 
> > the printing. Since kmemleak_lock will have been acquired with irq 
> > disabled, it meets the condition that printk_deferred_*() APIs can be used.
> 
> Gotcha, thanks.
> 
> kmemleak;c:__lookup_object() has a lot of callers.  I hope you're
> correct that all have local irqs enabled, but I'll ask Gu to verify
> that then please send along a new patch which uses
> printk_deferred_enter()?

__lookup_object() must be called with kmemleak_lock held (unless we have
a bug in kmemleak).

Using printk_deferred_enter() is more convenient, though I think some of
these places can defer the printing with something similar to the first
patch above from Breno.

For __lookup_object(), we could move the warning outside the lock but
this function would have to lock the respective object, return it and
somehow inform the caller that it was an error and the object needs
unlocking. Given that this is a very rare/never event (only happens if
someone messes up the kmemleak_alloc/free calls), I'd say the
printk_deferred_enter() works best.

We have delete_object_part() which calls kmemleak_warn() with the
kmemleak_lock held. The warning can be moved outside similar to Breno's
patch.

We have a kmemleak_stop() -> kmemleak_warn() called in __link_object()
with the kmemleak_lock held. We could also use the printk deferring here
as well. That's another rare corner case.

The patch should add a code comment on why printk deferring is used in
case others will wonder in the future.

I'm surprised we haven't seen these until recently. Has printk always
allocated memory?

-- 
Catalin


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix possible deadlock in console_trylock_spinning
  2025-08-04 12:08       ` Catalin Marinas
@ 2025-08-04 12:34         ` Breno Leitao
  2025-08-04 18:28           ` Catalin Marinas
  0 siblings, 1 reply; 12+ messages in thread
From: Breno Leitao @ 2025-08-04 12:34 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Andrew Morton, Waiman Long, Gu Bowen, stable, linux-mm, Lu Jialin

On Mon, Aug 04, 2025 at 01:08:15PM +0100, Catalin Marinas wrote:

> I'm surprised we haven't seen these until recently. Has printk always
> allocated memory?

I can talk about netconsole, and the answer is yes!

weird enought, lockdep never picked this issue, and I have a few set of
hosts running kmemleak and lockdep for a while.

This time was different because I have decided to invstiage the code,
and found the deadlock. Still, no lockdep complain at all.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix possible deadlock in console_trylock_spinning
  2025-08-04 12:34         ` Breno Leitao
@ 2025-08-04 18:28           ` Catalin Marinas
  2025-08-05  7:44             ` Gu Bowen
  0 siblings, 1 reply; 12+ messages in thread
From: Catalin Marinas @ 2025-08-04 18:28 UTC (permalink / raw)
  To: Breno Leitao
  Cc: Andrew Morton, Waiman Long, Gu Bowen, stable, linux-mm, Lu Jialin

On Mon, Aug 04, 2025 at 05:34:10AM -0700, Breno Leitao wrote:
> On Mon, Aug 04, 2025 at 01:08:15PM +0100, Catalin Marinas wrote:
> 
> > I'm surprised we haven't seen these until recently. Has printk always
> > allocated memory?
> 
> I can talk about netconsole, and the answer is yes!
> 
> weird enought, lockdep never picked this issue, and I have a few set of
> hosts running kmemleak and lockdep for a while.
> 
> This time was different because I have decided to invstiage the code,
> and found the deadlock. Still, no lockdep complain at all.

I guess it's because kmemleak is quiet in general, unless problems are
found, and lockdep never registered this combination - printk() called
with the kmemleak_lock held.

Thanks for investigating.

-- 
Catalin


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix possible deadlock in console_trylock_spinning
  2025-08-04 18:28           ` Catalin Marinas
@ 2025-08-05  7:44             ` Gu Bowen
  0 siblings, 0 replies; 12+ messages in thread
From: Gu Bowen @ 2025-08-05  7:44 UTC (permalink / raw)
  To: Catalin Marinas, Breno Leitao
  Cc: Andrew Morton, Waiman Long, stable, linux-mm, Lu Jialin



On 8/5/2025 2:28 AM, Catalin Marinas wrote:
> On Mon, Aug 04, 2025 at 05:34:10AM -0700, Breno Leitao wrote:
>> On Mon, Aug 04, 2025 at 01:08:15PM +0100, Catalin Marinas wrote:
>>
>> weird enought, lockdep never picked this issue, and I have a few set of
>> hosts running kmemleak and lockdep for a while.
>>
>> This time was different because I have decided to invstiage the code,
>> and found the deadlock. Still, no lockdep complain at all.
> 
> I guess it's because kmemleak is quiet in general, unless problems are
> found, and lockdep never registered this combination - printk() called
> with the kmemleak_lock held.

As you have guessed, this issue occurred right after the kmemleak warning.

Thanks for the reminder about printk_deferred_* and the suggestion 
regarding code comments..

I will send a new version after verification.

Best regards,
Guber


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix possible deadlock in console_trylock_spinning
  2025-08-02  3:09   ` Waiman Long
  2025-08-02  3:53     ` Andrew Morton
@ 2025-08-06 15:34     ` John Ogness
  2025-08-07  3:22       ` Gu Bowen
  1 sibling, 1 reply; 12+ messages in thread
From: John Ogness @ 2025-08-06 15:34 UTC (permalink / raw)
  To: Waiman Long, Andrew Morton, Gu Bowen
  Cc: Catalin Marinas, stable, linux-mm, Lu Jialin, Breno Leitao

On 2025-08-01, Waiman Long <llong@redhat.com> wrote:
>>> kmemleak_scan_thread() invokes scan_block() which may invoke a nomal
>>> printk() to print warning message. This can cause a deadlock in the
>>> scenario reported below:
>>>
>>>         CPU0                    CPU1
>>>         ----                    ----
>>>    lock(kmemleak_lock);
>>>                                 lock(&port->lock);
>>>                                 lock(kmemleak_lock);
>>>    lock(console_owner);
>>>
>>> To solve this problem, switch to printk_safe mode before printing warning
>>> message, this will redirect all printk()-s to a special per-CPU buffer,
>>> which will be flushed later from a safe context (irq work), and this
>>> deadlock problem can be avoided.

printk no longer works like this. There are no special per-CPU
buffers. As correctly pointed out by Longman, the proper interface is
printk_deferred_enter()/printk_deferred_exit().

With the deferred interface, the printk message is still immediately
stored in the printk ringbuffer (which is lockless) and only the console
printing itself is deferred.

John Ogness


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix possible deadlock in console_trylock_spinning
  2025-08-06 15:34     ` John Ogness
@ 2025-08-07  3:22       ` Gu Bowen
  0 siblings, 0 replies; 12+ messages in thread
From: Gu Bowen @ 2025-08-07  3:22 UTC (permalink / raw)
  To: John Ogness, Waiman Long, Andrew Morton
  Cc: Catalin Marinas, stable, linux-mm, Lu Jialin, Breno Leitao



On 8/6/2025 11:34 PM, John Ogness wrote:

> 
> printk no longer works like this. There are no special per-CPU
> buffers. As correctly pointed out by Longman, the proper interface is
> printk_deferred_enter()/printk_deferred_exit().
> 
> With the deferred interface, the printk message is still immediately
> stored in the printk ringbuffer (which is lockless) and only the console
> printing itself is deferred.
> 

Hi,
I agree that printk_deferred_enter()/printk_deferred_exit() is a better 
interface, but I noticed that in stable-5.10, the printk safety mode 
still uses a per-CPU buffer to store messages through the 
printk_safe_log_store function.

Additionally, the mainline version has already refactored printk, so 
this issue no longer exists.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix possible deadlock in console_trylock_spinning
  2025-08-01 23:52 ` Andrew Morton
@ 2025-08-11 14:33   ` Petr Mladek
  0 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2025-08-11 14:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Gu Bowen, Catalin Marinas, stable, linux-mm, Lu Jialin,
	Waiman Long, Breno Leitao, Steven Rostedt, John Ogness

On Fri 2025-08-01 16:52:28, Andrew Morton wrote:
> On Wed, 30 Jul 2025 17:49:14 +0800 Gu Bowen <gubowen5@huawei.com> wrote:
> 
> > kmemleak_scan_thread() invokes scan_block() which may invoke a nomal
> > printk() to print warning message. This can cause a deadlock in the
> > scenario reported below:
> > 
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(kmemleak_lock);
> >                                lock(&port->lock);
> >                                lock(kmemleak_lock);
> >   lock(console_owner);
> > 
> > To solve this problem, switch to printk_safe mode before printing warning
> > message, this will redirect all printk()-s to a special per-CPU buffer,
> > which will be flushed later from a safe context (irq work), and this
> > deadlock problem can be avoided.
> > 
> > Our syztester report the following lockdep error:
> > 
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 5.10.0-22221-gca646a51dd00 #16 Not tainted
> > ------------------------------------------------------
> > 
> > ...
> >
> > Chain exists of:
> >   console_owner --> &port->lock --> kmemleak_lock
> > 
> > Cc: stable@vger.kernel.org # 5.10
> > Signed-off-by: Gu Bowen <gubowen5@huawei.com>
> > ---
> >  mm/kmemleak.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> > index 4801751cb6b6..d322897a1de1 100644
> > --- a/mm/kmemleak.c
> > +++ b/mm/kmemleak.c
> > @@ -390,9 +390,11 @@ static struct kmemleak_object *lookup_object(unsigned long ptr, int alias)
> >  		else if (object->pointer == ptr || alias)
> >  			return object;
> >  		else {
> > +			__printk_safe_enter();
> >  			kmemleak_warn("Found object by alias at 0x%08lx\n",
> >  				      ptr);
> >  			dump_object_info(object);
> > +			__printk_safe_exit();
> >  			break;
> >  		}
> >  	}
> 
> umm,
> 
> --- a/mm/kmemleak.c~a
> +++ a/mm/kmemleak.c
> @@ -103,6 +103,8 @@
>  #include <linux/kmemleak.h>
>  #include <linux/memory_hotplug.h>
>  
> +#include "../kernel/printk/internal.h"		/* __printk_safe_enter */
> +
>  /*
>   * Kmemleak configuration and common defines.
>   */
> 
> 
> I'm not sure we're allowed to do that.  Is there an official way?

The official way is to use printk_deferred_enter()/exit().

Note that the API is using a per-CPU variable. It must be called
with CPU migration disabled. The comment suggests disabling IRQs.
But it should be enough to disable preemption.

Best Regards,
Petr


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-08-11 14:33 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-30  9:49 [PATCH] mm: Fix possible deadlock in console_trylock_spinning Gu Bowen
2025-08-01 22:33 ` Andrew Morton
2025-08-02  3:09   ` Waiman Long
2025-08-02  3:53     ` Andrew Morton
2025-08-04 12:08       ` Catalin Marinas
2025-08-04 12:34         ` Breno Leitao
2025-08-04 18:28           ` Catalin Marinas
2025-08-05  7:44             ` Gu Bowen
2025-08-06 15:34     ` John Ogness
2025-08-07  3:22       ` Gu Bowen
2025-08-01 23:52 ` Andrew Morton
2025-08-11 14:33   ` Petr Mladek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).