linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [syzbot] [crypto?] possible deadlock in crypto_exit_scomp_ops_async
@ 2025-02-24 21:53 syzbot
  2025-02-25  8:53 ` mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead Herbert Xu
  0 siblings, 1 reply; 9+ messages in thread
From: syzbot @ 2025-02-24 21:53 UTC (permalink / raw)
  To: davem, herbert, linux-crypto, linux-kernel, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    e9a8cac0bf89 Merge tag 'v6.14-rc3-smb3-client-fixes' of gi..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17b667f8580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=61cbf5ac8a063ad4
dashboard link: https://syzkaller.appspot.com/bug?extid=1a517ccfcbc6a7ab0f82
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/8441f1b50402/disk-e9a8cac0.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/65b1f8d2f790/vmlinux-e9a8cac0.xz
kernel image: https://storage.googleapis.com/syzbot-assets/1d6f6d8c3d6b/bzImage-e9a8cac0.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+1a517ccfcbc6a7ab0f82@syzkaller.appspotmail.com

smpboot: CPU 0 is now offline
crash hp: kexec_trylock() failed, kdump image may be inaccurate
======================================================
WARNING: possible circular locking dependency detected
6.14.0-rc3-syzkaller-00096-ge9a8cac0bf89 #0 Not tainted
------------------------------------------------------
kworker/1:1/46 is trying to acquire lock:
ffffffff8ec41aa8 (scomp_lock){+.+.}-{4:4}, at: crypto_exit_scomp_ops_async+0x42/0x90 crypto/scompress.c:201

but task is already holding lock:
ffffe8ffffc21250 (&per_cpu_ptr(pool->acomp_ctx, cpu)->mutex){+.+.}-{4:4}, at: zswap_cpu_comp_dead+0x80/0x210 mm/zswap.c:885

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&per_cpu_ptr(pool->acomp_ctx, cpu)->mutex){+.+.}-{4:4}:
       __mutex_lock_common kernel/locking/mutex.c:585 [inline]
       __mutex_lock+0x19b/0xb10 kernel/locking/mutex.c:730
       acomp_ctx_get_cpu_lock mm/zswap.c:905 [inline]
       zswap_compress mm/zswap.c:937 [inline]
       zswap_store_page mm/zswap.c:1462 [inline]
       zswap_store+0x8fc/0x2690 mm/zswap.c:1571
       swap_writepage+0x3b6/0x1120 mm/page_io.c:278
       pageout+0x3b2/0xaa0 mm/vmscan.c:696
       shrink_folio_list+0x2f7f/0x40c0 mm/vmscan.c:1402
       evict_folios+0x774/0x1ab0 mm/vmscan.c:4660
       try_to_shrink_lruvec+0x5a2/0x9a0 mm/vmscan.c:4821
       shrink_one+0x3e3/0x7b0 mm/vmscan.c:4866
       shrink_many mm/vmscan.c:4929 [inline]
       lru_gen_shrink_node mm/vmscan.c:5007 [inline]
       shrink_node+0x2761/0x3e60 mm/vmscan.c:5978
       kswapd_shrink_node mm/vmscan.c:6807 [inline]
       balance_pgdat+0xbab/0x19c0 mm/vmscan.c:6999
       kswapd+0x590/0xb70 mm/vmscan.c:7264
       kthread+0x3af/0x750 kernel/kthread.c:464
       ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:148
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

-> #1 (fs_reclaim){+.+.}-{0:0}:
       __fs_reclaim_acquire mm/page_alloc.c:3853 [inline]
       fs_reclaim_acquire+0x102/0x150 mm/page_alloc.c:3867
       might_alloc include/linux/sched/mm.h:318 [inline]
       slab_pre_alloc_hook mm/slub.c:4066 [inline]
       slab_alloc_node mm/slub.c:4144 [inline]
       __kmalloc_cache_node_noprof+0x54/0x420 mm/slub.c:4333
       kmalloc_node_noprof include/linux/slab.h:924 [inline]
       __get_vm_area_node+0x101/0x2f0 mm/vmalloc.c:3127
       __vmalloc_node_range_noprof+0x26a/0x1530 mm/vmalloc.c:3806
       __vmalloc_node_noprof mm/vmalloc.c:3911 [inline]
       vmalloc_node_noprof+0x6f/0x90 mm/vmalloc.c:4022
       crypto_scomp_alloc_scratches crypto/scompress.c:86 [inline]
       crypto_scomp_init_tfm+0x122/0x270 crypto/scompress.c:107
       crypto_create_tfm_node+0x100/0x320 crypto/api.c:539
       crypto_create_tfm crypto/internal.h:120 [inline]
       crypto_init_scomp_ops_async+0x5d/0x1d0 crypto/scompress.c:217
       crypto_acomp_init_tfm+0x240/0x2e0 crypto/acompress.c:70
       crypto_create_tfm_node+0x100/0x320 crypto/api.c:539
       crypto_alloc_tfm_node+0x102/0x260 crypto/api.c:640
       zswap_cpu_comp_prepare+0xe2/0x420 mm/zswap.c:834
       cpuhp_invoke_callback+0x20c/0xa10 kernel/cpu.c:204
       cpuhp_issue_call+0x1c0/0x980 kernel/cpu.c:2376
       __cpuhp_state_add_instance_cpuslocked+0x1a4/0x3c0 kernel/cpu.c:2438
       __cpuhp_state_add_instance+0xd7/0x2e0 kernel/cpu.c:2459
       cpuhp_state_add_instance include/linux/cpuhotplug.h:387 [inline]
       zswap_pool_create+0x59a/0x7b0 mm/zswap.c:291
       __zswap_pool_create_fallback mm/zswap.c:359 [inline]
       zswap_setup+0x402/0x810 mm/zswap.c:1814
       zswap_init+0x2c/0x40 mm/zswap.c:1850
       do_one_initcall+0x128/0x700 init/main.c:1257
       do_initcall_level init/main.c:1319 [inline]
       do_initcalls init/main.c:1335 [inline]
       do_basic_setup init/main.c:1354 [inline]
       kernel_init_freeable+0x5c7/0x900 init/main.c:1568
       kernel_init+0x1c/0x2b0 init/main.c:1457
       ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:148
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

-> #0 (scomp_lock){+.+.}-{4:4}:
       check_prev_add kernel/locking/lockdep.c:3163 [inline]
       check_prevs_add kernel/locking/lockdep.c:3282 [inline]
       validate_chain kernel/locking/lockdep.c:3906 [inline]
       __lock_acquire+0x249e/0x3c40 kernel/locking/lockdep.c:5228
       lock_acquire.part.0+0x11b/0x380 kernel/locking/lockdep.c:5851
       __mutex_lock_common kernel/locking/mutex.c:585 [inline]
       __mutex_lock+0x19b/0xb10 kernel/locking/mutex.c:730
       crypto_exit_scomp_ops_async+0x42/0x90 crypto/scompress.c:201
       crypto_exit_ops crypto/api.c:370 [inline]
       crypto_destroy_tfm crypto/api.c:681 [inline]
       crypto_destroy_tfm+0x135/0x2b0 crypto/api.c:668
       crypto_free_acomp include/crypto/acompress.h:159 [inline]
       zswap_cpu_comp_dead+0x169/0x210 mm/zswap.c:891
       cpuhp_invoke_callback+0x528/0xa10 kernel/cpu.c:216
       __cpuhp_invoke_callback_range+0x101/0x200 kernel/cpu.c:966
       cpuhp_invoke_callback_range kernel/cpu.c:990 [inline]
       cpuhp_down_callbacks kernel/cpu.c:1383 [inline]
       _cpu_down+0x422/0xf20 kernel/cpu.c:1444
       __cpu_down_maps_locked+0x6c/0x90 kernel/cpu.c:1474
       work_for_cpu_fn+0x52/0xa0 kernel/workqueue.c:6731
       process_one_work+0x9c5/0x1ba0 kernel/workqueue.c:3236
       process_scheduled_works kernel/workqueue.c:3317 [inline]
       worker_thread+0x6c8/0xf00 kernel/workqueue.c:3398
       kthread+0x3af/0x750 kernel/kthread.c:464
       ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:148
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

Chain exists of:
  scomp_lock --> fs_reclaim --> &per_cpu_ptr(pool->acomp_ctx, cpu)->mutex

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&per_cpu_ptr(pool->acomp_ctx, cpu)->mutex);
                               lock(fs_reclaim);
                               lock(&per_cpu_ptr(pool->acomp_ctx, cpu)->mutex);
  lock(scomp_lock);

 *** DEADLOCK ***

4 locks held by kworker/1:1/46:
 #0: ffff88801b080d48 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3211
 #1: ffffc90000b67d18 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3212
 #2: ffffffff8e060bd0 (cpu_hotplug_lock){++++}-{0:0}, at: cpus_write_lock kernel/cpu.c:508 [inline]
 #2: ffffffff8e060bd0 (cpu_hotplug_lock){++++}-{0:0}, at: _cpu_down+0xe5/0xf20 kernel/cpu.c:1412
 #3: ffffe8ffffc21250 (&per_cpu_ptr(pool->acomp_ctx, cpu)->mutex){+.+.}-{4:4}, at: zswap_cpu_comp_dead+0x80/0x210 mm/zswap.c:885

stack backtrace:
CPU: 1 UID: 0 PID: 46 Comm: kworker/1:1 Not tainted 6.14.0-rc3-syzkaller-00096-ge9a8cac0bf89 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
Workqueue: events work_for_cpu_fn
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
 print_circular_bug+0x490/0x760 kernel/locking/lockdep.c:2076
 check_noncircular+0x31a/0x400 kernel/locking/lockdep.c:2208
 check_prev_add kernel/locking/lockdep.c:3163 [inline]
 check_prevs_add kernel/locking/lockdep.c:3282 [inline]
 validate_chain kernel/locking/lockdep.c:3906 [inline]
 __lock_acquire+0x249e/0x3c40 kernel/locking/lockdep.c:5228
 lock_acquire.part.0+0x11b/0x380 kernel/locking/lockdep.c:5851
 __mutex_lock_common kernel/locking/mutex.c:585 [inline]
 __mutex_lock+0x19b/0xb10 kernel/locking/mutex.c:730
 crypto_exit_scomp_ops_async+0x42/0x90 crypto/scompress.c:201
 crypto_exit_ops crypto/api.c:370 [inline]
 crypto_destroy_tfm crypto/api.c:681 [inline]
 crypto_destroy_tfm+0x135/0x2b0 crypto/api.c:668
 crypto_free_acomp include/crypto/acompress.h:159 [inline]
 zswap_cpu_comp_dead+0x169/0x210 mm/zswap.c:891
 cpuhp_invoke_callback+0x528/0xa10 kernel/cpu.c:216
 __cpuhp_invoke_callback_range+0x101/0x200 kernel/cpu.c:966
 cpuhp_invoke_callback_range kernel/cpu.c:990 [inline]
 cpuhp_down_callbacks kernel/cpu.c:1383 [inline]
 _cpu_down+0x422/0xf20 kernel/cpu.c:1444
 __cpu_down_maps_locked+0x6c/0x90 kernel/cpu.c:1474
 work_for_cpu_fn+0x52/0xa0 kernel/workqueue.c:6731
 process_one_work+0x9c5/0x1ba0 kernel/workqueue.c:3236
 process_scheduled_works kernel/workqueue.c:3317 [inline]
 worker_thread+0x6c8/0xf00 kernel/workqueue.c:3398
 kthread+0x3af/0x750 kernel/kthread.c:464
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:148
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 9+ messages in thread

* mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead
  2025-02-24 21:53 [syzbot] [crypto?] possible deadlock in crypto_exit_scomp_ops_async syzbot
@ 2025-02-25  8:53 ` Herbert Xu
  2025-02-25 10:49   ` Hillf Danton
  2025-02-25 13:43   ` Yosry Ahmed
  0 siblings, 2 replies; 9+ messages in thread
From: Herbert Xu @ 2025-02-25  8:53 UTC (permalink / raw)
  To: syzbot
  Cc: davem, linux-crypto, linux-kernel, syzkaller-bugs, Andrew Morton,
	Yosry Ahmed, linux-mm

On Mon, Feb 24, 2025 at 01:53:21PM -0800, syzbot wrote:
> 
> syzbot found the following issue on:
> 
> HEAD commit:    e9a8cac0bf89 Merge tag 'v6.14-rc3-smb3-client-fixes' of gi..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=17b667f8580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=61cbf5ac8a063ad4
> dashboard link: https://syzkaller.appspot.com/bug?extid=1a517ccfcbc6a7ab0f82
> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/8441f1b50402/disk-e9a8cac0.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/65b1f8d2f790/vmlinux-e9a8cac0.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/1d6f6d8c3d6b/bzImage-e9a8cac0.xz

---8<---
Call crypto_free_acomp outside of the mutex in zswap_cpu_comp_dead
as otherwise this could dead-lock as the allocation path may lead
back into zswap while holding the same lock.  Zap the pointers to
acomp and buffer after freeing.

Also move the NULL check on acomp_ctx so that it takes place before
the mutex dereference.

Fixes: 12dcb0ef5406 ("mm: zswap: properly synchronize freeing resources during CPU hotunplug")
Reported-by: syzbot+1a517ccfcbc6a7ab0f82@syzkaller.appspotmail.com
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/mm/zswap.c b/mm/zswap.c
index 6504174fbc6a..24d36266a791 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -881,18 +881,23 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node)
 {
 	struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
 	struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu);
+	struct crypto_acomp *acomp = NULL;
+
+	if (IS_ERR_OR_NULL(acomp_ctx))
+		return 0;
 
 	mutex_lock(&acomp_ctx->mutex);
-	if (!IS_ERR_OR_NULL(acomp_ctx)) {
-		if (!IS_ERR_OR_NULL(acomp_ctx->req))
-			acomp_request_free(acomp_ctx->req);
-		acomp_ctx->req = NULL;
-		if (!IS_ERR_OR_NULL(acomp_ctx->acomp))
-			crypto_free_acomp(acomp_ctx->acomp);
-		kfree(acomp_ctx->buffer);
-	}
+	if (!IS_ERR_OR_NULL(acomp_ctx->req))
+		acomp_request_free(acomp_ctx->req);
+	acomp_ctx->req = NULL;
+	acomp = acomp_ctx->acomp;
+	acomp_ctx->acomp = NULL;
+	kfree(acomp_ctx->buffer);
+	acomp_ctx->buffer = NULL;
 	mutex_unlock(&acomp_ctx->mutex);
 
+	crypto_free_acomp(acomp);
+
 	return 0;
 }
 
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead
  2025-02-25  8:53 ` mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead Herbert Xu
@ 2025-02-25 10:49   ` Hillf Danton
  2025-02-25 13:43   ` Yosry Ahmed
  1 sibling, 0 replies; 9+ messages in thread
From: Hillf Danton @ 2025-02-25 10:49 UTC (permalink / raw)
  To: Herbert Xu
  Cc: syzbot, linux-crypto, linux-kernel, syzkaller-bugs, Yosry Ahmed,
	linux-mm

On Tue, 25 Feb 2025 16:53:58 +0800 Herbert Xu wrote:
> On Mon, Feb 24, 2025 at 01:53:21PM -0800, syzbot wrote:
> > 
> > syzbot found the following issue on:
> > 
> > HEAD commit:    e9a8cac0bf89 Merge tag 'v6.14-rc3-smb3-client-fixes' of gi..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=17b667f8580000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=61cbf5ac8a063ad4
> > dashboard link: https://syzkaller.appspot.com/bug?extid=1a517ccfcbc6a7ab0f82
> > compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > 
> > Unfortunately, I don't have any reproducer for this issue yet.
> > 
> > Downloadable assets:
> > disk image: https://storage.googleapis.com/syzbot-assets/8441f1b50402/disk-e9a8cac0.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/65b1f8d2f790/vmlinux-e9a8cac0.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/1d6f6d8c3d6b/bzImage-e9a8cac0.xz
> 
> ---8<---
> Call crypto_free_acomp outside of the mutex in zswap_cpu_comp_dead
> as otherwise this could dead-lock as the allocation path may lead
> back into zswap while holding the same lock.  Zap the pointers to
> acomp and buffer after freeing.
> 
> Also move the NULL check on acomp_ctx so that it takes place before
> the mutex dereference.
> 
> Fixes: 12dcb0ef5406 ("mm: zswap: properly synchronize freeing resources during CPU hotunplug")
> Reported-by: syzbot+1a517ccfcbc6a7ab0f82@syzkaller.appspotmail.com
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> 
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 6504174fbc6a..24d36266a791 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -881,18 +881,23 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node)
>  {
>  	struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
>  	struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu);
> +	struct crypto_acomp *acomp = NULL;
> +
> +	if (IS_ERR_OR_NULL(acomp_ctx))
> +		return 0;
>  
>  	mutex_lock(&acomp_ctx->mutex);
> -	if (!IS_ERR_OR_NULL(acomp_ctx)) {
> -		if (!IS_ERR_OR_NULL(acomp_ctx->req))
> -			acomp_request_free(acomp_ctx->req);
> -		acomp_ctx->req = NULL;
> -		if (!IS_ERR_OR_NULL(acomp_ctx->acomp))
> -			crypto_free_acomp(acomp_ctx->acomp);
> -		kfree(acomp_ctx->buffer);
> -	}
> +	if (!IS_ERR_OR_NULL(acomp_ctx->req))
> +		acomp_request_free(acomp_ctx->req);
> +	acomp_ctx->req = NULL;
> +	acomp = acomp_ctx->acomp;
> +	acomp_ctx->acomp = NULL;
> +	kfree(acomp_ctx->buffer);
> +	acomp_ctx->buffer = NULL;
>  	mutex_unlock(&acomp_ctx->mutex);
>  
> +	crypto_free_acomp(acomp);
> +
>  	return 0;
>  }
>  
> -- 
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> 
[snippet of the syz report]

>>-> #1 (fs_reclaim){+.+.}-{0:0}:
>>       __fs_reclaim_acquire mm/page_alloc.c:3853 [inline]
>>       fs_reclaim_acquire+0x102/0x150 mm/page_alloc.c:3867
>>       might_alloc include/linux/sched/mm.h:318 [inline]
>>       slab_pre_alloc_hook mm/slub.c:4066 [inline]
>>       slab_alloc_node mm/slub.c:4144 [inline]
>>       __kmalloc_cache_node_noprof+0x54/0x420 mm/slub.c:4333
>>       kmalloc_node_noprof include/linux/slab.h:924 [inline]
>>       __get_vm_area_node+0x101/0x2f0 mm/vmalloc.c:3127
>>       __vmalloc_node_range_noprof+0x26a/0x1530 mm/vmalloc.c:3806
>>       __vmalloc_node_noprof mm/vmalloc.c:3911 [inline]
>>       vmalloc_node_noprof+0x6f/0x90 mm/vmalloc.c:4022
>>       crypto_scomp_alloc_scratches crypto/scompress.c:86 [inline]
>>       crypto_scomp_init_tfm+0x122/0x270 crypto/scompress.c:107
>>       crypto_create_tfm_node+0x100/0x320 crypto/api.c:539
>>       crypto_create_tfm crypto/internal.h:120 [inline]
>>       crypto_init_scomp_ops_async+0x5d/0x1d0 crypto/scompress.c:217
>>       crypto_acomp_init_tfm+0x240/0x2e0 crypto/acompress.c:70
>>       crypto_create_tfm_node+0x100/0x320 crypto/api.c:539
>>       crypto_alloc_tfm_node+0x102/0x260 crypto/api.c:640
>>       zswap_cpu_comp_prepare+0xe2/0x420 mm/zswap.c:834
>>       cpuhp_invoke_callback+0x20c/0xa10 kernel/cpu.c:204
>>       cpuhp_issue_call+0x1c0/0x980 kernel/cpu.c:2376
>>       __cpuhp_state_add_instance_cpuslocked+0x1a4/0x3c0 kernel/cpu.c:2438
>>       __cpuhp_state_add_instance+0xd7/0x2e0 kernel/cpu.c:2459
>>       cpuhp_state_add_instance include/linux/cpuhotplug.h:387 [inline]
>>       zswap_pool_create+0x59a/0x7b0 mm/zswap.c:291

Given mutex_init() [1], nobody should take the lock, so the report
sounds false positive.

[1] https://elixir.bootlin.com/linux/v6.14-rc3/source/mm/zswap.c#L289

>>       __zswap_pool_create_fallback mm/zswap.c:359 [inline]
>>       zswap_setup+0x402/0x810 mm/zswap.c:1814
>>       zswap_init+0x2c/0x40 mm/zswap.c:1850
>>       do_one_initcall+0x128/0x700 init/main.c:1257
>>       do_initcall_level init/main.c:1319 [inline]
>>       do_initcalls init/main.c:1335 [inline]
>>       do_basic_setup init/main.c:1354 [inline]
>>       kernel_init_freeable+0x5c7/0x900 init/main.c:1568
>>       kernel_init+0x1c/0x2b0 init/main.c:1457
>>       ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:148
>>       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>>
>>-> #0 (scomp_lock){+.+.}-{4:4}:

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead
  2025-02-25  8:53 ` mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead Herbert Xu
  2025-02-25 10:49   ` Hillf Danton
@ 2025-02-25 13:43   ` Yosry Ahmed
  2025-02-26  1:25     ` Herbert Xu
  1 sibling, 1 reply; 9+ messages in thread
From: Yosry Ahmed @ 2025-02-25 13:43 UTC (permalink / raw)
  To: Herbert Xu, syzbot
  Cc: davem, linux-crypto, linux-kernel, syzkaller-bugs, Andrew Morton,
	linux-mm

February 25, 2025 at 12:53 AM, "Herbert Xu" <herbert@gondor.apana.org.au> wrote:
> 
> On Mon, Feb 24, 2025 at 01:53:21PM -0800, syzbot wrote:
> 
> > syzbot found the following issue on:
> > 
> >  HEAD commit: e9a8cac0bf89 Merge tag 'v6.14-rc3-smb3-client-fixes' of gi..
> >  git tree: upstream
> >  console output: https://syzkaller.appspot.com/x/log.txt?x=17b667f8580000
> >  kernel config: https://syzkaller.appspot.com/x/.config?x=61cbf5ac8a063ad4
> >  dashboard link: https://syzkaller.appspot.com/bug?extid=1a517ccfcbc6a7ab0f82
> > 
> >  compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 
> > 
> >  Unfortunately, I don't have any reproducer for this issue yet.
> >  
> >  Downloadable assets:
> >  disk image: https://storage.googleapis.com/syzbot-assets/8441f1b50402/disk-e9a8cac0.raw.xz
> >  vmlinux: https://storage.googleapis.com/syzbot-assets/65b1f8d2f790/vmlinux-e9a8cac0.xz 
> >  kernel image: https://storage.googleapis.com/syzbot-assets/1d6f6d8c3d6b/bzImage-e9a8cac0.xz
> > 
> 
> ---8<---
> 
> Call crypto_free_acomp outside of the mutex in zswap_cpu_comp_dead
> as otherwise this could dead-lock as the allocation path may lead
> back into zswap while holding the same lock. Zap the pointers to
> acomp and buffer after freeing.
> Also move the NULL check on acomp_ctx so that it takes place before
> the mutex dereference.
> 
> Fixes: 12dcb0ef5406 ("mm: zswap: properly synchronize freeing resources during CPU hotunplug")
> Reported-by: syzbot+1a517ccfcbc6a7ab0f82@syzkaller.appspotmail.com
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Interesting, it's weird that crypto_free_acomp() allocates memory. Do you have the specific call path?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead
  2025-02-25 13:43   ` Yosry Ahmed
@ 2025-02-26  1:25     ` Herbert Xu
  2025-02-26  2:08       ` Yosry Ahmed
  0 siblings, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2025-02-26  1:25 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: syzbot, davem, linux-crypto, linux-kernel, syzkaller-bugs,
	Andrew Morton, linux-mm

On Tue, Feb 25, 2025 at 01:43:41PM +0000, Yosry Ahmed wrote:
>
> Interesting, it's weird that crypto_free_acomp() allocates memory. Do you have the specific call path?

crypto_free_acomp does not allocate memory.  However, it takes
the same mutex that is also taken on the allocation path.

The specific call path can be seen in the original report:

https://syzkaller.appspot.com/bug?extid=1a517ccfcbc6a7ab0f82

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead
  2025-02-26  1:25     ` Herbert Xu
@ 2025-02-26  2:08       ` Yosry Ahmed
  2025-02-26  2:10         ` Herbert Xu
  0 siblings, 1 reply; 9+ messages in thread
From: Yosry Ahmed @ 2025-02-26  2:08 UTC (permalink / raw)
  To: Herbert Xu
  Cc: syzbot, davem, linux-crypto, linux-kernel, syzkaller-bugs,
	Andrew Morton, linux-mm

On Wed, Feb 26, 2025 at 09:25:23AM +0800, Herbert Xu wrote:
> On Tue, Feb 25, 2025 at 01:43:41PM +0000, Yosry Ahmed wrote:
> >
> > Interesting, it's weird that crypto_free_acomp() allocates memory. Do you have the specific call path?
> 
> crypto_free_acomp does not allocate memory.  However, it takes
> the same mutex that is also taken on the allocation path.
> 
> The specific call path can be seen in the original report:
> 
> https://syzkaller.appspot.com/bug?extid=1a517ccfcbc6a7ab0f82

After staring at this for a while I think the following situation could
be the problem:

Task A running on CPU #1:
crypto_alloc_acomp_node()
  Holds scomp_lock
  Enters reclaim
  Reads per_cpu_ptr(pool->acomp_ctx, cpu)

Task A is descheduled

zswap_cpu_comp_dead(CPU #1) // CPU #1 going offline
  Holds per_cpu_ptr(pool->acomp_ctx, cpu))
  Calls crypto_free_acomp()
    Waits for scomp_lock

Task A running on CPU #2:
  Waits for per_cpu_ptr(pool->acomp_ctx, cpu)
  DEADLOCK
  
In this case I think the fix is correct, thanks for looking into it.

Could you please:

(1) Explain the exact scenario in the commit log, I did not understand
it at first, only after looking at the syzbot dashboard for a while (and
I am not sure how long this persists).

(2) Move all the freeing operations outside the mutex? Right now
crypto_free_acomp() was the problematic call but it could be 
acomp_request_free() next.

Something like:

static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node)
{
        struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
        struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu);
        struct struct acomp_req *req;
        struct crypto_acomp *acomp;
        u8 *buffer;

        if (IS_ERR_OR_NULL(acomp_ctx))
                return 0;

        mutex_lock(&acomp_ctx->mutex);
        req = acomp_ctx->req;
        acomp_ctx->req = NULL;
        acomp = acomp_ctx->acomp;
        acomp_ctx->acomp = NULL;
        buffer = acomp_ctx->buffer;
        acomp_ctx->buffer = NULL;
        mutex_unlock(&acomp_ctx->mutex);

        /*
         * Do the actual freeing after releasing the mutex to avoid subtle
         * locking dependencies causing deadlocks
         */
        if (!IS_ERR_OR_NULL(req))
                acomp_request_free(req);
        if (!IS_ERR_OR_NULL(acomp))
                crypto_free_acomp(acomp);
        kfree(acomp_ctx->buffer);

        return 0;
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead
  2025-02-26  2:08       ` Yosry Ahmed
@ 2025-02-26  2:10         ` Herbert Xu
  2025-02-26  2:46           ` Yosry Ahmed
  0 siblings, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2025-02-26  2:10 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: syzbot, davem, linux-crypto, linux-kernel, syzkaller-bugs,
	Andrew Morton, linux-mm

On Wed, Feb 26, 2025 at 02:08:14AM +0000, Yosry Ahmed wrote:
>
> Could you please:
> 
> (1) Explain the exact scenario in the commit log, I did not understand
> it at first, only after looking at the syzbot dashboard for a while (and
> I am not sure how long this persists).
> 
> (2) Move all the freeing operations outside the mutex? Right now
> crypto_free_acomp() was the problematic call but it could be 
> acomp_request_free() next.
> 
> Something like:

Looks good to me.  Feel free to send this patch since it is your
system after all :)

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead
  2025-02-26  2:10         ` Herbert Xu
@ 2025-02-26  2:46           ` Yosry Ahmed
  2025-02-26  3:08             ` Herbert Xu
  0 siblings, 1 reply; 9+ messages in thread
From: Yosry Ahmed @ 2025-02-26  2:46 UTC (permalink / raw)
  To: Herbert Xu
  Cc: syzbot, davem, linux-crypto, linux-kernel, syzkaller-bugs,
	Andrew Morton, linux-mm

On Wed, Feb 26, 2025 at 10:10:22AM +0800, Herbert Xu wrote:
> On Wed, Feb 26, 2025 at 02:08:14AM +0000, Yosry Ahmed wrote:
> >
> > Could you please:
> > 
> > (1) Explain the exact scenario in the commit log, I did not understand
> > it at first, only after looking at the syzbot dashboard for a while (and
> > I am not sure how long this persists).
> > 
> > (2) Move all the freeing operations outside the mutex? Right now
> > crypto_free_acomp() was the problematic call but it could be 
> > acomp_request_free() next.
> > 
> > Something like:
> 
> Looks good to me.  Feel free to send this patch since it is your
> system after all :)

Can do :) May I add your Co-developed-by and Signed-off-by since this
would be based off your patch?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead
  2025-02-26  2:46           ` Yosry Ahmed
@ 2025-02-26  3:08             ` Herbert Xu
  0 siblings, 0 replies; 9+ messages in thread
From: Herbert Xu @ 2025-02-26  3:08 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: syzbot, davem, linux-crypto, linux-kernel, syzkaller-bugs,
	Andrew Morton, linux-mm

On Wed, Feb 26, 2025 at 02:46:53AM +0000, Yosry Ahmed wrote:
>
> Can do :) May I add your Co-developed-by and Signed-off-by since this
> would be based off your patch?

Sure you can also add my ack:

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-02-26  3:08 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-24 21:53 [syzbot] [crypto?] possible deadlock in crypto_exit_scomp_ops_async syzbot
2025-02-25  8:53 ` mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead Herbert Xu
2025-02-25 10:49   ` Hillf Danton
2025-02-25 13:43   ` Yosry Ahmed
2025-02-26  1:25     ` Herbert Xu
2025-02-26  2:08       ` Yosry Ahmed
2025-02-26  2:10         ` Herbert Xu
2025-02-26  2:46           ` Yosry Ahmed
2025-02-26  3:08             ` Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).