All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Lai, Yi" <yi1.lai@linux.intel.com>
To: Kairui Song <ryncsn@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Baoquan He <bhe@redhat.com>, Barry Song <baohua@kernel.org>,
	Chris Li <chrisl@kernel.org>, Nhat Pham <nphamcs@gmail.com>,
	Yosry Ahmed <yosry.ahmed@linux.dev>,
	David Hildenbrand <david@kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Youngjun Park <youngjun.park@lge.com>,
	Hugh Dickins <hughd@google.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Ying Huang <ying.huang@linux.alibaba.com>,
	Kemeng Shi <shikemeng@huaweicloud.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	linux-kernel@vger.kernel.org, Kairui Song <kasong@tencent.com>,
	linux-pm@vger.kernel.org,
	"Rafael J. Wysocki (Intel)" <rafael@kernel.org>
Subject: Re: [PATCH v5 14/19] mm, swap: cleanup swap entry management workflow
Date: Wed, 14 Jan 2026 21:28:00 +0800	[thread overview]
Message-ID: <aWeZ4LmfwiS9iwYF@ly-workstation> (raw)
In-Reply-To: <20251220-swap-table-p2-v5-14-8862a265a033@tencent.com>

Hi Kairui Song,

Greetings!

I used Syzkaller and found that there is possible deadlock in swap_free_hibernation_slot in linux-next next-20260113.

After bisection and the first bad commit is:
"
33be6f68989d mm. swap: cleanup swap entry management workflow
"

All detailed into can be found at:
https://github.com/laifryiee/syzkaller_logs/tree/main/260114_102849_swap_free_hibernation_slot
Syzkaller repro code:
https://github.com/laifryiee/syzkaller_logs/tree/main/260114_102849_swap_free_hibernation_slot/repro.c
Syzkaller repro syscall steps:
https://github.com/laifryiee/syzkaller_logs/tree/main/260114_102849_swap_free_hibernation_slot/repro.prog
Syzkaller report:
https://github.com/laifryiee/syzkaller_logs/tree/main/260114_102849_swap_free_hibernation_slot/repro.report
Kconfig(make olddefconfig):
https://github.com/laifryiee/syzkaller_logs/tree/main/260114_102849_swap_free_hibernation_slot/kconfig_origin
Bisect info:
https://github.com/laifryiee/syzkaller_logs/tree/main/260114_102849_swap_free_hibernation_slot/bisect_info.log
bzImage:
https://github.com/laifryiee/syzkaller_logs/raw/refs/heads/main/260114_102849_swap_free_hibernation_slot/bzImage_0f853ca2a798ead9d24d39cad99b0966815c582a
Issue dmesg:
https://github.com/laifryiee/syzkaller_logs/blob/main/260114_102849_swap_free_hibernation_slot/0f853ca2a798ead9d24d39cad99b0966815c582a_dmesg.log

"
[   62.477554] ============================================
[   62.477802] WARNING: possible recursive locking detected
[   62.478059] 6.19.0-rc5-next-20260113-0f853ca2a798 #1 Not tainted
[   62.478324] --------------------------------------------
[   62.478549] repro/668 is trying to acquire lock:
[   62.478759] ffff888011664018 (&cluster_info[i].lock){+.+.}-{3:3}, at: swap_free_hibernation_slot+0x13e/0x2a0
[   62.479271]
[   62.479271] but task is already holding lock:
[   62.479519] ffff888011664018 (&cluster_info[i].lock){+.+.}-{3:3}, at: swap_free_hibernation_slot+0xfa/0x2a0
[   62.479984]
[   62.479984] other info that might help us debug this:
[   62.480293]  Possible unsafe locking scenario:
[   62.480293]
[   62.480565]        CPU0
[   62.480686]        ----
[   62.480809]   lock(&cluster_info[i].lock);
[   62.481010]   lock(&cluster_info[i].lock);
[   62.481205]
[   62.481205]  *** DEADLOCK ***
[   62.481205]
[   62.481481]  May be due to missing lock nesting notation
[   62.481481]
[   62.481802] 2 locks held by repro/668:
[   62.481981]  #0: ffffffff87542e28 (system_transition_mutex){+.+.}-{4:4}, at: lock_system_sleep+0x92/0xb0
[   62.482439]  #1: ffff888011664018 (&cluster_info[i].lock){+.+.}-{3:3}, at: swap_free_hibernation_slot+0xfa/0x0
[   62.482936]
[   62.482936] stack backtrace:
[   62.483131] CPU: 0 UID: 0 PID: 668 Comm: repro Not tainted 6.19.0-rc5-next-20260113-0f853ca2a798 #1 PREEMPT(l
[   62.483143] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.q4
[   62.483151] Call Trace:
[   62.483156]  <TASK>
[   62.483160]  dump_stack_lvl+0xea/0x150
[   62.483195]  dump_stack+0x19/0x20
[   62.483206]  print_deadlock_bug+0x22e/0x300
[   62.483215]  __lock_acquire+0x1325/0x2210
[   62.483226]  lock_acquire+0x170/0x2f0
[   62.483234]  ? swap_free_hibernation_slot+0x13e/0x2a0
[   62.483249]  _raw_spin_lock+0x38/0x50
[   62.483267]  ? swap_free_hibernation_slot+0x13e/0x2a0
[   62.483279]  swap_free_hibernation_slot+0x13e/0x2a0
[   62.483291]  ? __pfx_swap_free_hibernation_slot+0x10/0x10
[   62.483303]  ? locks_remove_file+0xe2/0x7f0
[   62.483322]  ? __pfx_snapshot_release+0x10/0x10
[   62.483331]  free_all_swap_pages+0xdd/0x160
[   62.483339]  ? __pfx_snapshot_release+0x10/0x10
[   62.483346]  snapshot_release+0xac/0x200
[   62.483353]  __fput+0x41f/0xb70
[   62.483369]  ____fput+0x22/0x30
[   62.483376]  task_work_run+0x19e/0x2b0
[   62.483391]  ? __pfx_task_work_run+0x10/0x10
[   62.483398]  ? nsproxy_free+0x2da/0x5b0
[   62.483410]  ? switch_task_namespaces+0x118/0x130
[   62.483421]  do_exit+0x869/0x2810
[   62.483435]  ? do_group_exit+0x1d8/0x2c0
[   62.483445]  ? __pfx_do_exit+0x10/0x10
[   62.483451]  ? __this_cpu_preempt_check+0x21/0x30
[   62.483463]  ? _raw_spin_unlock_irq+0x2c/0x60
[   62.483474]  ? lockdep_hardirqs_on+0x85/0x110
[   62.483486]  ? _raw_spin_unlock_irq+0x2c/0x60
[   62.483498]  ? trace_hardirqs_on+0x26/0x130
[   62.483516]  do_group_exit+0xe4/0x2c0
[   62.483524]  __x64_sys_exit_group+0x4d/0x60
[   62.483531]  x64_sys_call+0x21a2/0x21b0
[   62.483544]  do_syscall_64+0x6d/0x1180
[   62.483560]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   62.483584] RIP: 0033:0x7fe84fb18a4d
[   62.483595] Code: Unable to access opcode bytes at 0x7fe84fb18a23.
[   62.483602] RSP: 002b:00007fff3e35c928 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[   62.483610] RAX: ffffffffffffffda RBX: 00007fe84fbf69e0 RCX: 00007fe84fb18a4d
[   62.483615] RDX: 00000000000000e7 RSI: ffffffffffffff80 RDI: 0000000000000001
[   62.483620] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000020
[   62.483624] R10: 00007fff3e35c7d0 R11: 0000000000000246 R12: 00007fe84fbf69e0
[   62.483629] R13: 00007fe84fbfbf00 R14: 0000000000000001 R15: 00007fe84fbfbee8
[   62.483640]  </TASK>
"

Hope this cound be insightful to you.

Regards,
Yi Lai

---

If you don't need the following environment to reproduce the problem or if you
already have one reproduced environment, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
  // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
  // You could change the bzImage_xxx as you want
  // Maybe you need to remove line "-drive if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd \" for different qemu version
You could use below command to log in, there is no password for root.
ssh -p 10023 root@localhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage           //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel in vm.


Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
yum -y install libslirp-devel.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp
make
make install 



  parent reply	other threads:[~2026-01-14 13:28 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-19 19:43 [PATCH v5 00/19] mm, swap: swap table phase II: unify swapin use swap cache and cleanup flags Kairui Song
2025-12-19 19:43 ` [PATCH v5 01/19] mm, swap: rename __read_swap_cache_async to swap_cache_alloc_folio Kairui Song
2025-12-19 19:43 ` [PATCH v5 02/19] mm, swap: split swap cache preparation loop into a standalone helper Kairui Song
2025-12-19 19:43 ` [PATCH v5 03/19] mm, swap: never bypass the swap cache even for SWP_SYNCHRONOUS_IO Kairui Song
2025-12-19 19:43 ` [PATCH v5 04/19] mm, swap: always try to free swap cache for SWP_SYNCHRONOUS_IO devices Kairui Song
2025-12-19 19:43 ` [PATCH v5 05/19] mm, swap: simplify the code and reduce indention Kairui Song
2025-12-19 19:43 ` [PATCH v5 06/19] mm, swap: free the swap cache after folio is mapped Kairui Song
2025-12-19 19:43 ` [PATCH v5 08/19] mm/shmem, swap: remove SWAP_MAP_SHMEM Kairui Song
2025-12-19 19:43 ` [PATCH v5 09/19] mm, swap: swap entry of a bad slot should not be considered as swapped out Kairui Song
2025-12-19 19:43 ` [PATCH v5 10/19] mm, swap: consolidate cluster reclaim and usability check Kairui Song
2025-12-19 19:43 ` [PATCH v5 11/19] mm, swap: split locked entry duplicating into a standalone helper Kairui Song
2025-12-19 19:43 ` [PATCH v5 12/19] mm, swap: use swap cache as the swap in synchronize layer Kairui Song
2026-01-12 18:33   ` Kairui Song
2025-12-19 19:43 ` [PATCH v5 13/19] mm, swap: remove workaround for unsynchronized swap map cache state Kairui Song
2025-12-19 19:43 ` [PATCH v5 14/19] mm, swap: cleanup swap entry management workflow Kairui Song
2025-12-20  4:02   ` Baoquan He
2025-12-22  2:43     ` Kairui Song
2026-01-07 16:05       ` Kairui Song
2026-01-14 12:16   ` Chris Mason
2026-01-14 16:18     ` Kairui Song
2026-01-14 13:28   ` Lai, Yi [this message]
2026-01-14 16:22     ` Kairui Song
2026-01-14 16:53   ` Kairui Song
2026-01-14 22:29     ` Andrew Morton
2026-01-16 10:57       ` Chris Li
2026-01-29 19:32   ` Chris Mason
2026-01-30 16:48     ` Kairui Song
2025-12-19 19:43 ` [PATCH v5 15/19] mm, swap: add folio to swap cache directly on allocation Kairui Song
2025-12-20  4:12   ` Baoquan He
2025-12-22  2:42     ` Kairui Song
2025-12-22  3:41       ` Baoquan He
2025-12-19 19:43 ` [PATCH v5 16/19] mm, swap: check swap table directly for checking cache Kairui Song
2025-12-19 19:43 ` [PATCH v5 17/19] mm, swap: clean up and improve swap entries freeing Kairui Song
2025-12-19 19:43 ` [PATCH v5 18/19] mm, swap: drop the SWAP_HAS_CACHE flag Kairui Song
2025-12-19 19:43 ` [PATCH v5 19/19] mm, swap: remove no longer needed _swap_info_get Kairui Song
2025-12-19 19:57 ` [PATCH v5 07/19] mm/shmem: never bypass the swap cache for SWP_SYNCHRONOUS_IO Kairui Song
2025-12-19 20:05 ` [PATCH v5 00/19] mm, swap: swap table phase II: unify swapin use swap cache and cleanup flags Kairui Song
2025-12-20 12:34 ` Baoquan He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWeZ4LmfwiS9iwYF@ly-workstation \
    --to=yi1.lai@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bhe@redhat.com \
    --cc=chrisl@kernel.org \
    --cc=david@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=nphamcs@gmail.com \
    --cc=rafael@kernel.org \
    --cc=ryncsn@gmail.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yosry.ahmed@linux.dev \
    --cc=youngjun.park@lge.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.