From: Chris Bainbridge <chris.bainbridge@gmail.com>
To: linux-kernel@vger.kernel.org
Cc: surenb@google.com, bsegall@google.com, dietmar.eggemann@arm.com,
mingo@redhat.com, hannes@cmpxchg.org, juri.lelli@redhat.com,
mgorman@suse.de, peterz@infradead.org, rostedt@goodmis.org,
vschneid@redhat.com, vincent.guittot@linaro.org,
regressions@lists.linux.dev
Subject: [REGRESSION] intermittent psi_avgs_work soft lockup
Date: Sun, 3 Aug 2025 23:14:42 +0100 [thread overview]
Message-ID: <aI_fUhpBrIBrJ073@debian.local> (raw)
Hello,
I'm getting intermittent soft lockups with recent kernel builds. This is
a new error that I haven't seen before.
An example lockup from 6.16.0-08685-g260f6f4fda93:
[39389.154516] iwlwifi 0000:01:00.0: Queue 3 is stuck 4977 5129
[39400.400429] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:1:1751316]
[39400.400433] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat x_tables nf_tables br_netfilter bridge stp llc ccm overlay qrtr rfcomm cmac algif_hash algif_skcipher af_alg bnep binfmt_misc ext4 mbcache jbd2 nls_ascii nls_cp437 vfat fat snd_hda_codec_generic snd_hda_codec_hdmi intel_rapl_msr intel_rapl_common iwlmvm snd_hda_intel snd_acp3x_pdm_dma snd_soc_dmic snd_acp3x_rn kvm_amd snd_hda_codec uvcvideo snd_soc_core mac80211 snd_usb_audio btusb snd_intel_dspcfg snd_compress videobuf2_vmalloc snd_usbmidi_lib btrtl libarc4 videobuf2_memops kvm snd_rawmidi snd_hwdep snd_pci_acp6x btintel uvc snd_seq_device snd_hda_core snd_pci_acp5x btbcm videobuf2_v4l2 irqbypass snd_pcm btmtk iwlwifi snd_rn_pci_acp3x sg videodev rapl snd_timer videobuf2_common wmi_bmof ee1004 snd_acp_config pcspkr bluetooth cfg80211 snd_soc_acpi k10temp snd mc snd_pci_acp3x soundcore ccp rfkill ac
[39400.400478] battery acpi_tad amd_pmc joydev evdev msr parport_pc ppdev lp parport efi_pstore fuse nvme_fabrics configfs nfnetlink efivarfs autofs4 crc32c_cryptoapi btrfs blake2b_generic xor raid6_pq hid_microsoft ff_memless hid_cmedia r8153_ecm cdc_ether usbnet r8152 mii libphy mdio_bus usbhid dm_crypt dm_mod sd_mod uas usb_storage scsi_mod scsi_common amdgpu drm_client_lib i2c_algo_bit drm_ttm_helper ttm drm_panel_backlight_quirks drm_exec drm_suballoc_helper amdxcp drm_buddy gpu_sched hid_multitouch drm_display_helper ucsi_acpi hid_generic drm_kms_helper typec_ucsi sp5100_tco roles xhci_pci cec i2c_hid_acpi watchdog typec xhci_hcd amd_sfh i2c_hid rc_core nvme i2c_piix4 thunderbolt video usbcore ghash_clmulni_intel serio_raw hid crc16 nvme_core fan i2c_smbus usb_common button wmi drm aesni_intel
[39400.400514] irq event stamp: 28884
[39400.400515] hardirqs last enabled at (28883): [<ffffffffb6200dc6>] asm_sysvec_apic_timer_interrupt+0x16/0x20
[39400.400521] hardirqs last disabled at (28884): [<ffffffffb71185fa>] sysvec_apic_timer_interrupt+0xa/0xc0
[39400.400526] softirqs last enabled at (28882): [<ffffffffb64f934d>] __irq_exit_rcu+0xcd/0x140
[39400.400530] softirqs last disabled at (28877): [<ffffffffb64f934d>] __irq_exit_rcu+0xcd/0x140
[39400.400533] CPU: 2 UID: 0 PID: 1751316 Comm: kworker/2:1 Not tainted 6.16.0-08685-g260f6f4fda93 #489 PREEMPT(voluntary)
[39400.400535] Hardware name: HP HP Pavilion Aero Laptop 13-be0xxx/8916, BIOS F.17 12/18/2024
[39400.400537] Workqueue: events psi_avgs_work
[39400.400541] RIP: 0010:collect_percpu_times+0x2d5/0x440
[39400.400543] Code: 00 00 00 00 00 41 8b 0c 94 48 0f af c8 48 01 4c d5 00 48 83 c2 01 48 83 fa 06 75 e9 8d 53 01 e9 aa fd ff ff f3 90 48 8b 3c 24 <48> 8b 14 fd 20 d0 6d b7 48 01 c2 8b 12 f6 c2 01 0f 84 ab fe ff ff
[39400.400545] RSP: 0018:ffffc06b07823cf8 EFLAGS: 00000202
[39400.400546] RAX: ffffffffb82abc80 RBX: ffffe06aff48f440 RCX: 0000000000000006
[39400.400548] RDX: 00000000000014b7 RSI: ffffffffb76b7293 RDI: 000000000000000d
[39400.400548] RBP: ffffc06b07823d70 R08: 0000000000000001 R09: 0000000000000000
[39400.400549] R10: 0000000000000001 R11: 0000000000000003 R12: ffffc06b07823d50
[39400.400550] R13: ffffe06aff48f454 R14: 000000000000000d R15: ffffffffb82abc80
[39400.400551] FS: 0000000000000000(0000) GS:ffff9d9f4e072000(0000) knlGS:0000000000000000
[39400.400552] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[39400.400553] CR2: 00000c2100382000 CR3: 0000000387c3b000 CR4: 0000000000750ef0
[39400.400554] PKRU: 55555554
[39400.400555] Call Trace:
[39400.400557] <TASK>
[39400.400571] psi_avgs_work+0x56/0xe0
[39400.400576] process_one_work+0x22b/0x5b0
[39400.400588] worker_thread+0x1d6/0x3c0
[39400.400592] ? bh_worker+0x260/0x260
[39400.400594] kthread+0x115/0x260
[39400.400599] ? kthreads_online_cpu+0x120/0x120
[39400.400603] ret_from_fork+0x231/0x2a0
[39400.400606] ? kthreads_online_cpu+0x120/0x120
[39400.400610] ret_from_fork_asm+0x11/0x20
[39400.400621] </TASK>
[39400.404429] watchdog: BUG: soft lockup - CPU#4 stuck for 21s! [kworker/4:0:1751752]
It appears to happen randomly when I have been away from the laptop for
some time and return, or sometimes if I leave it overnight. It also
looks like it occurs on 2% of system boots. Bisecting with such a low
failure probability takes a long time. I haven't identified the bad
commit yet, but I think I have narrowed it down to between v6.16-rc6
(good) and v6.16-rc6-79-g44e4e0297c3c (bad). At this rate, I should have
a more exact bisect result within a week.
#regzbot introduced: v6.16-rc6..v6.16-rc6-79-g44e4e0297c3c
next reply other threads:[~2025-08-03 22:14 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-03 22:14 Chris Bainbridge [this message]
2025-08-04 13:32 ` [REGRESSION] intermittent psi_avgs_work soft lockup Johannes Weiner
2025-08-04 16:54 ` Johannes Weiner
2025-08-04 22:34 ` Chris Bainbridge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aI_fUhpBrIBrJ073@debian.local \
--to=chris.bainbridge@gmail.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=hannes@cmpxchg.org \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=regressions@lists.linux.dev \
--cc=rostedt@goodmis.org \
--cc=surenb@google.com \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.