From: Sean Christopherson <seanjc@google.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Borislav Petkov <bp@alien8.de>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>, x86-ml <x86@kernel.org>,
lkml <linux-kernel@vger.kernel.org>
Subject: Re: [GIT PULL] locking/urgent for v6.17-rc1
Date: Fri, 22 Aug 2025 17:28:02 -0700 [thread overview]
Message-ID: <aKkLEtoDXKxAAWju@google.com> (raw)
In-Reply-To: <20250822141654.Sjoffo8F@linutronix.de>
On Fri, Aug 22, 2025, Sebastian Andrzej Siewior wrote:
> On 2025-08-21 12:45:52 [-0700], Sean Christopherson wrote:
> > Piggybacking the futex private hashing attention, the new fanciness is causing
> > crashes in my testing. The crashes are 100% reproducible, but my reproducer is
> > simply running a variety of tests in parallel, i.e. isn't very debug-friendly,
> > and the code itself is black magic to me, so all I've done is bisect.
> >
> > I reported the issue on the original thread, but haven't seen any follow-up.
> >
> > https://lore.kernel.org/all/aJ_vEP2EHj6l0xRT@google.com
>
> I somehow missed it. Can you try rc2 with the patch I just sent?
No dice, fails with the same signature.
I got a trimmed down reproduer. Load KVM, run this in the background (in a loop)
to constantly trigger try_to_wake_up() on relevant tasks (needs to be run as root):
echo Y > /sys/module/kvm/parameters/nx_huge_pages
echo N > /sys/module/kvm/parameters/nx_huge_pages
sleep .2
and then run the hardware_disable_test KVM selftest (from
tools/testing/selftests/kvm/hardware_disable_test.c).
Strace on hardware_disable_test spewed a whole pile of these
wait4(32861, 0x7ffc66475dec, WNOHANG, NULL) = 0
futex(0x7fb735c43000, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
immediately before the crash. I assume it corresponds to this:
/* Child is still running, keep waiting. */
if (pid != waitpid(pid, &status, WNOHANG))
continue;
I also got a new splat on the "WARN_ON_ONCE(ret < 0);" at the end of __futex_ref_atomic_end().
This happened during boot; AFAICT our userspace was setting up cgroups. In this
case, the system hung and I had to reboot.
------------[ cut here ]------------
WARNING: CPU: 45 PID: 0 at kernel/futex/core.c:1604 futex_ref_rcu+0xbf/0xf0
Modules linked in: vfat fat i2c_mux_pca954x i2c_mux spidev cdc_acm xhci_pci xhci_hcd gq(O) sha3_generic
CPU: 45 UID: 0 PID: 0 Comm: swapper/45 Tainted: G S O 6.17.0-smp--1278d576b27d-futex #886 NONE
Tainted: [S]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE
Hardware name: Google LLC Indus/Indus_QC_03, BIOS 30.110.0 09/13/2024
RIP: 0010:futex_ref_rcu+0xbf/0xf0
Code: c7 04 0a 00 00 00 00 48 ff c0 eb c2 65 ff 01 89 e8 4c 01 f0 48 ff c0 48 89 c1 f0 48 0f c1 8b 48 01 00 00 48 01 c1 74 06 79 0c <0f> 0b eb 08 48 89 df e8 55 0a f9 ff 48 89 df 5b 41 5e 5d e9 f9 5c
RSP: 0018:ffffa43c8d440ec8 EFLAGS: 00010286
RAX: 8000000000000000 RBX: ffff933782245080 RCX: ffffffffffffffff
RDX: 0000000000000060 RSI: 0000000000000060 RDI: ffffffffac840520
RBP: 0000000000000000 R08: ffff933680044d00 R09: ffffffff00000000
R10: ffff9365c9b59e00 R11: ffff9365c9b59e00 R12: ffffffffab77ac10
R13: ffff9337822451b8 R14: 7fffffffffffffff R15: ffff9365c749de00
FS: 0000000000000000(0000) GS:ffff9395514f2000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fad8a21cf38 CR3: 00000055c062b002 CR4: 00000000007706f0
PKRU: 55555554
Call Trace:
<IRQ>
rcu_do_batch+0x250/0x7e0
rcu_core+0x12f/0x230
handle_softirqs+0xc8/0x280
__irq_exit_rcu+0x48/0x100
sysvec_apic_timer_interrupt+0x74/0x80
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20
RIP: 0010:cpuidle_enter_state+0xfb/0x290
Code: bb f6 ff ff 49 89 c4 8b 73 04 bf ff ff ff ff e8 9b 68 d8 ff 31 ff e8 f4 32 48 ff 80 7c 24 04 00 74 05 e8 c8 68 d8 ff fb 85 ed <0f> 88 ba 00 00 00 89 e9 48 6b f9 68 4c 8b 44 24 08 49 8b 54 38 30
RSP: 0018:ffffa43c803d3e80 EFLAGS: 00000206
RAX: ffff9395514f2000 RBX: ffff9394ff776548 RCX: 000000000000001f
RDX: 000000000018ec50 RSI: 000000000000002d RDI: 0000000000000000
RBP: 0000000000000003 R08: 0000000000000002 R09: 0000000000000002
R10: 00000000000003dc R11: 0000000000000389 R12: 00000010fb32644d
R13: 00000010fb2333f7 R14: ffffffffad276f68 R15: 0000000000000003
cpuidle_enter+0x2c/0x40
do_idle+0x1ac/0x250
cpu_startup_entry+0x2a/0x30
start_secondary+0x80/0x80
common_startup_64+0x13e/0x140
</TASK>
---[ end trace 0000000000000000 ]---
Heh, and two more when booting a different system. Guess it's my lucky day.
This time whatever went sideways didn't appear to be fatal as the system booted
and I could ssh in. One is the same WARN as above, and the second WARN on the
system hit the
WARN_ON_ONCE(atomic_long_read(&mm->futex_atomic) != 0);
in futex_hash_allocate().
------------[ cut here ]------------
WARNING: CPU: 120 PID: 11779 at kernel/futex/core.c:1553 futex_hash_allocate+0x436/0x450
Modules linked in: vfat fat ccp k10temp i2c_piix4 cdc_acm xhci_pci xhci_hcd gq(O) sha3_generic
CPU: 120 UID: 0 PID: 11779 Comm: borglet Tainted: G U W O 6.17.0-smp--1278d576b27d-futex #886 NONE
Tainted: [U]=USER, [W]=WARN, [O]=OOT_MODULE
Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 34.64.2-0 12/26/2024
RIP: 0010:futex_hash_allocate+0x436/0x450
Code: 31 ff 65 48 8b 05 ba bc ae 02 48 3b 44 24 48 75 20 44 89 f8 48 83 c4 50 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b e9 9d fe ff ff <0f> 0b e9 c0 fe ff ff e8 ce 99 af 00 66 66 66 66 66 2e 0f 1f 84 00
RSP: 0018:ffffbbc0f1237d10 EFLAGS: 00010286
RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffa25747532180
RDX: 0000000000000400 RSI: 000000000000ffc0 RDI: 00000000000039b8
RBP: ffffa296a2620000 R08: 00000000004029c0 R09: 00000000ffffffff
R10: 00000000ffffffff R11: 0000000000010040 R12: ffffa2571336b700
R13: ffffa2571336b600 R14: ffffa2571336b600 R15: ffffa296b9270000
FS: 00007f6bbd3297c0(0000) GS:ffffa2d5a31b2000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6bae810f38 CR3: 00000001330a4001 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
<TASK>
? cgroup_can_fork+0x258/0x420
copy_process+0xae3/0xff0
kernel_clone+0x99/0x320
__x64_sys_clone+0xc8/0xf0
do_syscall_64+0x6f/0x1f0
? arch_exit_to_user_mode_prepare+0x9/0x50
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f6bbd466051
Code: 48 85 ff 74 3d 48 85 f6 74 38 48 83 ee 10 48 89 4e 08 48 89 3e 48 89 d7 4c 89 c2 4d 89 c8 4c 8b 54 24 08 b8 38 00 00 00 0f 05 <48> 85 c0 7c 13 74 01 c3 31 ed 58 5f ff d0 48 89 c7 b8 3c 00 00 00
RSP: 002b:00007fffff2eda98 EFLAGS: 00000206 ORIG_RAX: 0000000000000038
RAX: ffffffffffffffda RBX: 00007f6bae812700 RCX: 00007f6bbd466051
RDX: 00007f6bae8129d0 RSI: 00007f6bae810f30 RDI: 00000000003d0f00
RBP: 00007fffff2edad0 R08: 00007f6bae812700 R09: 00007f6bae812700
R10: 00007f6bae8129d0 R11: 0000000000000206 R12: 00007f6bae8129d0
R13: 00007fffff2edb66 R14: 00007fffff2edbd0 R15: 00007f6bae810f40
</TASK>
---[ end trace 0000000000000000 ]---
next prev parent reply other threads:[~2025-08-23 0:28 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-09 18:02 [GIT PULL] locking/urgent for v6.17-rc1 Borislav Petkov
2025-08-10 5:58 ` pr-tracker-bot
2025-08-21 18:19 ` Linus Torvalds
2025-08-21 19:45 ` Sean Christopherson
2025-08-22 14:16 ` Sebastian Andrzej Siewior
2025-08-23 0:28 ` Sean Christopherson [this message]
2025-08-25 16:04 ` Sebastian Andrzej Siewior
2025-08-25 23:55 ` Sean Christopherson
2025-08-22 10:57 ` Sebastian Andrzej Siewior
2025-08-22 14:12 ` [PATCH] futex: Move futex_hash_free() back to __mmput() Sebastian Andrzej Siewior
2025-08-27 11:34 ` [tip: locking/urgent] " tip-bot2 for Sebastian Andrzej Siewior
2025-08-31 12:23 ` tip-bot2 for Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aKkLEtoDXKxAAWju@google.com \
--to=seanjc@google.com \
--cc=bigeasy@linutronix.de \
--cc=bp@alien8.de \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.