From: Sean Christopherson <seanjc@google.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Borislav Petkov <bp@alien8.de>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>, x86-ml <x86@kernel.org>,
lkml <linux-kernel@vger.kernel.org>
Subject: Re: [GIT PULL] locking/urgent for v6.17-rc1
Date: Mon, 25 Aug 2025 16:55:29 -0700 [thread overview]
Message-ID: <aKz38QetUrDfKP8P@google.com> (raw)
In-Reply-To: <20250825160406.ZVcVPStz@linutronix.de>
On Mon, Aug 25, 2025, Sebastian Andrzej Siewior wrote:
> On 2025-08-22 17:28:02 [-0700], Sean Christopherson wrote:
> kvm-nx-lpage-recovery shares the mm but it grabs a reference.
> It might be a coincidence but the task, on which the wakeup chokes,
> seems to be gone according to my traces. And with
>
> diff --git a/kernel/vhost_task.c b/kernel/vhost_task.c
> --- a/kernel/vhost_task.c
> +++ b/kernel/vhost_task.c
> @@ -75,7 +84,10 @@ static int vhost_task_fn(void *data)
> */
> void vhost_task_wake(struct vhost_task *vtsk)
> {
> - wake_up_process(vtsk->task);
> + mutex_lock(&vtsk->exit_mutex);
> + if (!test_bit(VHOST_TASK_FLAGS_KILLED, &vtsk->flags))
> + wake_up_process(vtsk->task);
> + mutex_unlock(&vtsk->exit_mutex);
> }
> EXPORT_SYMBOL_GPL(vhost_task_wake);
>
> it doesn't crash anymore. Could it attempts to wake a task that is gone?
Oh fudge, that indeed is what's happening.
Each VM that KVM creates has a kvm-nx-lpage-recovery task, and KVM wakes all such
tasks across all VMs in response to any change to the hugepage recovery settings,
i.e. when privileged userspace changes any of the associate module params.
KVM holds a global lock when walking the list of VMs and so guarantees the VM
hasn't fully exited, but nothing prevents the recovery task from getting a signal
and exiting long before the VM is destroyed. hardware_disable_test is (deliberately?)
not very tidy, and exits without explicitly closing the VM and vCPU fds, and so
its recovery task gets terminated via signal instead of by KVM explicitly calling
vhost_task_stop() when the VM is being destroyed.
The basic gist of the above diff works, but unfortunately simply taking
vtsk->exit_mutex in vhost_task_wake() doesn't appear to be an option because the
vhost code appears to have gone through a lot of effort to avoid waking an exited
task.
I think we can also add some sanity checks and hints to help future users of the
vhost task code from running into the same problem.
I'll post a proper series.
Thanks a ton, I owe you a drink of your choice :-)
> > Strace on hardware_disable_test spewed a whole pile of these
> >
> > wait4(32861, 0x7ffc66475dec, WNOHANG, NULL) = 0
> > futex(0x7fb735c43000, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
>
> That is a shared FUTEX and is probably part pthread_join().
>
> > immediately before the crash. I assume it corresponds to this:
> >
> > /* Child is still running, keep waiting. */
> > if (pid != waitpid(pid, &status, WNOHANG))
> > continue;
> >
> > I also got a new splat on the "WARN_ON_ONCE(ret < 0);" at the end of __futex_ref_atomic_end().
> > This happened during boot; AFAICT our userspace was setting up cgroups. In this
> > case, the system hung and I had to reboot.
>
> This is odd
>
> > ------------[ cut here ]------------
> > WARNING: CPU: 45 PID: 0 at kernel/futex/core.c:1604 futex_ref_rcu+0xbf/0xf0
> …
> > Heh, and two more when booting a different system. Guess it's my lucky day.
> > This time whatever went sideways didn't appear to be fatal as the system booted
> > and I could ssh in. One is the same WARN as above, and the second WARN on the
> > system hit the
> >
> > WARN_ON_ONCE(atomic_long_read(&mm->futex_atomic) != 0);
> >
> > in futex_hash_allocate().
>
> This means the counter don't add up after the switch. Not sure how. This
> seems to be a random task but it might be part of the previous splat.
Yeah, IIRC, those only showed up when I kexec'd into a new kernel instead of doing
a normal reboot, so it may have been some weird leftovers and/or PEBKAC? I'll
file a new bug report if I see either of those warnings again.
next prev parent reply other threads:[~2025-08-25 23:55 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-09 18:02 [GIT PULL] locking/urgent for v6.17-rc1 Borislav Petkov
2025-08-10 5:58 ` pr-tracker-bot
2025-08-21 18:19 ` Linus Torvalds
2025-08-21 19:45 ` Sean Christopherson
2025-08-22 14:16 ` Sebastian Andrzej Siewior
2025-08-23 0:28 ` Sean Christopherson
2025-08-25 16:04 ` Sebastian Andrzej Siewior
2025-08-25 23:55 ` Sean Christopherson [this message]
2025-08-22 10:57 ` Sebastian Andrzej Siewior
2025-08-22 14:12 ` [PATCH] futex: Move futex_hash_free() back to __mmput() Sebastian Andrzej Siewior
2025-08-27 11:34 ` [tip: locking/urgent] " tip-bot2 for Sebastian Andrzej Siewior
2025-08-31 12:23 ` tip-bot2 for Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aKz38QetUrDfKP8P@google.com \
--to=seanjc@google.com \
--cc=bigeasy@linutronix.de \
--cc=bp@alien8.de \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.