From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Derek Yerger <derek@djy.llc>
Cc: Alex Williamson <alex.williamson@redhat.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"Bonzini, Paolo" <pbonzini@redhat.com>
Subject: Re: PROBLEM: Regression of MMU causing guest VM application errors
Date: Wed, 20 Nov 2019 10:19:14 -0800 [thread overview]
Message-ID: <20191120181913.GA11521@linux.intel.com> (raw)
In-Reply-To: <20191119200133.GD25672@linux.intel.com>
[-- Attachment #1: Type: text/plain, Size: 1786 bytes --]
On Tue, Nov 19, 2019 at 12:01:33PM -0800, Sean Christopherson wrote:
> On Wed, Oct 30, 2019 at 11:44:09PM -0400, Derek Yerger wrote:
> > I noticed the following in the host kernel log around the time the guest
> > encountered BSOD on 5.2.7:
> >
> > [ 337.841491] WARNING: CPU: 6 PID: 7548 at arch/x86/kvm/x86.c:7963
> > kvm_arch_vcpu_ioctl_run+0x19b1/0x1b00 [kvm]
>
> Rats, I overlooked this first time round. In the future, if you get a
> WARN splat, try to make it very obvious in the bug report, they're almost
> always a smoking gun.
>
> That WARN that fired is:
>
> /* The preempt notifier should have taken care of the FPU already. */
> WARN_ON_ONCE(test_thread_flag(TIF_NEED_FPU_LOAD));
>
> which was added part of a bug fix by commit:
>
> 240c35a3783a ("kvm: x86: Use task structs fpu field for user")
>
> the buggy commit that was fixed is
>
> 5f409e20b794 ("x86/fpu: Defer FPU state load until return to userspace")
>
> which was part of a FPU rewrite that went into 5.2[*]. So yep, big
> smoking gun :-)
>
> My understanding of the WARN is that it means the kernel's FPU state is
> unexpectedly loaded when entry to the KVM guest is imminent. As for *how*
> the kernel's FPU state is getting loaded, no clue. But, I think it'd be
> pretty easy to find the the culprit by adding a debug flag into struct
> thread_info that gets set in vcpu_load() and clearing it in vcpu_put(),
> and then WARN in set_ti_thread_flag() if the debug flag is true when
> TIF_NEED_FPU_LOAD is being set. I'll put together a debugging patch later
> today and send it your way.
Debug patch attached. Hopefully it finds something, it took me an
embarassing number of attempts to get correct, I kept screwing up checking
a bit number versus checking a bit mask...
[-- Attachment #2: 0001-thread_info-Add-a-debug-hook-to-detect-FPU-changes-w.patch --]
[-- Type: text/x-diff, Size: 1942 bytes --]
From 6288031dacbe753b84515d330f62c1f8ed31d932 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <sean.j.christopherson@intel.com>
Date: Wed, 20 Nov 2019 10:12:56 -0800
Subject: [PATCH] thread_info: Add a debug hook to detect FPU changes while a
vCPU is loaded
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
arch/x86/include/asm/thread_info.h | 2 ++
arch/x86/kvm/x86.c | 4 ++++
include/linux/thread_info.h | 1 +
3 files changed, 7 insertions(+)
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index f9453536f9bb..7b697005cc51 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -56,6 +56,8 @@ struct task_struct;
struct thread_info {
unsigned long flags; /* low level flags */
u32 status; /* thread synchronous flags */
+ bool vcpu_loaded;
+
};
#define INIT_THREAD_INFO(tsk) \
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a8ad3a4d86b1..3d9c049e749e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3303,6 +3303,8 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
}
kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu);
+
+ current_thread_info()->vcpu_loaded = 1;
}
static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu)
@@ -3322,6 +3324,8 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
{
int idx;
+ current_thread_info()->vcpu_loaded = 0;
+
if (vcpu->preempted)
vcpu->arch.preempted_in_kernel = !kvm_x86_ops->get_cpl(vcpu);
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 8d8821b3689a..016c2c887354 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -52,6 +52,7 @@ enum {
static inline void set_ti_thread_flag(struct thread_info *ti, int flag)
{
+ WARN_ON_ONCE(ti->vcpu_loaded && flag == TIF_NEED_FPU_LOAD);
set_bit(flag, (unsigned long *)&ti->flags);
}
--
2.24.0
next prev parent reply other threads:[~2019-11-20 18:19 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-16 4:49 PROBLEM: Regression of MMU causing guest VM application errors Derek Yerger
2019-10-16 7:28 ` Paolo Bonzini
2019-10-16 17:28 ` Alex Williamson
2019-10-16 17:49 ` Sean Christopherson
2019-10-17 23:57 ` Derek Yerger
2019-10-22 20:28 ` Sean Christopherson
2019-10-24 15:18 ` Derek Yerger
2019-10-24 17:32 ` Sean Christopherson
2019-10-31 3:44 ` Derek Yerger
2019-11-19 20:01 ` Sean Christopherson
2019-11-20 9:19 ` Wanpeng Li
2019-11-20 9:57 ` Paolo Bonzini
2019-11-20 18:19 ` Sean Christopherson [this message]
2019-11-20 19:04 ` Derek Yerger
2019-11-20 19:28 ` Sean Christopherson
2019-11-27 15:24 ` Sean Christopherson
2019-12-17 23:11 ` Sean Christopherson
2019-12-17 23:13 ` Derek Yerger
2020-01-02 13:42 ` Derek Yerger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191120181913.GA11521@linux.intel.com \
--to=sean.j.christopherson@intel.com \
--cc=alex.williamson@redhat.com \
--cc=derek@djy.llc \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.