From: Sean Christopherson <seanjc@google.com>
To: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Joerg Roedel <joro@8bytes.org>,
den@virtuozzo.com, ptikhomirov@virtuozzo.com,
alexander@mihalicyn.com
Subject: Re: [Question] debugging VM cpu hotplug (#GP -> #DF) which results in reset
Date: Wed, 15 Jun 2022 15:00:42 +0000 [thread overview]
Message-ID: <Yqn0GofIXFOHk6k4@google.com> (raw)
In-Reply-To: <20220615171410.ab537c7af3691a0d91171a76@virtuozzo.com>
On Wed, Jun 15, 2022, Alexander Mikhalitsyn wrote:
> Dear friends,
>
> I'm sorry for disturbing you but I've getting stuck with debugging KVM
> problem and looking for an advice. I'm working mostly on kernel
> containers/CRIU and am newbie with KVM so, I believe that I'm missing
> something very simple.
>
> My case:
> - AMD EPYC 7443P 24-Core Processor (Milan family processor)
> - OpenVZ kernel (based on RHEL7 3.10.0-1160.53.1) on the Host Node (HN)
> - Qemu/KVM VM (8 vCPU assigned) with many different kernels from 3.10.0-1160 RHEL7 to mainline 5.18
>
> Reproducer (run inside VM):
> echo 0 > /sys/devices/system/cpu/cpu3/online
> echo 1 > /sys/devices/system/cpu/cpu3/online <- got reset here
>
> *Not* reproducible on:
> - any Intel which we tried
> - AMD EPYC 7261 (Rome family)
Hmm, given that Milan is problematic but Rome isn't, that implies the bug is related
to a feature that's new in Milan. PCID is the one that comes to mind, and IIRC there
were issues with PCID (or INVCPID?) in various kernels when running on Milan.
Can you try hiding PCID and INVPCID from the guest?
> - without KVM (on Host)
...
> ==== trace-cmd record -b 20000 -e kvm:kvm_cr -e kvm:kvm_userspace_exit -e probe:* =====
>
> CPU-1834 [003] 69194.833364: kvm_userspace_exit: reason KVM_EXIT_IO (2)
> CPU-1838 [000] 69194.834177: kvm_multiple_exception_L9: (ffffffff814313c6) vcpu=0xffff93ee9a528000
> CPU-1838 [000] 69194.834180: kvm_multiple_exception_L41: (ffffffff81431493) vcpu=0xffff93ee9a528000 exception=0xd000001 has_error=0x0 nr=0xd error_code=0x0 has_payload=0x0
> CPU-1838 [000] 69194.834195: kvm_multiple_exception_L9: (ffffffff814313c6) vcpu=0xffff93ee9a528000
> CPU-1838 [000] 69194.834196: kvm_multiple_exception_L41: (ffffffff81431493) vcpu=0xffff93ee9a528000 exception=0x8000100 has_error=0x0 nr=0x8 error_code=0x0 has_payload=0x0
> CPU-1838 [000] 69194.834200: shutdown_interception_L8: (ffffffff8146e4a0)
If you can modify the host kernel, throwing a WARN in kvm_multiple_exception() should
pinpoint the source of the #GP. Though you may get unlucky and find that KVM is just
reflecting an intercepted a #GP that was first "injected" by hardware. Note that this
could spam the log if KVM is injecting a large number of #GPs.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9cea051ca62e..19d959bf97cc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -612,6 +612,8 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
u32 prev_nr;
int class1, class2;
+ WARN_ON(nr == GP_VECTOR);
+
kvm_make_request(KVM_REQ_EVENT, vcpu);
if (!vcpu->arch.exception.pending && !vcpu->arch.exception.injected) {
next prev parent reply other threads:[~2022-06-15 15:00 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-15 14:14 [Question] debugging VM cpu hotplug (#GP -> #DF) which results in reset Alexander Mikhalitsyn
2022-06-15 15:00 ` Sean Christopherson [this message]
2022-06-15 19:47 ` Alexander Mikhalitsyn
2022-06-20 11:04 ` Alexander Mikhalitsyn
2022-06-20 11:06 ` Alexander Mikhalitsyn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yqn0GofIXFOHk6k4@google.com \
--to=seanjc@google.com \
--cc=alexander.mikhalitsyn@virtuozzo.com \
--cc=alexander@mihalicyn.com \
--cc=bp@alien8.de \
--cc=den@virtuozzo.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=ptikhomirov@virtuozzo.com \
--cc=tglx@linutronix.de \
--cc=vkuznets@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox