From: Sean Christopherson <seanjc@google.com>
To: Jari Ruusu <jariruusu@protonmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: [PATCH] kvm ignores ignore_msrs=1 VETO for some MSRs
Date: Tue, 5 Sep 2023 19:27:15 +0000 [thread overview]
Message-ID: <ZPeBE5aZqLwdnspl@google.com> (raw)
In-Reply-To: <NOTSPohUo5EZSaOrRTX88K-vU9QJqeV2Vqti75bEwTpckXBiudKyWw97EDAbgp9ODnk8-lCVBVNCYdd7YygWY5S2n-Yoz_BiJ13DeNLEItI=@protonmail.com>
On Tue, Sep 05, 2023, Jari Ruusu wrote:
> This problem is old regression. This type of setup worked fine on older
> linux-4.x hosts but fails on linux-5.10.x hosts. I remember seeing this fail
> as early as year 2021. I just haven't had time to look at it earlier.
>
> Relevant qemu parameters:
> -machine pc-1.0
> -cpu Skylake-Server-IBRS,+md-clear,+pcid,+invpcid,+ssbd,+clflushopt
> -enable-kvm
> If I change CPU model to "Nehalem" then it boots OK.
>
> KVM stuff is built-in to host kernel and my kernel boot parameters include:
> kvm-intel.ept=0 l1tf=off kvm.ignore_msrs=1
> so any invalid RDMSR reads should not fail because of ignore_msrs=1 VETO,
> but at least MSR_IA32_PERF_CAPABILITIES RDMSR read does indeed fail.
No, as documented in Documentation/admin-guide/kernel-parameters.txt, ignore_msrs
only applies to _unhandled_ MSRs, i.e. MSRs that KVM knows nothing about.
kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs.
The reason this introduces a failure in your setup is that KVM didn't have any
handling for MSR_IA32_PERF_CAPABILITIES prior to commit 27461da31089 ("KVM: x86/pmu:
Support full width counting").
> Full C-language source file can be viewed here:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/x86/kernel/cpu/perf_event_intel.c?h=linux-3.10.y#n2023
>
> My understanding of this failure is that it is combination of many factors,
> including:
>
> 1) Qemu version is old
> 2) Qemu guest CPUID flags may be "Frankenstein"
It's a bit Frankenstein, but architecturally it's completely valid.
> 3) old linux-3.10.108 x86_64 kernel may be doing something questionable
The guest kernel is the real culprit. It is assuming that an MSR exists based on
the PMU version instead of checking the CPUID feature flag that enumerates the
existence of the MSR.
The bug was fixed almost a decade ago, but that fix obviously didn't make it to
the 3.10 kernel.
commit c9b08884c9c98929ec2d8abafd78e89062d01ee7
Author: Peter Zijlstra <peterz@infradead.org>
Date: Mon Feb 3 14:29:03 2014 +0100
perf/x86: Correctly use FEATURE_PDCM
The current code simply assumes Intel Arch PerfMon v2+ to have
the IA32_PERF_CAPABILITIES MSR; the SDM specifies that we should check
CPUID[1].ECX[15] (aka, FEATURE_PDCM) instead.
This was found by KVM which implements v2+ but didn't provide the
capabilities MSR. Change the code to DTRT; KVM will also implement the
MSR and return 0.
> 4) newer host linux KVM is not always honoring RDMSR ignore_msrs=1 VETO
>
> My reading linux-5.10.194 kernel source identified following questionable
> handling ignore_msrs=1 VETO. This same problem appears to be present in
> recently released linux-6.5 too, but so far I have not tested this
> with linux-6.5.x host kernels yet.
While this is arguably a regression, this isn't going to be addressed in KVM.
ignore_msrs is off by default, and is explicitly documented as applying only to
unhandled MSRs. The documentation could certainly do a better job of explaining
the potential pitfalls and long-term consequences of enabling ignore_msrs, but
hack-a-fixing this one MSR to fudge around a guest bug isn't going to happen,
and a broad "ignore all RDMSR/WRMSR faults" knob would likely break other guests,
e.g. would make it impossible to probe for MSR existence, and so such a knob would
be unusable.
As for working around this in your setup, assuming you don't actually need a
virtual PMU in the guest, the simplest workaround would be to turn off vPMU
support in KVM, i.e. boot with kvm.enable_pmu=0. That _should_ cause QEMU to not
advertise a PMU to the guest. Alternatively, if supported by QEMU, you could try
enumerating a version 1 vPMU to the guest.
next prev parent reply other threads:[~2023-09-05 19:27 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-05 18:07 [PATCH] kvm ignores ignore_msrs=1 VETO for some MSRs Jari Ruusu
2023-09-05 19:27 ` Sean Christopherson [this message]
2023-09-05 20:41 ` Jari Ruusu
2023-09-05 20:55 ` Sean Christopherson
2023-09-05 21:02 ` Jari Ruusu
2023-09-07 10:55 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZPeBE5aZqLwdnspl@google.com \
--to=seanjc@google.com \
--cc=jariruusu@protonmail.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox