public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Pavel Machek <pavel@denx.de>, Sasha Levin <sashal@kernel.org>,
	linux-kernel@vger.kernel.org,  stable@vger.kernel.org,
	Max Grobecker <max@grobecker.info>,
	Ingo Molnar <mingo@kernel.org>,
	 tglx@linutronix.de, mingo@redhat.com,
	dave.hansen@linux.intel.com,  x86@kernel.org,
	thomas.lendacky@amd.com, perry.yuan@amd.com,
	 mario.limonciello@amd.com, riel@surriel.com, mjguzik@gmail.com,
	 darwi@linutronix.de, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: CONFIG_X86_HYPERVISOR (was: Re: [PATCH AUTOSEL 5.10 2/6] x86/cpu: Don't clear X86_FEATURE_LAHF_LM flag in init_amd_k8() on AMD when running in a virtual machine)
Date: Tue, 22 Apr 2025 12:48:44 -0700	[thread overview]
Message-ID: <aAfynEK3wcfQa1qQ@google.com> (raw)
In-Reply-To: <20250422173355.GDaAfTA8GqnGBfDk7G@renoirsky.local>

On Tue, Apr 22, 2025, Borislav Petkov wrote:
> On Tue, Apr 22, 2025 at 10:22:54AM -0700, Sean Christopherson wrote:
> > > Because I really hate wagging the dog and "fixing" the kernel because something
> > > else can't be bothered. I didn't object stronger to that fix because it is
> > > meh, more of those "if I'm a guest" gunk which we sprinkle nowadays and that's
> > > apparently not that awful-ish...
> > 
> > FWIW, I think splattering X86_FEATURE_HYPERVISOR everywhere is quite awful.  There
> > are definitely cases where the kernel needs to know if it's running as a guest,
> > because the behavior of "hardware" fundamentally changes in ways that can't be
> > enumerated otherwise.  E.g. that things like the HPET are fully emulated and thus
> > will be prone to significant jitter.
> > 
> > But when it comes to feature enumeration, IMO sprinkling HYPERVISOR everywhere is
> > unnecessary because it's the hypervisor/VMM's responsibility to present a sane
> > model.  And I also think it's outright dangerous, because everywhere the kernel
> > does X for bare metal and Y for guest results in reduced test coverage.
> > 
> > E.g. things like syzkaller and other bots will largely be testing the HYPERVISOR
> > code, while humans will largely be testing and using the bare metal code.
> 
> All valid points...
> 
> At least one case justifies the X86_FEATURE_HYPERVISOR check: microcode loading
> and we've chewed that topic back then with Xen ad nauseam.

Yeah, from my perspective, ucode loading falls into the "fundamentally different"
bucket.

> 
> But I'd love to whack as many of such checks as possible.
> 
> $ git grep X86_FEATURE_HYPERVISOR | wc -l
> 60
> 
> I think I should start whacking at those and CC you if I'm not sure.
> It'll be a long-term, low prio thing but it'll be a good cleanup.

I did a quick pass.  Most of the usage is "fine".  E.g. explicit PV code, cases
where checking for HYPERVISOR is the least awful option, etc.

Looks sketchy, might be worth investigating?
--------------------------------------------
  arch/x86/kernel/cpu/amd.c:              if (!cpu_has(c, X86_FEATURE_HYPERVISOR) &&
  arch/x86/kernel/cpu/amd.c:      if (!cpu_has(c, X86_FEATURE_HYPERVISOR) && !cpu_has(c, X86_FEATURE_IBPB_BRTYPE)) {
  arch/x86/kernel/cpu/amd.c:      if (c->x86_model < 0x14 && cpu_has(c, X86_FEATURE_LAHF_LM) && !cpu_has(c, X86_FEATURE_HYPERVISOR)) {
  arch/x86/kernel/cpu/amd.c:      if (!cpu_has(c, X86_FEATURE_HYPERVISOR)) {
  arch/x86/kernel/cpu/amd.c:      if (!cpu_has(c, X86_FEATURE_HYPERVISOR)) {
  arch/x86/kernel/cpu/amd.c:      if (cpu_has(c, X86_FEATURE_HYPERVISOR))
  arch/x86/kernel/cpu/amd.c:      if (!cpu_has(c, X86_FEATURE_HYPERVISOR)) {
  arch/x86/kernel/cpu/amd.c:      if (!cpu_has(c, X86_FEATURE_HYPERVISOR))
  arch/x86/kernel/cpu/topology_amd.c:             if (!boot_cpu_has(X86_FEATURE_HYPERVISOR) && tscan->c->x86_model <= 0x3) {
  arch/x86/mm/init_64.c:  if (!boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
  arch/x86/mm/pat/set_memory.c:   return !cpu_feature_enabled(X86_FEATURE_HYPERVISOR);
  drivers/platform/x86/intel/pmc/pltdrv.c:        if (cpu_feature_enabled(X86_FEATURE_HYPERVISOR) && !xen_initial_domain())
  drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c: if (cpu_feature_enabled(X86_FEATURE_HYPERVISOR))
--------------------------------------


Could do with some love, but not horrible.
------------------------------------------
Eww.  Optimization to lessen the pain of DR7 interception.  It'd be nice to clean
this up at some point, especially with things like SEV-ES with DebugSwap, where
DR7 is never intercepted.
  arch/x86/include/asm/debugreg.h:        if (static_cpu_has(X86_FEATURE_HYPERVISOR) && !hw_breakpoint_active())
  arch/x86/kernel/hw_breakpoint.c:                 * When in guest (X86_FEATURE_HYPERVISOR), local_db_save()

This usage should be restricted to just the FMS matching, but unfortunately
needs to be kept for that check.
  arch/x86/kernel/cpu/bus_lock.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR))

Most of these look sane, e.g. are just being transparent about the state of
mitigations when running in a VM.  The use in update_srbds_msr() is the only
one that stands out as somewhat sketchy.
  arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
  arch/x86/kernel/cpu/bugs.c:     else if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
  arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
  arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
  arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
  arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
------------------------------------------


Don't bother
------------------------------------------
Most of these look sane, e.g. are just being transparent about the state of
mitigations when running in a VM.  The use in update_srbds_msr() is the only
one that stands out as somewhat sketchy.
  arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
  arch/x86/kernel/cpu/bugs.c:     else if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
  arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
  arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
  arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
  arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {

Perf, don't bother.  PMUs are notoriously virtualization-unfriendly, and perf
has had to resort to detecting its running in a VM to avoid crashing the kernel,
and I don't see this being fully solved any time soon.
  arch/x86/events/core.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
  arch/x86/events/intel/core.c:   if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))
  arch/x86/events/intel/core.c:           int assume = 3 * !boot_cpu_has(X86_FEATURE_HYPERVISOR);
  arch/x86/events/intel/cstate.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
  arch/x86/events/intel/uncore.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR))

PV code of one form or another.
  arch/x86/include/asm/acrn.h:    if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
  arch/x86/hyperv/ivm.c:  if (!cpu_feature_enabled(X86_FEATURE_HYPERVISOR))
  arch/x86/kernel/cpu/mshyperv.c: if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))
  arch/x86/kernel/cpu/vmware.c: * If !boot_cpu_has(X86_FEATURE_HYPERVISOR), vmware_hypercall_mode
  arch/x86/kernel/cpu/vmware.c:   if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
  arch/x86/kernel/jailhouse.c:        !boot_cpu_has(X86_FEATURE_HYPERVISOR))
  arch/x86/kernel/kvm.c:  if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
  arch/x86/kernel/paravirt.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
  arch/x86/kernel/tsc.c:  if (boot_cpu_has(X86_FEATURE_HYPERVISOR) ||
  arch/x86/kvm/vmx/vmx.c: if (!static_cpu_has(X86_FEATURE_HYPERVISOR) ||
  arch/x86/virt/svm/cmdline.c:                    if (!cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) {

Ugh.  Eliding WBINVD when running as a VM.  Probably the least awful option as
there's no sane way to enumerate that WBINVD is a nop, and a "passthrough" setup
can (and should) simply omit HYPERVISOR.
  arch/x86/include/asm/acenv.h:   if (!cpu_feature_enabled(X86_FEATURE_HYPERVISOR))       \

Skip sanity check on TSC deadline timer.  Makes sense to keep; either the timer
is emulated and thus not subject to hardware errata, or its passed through, in
which case HYPERVSIOR arguably shouldn't be set.
  arch/x86/kernel/apic/apic.c:    if (boot_cpu_has(X86_FEATURE_HYPERVISOR))

This "feature" is awful, but getting rid of it may not be feasible.
https://lore.kernel.org/all/20250201005048.657470-1-seanjc@google.com
  arch/x86/kernel/cpu/mtrr/generic.c:     if (!cpu_feature_enabled(X86_FEATURE_HYPERVISOR))

Exempting VMs from a gross workaround for old, buggy Intel chipsets.  Fine to keep.
  drivers/acpi/processor_idle.c:  if (boot_cpu_has(X86_FEATURE_HYPERVISOR))

More mitigation crud, probably not worth pursuing.
  arch/x86/kernel/cpu/common.c:        boot_cpu_has(X86_FEATURE_HYPERVISOR)))
  arch/x86/kernel/cpu/common.c:   if (cpu_has(c, X86_FEATURE_HYPERVISOR)) {

LOL.  Skip ucode revision check when detecting bad Spectre mitigation.
  arch/x86/kernel/cpu/intel.c:    if (cpu_has(c, X86_FEATURE_HYPERVISOR))
------------------------------------------

  reply	other threads:[~2025-04-22 19:48 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-31 14:37 [PATCH AUTOSEL 5.10 1/6] pm: cpupower: bench: Prevent NULL dereference on malloc failure Sasha Levin
2025-03-31 14:37 ` [PATCH AUTOSEL 5.10 2/6] x86/cpu: Don't clear X86_FEATURE_LAHF_LM flag in init_amd_k8() on AMD when running in a virtual machine Sasha Levin
2025-04-18 16:54   ` Pavel Machek
2025-04-18 17:19     ` Sean Christopherson
2025-04-18 17:36       ` Borislav Petkov
2025-04-18 18:31         ` Sean Christopherson
2025-04-18 19:12           ` Borislav Petkov
2025-04-22 17:22             ` Sean Christopherson
2025-04-22 17:33               ` CONFIG_X86_HYPERVISOR (was: Re: [PATCH AUTOSEL 5.10 2/6] x86/cpu: Don't clear X86_FEATURE_LAHF_LM flag in init_amd_k8() on AMD when running in a virtual machine) Borislav Petkov
2025-04-22 19:48                 ` Sean Christopherson [this message]
2025-04-23  7:20                   ` Borislav Petkov
2025-04-23 14:10                     ` Sean Christopherson
2025-04-23 18:43                       ` Borislav Petkov
2025-04-24 19:18                         ` Sean Christopherson
2025-04-24 20:31                           ` Borislav Petkov
2025-04-26  0:08                             ` Sean Christopherson
2025-04-26 11:26                               ` Borislav Petkov
2025-05-06  1:04                                 ` Sean Christopherson
2025-03-31 14:37 ` [PATCH AUTOSEL 5.10 3/6] perf: arm_pmu: Don't disable counter in armpmu_add() Sasha Levin
2025-03-31 14:37 ` [PATCH AUTOSEL 5.10 4/6] arm64: cputype: Add QCOM_CPU_PART_KRYO_3XX_GOLD Sasha Levin
2025-04-18 16:55   ` Pavel Machek
2025-04-18 19:27     ` Doug Anderson
2025-03-31 14:37 ` [PATCH AUTOSEL 5.10 5/6] xen/mcelog: Add __nonstring annotations for unterminated strings Sasha Levin
2025-03-31 14:37 ` [PATCH AUTOSEL 5.10 6/6] x86/mm/ident_map: Fix theoretical virtual address overflow to zero Sasha Levin
2025-04-18 16:52 ` [PATCH AUTOSEL 5.10 1/6] pm: cpupower: bench: Prevent NULL dereference on malloc failure Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aAfynEK3wcfQa1qQ@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=darwi@linutronix.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=max@grobecker.info \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mjguzik@gmail.com \
    --cc=pavel@denx.de \
    --cc=pbonzini@redhat.com \
    --cc=perry.yuan@amd.com \
    --cc=riel@surriel.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox