linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Ard Biesheuvel <ardb+git@google.com>,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	Ard Biesheuvel <ardb@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Brian Gerst <brgerst@gmail.com>
Subject: Re: [PATCH v3 1/7] x86/cpu: Use a new feature flag for 5 level paging
Date: Thu, 15 May 2025 10:22:59 +0200	[thread overview]
Message-ID: <aCWkY1UbhWfSfgZU@gmail.com> (raw)
In-Reply-To: <ct3z5lyozftd2tkzfksc6ylbh7ubeonuww2t77voymuy5egyo2@ocqfhd6gnbti>


* Kirill A. Shutemov <kirill@shutemov.name> wrote:

> On Thu, May 15, 2025 at 09:45:52AM +0200, Ingo Molnar wrote:
> > 
> > * Ard Biesheuvel <ardb+git@google.com> wrote:
> > 
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > > 
> > > Currently, the LA57 CPU feature flag is taken to mean two different
> > > things at once:
> > > - whether the CPU implements the LA57 extension, and is therefore
> > >   capable of supporting 5 level paging;
> > > - whether 5 level paging is currently in use.
> > > 
> > > This means the LA57 capability of the hardware is hidden when a LA57
> > > capable CPU is forced to run with 4 levels of paging. It also means the
> > > the ordinary CPU capability detection code will happily set the LA57
> > > capability and it needs to be cleared explicitly afterwards to avoid
> > > inconsistencies.
> > > 
> > > Separate the two so that the CPU hardware capability can be identified
> > > unambigously in all cases.
> > > 
> > > To avoid breaking existing users that might assume that 5 level paging
> > > is being used when the "la57" string is visible in /proc/cpuinfo,
> > > repurpose that string to mean that 5-level paging is in use, and add a
> > > new string la57_capable to indicate that the CPU feature is implemented
> > > by the hardware.
> > > 
> > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > > ---
> > >  arch/x86/include/asm/cpufeatures.h               |  3 ++-
> > >  arch/x86/include/asm/page_64.h                   |  2 +-
> > >  arch/x86/include/asm/pgtable_64_types.h          |  2 +-
> > >  arch/x86/kernel/cpu/common.c                     | 16 ++--------------
> > >  arch/x86/kvm/x86.h                               |  4 ++--
> > >  drivers/iommu/amd/init.c                         |  4 ++--
> > >  drivers/iommu/intel/svm.c                        |  4 ++--
> > >  tools/testing/selftests/kvm/x86/set_sregs_test.c |  2 +-
> > >  8 files changed, 13 insertions(+), 24 deletions(-)
> > > 
> > > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> > > index f67a93fc9391..d59bee5907e7 100644
> > > --- a/arch/x86/include/asm/cpufeatures.h
> > > +++ b/arch/x86/include/asm/cpufeatures.h
> > > @@ -395,7 +395,7 @@
> > >  #define X86_FEATURE_AVX512_BITALG	(16*32+12) /* "avx512_bitalg" Support for VPOPCNT[B,W] and VPSHUF-BITQMB instructions */
> > >  #define X86_FEATURE_TME			(16*32+13) /* "tme" Intel Total Memory Encryption */
> > >  #define X86_FEATURE_AVX512_VPOPCNTDQ	(16*32+14) /* "avx512_vpopcntdq" POPCNT for vectors of DW/QW */
> > > -#define X86_FEATURE_LA57		(16*32+16) /* "la57" 5-level page tables */
> > > +#define X86_FEATURE_LA57		(16*32+16) /* "la57_hw" 5-level page tables */
> > >  #define X86_FEATURE_RDPID		(16*32+22) /* "rdpid" RDPID instruction */
> > >  #define X86_FEATURE_BUS_LOCK_DETECT	(16*32+24) /* "bus_lock_detect" Bus Lock detect */
> > >  #define X86_FEATURE_CLDEMOTE		(16*32+25) /* "cldemote" CLDEMOTE instruction */
> > > @@ -483,6 +483,7 @@
> > >  #define X86_FEATURE_PREFER_YMM		(21*32+ 8) /* Avoid ZMM registers due to downclocking */
> > >  #define X86_FEATURE_APX			(21*32+ 9) /* Advanced Performance Extensions */
> > >  #define X86_FEATURE_INDIRECT_THUNK_ITS	(21*32+10) /* Use thunk for indirect branches in lower half of cacheline */
> > > +#define X86_FEATURE_5LEVEL_PAGING	(21*32+11) /* "la57" Whether 5 levels of page tables are in use */
> > 
> > So there's a new complication here, KVM doesn't like the use of 
> > synthethic CPU flags, for understandable reasons:
> > 
> >   inlined from ‘intel_pmu_set_msr’ at arch/x86/kvm/vmx/pmu_intel.c:369:7:
> >   ...
> >   ./arch/x86/kvm/reverse_cpuid.h:102:9: note: in expansion of macro ‘BUILD_BUG_ON’
> >     102 |         BUILD_BUG_ON(x86_leaf == CPUID_LNX_5);
> >         |         ^~~~~~~~~~~~
> > 
> > (See x86-64 allmodconfig)
> > 
> > Even though previously X86_FEATURE_LA57 was effectively a synthethic 
> > CPU flag (it got artificially turned off by the Linux kernel if 5-level 
> > paging was disabled) ...
> > 
> > So I think the most straightforward solution would be to do the change 
> > below, and pass through LA57 flag if 5-level paging is enabled in the 
> > host kernel. This is similar to as if the firmware turned off LA57, and 
> > it doesn't bring in all the early-boot complications bare metal has. It 
> > should also match the previous behavior I think.
> > 
> > Thoughts?
> > 
> > Thanks,
> > 
> > 	Ingo
> > 
> > =================>
> > 
> >  arch/x86/kvm/cpuid.c | 6 ++++++
> >  arch/x86/kvm/x86.h   | 4 ++--
> >  2 files changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 571c906ffcbf..d951d71aea3b 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -1221,6 +1221,12 @@ void kvm_set_cpu_caps(void)
> >  		kvm_cpu_cap_clear(X86_FEATURE_RDTSCP);
> >  		kvm_cpu_cap_clear(X86_FEATURE_RDPID);
> >  	}
> > +	/*
> > +	 * Clear the LA57 flag in the guest if the host kernel
> > +	 * does not have 5-level paging support:
> > +	 */
> > +	if (kvm_cpu_cap_has(X86_FEATURE_LA57) && !pgtable_l5_enabled())
> 
> X86_FEATURE_LA57 check seems redundant.

Possibly. I wanted to limit the clearing of X86_FEATURE_LA57 only to 
precisely the case where it was set to begin with. Ie. don't clear it 
when it's not present to begin with.

It shouldn't matter, as this is KVM-module-init-only code.

Thanks,

	Ingo

  reply	other threads:[~2025-05-15  8:23 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-14 10:42 [PATCH v3 0/7] x86: Robustify pgtable_l5_enabled() Ard Biesheuvel
2025-05-14 10:42 ` [PATCH v3 1/7] x86/cpu: Use a new feature flag for 5 level paging Ard Biesheuvel
2025-05-15  7:06   ` Ingo Molnar
2025-05-15  7:45   ` Ingo Molnar
2025-05-15  8:07     ` Kirill A. Shutemov
2025-05-15  8:22       ` Ingo Molnar [this message]
2025-05-15 10:12     ` Ard Biesheuvel
2025-05-15 23:24       ` Sean Christopherson
2025-05-16  8:31         ` Ard Biesheuvel
2025-05-15  9:51   ` Borislav Petkov
2025-05-15 10:17     ` Ard Biesheuvel
2025-05-15 10:39       ` Borislav Petkov
2025-05-15 10:57         ` Ard Biesheuvel
2025-05-15 13:11   ` Borislav Petkov
2025-05-15 13:33     ` Ard Biesheuvel
2025-05-17 16:59       ` David Laight
2025-05-15 18:20     ` Shivank Garg
2025-05-15 19:11       ` Borislav Petkov
2025-05-16  9:17         ` Kirill A. Shutemov
2025-05-14 10:42 ` [PATCH v3 2/7] x86/cpu: Allow caps to be set arbitrarily early Ard Biesheuvel
2025-05-15  6:56   ` Ingo Molnar
2025-05-15  7:50     ` Ingo Molnar
2025-05-15  7:55     ` Kirill A. Shutemov
2025-05-15  8:18       ` Ingo Molnar
2025-05-15  9:45         ` Ard Biesheuvel
2025-05-15 12:08           ` Ingo Molnar
2025-05-14 10:42 ` [PATCH v3 3/7] x86/asm-offsets: Export struct cpuinfo_x86 layout for asm use Ard Biesheuvel
2025-05-15  7:10   ` Ingo Molnar
2025-05-15  7:58   ` [tip: x86/core] x86/asm-offsets: Export certain 'struct cpuinfo_x86' fields for 64-bit asm use too tip-bot2 for Ard Biesheuvel
2025-05-14 10:42 ` [PATCH v3 4/7] x86/boot: Set 5-level paging CPU cap before entering C code Ard Biesheuvel
2025-05-15  8:00   ` Kirill A. Shutemov
2025-05-15  9:43     ` Ard Biesheuvel
2025-05-15 11:05       ` Kirill A. Shutemov
2025-05-14 10:42 ` [PATCH v3 5/7] x86/boot: Drop the early variant of pgtable_l5_enabled() Ard Biesheuvel
2025-05-14 10:42 ` [PATCH v3 6/7] x86/boot: Drop 5-level paging related variables and early updates Ard Biesheuvel
2025-05-14 10:42 ` [PATCH v3 7/7] x86/cpu: Make CPU capability overrides __ro_after_init Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aCWkY1UbhWfSfgZU@gmail.com \
    --to=mingo@kernel.org \
    --cc=ardb+git@google.com \
    --cc=ardb@kernel.org \
    --cc=brgerst@gmail.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).