public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: mlevitsk@redhat.com
To: Chao Gao <chao.gao@intel.com>
Cc: kvm@vger.kernel.org, Dave Hansen <dave.hansen@linux.intel.com>,
	Sean Christopherson <seanjc@google.com>,
	Borislav Petkov <bp@alien8.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	 x86@kernel.org, linux-kernel@vger.kernel.org,
	Ingo Molnar <mingo@redhat.com>,
	 Paolo Bonzini <pbonzini@redhat.com>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 1/2] KVM: x86: relax canonical checks for some x86 architectural msrs
Date: Fri, 26 Jul 2024 11:08:36 -0400	[thread overview]
Message-ID: <91c7727f66afa7c1f424fb08958579dfa3dc708c.camel@redhat.com> (raw)
In-Reply-To: <ZqNHGBZyiHKvQKj1@chao-email>

У пт, 2024-07-26 у 14:50 +0800, Chao Gao пише:
> On Thu, Jul 25, 2024 at 11:01:09AM -0400, Maxim Levitsky wrote:
> > Several architectural msrs (e.g MSR_KERNEL_GS_BASE) must contain
> > a canonical address, and according to Intel PRM, this is enforced
> > by #GP on a MSR write.
> > 
> > However with the introduction of the LA57 the definition of
> > what is a canonical address became blurred.
> > 
> > Few tests done on Sapphire Rapids CPU and on Zen4 CPU,
> > reveal:
> > 
> > 1. These CPUs do allow full 57-bit wide non canonical values
> > to be written to MSR_GS_BASE, MSR_FS_BASE, MSR_KERNEL_GS_BASE,
> > regardless of the state of CR4.LA57.
> > Zen4 in addition to that even allows such writes to
> > MSR_CSTAR and MSR_LSTAR.
> 
> This actually is documented/implied at least in ISE [1]. In Chapter 6.4
> "CANONICALITY CHECKING FOR DATA ADDRESSES WRITTEN TO CONTROL REGISTERS AND
> MSRS"
> 
>   In Processors that support LAM continue to require the addresses written to
>   control registers or MSRs to be 57-bit canonical if the processor _supports_
>   5-level paging or 48-bit canonical if it supports only 4-level paging
> 
> [1]: https://cdrdv2.intel.com/v1/dl/getContent/671368

I haven't found this in the actual PRM, but mine is relatively old,
(from September 2023, I didn't bother to update it because 5 level paging is
quite an old feature)

> 
> > 
> > 2. These CPUs don't prevent the user from switching back to 4 level
> > paging with values that will be non canonical in 4 level paging,
> > and instead just allow the msrs to contain these values.
> > 
> > Since these MSRS are all passed through to the guest, and microcode
> > allows the non canonical values to get into these msrs,
> > KVM has to tolerate such values and avoid crashing the guest.
> > 
> > To do so, always allow the host initiated values regardless of
> > the state of CR4.LA57, instead only gate this by the actual hardware
> > support for 5 level paging.
> > 
> > To be on the safe side leave the check for guest writes as is.
> > 
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> > arch/x86/kvm/x86.c | 31 ++++++++++++++++++++++++++++++-
> > 1 file changed, 30 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index a6968eadd418..c599deff916e 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -1844,7 +1844,36 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data,
> >         case MSR_KERNEL_GS_BASE:
> >         case MSR_CSTAR:
> >         case MSR_LSTAR:
> > -               if (is_noncanonical_address(data, vcpu))
> > +
> > +               /*
> > +                * Both AMD and Intel cpus tend to allow values which
> > +                * are canonical in the 5 level paging mode but are not
> > +                * canonical in the 4 level paging mode to be written
> > +                * to the above msrs, regardless of the state of the CR4.LA57.
> > +                *
> > +                * Intel CPUs do honour CR4.LA57 for the MSR_CSTAR/MSR_LSTAR,
> > +                * AMD cpus don't even do that.
> > +                *
> > +                * Both CPUs also allow non canonical values to remain in
> > +                * these MSRs if the CPU was in 5 level paging mode and was
> > +                * switched back to 4 level paging, and tolerate these values
> > +                * both in native MSRs and in vmcs/vmcb fields.
> > +                *
> > +                * To avoid crashing a guest, which manages using one of the above
> > +                * tricks to get non canonical value to one of
> > +                * these MSRs, and later migrates, allow the host initiated
> > +                * writes regardless of the state of CR4.LA57.
> > +                *
> > +                * To be on the safe side, don't allow the guest initiated
> > +                * writes to bypass the canonical check (e.g be more strict
> > +                * than what the actual ucode usually does).
> 
> I may think guest-initiated writes should be allowed as well because this is
> the architectural behavior.

Note though that for MSR_CSTAR/MSR_LSTAR I did set #GP, depending on CR.LA57.
Ah, I see it, KVM intercepts these msrs on VMX (but not on SVM) and I was under 
the impression that it doesn't, that is why I get #GP depending on CR4.LA57....

I do wonder why we intercept these msrs on VMX and not on SVM.

It all makes sense now, thanks a lot for the explanation!

> 
> > +                */
> > +
> > +               if (!host_initiated && is_noncanonical_address(data, vcpu))
> > +                       return 1;
> > +
> > +               if (!__is_canonical_address(data,
> > +                       boot_cpu_has(X86_FEATURE_LA57) ? 57 : 48))
> 
> boot_cpu_has(X86_FEATURE_LA57)=1 means LA57 is enabled. Right?
> 
> With this change, host-initiated writes must be 48-bit canonical if LA57 isn't
> enabled on the host, even if it is enabled in the guest. (note that KVM can
> expose LA57 to guests even if LA57 is disabled on the host, see
> kvm_set_cpu_caps()).


Sorry about this - we indeed need to use kvm_cpu_cap_has(X86_FEATURE_LA57) because
it is forced based on host raw CPUID.

I remember I wanted to do exactly this but forgot somehow.

Also I need to update this for nested VMX - these msrs are also checked there based
on CR4.LA57.

Thanks for the clarification, and it all makes sense. I'll send v2 with all of this
when I get back from a vacation (next Wednesday).


Best regards,
	Maxim Levitsky


> 
> >                         return 1;
> 


  reply	other threads:[~2024-07-26 15:08 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-25 15:01 [PATCH 0/2] Relax canonical checks on some arch msrs Maxim Levitsky
2024-07-25 15:01 ` [PATCH 1/2] KVM: x86: relax canonical checks for some x86 architectural msrs Maxim Levitsky
2024-07-26  6:50   ` Chao Gao
2024-07-26 15:08     ` mlevitsk [this message]
2024-07-25 15:01 ` [PATCH 2/2] KVM: SVM: fix emulation of msr reads/writes of MSR_FS_BASE and MSR_GS_BASE Maxim Levitsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=91c7727f66afa7c1f424fb08958579dfa3dc708c.camel@redhat.com \
    --to=mlevitsk@redhat.com \
    --cc=bp@alien8.de \
    --cc=chao.gao@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox