From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: kvm@vger.kernel.org, "Paolo Bonzini" <pbonzini@redhat.com>,
"Radim Krčmář" <rkrcmar@redhat.com>,
x86@kernel.org, "Andy Lutomirski" <luto@kernel.org>,
"Dmitry V . Levin" <ldv@altlinux.org>,
"Masatake YAMATO" <yamato@redhat.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] x86/kvm/vmx: don't read current->thread.{fs,gs}base of legacy tasks
Date: Fri, 13 Jul 2018 09:46:50 -0700 [thread overview]
Message-ID: <20180713164650.GA14830@linux.intel.com> (raw)
In-Reply-To: <20180711173718.8850-1-vkuznets@redhat.com>
On Wed, Jul 11, 2018 at 07:37:18PM +0200, Vitaly Kuznetsov wrote:
> When we switched from doing rdmsr() to reading FS/GS base values from
> current->thread we completely forgot about legacy 32-bit userspaces which
> we still support in KVM (why?). task->thread.{fsbase,gsbase} are only
> synced for 64-bit processes, calling save_fsgs_for_kvm() and using
> its result from current is illegal for legacy processes.
>
> There's no ARCH_SET_FS/GS prctls for legacy applications. Base MSRs are,
> however, not always equal to zero. Intel's manual says (3.4.4 Segment
> Loading Instructions in IA-32e Mode):
>
> "In order to set up compatibility mode for an application, segment-load
> instructions (MOV to Sreg, POP Sreg) work normally in 64-bit mode. An
> entry is read from the system descriptor table (GDT or LDT) and is loaded
> in the hidden portion of the segment register.
> ...
> The hidden descriptor register fields for FS.base and GS.base are
> physically mapped to MSRs in order to load all address bits supported by
> a 64-bit implementation.
> "
>
> The issue was found by strace test suite where 32-bit ioctl_kvm_run test
> started segfaulting.
>
> Reported-by: Dmitry V. Levin <ldv@altlinux.org>
> Bisected-by: Masatake YAMATO <yamato@redhat.com>
> Fixes: 42b933b59721 ("x86/kvm/vmx: read MSR_{FS,KERNEL_GS}_BASE from current->thread")
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> arch/x86/kvm/vmx.c | 25 +++++++++++++++++--------
> 1 file changed, 17 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 559a12b6184d..65968649b365 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2560,6 +2560,7 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu)
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> #ifdef CONFIG_X86_64
> int cpu = raw_smp_processor_id();
> + unsigned long fsbase, kernel_gsbase;
Because bikeshedding is fun, what do you think about using fs_base and
kernel_gs_base for these names? I have a series that touches this
code and also adds local variables for {FS,GS}.base and {FS,GS}.sel.
I used {fs,gs}_base and {fs,gs}_sel to be consistent with the
vmx->host_state nomenclature (the local variables are used to update
the associated vmx->host_state variables), but I'll change my patches
if you have a strong preference for omitting the underscore.
> #endif
> int i;
>
> @@ -2575,12 +2576,20 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu)
> vmx->host_state.gs_ldt_reload_needed = vmx->host_state.ldt_sel;
>
> #ifdef CONFIG_X86_64
> - save_fsgs_for_kvm();
> - vmx->host_state.fs_sel = current->thread.fsindex;
> - vmx->host_state.gs_sel = current->thread.gsindex;
> -#else
> - savesegment(fs, vmx->host_state.fs_sel);
> - savesegment(gs, vmx->host_state.gs_sel);
> + if (likely(is_64bit_mm(current->mm))) {
> + save_fsgs_for_kvm();
> + vmx->host_state.fs_sel = current->thread.fsindex;
> + vmx->host_state.gs_sel = current->thread.gsindex;
> + fsbase = current->thread.fsbase;
> + kernel_gsbase = current->thread.gsbase;
> + } else {
> +#endif
> + savesegment(fs, vmx->host_state.fs_sel);
> + savesegment(gs, vmx->host_state.gs_sel);
> +#ifdef CONFIG_X86_64
> + fsbase = read_msr(MSR_FS_BASE);
> + kernel_gsbase = read_msr(MSR_KERNEL_GS_BASE);
> + }
> #endif
> if (!(vmx->host_state.fs_sel & 7)) {
> vmcs_write16(HOST_FS_SELECTOR, vmx->host_state.fs_sel);
> @@ -2600,10 +2609,10 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu)
> savesegment(ds, vmx->host_state.ds_sel);
> savesegment(es, vmx->host_state.es_sel);
>
> - vmcs_writel(HOST_FS_BASE, current->thread.fsbase);
> + vmcs_writel(HOST_FS_BASE, fsbase);
> vmcs_writel(HOST_GS_BASE, cpu_kernelmode_gs_base(cpu));
>
> - vmx->msr_host_kernel_gs_base = current->thread.gsbase;
> + vmx->msr_host_kernel_gs_base = kernel_gsbase;
> if (is_long_mode(&vmx->vcpu))
> wrmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base);
> #else
> --
> 2.14.4
>
next prev parent reply other threads:[~2018-07-13 16:46 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-11 17:37 [PATCH] x86/kvm/vmx: don't read current->thread.{fs,gs}base of legacy tasks Vitaly Kuznetsov
2018-07-12 1:39 ` Wanpeng Li
2018-07-12 9:31 ` Vitaly Kuznetsov
2018-07-12 11:23 ` Vitaly Kuznetsov
2018-07-13 16:46 ` Sean Christopherson [this message]
2018-07-13 17:10 ` Vitaly Kuznetsov
2018-07-15 14:27 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180713164650.GA14830@linux.intel.com \
--to=sean.j.christopherson@intel.com \
--cc=kvm@vger.kernel.org \
--cc=ldv@altlinux.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
--cc=vkuznets@redhat.com \
--cc=x86@kernel.org \
--cc=yamato@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.