From: KarimAllah Ahmed <karahmed@amazon.com>
To: Jim Mattson <jmattson@google.com>, David Woodhouse <dwmw2@infradead.org>
Cc: KarimAllah Ahmed <karahmed@amazon.de>,
kvm list <kvm@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Asit Mallick <asit.k.mallick@intel.com>,
Arjan Van De Ven <arjan.van.de.ven@intel.com>,
Dave Hansen <dave.hansen@intel.com>,
Andi Kleen <ak@linux.intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
"Tim Chen" <tim.c.chen@linux.intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
"Dan Williams" <dan.j.williams@intel.com>,
Jun Nakajima <jun.nakajima@intel.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Greg KH <gregkh@linuxfoundation.org>,
Andy Lutomirski <luto@kernel.org>,
Ashok Raj <ashok.raj@intel.com>
Subject: Re: [PATCH] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL
Date: Mon, 29 Jan 2018 20:10:02 +0100 [thread overview]
Message-ID: <c2b6a17f-e81a-42d4-40e8-535dee52487e@amazon.com> (raw)
In-Reply-To: <CALMp9eTvwVMF0GQKMXBhOyuHFWmJOq4BYTY4D1tEtfp=QMf-fA@mail.gmail.com>
On 01/29/2018 08:04 PM, Jim Mattson wrote:
> Can I assume you'll send out a new version with the fixes?
Yes, I am currently doing some tests and once I am done I will send a
new round.
... and the typo is already fixed in 'ibpb-wip' :)
>
> On Mon, Jan 29, 2018 at 11:01 AM, David Woodhouse <dwmw2@infradead.org> wrote:
>>
>> (Top-posting; sorry.)
>>
>> Much of that is already fixed during our day, in
>> http://git.infradead.org/linux-retpoline.git/shortlog/refs/heads/ibpb
>>
>> I forgot to fix up the wrong-MSR typo though, and we do still need to address reset.
>>
>> On Mon, 2018-01-29 at 10:43 -0800, Jim Mattson wrote:
>>> On Sun, Jan 28, 2018 at 11:29 AM, KarimAllah Ahmed wrote:
>>>>
>>>> Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for guests
>>>> that will only mitigate Spectre V2 through IBRS+IBPB and will not be using a
>>>> retpoline+IBPB based approach.
>>>>
>>>> To avoid the overhead of atomically saving and restoring the MSR_IA32_SPEC_CTRL
>>>> for guests that do not actually use the MSR, only add_atomic_switch_msr when a
>>>> non-zero is written to it.
>>>>
>>>> Cc: Asit Mallick <asit.k.mallick@intel.com>
>>>> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
>>>> Cc: Dave Hansen <dave.hansen@intel.com>
>>>> Cc: Andi Kleen <ak@linux.intel.com>
>>>> Cc: Andrea Arcangeli <aarcange@redhat.com>
>>>> Cc: Linus Torvalds <torvalds@linux-foundation.org>
>>>> Cc: Tim Chen <tim.c.chen@linux.intel.com>
>>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>>> Cc: Dan Williams <dan.j.williams@intel.com>
>>>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>>> Cc: David Woodhouse <dwmw@amazon.co.uk>
>>>> Cc: Greg KH <gregkh@linuxfoundation.org>
>>>> Cc: Andy Lutomirski <luto@kernel.org>
>>>> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
>>>> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
>>>> ---
>>>> arch/x86/kvm/cpuid.c | 4 +++-
>>>> arch/x86/kvm/cpuid.h | 1 +
>>>> arch/x86/kvm/vmx.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> 3 files changed, 67 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>>> index 0099e10..dc78095 100644
>>>> --- a/arch/x86/kvm/cpuid.c
>>>> +++ b/arch/x86/kvm/cpuid.c
>>>> @@ -70,6 +70,7 @@ u64 kvm_supported_xcr0(void)
>>>> /* These are scattered features in cpufeatures.h. */
>>>> #define KVM_CPUID_BIT_AVX512_4VNNIW 2
>>>> #define KVM_CPUID_BIT_AVX512_4FMAPS 3
>>>> +#define KVM_CPUID_BIT_SPEC_CTRL 26
>>>> #define KF(x) bit(KVM_CPUID_BIT_##x)
>>>>
>>>> int kvm_update_cpuid(struct kvm_vcpu *vcpu)
>>>> @@ -392,7 +393,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>>>>
>>>> /* cpuid 7.0.edx*/
>>>> const u32 kvm_cpuid_7_0_edx_x86_features =
>>>> - KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
>>>> + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
>>>> + (boot_cpu_has(X86_FEATURE_SPEC_CTRL) ? KF(SPEC_CTRL) : 0);
>>> Isn't 'boot_cpu_has()' superflous here? And aren't there two bits to
>>> pass through for existing CPUs (26 and 27)?
>>>
>>>>
>>>>
>>>> /* all calls to cpuid_count() should be made on the same cpu */
>>>> get_cpu();
>>>> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
>>>> index cdc70a3..dcfe227 100644
>>>> --- a/arch/x86/kvm/cpuid.h
>>>> +++ b/arch/x86/kvm/cpuid.h
>>>> @@ -54,6 +54,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
>>>> [CPUID_8000_000A_EDX] = {0x8000000a, 0, CPUID_EDX},
>>>> [CPUID_7_ECX] = { 7, 0, CPUID_ECX},
>>>> [CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX},
>>>> + [CPUID_7_EDX] = { 7, 0, CPUID_EDX},
>>>> };
>>>>
>>>> static __always_inline struct cpuid_reg x86_feature_cpuid(unsigned x86_feature)
>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>> index aa8638a..1b743a0 100644
>>>> --- a/arch/x86/kvm/vmx.c
>>>> +++ b/arch/x86/kvm/vmx.c
>>>> @@ -920,6 +920,9 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked);
>>>> static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
>>>> u16 error_code);
>>>> static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
>>>> +static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
>>>> + u32 msr, int type);
>>>> +
>>>>
>>>> static DEFINE_PER_CPU(struct vmcs *, vmxarea);
>>>> static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
>>>> @@ -2007,6 +2010,28 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
>>>> m->host[i].value = host_val;
>>>> }
>>>>
>>>> +/* do not touch guest_val and host_val if the msr is not found */
>>>> +static int read_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
>>>> + u64 *guest_val, u64 *host_val)
>>>> +{
>>>> + unsigned i;
>>>> + struct msr_autoload *m = &vmx->msr_autoload;
>>>> +
>>>> + for (i = 0; i < m->nr; ++i)
>>>> + if (m->guest[i].index == msr)
>>>> + break;
>>>> +
>>>> + if (i == m->nr)
>>>> + return 1;
>>>> +
>>>> + if (guest_val)
>>>> + *guest_val = m->guest[i].value;
>>>> + if (host_val)
>>>> + *host_val = m->host[i].value;
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
>>>> {
>>>> u64 guest_efer = vmx->vcpu.arch.efer;
>>>> @@ -3203,7 +3228,9 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
>>>> */
>>>> static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>>> {
>>>> + u64 spec_ctrl = 0;
>>>> struct shared_msr_entry *msr;
>>>> + struct vcpu_vmx *vmx = to_vmx(vcpu);
>>>>
>>>> switch (msr_info->index) {
>>>> #ifdef CONFIG_X86_64
>>>> @@ -3223,6 +3250,19 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>>> case MSR_IA32_TSC:
>>>> msr_info->data = guest_read_tsc(vcpu);
>>>> break;
>>>> + case MSR_IA32_SPEC_CTRL:
>>>> + if (!msr_info->host_initiated &&
>>>> + !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
>>> Shouldn't this conjunct be:
>>> !(guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
>>> guest_cpuid_has(vcpu, X86_FEATURE_STIBP))?
>>>
>>>>
>>>> + return 1;
>>> What if !boot_cpu_has(X86_FEATURE_SPEC_CTRL) &&
>>> !boot_cpu_has(X86_FEATURE_STIBP)? That should also return 1, I think.
>>>
>>>>
>>>> +
>>>> + /*
>>>> + * If the MSR is not in the atomic list yet, then it was never
>>>> + * written to. So the MSR value will be '0'.
>>>> + */
>>>> + read_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, &spec_ctrl, NULL);
>>> Why not just add msr_ia32_spec_ctrl to struct vcpu_vmx, so that you
>>> don't have to search the atomic switch list?
>>>
>>>>
>>>> +
>>>> + msr_info->data = spec_ctrl;
>>>> + break;
>>>> case MSR_IA32_SYSENTER_CS:
>>>> msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
>>>> break;
>>>> @@ -3289,6 +3329,13 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>>> int ret = 0;
>>>> u32 msr_index = msr_info->index;
>>>> u64 data = msr_info->data;
>>>> + unsigned long *msr_bitmap;
>>>> +
>>>> + /*
>>>> + * IBRS is not used (yet) to protect the host. Once it does, this
>>>> + * variable needs to be a bit smarter.
>>>> + */
>>>> + u64 host_spec_ctrl = 0;
>>>>
>>>> switch (msr_index) {
>>>> case MSR_EFER:
>>>> @@ -3330,6 +3377,22 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>>> case MSR_IA32_TSC:
>>>> kvm_write_tsc(vcpu, msr_info);
>>>> break;
>>>> + case MSR_IA32_SPEC_CTRL:
>>>> + if (!msr_info->host_initiated &&
>>>> + !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
>>>> + return 1;
>>> This looks incomplete. As above, what if
>>> !boot_cpu_has(X86_FEATURE_SPEC_CTRL) &&
>>> !boot_cpu_has(X86_FEATURE_STIBP)?
>>> If the host doesn't support MSR_IA32_SPEC_CTRL, you'll get a VMX-abort
>>> on loading the host MSRs from the VM-exit MSR load list.
>>>
>>> Also, what if the value being written is illegal?
>>>
>>> /*
>>> * Processors that support IBRS but not STIBP
>>> * (CPUID.(EAX=07H, ECX=0):EDX[27:26] = 01b) will
>>> * ignore attempts to set STIBP instead of causing an
>>> * exception due to setting that reserved bit.
>>> */
>>> if ((data & ~(u64)(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP)) ||
>>> ((data & SPEC_CTRL_IBRS) &&
>>> !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)))
>>> return 1;
>>>
>>>>
>>>> +
>>>> + /*
>>>> + * Now we know that the guest is actually using the MSR, so
>>>> + * atomically load and save the SPEC_CTRL MSR and pass it
>>>> + * through to the guest.
>>>> + */
>>>> + add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, msr_info->data,
>>>> + host_spec_ctrl);
>>>> + msr_bitmap = vmx->vmcs01.msr_bitmap;
>>>> + vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
>>> I assume you mean MSR_IA32_SPEC_CTRL rather than MSR_FS_BASE.
>>>
>>> Also, what if the host and the guest support a different set of bits
>>> in MSR_IA32_SPEC_CTRL, due to a userspace modification of the guest's
>>> CPUID info?
>>>
>>>>
>>>> +
>>>> + break;
>>>> case MSR_IA32_CR_PAT:
>>>> if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
>>>> if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
>>>> --
>>>> 2.7.4
>>>>
>>> Where do you preserve the guest's MSR_IA32_SPEC_CTRL value on VM-exit,
>>> if the guest has been given permission to write the MSR?
>>>
>>> You also have to clear the guest's MSR_IA32_SPEC_CTRL on
>>> vmx_vcpu_reset, don't you?
>>>
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
next prev parent reply other threads:[~2018-01-29 19:11 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-28 19:29 [PATCH] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL KarimAllah Ahmed
2018-01-28 20:21 ` Konrad Rzeszutek Wilk
2018-01-28 20:39 ` David Woodhouse
2018-01-28 20:40 ` Andy Lutomirski
2018-01-28 20:44 ` David Woodhouse
2018-01-28 20:53 ` Andy Lutomirski
2018-01-28 20:56 ` David Woodhouse
2018-01-28 21:41 ` Van De Ven, Arjan
2018-01-28 21:47 ` David Woodhouse
2018-01-29 1:06 ` KarimAllah Ahmed
2018-01-29 18:43 ` Jim Mattson
2018-01-29 19:01 ` David Woodhouse
2018-01-29 19:04 ` Jim Mattson
2018-01-29 19:10 ` KarimAllah Ahmed [this message]
2018-01-29 19:16 ` Konrad Rzeszutek Wilk
2018-01-29 19:27 ` Jim Mattson
-- strict thread matches above, loose matches on Subject: below --
2018-01-29 0:39 Liran Alon
2018-01-29 8:46 ` David Woodhouse
2018-01-29 9:43 ` KarimAllah Ahmed
2018-01-29 10:37 ` David Woodhouse
2018-01-29 10:55 ` Paolo Bonzini
2018-01-29 17:27 ` Konrad Rzeszutek Wilk
2018-01-29 10:47 ` Paolo Bonzini
2018-01-29 17:31 ` Konrad Rzeszutek Wilk
2018-01-29 21:49 ` Daniel Kiper
2018-01-29 23:01 ` Jim Mattson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c2b6a17f-e81a-42d4-40e8-535dee52487e@amazon.com \
--to=karahmed@amazon.com \
--cc=aarcange@redhat.com \
--cc=ak@linux.intel.com \
--cc=arjan.van.de.ven@intel.com \
--cc=ashok.raj@intel.com \
--cc=asit.k.mallick@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=dwmw2@infradead.org \
--cc=gregkh@linuxfoundation.org \
--cc=jmattson@google.com \
--cc=jun.nakajima@intel.com \
--cc=karahmed@amazon.de \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox