From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.9 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDD86C32771 for ; Wed, 15 Jan 2020 22:52:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 93D312081E for ; Wed, 15 Jan 2020 22:52:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="khW1yH2T" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729868AbgAOWwA (ORCPT ); Wed, 15 Jan 2020 17:52:00 -0500 Received: from mail-pf1-f194.google.com ([209.85.210.194]:43394 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729157AbgAOWwA (ORCPT ); Wed, 15 Jan 2020 17:52:00 -0500 Received: by mail-pf1-f194.google.com with SMTP id x6so9199921pfo.10 for ; Wed, 15 Jan 2020 14:52:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=Bcb8NrqLvdb4N3pyKtd/Z3CmcoX5YLqo/yeEHZr9n2Q=; b=khW1yH2T4r3C5bFftOqzSLh2dJnj0JCekZwuNbf6Um151wzyTmrLGT2btH9iaQt8eS NRBMmUZ0g7dXjk3CfcU2RtIB2F2fbsKiYDvi+ZNXwHDB7Olsg3B1t4rDX3vK5JOkc2zO j2ejfA597NwVW9i9+pHRsZmipMlLK3CNtX99vFDzGtiF9baW9ax0/LhQlqCJOTeNH9NS RDzofwAgI8Rajb+jSIxX3deyiQ15wmiwqgJZ3JtiD4ZKNDiMDWDmU2Cfa5T+qP17RNMg D8CyfS6x9T+MVXIDCwLLJNpPYB/jyzhrMMJdHQfo3YlvVSliGoDDduIVtTZESSVVImrE SJyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=Bcb8NrqLvdb4N3pyKtd/Z3CmcoX5YLqo/yeEHZr9n2Q=; b=FOEYJeYy9Z/Dp0K4i1q39GRjkOXPUiI4Fl2cucyY3LPYg19ImYwbg5K0n7Pv4Aq9W8 qDKGYd8zKjOxrfbDN2cxaHeuIkWGjgd8KlkrrofwYHnP7mV+Z79slziGCFfxc2/WxmSo Cp7P9lr0itMwrsPdS1Wk80YZgemoIuqRcfrIGWtuYElEY2tmaUmW34SSd0hzYXW5vJgC qSgyIaoDKCgCGd6L8YdIeKr0WYX7+zlfLEcUqtaK5I/b5p/ITDTrokZ7F6a03WidqNU9 jfXtV86F2TYTJGSneazfdwYY3BAcDpeMUVPtj/n/S3NyndvLTE5N7yr/rjzet4NXKznu wFvA== X-Gm-Message-State: APjAAAU0GpbajaBLW+nG3DbAi72uswaGitfTNhHUPAqEgWPEo4bNe3oc Q8xzFF1EFp9QlO/FvdLXRs09bw== X-Google-Smtp-Source: APXvYqxitOKpSwBRtvhIsru0DXUW9vedirmg5d3W8U9qjxs3nHJ6V6Ex+HR3t2SslgIiIWVVHTahmg== X-Received: by 2002:a62:2a49:: with SMTP id q70mr32758231pfq.170.1579128719465; Wed, 15 Jan 2020 14:51:59 -0800 (PST) Received: from google.com ([2620:15c:100:202:d78:d09d:ec00:5fa7]) by smtp.gmail.com with ESMTPSA id z4sm22576162pfn.42.2020.01.15.14.51.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Jan 2020 14:51:58 -0800 (PST) Date: Wed, 15 Jan 2020 14:51:54 -0800 From: Oliver Upton To: Sean Christopherson Cc: kvm@vger.kernel.org, Paolo Bonzini , Peter Shier , Jim Mattson Subject: Re: [PATCH 2/3] KVM: x86: Emulate MTF when performing instruction emulation Message-ID: <20200115225154.GA63061@google.com> References: <20200113221053.22053-1-oupton@google.com> <20200113221053.22053-3-oupton@google.com> <20200114000517.GC14928@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200114000517.GC14928@linux.intel.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Mon, Jan 13, 2020 at 04:05:18PM -0800, Sean Christopherson wrote: > On Mon, Jan 13, 2020 at 02:10:52PM -0800, Oliver Upton wrote: > > Since commit 5f3d45e7f282 ("kvm/x86: add support for > > MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap > > flag processor-based execution control for its L2 guest. KVM simply > > forwards any MTF VM-exits to the L1 guest, which works for normal > > instruction execution. > > > > However, when KVM needs to emulate an instruction on the behalf of an L2 > > guest, the monitor trap flag is not emulated. Add the necessary logic to > > kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon > > instruction emulation for L2. > > This only fixes the flows where KVM is doing "fast" emulation, or whatever > you want to call it. The full emulation paths are still missing proper MTF > support. Dunno if it makes sense to fix it all at once, e.g. if the full > emulation is stupid complex, but at a minimum the gaps should be called out > with a brief explanation of why they're not being addressed. Good point. I'm instead inclined to call the hook to emulation_complete (will rename as appropriate) from kvm_vcpu_do_singlestep(), as it appears this is how x86_emulate_instruction processes the trap flag. Need to take a deeper look + test to ensure this change will fix MTF for full instruction emulation. > > Fixes: 5f3d45e7f282 ("kvm/x86: add support for MONITOR_TRAP_FLAG") > > > > Signed-off-by: Oliver Upton > > Reviewed-by: Peter Shier > > Reviewed-by: Jim Mattson > > --- > > arch/x86/include/asm/kvm_host.h | 1 + > > arch/x86/kvm/svm.c | 5 +++++ > > arch/x86/kvm/vmx/nested.c | 2 +- > > arch/x86/kvm/vmx/nested.h | 5 +++++ > > arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++ > > arch/x86/kvm/x86.c | 6 ++++++ > > 6 files changed, 37 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index 4739ca11885d..89dcdc7201ae 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -1092,6 +1092,7 @@ struct kvm_x86_ops { > > void (*run)(struct kvm_vcpu *vcpu); > > int (*handle_exit)(struct kvm_vcpu *vcpu); > > int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu); > > + void (*emulation_complete)(struct kvm_vcpu *vcpu); > > void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask); > > u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu); > > void (*patch_hypercall)(struct kvm_vcpu *vcpu, > > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > > index 16ded16af997..f21eec4443d5 100644 > > --- a/arch/x86/kvm/svm.c > > +++ b/arch/x86/kvm/svm.c > > @@ -802,6 +802,10 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu) > > return 1; > > } > > > > +static void svm_emulation_complete(struct kvm_vcpu *vcpu) > > +{ > > +} > > + > > static void svm_queue_exception(struct kvm_vcpu *vcpu) > > { > > struct vcpu_svm *svm = to_svm(vcpu); > > @@ -7320,6 +7324,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = { > > .run = svm_vcpu_run, > > .handle_exit = handle_exit, > > .skip_emulated_instruction = skip_emulated_instruction, > > + .emulation_complete = svm_emulation_complete, > > .set_interrupt_shadow = svm_set_interrupt_shadow, > > .get_interrupt_shadow = svm_get_interrupt_shadow, > > .patch_hypercall = svm_patch_hypercall, > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > > index 4aea7d304beb..ee26f2d10a09 100644 > > --- a/arch/x86/kvm/vmx/nested.c > > +++ b/arch/x86/kvm/vmx/nested.c > > @@ -5578,7 +5578,7 @@ bool nested_vmx_exit_reflected(struct kvm_vcpu *vcpu, u32 exit_reason) > > case EXIT_REASON_MWAIT_INSTRUCTION: > > return nested_cpu_has(vmcs12, CPU_BASED_MWAIT_EXITING); > > case EXIT_REASON_MONITOR_TRAP_FLAG: > > - return nested_cpu_has(vmcs12, CPU_BASED_MONITOR_TRAP_FLAG); > > + return nested_cpu_has_mtf(vmcs12); > > case EXIT_REASON_MONITOR_INSTRUCTION: > > return nested_cpu_has(vmcs12, CPU_BASED_MONITOR_EXITING); > > case EXIT_REASON_PAUSE_INSTRUCTION: > > diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h > > index fc874d4ead0f..901d2745bc93 100644 > > --- a/arch/x86/kvm/vmx/nested.h > > +++ b/arch/x86/kvm/vmx/nested.h > > @@ -238,6 +238,11 @@ static inline bool nested_cpu_has_save_preemption_timer(struct vmcs12 *vmcs12) > > VM_EXIT_SAVE_VMX_PREEMPTION_TIMER; > > } > > > > +static inline bool nested_cpu_has_mtf(struct vmcs12 *vmcs12) > > +{ > > + return nested_cpu_has(vmcs12, CPU_BASED_MONITOR_TRAP_FLAG); > > +} > > + > > /* > > * In nested virtualization, check if L1 asked to exit on external interrupts. > > * For most existing hypervisors, this will always return true. > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index 148696199c88..8d3b693c3d3a 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -1595,6 +1595,24 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu) > > return 1; > > } > > > > +static void vmx_emulation_complete(struct kvm_vcpu *vcpu) > > +{ > > + if (!(is_guest_mode(vcpu) && > > + nested_cpu_has_mtf(get_vmcs12(vcpu)))) > > + return; > > + > > + /* > > + * Per the SDM, MTF takes priority over debug-trap instructions. As > > Except for T-bit #DBs. Thankfully KVM doesn't emulate them. :-D > > > + * instruction emulation is completed (i.e. at the instruction > > + * boundary), any #DB exception must be a trap. Emulate an MTF VM-exit > > + * into L1 should there be a debug-trap exception pending or no > > + * exception pending. > > + */ > > + if (!vcpu->arch.exception.pending || > > + vcpu->arch.exception.nr == DB_VECTOR) > > + nested_vmx_vmexit(vcpu, EXIT_REASON_MONITOR_TRAP_FLAG, 0, 0); > > Doing a direct nested_vmx_vmexit() in kvm_skip_emulated_instruction() feels > like a hack. It might work for now, i.e. while only the "fast" emulation > paths deal with MTF, but I have a feeling that we'll want MTF handling to > live in vmx_check_nested_events(), or possibly in a custom callback that is > invoked from similar call sites. I'm convinced now that perhaps kvm_vcpu_do_singlestep() is the right place to call a hook that determines if an MTF VM-exit is pending, as x86_emulate_instruction() also uses this function to handle the trap flag. > Actually, I thought of a case it breaks. If HLT is passed through from > L1->L2, this approach will process the VM-Exit prior to handling the HLT > in L0. KVM will record the wrong activity state in vmcs12 ("active" > instead of "halted") and then incorrectly put L1 into KVM_MP_STATE_HALTED. Great catch! I'm doing some experimentation with HLT to figure out how to correctly handle this. > Another case, which may or may not be possible, is if INIT is recognized > on the same instruction, in which case it takes priority over MTF. SMI > might also be an issue. I've changed this locally to defer delivery of the MTF VM-exit to vmx_check_nested_events(). Tests look fine, seems this would resolve INIT signal priority. > > +} > > + > > static void vmx_clear_hlt(struct kvm_vcpu *vcpu) > > { > > /* > > @@ -7831,6 +7849,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = { > > .run = vmx_vcpu_run, > > .handle_exit = vmx_handle_exit, > > .skip_emulated_instruction = skip_emulated_instruction, > > + .emulation_complete = vmx_emulation_complete, > > .set_interrupt_shadow = vmx_set_interrupt_shadow, > > .get_interrupt_shadow = vmx_get_interrupt_shadow, > > .patch_hypercall = vmx_patch_hypercall, > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index c14174c033e4..d3af7a8a3c4b 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -6546,6 +6546,12 @@ int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu) > > */ > > if (unlikely(rflags & X86_EFLAGS_TF)) > > r = kvm_vcpu_do_singlestep(vcpu); > > + /* > > + * Allow for vendor-specific handling of completed emulation before > > + * returning. > > + */ > > + if (r) > > + kvm_x86_ops->emulation_complete(vcpu); > > return r; > > } > > EXPORT_SYMBOL_GPL(kvm_skip_emulated_instruction); > > -- > > 2.25.0.rc1.283.g88dfdc4193-goog > >