From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E99A6C2BA19 for ; Sat, 18 Apr 2020 04:21:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BE4FF221EA for ; Sat, 18 Apr 2020 04:21:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725796AbgDREVJ (ORCPT ); Sat, 18 Apr 2020 00:21:09 -0400 Received: from mga09.intel.com ([134.134.136.24]:28927 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725440AbgDREVJ (ORCPT ); Sat, 18 Apr 2020 00:21:09 -0400 IronPort-SDR: CyjN/EZd1nSOUolKGpZkQsBAbPUZyVsIAKSIWCr6lc5vbW2up3TSy2Y3m85bhIlkPMPLTyc453 fhD81+TwI6Ug== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2020 21:21:08 -0700 IronPort-SDR: EeDMeTxG/9QI+Cx3T0w0VnOhBXrELvbzxnj47ajg2x3hW7cF6VCQpeOBpfrAqFH3smV9hpU/fp moUUP01Wus9Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,397,1580803200"; d="scan'208";a="257786037" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.202]) by orsmga006.jf.intel.com with ESMTP; 17 Apr 2020 21:21:08 -0700 Date: Fri, 17 Apr 2020 21:21:08 -0700 From: Sean Christopherson To: Jim Mattson Cc: kvm list , Oliver Upton , Peter Shier Subject: Re: [PATCH 2/2] kvm: nVMX: Single-step traps trump expired VMX-preemption timer Message-ID: <20200418042108.GF15609@linux.intel.com> References: <20200414000946.47396-1-jmattson@google.com> <20200414000946.47396-2-jmattson@google.com> <20200414031705.GP21204@linux.intel.com> <20200415001212.GA12547@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Wed, Apr 15, 2020 at 04:33:31PM -0700, Jim Mattson wrote: > On Tue, Apr 14, 2020 at 5:12 PM Sean Christopherson > wrote: > > > > On Tue, Apr 14, 2020 at 09:47:53AM -0700, Jim Mattson wrote: > > > Regarding -EBUSY, I'm in complete agreement. However, I'm not sure > > > what the potential confusion is regarding the event. Are you > > > suggesting that one might think that we have a #DB to deliver to L1 > > > while we're in guest mode? IIRC, that can happen under SVM, but I > > > don't believe it can happen under VMX. > > > > The potential confusion is that vcpu->arch.exception.pending was already > > checked, twice. It makes one wonder why it needs to be checked a third > > time. And actually, I think that's probably a good indicator that singling > > out single-step #DB isn't the correct fix, it just happens to be the only > > case that's been encountered thus far, e.g. a #PF when fetching the instr > > for emulation should also get priority over the preemption timer. On real > > hardware, expiration of the preemption timer while vectoring a #PF wouldn't > > wouldn't get recognized until the next instruction boundary, i.e. at the > > start of the first instruction of the #PF handler. Dropping the #PF isn't > > a problem in most cases, because unlike the single-step #DB, it will be > > re-encountered when L1 resumes L2. But, dropping the #PF is still wrong. > > Yes, it's wrong in the abstract, but with respect to faults and the > VMX-preemption timer expiration, is there any way for either L1 or L2 > to *know* that the virtual CPU has done something wrong? I don't think so? But how is that relevant, i.e. if we can fix KVM instead of fudging the result, why wouldn't we fix KVM? > Isn't it generally true that if you have an exception queued when you > transition from L2 to L1, then you've done something wrong? I wonder > if the call to kvm_clear_exception_queue() in prepare_vmcs12() just > serves to sweep a whole collection of problems under the rug. More than likely, yes. > > In general, interception of an event doesn't change the priority of events, > > e.g. INTR shouldn't get priority over NMI just because if L1 wants to > > intercept INTR but not NMI. > > Yes, but that's a different problem altogether. But isn't the fix the same? Stop processing events if a higher priority event is pending, regardless of whether the event exits to L1. > > TL;DR: I think the fix should instead be: > > > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > > index c868c64770e0..042d7a9037be 100644 > > --- a/arch/x86/kvm/vmx/nested.c > > +++ b/arch/x86/kvm/vmx/nested.c > > @@ -3724,9 +3724,10 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) > > /* > > * Process any exceptions that are not debug traps before MTF. > > */ > > - if (vcpu->arch.exception.pending && > > - !vmx_pending_dbg_trap(vcpu) && > > - nested_vmx_check_exception(vcpu, &exit_qual)) { > > + if (vcpu->arch.exception.pending && !vmx_pending_dbg_trap(vcpu)) > > + if (!nested_vmx_check_exception(vcpu, &exit_qual)) > > + return 0; > > + > > if (block_nested_events) > > return -EBUSY; > > nested_vmx_inject_exception_vmexit(vcpu, exit_qual); > > @@ -3741,8 +3742,10 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) > > return 0; > > } > > > > - if (vcpu->arch.exception.pending && > > - nested_vmx_check_exception(vcpu, &exit_qual)) { > > + if (vcpu->arch.exception.pending) { > > + if (!nested_vmx_check_exception(vcpu, &exit_qual)) > > + return 0; > > + > > if (block_nested_events) > > return -EBUSY; > > nested_vmx_inject_exception_vmexit(vcpu, exit_qual); > > @@ -3757,7 +3760,10 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) > > return 0; > > } > > > > - if (vcpu->arch.nmi_pending && nested_exit_on_nmi(vcpu)) { > > + if (vcpu->arch.nmi_pending) { > > + if (!nested_exit_on_nmi(vcpu)) > > + return 0; > > + > > if (block_nested_events) > > return -EBUSY; > > nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI, > > @@ -3772,7 +3778,10 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) > > return 0; > > } > > > > - if (kvm_cpu_has_interrupt(vcpu) && nested_exit_on_intr(vcpu)) { > > + if (kvm_cpu_has_interrupt(vcpu) { > > + if (!nested_exit_on_intr(vcpu)) > > + return 0; > > + > > if (block_nested_events) > > return -EBUSY; > > nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT, 0, 0); > >