From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 403C7C4167B for ; Wed, 6 Apr 2022 21:38:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236960AbiDFVkW (ORCPT ); Wed, 6 Apr 2022 17:40:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47018 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236839AbiDFVhp (ORCPT ); Wed, 6 Apr 2022 17:37:45 -0400 Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BA605149D24 for ; Wed, 6 Apr 2022 13:52:36 -0700 (PDT) Received: by mail-pl1-x62c.google.com with SMTP id n18so3038859plg.5 for ; Wed, 06 Apr 2022 13:52:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=xkFjUfhGoQ34kJyACGIdw2JI7HA9Qbq4djiPovTr3nU=; b=V8fNKwMdF/78RolOKcFf3ZhsNoC2OOtHtdeW5d4KTeh6iZ082KMBgGMOrUBMe/NoAk +hYS8r+McENQQu6k7QLmoTfZZhz0LTvbBhIwc1ZI7jxz1W+5c5WWXRyy1o7L2cBuaII1 NmyILt6XWSnRiLW7ZHhxOJQEiTaLQ9Eq5FsY0JtXgTZsjhSPvufEaufsobiiryXZJNz+ iBucEWrK4ZyDbDEQsCwa80KUqwk5B3Yq0AUdwErNvl0NPJKwNJbrvt4iLP9RJaDs1tw8 9doJwoCrBACSmeYA61WBSRyiqNZQ+DLpu8wV1DHZqxvoH2bn9xvtpXh9i1JHl+qFC86h PacA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=xkFjUfhGoQ34kJyACGIdw2JI7HA9Qbq4djiPovTr3nU=; b=5oCrUkU0Dmmv2doBAYmaXgLbB55ofAxiKTRbALTa0x6CQz5P1jC18L+k5A2vaqyC45 T0BDgTp0fI1A2Vm/tD0s/nSlrHNydAYgqLMz1locE61NEvmRs+baNMgrx6xemC56mQTS MStyrPUahOwifWa/eFseISUAINgl+acdBFD/uNGMwcQ5ErxPHjvhIeZUWL2ZVkUSWSCn hsLJdLfi8m+9c0Js2ZzmJFeAUljn/e9a0PBW1RWIwGawnccDYlRgqaSwvKpywJz3OR+l Ki5xPTkR3fYIreX5O0XwSw1BkqxORT9sYhiFpjKRWMPU/KyoCGB6kF5hdPoPRPBaTYr+ wL5A== X-Gm-Message-State: AOAM533ADFyNrpT7pYGReQXD/v+p1BewGngmfEVTTz/YsVK++XwUSkKa OJehAXX2AKY102W/l5UFgRzozw== X-Google-Smtp-Source: ABdhPJxKhnEQV5VqEbY+JvX8VIw//fdN/sZciPCv9KuPGbSdvYgXpIcb2nA5fqbPdQp5BKAl6Hw2OQ== X-Received: by 2002:a17:902:76ca:b0:157:1c6:5660 with SMTP id j10-20020a17090276ca00b0015701c65660mr703844plt.105.1649278356074; Wed, 06 Apr 2022 13:52:36 -0700 (PDT) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id a16-20020a17090a6d9000b001c9c3e2a177sm6421301pjk.27.2022.04.06.13.52.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Apr 2022 13:52:35 -0700 (PDT) Date: Wed, 6 Apr 2022 20:52:31 +0000 From: Sean Christopherson To: "Maciej S. Szmigiero" Cc: Maxim Levitsky , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 5/8] KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction Message-ID: References: <20220402010903.727604-6-seanjc@google.com> <7caee33a-da0f-00be-3195-82c3d1cd4cb4@maciej.szmigiero.name> <5135b502-ce2e-babb-7812-4d4c431a5252@maciej.szmigiero.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 06, 2022, Maciej S. Szmigiero wrote: > On 6.04.2022 21:48, Sean Christopherson wrote: > > On Wed, Apr 06, 2022, Maciej S. Szmigiero wrote: > > > On 6.04.2022 19:10, Sean Christopherson wrote: > > > > On Wed, Apr 06, 2022, Maciej S. Szmigiero wrote: > > > And what if it's L0 that is trying to inject a NMI into L2? > > > In this case is_guest_mode() is true, but the full NMI injection machinery > > > should be used. > > > > Gah, you're right, I got misled by a benign bug in nested_vmx_l1_wants_exit() and > > was thinking that NMIs always exit. The "L1 wants" part should be conditioned on > > NMI exiting being enabled. It's benign because KVM always wants "real" NMIs, and > > so the path is never encountered. > > > > @@ -5980,7 +6005,7 @@ static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu, > > switch ((u16)exit_reason.basic) { > > case EXIT_REASON_EXCEPTION_NMI: > > intr_info = vmx_get_intr_info(vcpu); > > - if (is_nmi(intr_info)) > > + if (is_nmi(intr_info) && nested_cpu_has_nmi_exiting(vmcs12)) > > return true; > > else if (is_page_fault(intr_info)) > > return true; > > > > I guess you mean "benign" when having KVM as L1, since other hypervisors may > let their L2s handle NMIs themselves. No, this one's truly benign. The nVMX exit processing is: if (nested_vmx_l0_wants_exit()) handle in L0 / KVM; if (nested_vmx_l1_wants_exit()) handle in L1 handle in L0 / KVM Since this is for actual hardware NMIs, the first "L0 wants" check always returns true for NMIs, so the fact that KVM screws up L1's wants is a non-issue. > > > It is also incorrect to block L1 -> L2 NMI injection because either L1 > > > or L2 is currently under NMI blocking: the first case is obvious, > > > the second because it's L1 that is supposed to take care of proper NMI > > > blocking for L2 when injecting an NMI there. > > > > Yep, but I don't think there's a bug here. At least not for nVMX. > > I agree this scenario should currently work (including on nSVM) - mentioned > it just as a constraint on solution space. > > > > > > With the code in my previous patch set I planned to use > > > > > exit_during_event_injection() to detect such case, but if we implement > > > > > VMCB12 EVENTINJ parsing we can simply add a flag that the relevant event > > > > > comes from L1, so its normal injection side-effects should be skipped. > > > > > > > > Do we still need a flag based on the above? Honest question... I've been staring > > > > at all this for the better part of an hour and may have lost track of things. > > > > > > If checking just is_guest_mode() is not enough due to reasons I described > > > above then we need to somehow determine in the NMI / IRQ injection handler > > > whether the event to be injected into L2 comes from L0 or L1. > > > For this (assuming we do VMCB12 EVENTINJ parsing) I think we need an extra flag. > > > > Yes :-( And I believe the extra flag would need to be handled by KVM_{G,S}ET_VCPU_EVENTS. > > > > Another option for saving and restoring a VM would be to add it to > KVM_{GET,SET}_NESTED_STATE somewhere (maybe as a part of the saved VMCB12 > control area?). Ooh. What if we keep nested_run_pending=true until the injection completes? Then we don't even need an extra flag because nested_run_pending effectively says that any and all injected events are for L1=>L2. In KVM_GET_NESTED_STATE, shove the to-be-injected event into the normal vmc*12 injection field, and ignore all to-be-injected events in KVM_GET_VCPU_EVENTS if nested_run_pending=true. That should work even for migrating to an older KVM, as keeping nested_run_pending will cause the target to reprocess the event injection as if it were from nested VM-Enter, which it technically is. We could probably get away with completely dropping the intermediate event as the vmc*12 should still have the original event, but that technically could result in architecturally incorrect behavior, e.g. if vectoring up to the point of interception sets A/D bits in the guest.