From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36475C71145 for ; Thu, 24 Aug 2023 14:23:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241549AbjHXOXW (ORCPT ); Thu, 24 Aug 2023 10:23:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236682AbjHXOWx (ORCPT ); Thu, 24 Aug 2023 10:22:53 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6065511F for ; Thu, 24 Aug 2023 07:22:51 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-59285f1e267so10298887b3.0 for ; Thu, 24 Aug 2023 07:22:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692886970; x=1693491770; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=LUqkh+MsV5Vrt4MTvDlgnnSg6OPpYXOoL7cfHlTtb8M=; b=UWvBCdin9fzOcW//Ix69NYEO26eQR2dYIfmBO7ycAwvPBVNQzetffaZ9xtH1qn+6Wj SX7zr6smqNFsXI687Ckf/On8Uc2/qRwaEQycIbFakkEKjEAOAf3SsJ7DPYDKNawW+Tbs TuL71VlDi5BNXVpf6Nkqxtn1knirtV/yhYEU5lqSEhCdAgsXxsxiuuDyAU0cRfzmatd0 pEQFFXCyBH64J6dnRDGZJW5zfzGY1TShoOCir92rQrhFmZLbKILQXSF2OqVCjrFYoxJv XdBNagJOV96y0wP+bNRDwE23pnd6xpZPFrYOewULejQ+4wdJdza9ZEHvpaC74Qk+TmGL AM+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692886970; x=1693491770; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=LUqkh+MsV5Vrt4MTvDlgnnSg6OPpYXOoL7cfHlTtb8M=; b=gMOGJerIVO+pzSexAnYmu6uMsR15Z7y+ylsSFL/EyC/aH5BEKGDcH0Vm50D3IQBft0 k5TW3l6/+ZWKmGuGcDnk3Rz7gZXWF/xB41h6HD72toDOYaFdW6n01AeGBHe571zvP+7I +65uNjqQ8B+6X4nMtPvyHDKo1SV6GGiqV13o6BudlnMjr37ZMBFyVgXGx6WedbMhAHx0 qzwJkU5PSI7oZxLGTTKYtUTFtoN2fvnWULEL4sJi7EKCH5p4HGkWUIqqZzFa0R8o/w/j lRelfRx7CfNQ74KPrfwNmII8ZE3B8WPpgbRb0C6Zj4+PxyakVZMKsaFnyOMIqC67Ydlf xb9g== X-Gm-Message-State: AOJu0YySdMQr9DtRgXOprpP7Jo+88YdAE01N/xHfJtBr12CcEpyGoTQg luJJ4uoR3ef3Rn3FDtBiZulHKoe3Dio= X-Google-Smtp-Source: AGHT+IEcl3qGP1JrvIn63TSnfzzvUSShyoMV4j6547DJrdRfSZAgVP7pWcLKEyUAru3TUAMu0cVhLEDsybQ= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:af24:0:b0:586:e2b5:f364 with SMTP id n36-20020a81af24000000b00586e2b5f364mr320156ywh.4.1692886970397; Thu, 24 Aug 2023 07:22:50 -0700 (PDT) Date: Thu, 24 Aug 2023 14:22:48 +0000 In-Reply-To: <174aa0da-0b05-a2dc-7884-4f7b57abcc37@amd.com> Mime-Version: 1.0 References: <67fba65c-ba2a-4681-a9bc-2a6e8f0bcb92.chenpeihong.cph@alibaba-inc.com> <174aa0da-0b05-a2dc-7884-4f7b57abcc37@amd.com> Message-ID: Subject: Re: Question about AMD SVM's virtual NMI support in Linux kernel mainline From: Sean Christopherson To: Santosh Shukla Cc: "=?utf-8?B?6ZmI5Z+56bi/KOS5mOm4vyk=?=" , mlevitsk , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org +kvm and lkml, didn't realize they weren't Cc'd in the original mail. On Thu, Aug 24, 2023, Santosh Shukla wrote: > Hi Sean, >=20 > On 8/23/2023 8:33 PM, Sean Christopherson wrote: > > On Fri, Aug 18, 2023, =E9=99=88=E5=9F=B9=E9=B8=BF(=E4=B9=98=E9=B8=BF) w= rote: > >> According to the results, I found that in the case of concurrent NMIs,= some > >> NMIs are still injected through eventinj instead of vNMI. > >=20 > > Key word is "some", having two NMIs arrive "simultaneously" is uncommon= . In quotes > > because if a vCPU is scheduled out or otherwise delayed, two NMIs can b= e recognized > > by KVM at the same time even if there was a sizeable delay between when= they were > > sent. > >=20 > >> Based on the above explanations, I summarize my questions as follows: > >> 1. According to the specification of AMD SVM vNMI, with vNMI enabled, = will > >> some NMIs be injected through eventinj under the condition of concurre= nt > >> NMIs? > >=20 > > Yes. > >=20 > >> 2. If yes, what is the role of vNMI? Is it just an addition to eventin= j? What > >> benefits is it designed to expect? Is there any benchmark data support= ? > >=20 > > Manually injecting NMIs isn't problematic from a performance perspectiv= e. KVM > > takes control of the vCPU, i.e. forces a VM-Exit, to pend a virtual NMI= , so there's > > no extra VM-Exit. > >=20 > > The value added by vNMI support is that KVM doesn't need to manually tr= ack/detect > > when NMIs become unblocked in the guest. SVM doesn't provide a hardwar= e-supported > > NMI-window exiting, so KVM has to intercept _and_ single-step IRET, whi= ch adds two > > VM-Exits for _every_ NMI when vNMI isn't available (and it's a complete= mess for > > things like SEV-ES). > >=20 > >> 3. If not, does it mean that the current SVM's vNMI support scheme in = the > >> Linux mainline code is flawed? How should it be fixed? > >=20 > > The approach as a whole isn't flawed, it's the best KVM can do given SV= M's vNMI > > architecture and KVM's ABI with respect to "concurrent" NMIs. > >=20 > > Hrm, though there does appear to be a bug in the injecting path. KVM d= oesn't > > manually set V_NMI_BLOCKING_MASK, and will unnecessarily enable IRET in= terception > > when manually injecting a vNMI. Intercepting IRET should be unnecessar= y because > > hardware will automatically accept the pending vNMI when NMIs become un= blocked. > > And I don't see anything in the APM that suggests hardware will set V_N= MI_BLOCKING_MASK > > when software directly injects an NMI. > >=20 > > So I think we need: > >=20 > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > > index d381ad424554..c956a9f500a2 100644 > > --- a/arch/x86/kvm/svm/svm.c > > +++ b/arch/x86/kvm/svm/svm.c > > @@ -3476,6 +3476,11 @@ static void svm_inject_nmi(struct kvm_vcpu *vcpu= ) > > if (svm->nmi_l1_to_l2) > > return; > > =20 > > + if (is_vnmi_enabled(svm)) { > > + svm->vmcb->control.int_ctl |=3D V_NMI_BLOCKING_MASK; > > + return; > > + } > > + > > svm->nmi_masked =3D true; > > svm_set_iret_intercept(svm); > > ++vcpu->stat.nmi_injections; > > -- > >=20 > > or if hardware does set V_NMI_BLOCKING_MASK in this case, just: > >=20 >=20 > Yes, HW does set BLOCKING_MASK when HW takes the pending vNMI event. I'm not asking about the pending vNMI case, which is clearly spelled out in= the APM. I'm asking about directly injecting an NMI via: svm->vmcb->control.event_inj =3D SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_NMI; > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > > index d381ad424554..201a1a33ecd2 100644 > > --- a/arch/x86/kvm/svm/svm.c > > +++ b/arch/x86/kvm/svm/svm.c > > @@ -3473,7 +3473,7 @@ static void svm_inject_nmi(struct kvm_vcpu *vcpu) > > =20 > > svm->vmcb->control.event_inj =3D SVM_EVTINJ_VALID | SVM_EVTINJ_= TYPE_NMI; > > =20 > > - if (svm->nmi_l1_to_l2) > > + if (svm->nmi_l1_to_l2 || is_vnmi_enabled(svm)) > > return; > > =20 > > svm->nmi_masked =3D true; > > -- > > >=20 > Above proposal make sense to me, I was reviewing source code flow > for a scenarios when two consecutive need to me delivered to Guest. > Example, process_nmi will pend the first NMI and then second NMI will > be injected through EVTINJ, as because (kvm_x86_inject_nmi) > will get called and that will set the _iret_intercept.=20 >=20 > With your proposal inject_nmi will be set the env_inj NMI w/o the IRET, > I think we could check for "is_vnmi_enabled" before the programming > the "evt_inj"? No, because the whole point of this path is to directly inject an NMI when = NMIs are NOT blocked in the guest AND there is already a pending vNMI.