From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5FF2C433EF for ; Wed, 6 Apr 2022 23:03:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237571AbiDFXFk (ORCPT ); Wed, 6 Apr 2022 19:05:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40978 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237578AbiDFXFZ (ORCPT ); Wed, 6 Apr 2022 19:05:25 -0400 Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D2FBE1226F4 for ; Wed, 6 Apr 2022 16:03:24 -0700 (PDT) Received: by mail-pf1-x42c.google.com with SMTP id b13so3892300pfv.0 for ; Wed, 06 Apr 2022 16:03:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=tsvlof6FPejAIMTXX6OMYR8u1wWfyFACALvOuNHJncA=; b=A+/7G6K8eTFnu4PMRYz0QUP5GVp6FFnb1AZdQbOMps7YfiTuOaHT3bvnrHx9/9cJme Y4kNfVP3q4DcHnFOLRDZxbN/SAtUitkkErqmkCUWaU9mSv722O9otxiHIb4lRSn3GyXw Q0zcY3EjRlSMsBQ12dUWRfLObEdUMX2xDNUzeIzlpr1wb6anMkwR7L9atxyB9HTdN6uu txOzEV6Ly9lh7mQHDZy8C4Gow0t/B6RrAO0/ihuzYeZdAQIqdiuGVot6JNxB8GOXQXyX iGs+bl80jdzr0m7SplBoLOi3B5AQGHihdffEDLAHByL9//QgCVy7SXyY985S7o6RVQcX HcpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=tsvlof6FPejAIMTXX6OMYR8u1wWfyFACALvOuNHJncA=; b=qXBp4at9Qb5RRrSIHhTAudVIIGTlJprSGvgAzWA3rv27BA1tJ/sSTnBfd/jjmDtl7f /nxyczKHo6c4SUplyB4gTBR5zu8fq4FrLNnT2cjeFtNSSfwciA7rN8Iha1LTQ5oLEI6E sbrTGapjADuGIhWBd51SKUCw/mHhDUtVt0yuf6vU0etK/Rpw1Et02xv6I8fjsFxwFT1l BR9AYzEARHulB4d88X6C5YSAMM65PZD8TnVTTn2MSVk0+qjFI2nr7B8CbfoAwiMM4P1K 8qvEm5X008J8VwEWZdWErh7D6CAglXj3jJVHWhOJKAtJBop3TFdTCcUf2yUT3tXvvwKj G9pw== X-Gm-Message-State: AOAM5328ZkJ+7f+R+iFohlRO1QujfIDQvU4IM3nwW6amyEapyd7RtDWf vRR8Hq/tMhlHAxyvrtIxeURHdw== X-Google-Smtp-Source: ABdhPJzVikFTDdNywBy6s+xfQqG/MGuqigLUM9aGH0aUuVMGX9x4KNUTrgDUaoI/qXpaiciPozTS6g== X-Received: by 2002:a05:6a00:138a:b0:4fd:a7ec:6131 with SMTP id t10-20020a056a00138a00b004fda7ec6131mr11161020pfg.10.1649286204132; Wed, 06 Apr 2022 16:03:24 -0700 (PDT) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id pi2-20020a17090b1e4200b001c7b15928e0sm6597611pjb.23.2022.04.06.16.03.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Apr 2022 16:03:23 -0700 (PDT) Date: Wed, 6 Apr 2022 23:03:20 +0000 From: Sean Christopherson To: "Maciej S. Szmigiero" Cc: Maxim Levitsky , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 5/8] KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction Message-ID: References: <7caee33a-da0f-00be-3195-82c3d1cd4cb4@maciej.szmigiero.name> <5135b502-ce2e-babb-7812-4d4c431a5252@maciej.szmigiero.name> <7e8f558d-c00a-7170-f671-bd10c0a56557@maciej.szmigiero.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7e8f558d-c00a-7170-f671-bd10c0a56557@maciej.szmigiero.name> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 07, 2022, Maciej S. Szmigiero wrote: > On 6.04.2022 22:52, Sean Christopherson wrote: > > On Wed, Apr 06, 2022, Maciej S. Szmigiero wrote: > > > Another option for saving and restoring a VM would be to add it to > > > KVM_{GET,SET}_NESTED_STATE somewhere (maybe as a part of the saved VMCB12 > > > control area?). > > > > Ooh. What if we keep nested_run_pending=true until the injection completes? Then > > we don't even need an extra flag because nested_run_pending effectively says that > > any and all injected events are for L1=>L2. In KVM_GET_NESTED_STATE, shove the > > to-be-injected event into the normal vmc*12 injection field, and ignore all > > to-be-injected events in KVM_GET_VCPU_EVENTS if nested_run_pending=true. > > > > That should work even for migrating to an older KVM, as keeping nested_run_pending > > will cause the target to reprocess the event injection as if it were from nested > > VM-Enter, which it technically is. > > I guess here by "ignore all to-be-injected events in KVM_GET_VCPU_EVENTS" you > mean *moving* back the L1 -> L2 event to be injected from KVM internal data > structures like arch.nmi_injected (and so on) to the KVM_GET_NESTED_STATE-returned > VMCB12 EVENTINJ field (or its VMX equivalent). > > But then the VMM will need to first call KVM_GET_NESTED_STATE (which will do > the moving), only then KVM_GET_VCPU_EVENTS (which will then no longer show > these events as pending). > And their setters in the opposite order when restoring the VM. I wasn't thinking of actually moving things in the source VM, only ignoring events in KVM_GET_VCPU_EVENTS. Getting state shouldn't be destructive, e.g. the source VM should still be able to continue running. Ahahahaha, and actually looking at the code, there's this gem in KVM_GET_VCPU_EVENTS /* * The API doesn't provide the instruction length for software * exceptions, so don't report them. As long as the guest RIP * isn't advanced, we should expect to encounter the exception * again. */ if (kvm_exception_is_soft(vcpu->arch.exception.nr)) { events->exception.injected = 0; events->exception.pending = 0; } and again for soft interrupts events->interrupt.injected = vcpu->arch.interrupt.injected && !vcpu->arch.interrupt.soft; so through KVM's own incompetency, it's already doing half the work. This is roughly what I had in mind. It will "require" moving nested_run_pending to kvm_vcpu_arch, but I've been itching for an excuse to do that anyways. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index eb71727acecb..62c48f6a0815 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4846,6 +4846,8 @@ static int kvm_vcpu_ioctl_x86_set_mce(struct kvm_vcpu *vcpu, static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu, struct kvm_vcpu_events *events) { + bool drop_injected_events = vcpu->arch.nested_run_pending; + process_nmi(vcpu); if (kvm_check_request(KVM_REQ_SMI, vcpu)) @@ -4872,7 +4874,8 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu, * isn't advanced, we should expect to encounter the exception * again. */ - if (kvm_exception_is_soft(vcpu->arch.exception.nr)) { + if (drop_injected_events || + kvm_exception_is_soft(vcpu->arch.exception.nr)) { events->exception.injected = 0; events->exception.pending = 0; } else { @@ -4893,13 +4896,14 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu, events->exception_has_payload = vcpu->arch.exception.has_payload; events->exception_payload = vcpu->arch.exception.payload; - events->interrupt.injected = - vcpu->arch.interrupt.injected && !vcpu->arch.interrupt.soft; + events->interrupt.injected = vcpu->arch.interrupt.injected && + !vcpu->arch.interrupt.soft && + !drop_injected_events; events->interrupt.nr = vcpu->arch.interrupt.nr; events->interrupt.soft = 0; events->interrupt.shadow = static_call(kvm_x86_get_interrupt_shadow)(vcpu); - events->nmi.injected = vcpu->arch.nmi_injected; + events->nmi.injected = vcpu->arch.nmi_injected && !drop_injected_events; events->nmi.pending = vcpu->arch.nmi_pending != 0; events->nmi.masked = static_call(kvm_x86_get_nmi_mask)(vcpu); events->nmi.pad = 0;