From: Sean Christopherson <seanjc@google.com>
To: Mingwei Zhang <mizhang@google.com>
Cc: Jim Mattson <jmattson@google.com>,
Maxim Levitsky <mlevitsk@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Joerg Roedel <joro@8bytes.org>, kvm <kvm@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Oliver Upton <oupton@google.com>
Subject: Re: [PATCH 1/5] KVM: x86: Get vmcs12 pages before checking pending interrupts
Date: Thu, 25 Aug 2022 17:53:36 +0000 [thread overview]
Message-ID: <Ywe3IC7OlF/jYU1X@google.com> (raw)
In-Reply-To: <CAL715WK4eqxX9EUHzwqT4o-OX4S_1-WcTr5UuGnc-KEb7pk6EQ@mail.gmail.com>
On Thu, Aug 25, 2022, Mingwei Zhang wrote:
> On Thu, Aug 25, 2022 at 7:41 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Wed, Aug 24, 2022, Jim Mattson wrote:
> > > On Wed, Aug 24, 2022 at 5:11 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > > @google folks, what would it take for us to mark KVM_REQ_GET_NESTED_STATE_PAGES
> > > > as deprecated in upstream and stop accepting patches/fixes? IIUC, when we eventually
> > > > move to userfaultfd, all this goes away, i.e. we do want to ditch this at some point.
> > >
> > > Userfaultfd is a red herring. There were two reasons that we needed
> > > this when nested live migration was implemented:
> > > 1) our netlink socket mechanism for funneling remote page requests to
> > > a userspace listener was broken.
> > > 2) we were not necessarily prepared to deal with remote page requests
> > > during VM setup.
> > >
> > > (1) has long since been fixed. Though our preference is to exit from
> > > KVM_RUN and get the vCPU thread to request the remote page itself, we
> > > are now capable of queuing a remote page request with a separate
> > > listener thread and blocking in the kernel until the page is received.
> > > I believe that mechanism is functionally equivalent to userfaultfd,
> > > though not as elegant.
> > > I don't know about (2). I'm not sure when the listener thread is set
> > > up, relative to all of the other setup steps. Eliminating
> > > KVM_REQ_GET_NESTED_STATE_PAGES means that userspace must be prepared
> > > to fetch a remote page by the first call to KVM_SET_NESTED_STATE. The
> > > same is true when using userfaultfd.
> > >
> > > These new ordering constraints represent a UAPI breakage, but we don't
> > > seem to be as concerned about that as we once were. Maybe that's a
> > > good thing. Can we get rid of all of the superseded ioctls, like
> > > KVM_SET_CPUID, while we're at it?
> >
> > I view KVM_REQ_GET_NESTED_STATE_PAGES as a special case. We are likely the only
> > users, we can (eventually) wean ourselves off the feature, and we can carry
> > internal patches (which we are obviously already carrying) until we transition
> > away. And unlike KVM_SET_CPUID and other ancient ioctls() that are largely
> > forgotten, this feature is likely to be a maintenance burden as long as it exists.
>
> KVM_REQ_GET_NESTED_STATE_PAGES has been uniformly used in
> KVM_SET_NESTED_STATE ioctl in VMX (including eVMCS) and SVM, it is
> basically a two-step setting up of a nested state mechanism.
>
> We can change that, but this may have side effects and I think this
> usage case has nothing to do with demand paging.
There are two uses of KVM_REQ_GET_NESTED_STATE_PAGES:
1. Defer loads when leaving SMM.
2: Defer loads for KVM_SET_NESTED_STATE.
#1 is fully solvable without a request, e.g. split ->leave_smm() into two helpers,
one that restores whatever metadata is needed before restoring from SMRAM, and
a second to load guest virtualization state that _after_ restoring all other guest
state from SMRAM.
#2 is done because of the reasons Jim listed above, which are specific to demand
paging (including userfaultfd). There might be some interactions with other
ioctls() (KVM_SET_SREGS?) that are papered over by the request, but that can be
solved without a full request since only the first KVM_RUN after KVM_SET_NESTED_STATE
needs to refresh things (though ideally we'd avoid that).
In other words, if the demand paging use case goes away, then KVM can get rid of
KVM_REQ_GET_NESTED_STATE_PAGES.
> KVM_SET_NESTED_STATE in VMX, while in SVM implementation, it is simply
> just a kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
svm_set_nested_state() very rougly open codes enter_svm_guest_mode(). VMX could
do the same, but that may or may not be a net positive.
> hmm... so is the nested_vmx_enter_non_root_mode() call in vmx
> KVM_SET_NESTED_STATE ioctl() still necessary? I am thinking that
> because the same function is called again in nested_vmx_run().
nested_vmx_run() is used only to emulate VMLAUNCH/VMRESUME and wont' be invoked
if the vCPU is already running L2 at the time of migration.
next prev parent reply other threads:[~2022-08-25 17:53 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-02 23:07 [PATCH 0/5] Fix a race between posted interrupt delivery and migration in a nested VM Mingwei Zhang
2022-08-02 23:07 ` [PATCH 1/5] KVM: x86: Get vmcs12 pages before checking pending interrupts Mingwei Zhang
2022-08-03 10:08 ` Maxim Levitsky
2022-08-03 16:45 ` Mingwei Zhang
2022-08-03 17:00 ` Mingwei Zhang
2022-08-03 18:36 ` Oliver Upton
2022-08-03 17:18 ` Paolo Bonzini
2022-08-03 17:51 ` Mingwei Zhang
2022-08-03 19:34 ` Maxim Levitsky
2022-08-25 0:11 ` Sean Christopherson
2022-08-25 2:51 ` Jim Mattson
2022-08-25 14:40 ` Sean Christopherson
2022-08-25 17:16 ` Mingwei Zhang
2022-08-25 17:53 ` Sean Christopherson [this message]
2022-08-25 20:35 ` Mingwei Zhang
2022-08-25 20:37 ` Mingwei Zhang
2022-08-25 23:21 ` Jim Mattson
2022-08-02 23:07 ` [PATCH 2/5] selftests: KVM/x86: Fix vcpu_{save,load}_state() by adding APIC state into kvm_x86_state Mingwei Zhang
2022-08-03 18:44 ` Oliver Upton
2022-08-03 19:21 ` Sean Christopherson
2022-08-03 23:55 ` Oliver Upton
2022-08-02 23:07 ` [PATCH 3/5] selftests: KVM: Introduce vcpu_run_interruptable() Mingwei Zhang
2022-08-25 0:15 ` Sean Christopherson
2022-08-28 19:21 ` Mingwei Zhang
2022-08-02 23:07 ` [PATCH 4/5] selftests: KVM: Add support for posted interrupt handling in L2 Mingwei Zhang
2022-08-25 0:16 ` Sean Christopherson
2022-08-28 19:29 ` Mingwei Zhang
2022-08-02 23:07 ` [PATCH 5/5] selftests: KVM: Test if posted interrupt delivery race with migration Mingwei Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ywe3IC7OlF/jYU1X@google.com \
--to=seanjc@google.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mizhang@google.com \
--cc=mlevitsk@redhat.com \
--cc=oupton@google.com \
--cc=pbonzini@redhat.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.