Re: [WIP Patch v2 04/14] KVM: x86: Add KVM_CAP_X86_MEMORY_FAULT_EXIT and associated kvm_run field

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Anish Moorthy <amoorthy@google.com>
Cc: Isaku Yamahata <isaku.yamahata@gmail.com>,
	Marc Zyngier <maz@kernel.org>,
	Oliver Upton <oliver.upton@linux.dev>,
	jthoughton@google.com, kvm@vger.kernel.org
Subject: Re: [WIP Patch v2 04/14] KVM: x86: Add KVM_CAP_X86_MEMORY_FAULT_EXIT and associated kvm_run field
Date: Wed, 22 Mar 2023 16:17:24 -0700	[thread overview]
Message-ID: <ZBuMhA8eOPC8HzkC@google.com> (raw)
In-Reply-To: <CAF7b7mpWBCa9Y4xuNLbmgh=EQWOzU4bpSDxGjmRnpH3UEZkB3g@mail.gmail.com>

On Wed, Mar 22, 2023, Anish Moorthy wrote:
> On Tue, Mar 21, 2023 at 12:43 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Tue, Mar 21, 2023, Anish Moorthy wrote:
> > > > FWIW, I completely agree that filling KVM_EXIT_MEMORY_FAULT without guaranteeing
> > > > that KVM "immediately" exits to userspace isn't ideal, but given the amount of
> > > > historical code that we need to deal with, it seems like the lesser of all evils.
> > > > Unless I'm misunderstanding the use cases, unnecessarily filling kvm_run is a far
> > > > better failure mode than KVM not filling kvm_run when it should, i.e. false
> > > > positives are ok, false negatives are fatal.
> > >
> > > Don't you have this in reverse?
> >
> > No, I don't think so.
> >
> > > False negatives will just result in userspace not having useful extra
> > > information for the -EFAULT it receives from KVM_RUN, in which case userspace
> > > can do what you mentioned all VMMs do today and just terminate the VM.
> >
> > And that is _really_ bad behavior if we have any hope of userspace actually being
> > able to rely on this functionality.  E.g. any false negative when userspace is
> > trying to do postcopy demand paging will be fatal to the VM.
> 
> But since -EFAULTs from KVM_RUN today are already fatal, so there's no
> new failure introduced by an -EFAULT w/o a populated memory_fault
> field right?

Yes, but it's a bit of a moot piont since the goal of the feature is to avoid
killing the VM.

> Obviously that's of no real use to userspace, but that seems like part of the
> point of starting with a partial conversion: to allow for filling holes in
> the implementation in the future.

Yes, but I want a forcing function to reveal any holes we missed sooner than
later, otherwise the feature will languish since it won't be useful beyond the
fast-gup-only use case.

> It seems like what you're really concerned about here is the interaction with
> the memslot fast-gup-only flag. Obviously, failing to populate
> kvm_run.memory_fault for new userspace-visible -EFAULTs caused by that flag
> would cause new fatal failures for the guest, which would make the feature
> actually harmful. But as far as I know (and please lmk if I'm wrong), the
> memslot flag only needs to be used by the kvm_handle_error_pfn (x86) and
> user_mem_abort (arm64) functions, meaning that those are the only places
> where we need to check/populate kvm_run.memory_fault for new
> userspace-visible -EFAULTs.

No.  As you point out, the fast-gup-only case should be pretty easy to get correct,
i.e. this should all work just fine for _GCE's current_ use case.  I'm more concerned
with setting KVM up for success when future use cases come along that might not be ok
with unhandled faults in random guest accesses killing the VM.

To be clear, I do not expect us to get this 100% correct on the first attempt,
but I do want to have mechanisms in place that will detect any bugs/misses so
that we can fix the issues _before_ a use case comes along that needs 100%
accuracy.

> > > Whereas a false positive might cause a double-write to the KVM_RUN struct,
> > > either putting incorrect information in kvm_run.memory_fault or
> >
> > Recording unused information on -EFAULT in kvm_run doesn't make the information
> > incorrect.
> >
> > > corrupting another member of the union.
> >
> > Only if KVM accesses guest memory after initiating an exit to userspace, which
> > would be a KVM irrespective of kvm_run.memory_fault.
> 
> Ah good: I was concerned that this was a valid set of code paths in
> KVM. Although I'm assuming that "initiating an exit to userspace"
> includes the "returning -EFAULT from KVM_RUN" cases, because we
> wouldn't want EFAULTs to stomp on each other as well (the
> kvm_mmu_do_page_fault usages were supposed to be one such example,
> though I'm glad to know that they're not a problem).

This one gets into a bit of a grey area.  The "rule" is really about the intent,
i.e. once KVM intends to exit to userspace, it's a bug if KVM encounters something
else and runs into the weeds.

In no small part because of the myriad paths where KVM ignores what be fatal errors
in most flows, e.g. record_steal_time(), simply returning -EFAULT from some low
level helper doesn't necessarily signal an intent to exit all the way to userspace.

To be honest, I don't have a clear idea of how difficult it will be to detect bugs.
In most cases, failure to exit to userspace leads to a fatal error fairly quickly.
With userspace faults, it's entirely possible that an exit could be missed and
nothing bad would happen.

Hmm, one idea would be to have the initial -EFAULT detection fill kvm_run.memory_fault,
but set kvm_run.exit_reason to some magic number, e.g. zero it out.  Then KVM could
WARN if something tries to overwrite kvm_run.exit_reason.  The WARN would need to
be buried by a Kconfig or something since kvm_run can be modified by userspace,
but other than that I think it would work.

next prev parent reply	other threads:[~2023-03-22 23:17 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-15  2:17 [WIP Patch v2 00/14] Avoiding slow get-user-pages via memory fault exit Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 01/14] KVM: selftests: Allow many vCPUs and reader threads per UFFD in demand paging test Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 02/14] KVM: selftests: Use EPOLL in userfaultfd_util reader threads and signal errors via TEST_ASSERT Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 03/14] KVM: Allow hva_pfn_fast to resolve read-only faults Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 04/14] KVM: x86: Add KVM_CAP_X86_MEMORY_FAULT_EXIT and associated kvm_run field Anish Moorthy
2023-03-17  0:02   ` Isaku Yamahata
2023-03-17 18:33     ` Anish Moorthy
2023-03-17 19:30       ` Oliver Upton
2023-03-17 21:50       ` Sean Christopherson
2023-03-17 22:44         ` Anish Moorthy
2023-03-20 15:53           ` Sean Christopherson
2023-03-20 18:19             ` Anish Moorthy
2023-03-20 22:11             ` Anish Moorthy
2023-03-21 15:21               ` Sean Christopherson
2023-03-21 18:01                 ` Anish Moorthy
2023-03-21 19:43                   ` Sean Christopherson
2023-03-22 21:06                     ` Anish Moorthy
2023-03-22 23:17                       ` Sean Christopherson [this message]
2023-03-28 22:19                     ` Anish Moorthy
2023-04-04 19:34                       ` Sean Christopherson
2023-04-04 20:40                         ` Anish Moorthy
2023-04-04 22:07                           ` Sean Christopherson
2023-04-05 20:21                             ` Anish Moorthy
2023-03-17 18:35   ` Oliver Upton
2023-03-15  2:17 ` [WIP Patch v2 05/14] KVM: x86: Implement memory fault exit for direct_map Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 06/14] KVM: x86: Implement memory fault exit for kvm_handle_page_fault Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 07/14] KVM: x86: Implement memory fault exit for setup_vmgexit_scratch Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 08/14] KVM: x86: Implement memory fault exit for FNAME(fetch) Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 09/14] KVM: Introduce KVM_CAP_MEMORY_FAULT_NOWAIT without implementation Anish Moorthy
2023-03-17 18:59   ` Oliver Upton
2023-03-17 20:15     ` Anish Moorthy
2023-03-17 20:54       ` Sean Christopherson
2023-03-17 23:42         ` Anish Moorthy
2023-03-20 15:13           ` Sean Christopherson
2023-03-20 19:53             ` Anish Moorthy
2023-03-17 20:17     ` Sean Christopherson
2023-03-20 22:22       ` Oliver Upton
2023-03-21 14:50         ` Sean Christopherson
2023-03-21 20:23           ` Oliver Upton
2023-03-21 21:01             ` Sean Christopherson
2023-03-15  2:17 ` [WIP Patch v2 10/14] KVM: x86: Implement KVM_CAP_MEMORY_FAULT_NOWAIT Anish Moorthy
2023-03-17  0:32   ` Isaku Yamahata
2023-03-15  2:17 ` [WIP Patch v2 11/14] KVM: arm64: Allow user_mem_abort to return 0 to signal a 'normal' exit Anish Moorthy
2023-03-17 18:18   ` Oliver Upton
2023-03-15  2:17 ` [WIP Patch v2 12/14] KVM: arm64: Implement KVM_CAP_MEMORY_FAULT_NOWAIT Anish Moorthy
2023-03-17 18:27   ` Oliver Upton
2023-03-17 19:00     ` Anish Moorthy
2023-03-17 19:03       ` Oliver Upton
2023-03-17 19:24       ` Sean Christopherson
2023-03-15  2:17 ` [WIP Patch v2 13/14] KVM: selftests: Add memslot_flags parameter to memstress_create_vm Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 14/14] KVM: selftests: Handle memory fault exits in demand_paging_test Anish Moorthy
2023-03-17 17:43 ` [WIP Patch v2 00/14] Avoiding slow get-user-pages via memory fault exit Oliver Upton
2023-03-17 18:13   ` Sean Christopherson
2023-03-17 18:46     ` David Matlack
2023-03-17 18:54       ` Oliver Upton
2023-03-17 18:59         ` David Matlack
2023-03-17 19:53           ` Anish Moorthy
2023-03-17 22:03             ` Sean Christopherson
2023-03-20 15:56               ` Sean Christopherson
2023-03-17 20:35 ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZBuMhA8eOPC8HzkC@google.com \
    --to=seanjc@google.com \
    --cc=amoorthy@google.com \
    --cc=isaku.yamahata@gmail.com \
    --cc=jthoughton@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).