Re: [PATCH v4 03/16] KVM: Add KVM_CAP_MEMORY_FAULT_INFO

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Anish Moorthy <amoorthy@google.com>
Cc: oliver.upton@linux.dev, kvm@vger.kernel.org,
	kvmarm@lists.linux.dev, pbonzini@redhat.com, maz@kernel.org,
	robert.hoo.linux@gmail.com, jthoughton@google.com,
	bgardon@google.com, dmatlack@google.com, ricarkol@google.com,
	axelrasmussen@google.com, peterx@redhat.com,
	nadav.amit@gmail.com, isaku.yamahata@gmail.com
Subject: Re: [PATCH v4 03/16] KVM: Add KVM_CAP_MEMORY_FAULT_INFO
Date: Mon, 14 Aug 2023 11:01:46 -0700	[thread overview]
Message-ID: <ZNpsCngiSjISMG5j@google.com> (raw)
In-Reply-To: <CAF7b7mqfkLYtWBJ=u0MK7hhARHrahQXHza9VnaughyNz5_tNug@mail.gmail.com>

On Fri, Aug 11, 2023, Anish Moorthy wrote:
> On Wed, Jun 14, 2023 at 10:35 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > +inline void kvm_populate_efault_info(struct kvm_vcpu *vcpu,
> >
> > Tagging a globally visible, non-static function as "inline" is odd, to say the
> > least.
> 
> I think my eyes glaze over whenever I read the words "translation
> unit" (my brain certainly does) so I'll have to take your word for it.
> IIRC last time I tried to mark this function "static" the compiler
> yelled at me, so removing the "inline" it is.

What is/was the error?  It's probably worth digging into; "static inline" should
work just fine, so there might be something funky elsewhere that you're papering
over.

> > I got a bit (ok, way more than a bit) lost in all of the (A) (B) (C) madness.  I
> > think this what you intended for each case?
> >
> >   (A) if there are any existing paths in KVM_RUN which cause a vCPU
> >       to (1) populate the kvm_run struct then (2) fail a vCPU guest memory
> >       access but ignore the failure and then (3) complete the exit to
> >       userspace set up in (1), then the contents of the kvm_run struct written
> >       in (1) will be corrupted.
> >
> >   (B) if KVM_RUN fails a guest memory access for which the EFAULT is annotated
> >       but does not return the EFAULT to userspace, then later returns an *un*annotated
> >       EFAULT to userspace, then userspace will receive incorrect information.
> >
> >   (C) an annotated EFAULT which is ignored/suppressed followed by one which is
> >       propagated to userspace. Here the exit-reason-unset check will prevent the
> >       second annotation from being written, so userspace sees an annotation with
> >       bad contents, If we believe that case (A) is a weird sequence of events
> >       that shouldn't be happening in the first place, then case (C) seems more
> >       important to ensure correctness in. But I don't know anything about how often
> >       (A) happens in KVM, which is why I want others' opinions.
> 
> Yeah, I got lost in the weeds: you've gotten the important points though
> 
> > (A) does sadly happen.  I wouldn't call it a "pattern" though, it's an unfortunate
> > side effect of deficiencies in KVM's uAPI.
> >
> > (B) is the trickiest to defend against in the kernel, but as I mentioned in earlier
> > versions of this series, userspace needs to guard against a vCPU getting stuck in
> > an infinite fault anyways, so I'm not _that_ concerned with figuring out a way to
> > address this in the kernel.  KVM's documentation should strongly encourage userspace
> > to take action if KVM repeatedly exits with the same info over and over, but beyond
> > that I think anything else is nice to have, not mandatory.
> >
> > (C) should simply not be possible.  (A) is very much a "this shouldn't happen,
> > but it does" case.  KVM provides no meaningful guarantees if (A) does happen, so
> > yes, prioritizing correctness for (C) is far more important.
> >
> > That said, prioritizing (C) doesn't mean we can't also do our best to play nice
> > with (A).  None of the existing exits use anywhere near the exit info union's 256
> > bytes, i.e. there is tons of space to play with.  So rather than put memory_fault
> > in with all the others, what if we split the union in two, and place memory_fault
> > in the high half (doesn't have to literally be half, but you get the idea).  It'd
> > kinda be similar to x86's contributory vs. benign faults; exits that can't be
> > "nested" or "speculative" go in the low half, and things like memory_fault go in
> > the high half.
> >
> > That way, if (A) does occur, the original information will be preserved when KVM
> > fills memory_fault.  And my suggestion to WARN-and-continue limits the problematic
> > scenarios to just fields in the second union, i.e. just memory_fault for now.
> > At the very least, not clobbering would likely make it easier for us to debug when
> > things go sideways.
> >
> > And rather than use kvm_run.exit_reason as the canary, we should carve out a
> > kernel-only, i.e. non-ABI, field from the union.  That would allow setting the
> > canary in common KVM code, which can't be done for kvm_run.exit_reason because
> > some architectures, e.g. s390 (and x86 IIRC), consume the exit_reason early on
> > in KVM_RUN.
> 
> I think this is a good idea :D I was going to suggest something
> similar a while back, but I thought it would be off the table- whoops.
> 
> My one concern is that if/when other features eventually also use the
> "speculative" portion, then they're going to run into the same issues
> as we're trying to avoid here.

I think it's worth the risk.  We could mitigate potential future problems to some
degree by maintaining the last N "speculative" user exits since KVM_RUN, e.g. with
a ring buffer, but (a) that's more than a bit crazy and (b) I don't think the
extra data would be actionable for userspace unless userspace somehow had a priori
knowledge of the "failing" sequence.

> But fixing *that* (probably by propagating these exits through return
> values/the call stack) would be a really big refactor, and C doesn't really
> have the type system for it in the first place :(

Yeah, lack of a clean and easy way to return a tuple makes it all but impossible
to handle this without resorting to evil shenanigans.

next prev parent reply	other threads:[~2023-08-14 18:02 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-02 16:19 [PATCH v4 00/16] Improve scalability of KVM + userfaultfd live migration via annotated memory faults Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 01/16] KVM: Allow hva_pfn_fast() to resolve read-only faults Anish Moorthy
2023-06-14 14:39   ` Sean Christopherson
2023-06-14 16:57     ` Anish Moorthy
2023-08-10 19:54       ` Anish Moorthy
2023-08-10 23:48         ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 02/16] KVM: x86: Set vCPU exit reason to KVM_EXIT_UNKNOWN at the start of KVM_RUN Anish Moorthy
2023-06-02 20:30   ` Isaku Yamahata
2023-06-05 16:41     ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 03/16] KVM: Add KVM_CAP_MEMORY_FAULT_INFO Anish Moorthy
2023-06-03 16:58   ` Isaku Yamahata
2023-06-05 16:37     ` Anish Moorthy
2023-06-14 14:55       ` Sean Christopherson
2023-06-05 17:46   ` Anish Moorthy
2023-06-14 17:35   ` Sean Christopherson
2023-06-20 21:13     ` Anish Moorthy
2023-07-07 11:50     ` Kautuk Consul
2023-07-10 15:00       ` Anish Moorthy
2023-07-11  3:54         ` Kautuk Consul
2023-07-11 14:25           ` Sean Christopherson
2023-08-11 22:12     ` Anish Moorthy
2023-08-14 18:01       ` Sean Christopherson [this message]
2023-08-15  0:06         ` Anish Moorthy
2023-08-15  0:43           ` Sean Christopherson
2023-08-15 17:01             ` Anish Moorthy
2023-08-16 15:58               ` Sean Christopherson
2023-08-16 21:28                 ` Anish Moorthy
2023-08-17 23:58                   ` Sean Christopherson
2023-08-18 17:32                     ` Anish Moorthy
2023-08-23 22:20                       ` Sean Christopherson
2023-08-23 23:38                         ` Anish Moorthy
2023-08-24 17:24                           ` Sean Christopherson
2023-08-17 22:55     ` Anish Moorthy
2023-07-05  8:21   ` Kautuk Consul
2023-06-02 16:19 ` [PATCH v4 04/16] KVM: Add docstrings to __kvm_write_guest_page() and __kvm_read_guest_page() Anish Moorthy
2023-06-15  2:41   ` Robert Hoo
2023-08-14 22:51     ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 05/16] KVM: Annotate -EFAULTs from kvm_vcpu_write_guest_page() Anish Moorthy
2023-06-14 19:10   ` Sean Christopherson
2023-07-06 22:51     ` Anish Moorthy
2023-07-12 14:08       ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 06/16] KVM: Annotate -EFAULTs from kvm_vcpu_read_guest_page() Anish Moorthy
2023-06-14 19:22   ` Sean Christopherson
2023-07-07 17:35     ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 07/16] KVM: Simplify error handling in __gfn_to_pfn_memslot() Anish Moorthy
2023-06-14 19:26   ` Sean Christopherson
2023-07-07 17:33     ` Anish Moorthy
2023-07-10 17:40       ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 08/16] KVM: x86: Annotate -EFAULTs from kvm_handle_error_pfn() Anish Moorthy
2023-06-14 20:03   ` Sean Christopherson
2023-07-07 18:05     ` Anish Moorthy
2023-06-15  2:43   ` Robert Hoo
2023-06-15 14:40     ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 09/16] KVM: Introduce KVM_CAP_NOWAIT_ON_FAULT without implementation Anish Moorthy
2023-06-14 20:11   ` Sean Christopherson
2023-07-06 19:04     ` Anish Moorthy
2023-06-14 21:20   ` Sean Christopherson
2023-06-14 21:23     ` Sean Christopherson
2023-08-23 21:17       ` Anish Moorthy
2023-06-15  3:55     ` Wang, Wei W
2023-06-15 14:56       ` Sean Christopherson
2023-06-16 12:08         ` Wang, Wei W
2023-07-07 18:13     ` Anish Moorthy
2023-07-07 20:07       ` Anish Moorthy
2023-07-11 15:29         ` Sean Christopherson
2023-08-25  0:15           ` Anish Moorthy
2023-08-29 22:41             ` Sean Christopherson
2023-08-30 16:21               ` Anish Moorthy
2023-09-07 21:17                 ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 10/16] KVM: x86: Implement KVM_CAP_NOWAIT_ON_FAULT Anish Moorthy
2023-06-14 20:25   ` Sean Christopherson
2023-07-07 17:41     ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 11/16] KVM: arm64: " Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 12/16] KVM: selftests: Report per-vcpu demand paging rate from demand paging test Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 13/16] KVM: selftests: Allow many vCPUs and reader threads per UFFD in " Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 14/16] KVM: selftests: Use EPOLL in userfaultfd_util reader threads and signal errors via TEST_ASSERT Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 15/16] KVM: selftests: Add memslot_flags parameter to memstress_create_vm() Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 16/16] KVM: selftests: Handle memory fault exits in demand_paging_test Anish Moorthy
2023-06-20  2:44   ` Robert Hoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZNpsCngiSjISMG5j@google.com \
    --to=seanjc@google.com \
    --cc=amoorthy@google.com \
    --cc=axelrasmussen@google.com \
    --cc=bgardon@google.com \
    --cc=dmatlack@google.com \
    --cc=isaku.yamahata@gmail.com \
    --cc=jthoughton@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=maz@kernel.org \
    --cc=nadav.amit@gmail.com \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=ricarkol@google.com \
    --cc=robert.hoo.linux@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).