Re: [PATCH v4 03/16] KVM: Add KVM_CAP_MEMORY_FAULT_INFO

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Anish Moorthy <amoorthy@google.com>
Cc: oliver.upton@linux.dev, kvm@vger.kernel.org,
	kvmarm@lists.linux.dev,  pbonzini@redhat.com, maz@kernel.org,
	robert.hoo.linux@gmail.com,  jthoughton@google.com,
	bgardon@google.com, dmatlack@google.com,  ricarkol@google.com,
	axelrasmussen@google.com, peterx@redhat.com,
	 nadav.amit@gmail.com, isaku.yamahata@gmail.com
Subject: Re: [PATCH v4 03/16] KVM: Add KVM_CAP_MEMORY_FAULT_INFO
Date: Mon, 14 Aug 2023 11:01:46 -0700	[thread overview]
Message-ID: <ZNpsCngiSjISMG5j@google.com> (raw)
In-Reply-To: <CAF7b7mqfkLYtWBJ=u0MK7hhARHrahQXHza9VnaughyNz5_tNug@mail.gmail.com>

On Fri, Aug 11, 2023, Anish Moorthy wrote:
> On Wed, Jun 14, 2023 at 10:35 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > +inline void kvm_populate_efault_info(struct kvm_vcpu *vcpu,
> >
> > Tagging a globally visible, non-static function as "inline" is odd, to say the
> > least.
> 
> I think my eyes glaze over whenever I read the words "translation
> unit" (my brain certainly does) so I'll have to take your word for it.
> IIRC last time I tried to mark this function "static" the compiler
> yelled at me, so removing the "inline" it is.

What is/was the error?  It's probably worth digging into; "static inline" should
work just fine, so there might be something funky elsewhere that you're papering
over.

> > I got a bit (ok, way more than a bit) lost in all of the (A) (B) (C) madness.  I
> > think this what you intended for each case?
> >
> >   (A) if there are any existing paths in KVM_RUN which cause a vCPU
> >       to (1) populate the kvm_run struct then (2) fail a vCPU guest memory
> >       access but ignore the failure and then (3) complete the exit to
> >       userspace set up in (1), then the contents of the kvm_run struct written
> >       in (1) will be corrupted.
> >
> >   (B) if KVM_RUN fails a guest memory access for which the EFAULT is annotated
> >       but does not return the EFAULT to userspace, then later returns an *un*annotated
> >       EFAULT to userspace, then userspace will receive incorrect information.
> >
> >   (C) an annotated EFAULT which is ignored/suppressed followed by one which is
> >       propagated to userspace. Here the exit-reason-unset check will prevent the
> >       second annotation from being written, so userspace sees an annotation with
> >       bad contents, If we believe that case (A) is a weird sequence of events
> >       that shouldn't be happening in the first place, then case (C) seems more
> >       important to ensure correctness in. But I don't know anything about how often
> >       (A) happens in KVM, which is why I want others' opinions.
> 
> Yeah, I got lost in the weeds: you've gotten the important points though
> 
> > (A) does sadly happen.  I wouldn't call it a "pattern" though, it's an unfortunate
> > side effect of deficiencies in KVM's uAPI.
> >
> > (B) is the trickiest to defend against in the kernel, but as I mentioned in earlier
> > versions of this series, userspace needs to guard against a vCPU getting stuck in
> > an infinite fault anyways, so I'm not _that_ concerned with figuring out a way to
> > address this in the kernel.  KVM's documentation should strongly encourage userspace
> > to take action if KVM repeatedly exits with the same info over and over, but beyond
> > that I think anything else is nice to have, not mandatory.
> >
> > (C) should simply not be possible.  (A) is very much a "this shouldn't happen,
> > but it does" case.  KVM provides no meaningful guarantees if (A) does happen, so
> > yes, prioritizing correctness for (C) is far more important.
> >
> > That said, prioritizing (C) doesn't mean we can't also do our best to play nice
> > with (A).  None of the existing exits use anywhere near the exit info union's 256
> > bytes, i.e. there is tons of space to play with.  So rather than put memory_fault
> > in with all the others, what if we split the union in two, and place memory_fault
> > in the high half (doesn't have to literally be half, but you get the idea).  It'd
> > kinda be similar to x86's contributory vs. benign faults; exits that can't be
> > "nested" or "speculative" go in the low half, and things like memory_fault go in
> > the high half.
> >
> > That way, if (A) does occur, the original information will be preserved when KVM
> > fills memory_fault.  And my suggestion to WARN-and-continue limits the problematic
> > scenarios to just fields in the second union, i.e. just memory_fault for now.
> > At the very least, not clobbering would likely make it easier for us to debug when
> > things go sideways.
> >
> > And rather than use kvm_run.exit_reason as the canary, we should carve out a
> > kernel-only, i.e. non-ABI, field from the union.  That would allow setting the
> > canary in common KVM code, which can't be done for kvm_run.exit_reason because
> > some architectures, e.g. s390 (and x86 IIRC), consume the exit_reason early on
> > in KVM_RUN.
> 
> I think this is a good idea :D I was going to suggest something
> similar a while back, but I thought it would be off the table- whoops.
> 
> My one concern is that if/when other features eventually also use the
> "speculative" portion, then they're going to run into the same issues
> as we're trying to avoid here.

I think it's worth the risk.  We could mitigate potential future problems to some
degree by maintaining the last N "speculative" user exits since KVM_RUN, e.g. with
a ring buffer, but (a) that's more than a bit crazy and (b) I don't think the
extra data would be actionable for userspace unless userspace somehow had a priori
knowledge of the "failing" sequence.

> But fixing *that* (probably by propagating these exits through return
> values/the call stack) would be a really big refactor, and C doesn't really
> have the type system for it in the first place :(

Yeah, lack of a clean and easy way to return a tuple makes it all but impossible
to handle this without resorting to evil shenanigans.

next prev parent reply	other threads:[~2023-08-14 18:01 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-02 16:19 [PATCH v4 00/16] Improve scalability of KVM + userfaultfd live migration via annotated memory faults Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 01/16] KVM: Allow hva_pfn_fast() to resolve read-only faults Anish Moorthy
2023-06-14 14:39   ` Sean Christopherson
2023-06-14 16:57     ` Anish Moorthy
2023-08-10 19:54       ` Anish Moorthy
2023-08-10 23:48         ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 02/16] KVM: x86: Set vCPU exit reason to KVM_EXIT_UNKNOWN at the start of KVM_RUN Anish Moorthy
2023-06-02 20:30   ` Isaku Yamahata
2023-06-05 16:41     ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 03/16] KVM: Add KVM_CAP_MEMORY_FAULT_INFO Anish Moorthy
2023-06-03 16:58   ` Isaku Yamahata
2023-06-05 16:37     ` Anish Moorthy
2023-06-14 14:55       ` Sean Christopherson
2023-06-05 17:46   ` Anish Moorthy
2023-06-14 17:35   ` Sean Christopherson
2023-06-20 21:13     ` Anish Moorthy
2023-07-07 11:50     ` Kautuk Consul
2023-07-10 15:00       ` Anish Moorthy
2023-07-11  3:54         ` Kautuk Consul
2023-07-11 14:25           ` Sean Christopherson
2023-08-11 22:12     ` Anish Moorthy
2023-08-14 18:01       ` Sean Christopherson [this message]
2023-08-15  0:06         ` Anish Moorthy
2023-08-15  0:43           ` Sean Christopherson
2023-08-15 17:01             ` Anish Moorthy
2023-08-16 15:58               ` Sean Christopherson
2023-08-16 21:28                 ` Anish Moorthy
2023-08-17 23:58                   ` Sean Christopherson
2023-08-18 17:32                     ` Anish Moorthy
2023-08-23 22:20                       ` Sean Christopherson
2023-08-23 23:38                         ` Anish Moorthy
2023-08-24 17:24                           ` Sean Christopherson
2023-08-17 22:55     ` Anish Moorthy
2023-07-05  8:21   ` Kautuk Consul
2023-06-02 16:19 ` [PATCH v4 04/16] KVM: Add docstrings to __kvm_write_guest_page() and __kvm_read_guest_page() Anish Moorthy
2023-06-15  2:41   ` Robert Hoo
2023-08-14 22:51     ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 05/16] KVM: Annotate -EFAULTs from kvm_vcpu_write_guest_page() Anish Moorthy
2023-06-14 19:10   ` Sean Christopherson
2023-07-06 22:51     ` Anish Moorthy
2023-07-12 14:08       ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 06/16] KVM: Annotate -EFAULTs from kvm_vcpu_read_guest_page() Anish Moorthy
2023-06-14 19:22   ` Sean Christopherson
2023-07-07 17:35     ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 07/16] KVM: Simplify error handling in __gfn_to_pfn_memslot() Anish Moorthy
2023-06-14 19:26   ` Sean Christopherson
2023-07-07 17:33     ` Anish Moorthy
2023-07-10 17:40       ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 08/16] KVM: x86: Annotate -EFAULTs from kvm_handle_error_pfn() Anish Moorthy
2023-06-14 20:03   ` Sean Christopherson
2023-07-07 18:05     ` Anish Moorthy
2023-06-15  2:43   ` Robert Hoo
2023-06-15 14:40     ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 09/16] KVM: Introduce KVM_CAP_NOWAIT_ON_FAULT without implementation Anish Moorthy
2023-06-14 20:11   ` Sean Christopherson
2023-07-06 19:04     ` Anish Moorthy
2023-06-14 21:20   ` Sean Christopherson
2023-06-14 21:23     ` Sean Christopherson
2023-08-23 21:17       ` Anish Moorthy
2023-06-15  3:55     ` Wang, Wei W
2023-06-15 14:56       ` Sean Christopherson
2023-06-16 12:08         ` Wang, Wei W
2023-07-07 18:13     ` Anish Moorthy
2023-07-07 20:07       ` Anish Moorthy
2023-07-11 15:29         ` Sean Christopherson
2023-08-25  0:15           ` Anish Moorthy
2023-08-29 22:41             ` Sean Christopherson
2023-08-30 16:21               ` Anish Moorthy
2023-09-07 21:17                 ` Sean Christopherson
2023-06-02 16:19 ` [PATCH v4 10/16] KVM: x86: Implement KVM_CAP_NOWAIT_ON_FAULT Anish Moorthy
2023-06-14 20:25   ` Sean Christopherson
2023-07-07 17:41     ` Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 11/16] KVM: arm64: " Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 12/16] KVM: selftests: Report per-vcpu demand paging rate from demand paging test Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 13/16] KVM: selftests: Allow many vCPUs and reader threads per UFFD in " Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 14/16] KVM: selftests: Use EPOLL in userfaultfd_util reader threads and signal errors via TEST_ASSERT Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 15/16] KVM: selftests: Add memslot_flags parameter to memstress_create_vm() Anish Moorthy
2023-06-02 16:19 ` [PATCH v4 16/16] KVM: selftests: Handle memory fault exits in demand_paging_test Anish Moorthy
2023-06-20  2:44   ` Robert Hoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZNpsCngiSjISMG5j@google.com \
    --to=seanjc@google.com \
    --cc=amoorthy@google.com \
    --cc=axelrasmussen@google.com \
    --cc=bgardon@google.com \
    --cc=dmatlack@google.com \
    --cc=isaku.yamahata@gmail.com \
    --cc=jthoughton@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=maz@kernel.org \
    --cc=nadav.amit@gmail.com \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=ricarkol@google.com \
    --cc=robert.hoo.linux@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.