From: Sean Christopherson <seanjc@google.com>
To: David Matlack <dmatlack@google.com>
Cc: Anish Moorthy <amoorthy@google.com>,
oliver.upton@linux.dev, kvm@vger.kernel.org,
kvmarm@lists.linux.dev, pbonzini@redhat.com, maz@kernel.org,
robert.hoo.linux@gmail.com, jthoughton@google.com,
ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com,
nadav.amit@gmail.com, isaku.yamahata@gmail.com,
kconsul@linux.vnet.ibm.com
Subject: Re: [PATCH v5 04/17] KVM: Add KVM_CAP_MEMORY_FAULT_INFO
Date: Mon, 16 Oct 2023 12:14:33 -0700 [thread overview]
Message-ID: <ZS2LmY4BnOM8vP2C@google.com> (raw)
In-Reply-To: <CALzav=fX+cCXQBXhxvRx0KZvHP=GdbP88Kvk9pnx=Ndsf9awEw@mail.gmail.com>
On Mon, Oct 16, 2023, David Matlack wrote:
> On Tue, Oct 10, 2023 at 4:40 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Tue, Oct 10, 2023, David Matlack wrote:
> > > On Fri, Sep 8, 2023 at 3:30 PM Anish Moorthy <amoorthy@google.com> wrote:
> > > >
> > > > +::
> > > > + union {
> > > > + /* KVM_SPEC_EXIT_MEMORY_FAULT */
> > > > + struct {
> > > > + __u64 flags;
> > > > + __u64 gpa;
> > > > + __u64 len; /* in bytes */
> > >
> > > I wonder if `gpa` and `len` should just be replaced with `gfn`.
> > >
> > > - We don't seem to care about returning an exact `gpa` out to
> > > userspace since this series just returns gpa = gfn * PAGE_SIZE out to
> > > userspace.
> > > - The len we return seems kind of arbitrary. PAGE_SIZE on x86 and
> > > vma_pagesize on ARM64. But at the end of the day we're not asking the
> > > kernel to fault in any specific length of mapping. We're just asking
> > > for gfn-to-pfn for a specific gfn.
> > > - I'm not sure userspace will want to do anything with this information.
> >
> > Extending ABI is tricky. E.g. if a use case comes along that needs/wants to
> > return a range, then we'd need to add a flag and also update userspace to actually
> > do the right thing.
> >
> > The page fault path doesn't need such information because hardware gives a very
> > precise faulting address. But if we ever get to a point where KVM provides info
> > for uaccess failures, then we'll likely want to provide the range. E.g. if a
> > uaccess splits a page, on x86, we'd either need to register our own exception
> > fixup and use custom uaccess macros (eww), or convice the world that extending
> > ex_handler_uaccess() and all of the uaccess macros that they need to provide the
> > exact address that failed.
>
> I wonder if userspace might need a precise fault address in some
> situations? e.g. If KVM returns -HWPOISON for an access that spans a
> page boundary, userspace won't know which is poisoned.
As things currently stand, the -EHWPOISON case is guaranteed to be precise because
uaccess failures only ever return -EFAULT. The resulting BUS_MCEERR_AR from the
kernel's #MC handler will provide the necessary precision to userspace.
Though even if -EHWPOISON were imprecise, userspace should be able to figure out
which page is poisoned, e.g. by probing each possible page (gross, but doable).
Ah, and a much more concrete reason to report gpa+len is that it's possible that
KVM may someday support faults at sub-page granularity, e.g. if something like
HEKI[*] wants to use Intel's Sub-Page Write Permissions to make a minimal amount
of guest code writable when the guest kernel is doing code patching.
> Maybe SNP/TDX need precise fault addresses as well? I don't know enough about
> how SNP and TDX plan to use this UAPI.
FWIW, SNP and TDX usage are limited to the KVM page fault path, i.e. always do
precise, single-page reporting.
[*] https://lore.kernel.org/all/20230505152046.6575-1-mic@digikod.net
next prev parent reply other threads:[~2023-10-16 19:14 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-08 22:28 [PATCH v5 00/17] Improve KVM + userfaultfd live migration via annotated memory faults Anish Moorthy
2023-09-08 22:28 ` [PATCH v5 01/17] KVM: Clarify documentation of hva_to_pfn()'s 'atomic' parameter Anish Moorthy
2023-09-08 22:28 ` [PATCH v5 02/17] KVM: Add docstrings to __kvm_read/write_guest_page() Anish Moorthy
2023-10-05 1:18 ` Sean Christopherson
2023-09-08 22:28 ` [PATCH v5 03/17] KVM: Simplify error handling in __gfn_to_pfn_memslot() Anish Moorthy
2023-09-08 22:28 ` [PATCH v5 04/17] KVM: Add KVM_CAP_MEMORY_FAULT_INFO Anish Moorthy
2023-10-05 1:14 ` Sean Christopherson
2023-10-05 18:45 ` Anish Moorthy
2023-10-05 22:13 ` Sean Christopherson
2023-10-10 22:58 ` David Matlack
2023-10-10 23:40 ` Sean Christopherson
2023-10-16 17:07 ` David Matlack
2023-10-16 19:14 ` Sean Christopherson [this message]
2023-09-08 22:28 ` [PATCH v5 05/17] KVM: Annotate -EFAULTs from kvm_vcpu_read/write_guest_page() Anish Moorthy
2023-09-14 8:04 ` kernel test robot
2023-10-05 1:53 ` Sean Christopherson
2023-10-05 23:03 ` Anish Moorthy
2023-09-08 22:28 ` [PATCH v5 06/17] KVM: x86: Annotate -EFAULTs from kvm_handle_error_pfn() Anish Moorthy
2023-10-05 1:26 ` Sean Christopherson
2023-10-05 23:57 ` Anish Moorthy
2023-10-06 0:36 ` Sean Christopherson
2023-09-08 22:28 ` [PATCH v5 07/17] KVM: arm64: Annotate -EFAULT from user_mem_abort() Anish Moorthy
2023-09-28 21:42 ` Anish Moorthy
2023-10-05 1:26 ` Sean Christopherson
2023-10-10 23:01 ` David Matlack
2023-09-08 22:28 ` [PATCH v5 08/17] KVM: Allow hva_pfn_fast() to resolve read faults Anish Moorthy
2023-09-08 22:28 ` [PATCH v5 09/17] KVM: Introduce KVM_CAP_USERFAULT_ON_MISSING without implementation Anish Moorthy
2023-10-10 23:16 ` David Matlack
2023-10-11 17:54 ` Anish Moorthy
2023-10-16 19:38 ` Sean Christopherson
2023-09-08 22:28 ` [PATCH v5 10/17] KVM: Implement KVM_CAP_USERFAULT_ON_MISSING by atomizing __gfn_to_pfn_memslot() calls Anish Moorthy
2023-10-05 1:44 ` Sean Christopherson
2023-10-05 18:58 ` Anish Moorthy
2023-10-06 0:17 ` Sean Christopherson
2023-10-11 22:04 ` Anish Moorthy
2023-11-01 21:53 ` Anish Moorthy
2023-11-01 22:03 ` Sean Christopherson
2023-11-01 22:25 ` Anish Moorthy
2023-11-01 22:39 ` David Matlack
2023-11-01 22:42 ` Sean Christopherson
2023-11-02 19:14 ` Anish Moorthy
2023-11-02 20:25 ` Anish Moorthy
2023-11-03 20:05 ` Sean Christopherson
2023-09-08 22:28 ` [PATCH v5 11/17] KVM: x86: Enable KVM_CAP_USERFAULT_ON_MISSING Anish Moorthy
2023-10-05 1:52 ` Sean Christopherson
2023-11-01 22:55 ` Anish Moorthy
2023-11-02 14:31 ` Sean Christopherson
2023-09-08 22:28 ` [PATCH v5 12/17] KVM: arm64: " Anish Moorthy
2023-09-08 22:29 ` [PATCH v5 13/17] KVM: selftests: Report per-vcpu demand paging rate from demand paging test Anish Moorthy
2023-09-08 22:29 ` [PATCH v5 14/17] KVM: selftests: Allow many vCPUs and reader threads per UFFD in " Anish Moorthy
2023-09-08 22:29 ` [PATCH v5 15/17] KVM: selftests: Use EPOLL in userfaultfd_util reader threads and signal errors via TEST_ASSERT Anish Moorthy
2023-09-08 22:29 ` [PATCH v5 16/17] KVM: selftests: Add memslot_flags parameter to memstress_create_vm() Anish Moorthy
2023-09-08 22:29 ` [PATCH v5 17/17] KVM: selftests: Handle memory fault exits in demand_paging_test Anish Moorthy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZS2LmY4BnOM8vP2C@google.com \
--to=seanjc@google.com \
--cc=amoorthy@google.com \
--cc=axelrasmussen@google.com \
--cc=dmatlack@google.com \
--cc=isaku.yamahata@gmail.com \
--cc=jthoughton@google.com \
--cc=kconsul@linux.vnet.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=maz@kernel.org \
--cc=nadav.amit@gmail.com \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=ricarkol@google.com \
--cc=robert.hoo.linux@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).