From: Sean Christopherson <seanjc@google.com>
To: Vishal Annapurve <vannapurve@google.com>
Cc: Ackerley Tng <ackerleytng@google.com>,
David Hildenbrand <david@redhat.com>,
Patrick Roy <patrick.roy@linux.dev>,
Fuad Tabba <tabba@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Janosch Frank <frankja@linux.ibm.com>,
Claudio Imbrenda <imbrenda@linux.ibm.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Nikita Kalyazin <kalyazin@amazon.co.uk>,
shivankg@amd.com
Subject: Re: [PATCH 1/6] KVM: guest_memfd: Add DEFAULT_SHARED flag, reject user page faults if not set
Date: Wed, 1 Oct 2025 09:15:32 -0700 [thread overview]
Message-ID: <aN1TgRpde5hq_FPn@google.com> (raw)
In-Reply-To: <CAGtprH_JgWfr2wPGpJg_mY5Sxf6E0dp5r-_4aVLi96To2pugXA@mail.gmail.com>
On Wed, Oct 01, 2025, Vishal Annapurve wrote:
> On Mon, Sep 29, 2025 at 5:15 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > Oh! This got me looking at kvm_arch_supports_gmem_mmap() and thus
> > KVM_CAP_GUEST_MEMFD_MMAP. Two things:
> >
> > 1. We should change KVM_CAP_GUEST_MEMFD_MMAP into KVM_CAP_GUEST_MEMFD_FLAGS so
> > that we don't need to add a capability every time a new flag comes along,
> > and so that userspace can gather all flags in a single ioctl. If gmem ever
> > supports more than 32 flags, we'll need KVM_CAP_GUEST_MEMFD_FLAGS2, but
> > that's a non-issue relatively speaking.
> >
>
> Guest_memfd capabilities don't necessarily translate into flags, so ideally:
> 1) There should be two caps, KVM_CAP_GUEST_MEMFD_FLAGS and
> KVM_CAP_GUEST_MEMFD_CAPS.
I'm not saying we can't have another GUEST_MEMFD capability or three, all I'm
saying is that for enumerating what flags can be passed to KVM_CREATE_GUEST_MEMFD,
KVM_CAP_GUEST_MEMFD_FLAGS is a better fit than a one-off KVM_CAP_GUEST_MEMFD_MMAP.
> 2) IMO they should both support namespace of 64 values at least from the get go.
It's a limitation of KVM_CHECK_EXTENSION, and all of KVM's plumbing for ioctls.
Because KVM still supports 32-bit architectures, direct returns from ioctls are
forced to fit in 32-bit values to avoid unintentionally creating different ABI
for 32-bit vs. 64-bit kernels.
We could add KVM_CHECK_EXTENSION2 or KVM_CHECK_EXTENSION64 or something, but I
honestly don't see the point. The odds of guest_memfd supporting >32 flags is
small, and the odds of that happening in the next ~5 years is basically zero.
All so that userspace can make one syscall instead of two for a path that isn't
remotely performance critical.
So while I agree that being able to enumerate 64 flags from the get-go would be
nice to have, it's simply not worth the effort (unless someone has a clever idea).
> 3) The reservation scheme for upstream should ideally be LSB's first
> for the new caps/flags.
We're getting way ahead of ourselves. Nothing needs KVM_CAP_GUEST_MEMFD_CAPS at
this time, so there's nothing to discuss.
> guest_memfd will achieve multiple features in future, both upstream
> and in out-of-tree versions to deploy features before they make their
When it comes to upstream uAPI and uABI, out-of-tree kernel code is irrelevant.
> way upstream. Generally the scheme followed by out-of-tree versions is
> to define a custom UAPI that won't conflict with upstream UAPIs in
> near future. Having a namespace of 32 values gives little space to
> avoid the conflict, e.g. features like hugetlb support will have to
> eat up at least 5 bits from the flags [1].
Why on earth would out-of-tree code use KVM_CAP_GUEST_MEMFD_FLAGS? Providing
infrastructure to support an infinite (quite literally) number of out-of-tree
capabilities and sub-ioctls, with practically zero chance of conflict, is not
difficult. See internal b/378111418.
But as above, this is not upstream's problem to solve.
> [1] https://elixir.bootlin.com/linux/v6.17/source/include/uapi/asm-generic/hugetlb_encode.h#L20
next prev parent reply other threads:[~2025-10-01 16:15 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-26 16:31 [PATCH 0/6] KVM: Avoid a lurking guest_memfd ABI mess Sean Christopherson
2025-09-26 16:31 ` [PATCH 1/6] KVM: guest_memfd: Add DEFAULT_SHARED flag, reject user page faults if not set Sean Christopherson
2025-09-29 8:38 ` David Hildenbrand
2025-09-29 8:57 ` Fuad Tabba
2025-09-29 9:01 ` David Hildenbrand
2025-09-29 9:04 ` Fuad Tabba
2025-09-29 9:43 ` Ackerley Tng
2025-09-29 10:15 ` Patrick Roy
2025-09-29 10:22 ` David Hildenbrand
2025-09-29 10:51 ` Ackerley Tng
2025-09-29 16:55 ` Sean Christopherson
2025-09-30 0:15 ` Sean Christopherson
2025-09-30 8:36 ` Ackerley Tng
2025-10-01 14:22 ` Vishal Annapurve
2025-10-01 16:15 ` Sean Christopherson [this message]
2025-10-01 16:31 ` Vishal Annapurve
2025-10-01 17:16 ` Sean Christopherson
2025-10-01 22:13 ` Vishal Annapurve
2025-10-02 0:04 ` Sean Christopherson
2025-10-02 15:41 ` Vishal Annapurve
2025-10-03 0:12 ` Sean Christopherson
2025-10-03 4:10 ` Vishal Annapurve
2025-10-03 16:13 ` Sean Christopherson
2025-10-03 20:30 ` Vishal Annapurve
2025-09-29 16:54 ` Sean Christopherson
2025-09-26 16:31 ` [PATCH 2/6] KVM: selftests: Stash the host page size in a global in the guest_memfd test Sean Christopherson
2025-09-29 9:12 ` Fuad Tabba
2025-09-29 9:17 ` David Hildenbrand
2025-09-29 10:56 ` Ackerley Tng
2025-09-29 16:58 ` Sean Christopherson
2025-09-30 6:52 ` Ackerley Tng
2025-09-26 16:31 ` [PATCH 3/6] KVM: selftests: Create a new guest_memfd for each testcase Sean Christopherson
2025-09-29 9:18 ` David Hildenbrand
2025-09-29 9:24 ` Fuad Tabba
2025-09-29 11:02 ` Ackerley Tng
2025-09-26 16:31 ` [PATCH 4/6] KVM: selftests: Add test coverage for guest_memfd without GUEST_MEMFD_FLAG_MMAP Sean Christopherson
2025-09-29 9:21 ` David Hildenbrand
2025-09-29 9:24 ` Fuad Tabba
2025-09-26 16:31 ` [PATCH 5/6] KVM: selftests: Add wrappers for mmap() and munmap() to assert success Sean Christopherson
2025-09-29 9:24 ` Fuad Tabba
2025-09-29 9:28 ` David Hildenbrand
2025-09-29 11:08 ` Ackerley Tng
2025-09-29 17:32 ` Sean Christopherson
2025-09-30 7:09 ` Ackerley Tng
2025-09-30 14:24 ` Sean Christopherson
2025-10-01 10:18 ` Ackerley Tng
2025-09-26 16:31 ` [PATCH 6/6] KVM: selftests: Verify that faulting in private guest_memfd memory fails Sean Christopherson
2025-09-29 9:24 ` Fuad Tabba
2025-09-29 9:28 ` David Hildenbrand
2025-09-29 14:38 ` Ackerley Tng
2025-09-29 18:10 ` Sean Christopherson
2025-09-29 18:35 ` Sean Christopherson
2025-09-30 7:53 ` Ackerley Tng
2025-09-30 14:58 ` Sean Christopherson
2025-10-01 10:26 ` Ackerley Tng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aN1TgRpde5hq_FPn@google.com \
--to=seanjc@google.com \
--cc=ackerleytng@google.com \
--cc=borntraeger@linux.ibm.com \
--cc=david@redhat.com \
--cc=frankja@linux.ibm.com \
--cc=imbrenda@linux.ibm.com \
--cc=kalyazin@amazon.co.uk \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=patrick.roy@linux.dev \
--cc=pbonzini@redhat.com \
--cc=shivankg@amd.com \
--cc=tabba@google.com \
--cc=vannapurve@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox