Re: [RFC PATCH v2 02/51] KVM: guest_memfd: Introduce and use shareability to guard faulting

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Fuad Tabba <tabba@google.com>
To: Michael Roth <michael.roth@amd.com>
Cc: Vishal Annapurve <vannapurve@google.com>,
	Ackerley Tng <ackerleytng@google.com>,
	kvm@vger.kernel.org,  linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	 linux-fsdevel@vger.kernel.org, aik@amd.com,
	ajones@ventanamicro.com,  akpm@linux-foundation.org,
	amoorthy@google.com, anthony.yznaga@oracle.com,
	 anup@brainfault.org, aou@eecs.berkeley.edu, bfoster@redhat.com,
	 binbin.wu@linux.intel.com, brauner@kernel.org,
	catalin.marinas@arm.com,  chao.p.peng@intel.com,
	chenhuacai@kernel.org, dave.hansen@intel.com,  david@redhat.com,
	dmatlack@google.com, dwmw@amazon.co.uk,  erdemaktas@google.com,
	fan.du@intel.com, fvdl@google.com, graf@amazon.com,
	 haibo1.xu@intel.com, hch@infradead.org, hughd@google.com,
	ira.weiny@intel.com,  isaku.yamahata@intel.com, jack@suse.cz,
	james.morse@arm.com,  jarkko@kernel.org, jgg@ziepe.ca,
	jgowans@amazon.com, jhubbard@nvidia.com,  jroedel@suse.de,
	jthoughton@google.com, jun.miao@intel.com,  kai.huang@intel.com,
	keirf@google.com, kent.overstreet@linux.dev,
	 kirill.shutemov@intel.com, liam.merwick@oracle.com,
	 maciej.wieczor-retman@intel.com, mail@maciej.szmigiero.name,
	maz@kernel.org,  mic@digikod.net, mpe@ellerman.id.au,
	muchun.song@linux.dev, nikunj@amd.com,  nsaenz@amazon.es,
	oliver.upton@linux.dev, palmer@dabbelt.com,
	 pankaj.gupta@amd.com, paul.walmsley@sifive.com,
	pbonzini@redhat.com,  pdurrant@amazon.co.uk, peterx@redhat.com,
	pgonda@google.com, pvorel@suse.cz,  qperret@google.com,
	quic_cvanscha@quicinc.com, quic_eberman@quicinc.com,
	 quic_mnalajal@quicinc.com, quic_pderrin@quicinc.com,
	quic_pheragu@quicinc.com,  quic_svaddagi@quicinc.com,
	quic_tsoni@quicinc.com, richard.weiyang@gmail.com,
	 rick.p.edgecombe@intel.com, rientjes@google.com,
	roypat@amazon.co.uk,  rppt@kernel.org, seanjc@google.com,
	shuah@kernel.org, steven.price@arm.com,
	 steven.sistare@oracle.com, suzuki.poulose@arm.com,
	thomas.lendacky@amd.com,  usama.arif@bytedance.com,
	vbabka@suse.cz, viro@zeniv.linux.org.uk,  vkuznets@redhat.com,
	wei.w.wang@intel.com, will@kernel.org,  willy@infradead.org,
	xiaoyao.li@intel.com, yan.y.zhao@intel.com,  yilun.xu@intel.com,
	yuzenghui@huawei.com, zhiquan1.li@intel.com
Subject: Re: [RFC PATCH v2 02/51] KVM: guest_memfd: Introduce and use shareability to guard faulting
Date: Tue, 12 Aug 2025 09:23:52 +0100	[thread overview]
Message-ID: <CA+EHjTxZO-1nvDhxM7oBdpgrVq2NcgKrGvrCoiPqX4NPWGvt4w@mail.gmail.com> (raw)
In-Reply-To: <20250703041210.uc4ygp4clqw2h6yd@amd.com>

Hi,

On Thu, 3 Jul 2025 at 05:12, Michael Roth <michael.roth@amd.com> wrote:
>
> On Wed, Jul 02, 2025 at 05:46:23PM -0700, Vishal Annapurve wrote:
> > On Wed, Jul 2, 2025 at 4:25 PM Michael Roth <michael.roth@amd.com> wrote:
> > >
> > > On Wed, Jun 11, 2025 at 02:51:38PM -0700, Ackerley Tng wrote:
> > > > Michael Roth <michael.roth@amd.com> writes:
> > > >
> > > > > On Wed, May 14, 2025 at 04:41:41PM -0700, Ackerley Tng wrote:
> > > > >> Track guest_memfd memory's shareability status within the inode as
> > > > >> opposed to the file, since it is property of the guest_memfd's memory
> > > > >> contents.
> > > > >>
> > > > >> Shareability is a property of the memory and is indexed using the
> > > > >> page's index in the inode. Because shareability is the memory's
> > > > >> property, it is stored within guest_memfd instead of within KVM, like
> > > > >> in kvm->mem_attr_array.
> > > > >>
> > > > >> KVM_MEMORY_ATTRIBUTE_PRIVATE in kvm->mem_attr_array must still be
> > > > >> retained to allow VMs to only use guest_memfd for private memory and
> > > > >> some other memory for shared memory.
> > > > >>
> > > > >> Not all use cases require guest_memfd() to be shared with the host
> > > > >> when first created. Add a new flag, GUEST_MEMFD_FLAG_INIT_PRIVATE,
> > > > >> which when set on KVM_CREATE_GUEST_MEMFD, initializes the memory as
> > > > >> private to the guest, and therefore not mappable by the
> > > > >> host. Otherwise, memory is shared until explicitly converted to
> > > > >> private.
> > > > >>
> > > > >> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> > > > >> Co-developed-by: Vishal Annapurve <vannapurve@google.com>
> > > > >> Signed-off-by: Vishal Annapurve <vannapurve@google.com>
> > > > >> Co-developed-by: Fuad Tabba <tabba@google.com>
> > > > >> Signed-off-by: Fuad Tabba <tabba@google.com>
> > > > >> Change-Id: If03609cbab3ad1564685c85bdba6dcbb6b240c0f
> > > > >> ---
> > > > >>  Documentation/virt/kvm/api.rst |   5 ++
> > > > >>  include/uapi/linux/kvm.h       |   2 +
> > > > >>  virt/kvm/guest_memfd.c         | 124 ++++++++++++++++++++++++++++++++-
> > > > >>  3 files changed, 129 insertions(+), 2 deletions(-)
> > > > >>
> > > > >> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > > > >> index 86f74ce7f12a..f609337ae1c2 100644
> > > > >> --- a/Documentation/virt/kvm/api.rst
> > > > >> +++ b/Documentation/virt/kvm/api.rst
> > > > >> @@ -6408,6 +6408,11 @@ belonging to the slot via its userspace_addr.
> > > > >>  The use of GUEST_MEMFD_FLAG_SUPPORT_SHARED will not be allowed for CoCo VMs.
> > > > >>  This is validated when the guest_memfd instance is bound to the VM.
> > > > >>
> > > > >> +If the capability KVM_CAP_GMEM_CONVERSIONS is supported, then the 'flags' field
> > > > >> +supports GUEST_MEMFD_FLAG_INIT_PRIVATE.  Setting GUEST_MEMFD_FLAG_INIT_PRIVATE
> > > > >> +will initialize the memory for the guest_memfd as guest-only and not faultable
> > > > >> +by the host.
> > > > >> +
> > > > >
> > > > > KVM_CAP_GMEM_CONVERSION doesn't get introduced until later, so it seems
> > > > > like this flag should be deferred until that patch is in place. Is it
> > > > > really needed at that point though? Userspace would be able to set the
> > > > > initial state via KVM_GMEM_CONVERT_SHARED/PRIVATE ioctls.
> > > > >
> > > >
> > > > I can move this change to the later patch. Thanks! Will fix in the next
> > > > revision.
> > > >
> > > > > The mtree contents seems to get stored in the same manner in either case so
> > > > > performance-wise only the overhead of a few userspace<->kernel switches
> > > > > would be saved. Are there any other reasons?
> > > > >
> > > > > Otherwise, maybe just settle on SHARED as a documented default (since at
> > > > > least non-CoCo VMs would be able to reliably benefit) and let
> > > > > CoCo/GUEST_MEMFD_FLAG_SUPPORT_SHARED VMs set PRIVATE at whatever
> > > > > granularity makes sense for the architecture/guest configuration.
> > > > >
> > > >
> > > > Because shared pages are split once any memory is allocated, having a
> > > > way to INIT_PRIVATE could avoid the split and then merge on
> > > > conversion. I feel that is enough value to have this config flag, what
> > > > do you think?
> > > >
> > > > I guess we could also have userspace be careful not to do any allocation
> > > > before converting.
>
> (Re-visiting this with the assumption that we *don't* intend to use mmap() to
> populate memory (in which case you can pretty much ignore my previous
> response))
>
> I'm still not sure where the INIT_PRIVATE flag comes into play. For SNP,
> userspace already defaults to marking everything private pretty close to
> guest_memfd creation time, so the potential for allocations to occur
> in-between seems small, but worth confirming.
>
> But I know in the past there was a desire to ensure TDX/SNP could
> support pre-allocating guest_memfd memory (and even pre-faulting via
> KVM_PRE_FAULT_MEMORY), but I think that could still work right? The
> fallocate() handling could still avoid the split if the whole hugepage
> is private, though there is a bit more potential for that fallocate()
> to happen before userspace does the "manually" shared->private
> conversion. I'll double-check on that aspect, but otherwise, is there
> still any other need for it?

It's not just about performance. I think that the need is more a
matter of having a consistent API with the hypervisors guest_memfd is
going to support. Memory in guest_memfd is shared by default, but in
pKVM for example, it's private by default. Therefore, it would be good
to have a way to ensure that all guest_memfd allocations can be made
private from the get-go.

Cheers,
/fuad

> > >
> > > I assume we do want to support things like preallocating guest memory so
> > > not sure this approach is feasible to avoid splits.
> > >
> > > But I feel like we might be working around a deeper issue here, which is
> > > that we are pre-emptively splitting anything that *could* be mapped into
> > > userspace (i.e. allocated+shared/mixed), rather than splitting when
> > > necessary.
> > >
> > > I know that was the plan laid out in the guest_memfd calls, but I've run
> > > into a couple instances that have me thinking we should revisit this.
> > >
> > > 1) Some of the recent guest_memfd seems to be gravitating towards having
> > >    userspace populate/initialize guest memory payload prior to boot via
> > >    mmap()'ing the shared guest_memfd pages so things work the same as
> > >    they would for initialized normal VM memory payload (rather than
> > >    relying on back-channels in the kernel to user data into guest_memfd
> > >    pages).
> > >
> > >    When you do this though, for an SNP guest at least, that memory
> > >    acceptance is done in chunks of 4MB (with accept_memory=lazy), and
> > >    because that will put each 1GB page into an allocated+mixed state,
> >
> > I would like your help in understanding why we need to start
> > guest_memfd ranges as shared for SNP guests. guest_memfd ranges being
> > private simply should mean that certain ranges are not faultable by
> > the userspace.
>
> It's seeming like I probably misremembered, but I thought there was a
> discussion on guest_memfd call a month (or so?) ago about whether to
> continue to use backchannels to populate guest_memfd pages prior to
> launch. It was in the context of whether to keep using kvm_gmem_populate()
> for populating guest_memfd pages by copying them in from separate
> userspace buffer vs. simply populating them directly from userspace.
> I thought we were leaning on the latter since it was simpler all-around,
> which is great for SNP since that is already how it populates memory: by
> writing to it from userspace, which kvm_gmem_populate() then copies into
> guest_memfd pages. With shared gmem support, we just skip the latter now
> in the kernel rather needing changes to how userspace handles things in
> that regard. But maybe that was just wishful thinking :)
>
> But you raise some very compelling points on why this might not be a
> good idea even if that was how that discussion went.
>
> >
> > Will following work?
> > 1) Userspace starts all guest_memfd ranges as private.
> > 2) During early guest boot it starts issuing PSC requests for
> > converting memory from shared to private
> >     -> KVM forwards this request to userspace
> >     -> Userspace checks that the pages are already private and simply
> > does nothing.
> > 3) Pvalidate from guest on that memory will result in guest_memfd
> > offset query which will cause the RMP table entries to actually get
> > populated.
>
> That would work, but there will need to be changes on userspace to deal
> with how SNP populates memory pre-boot just like normal VMs do. We will
> instead need to copy that data into separate buffers, and pass those in
> as the buffer hva instead of the shared hva corresponding to that GPA.
>
> But that seems reasonable if it avoids so many other problems.
>
> >
> > >    we end up splitting every 1GB to 4K and the guest can't even
> > >    accept/PVALIDATE it 2MB at that point even if userspace doesn't touch
> > >    anything in the range. As some point the guest will convert/accept
> > >    the entire range, at which point we could merge, but for SNP we'd
> > >    need guest cooperation to actually use a higher-granularity in stage2
> > >    page tables at that point since RMP entries are effectively all split
> > >    to 4K.
> > >
> > >    I understand the intent is to default to private where this wouldn't
> > >    be an issue, and we could punt to userspace to deal with it, but it
> > >    feels like an artificial restriction to place on userspace. And if we
> > >    do want to allow/expect guest_memfd contents to be initialized pre-boot
> > >    just like normal memory, then userspace would need to jump through
> > >    some hoops:
> > >
> > >    - if defaulting to private: add hooks to convert each range that's being
> > >      modified to a shared state prior to writing to it
> >
> > Why is that a problem?
>
> These were only problems if we went the above-mentioned way of
> populating memory pre-boot via mmap() instead of other backchannels. If
> we don't do that, then both these things cease to be problems. Sounds goods
> to me. :)
>
> >
> > >    - if defaulting to shared: initialize memory in-place, then covert
> > >      everything else to private to avoid unecessarily splitting folios
> > >      at run-time
> > >
> > >    It feels like implementations details are bleeding out into the API
> > >    to some degree here (e.g. we'd probably at least need to document
> > >    this so users know how to take proper advantage of hugepage support).
> >
> > Does it make sense to keep the default behavior as INIT_PRIVATE for
> > SNP VMs always even without using hugepages?
>
> Yes!
>
> Though, revisiting discussion around INIT_PRIVATE (without the baggage
> of potentially relying on mmap() to populate memory), I'm still not sure why
> it's needed. I responded in the context of Ackerley's initial reply
> above.
>
> >
> > >
> > > 2) There are some use-cases for HugeTLB + CoCo that have come to my
> > >    attention recently that put a lot of weight on still being able to
> > >    maximize mapping/hugepage size when accessing shared mem from userspace,
> > >    e.g. for certain DPDK workloads that accessed shared guest buffers
> > >    from host userspace. We don't really have a story for this, and I
> > >    wouldn't expect us to at this stage, but I think it ties into #1 so
> > >    might be worth considering in that context.
> >
> > Major problem I see here is that if anything in the kernel does a GUP
> > on shared memory ranges (which is very likely to happen), it would be
> > difficult to get them to let go of the whole hugepage before it can be
> > split safely.
> >
> > Another problem is guest_memfd today doesn't support management of
> > large user space page table mappings, this can turnout to be
> > significant work to do referring to hugetlb pagetable management
> > logic.
>
> Yah that was more line-of-sight that might be possible by going this
> route, but the refcount'ing issue above is a showstopper as always. I'd
> somehow convinced myself that supporting fine-grained splitting somehow
> worked around it, but you still have no idea what page you need to avoid
> converting and fancy splitting doesn't get you past that. More wishful
> thinking. =\
>
> Thanks,
>
> Mike
>
> >
> > >
> > > I'm still fine with the current approach as a starting point, but I'm
> > > wondering if improving both #1/#2 might not be so bad and maybe even
> > > give us some more flexibility (for instance, Sean had mentioned leaving
> > > open the option of tracking more than just shareability/mappability, and
> > > if there is split/merge logic associated with those transitions then
> > > re-scanning each of these attributes for a 1G range seems like it could
> > > benefit from some sort of intermediate data structure to help determine
> > > things like what mapping granularity is available for guest/userspace
> > > for a particular range.
> > >
> > > One approach I was thinking of was that we introduce a data structure
> > > similar to KVM's memslot->arch.lpage_info() where we store information
> > > about what 1G/2M ranges are shared/private/mixed, and then instead of
> > > splitting ahead of time we just record that state into this data
> > > structure (using the same write lock as with the
> > > shareability/mappability state), and then at *fault* time we split the
> > > folio if our lpage_info-like data structure says the range is mixed.
> > >
> > > Then, if guest converts a 2M/4M range to private while lazilly-accepting
> > > (for instance), we can still keep the folio intact as 1GB, but mark
> > > the 1G range in the lpage_info-like data structure as mixed so that we
> > > still inform KVM/etc. they need to map it as 2MB or lower in stage2
> > > page tables. In that case, even at guest fault-time, we can leave the
> > > folio unsplit until userspace tries to touch it (though in most cases
> > > it never will and we can keep most of the guest's 1G intact for the
> > > duration of its lifetime).
> > >
> > > On the userspace side, another nice thing there is if we see 1G is in a
> > > mixed state, but 2M is all-shared, then we can still leave the folio as 2M,
> > > and I think the refcount'ing logic would still work for the most part,
> > > which makes #2 a bit easier to implement as well.
> > >
> > > And of course, we wouldn't need the INIT_PRIVATE then since we are only
> > > splitting when necessary.
> > >
> > > But I guess this all comes down to how much extra pain there is in
> > > tracking a 1G folio that's been split into a mixed of 2MB/4K regions,
> > > but I think we'd get a lot more mileage out of getting that working and
> > > just completely stripping out all of the merging logic for initial
> > > implementation (other than at cleanup time), so maybe complexity-wise
> > > it balances out a bit?
> > >
> > > Thanks,
> > >
> > > Mike
> > >
> > > >
> > > > >>  See KVM_SET_USER_MEMORY_REGION2 for additional details.
> > > > >>
> > > > >>  4.143 KVM_PRE_FAULT_MEMORY
> > > > >> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > > > >> index 4cc824a3a7c9..d7df312479aa 100644
> > > > >> --- a/include/uapi/linux/kvm.h
> > > > >> +++ b/include/uapi/linux/kvm.h
> > > > >> @@ -1567,7 +1567,9 @@ struct kvm_memory_attributes {
> > > > >>  #define KVM_MEMORY_ATTRIBUTE_PRIVATE           (1ULL << 3)
> > > > >>
> > > > >>  #define KVM_CREATE_GUEST_MEMFD    _IOWR(KVMIO,  0xd4, struct kvm_create_guest_memfd)
> > > > >> +
> > > > >>  #define GUEST_MEMFD_FLAG_SUPPORT_SHARED   (1UL << 0)
> > > > >> +#define GUEST_MEMFD_FLAG_INIT_PRIVATE     (1UL << 1)
> > > > >>
> > > > >>  struct kvm_create_guest_memfd {
> > > > >>    __u64 size;
> > > > >> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> > > > >> index 239d0f13dcc1..590932499eba 100644
> > > > >> --- a/virt/kvm/guest_memfd.c
> > > > >> +++ b/virt/kvm/guest_memfd.c
> > > > >> @@ -4,6 +4,7 @@
> > > > >>  #include <linux/falloc.h>
> > > > >>  #include <linux/fs.h>
> > > > >>  #include <linux/kvm_host.h>
> > > > >> +#include <linux/maple_tree.h>
> > > > >>  #include <linux/pseudo_fs.h>
> > > > >>  #include <linux/pagemap.h>
> > > > >>
> > > > >> @@ -17,6 +18,24 @@ struct kvm_gmem {
> > > > >>    struct list_head entry;
> > > > >>  };
> > > > >>
> > > > >> +struct kvm_gmem_inode_private {
> > > > >> +#ifdef CONFIG_KVM_GMEM_SHARED_MEM
> > > > >> +  struct maple_tree shareability;
> > > > >> +#endif
> > > > >> +};
> > > > >> +
> > > > >> +enum shareability {
> > > > >> +  SHAREABILITY_GUEST = 1, /* Only the guest can map (fault) folios in this range. */
> > > > >> +  SHAREABILITY_ALL = 2,   /* Both guest and host can fault folios in this range. */
> > > > >> +};
> > > > >> +
> > > > >> +static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index);
> > > > >> +
> > > > >> +static struct kvm_gmem_inode_private *kvm_gmem_private(struct inode *inode)
> > > > >> +{
> > > > >> +  return inode->i_mapping->i_private_data;
> > > > >> +}
> > > > >> +
> > > > >>  /**
> > > > >>   * folio_file_pfn - like folio_file_page, but return a pfn.
> > > > >>   * @folio: The folio which contains this index.
> > > > >> @@ -29,6 +48,58 @@ static inline kvm_pfn_t folio_file_pfn(struct folio *folio, pgoff_t index)
> > > > >>    return folio_pfn(folio) + (index & (folio_nr_pages(folio) - 1));
> > > > >>  }
> > > > >>
> > > > >> +#ifdef CONFIG_KVM_GMEM_SHARED_MEM
> > > > >> +
> > > > >> +static int kvm_gmem_shareability_setup(struct kvm_gmem_inode_private *private,
> > > > >> +                                loff_t size, u64 flags)
> > > > >> +{
> > > > >> +  enum shareability m;
> > > > >> +  pgoff_t last;
> > > > >> +
> > > > >> +  last = (size >> PAGE_SHIFT) - 1;
> > > > >> +  m = flags & GUEST_MEMFD_FLAG_INIT_PRIVATE ? SHAREABILITY_GUEST :
> > > > >> +                                              SHAREABILITY_ALL;
> > > > >> +  return mtree_store_range(&private->shareability, 0, last, xa_mk_value(m),
> > > > >> +                           GFP_KERNEL);
> > > > >
> > > > > One really nice thing about using a maple tree is that it should get rid
> > > > > of a fairly significant startup delay for SNP/TDX when the entire xarray gets
> > > > > initialized with private attribute entries via KVM_SET_MEMORY_ATTRIBUTES
> > > > > (which is the current QEMU default behavior).
> > > > >
> > > > > I'd originally advocated for sticking with the xarray implementation Fuad was
> > > > > using until we'd determined we really need it for HugeTLB support, but I'm
> > > > > sort of thinking it's already justified just based on the above.
> > > > >
> > > > > Maybe it would make sense for KVM memory attributes too?
> > > > >
> > > > >> +}
> > > > >> +
> > > > >> +static enum shareability kvm_gmem_shareability_get(struct inode *inode,
> > > > >> +                                           pgoff_t index)
> > > > >> +{
> > > > >> +  struct maple_tree *mt;
> > > > >> +  void *entry;
> > > > >> +
> > > > >> +  mt = &kvm_gmem_private(inode)->shareability;
> > > > >> +  entry = mtree_load(mt, index);
> > > > >> +  WARN(!entry,
> > > > >> +       "Shareability should always be defined for all indices in inode.");
> > > > >> +
> > > > >> +  return xa_to_value(entry);
> > > > >> +}
> > > > >> +
> > > > >> +static struct folio *kvm_gmem_get_shared_folio(struct inode *inode, pgoff_t index)
> > > > >> +{
> > > > >> +  if (kvm_gmem_shareability_get(inode, index) != SHAREABILITY_ALL)
> > > > >> +          return ERR_PTR(-EACCES);
> > > > >> +
> > > > >> +  return kvm_gmem_get_folio(inode, index);
> > > > >> +}
> > > > >> +
> > > > >> +#else
> > > > >> +
> > > > >> +static int kvm_gmem_shareability_setup(struct maple_tree *mt, loff_t size, u64 flags)
> > > > >> +{
> > > > >> +  return 0;
> > > > >> +}
> > > > >> +
> > > > >> +static inline struct folio *kvm_gmem_get_shared_folio(struct inode *inode, pgoff_t index)
> > > > >> +{
> > > > >> +  WARN_ONCE("Unexpected call to get shared folio.")
> > > > >> +  return NULL;
> > > > >> +}
> > > > >> +
> > > > >> +#endif /* CONFIG_KVM_GMEM_SHARED_MEM */
> > > > >> +
> > > > >>  static int __kvm_gmem_prepare_folio(struct kvm *kvm, struct kvm_memory_slot *slot,
> > > > >>                                pgoff_t index, struct folio *folio)
> > > > >>  {
> > > > >> @@ -333,7 +404,7 @@ static vm_fault_t kvm_gmem_fault_shared(struct vm_fault *vmf)
> > > > >>
> > > > >>    filemap_invalidate_lock_shared(inode->i_mapping);
> > > > >>
> > > > >> -  folio = kvm_gmem_get_folio(inode, vmf->pgoff);
> > > > >> +  folio = kvm_gmem_get_shared_folio(inode, vmf->pgoff);
> > > > >>    if (IS_ERR(folio)) {
> > > > >>            int err = PTR_ERR(folio);
> > > > >>
> > > > >> @@ -420,8 +491,33 @@ static struct file_operations kvm_gmem_fops = {
> > > > >>    .fallocate      = kvm_gmem_fallocate,
> > > > >>  };
> > > > >>
> > > > >> +static void kvm_gmem_free_inode(struct inode *inode)
> > > > >> +{
> > > > >> +  struct kvm_gmem_inode_private *private = kvm_gmem_private(inode);
> > > > >> +
> > > > >> +  kfree(private);
> > > > >> +
> > > > >> +  free_inode_nonrcu(inode);
> > > > >> +}
> > > > >> +
> > > > >> +static void kvm_gmem_destroy_inode(struct inode *inode)
> > > > >> +{
> > > > >> +  struct kvm_gmem_inode_private *private = kvm_gmem_private(inode);
> > > > >> +
> > > > >> +#ifdef CONFIG_KVM_GMEM_SHARED_MEM
> > > > >> +  /*
> > > > >> +   * mtree_destroy() can't be used within rcu callback, hence can't be
> > > > >> +   * done in ->free_inode().
> > > > >> +   */
> > > > >> +  if (private)
> > > > >> +          mtree_destroy(&private->shareability);
> > > > >> +#endif
> > > > >> +}
> > > > >> +
> > > > >>  static const struct super_operations kvm_gmem_super_operations = {
> > > > >>    .statfs         = simple_statfs,
> > > > >> +  .destroy_inode  = kvm_gmem_destroy_inode,
> > > > >> +  .free_inode     = kvm_gmem_free_inode,
> > > > >>  };
> > > > >>
> > > > >>  static int kvm_gmem_init_fs_context(struct fs_context *fc)
> > > > >> @@ -549,12 +645,26 @@ static const struct inode_operations kvm_gmem_iops = {
> > > > >>  static struct inode *kvm_gmem_inode_make_secure_inode(const char *name,
> > > > >>                                                  loff_t size, u64 flags)
> > > > >>  {
> > > > >> +  struct kvm_gmem_inode_private *private;
> > > > >>    struct inode *inode;
> > > > >> +  int err;
> > > > >>
> > > > >>    inode = alloc_anon_secure_inode(kvm_gmem_mnt->mnt_sb, name);
> > > > >>    if (IS_ERR(inode))
> > > > >>            return inode;
> > > > >>
> > > > >> +  err = -ENOMEM;
> > > > >> +  private = kzalloc(sizeof(*private), GFP_KERNEL);
> > > > >> +  if (!private)
> > > > >> +          goto out;
> > > > >> +
> > > > >> +  mt_init(&private->shareability);
> > > > >> +  inode->i_mapping->i_private_data = private;
> > > > >> +
> > > > >> +  err = kvm_gmem_shareability_setup(private, size, flags);
> > > > >> +  if (err)
> > > > >> +          goto out;
> > > > >> +
> > > > >>    inode->i_private = (void *)(unsigned long)flags;
> > > > >>    inode->i_op = &kvm_gmem_iops;
> > > > >>    inode->i_mapping->a_ops = &kvm_gmem_aops;
> > > > >> @@ -566,6 +676,11 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name,
> > > > >>    WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping));
> > > > >>
> > > > >>    return inode;
> > > > >> +
> > > > >> +out:
> > > > >> +  iput(inode);
> > > > >> +
> > > > >> +  return ERR_PTR(err);
> > > > >>  }
> > > > >>
> > > > >>  static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size,
> > > > >> @@ -654,6 +769,9 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args)
> > > > >>    if (kvm_arch_vm_supports_gmem_shared_mem(kvm))
> > > > >>            valid_flags |= GUEST_MEMFD_FLAG_SUPPORT_SHARED;
> > > > >>
> > > > >> +  if (flags & GUEST_MEMFD_FLAG_SUPPORT_SHARED)
> > > > >> +          valid_flags |= GUEST_MEMFD_FLAG_INIT_PRIVATE;
> > > > >> +
> > > > >>    if (flags & ~valid_flags)
> > > > >>            return -EINVAL;
> > > > >>
> > > > >> @@ -842,6 +960,8 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
> > > > >>    if (!file)
> > > > >>            return -EFAULT;
> > > > >>
> > > > >> +  filemap_invalidate_lock_shared(file_inode(file)->i_mapping);
> > > > >> +
> > > > >
> > > > > I like the idea of using a write-lock/read-lock to protect write/read access
> > > > > to shareability state (though maybe not necessarily re-using filemap's
> > > > > invalidate lock), it's simple and still allows concurrent faulting in of gmem
> > > > > pages. One issue on the SNP side (which also came up in one of the gmem calls)
> > > > > is if we introduce support for tracking preparedness as discussed (e.g. via a
> > > > > new SHAREABILITY_GUEST_PREPARED state) the
> > > > > SHAREABILITY_GUEST->SHAREABILITY_GUEST_PREPARED transition would occur at
> > > > > fault-time, and so would need to take the write-lock and no longer allow for
> > > > > concurrent fault-handling.
> > > > >
> > > > > I was originally planning on introducing a new rw_semaphore with similar
> > > > > semantics to the rw_lock that Fuad previously had in his restricted mmap
> > > > > series[1] (and simiar semantics to filemap invalidate lock here). The main
> > > > > difference, to handle setting SHAREABILITY_GUEST_PREPARED within fault paths,
> > > > > was that in the case of a folio being present for an index, the folio lock would
> > > > > also need to be held in order to update the shareability state. Because
> > > > > of that, fault paths (which will always either have or allocate folio
> > > > > basically) can rely on the folio lock to guard shareability state in a more
> > > > > granular way and so can avoid a global write lock.
> > > > >
> > > > > They would still need to hold the read lock to access the tree however.
> > > > > Or more specifically, any paths that could allocate a folio need to take
> > > > > a read lock so there isn't a TOCTOU situation where shareability is
> > > > > being updated for an index for which a folio hasn't been allocated, but
> > > > > then just afterward the folio gets faulted in/allocated while the
> > > > > shareability state is already being updated which the understand that
> > > > > there was no folio around that needed locking.
> > > > >
> > > > > I had a branch with in-place conversion support for SNP[2] that added this
> > > > > lock reworking on top of Fuad's series along with preparation tracking,
> > > > > but I'm now planning to rebase that on top of the patches from this
> > > > > series that Sean mentioned[3] earlier:
> > > > >
> > > > >   KVM: guest_memfd: Add CAP KVM_CAP_GMEM_CONVERSION
> > > > >   KVM: Query guest_memfd for private/shared status
> > > > >   KVM: guest_memfd: Skip LRU for guest_memfd folios
> > > > >   KVM: guest_memfd: Introduce KVM_GMEM_CONVERT_SHARED/PRIVATE ioctls
> > > > >   KVM: guest_memfd: Introduce and use shareability to guard faulting
> > > > >   KVM: guest_memfd: Make guest mem use guest mem inodes instead of anonymous inodes
> > > > >
> > > > > but figured I'd mention it here in case there are other things to consider on
> > > > > the locking front.
> > > > >
> > > > > Definitely agree with Sean though that it would be nice to start identifying a
> > > > > common base of patches for the in-place conversion enablement for SNP, TDX, and
> > > > > pKVM so the APIs/interfaces for hugepages can be handled separately.
> > > > >
> > > > > -Mike
> > > > >
> > > > > [1] https://lore.kernel.org/kvm/20250328153133.3504118-1-tabba@google.com/
> > > > > [2] https://github.com/mdroth/linux/commits/mmap-swprot-v10-snp0-wip2/
> > > > > [3] https://lore.kernel.org/kvm/aC86OsU2HSFZkJP6@google.com/
> > > > >
> > > > >>    folio = __kvm_gmem_get_pfn(file, slot, index, pfn, &is_prepared, max_order);
> > > > >>    if (IS_ERR(folio)) {
> > > > >>            r = PTR_ERR(folio);
> > > > >> @@ -857,8 +977,8 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
> > > > >>            *page = folio_file_page(folio, index);
> > > > >>    else
> > > > >>            folio_put(folio);
> > > > >> -
> > > > >>  out:
> > > > >> +  filemap_invalidate_unlock_shared(file_inode(file)->i_mapping);
> > > > >>    fput(file);
> > > > >>    return r;
> > > > >>  }
> > > > >> --
> > > > >> 2.49.0.1045.g170613ef41-goog
> > > > >>
> > > >

next prev parent reply	other threads:[~2025-08-12  8:24 UTC|newest]

Thread overview: 231+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-14 23:41 [RFC PATCH v2 00/51] 1G page support for guest_memfd Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 01/51] KVM: guest_memfd: Make guest mem use guest mem inodes instead of anonymous inodes Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 02/51] KVM: guest_memfd: Introduce and use shareability to guard faulting Ackerley Tng
2025-05-27  3:54   ` Yan Zhao
2025-05-29 18:20     ` Ackerley Tng
2025-05-30  8:53     ` Fuad Tabba
2025-05-30 18:32       ` Ackerley Tng
2025-06-02  9:43         ` Fuad Tabba
2025-05-27  8:25   ` Binbin Wu
2025-05-27  8:43     ` Binbin Wu
2025-05-29 18:26     ` Ackerley Tng
2025-05-29 20:37       ` Ackerley Tng
2025-05-29  5:42   ` Michael Roth
2025-06-11 21:51     ` Ackerley Tng
2025-07-02 23:25       ` Michael Roth
2025-07-03  0:46         ` Vishal Annapurve
2025-07-03  0:52           ` Vishal Annapurve
2025-07-03  4:12           ` Michael Roth
2025-07-03  5:10             ` Vishal Annapurve
2025-07-03 20:39               ` Michael Roth
2025-07-07 14:55                 ` Vishal Annapurve
2025-07-12  0:10                   ` Michael Roth
2025-07-12 17:53                     ` Vishal Annapurve
2025-08-12  8:23             ` Fuad Tabba [this message]
2025-08-13 17:11               ` Ira Weiny
2025-06-11 22:10     ` Ackerley Tng
2025-08-01  0:01   ` Yan Zhao
2025-08-14 21:35     ` Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 03/51] KVM: selftests: Update guest_memfd_test for INIT_PRIVATE flag Ackerley Tng
2025-05-15 13:49   ` Ira Weiny
2025-05-16 17:42     ` Ackerley Tng
2025-05-16 19:31       ` Ira Weiny
2025-05-27  8:53       ` Binbin Wu
2025-05-30 19:59         ` Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 04/51] KVM: guest_memfd: Introduce KVM_GMEM_CONVERT_SHARED/PRIVATE ioctls Ackerley Tng
2025-05-15 14:50   ` Ira Weiny
2025-05-16 17:53     ` Ackerley Tng
2025-05-20  9:22   ` Fuad Tabba
2025-05-20 13:02     ` Vishal Annapurve
2025-05-20 13:44       ` Fuad Tabba
2025-05-20 14:11         ` Vishal Annapurve
2025-05-20 14:33           ` Fuad Tabba
2025-05-20 16:02             ` Vishal Annapurve
2025-05-20 18:05               ` Fuad Tabba
2025-05-20 19:40                 ` Ackerley Tng
2025-05-21 12:36                   ` Fuad Tabba
2025-05-21 14:42                     ` Vishal Annapurve
2025-05-21 15:21                       ` Fuad Tabba
2025-05-21 15:51                         ` Vishal Annapurve
2025-05-21 18:27                           ` Fuad Tabba
2025-05-22 14:52                             ` Sean Christopherson
2025-05-22 15:07                               ` Fuad Tabba
2025-05-22 16:26                                 ` Sean Christopherson
2025-05-23 10:12                                   ` Fuad Tabba
2025-06-24  8:23           ` Alexey Kardashevskiy
2025-06-24 13:08             ` Jason Gunthorpe
2025-06-24 14:10               ` Vishal Annapurve
2025-06-27  4:49                 ` Alexey Kardashevskiy
2025-06-27 15:17                   ` Vishal Annapurve
2025-06-30  0:19                     ` Alexey Kardashevskiy
2025-06-30 14:19                       ` Vishal Annapurve
2025-07-10  6:57                         ` Alexey Kardashevskiy
2025-07-10 17:58                           ` Jason Gunthorpe
2025-07-02  8:35                 ` Yan Zhao
2025-07-02 13:54                   ` Vishal Annapurve
2025-07-02 14:13                     ` Jason Gunthorpe
2025-07-02 14:32                       ` Vishal Annapurve
2025-07-10 10:50                         ` Xu Yilun
2025-07-10 17:54                           ` Jason Gunthorpe
2025-07-11  4:31                             ` Xu Yilun
2025-07-11  9:33                               ` Xu Yilun
2025-07-16 22:22                   ` Ackerley Tng
2025-07-17  9:32                     ` Xu Yilun
2025-07-17 16:56                       ` Ackerley Tng
2025-07-18  2:48                         ` Xu Yilun
2025-07-18 14:15                           ` Jason Gunthorpe
2025-07-21 14:18                             ` Xu Yilun
2025-07-18 15:13                           ` Ira Weiny
2025-07-21  9:58                             ` Xu Yilun
2025-07-22 18:17                               ` Ackerley Tng
2025-07-22 19:25                                 ` Edgecombe, Rick P
2025-05-28  3:16   ` Binbin Wu
2025-05-30 20:10     ` Ackerley Tng
2025-06-03  0:54       ` Binbin Wu
2025-05-14 23:41 ` [RFC PATCH v2 05/51] KVM: guest_memfd: Skip LRU for guest_memfd folios Ackerley Tng
2025-05-28  7:01   ` Binbin Wu
2025-05-30 20:32     ` Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 06/51] KVM: Query guest_memfd for private/shared status Ackerley Tng
2025-05-27  3:55   ` Yan Zhao
2025-05-28  8:08     ` Binbin Wu
2025-05-28  9:55       ` Yan Zhao
2025-05-14 23:41 ` [RFC PATCH v2 07/51] KVM: guest_memfd: Add CAP KVM_CAP_GMEM_CONVERSION Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 08/51] KVM: selftests: Test flag validity after guest_memfd supports conversions Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 09/51] KVM: selftests: Test faulting with respect to GUEST_MEMFD_FLAG_INIT_PRIVATE Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 10/51] KVM: selftests: Refactor vm_mem_add to be more flexible Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 11/51] KVM: selftests: Allow cleanup of ucall_pool from host Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 12/51] KVM: selftests: Test conversion flows for guest_memfd Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 13/51] KVM: selftests: Add script to exercise private_mem_conversions_test Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 14/51] KVM: selftests: Update private_mem_conversions_test to mmap guest_memfd Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 15/51] KVM: selftests: Update script to map shared memory from guest_memfd Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 16/51] mm: hugetlb: Consolidate interpretation of gbl_chg within alloc_hugetlb_folio() Ackerley Tng
2025-05-15  2:09   ` Matthew Wilcox
2025-05-28  8:55   ` Binbin Wu
2025-07-07 18:27   ` James Houghton
2025-05-14 23:41 ` [RFC PATCH v2 17/51] mm: hugetlb: Cleanup interpretation of gbl_chg in alloc_hugetlb_folio() Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 18/51] mm: hugetlb: Cleanup interpretation of map_chg_state within alloc_hugetlb_folio() Ackerley Tng
2025-07-07 18:08   ` James Houghton
2025-05-14 23:41 ` [RFC PATCH v2 19/51] mm: hugetlb: Rename alloc_surplus_hugetlb_folio Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 20/51] mm: mempolicy: Refactor out policy_node_nodemask() Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 21/51] mm: hugetlb: Inline huge_node() into callers Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 22/51] mm: hugetlb: Refactor hugetlb allocation functions Ackerley Tng
2025-05-31 23:45   ` Ira Weiny
2025-06-13 22:03     ` Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 23/51] mm: hugetlb: Refactor out hugetlb_alloc_folio() Ackerley Tng
2025-06-01  0:38   ` Ira Weiny
2025-06-13 22:07     ` Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 24/51] mm: hugetlb: Add option to create new subpool without using surplus Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 25/51] mm: truncate: Expose preparation steps for truncate_inode_pages_final Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 26/51] mm: Consolidate freeing of typed folios on final folio_put() Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 27/51] mm: hugetlb: Expose hugetlb_subpool_{get,put}_pages() Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 28/51] mm: Introduce guestmem_hugetlb to support folio_put() handling of guestmem pages Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 29/51] mm: guestmem_hugetlb: Wrap HugeTLB as an allocator for guest_memfd Ackerley Tng
2025-05-16 14:07   ` Ackerley Tng
2025-05-16 20:33     ` Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 30/51] mm: truncate: Expose truncate_inode_folio() Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 31/51] KVM: x86: Set disallow_lpage on base_gfn and guest_memfd pgoff misalignment Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 32/51] KVM: guest_memfd: Support guestmem_hugetlb as custom allocator Ackerley Tng
2025-05-23 10:47   ` Yan Zhao
2025-08-12  9:13   ` Tony Lindgren
2025-05-14 23:42 ` [RFC PATCH v2 33/51] KVM: guest_memfd: Allocate and truncate from " Ackerley Tng
2025-05-21 18:05   ` Vishal Annapurve
2025-05-22 23:12   ` Edgecombe, Rick P
2025-05-28 10:58   ` Yan Zhao
2025-06-03  7:43   ` Binbin Wu
2025-07-16 22:13     ` Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 34/51] mm: hugetlb: Add functions to add/delete folio from hugetlb lists Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 35/51] mm: guestmem_hugetlb: Add support for splitting and merging pages Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 36/51] mm: Convert split_folio() macro to function Ackerley Tng
2025-05-21 16:40   ` Edgecombe, Rick P
2025-05-14 23:42 ` [RFC PATCH v2 37/51] filemap: Pass address_space mapping to ->free_folio() Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 38/51] KVM: guest_memfd: Split allocator pages for guest_memfd use Ackerley Tng
2025-05-22 22:19   ` Edgecombe, Rick P
2025-06-05 17:15     ` Ackerley Tng
2025-06-05 17:53       ` Edgecombe, Rick P
2025-06-05 17:15     ` Ackerley Tng
2025-06-05 17:16     ` Ackerley Tng
2025-06-05 17:16     ` Ackerley Tng
2025-06-05 17:16     ` Ackerley Tng
2025-05-27  4:30   ` Yan Zhao
2025-05-27  4:38     ` Yan Zhao
2025-06-05 17:50     ` Ackerley Tng
2025-05-27  8:45   ` Yan Zhao
2025-06-05 19:10     ` Ackerley Tng
2025-06-16 11:15       ` Yan Zhao
2025-06-05  5:24   ` Binbin Wu
2025-06-05 19:16     ` Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 39/51] KVM: guest_memfd: Merge and truncate on fallocate(PUNCH_HOLE) Ackerley Tng
2025-05-28 11:00   ` Yan Zhao
2025-05-28 16:39     ` Ackerley Tng
2025-05-29  3:26       ` Yan Zhao
2025-05-14 23:42 ` [RFC PATCH v2 40/51] KVM: guest_memfd: Update kvm_gmem_mapping_order to account for page status Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 41/51] KVM: Add CAP to indicate support for HugeTLB as custom allocator Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 42/51] KVM: selftests: Add basic selftests for hugetlb-backed guest_memfd Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 43/51] KVM: selftests: Update conversion flows test for HugeTLB Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 44/51] KVM: selftests: Test truncation paths of guest_memfd Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 45/51] KVM: selftests: Test allocation and conversion of subfolios Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 46/51] KVM: selftests: Test that guest_memfd usage is reported via hugetlb Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 47/51] KVM: selftests: Support various types of backing sources for private memory Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 48/51] KVM: selftests: Update test for various private memory backing source types Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 49/51] KVM: selftests: Update private_mem_conversions_test.sh to test with HugeTLB pages Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 50/51] KVM: selftests: Add script to test HugeTLB statistics Ackerley Tng
2025-05-15 18:03 ` [RFC PATCH v2 00/51] 1G page support for guest_memfd Edgecombe, Rick P
2025-05-15 18:42   ` Vishal Annapurve
2025-05-15 23:35     ` Edgecombe, Rick P
2025-05-16  0:57       ` Sean Christopherson
2025-05-16  2:12         ` Edgecombe, Rick P
2025-05-16 13:11           ` Vishal Annapurve
2025-05-16 16:45             ` Edgecombe, Rick P
2025-05-16 17:51               ` Sean Christopherson
2025-05-16 19:14                 ` Edgecombe, Rick P
2025-05-16 20:25                   ` Dave Hansen
2025-05-16 21:42                     ` Edgecombe, Rick P
2025-05-16 17:45             ` Sean Christopherson
2025-05-16 13:09         ` Jason Gunthorpe
2025-05-16 17:04           ` Edgecombe, Rick P
2025-05-16  0:22 ` [RFC PATCH v2 51/51] KVM: selftests: Test guest_memfd for accuracy of st_blocks Ackerley Tng
2025-05-16 19:48 ` [RFC PATCH v2 00/51] 1G page support for guest_memfd Ira Weiny
2025-05-16 19:59   ` Ira Weiny
2025-05-16 20:26     ` Ackerley Tng
2025-05-16 22:43 ` Ackerley Tng
2025-06-19  8:13 ` Yan Zhao
2025-06-19  8:59   ` Xiaoyao Li
2025-06-19  9:18     ` Xiaoyao Li
2025-06-19  9:28       ` Yan Zhao
2025-06-19  9:45         ` Xiaoyao Li
2025-06-19  9:49           ` Xiaoyao Li
2025-06-29 18:28     ` Vishal Annapurve
2025-06-30  3:14       ` Yan Zhao
2025-06-30 14:14         ` Vishal Annapurve
2025-07-01  5:23           ` Yan Zhao
2025-07-01 19:48             ` Vishal Annapurve
2025-07-07 23:25               ` Sean Christopherson
2025-07-08  0:14                 ` Vishal Annapurve
2025-07-08  1:08                   ` Edgecombe, Rick P
2025-07-08 14:20                     ` Sean Christopherson
2025-07-08 14:52                       ` Edgecombe, Rick P
2025-07-08 15:07                         ` Vishal Annapurve
2025-07-08 15:31                           ` Edgecombe, Rick P
2025-07-08 17:16                             ` Vishal Annapurve
2025-07-08 17:39                               ` Edgecombe, Rick P
2025-07-08 18:03                                 ` Sean Christopherson
2025-07-08 18:13                                   ` Edgecombe, Rick P
2025-07-08 18:55                                     ` Sean Christopherson
2025-07-08 21:23                                       ` Edgecombe, Rick P
2025-07-09 14:28                                       ` Vishal Annapurve
2025-07-09 15:00                                         ` Sean Christopherson
2025-07-10  1:30                                           ` Vishal Annapurve
2025-07-10 23:33                                             ` Sean Christopherson
2025-07-11 21:18                                             ` Vishal Annapurve
2025-07-12 17:33                                               ` Vishal Annapurve
2025-07-09 15:17                                         ` Edgecombe, Rick P
2025-07-10  3:39                                           ` Vishal Annapurve
2025-07-08 19:28                                   ` Vishal Annapurve
2025-07-08 19:58                                     ` Sean Christopherson
2025-07-08 22:54                                       ` Vishal Annapurve
2025-07-08 15:38                           ` Sean Christopherson
2025-07-08 16:22                             ` Fuad Tabba
2025-07-08 17:25                               ` Sean Christopherson
2025-07-08 18:37                                 ` Fuad Tabba
2025-07-16 23:06                                   ` Ackerley Tng
2025-06-26 23:19 ` Ackerley Tng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+EHjTxZO-1nvDhxM7oBdpgrVq2NcgKrGvrCoiPqX4NPWGvt4w@mail.gmail.com \
    --to=tabba@google.com \
    --cc=ackerleytng@google.com \
    --cc=aik@amd.com \
    --cc=ajones@ventanamicro.com \
    --cc=akpm@linux-foundation.org \
    --cc=amoorthy@google.com \
    --cc=anthony.yznaga@oracle.com \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=bfoster@redhat.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=brauner@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=chao.p.peng@intel.com \
    --cc=chenhuacai@kernel.org \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=dmatlack@google.com \
    --cc=dwmw@amazon.co.uk \
    --cc=erdemaktas@google.com \
    --cc=fan.du@intel.com \
    --cc=fvdl@google.com \
    --cc=graf@amazon.com \
    --cc=haibo1.xu@intel.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=ira.weiny@intel.com \
    --cc=isaku.yamahata@intel.com \
    --cc=jack@suse.cz \
    --cc=james.morse@arm.com \
    --cc=jarkko@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=jgowans@amazon.com \
    --cc=jhubbard@nvidia.com \
    --cc=jroedel@suse.de \
    --cc=jthoughton@google.com \
    --cc=jun.miao@intel.com \
    --cc=kai.huang@intel.com \
    --cc=keirf@google.com \
    --cc=kent.overstreet@linux.dev \
    --cc=kirill.shutemov@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=liam.merwick@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=mail@maciej.szmigiero.name \
    --cc=maz@kernel.org \
    --cc=mic@digikod.net \
    --cc=michael.roth@amd.com \
    --cc=mpe@ellerman.id.au \
    --cc=muchun.song@linux.dev \
    --cc=nikunj@amd.com \
    --cc=nsaenz@amazon.es \
    --cc=oliver.upton@linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=pankaj.gupta@amd.com \
    --cc=paul.walmsley@sifive.com \
    --cc=pbonzini@redhat.com \
    --cc=pdurrant@amazon.co.uk \
    --cc=peterx@redhat.com \
    --cc=pgonda@google.com \
    --cc=pvorel@suse.cz \
    --cc=qperret@google.com \
    --cc=quic_cvanscha@quicinc.com \
    --cc=quic_eberman@quicinc.com \
    --cc=quic_mnalajal@quicinc.com \
    --cc=quic_pderrin@quicinc.com \
    --cc=quic_pheragu@quicinc.com \
    --cc=quic_svaddagi@quicinc.com \
    --cc=quic_tsoni@quicinc.com \
    --cc=richard.weiyang@gmail.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rientjes@google.com \
    --cc=roypat@amazon.co.uk \
    --cc=rppt@kernel.org \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=steven.price@arm.com \
    --cc=steven.sistare@oracle.com \
    --cc=suzuki.poulose@arm.com \
    --cc=thomas.lendacky@amd.com \
    --cc=usama.arif@bytedance.com \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vkuznets@redhat.com \
    --cc=wei.w.wang@intel.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yilun.xu@intel.com \
    --cc=yuzenghui@huawei.com \
    --cc=zhiquan1.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).