Re: [RFC PATCH v2 38/51] KVM: guest_memfd: Split allocator pages for guest_memfd use

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ackerley Tng <ackerleytng@google.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: kvm@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,  x86@kernel.org,
	linux-fsdevel@vger.kernel.org, aik@amd.com,
	 ajones@ventanamicro.com, akpm@linux-foundation.org,
	amoorthy@google.com,  anthony.yznaga@oracle.com,
	anup@brainfault.org, aou@eecs.berkeley.edu,  bfoster@redhat.com,
	binbin.wu@linux.intel.com, brauner@kernel.org,
	 catalin.marinas@arm.com, chao.p.peng@intel.com,
	chenhuacai@kernel.org,  dave.hansen@intel.com, david@redhat.com,
	dmatlack@google.com,  dwmw@amazon.co.uk, erdemaktas@google.com,
	fan.du@intel.com, fvdl@google.com,  graf@amazon.com,
	haibo1.xu@intel.com, hch@infradead.org, hughd@google.com,
	 ira.weiny@intel.com, isaku.yamahata@intel.com, jack@suse.cz,
	 james.morse@arm.com, jarkko@kernel.org, jgg@ziepe.ca,
	jgowans@amazon.com,  jhubbard@nvidia.com, jroedel@suse.de,
	jthoughton@google.com,  jun.miao@intel.com, kai.huang@intel.com,
	keirf@google.com,  kent.overstreet@linux.dev,
	kirill.shutemov@intel.com, liam.merwick@oracle.com,
	 maciej.wieczor-retman@intel.com, mail@maciej.szmigiero.name,
	maz@kernel.org,  mic@digikod.net, michael.roth@amd.com,
	mpe@ellerman.id.au,  muchun.song@linux.dev, nikunj@amd.com,
	nsaenz@amazon.es,  oliver.upton@linux.dev, palmer@dabbelt.com,
	pankaj.gupta@amd.com,  paul.walmsley@sifive.com,
	pbonzini@redhat.com, pdurrant@amazon.co.uk,  peterx@redhat.com,
	pgonda@google.com, pvorel@suse.cz, qperret@google.com,
	 quic_cvanscha@quicinc.com, quic_eberman@quicinc.com,
	 quic_mnalajal@quicinc.com, quic_pderrin@quicinc.com,
	quic_pheragu@quicinc.com,  quic_svaddagi@quicinc.com,
	quic_tsoni@quicinc.com, richard.weiyang@gmail.com,
	 rick.p.edgecombe@intel.com, rientjes@google.com,
	roypat@amazon.co.uk,  rppt@kernel.org, seanjc@google.com,
	shuah@kernel.org, steven.price@arm.com,
	 steven.sistare@oracle.com, suzuki.poulose@arm.com,
	tabba@google.com,  thomas.lendacky@amd.com,
	usama.arif@bytedance.com, vannapurve@google.com,  vbabka@suse.cz,
	viro@zeniv.linux.org.uk, vkuznets@redhat.com,
	 wei.w.wang@intel.com, will@kernel.org, willy@infradead.org,
	 xiaoyao.li@intel.com, yilun.xu@intel.com, yuzenghui@huawei.com,
	 zhiquan1.li@intel.com
Subject: Re: [RFC PATCH v2 38/51] KVM: guest_memfd: Split allocator pages for guest_memfd use
Date: Thu, 05 Jun 2025 12:10:08 -0700	[thread overview]
Message-ID: <diqztt4uhunj.fsf@ackerleytng-ctop.c.googlers.com> (raw)
In-Reply-To: <aDV7vZ72S+uJDgmn@yzhao56-desk.sh.intel.com>

Yan Zhao <yan.y.zhao@intel.com> writes:

> On Wed, May 14, 2025 at 04:42:17PM -0700, Ackerley Tng wrote:
>> +static int kvm_gmem_convert_execute_work(struct inode *inode,
>> +					 struct conversion_work *work,
>> +					 bool to_shared)
>> +{
>> +	enum shareability m;
>> +	int ret;
>> +
>> +	m = to_shared ? SHAREABILITY_ALL : SHAREABILITY_GUEST;
>> +	ret = kvm_gmem_shareability_apply(inode, work, m);
>> +	if (ret)
>> +		return ret;
>> +	/*
>> +	 * Apply shareability first so split/merge can operate on new
>> +	 * shareability state.
>> +	 */
>> +	ret = kvm_gmem_restructure_folios_in_range(
>> +		inode, work->start, work->nr_pages, to_shared);
>> +
>> +	return ret;
>> +}
>> +

Hi Yan,

Thanks for your thorough reviews and your alternative suggestion in the
other discussion at [1]! I'll try to bring the conversion-related parts
of that discussion over here.

>>  static int kvm_gmem_convert_range(struct file *file, pgoff_t start,
>>  				  size_t nr_pages, bool shared,
>>  				  pgoff_t *error_index)

The guiding principle I was using for the conversion ioctls is

* Have the shareability updates and any necessary page restructuring
  (aka splitting/merging) either fully complete for not at all by the
  time the conversion ioctl returns.
* Any unmapping (from host or guest page tables) will not be re-mapped
  on errors.
* Rollback undoes changes if conversion failed, and in those cases any
  errors are turned into WARNings.

The rationale is that we want page sizes to be in sync with shareability
so that any faults after the (successful or failed) conversion will not
wrongly map in a larger page than allowed and cause any host crashes.

We considered 3 places where the memory can be mapped for conversions:

1. Host page tables
2. Guest page tables
3. IOMMU page tables

Unmapping from host page tables is the simplest case. We unmap any
shared ranges from the host page tables. Any accesses after the failed
conversion would just fault the memory back in and proceed as usual.

guest_memfd memory is not unmapped from IOMMUs in conversions. This case
is handled because IOMMU mappings hold refcounts. After unmapping from
the host, we check for unexpected refcounts and fail if there are
unexpected refcounts.

We also unmap from guest page tables. Considering failed conversions, if
the pages are shared, we're good since the next time the guest accesses
the page, the page will be faulted in as before.

If the pages are private, on the next guest access, the pages will be
faulted in again as well. This is fine for software-protected VMs IIUC.

For TDX (and SNP) IIUC the memory would have been cleared, and the
memory would also need to be re-accepted. I was thinking that this is by
design, since when a TDX guest requests a conversion it knows that the
contents is not to be used again.

The userspace VMM is obligated to keep trying convert and if it gives
up, userspace VMM should inform the guest that the conversion
failed. The guest should handle conversion failures too and not assume
that conversion always succeeds.

Putting TDX aside for a moment, so far, there are a few ways this
conversion could fail:

a. Unexpected refcounts. Userspace should clear up the unexpected
   refcounts and report failure to the guest if it can't for whatever
   reason.
b. ENOMEM because (i) we ran out of memory updating the shareability
   maple_tree or (ii) since splitting involves allocating more memory
   for struct pages and we ran out of memory there. In this case the
   userspace VMM gets -ENOMEM and can make more memory available and
   then retry, or if it can't, also report failure to the guest.

TDX introduces TDX-specific conversion failures (see discussion at
[1]), which this series doesn't handle, but I think we still have a line
of sight to handle new errors.

In the other thread [1], I was proposing to have guest_memfd decide what
to do on errors, but I think that might be baking more TDX-specific
details into guest_memfd/KVM, and perhaps this is better:

We could return the errors to userspace and let userspace determine what
to do. For retryable errors (as determined by userspace), it should do
what it needs to do, and retry. For errors like TDX being unable to
reclaim the memory, it could tell guest_memfd to leak that memory.

If userspace gives up, it should report conversion failure to the guest
if userspace thinks the guest can continue (to a clean shutdown or
otherwise). If something terrible happened during conversion, then
userspace might have to exit itself or shutdown the host.

In [2], for TDX-specific conversion failures, you proposed prepping to
eliminate errors and exiting early on failure, then actually
unmapping. I think that could work too.

I'm a little concerned that prepping could be complicated, since the
nature of conversion depends on the current state of shareability, and
there's a lot to prepare, everything from counting memory required for
maple_tree allocation (and merging ranges in the maple_tree), and
counting the number of pages required for undoing vmemmap optimization
in the case of splitting...

And even after doing all the prep to eliminate errors, the unmapping
could fail in TDX-specific cases anyway, which still needs to be
handled.

Hence I'm hoping you'll consider to let TDX-specific failures be
built-in and handled alongside other failures by getting help from the
userspace VMM, and in the worst case letting the guest know the
conversion failed.

I also appreciate comments or suggestions from anyone else!

[1] https://lore.kernel.org/all/diqzfrgfp95d.fsf@ackerleytng-ctop.c.googlers.com/
[2] https://lore.kernel.org/all/aEEEJbTzlncbRaRA@yzhao56-desk.sh.intel.com/

>> @@ -371,18 +539,21 @@ static int kvm_gmem_convert_range(struct file *file, pgoff_t start,
>>  
>>  	list_for_each_entry(work, &work_list, list) {
>>  		rollback_stop_item = work;
>> -		ret = kvm_gmem_shareability_apply(inode, work, m);
>> +
>> +		ret = kvm_gmem_convert_execute_work(inode, work, shared);
>>  		if (ret)
>>  			break;
>>  	}
>>  
>>  	if (ret) {
>> -		m = shared ? SHAREABILITY_GUEST : SHAREABILITY_ALL;
>>  		list_for_each_entry(work, &work_list, list) {
>> +			int r;
>> +
>> +			r = kvm_gmem_convert_execute_work(inode, work, !shared);
>> +			WARN_ON(r);
>> +
>>  			if (work == rollback_stop_item)
>>  				break;
>> -
>> -			WARN_ON(kvm_gmem_shareability_apply(inode, work, m));
> Could kvm_gmem_shareability_apply() fail here?
>

Yes, it could. If shareability cannot be updated, then we probably ran
out of memory. Userspace VMM will probably get -ENOMEM set on some
earlier ret and should handle that accordingly.

On -ENOMEM in a rollback, the host is in a very tough spot anyway, and a
clean guest shutdown may be the only way out, hence this is a WARN and
not returned to userspace.

>>  		}
>>  	}
>>  
>> @@ -434,6 +605,277 @@ static int kvm_gmem_ioctl_convert_range(struct file *file,
>>  	return ret;
>>  }
>>  
>> +#ifdef CONFIG_KVM_GMEM_HUGETLB
>> +
>> +static inline void __filemap_remove_folio_for_restructuring(struct folio *folio)
>> +{
>> +	struct address_space *mapping = folio->mapping;
>> +
>> +	spin_lock(&mapping->host->i_lock);
>> +	xa_lock_irq(&mapping->i_pages);
>> +
>> +	__filemap_remove_folio(folio, NULL);
>> +
>> +	xa_unlock_irq(&mapping->i_pages);
>> +	spin_unlock(&mapping->host->i_lock);
>> +}
>> +
>> +/**
>> + * filemap_remove_folio_for_restructuring() - Remove @folio from filemap for
>> + * split/merge.
>> + *
>> + * @folio: the folio to be removed.
>> + *
>> + * Similar to filemap_remove_folio(), but skips LRU-related calls (meaningless
>> + * for guest_memfd), and skips call to ->free_folio() to maintain folio flags.
>> + *
>> + * Context: Expects only the filemap's refcounts to be left on the folio. Will
>> + *          freeze these refcounts away so that no other users will interfere
>> + *          with restructuring.
>> + */
>> +static inline void filemap_remove_folio_for_restructuring(struct folio *folio)
>> +{
>> +	int filemap_refcount;
>> +
>> +	filemap_refcount = folio_nr_pages(folio);
>> +	while (!folio_ref_freeze(folio, filemap_refcount)) {
>> +		/*
>> +		 * At this point only filemap refcounts are expected, hence okay
>> +		 * to spin until speculative refcounts go away.
>> +		 */
>> +		WARN_ONCE(1, "Spinning on folio=%p refcount=%d", folio, folio_ref_count(folio));
>> +	}
>> +
>> +	folio_lock(folio);
>> +	__filemap_remove_folio_for_restructuring(folio);
>> +	folio_unlock(folio);
>> +}
>> +
>> +/**
>> + * kvm_gmem_split_folio_in_filemap() - Split @folio within filemap in @inode.
>> + *
>> + * @inode: inode containing the folio.
>> + * @folio: folio to be split.
>> + *
>> + * Split a folio into folios of size PAGE_SIZE. Will clean up folio from filemap
>> + * and add back the split folios.
>> + *
>> + * Context: Expects that before this call, folio's refcount is just the
>> + *          filemap's refcounts. After this function returns, the split folios'
>> + *          refcounts will also be filemap's refcounts.
>> + * Return: 0 on success or negative error otherwise.
>> + */
>> +static int kvm_gmem_split_folio_in_filemap(struct inode *inode, struct folio *folio)
>> +{
>> +	size_t orig_nr_pages;
>> +	pgoff_t orig_index;
>> +	size_t i, j;
>> +	int ret;
>> +
>> +	orig_nr_pages = folio_nr_pages(folio);
>> +	if (orig_nr_pages == 1)
>> +		return 0;
>> +
>> +	orig_index = folio->index;
>> +
>> +	filemap_remove_folio_for_restructuring(folio);
>> +
>> +	ret = kvm_gmem_allocator_ops(inode)->split_folio(folio);
>> +	if (ret)
>> +		goto err;
>> +
>> +	for (i = 0; i < orig_nr_pages; ++i) {
>> +		struct folio *f = page_folio(folio_page(folio, i));
>> +
>> +		ret = __kvm_gmem_filemap_add_folio(inode->i_mapping, f,
>> +						   orig_index + i);
> Why does the failure of __kvm_gmem_filemap_add_folio() here lead to rollback,    
> while the failure of the one under rollback only triggers WARN_ON()?
>

Mostly because I don't really have a choice on rollback. On rollback we
try to restore the merged folio back into the filemap, and if we
couldn't, the host is probably in rather bad shape in terms of memory
availability and there may not be many options for the userspace VMM.

Perhaps the different possible errors from
__kvm_gmem_filemap_add_folio() in both should be handled differently. Do
you have any suggestions on that?

>> +		if (ret)
>> +			goto rollback;
>> +	}
>> +
>> +	return ret;
>> +
>> +rollback:
>> +	for (j = 0; j < i; ++j) {
>> +		struct folio *f = page_folio(folio_page(folio, j));
>> +
>> +		filemap_remove_folio_for_restructuring(f);
>> +	}
>> +
>> +	kvm_gmem_allocator_ops(inode)->merge_folio(folio);
>> +err:
>> +	WARN_ON(__kvm_gmem_filemap_add_folio(inode->i_mapping, folio, orig_index));
>> +
>> +	return ret;
>> +}
>> +
>> +static inline int kvm_gmem_try_split_folio_in_filemap(struct inode *inode,
>> +						      struct folio *folio)
>> +{
>> +	size_t to_nr_pages;
>> +	void *priv;
>> +
>> +	if (!kvm_gmem_has_custom_allocator(inode))
>> +		return 0;
>> +
>> +	priv = kvm_gmem_allocator_private(inode);
>> +	to_nr_pages = kvm_gmem_allocator_ops(inode)->nr_pages_in_page(priv);
>> +
>> +	if (kvm_gmem_has_some_shared(inode, folio->index, to_nr_pages))
>> +		return kvm_gmem_split_folio_in_filemap(inode, folio);
>> +
>> +	return 0;
>> +}
>> +
>> +/**
>> + * kvm_gmem_merge_folio_in_filemap() - Merge @first_folio within filemap in
>> + * @inode.
>> + *
>> + * @inode: inode containing the folio.
>> + * @first_folio: first folio among folios to be merged.
>> + *
>> + * Will clean up subfolios from filemap and add back the merged folio.
>> + *
>> + * Context: Expects that before this call, all subfolios only have filemap
>> + *          refcounts. After this function returns, the merged folio will only
>> + *          have filemap refcounts.
>> + * Return: 0 on success or negative error otherwise.
>> + */
>> +static int kvm_gmem_merge_folio_in_filemap(struct inode *inode,
>> +					   struct folio *first_folio)
>> +{
>> +	size_t to_nr_pages;
>> +	pgoff_t index;
>> +	void *priv;
>> +	size_t i;
>> +	int ret;
>> +
>> +	index = first_folio->index;
>> +
>> +	priv = kvm_gmem_allocator_private(inode);
>> +	to_nr_pages = kvm_gmem_allocator_ops(inode)->nr_pages_in_folio(priv);
>> +	if (folio_nr_pages(first_folio) == to_nr_pages)
>> +		return 0;
>> +
>> +	for (i = 0; i < to_nr_pages; ++i) {
>> +		struct folio *f = page_folio(folio_page(first_folio, i));
>> +
>> +		filemap_remove_folio_for_restructuring(f);
>> +	}
>> +
>> +	kvm_gmem_allocator_ops(inode)->merge_folio(first_folio);
>> +
>> +	ret = __kvm_gmem_filemap_add_folio(inode->i_mapping, first_folio, index);
>> +	if (ret)
>> +		goto err_split;
>> +
>> +	return ret;
>> +
>> +err_split:
>> +	WARN_ON(kvm_gmem_allocator_ops(inode)->split_folio(first_folio));
> guestmem_hugetlb_split_folio() is possible to fail. e.g.
> After the stash is freed by guestmem_hugetlb_unstash_free_metadata() in
> guestmem_hugetlb_merge_folio(), it's possible to get -ENOMEM for the stash
> allocation in guestmem_hugetlb_stash_metadata() in
> guestmem_hugetlb_split_folio().
>
>

Yes. This is also on the error path. In line with all the other error
and rollback paths, I don't really have other options at this point,
since on error, I probably ran out of memory, so I try my best to
restore the original state but give up with a WARN otherwise.

>> +	for (i = 0; i < to_nr_pages; ++i) {
>> +		struct folio *f = page_folio(folio_page(first_folio, i));
>> +
>> +		WARN_ON(__kvm_gmem_filemap_add_folio(inode->i_mapping, f, index + i));
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static inline int kvm_gmem_try_merge_folio_in_filemap(struct inode *inode,
>> +						      struct folio *first_folio)
>> +{
>> +	size_t to_nr_pages;
>> +	void *priv;
>> +
>> +	priv = kvm_gmem_allocator_private(inode);
>> +	to_nr_pages = kvm_gmem_allocator_ops(inode)->nr_pages_in_folio(priv);
>> +
>> +	if (kvm_gmem_has_some_shared(inode, first_folio->index, to_nr_pages))
>> +		return 0;
>> +
>> +	return kvm_gmem_merge_folio_in_filemap(inode, first_folio);
>> +}
>> +
>> +static int kvm_gmem_restructure_folios_in_range(struct inode *inode,
>> +						pgoff_t start, size_t nr_pages,
>> +						bool is_split_operation)
>> +{
>> +	size_t to_nr_pages;
>> +	pgoff_t index;
>> +	pgoff_t end;
>> +	void *priv;
>> +	int ret;
>> +
>> +	if (!kvm_gmem_has_custom_allocator(inode))
>> +		return 0;
>> +
>> +	end = start + nr_pages;
>> +
>> +	/* Round to allocator page size, to check all (huge) pages in range. */
>> +	priv = kvm_gmem_allocator_private(inode);
>> +	to_nr_pages = kvm_gmem_allocator_ops(inode)->nr_pages_in_folio(priv);
>> +
>> +	start = round_down(start, to_nr_pages);
>> +	end = round_up(end, to_nr_pages);
>> +
>> +	for (index = start; index < end; index += to_nr_pages) {
>> +		struct folio *f;
>> +
>> +		f = filemap_get_folio(inode->i_mapping, index);
>> +		if (IS_ERR(f))
>> +			continue;
>> +
>> +		/* Leave just filemap's refcounts on the folio. */
>> +		folio_put(f);
>> +
>> +		if (is_split_operation)
>> +			ret = kvm_gmem_split_folio_in_filemap(inode, f);
> kvm_gmem_try_split_folio_in_filemap()?
>

Here we know for sure that this was a private-to-shared
conversion. Hence, we know that there are at least some shared parts in
this huge page and we can skip checking that. 

>> +		else
>> +			ret = kvm_gmem_try_merge_folio_in_filemap(inode, f);
>> +

For merge, we don't know if the entire huge page might perhaps contain
some other shared subpages, hence we "try" to merge by first checking
against shareability to find shared subpages.

>> +		if (ret)
>> +			goto rollback;
>> +	}
>> +	return ret;
>> +
>> +rollback:
>> +	for (index -= to_nr_pages; index >= start; index -= to_nr_pages) {

Note to self: the first index -= to_nr_pages was meant to skip the index
that caused the failure, but this could cause an underflow if index = 0
when entering rollback. Need to fix this in the next revision.

>> +		struct folio *f;
>> +
>> +		f = filemap_get_folio(inode->i_mapping, index);
>> +		if (IS_ERR(f))
>> +			continue;
>> +
>> +		/* Leave just filemap's refcounts on the folio. */
>> +		folio_put(f);
>> +
>> +		if (is_split_operation)
>> +			WARN_ON(kvm_gmem_merge_folio_in_filemap(inode, f));
>> +		else
>> +			WARN_ON(kvm_gmem_split_folio_in_filemap(inode, f));
> Is it safe to just leave WARN_ON()s in the rollback case?
>

Same as above. I don't think we have much of a choice.

> Besides, are the kvm_gmem_merge_folio_in_filemap() and
> kvm_gmem_split_folio_in_filemap() here duplicated with the
> kvm_gmem_split_folio_in_filemap() and kvm_gmem_try_merge_folio_in_filemap() in
> the following "r = kvm_gmem_convert_execute_work(inode, work, !shared)"?
>

This handles the case where some pages in the range [start, start +
nr_pages) were split and the failure was halfway through. I could call
kvm_gmem_convert_execute_work() with !shared but that would go over all
the folios again from the start.

>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +#else
>> +
>> +static inline int kvm_gmem_try_split_folio_in_filemap(struct inode *inode,
>> +						      struct folio *folio)
>> +{
>> +	return 0;
>> +}
>> +
>> +static int kvm_gmem_restructure_folios_in_range(struct inode *inode,
>> +						pgoff_t start, size_t nr_pages,
>> +						bool is_split_operation)
>> +{
>> +	return 0;
>> +}
>> +
>> +#endif
>> +
>

next prev parent reply	other threads:[~2025-06-05 19:10 UTC|newest]

Thread overview: 231+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-14 23:41 [RFC PATCH v2 00/51] 1G page support for guest_memfd Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 01/51] KVM: guest_memfd: Make guest mem use guest mem inodes instead of anonymous inodes Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 02/51] KVM: guest_memfd: Introduce and use shareability to guard faulting Ackerley Tng
2025-05-27  3:54   ` Yan Zhao
2025-05-29 18:20     ` Ackerley Tng
2025-05-30  8:53     ` Fuad Tabba
2025-05-30 18:32       ` Ackerley Tng
2025-06-02  9:43         ` Fuad Tabba
2025-05-27  8:25   ` Binbin Wu
2025-05-27  8:43     ` Binbin Wu
2025-05-29 18:26     ` Ackerley Tng
2025-05-29 20:37       ` Ackerley Tng
2025-05-29  5:42   ` Michael Roth
2025-06-11 21:51     ` Ackerley Tng
2025-07-02 23:25       ` Michael Roth
2025-07-03  0:46         ` Vishal Annapurve
2025-07-03  0:52           ` Vishal Annapurve
2025-07-03  4:12           ` Michael Roth
2025-07-03  5:10             ` Vishal Annapurve
2025-07-03 20:39               ` Michael Roth
2025-07-07 14:55                 ` Vishal Annapurve
2025-07-12  0:10                   ` Michael Roth
2025-07-12 17:53                     ` Vishal Annapurve
2025-08-12  8:23             ` Fuad Tabba
2025-08-13 17:11               ` Ira Weiny
2025-06-11 22:10     ` Ackerley Tng
2025-08-01  0:01   ` Yan Zhao
2025-08-14 21:35     ` Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 03/51] KVM: selftests: Update guest_memfd_test for INIT_PRIVATE flag Ackerley Tng
2025-05-15 13:49   ` Ira Weiny
2025-05-16 17:42     ` Ackerley Tng
2025-05-16 19:31       ` Ira Weiny
2025-05-27  8:53       ` Binbin Wu
2025-05-30 19:59         ` Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 04/51] KVM: guest_memfd: Introduce KVM_GMEM_CONVERT_SHARED/PRIVATE ioctls Ackerley Tng
2025-05-15 14:50   ` Ira Weiny
2025-05-16 17:53     ` Ackerley Tng
2025-05-20  9:22   ` Fuad Tabba
2025-05-20 13:02     ` Vishal Annapurve
2025-05-20 13:44       ` Fuad Tabba
2025-05-20 14:11         ` Vishal Annapurve
2025-05-20 14:33           ` Fuad Tabba
2025-05-20 16:02             ` Vishal Annapurve
2025-05-20 18:05               ` Fuad Tabba
2025-05-20 19:40                 ` Ackerley Tng
2025-05-21 12:36                   ` Fuad Tabba
2025-05-21 14:42                     ` Vishal Annapurve
2025-05-21 15:21                       ` Fuad Tabba
2025-05-21 15:51                         ` Vishal Annapurve
2025-05-21 18:27                           ` Fuad Tabba
2025-05-22 14:52                             ` Sean Christopherson
2025-05-22 15:07                               ` Fuad Tabba
2025-05-22 16:26                                 ` Sean Christopherson
2025-05-23 10:12                                   ` Fuad Tabba
2025-06-24  8:23           ` Alexey Kardashevskiy
2025-06-24 13:08             ` Jason Gunthorpe
2025-06-24 14:10               ` Vishal Annapurve
2025-06-27  4:49                 ` Alexey Kardashevskiy
2025-06-27 15:17                   ` Vishal Annapurve
2025-06-30  0:19                     ` Alexey Kardashevskiy
2025-06-30 14:19                       ` Vishal Annapurve
2025-07-10  6:57                         ` Alexey Kardashevskiy
2025-07-10 17:58                           ` Jason Gunthorpe
2025-07-02  8:35                 ` Yan Zhao
2025-07-02 13:54                   ` Vishal Annapurve
2025-07-02 14:13                     ` Jason Gunthorpe
2025-07-02 14:32                       ` Vishal Annapurve
2025-07-10 10:50                         ` Xu Yilun
2025-07-10 17:54                           ` Jason Gunthorpe
2025-07-11  4:31                             ` Xu Yilun
2025-07-11  9:33                               ` Xu Yilun
2025-07-16 22:22                   ` Ackerley Tng
2025-07-17  9:32                     ` Xu Yilun
2025-07-17 16:56                       ` Ackerley Tng
2025-07-18  2:48                         ` Xu Yilun
2025-07-18 14:15                           ` Jason Gunthorpe
2025-07-21 14:18                             ` Xu Yilun
2025-07-18 15:13                           ` Ira Weiny
2025-07-21  9:58                             ` Xu Yilun
2025-07-22 18:17                               ` Ackerley Tng
2025-07-22 19:25                                 ` Edgecombe, Rick P
2025-05-28  3:16   ` Binbin Wu
2025-05-30 20:10     ` Ackerley Tng
2025-06-03  0:54       ` Binbin Wu
2025-05-14 23:41 ` [RFC PATCH v2 05/51] KVM: guest_memfd: Skip LRU for guest_memfd folios Ackerley Tng
2025-05-28  7:01   ` Binbin Wu
2025-05-30 20:32     ` Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 06/51] KVM: Query guest_memfd for private/shared status Ackerley Tng
2025-05-27  3:55   ` Yan Zhao
2025-05-28  8:08     ` Binbin Wu
2025-05-28  9:55       ` Yan Zhao
2025-05-14 23:41 ` [RFC PATCH v2 07/51] KVM: guest_memfd: Add CAP KVM_CAP_GMEM_CONVERSION Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 08/51] KVM: selftests: Test flag validity after guest_memfd supports conversions Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 09/51] KVM: selftests: Test faulting with respect to GUEST_MEMFD_FLAG_INIT_PRIVATE Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 10/51] KVM: selftests: Refactor vm_mem_add to be more flexible Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 11/51] KVM: selftests: Allow cleanup of ucall_pool from host Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 12/51] KVM: selftests: Test conversion flows for guest_memfd Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 13/51] KVM: selftests: Add script to exercise private_mem_conversions_test Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 14/51] KVM: selftests: Update private_mem_conversions_test to mmap guest_memfd Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 15/51] KVM: selftests: Update script to map shared memory from guest_memfd Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 16/51] mm: hugetlb: Consolidate interpretation of gbl_chg within alloc_hugetlb_folio() Ackerley Tng
2025-05-15  2:09   ` Matthew Wilcox
2025-05-28  8:55   ` Binbin Wu
2025-07-07 18:27   ` James Houghton
2025-05-14 23:41 ` [RFC PATCH v2 17/51] mm: hugetlb: Cleanup interpretation of gbl_chg in alloc_hugetlb_folio() Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 18/51] mm: hugetlb: Cleanup interpretation of map_chg_state within alloc_hugetlb_folio() Ackerley Tng
2025-07-07 18:08   ` James Houghton
2025-05-14 23:41 ` [RFC PATCH v2 19/51] mm: hugetlb: Rename alloc_surplus_hugetlb_folio Ackerley Tng
2025-05-14 23:41 ` [RFC PATCH v2 20/51] mm: mempolicy: Refactor out policy_node_nodemask() Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 21/51] mm: hugetlb: Inline huge_node() into callers Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 22/51] mm: hugetlb: Refactor hugetlb allocation functions Ackerley Tng
2025-05-31 23:45   ` Ira Weiny
2025-06-13 22:03     ` Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 23/51] mm: hugetlb: Refactor out hugetlb_alloc_folio() Ackerley Tng
2025-06-01  0:38   ` Ira Weiny
2025-06-13 22:07     ` Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 24/51] mm: hugetlb: Add option to create new subpool without using surplus Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 25/51] mm: truncate: Expose preparation steps for truncate_inode_pages_final Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 26/51] mm: Consolidate freeing of typed folios on final folio_put() Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 27/51] mm: hugetlb: Expose hugetlb_subpool_{get,put}_pages() Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 28/51] mm: Introduce guestmem_hugetlb to support folio_put() handling of guestmem pages Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 29/51] mm: guestmem_hugetlb: Wrap HugeTLB as an allocator for guest_memfd Ackerley Tng
2025-05-16 14:07   ` Ackerley Tng
2025-05-16 20:33     ` Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 30/51] mm: truncate: Expose truncate_inode_folio() Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 31/51] KVM: x86: Set disallow_lpage on base_gfn and guest_memfd pgoff misalignment Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 32/51] KVM: guest_memfd: Support guestmem_hugetlb as custom allocator Ackerley Tng
2025-05-23 10:47   ` Yan Zhao
2025-08-12  9:13   ` Tony Lindgren
2025-05-14 23:42 ` [RFC PATCH v2 33/51] KVM: guest_memfd: Allocate and truncate from " Ackerley Tng
2025-05-21 18:05   ` Vishal Annapurve
2025-05-22 23:12   ` Edgecombe, Rick P
2025-05-28 10:58   ` Yan Zhao
2025-06-03  7:43   ` Binbin Wu
2025-07-16 22:13     ` Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 34/51] mm: hugetlb: Add functions to add/delete folio from hugetlb lists Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 35/51] mm: guestmem_hugetlb: Add support for splitting and merging pages Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 36/51] mm: Convert split_folio() macro to function Ackerley Tng
2025-05-21 16:40   ` Edgecombe, Rick P
2025-05-14 23:42 ` [RFC PATCH v2 37/51] filemap: Pass address_space mapping to ->free_folio() Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 38/51] KVM: guest_memfd: Split allocator pages for guest_memfd use Ackerley Tng
2025-05-22 22:19   ` Edgecombe, Rick P
2025-06-05 17:15     ` Ackerley Tng
2025-06-05 17:53       ` Edgecombe, Rick P
2025-06-05 17:15     ` Ackerley Tng
2025-06-05 17:16     ` Ackerley Tng
2025-06-05 17:16     ` Ackerley Tng
2025-06-05 17:16     ` Ackerley Tng
2025-05-27  4:30   ` Yan Zhao
2025-05-27  4:38     ` Yan Zhao
2025-06-05 17:50     ` Ackerley Tng
2025-05-27  8:45   ` Yan Zhao
2025-06-05 19:10     ` Ackerley Tng [this message]
2025-06-16 11:15       ` Yan Zhao
2025-06-05  5:24   ` Binbin Wu
2025-06-05 19:16     ` Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 39/51] KVM: guest_memfd: Merge and truncate on fallocate(PUNCH_HOLE) Ackerley Tng
2025-05-28 11:00   ` Yan Zhao
2025-05-28 16:39     ` Ackerley Tng
2025-05-29  3:26       ` Yan Zhao
2025-05-14 23:42 ` [RFC PATCH v2 40/51] KVM: guest_memfd: Update kvm_gmem_mapping_order to account for page status Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 41/51] KVM: Add CAP to indicate support for HugeTLB as custom allocator Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 42/51] KVM: selftests: Add basic selftests for hugetlb-backed guest_memfd Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 43/51] KVM: selftests: Update conversion flows test for HugeTLB Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 44/51] KVM: selftests: Test truncation paths of guest_memfd Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 45/51] KVM: selftests: Test allocation and conversion of subfolios Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 46/51] KVM: selftests: Test that guest_memfd usage is reported via hugetlb Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 47/51] KVM: selftests: Support various types of backing sources for private memory Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 48/51] KVM: selftests: Update test for various private memory backing source types Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 49/51] KVM: selftests: Update private_mem_conversions_test.sh to test with HugeTLB pages Ackerley Tng
2025-05-14 23:42 ` [RFC PATCH v2 50/51] KVM: selftests: Add script to test HugeTLB statistics Ackerley Tng
2025-05-15 18:03 ` [RFC PATCH v2 00/51] 1G page support for guest_memfd Edgecombe, Rick P
2025-05-15 18:42   ` Vishal Annapurve
2025-05-15 23:35     ` Edgecombe, Rick P
2025-05-16  0:57       ` Sean Christopherson
2025-05-16  2:12         ` Edgecombe, Rick P
2025-05-16 13:11           ` Vishal Annapurve
2025-05-16 16:45             ` Edgecombe, Rick P
2025-05-16 17:51               ` Sean Christopherson
2025-05-16 19:14                 ` Edgecombe, Rick P
2025-05-16 20:25                   ` Dave Hansen
2025-05-16 21:42                     ` Edgecombe, Rick P
2025-05-16 17:45             ` Sean Christopherson
2025-05-16 13:09         ` Jason Gunthorpe
2025-05-16 17:04           ` Edgecombe, Rick P
2025-05-16  0:22 ` [RFC PATCH v2 51/51] KVM: selftests: Test guest_memfd for accuracy of st_blocks Ackerley Tng
2025-05-16 19:48 ` [RFC PATCH v2 00/51] 1G page support for guest_memfd Ira Weiny
2025-05-16 19:59   ` Ira Weiny
2025-05-16 20:26     ` Ackerley Tng
2025-05-16 22:43 ` Ackerley Tng
2025-06-19  8:13 ` Yan Zhao
2025-06-19  8:59   ` Xiaoyao Li
2025-06-19  9:18     ` Xiaoyao Li
2025-06-19  9:28       ` Yan Zhao
2025-06-19  9:45         ` Xiaoyao Li
2025-06-19  9:49           ` Xiaoyao Li
2025-06-29 18:28     ` Vishal Annapurve
2025-06-30  3:14       ` Yan Zhao
2025-06-30 14:14         ` Vishal Annapurve
2025-07-01  5:23           ` Yan Zhao
2025-07-01 19:48             ` Vishal Annapurve
2025-07-07 23:25               ` Sean Christopherson
2025-07-08  0:14                 ` Vishal Annapurve
2025-07-08  1:08                   ` Edgecombe, Rick P
2025-07-08 14:20                     ` Sean Christopherson
2025-07-08 14:52                       ` Edgecombe, Rick P
2025-07-08 15:07                         ` Vishal Annapurve
2025-07-08 15:31                           ` Edgecombe, Rick P
2025-07-08 17:16                             ` Vishal Annapurve
2025-07-08 17:39                               ` Edgecombe, Rick P
2025-07-08 18:03                                 ` Sean Christopherson
2025-07-08 18:13                                   ` Edgecombe, Rick P
2025-07-08 18:55                                     ` Sean Christopherson
2025-07-08 21:23                                       ` Edgecombe, Rick P
2025-07-09 14:28                                       ` Vishal Annapurve
2025-07-09 15:00                                         ` Sean Christopherson
2025-07-10  1:30                                           ` Vishal Annapurve
2025-07-10 23:33                                             ` Sean Christopherson
2025-07-11 21:18                                             ` Vishal Annapurve
2025-07-12 17:33                                               ` Vishal Annapurve
2025-07-09 15:17                                         ` Edgecombe, Rick P
2025-07-10  3:39                                           ` Vishal Annapurve
2025-07-08 19:28                                   ` Vishal Annapurve
2025-07-08 19:58                                     ` Sean Christopherson
2025-07-08 22:54                                       ` Vishal Annapurve
2025-07-08 15:38                           ` Sean Christopherson
2025-07-08 16:22                             ` Fuad Tabba
2025-07-08 17:25                               ` Sean Christopherson
2025-07-08 18:37                                 ` Fuad Tabba
2025-07-16 23:06                                   ` Ackerley Tng
2025-06-26 23:19 ` Ackerley Tng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=diqztt4uhunj.fsf@ackerleytng-ctop.c.googlers.com \
    --to=ackerleytng@google.com \
    --cc=aik@amd.com \
    --cc=ajones@ventanamicro.com \
    --cc=akpm@linux-foundation.org \
    --cc=amoorthy@google.com \
    --cc=anthony.yznaga@oracle.com \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=bfoster@redhat.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=brauner@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=chao.p.peng@intel.com \
    --cc=chenhuacai@kernel.org \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=dmatlack@google.com \
    --cc=dwmw@amazon.co.uk \
    --cc=erdemaktas@google.com \
    --cc=fan.du@intel.com \
    --cc=fvdl@google.com \
    --cc=graf@amazon.com \
    --cc=haibo1.xu@intel.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=ira.weiny@intel.com \
    --cc=isaku.yamahata@intel.com \
    --cc=jack@suse.cz \
    --cc=james.morse@arm.com \
    --cc=jarkko@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=jgowans@amazon.com \
    --cc=jhubbard@nvidia.com \
    --cc=jroedel@suse.de \
    --cc=jthoughton@google.com \
    --cc=jun.miao@intel.com \
    --cc=kai.huang@intel.com \
    --cc=keirf@google.com \
    --cc=kent.overstreet@linux.dev \
    --cc=kirill.shutemov@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=liam.merwick@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=mail@maciej.szmigiero.name \
    --cc=maz@kernel.org \
    --cc=mic@digikod.net \
    --cc=michael.roth@amd.com \
    --cc=mpe@ellerman.id.au \
    --cc=muchun.song@linux.dev \
    --cc=nikunj@amd.com \
    --cc=nsaenz@amazon.es \
    --cc=oliver.upton@linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=pankaj.gupta@amd.com \
    --cc=paul.walmsley@sifive.com \
    --cc=pbonzini@redhat.com \
    --cc=pdurrant@amazon.co.uk \
    --cc=peterx@redhat.com \
    --cc=pgonda@google.com \
    --cc=pvorel@suse.cz \
    --cc=qperret@google.com \
    --cc=quic_cvanscha@quicinc.com \
    --cc=quic_eberman@quicinc.com \
    --cc=quic_mnalajal@quicinc.com \
    --cc=quic_pderrin@quicinc.com \
    --cc=quic_pheragu@quicinc.com \
    --cc=quic_svaddagi@quicinc.com \
    --cc=quic_tsoni@quicinc.com \
    --cc=richard.weiyang@gmail.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rientjes@google.com \
    --cc=roypat@amazon.co.uk \
    --cc=rppt@kernel.org \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=steven.price@arm.com \
    --cc=steven.sistare@oracle.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=thomas.lendacky@amd.com \
    --cc=usama.arif@bytedance.com \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vkuznets@redhat.com \
    --cc=wei.w.wang@intel.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yilun.xu@intel.com \
    --cc=yuzenghui@huawei.com \
    --cc=zhiquan1.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).