From: Ackerley Tng via B4 Relay <devnull+ackerleytng.google.com@kernel.org>
To: aik@amd.com, andrew.jones@linux.dev, binbin.wu@linux.intel.com,
brauner@kernel.org, chao.p.peng@linux.intel.com,
david@kernel.org, ira.weiny@intel.com, jmattson@google.com,
jthoughton@google.com, michael.roth@amd.com, oupton@kernel.org,
pankaj.gupta@amd.com, qperret@google.com,
rick.p.edgecombe@intel.com, rientjes@google.com,
shivankg@amd.com, steven.price@arm.com, tabba@google.com,
willy@infradead.org, wyihan@google.com, yan.y.zhao@intel.com,
forkloop@google.com, pratyush@kernel.org,
suzuki.poulose@arm.com, aneesh.kumar@kernel.org,
liam@infradead.org, Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <seanjc@google.com>,
Thomas Gleixner <tglx@kernel.org>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
Steven Rostedt <rostedt@goodmis.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>,
Shuah Khan <shuah@kernel.org>,
Vishal Annapurve <vannapurve@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Chris Li <chrisl@kernel.org>, Kairui Song <kasong@tencent.com>,
Kemeng Shi <shikemeng@huaweicloud.com>,
Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
Barry Song <baohua@kernel.org>,
Axel Rasmussen <axelrasmussen@google.com>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
Youngjun Park <youngjun.park@lge.com>,
Qi Zheng <qi.zheng@linux.dev>,
Shakeel Butt <shakeel.butt@linux.dev>,
Kiryl Shutsemau <kas@kernel.org>, Jason Gunthorpe <jgg@ziepe.ca>,
Vlastimil Babka <vbabka@kernel.org>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kselftest@vger.kernel.org, linux-mm@kvack.org,
linux-coco@lists.linux.dev,
Ackerley Tng <ackerleytng@google.com>
Subject: [PATCH v6 11/43] KVM: guest_memfd: Ensure pages are not in use before conversion
Date: Thu, 07 May 2026 13:22:30 -0700 [thread overview]
Message-ID: <20260507-gmem-inplace-conversion-v6-11-91ab5a8b19a4@google.com> (raw)
In-Reply-To: <20260507-gmem-inplace-conversion-v6-0-91ab5a8b19a4@google.com>
From: Ackerley Tng <ackerleytng@google.com>
When converting memory to private in guest_memfd, it is necessary to ensure
that the pages are not currently being accessed by any other part of the
kernel or userspace to avoid any current user writing to guest private
memory.
guest_memfd checks for unexpected refcounts to determine whether a page is
still in use. The only expected refcounts after unmapping the range
requested for conversion are those that are held by guest_memfd itself.
Update the kvm_memory_attributes2 structure to include an error_offset
field. This allows KVM to report the exact offset where a conversion
failed to userspace. If the safety check fails, return -EAGAIN and copy
the error_offset back to userspace so that it can potentially retry the
operation or handle the failure gracefully.
Suggested-by: David Hildenbrand <david@kernel.org>
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
Co-developed-by: Vishal Annapurve <vannapurve@google.com>
Signed-off-by: Vishal Annapurve <vannapurve@google.com>
---
include/uapi/linux/kvm.h | 3 ++-
virt/kvm/guest_memfd.c | 65 ++++++++++++++++++++++++++++++++++++++++++++----
2 files changed, 62 insertions(+), 6 deletions(-)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index e6bbf68a83813..0b55258573d3d 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1658,7 +1658,8 @@ struct kvm_memory_attributes2 {
__u64 size;
__u64 attributes;
__u64 flags;
- __u64 reserved[12];
+ __u64 error_offset;
+ __u64 reserved[11];
};
#define KVM_MEMORY_ATTRIBUTE_PRIVATE (1ULL << 3)
diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index 91e89b188f583..9d82642a025e9 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -572,9 +572,42 @@ static int kvm_gmem_mas_preallocate(struct ma_state *mas, u64 attributes,
return mas_preallocate(mas, xa_mk_value(attributes), GFP_KERNEL);
}
+static bool kvm_gmem_is_safe_for_conversion(struct inode *inode, pgoff_t start,
+ size_t nr_pages, pgoff_t *err_index)
+{
+ struct address_space *mapping = inode->i_mapping;
+ const int filemap_get_folios_refcount = 1;
+ pgoff_t last = start + nr_pages - 1;
+ struct folio_batch fbatch;
+ bool safe = true;
+ int i;
+
+ folio_batch_init(&fbatch);
+ while (safe && filemap_get_folios(mapping, &start, last, &fbatch)) {
+
+ for (i = 0; i < folio_batch_count(&fbatch); ++i) {
+ struct folio *folio = fbatch.folios[i];
+
+ if (folio_ref_count(folio) !=
+ folio_nr_pages(folio) + filemap_get_folios_refcount) {
+ safe = false;
+ *err_index = folio->index;
+ break;
+ }
+ }
+
+ folio_batch_release(&fbatch);
+ cond_resched();
+ }
+
+ return safe;
+}
+
static int __kvm_gmem_set_attributes(struct inode *inode, pgoff_t start,
- size_t nr_pages, uint64_t attrs)
+ size_t nr_pages, uint64_t attrs,
+ pgoff_t *err_index)
{
+ bool to_private = attrs & KVM_MEMORY_ATTRIBUTE_PRIVATE;
struct address_space *mapping = inode->i_mapping;
struct gmem_inode *gi = GMEM_I(inode);
pgoff_t end = start + nr_pages;
@@ -588,8 +621,21 @@ static int __kvm_gmem_set_attributes(struct inode *inode, pgoff_t start,
mas_init(&mas, mt, start);
r = kvm_gmem_mas_preallocate(&mas, attrs, start, nr_pages);
- if (r)
+ if (r) {
+ *err_index = start;
goto out;
+ }
+
+ if (to_private) {
+ unmap_mapping_pages(mapping, start, nr_pages, false);
+
+ if (!kvm_gmem_is_safe_for_conversion(inode, start, nr_pages,
+ err_index)) {
+ mas_destroy(&mas);
+ r = -EAGAIN;
+ goto out;
+ }
+ }
/*
* From this point on guest_memfd has performed necessary
@@ -609,9 +655,10 @@ static long kvm_gmem_set_attributes(struct file *file, void __user *argp)
struct gmem_file *f = file->private_data;
struct inode *inode = file_inode(file);
struct kvm_memory_attributes2 attrs;
+ pgoff_t err_index;
size_t nr_pages;
pgoff_t index;
- int i;
+ int i, r;
if (copy_from_user(&attrs, argp, sizeof(attrs)))
return -EFAULT;
@@ -635,8 +682,16 @@ static long kvm_gmem_set_attributes(struct file *file, void __user *argp)
nr_pages = attrs.size >> PAGE_SHIFT;
index = attrs.offset >> PAGE_SHIFT;
- return __kvm_gmem_set_attributes(inode, index, nr_pages,
- attrs.attributes);
+ r = __kvm_gmem_set_attributes(inode, index, nr_pages, attrs.attributes,
+ &err_index);
+ if (r) {
+ attrs.error_offset = ((uint64_t)err_index) << PAGE_SHIFT;
+
+ if (copy_to_user(argp, &attrs, sizeof(attrs)))
+ return -EFAULT;
+ }
+
+ return r;
}
static long kvm_gmem_ioctl(struct file *file, unsigned int ioctl,
--
2.54.0.563.g4f69b47b94-goog
next prev parent reply other threads:[~2026-05-07 20:23 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-07 20:22 [PATCH v6 00/43] guest_memfd: In-place conversion support Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 01/43] KVM: guest_memfd: Introduce per-gmem attributes, use to guard user mappings Ackerley Tng via B4 Relay
2026-05-08 23:36 ` Ackerley Tng
2026-05-07 20:22 ` [PATCH v6 02/43] KVM: Rename KVM_GENERIC_MEMORY_ATTRIBUTES to KVM_VM_MEMORY_ATTRIBUTES Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 03/43] KVM: Enumerate support for PRIVATE memory iff kvm_arch_has_private_mem is defined Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 04/43] KVM: Stub in ability to disable per-VM memory attribute tracking Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 05/43] KVM: guest_memfd: Wire up kvm_get_memory_attributes() to per-gmem attributes Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 06/43] KVM: x86/mmu: Bug the VM if gmem attributes are queried to determine max mapping level Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 07/43] KVM: guest_memfd: Update kvm_gmem_populate() to use gmem attributes Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 08/43] KVM: guest_memfd: Only prepare folios for private pages Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 09/43] KVM: Move kvm_supported_mem_attributes() to kvm_host.h Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 10/43] KVM: guest_memfd: Add base support for KVM_SET_MEMORY_ATTRIBUTES2 Ackerley Tng via B4 Relay
2026-05-07 20:22 ` Ackerley Tng via B4 Relay [this message]
2026-05-07 20:22 ` [PATCH v6 12/43] KVM: guest_memfd: Call arch invalidate hooks on conversion Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 13/43] KVM: guest_memfd: Return early if range already has requested attributes Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 14/43] KVM: guest_memfd: Advertise KVM_SET_MEMORY_ATTRIBUTES2 ioctl Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 15/43] KVM: guest_memfd: Handle lru_add fbatch refcounts during conversion safety check Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 16/43] KVM: guest_memfd: Use actual size for invalidation in kvm_gmem_release() Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 17/43] KVM: guest_memfd: Determine invalidation filter from memory attributes Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 18/43] KVM: Move KVM_VM_MEMORY_ATTRIBUTES config definition to x86 Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 19/43] KVM: Let userspace disable per-VM mem attributes, enable per-gmem attributes Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 20/43] KVM: guest_memfd: Enable INIT_SHARED on guest_memfd for x86 Coco VMs Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 21/43] KVM: SEV: Make 'uaddr' parameter optional for KVM_SEV_SNP_LAUNCH_UPDATE Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 22/43] KVM: TDX: Make source page optional for KVM_TDX_INIT_MEM_REGION Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 23/43] KVM: selftests: Create gmem fd before "regular" fd when adding memslot Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 24/43] KVM: selftests: Rename guest_memfd{,_offset} to gmem_{fd,offset} Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 25/43] KVM: selftests: Add support for mmap() on guest_memfd in core library Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 26/43] KVM: selftests: Add selftests global for guest memory attributes capability Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 27/43] KVM: selftests: Add helpers for calling ioctls on guest_memfd Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 28/43] KVM: selftests: Test basic single-page conversion flow Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 29/43] KVM: selftests: Test conversion flow when INIT_SHARED Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 30/43] KVM: selftests: Test conversion precision in guest_memfd Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 31/43] KVM: selftests: Test conversion before allocation Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 32/43] KVM: selftests: Convert with allocated folios in different layouts Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 33/43] KVM: selftests: Test that truncation does not change shared/private status Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 34/43] KVM: selftests: Test that shared/private status is consistent across processes Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 35/43] KVM: selftests: Test conversion with elevated page refcount Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 36/43] KVM: selftests: Reset shared memory after hole-punching Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 37/43] KVM: selftests: Provide function to look up guest_memfd details from gpa Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 38/43] KVM: selftests: Provide common function to set memory attributes Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 39/43] KVM: selftests: Check fd/flags provided to mmap() when setting up memslot Ackerley Tng via B4 Relay
2026-05-07 20:22 ` [PATCH v6 40/43] KVM: selftests: Make TEST_EXPECT_SIGBUS thread-safe Ackerley Tng via B4 Relay
2026-05-07 20:23 ` [PATCH v6 41/43] KVM: selftests: Update private_mem_conversions_test to mmap() guest_memfd Ackerley Tng via B4 Relay
2026-05-07 20:23 ` [PATCH v6 42/43] KVM: selftests: Add script to exercise private_mem_conversions_test Ackerley Tng via B4 Relay
2026-05-07 20:23 ` [PATCH v6 43/43] KVM: selftests: Update private memory exits test to work with per-gmem attributes Ackerley Tng via B4 Relay
2026-05-07 20:34 ` [POC PATCH 0/5] guest_memfd in-place conversion selftests for SNP Ackerley Tng
2026-05-07 20:34 ` [POC PATCH 1/5] KVM: selftests: Initialize guest_memfd with INIT_SHARED Ackerley Tng
2026-05-07 20:34 ` [POC PATCH 2/5] KVM: selftests: Use guest_memfd memory contents in-place for SNP launch update Ackerley Tng
2026-05-07 20:34 ` [POC PATCH 3/5] KVM: selftests: Make guest_code_xsave more friendly Ackerley Tng
2026-05-07 20:34 ` [POC PATCH 4/5] KVM: selftests: Allow specifying CoCo-privateness while mapping a page Ackerley Tng
2026-05-07 20:34 ` [POC PATCH 5/5] KVM: selftests: Test conversions for SNP Ackerley Tng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260507-gmem-inplace-conversion-v6-11-91ab5a8b19a4@google.com \
--to=devnull+ackerleytng.google.com@kernel.org \
--cc=ackerleytng@google.com \
--cc=aik@amd.com \
--cc=akpm@linux-foundation.org \
--cc=andrew.jones@linux.dev \
--cc=aneesh.kumar@kernel.org \
--cc=axelrasmussen@google.com \
--cc=baohua@kernel.org \
--cc=bhe@redhat.com \
--cc=binbin.wu@linux.intel.com \
--cc=bp@alien8.de \
--cc=brauner@kernel.org \
--cc=chao.p.peng@linux.intel.com \
--cc=chrisl@kernel.org \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=david@kernel.org \
--cc=forkloop@google.com \
--cc=hpa@zytor.com \
--cc=ira.weiny@intel.com \
--cc=jgg@ziepe.ca \
--cc=jmattson@google.com \
--cc=jthoughton@google.com \
--cc=kas@kernel.org \
--cc=kasong@tencent.com \
--cc=kvm@vger.kernel.org \
--cc=liam@infradead.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=michael.roth@amd.com \
--cc=mingo@redhat.com \
--cc=nphamcs@gmail.com \
--cc=oupton@kernel.org \
--cc=pankaj.gupta@amd.com \
--cc=pbonzini@redhat.com \
--cc=pratyush@kernel.org \
--cc=qi.zheng@linux.dev \
--cc=qperret@google.com \
--cc=rick.p.edgecombe@intel.com \
--cc=rientjes@google.com \
--cc=rostedt@goodmis.org \
--cc=seanjc@google.com \
--cc=shakeel.butt@linux.dev \
--cc=shikemeng@huaweicloud.com \
--cc=shivankg@amd.com \
--cc=shuah@kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=steven.price@arm.com \
--cc=suzuki.poulose@arm.com \
--cc=tabba@google.com \
--cc=tglx@kernel.org \
--cc=vannapurve@google.com \
--cc=vbabka@kernel.org \
--cc=weixugc@google.com \
--cc=willy@infradead.org \
--cc=wyihan@google.com \
--cc=x86@kernel.org \
--cc=yan.y.zhao@intel.com \
--cc=youngjun.park@lge.com \
--cc=yuanchu@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox