Kernel KVM virtualization development
 help / color / mirror / Atom feed
* [PATCH v3 0/3] KVM: guest_memfd: Fix signed offset+size goof
@ 2026-05-28  2:11 Sean Christopherson
  2026-05-28  2:11 ` [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values Sean Christopherson
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Sean Christopherson @ 2026-05-28  2:11 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel, Ackerley Tng, Sean Christopherson

Fix a bug where KVM fails to reject a comically large offset into guest_memfd
if offset+size results in a signed, negative value.  Add a testcase to prove
the bug, and to serve as a regression test.

Note, v1 and v2 was part of larger series.

v3:
 - Use uoff_t, not u64. [the combined might of Sean and Ackerley]
 - Explaining exactly what is broken. [Ackerley]
 - Add a regression test.

v2: https://lore.kernel.org/all/20260522-fix-sev-gmem-post-populate-v2-0-3f196bfad5a1@google.com
v1: https://lore.kernel.org/r/20260522-fix-sev-gmem-post-populate-v1-0-9fc8d6437b65@google.com

Sean Christopherson (3):
  KVM: guest_memfd: Treat memslot binding offset+size as unsigned values
  KVM: selftests: Expand the guest_memfd test macros to allow passing
    the VM
  KVM: selftests: Add guest_memfd regression test signed offset+size bug

 .../testing/selftests/kvm/guest_memfd_test.c  | 24 +++++++++++++++++--
 virt/kvm/guest_memfd.c                        |  8 +++----
 virt/kvm/kvm_mm.h                             |  7 ++++--
 3 files changed, 31 insertions(+), 8 deletions(-)


base-commit: 9f2a49c511cb05b85745e1578e4fd425bff87f58
-- 
2.54.0.794.g4f17f83d09-goog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values
  2026-05-28  2:11 [PATCH v3 0/3] KVM: guest_memfd: Fix signed offset+size goof Sean Christopherson
@ 2026-05-28  2:11 ` Sean Christopherson
  2026-05-28  3:00   ` sashiko-bot
                     ` (2 more replies)
  2026-05-28  2:11 ` [PATCH v3 2/3] KVM: selftests: Expand the guest_memfd test macros to allow passing the VM Sean Christopherson
  2026-05-28  2:11 ` [PATCH v3 3/3] KVM: selftests: Add guest_memfd regression test signed offset+size bug Sean Christopherson
  2 siblings, 3 replies; 10+ messages in thread
From: Sean Christopherson @ 2026-05-28  2:11 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel, Ackerley Tng, Sean Christopherson

When binding a memslot to a guest_memfd file, treat the offset and size as
unsigned values to fix a bug where the sum of the two can result in a false
negative when checking for overflow against the size of the file.  Passing
unsigned values also avoids relying on somewhat obscure checks in other
flows for safety, and tracks the offset and size as they are intended to be
tracked, as unsigned values.

On 64-bit kernels, the number of pages a memslot contains and thus the size
(and offset) of its guest_memfd binding are unsigned 64-bit values.  Taking
the offset+size as an loff_t instead of a uoff_t inadvertently converts
the unsigned value to a signed value if the offset and/or size is massive.

Locally storing the offset and size as signed values is benign in and of
itself (though even that is *extremely* difficult to discern), but
operating on their sum is not.

For the offset, KVM explicitly checks against a negative value, which might
seem like a bug as KVM could incorrectly reject a legitimate binding, but
that's not actually the case as KVM_CREATE_GUEST_MEMFD takes a signed value
for its size, i.e. a would-be-negative offset is also greater than the
maximum possible size of any guest_memfd file.

Regarding the size, while KVM lacks an explicit check for a negative value,
i.e. seemingly has a flawed overflow check, KVM restricts the number of
pages in a single memslot to the largest positive signed 32-bit value:

        if (id < KVM_USER_MEM_SLOTS &&
            (mem->memory_size >> PAGE_SHIFT) > KVM_MEM_MAX_NR_PAGES)
                return -EINVAL;

and so that maximum "size" will ever be is 0x7fffffff000.

The sum of the two is, however, problematic.  While the size is restricted
by KVM's memslot logic, the offset is not, i.e. the offset is completely
unchecked until the "offset + size > i_size_read(inode)" check.  If the
offset is the (nearly) largest possible _positive_ value, then adding size
to the offset can result in a signed, negative 64-bit value.  When compared
against the size of the file (guaranteed to be positive), the negative sum
is always smaller, and KVM incorrectly allows the absurd offset.

Opportunistically add missing includes in kvm_mm.h (instead of relying on
its parents).

Fixes: a7800aa80ea4 ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory")
Cc: stable@vger.kernel.org
Cc: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/guest_memfd.c | 8 ++++----
 virt/kvm/kvm_mm.h      | 7 +++++--
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index bf9659a7b0f6..a1cb72e66288 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -640,15 +640,16 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args)
 }
 
 int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot,
-		  unsigned int fd, loff_t offset)
+		  unsigned int fd, uoff_t offset)
 {
-	loff_t size = slot->npages << PAGE_SHIFT;
+	uoff_t size = slot->npages << PAGE_SHIFT;
 	unsigned long start, end;
 	struct gmem_file *f;
 	struct inode *inode;
 	struct file *file;
 	int r = -EINVAL;
 
+	BUILD_BUG_ON(sizeof(gpa_t) != sizeof(offset));
 	BUILD_BUG_ON(sizeof(gfn_t) != sizeof(slot->gmem.pgoff));
 
 	file = fget(fd);
@@ -664,8 +665,7 @@ int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot,
 
 	inode = file_inode(file);
 
-	if (offset < 0 || !PAGE_ALIGNED(offset) ||
-	    offset + size > i_size_read(inode))
+	if (!PAGE_ALIGNED(offset) || offset + size > i_size_read(inode))
 		goto err;
 
 	filemap_invalidate_lock(inode->i_mapping);
diff --git a/virt/kvm/kvm_mm.h b/virt/kvm/kvm_mm.h
index 9fcc5d5b7f8d..7510ca915dd1 100644
--- a/virt/kvm/kvm_mm.h
+++ b/virt/kvm/kvm_mm.h
@@ -3,6 +3,9 @@
 #ifndef __KVM_MM_H__
 #define __KVM_MM_H__ 1
 
+#include <linux/kvm.h>
+#include <linux/kvm_types.h>
+
 /*
  * Architectures can choose whether to use an rwlock or spinlock
  * for the mmu_lock.  These macros, for use in common code
@@ -72,7 +75,7 @@ int kvm_gmem_init(struct module *module);
 void kvm_gmem_exit(void);
 int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args);
 int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot,
-		  unsigned int fd, loff_t offset);
+		  unsigned int fd, uoff_t offset);
 void kvm_gmem_unbind(struct kvm_memory_slot *slot);
 #else
 static inline int kvm_gmem_init(struct module *module)
@@ -82,7 +85,7 @@ static inline int kvm_gmem_init(struct module *module)
 static inline void kvm_gmem_exit(void) {};
 static inline int kvm_gmem_bind(struct kvm *kvm,
 					 struct kvm_memory_slot *slot,
-					 unsigned int fd, loff_t offset)
+					 unsigned int fd, uoff_t offset)
 {
 	WARN_ON_ONCE(1);
 	return -EIO;
-- 
2.54.0.794.g4f17f83d09-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/3] KVM: selftests: Expand the guest_memfd test macros to allow passing the VM
  2026-05-28  2:11 [PATCH v3 0/3] KVM: guest_memfd: Fix signed offset+size goof Sean Christopherson
  2026-05-28  2:11 ` [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values Sean Christopherson
@ 2026-05-28  2:11 ` Sean Christopherson
  2026-05-28  2:11 ` [PATCH v3 3/3] KVM: selftests: Add guest_memfd regression test signed offset+size bug Sean Christopherson
  2 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2026-05-28  2:11 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel, Ackerley Tng, Sean Christopherson

Expand the gmem test macros to allow passing the VM to testcases, without
needing to plumb the VM into _every_ testcase, as the vast majority of
testcases only need the fd and size.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 tools/testing/selftests/kvm/guest_memfd_test.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c
index 832ef4dfb99f..246bb408ecc0 100644
--- a/tools/testing/selftests/kvm/guest_memfd_test.c
+++ b/tools/testing/selftests/kvm/guest_memfd_test.c
@@ -408,17 +408,26 @@ static void test_guest_memfd_flags(struct kvm_vm *vm)
 	}
 }
 
-#define __gmem_test(__test, __vm, __flags, __gmem_size)			\
+#define ____gmem_test(__test, __vm, __flags, __gmem_size, args...)	\
 do {									\
 	int fd = vm_create_guest_memfd(__vm, __gmem_size, __flags);	\
 									\
-	test_##__test(fd, __gmem_size);					\
+	test_##__test(args);						\
 	close(fd);							\
 } while (0)
 
+#define __gmem_test(__test, __vm, __flags, __gmem_size)			\
+	____gmem_test(__test, __vm, __flags, __gmem_size, fd, __gmem_size)
+
 #define gmem_test(__test, __vm, __flags)				\
 	__gmem_test(__test, __vm, __flags, page_size * 4)
 
+#define __gmem_test_vm(__test, __vm, __flags, __gmem_size)		\
+	____gmem_test(__test, __vm, __flags, __gmem_size, __vm, fd, __gmem_size)
+
+#define gmem_test_vm(__test, __vm, __flags)				\
+	__gmem_test_vm(__test, __vm, __flags, page_size * 4)
+
 static void __test_guest_memfd(struct kvm_vm *vm, u64 flags)
 {
 	test_create_guest_memfd_multiple(vm);
-- 
2.54.0.794.g4f17f83d09-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 3/3] KVM: selftests: Add guest_memfd regression test signed offset+size bug
  2026-05-28  2:11 [PATCH v3 0/3] KVM: guest_memfd: Fix signed offset+size goof Sean Christopherson
  2026-05-28  2:11 ` [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values Sean Christopherson
  2026-05-28  2:11 ` [PATCH v3 2/3] KVM: selftests: Expand the guest_memfd test macros to allow passing the VM Sean Christopherson
@ 2026-05-28  2:11 ` Sean Christopherson
  2026-05-28  3:29   ` sashiko-bot
  2 siblings, 1 reply; 10+ messages in thread
From: Sean Christopherson @ 2026-05-28  2:11 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel, Ackerley Tng, Sean Christopherson

Add a regression (and proof-of-bug) testcase to ensure KVM rejects an
offset+size that would result in a negative value when computed as a signed
64-bit value.  KVM had a flaw where it would allow binding a memslot to a
guest_memfd instance even with a wildly out-of-range offset, if the offset
and size were both positive values, but the combined offset+size was
negative.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 tools/testing/selftests/kvm/guest_memfd_test.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c
index 246bb408ecc0..95a6ddfd8023 100644
--- a/tools/testing/selftests/kvm/guest_memfd_test.c
+++ b/tools/testing/selftests/kvm/guest_memfd_test.c
@@ -345,6 +345,16 @@ static void test_invalid_punch_hole(int fd, size_t total_size)
 	}
 }
 
+static void test_invalid_binding(struct kvm_vm *vm, int fd, size_t size)
+{
+	int r;
+
+	r = __vm_set_user_memory_region2(vm, 0, KVM_MEM_GUEST_MEMFD, 0, size, 0,
+					 fd, 0x7ffffffffffff000ull);
+	TEST_ASSERT(r && errno == EINVAL,
+		    "Memslot with out-of-range offset+size should fail");
+}
+
 static void test_create_guest_memfd_invalid_sizes(struct kvm_vm *vm,
 						  u64 guest_memfd_flags)
 {
@@ -456,6 +466,7 @@ static void __test_guest_memfd(struct kvm_vm *vm, u64 flags)
 	gmem_test(file_size, vm, flags);
 	gmem_test(fallocate, vm, flags);
 	gmem_test(invalid_punch_hole, vm, flags);
+	gmem_test_vm(invalid_binding, vm, flags);
 }
 
 static void test_guest_memfd(unsigned long vm_type)
-- 
2.54.0.794.g4f17f83d09-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values
  2026-05-28  2:11 ` [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values Sean Christopherson
@ 2026-05-28  3:00   ` sashiko-bot
  2026-05-28 17:24   ` Ackerley Tng
  2026-05-28 23:25   ` Michael Roth
  2 siblings, 0 replies; 10+ messages in thread
From: sashiko-bot @ 2026-05-28  3:00 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [Critical] The return value of `xa_store_range()` is ignored in `kvm_gmem_bind()`, leading to a silent failure to record the memslot binding and causing a potential Use-After-Free.
--

commit 501192c5cb0ca04a3b338baa0236457046c2b566
Author: Sean Christopherson <seanjc@google.com>

KVM: guest_memfd: Treat memslot binding offset+size as unsigned values

This commit changes the offset and size values when binding a memslot to a
guest_memfd file to be unsigned. This fixes an overflow bug that could
result in a false negative when checking the bounds against the file size.

> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
[ ... ]
> @@ -664,8 +665,7 @@ int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot,
>  
>  	inode = file_inode(file);
>  
> -	if (offset < 0 || !PAGE_ALIGNED(offset) ||
> -	    offset + size > i_size_read(inode))
> +	if (!PAGE_ALIGNED(offset) || offset + size > i_size_read(inode))
>  		goto err;
>  
>  	filemap_invalidate_lock(inode->i_mapping);
[ ... ]
>  	WRITE_ONCE(slot->gmem.file, file);
>  	slot->gmem.pgoff = start;
>  	if (kvm_gmem_supports_mmap(inode))
>  		slot->flags |= KVM_MEMSLOT_GMEM_ONLY;
>  
>  	xa_store_range(&f->bindings, start, end - 1, slot, GFP_KERNEL);

[Severity: Critical]
This isn't a bug introduced by this patch, but does this ignore the return
value of xa_store_range()?

If memory allocation fails internally within xa_store_range(), it returns an
error pointer instead of storing the node. Because the error is ignored,
kvm_gmem_bind() proceeds to return success.

At this point, slot->gmem.file has already been set via WRITE_ONCE(), but the
memslot is missing from the f->bindings array.

If the user later closes the guest_memfd file descriptor, kvm_gmem_release()
runs and iterates over f->bindings to nullify all slot->gmem.file pointers.
Since the failed memslot is not in the array, its pointer is never cleared.

When the VM is subsequently destroyed or the memslot is unbound,
kvm_gmem_unbind() will try to dereference this dangling slot->gmem.file
pointer, potentially triggering a use-after-free on a file that has already
been freed.

Could we check the return value here and handle the error appropriately?

>  	filemap_invalidate_unlock(inode->i_mapping);
>  
>  	/*
>  	 * Drop the reference to the file, even on success.  The file pins KVM,
>  	 * not the other way 'round.  Active bindings are invalidated if the
>  	 * file is closed before memslots are destroyed.
>  	 */
>  	r = 0;
> err:
>  	fput(file);
>  	return r;
>  }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528021117.107984-1-seanjc@google.com?part=1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 3/3] KVM: selftests: Add guest_memfd regression test signed offset+size bug
  2026-05-28  2:11 ` [PATCH v3 3/3] KVM: selftests: Add guest_memfd regression test signed offset+size bug Sean Christopherson
@ 2026-05-28  3:29   ` sashiko-bot
  0 siblings, 0 replies; 10+ messages in thread
From: sashiko-bot @ 2026-05-28  3:29 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] The hardcoded guest_memfd offset literal is 68 bits long and only 4KB-aligned, causing the test to pass for the wrong reasons and failing to verify the actual signed offset+size bug.
--

commit 418cb333b0d34d015bff4f1526de77942a583c81
Author: Sean Christopherson <seanjc@google.com>

KVM: selftests: Add guest_memfd regression test signed offset+size bug

This commit adds a regression testcase to ensure KVM rejects an offset and
size that would result in a negative value when computed as a signed 64-bit
value.

> diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c
> index 246bb408ecc0c..95a6ddfd80237 100644
> --- a/tools/testing/selftests/kvm/guest_memfd_test.c
> +++ b/tools/testing/selftests/kvm/guest_memfd_test.c

[ ... ]

> @@ -345,6 +345,16 @@ static void test_invalid_punch_hole(int fd, size_t total_size)
>  	}
>  }
>  
> +static void test_invalid_binding(struct kvm_vm *vm, int fd, size_t size)
> +{
> +	int r;
> +
> +	r = __vm_set_user_memory_region2(vm, 0, KVM_MEM_GUEST_MEMFD, 0, size, 0,
> +					 fd, 0x7ffffffffffff000ull);

[Severity: Medium]
Is this hardcoded offset value correct?

The literal 0x7ffffffffffff000ull contains 17 hex digits (68 bits), which
will be truncated to 64 bits (0xfffffffffffff000ull) by the compiler. Because
this truncated value is very close to the maximum 64-bit unsigned integer,
adding the size to it will wrap around zero.

Will this trigger KVM's existing unsigned overflow check instead of the signed
offset plus size bug this test is meant to verify, causing the test to pass
for the wrong reason?

Also, since the trailing 000 makes the offset strictly 4KB-aligned, will
this fail immediately during the page alignment check on architectures with
larger page sizes (like 64KB on ARM64)?

Should the offset be dynamically calculated at runtime based on the page size
to ensure proper alignment and accurately test the signed boundary?

> +	TEST_ASSERT(r && errno == EINVAL,
> +		    "Memslot with out-of-range offset+size should fail");
> +}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528021117.107984-1-seanjc@google.com?part=3

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values
  2026-05-28  2:11 ` [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values Sean Christopherson
  2026-05-28  3:00   ` sashiko-bot
@ 2026-05-28 17:24   ` Ackerley Tng
  2026-05-28 17:57     ` Sean Christopherson
  2026-05-28 23:25   ` Michael Roth
  2 siblings, 1 reply; 10+ messages in thread
From: Ackerley Tng @ 2026-05-28 17:24 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

Sean Christopherson <seanjc@google.com> writes:

>
> [...snip...]
>
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index bf9659a7b0f6..a1cb72e66288 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -640,15 +640,16 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args)
>  }
>
>  int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot,
> -		  unsigned int fd, loff_t offset)
> +		  unsigned int fd, uoff_t offset)
>  {
> -	loff_t size = slot->npages << PAGE_SHIFT;
> +	uoff_t size = slot->npages << PAGE_SHIFT;

I get why uoff_t is chosen for byte offset, but why is uoff_t also
picked for size, over size_t?

Not trying to nitpick, guest_memfd deals with lots of sizes, byte and
page offsets so I'd like to derive a mental model for using these types
in functions.

>  	unsigned long start, end;
>  	struct gmem_file *f;
>  	struct inode *inode;
>  	struct file *file;
>  	int r = -EINVAL;
>
> +	BUILD_BUG_ON(sizeof(gpa_t) != sizeof(offset));
>  	BUILD_BUG_ON(sizeof(gfn_t) != sizeof(slot->gmem.pgoff));
>

I never knew this was meant to show the concept/typing relationship
between gpa_t and byte offset, gfn_t and page offset. Can we add a
comment to explain the presence of BUILD_BUG_ON()?

Also, what's the rationale for picking BUILD_BUG_ON() over
static_assert()? static_assert() could be placed outside block scope
which makes this relationship declaration more general/global. Putting
it in the function makes it seem local (seems to suggest trying to
assert some guard for just this function).

>
> [...snip...]
>

Reviewed-by: Ackerley Tng <ackerleytng@google.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values
  2026-05-28 17:24   ` Ackerley Tng
@ 2026-05-28 17:57     ` Sean Christopherson
  2026-05-28 20:42       ` Ackerley Tng
  0 siblings, 1 reply; 10+ messages in thread
From: Sean Christopherson @ 2026-05-28 17:57 UTC (permalink / raw)
  To: Ackerley Tng; +Cc: Paolo Bonzini, kvm, linux-kernel

On Thu, May 28, 2026, Ackerley Tng wrote:
> Sean Christopherson <seanjc@google.com> writes:
> 
> >
> > [...snip...]
> >
> > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> > index bf9659a7b0f6..a1cb72e66288 100644
> > --- a/virt/kvm/guest_memfd.c
> > +++ b/virt/kvm/guest_memfd.c
> > @@ -640,15 +640,16 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args)
> >  }
> >
> >  int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot,
> > -		  unsigned int fd, loff_t offset)
> > +		  unsigned int fd, uoff_t offset)
> >  {
> > -	loff_t size = slot->npages << PAGE_SHIFT;
> > +	uoff_t size = slot->npages << PAGE_SHIFT;
> 
> I get why uoff_t is chosen for byte offset, but why is uoff_t also
> picked for size, over size_t?

Because size describes the max offset.

> Not trying to nitpick, guest_memfd deals with lots of sizes, byte and
> page offsets so I'd like to derive a mental model for using these types
> in functions.

Earlier you asked what I meant by "same domain", and I forgot to respond.  What
I meant by "domain" is that both "size" and "offset" operate in the "number of
bytes in the file" domain, and so should use the same type.

E.g. size_t isn't literally just for sizes, it's also used when comparing against
a size, e.g. 

  $ git grep "size_t i;" | wc -l
  476

Changing all of those to uoff_t because their effectively offsets, not sizes,
would be nonsensical.

> >  	unsigned long start, end;
> >  	struct gmem_file *f;
> >  	struct inode *inode;
> >  	struct file *file;
> >  	int r = -EINVAL;
> >
> > +	BUILD_BUG_ON(sizeof(gpa_t) != sizeof(offset));
> >  	BUILD_BUG_ON(sizeof(gfn_t) != sizeof(slot->gmem.pgoff));
> >
> 
> I never knew this was meant to show the concept/typing relationship
> between gpa_t and byte offset, gfn_t and page offset. Can we add a
> comment to explain the presence of BUILD_BUG_ON()?

Eh, I'd honestly rather not.  I agree that on the whole, the kernel tends to be
far too stingy with comments.  However, at some point a baseline of knowledge is
required.  For me, understanding trunctation falls well below that line.  It
does require the reader to know that guest_memfd bindings are gpa:offset, but
that too is core/basic knowledge that the reader needs to have to make any sense
of this code.

> Also, what's the rationale for picking BUILD_BUG_ON() over
> static_assert()? 

https://lore.kernel.org/all/aI05DvQlMWJXewUi@google.com

> static_assert() could be placed outside block scope which makes this
> relationship declaration more general/global. Putting it in the function
> makes it seem local (seems to suggest trying to assert some guard for just
> this function).

Well, yeah, because it is local.  _This_ code relies on the sizes being the same,
otherwise we'd have to deal with potential truncation, whereas overall KVM does
NOT.  The purpose of adding asserts like this is to document an assumption and
guard against future breakage.

Just because we _can_ put the assertion in a global location doesn't mean we
should.

KVM does use static_assert().  But without exception (at least in x86, haven't
checked other architectures), every usage is still co-located with the exact code
that relies on the assumption that's guarded by the assertion.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values
  2026-05-28 17:57     ` Sean Christopherson
@ 2026-05-28 20:42       ` Ackerley Tng
  0 siblings, 0 replies; 10+ messages in thread
From: Ackerley Tng @ 2026-05-28 20:42 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel

Sean Christopherson <seanjc@google.com> writes:

> On Thu, May 28, 2026, Ackerley Tng wrote:
>> Sean Christopherson <seanjc@google.com> writes:
>>
>> >
>> > [...snip...]
>> >
>> > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
>> > index bf9659a7b0f6..a1cb72e66288 100644
>> > --- a/virt/kvm/guest_memfd.c
>> > +++ b/virt/kvm/guest_memfd.c
>> > @@ -640,15 +640,16 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args)
>> >  }
>> >
>> >  int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot,
>> > -		  unsigned int fd, loff_t offset)
>> > +		  unsigned int fd, uoff_t offset)
>> >  {
>> > -	loff_t size = slot->npages << PAGE_SHIFT;
>> > +	uoff_t size = slot->npages << PAGE_SHIFT;
>>
>> I get why uoff_t is chosen for byte offset, but why is uoff_t also
>> picked for size, over size_t?
>
> Because size describes the max offset.
>
>> Not trying to nitpick, guest_memfd deals with lots of sizes, byte and
>> page offsets so I'd like to derive a mental model for using these types
>> in functions.
>
> Earlier you asked what I meant by "same domain", and I forgot to respond.  What
> I meant by "domain" is that both "size" and "offset" operate in the "number of
> bytes in the file" domain, and so should use the same type.
>
> E.g. size_t isn't literally just for sizes, it's also used when comparing against
> a size, e.g.
>
>   $ git grep "size_t i;" | wc -l
>   476
>
> Changing all of those to uoff_t because their effectively offsets, not sizes,
> would be nonsensical.
>

Got it, so the takeaway is that local variables' type (especially if
only used in one case) should mostly align with what it's being compared
with.

>> >  	unsigned long start, end;
>> >  	struct gmem_file *f;
>> >  	struct inode *inode;
>> >  	struct file *file;
>> >  	int r = -EINVAL;
>> >
>> > +	BUILD_BUG_ON(sizeof(gpa_t) != sizeof(offset));
>> >  	BUILD_BUG_ON(sizeof(gfn_t) != sizeof(slot->gmem.pgoff));
>> >
>>
>> I never knew this was meant to show the concept/typing relationship
>> between gpa_t and byte offset, gfn_t and page offset. Can we add a
>> comment to explain the presence of BUILD_BUG_ON()?
>
> Eh, I'd honestly rather not.  I agree that on the whole, the kernel tends to be
> far too stingy with comments.  However, at some point a baseline of knowledge is
> required.  For me, understanding trunctation falls well below that line.  It
> does require the reader to know that guest_memfd bindings are gpa:offset, but
> that too is core/basic knowledge that the reader needs to have to make any sense
> of this code.
>

I agree that both truncation and knowing that guest_memfd bindings are
gpa:offset are both below the line, the part that's new is using
BUILD_BUG_ON() to remind people of the binding. Is this BUILD_BUG_ON()
also the part that strictly requires guest_memfd to be 64-bit only?

>> Also, what's the rationale for picking BUILD_BUG_ON() over
>> static_assert()?
>
> https://lore.kernel.org/all/aI05DvQlMWJXewUi@google.com
>

This is helpful, thanks!

>> static_assert() could be placed outside block scope which makes this
>> relationship declaration more general/global. Putting it in the function
>> makes it seem local (seems to suggest trying to assert some guard for just
>> this function).
>
> Well, yeah, because it is local.  _This_ code relies on the sizes being the same,
> otherwise we'd have to deal with potential truncation, whereas overall KVM does
> NOT.  The purpose of adding asserts like this is to document an assumption and
> guard against future breakage.
>

I saw the gpa:offset and gfn:page_offset relationships as local to
guest_memfd as a whole rather than just this function.

Either way good to know for next time!

> Just because we _can_ put the assertion in a global location doesn't mean we
> should.
>
> KVM does use static_assert().  But without exception (at least in x86, haven't
> checked other architectures), every usage is still co-located with the exact code
> that relies on the assumption that's guarded by the assertion.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values
  2026-05-28  2:11 ` [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values Sean Christopherson
  2026-05-28  3:00   ` sashiko-bot
  2026-05-28 17:24   ` Ackerley Tng
@ 2026-05-28 23:25   ` Michael Roth
  2 siblings, 0 replies; 10+ messages in thread
From: Michael Roth @ 2026-05-28 23:25 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Ackerley Tng

On Wed, May 27, 2026 at 07:11:15PM -0700, Sean Christopherson wrote:
> When binding a memslot to a guest_memfd file, treat the offset and size as
> unsigned values to fix a bug where the sum of the two can result in a false
> negative when checking for overflow against the size of the file.  Passing
> unsigned values also avoids relying on somewhat obscure checks in other
> flows for safety, and tracks the offset and size as they are intended to be
> tracked, as unsigned values.
> 
> On 64-bit kernels, the number of pages a memslot contains and thus the size
> (and offset) of its guest_memfd binding are unsigned 64-bit values.  Taking
> the offset+size as an loff_t instead of a uoff_t inadvertently converts
> the unsigned value to a signed value if the offset and/or size is massive.
> 
> Locally storing the offset and size as signed values is benign in and of
> itself (though even that is *extremely* difficult to discern), but
> operating on their sum is not.
> 
> For the offset, KVM explicitly checks against a negative value, which might
> seem like a bug as KVM could incorrectly reject a legitimate binding, but
> that's not actually the case as KVM_CREATE_GUEST_MEMFD takes a signed value
> for its size, i.e. a would-be-negative offset is also greater than the
> maximum possible size of any guest_memfd file.
> 
> Regarding the size, while KVM lacks an explicit check for a negative value,
> i.e. seemingly has a flawed overflow check, KVM restricts the number of
> pages in a single memslot to the largest positive signed 32-bit value:
> 
>         if (id < KVM_USER_MEM_SLOTS &&
>             (mem->memory_size >> PAGE_SHIFT) > KVM_MEM_MAX_NR_PAGES)
>                 return -EINVAL;
> 
> and so that maximum "size" will ever be is 0x7fffffff000.
> 
> The sum of the two is, however, problematic.  While the size is restricted
> by KVM's memslot logic, the offset is not, i.e. the offset is completely
> unchecked until the "offset + size > i_size_read(inode)" check.  If the
> offset is the (nearly) largest possible _positive_ value, then adding size
> to the offset can result in a signed, negative 64-bit value.  When compared
> against the size of the file (guaranteed to be positive), the negative sum
> is always smaller, and KVM incorrectly allows the absurd offset.
> 
> Opportunistically add missing includes in kvm_mm.h (instead of relying on
> its parents).
> 
> Fixes: a7800aa80ea4 ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory")
> Cc: stable@vger.kernel.org
> Cc: Ackerley Tng <ackerleytng@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Michael Roth <michael.roth@amd.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-05-28 23:25 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-28  2:11 [PATCH v3 0/3] KVM: guest_memfd: Fix signed offset+size goof Sean Christopherson
2026-05-28  2:11 ` [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values Sean Christopherson
2026-05-28  3:00   ` sashiko-bot
2026-05-28 17:24   ` Ackerley Tng
2026-05-28 17:57     ` Sean Christopherson
2026-05-28 20:42       ` Ackerley Tng
2026-05-28 23:25   ` Michael Roth
2026-05-28  2:11 ` [PATCH v3 2/3] KVM: selftests: Expand the guest_memfd test macros to allow passing the VM Sean Christopherson
2026-05-28  2:11 ` [PATCH v3 3/3] KVM: selftests: Add guest_memfd regression test signed offset+size bug Sean Christopherson
2026-05-28  3:29   ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox