From: Sean Christopherson <seanjc@google.com>
To: Ackerley Tng <ackerleytng@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Thomas Gleixner <tglx@kernel.org>, Ingo Molnar <mingo@redhat.com>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
Kiryl Shutsemau <kas@kernel.org>,
Rick Edgecombe <rick.p.edgecombe@intel.com>,
Vishal Annapurve <vannapurve@google.com>,
Yan Zhao <yan.y.zhao@intel.com>,
Michael Roth <michael.roth@amd.com>,
Isaku Yamahata <isaku.yamahata@intel.com>,
Chao Peng <chao.p.peng@linux.intel.com>,
Xiaoyao Li <xiaoyao.li@intel.com>,
Zongyao Chen <ZongYao.Chen@linux.alibaba.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev,
Yu Zhang <yu.c.zhang@linux.intel.com>,
Fuad Tabba <tabba@google.com>
Subject: Re: [PATCH v2 2/5] KVM: guest_memfd: Fix possible signed integer overflow
Date: Wed, 27 May 2026 12:26:03 -0700 [thread overview]
Message-ID: <ahdFSxXnWhhsjhKe@google.com> (raw)
In-Reply-To: <CAEvNRgEN6syKVLjvFX-s=xU=r3CBZ3KtmeKvVYOC09uvFGXSFg@mail.gmail.com>
On Wed, May 27, 2026, Ackerley Tng wrote:
> Sean Christopherson <seanjc@google.com> writes:
>
> > For shortlogs (and changeloges), when possible, describe the _change_ itself, not
> > its impact is. Sometimes "Fix xyz" is the best shortlog, e.g. when fixing build
> > failures, but here, I would go with:
> >
> > KVM: guest_memfd: Treat memslot binding offset+size as unsigned values
> >
> > for two reasons. First, it provides a lot more context for future readers, versus
> > "Fix possible signed integer overflow" which doesn't even capture what flow is
> > affected, how the overflow is being fixed, etc. Second, if the fix is wrong,
> > incomplete, etc., we don't end up with a follow-up patch that start with "Really
> > fix ...".
> >
>
> Thanks for explaining!
>
> > Oh, actually, three reasons. This doesn't only affect the overflow check. The
> > check on a negative offset is flawed, as it means KVM would incorrectly reject
> > bindings with (comically) large offsets.
> >
>
> Makes sense.
>
> > LOL, four. There is no bug. The size of the memslot is ((1UL << 31) - 1)
> > pages, i.e. 0x7FF_FFFFF000:
> >
> > if (id < KVM_USER_MEM_SLOTS &&
> > (mem->memory_size >> PAGE_SHIFT) > KVM_MEM_MAX_NR_PAGES)
> > return -EINVAL;
> >
> > and so "loff_t size" can never be negative.
>
> I think the bug was that the sum of offset + size in kvm_gmem_bind()
> when interpreted as signed integers could be smaller than
> i_size_read(inode) and allow binding.
>
> So IIUC even if size is small (and not negative), nothing catches a
> large enough offset where offset + size (interpreted as unsigned
> integers) doesn't overflow, but offset + size (interpreted as signed
> integers) overflows.
Oooh, duh, if @offset is positive, but @offset+size is negative. Yes, that's a
real bug, confirmed via selftest. I'll send a fix along with a selftest testcase.
Thanks much!
> >> Fixes: a7800aa80ea4d ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory")
> >> Signed-off-by: Sean Christopherson <seanjc@google.com>
> >> [Use size_t for size instead of u64]
> >
> > Why? Oh, right, because kvm_memory_slot.npages is an "unsigned long". The
> > discrepancy between a u64 for the offset and a size_t for the size is confusing,
> > as they are both conceptually in the same "domain".
> >
> > Rather than u64 and size_t, we should use pgoff_t, which is what KVM already uses
> > as the storage for kvm_memory_slot.gmem.pgoff.
> >
>
> I picked size_t more because I thought it was semantically correct to
> use the size type for a size. size_t may have different sizes (64 vs
> 32), but in the comparison offset + size > i_size_read(inode), size is
> promoted to 64 bits, and signed inode size is cast to unsigned for
> comparison, so I think that works.
>
> pgoff_t is also unsigned, but I think that should be reserved for page
> offsets/indices?
Just to avoid confusion over the definition of an offset/idnex:
* The type of an index into the pagecache.
I.e. it's not the 12-bit offset into a 4KiB page. Which I'm pretty sure you were
saying as well, just want to ensure we're on the same page.
I like pgoff_t more than size_t because, for KVM, it's really all about addressing
memory, thanks to the offset into guest_memfd being associated 1:1 with a GPA.
It's not perfect, because GPAs are tracked as 64-bit values, whereas the kernel
restricts itself to "unsigned long". But that's a non-issue in practice since
guest_memfd is 64-bit only.
But conceptually, I like tracking the gmem offset as a pgoff_t to tie it back
to using GPAs to offset/index into gmem. And for all intents and purposes, gmem
is nothing more than a glorified pagecache :-)
next prev parent reply other threads:[~2026-05-27 19:26 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-22 22:46 [PATCH v2 0/5] guest_memfd fixes for bind and populate Ackerley Tng via B4 Relay
2026-05-22 22:46 ` [PATCH v2 1/5] KVM: guest_memfd: Use write permissions when GUP-ing source pages Ackerley Tng via B4 Relay
2026-05-26 16:13 ` Sean Christopherson
2026-05-22 22:46 ` [PATCH v2 2/5] KVM: guest_memfd: Fix possible signed integer overflow Ackerley Tng via B4 Relay
2026-05-26 15:53 ` Sean Christopherson
2026-05-27 18:26 ` Ackerley Tng
2026-05-27 19:26 ` Sean Christopherson [this message]
2026-05-27 20:17 ` Ackerley Tng
2026-05-22 22:46 ` [PATCH v2 3/5] KVM: guest_memfd: Handle errors from xa_store_range() when binding Ackerley Tng via B4 Relay
2026-05-26 16:39 ` Sean Christopherson
2026-05-27 19:11 ` Ackerley Tng
2026-05-22 22:46 ` [PATCH v2 4/5] KVM: SNP: Fix kunmap_local() unmapping order Ackerley Tng via B4 Relay
2026-05-26 15:55 ` Sean Christopherson
2026-05-22 22:46 ` [PATCH v2 5/5] KVM: SNP: Mark source page dirty in sev_gmem_post_populate Ackerley Tng via B4 Relay
2026-05-26 16:47 ` Sean Christopherson
2026-05-27 19:14 ` Ackerley Tng
2026-05-26 16:55 ` [PATCH v2 0/5] guest_memfd fixes for bind and populate Sean Christopherson
2026-05-27 18:19 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ahdFSxXnWhhsjhKe@google.com \
--to=seanjc@google.com \
--cc=ZongYao.Chen@linux.alibaba.com \
--cc=ackerleytng@google.com \
--cc=bp@alien8.de \
--cc=chao.p.peng@linux.intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=isaku.yamahata@intel.com \
--cc=kas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=michael.roth@amd.com \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=tabba@google.com \
--cc=tglx@kernel.org \
--cc=vannapurve@google.com \
--cc=x86@kernel.org \
--cc=xiaoyao.li@intel.com \
--cc=yan.y.zhao@intel.com \
--cc=yu.c.zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox