All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Ackerley Tng <ackerleytng@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	 Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org,  "H. Peter Anvin" <hpa@zytor.com>,
	Kiryl Shutsemau <kas@kernel.org>,
	 Rick Edgecombe <rick.p.edgecombe@intel.com>,
	Vishal Annapurve <vannapurve@google.com>,
	 Yan Zhao <yan.y.zhao@intel.com>,
	Michael Roth <michael.roth@amd.com>,
	 Isaku Yamahata <isaku.yamahata@intel.com>,
	Chao Peng <chao.p.peng@linux.intel.com>,
	 Xiaoyao Li <xiaoyao.li@intel.com>,
	Zongyao Chen <ZongYao.Chen@linux.alibaba.com>,
	 kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-coco@lists.linux.dev,
	 Yu Zhang <yu.c.zhang@linux.intel.com>,
	Fuad Tabba <tabba@google.com>
Subject: Re: [PATCH v2 2/5] KVM: guest_memfd: Fix possible signed integer overflow
Date: Wed, 27 May 2026 12:26:03 -0700	[thread overview]
Message-ID: <ahdFSxXnWhhsjhKe@google.com> (raw)
In-Reply-To: <CAEvNRgEN6syKVLjvFX-s=xU=r3CBZ3KtmeKvVYOC09uvFGXSFg@mail.gmail.com>

On Wed, May 27, 2026, Ackerley Tng wrote:
> Sean Christopherson <seanjc@google.com> writes:
> 
> > For shortlogs (and changeloges), when possible, describe the _change_ itself, not
> > its impact is.	Sometimes "Fix xyz" is the best shortlog, e.g. when fixing build
> > failures, but here, I would go with:
> >
> >   KVM: guest_memfd: Treat memslot binding offset+size as unsigned values
> >
> > for two reasons.  First, it provides a lot more context for future readers, versus
> > "Fix possible signed integer overflow" which doesn't even capture what flow is
> > affected, how the overflow is being fixed, etc.  Second, if the fix is wrong,
> > incomplete, etc., we don't end up with a follow-up patch that start with "Really
> > fix ...".
> >
> 
> Thanks for explaining!
> 
> > Oh, actually, three reasons.  This doesn't only affect the overflow check.  The
> > check on a negative offset is flawed, as it means KVM would incorrectly reject
> > bindings with (comically) large offsets.
> >
> 
> Makes sense.
> 
> > LOL, four.  There is no bug.  The size of the memslot is ((1UL << 31) - 1)
> > pages, i.e. 0x7FF_FFFFF000:
> >
> > 	if (id < KVM_USER_MEM_SLOTS &&
> > 	    (mem->memory_size >> PAGE_SHIFT) > KVM_MEM_MAX_NR_PAGES)
> > 		return -EINVAL;
> >
> > and so "loff_t size" can never be negative.
> 
> I think the bug was that the sum of offset + size in kvm_gmem_bind()
> when interpreted as signed integers could be smaller than
> i_size_read(inode) and allow binding.
> 
> So IIUC even if size is small (and not negative), nothing catches a
> large enough offset where offset + size (interpreted as unsigned
> integers) doesn't overflow, but offset + size (interpreted as signed
> integers) overflows.

Oooh, duh, if @offset is positive, but @offset+size is negative.  Yes, that's a
real bug, confirmed via selftest.  I'll send a fix along with a selftest testcase.

Thanks much!

> >> Fixes: a7800aa80ea4d ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory")
> >> Signed-off-by: Sean Christopherson <seanjc@google.com>
> >> [Use size_t for size instead of u64]
> >
> > Why?  Oh, right, because kvm_memory_slot.npages is an "unsigned long".  The
> > discrepancy between a u64 for the offset and a size_t for the size is confusing,
> > as they are both conceptually in the same "domain".
> >
> > Rather than u64 and size_t, we should use pgoff_t, which is what KVM already uses
> > as the storage for kvm_memory_slot.gmem.pgoff.
> >
> 
> I picked size_t more because I thought it was semantically correct to
> use the size type for a size. size_t may have different sizes (64 vs
> 32), but in the comparison offset + size > i_size_read(inode), size is
> promoted to 64 bits, and signed inode size is cast to unsigned for
> comparison, so I think that works.
> 
> pgoff_t is also unsigned, but I think that should be reserved for page
> offsets/indices?

Just to avoid confusion over the definition of an offset/idnex:
  
  * The type of an index into the pagecache.

I.e. it's not the 12-bit offset into a 4KiB page.  Which I'm pretty sure you were
saying as well, just want to ensure we're on the same page.

I like pgoff_t more than size_t because, for KVM, it's really all about addressing
memory, thanks to the offset into guest_memfd being associated 1:1 with a GPA.
It's not perfect, because GPAs are tracked as 64-bit values, whereas the kernel
restricts itself to "unsigned long".  But that's a non-issue in practice since
guest_memfd is 64-bit only.

But conceptually, I like tracking the gmem offset as a pgoff_t to tie it back
to using GPAs to offset/index into gmem.  And for all intents and purposes, gmem
is nothing more than a glorified pagecache :-)

  reply	other threads:[~2026-05-27 19:26 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-22 22:46 [PATCH v2 0/5] guest_memfd fixes for bind and populate Ackerley Tng via B4 Relay
2026-05-22 22:46 ` Ackerley Tng
2026-05-22 22:46 ` [PATCH v2 1/5] KVM: guest_memfd: Use write permissions when GUP-ing source pages Ackerley Tng via B4 Relay
2026-05-22 22:46   ` Ackerley Tng
2026-05-22 23:47   ` sashiko-bot
2026-05-26 16:13   ` Sean Christopherson
2026-05-22 22:46 ` [PATCH v2 2/5] KVM: guest_memfd: Fix possible signed integer overflow Ackerley Tng via B4 Relay
2026-05-22 22:46   ` Ackerley Tng
2026-05-26 15:53   ` Sean Christopherson
2026-05-27 18:26     ` Ackerley Tng
2026-05-27 19:26       ` Sean Christopherson [this message]
2026-05-27 20:17         ` Ackerley Tng
2026-05-27 22:08           ` Sean Christopherson
2026-05-22 22:46 ` [PATCH v2 3/5] KVM: guest_memfd: Handle errors from xa_store_range() when binding Ackerley Tng via B4 Relay
2026-05-22 22:46   ` Ackerley Tng
2026-05-26 16:39   ` Sean Christopherson
2026-05-27 19:11     ` Ackerley Tng
2026-05-27 22:43       ` Sean Christopherson
2026-05-22 22:46 ` [PATCH v2 4/5] KVM: SNP: Fix kunmap_local() unmapping order Ackerley Tng via B4 Relay
2026-05-22 22:46   ` Ackerley Tng
2026-05-26 15:55   ` Sean Christopherson
2026-05-22 22:46 ` [PATCH v2 5/5] KVM: SNP: Mark source page dirty in sev_gmem_post_populate Ackerley Tng via B4 Relay
2026-05-22 22:46   ` Ackerley Tng
2026-05-26 16:47   ` Sean Christopherson
2026-05-27 19:14     ` Ackerley Tng
2026-05-26 16:55 ` [PATCH v2 0/5] guest_memfd fixes for bind and populate Sean Christopherson
2026-05-27 18:19 ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahdFSxXnWhhsjhKe@google.com \
    --to=seanjc@google.com \
    --cc=ZongYao.Chen@linux.alibaba.com \
    --cc=ackerleytng@google.com \
    --cc=bp@alien8.de \
    --cc=chao.p.peng@linux.intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.roth@amd.com \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=tabba@google.com \
    --cc=tglx@kernel.org \
    --cc=vannapurve@google.com \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.