From: Yosry Ahmed <yosry@kernel.org>
To: Sean Christopherson <seanjc@google.com>
Cc: Peter Fang <peter.fang@intel.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Madhavan Srinivasan <maddy@linux.ibm.com>,
Nicholas Piggin <npiggin@gmail.com>,
Ritesh Harjani <ritesh.list@gmail.com>,
Michael Ellerman <mpe@ellerman.id.au>,
"Christophe Leroy (CS GROUP)" <chleroy@kernel.org>,
Thomas Gleixner <tglx@kernel.org>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
kvm@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 3/3] KVM: Take gpa_t in kvm_vcpu_map[_readonly]()
Date: Wed, 22 Apr 2026 20:19:45 +0000 [thread overview]
Message-ID: <aekrJrSRoclZoBo9@google.com> (raw)
In-Reply-To: <aegV3ELp20DlC5Ge@google.com>
> > > To elaborate a bit, I'm all for adding WARNs in flows where something bad is all
> > > but guaranteed to happen if an assumption is violated, or in APIs where there's
> > > a history of goofs and/or subtlety in how the API behaves.
> > >
> > > What I'm against is adding WARNs because someone could write bad code in the
> > > future, or because KVM doesn't do XYZ at this time. Such WARNs usualy just add
> > > noise, and can even be actively harmful. E.g. in this case, ignoring the PID
> > > usage, a reader might look at the WARN and think it's _wrong_ to map a page in
> > > order to access a subset of the page, which is just not true.
> >
> > Yeah I agree with most/all of your objections to my suggestions, it's
> > usually that I don't have enough context to understand how the WARN
> > could be harmful (like here), or am just being too paranoid or
> > defending against bad code as you mentioned. I was mentioning your
> > objections semi-sarcastically and intentionally bringing up the WARN
> > in case it's actually useful.
> >
> > Taking a step back, what I really want to clarify and/or detect misuse
> > of, is that kvm_vcpu_map() will map exactly one page, the one that the
> > GPA lies in. For example, there's nothing protecting against the PID
> > address being the last byte of the page,
>
> Well, technically there is:
>
> CC(!kvm_vcpu_is_legal_aligned_gpa(vcpu, vmcs12->posted_intr_desc_addr, 64))))
> return -EINVAL;
>
> But I'm pretty sure what you're saying is that "nothing in the common helper code
> protects against a stupid caller".
Yes. Now I am actually glad I brought up the WARN and you elaborated
your thoughts, because it made me think and spell out my actual concern
(that my brain translated into just WARN initiailly): we need bounds
checking.
>
> > in which case accessing all of it would be wrong as it spans the mapped page
> > boundary. This is difficult to hit if you are passing in a GFN, as it's more
> > obvious that KVM is mapping one physical page.
>
> Eh, that just leads to a different class of bugs (and possible even *worse* bugs).
> E.g. caller passes in a GFN, accesses the kernel mapping beyond the page, and
> reads/write arbitrary kernel memory (pfn_valid() case using the direct map), or
> hits a !PRESENT #PF (remap case).
>
> > Perhaps we just need to rename the functions (e.g.
> > kvm_vcpu_map_page()), or more intrusively pass in a size and do bounds
> > checking.
>
> Definitely the latter. Or both I guess, but probably just the latter.
I think both. I think renaming to kvm_vcpu_map_page() (and similar for
others) would further clarify things, especially with the introduction
of kvm_vcpu_map_ptr() below.
>
> Commit 025dde582bbf ("KVM: Harden guest memory APIs against out-of-bounds accesses")
> added that type of hardening for the "slow" APIs, exactly because of the type of
> OOB bug you're describing: commit f559b2e9c5c5 ("KVM: nSVM: Ignore nCR3[4:0] when
> loading PDPTEs from memory").
>
> Actually, that's actually useful feedback for patch 3. __kvm_vcpu_map() should
> do the GPA=>GFN conversion, not its caller.
Yes.
>
> Anyways, back to the hardening. We can do it with minimal additional churn. After
> patch 3 (passing a @gpa to __kvm_vcpu_map(), not a @gfn), do the below over a few
> patches (completely untested). This way the common case of mapping and accessing
> an entire page Just Works, and flows like the PI descriptor handling don't have to
> many provide the length (which also can be error prone).
Yeah probably this (maybe not in the same order):
- Convert map->pfn to map->hpa.
- Pass size to __kvm_vcpu_map() and do bounds checking.
- Rename kvm_vcpu_map() and __kvm_vpcu_map() to kvm_vcpu_map_page() and
__kvm_vcpu_map_page().
- Introduce kvm_vcpu_map_ptr() wrapper and simplify the nested PID call
site.
Generally looks good with a small nit/question below. Peter, would you
be interested in extending the series to do this? If not, I can send a
follow up on top of your series when it's hashed out.
[..]
> @@ -1893,22 +1894,31 @@ static inline hpa_t pfn_to_hpa(kvm_pfn_t pfn)
> return (hpa_t)pfn << PAGE_SHIFT;
> }
>
> -int __kvm_vcpu_map(struct kvm_vcpu *vcpu, gfn_t gfn, struct kvm_host_map *map,
> - bool writable);
> +int __kvm_vcpu_map(struct kvm_vcpu *vcpu, gpa_t gpa, gpa_t len,
> + struct kvm_host_map *map, bool writable);
> void kvm_vcpu_unmap(struct kvm_vcpu *vcpu, struct kvm_host_map *map);
>
> static inline int kvm_vcpu_map(struct kvm_vcpu *vcpu, gpa_t gpa,
> struct kvm_host_map *map)
> {
> - return __kvm_vcpu_map(vcpu, gpa_to_gfn(gpa), map, true);
> + return __kvm_vcpu_map(vcpu, gpa, PAGE_SIZE, map, true);
> }
>
> static inline int kvm_vcpu_map_readonly(struct kvm_vcpu *vcpu, gpa_t gpa,
> struct kvm_host_map *map)
> {
> - return __kvm_vcpu_map(vcpu, gpa_to_gfn(gpa), map, false);
> + return __kvm_vcpu_map(vcpu, gpa, PAGE_SIZE, map, false);
> }
>
> +#define kvm_vcpu_map_ptr(__vcpu, __gpa, __ptr, __map) \
> +({ \
> + int r; \
> + \
> + r = __kvm_vcpu_map(__vcpu, __gpa, sizeof(*(__ptr)), __map, true); \
> + __ptr = !r ? (__map)->hva : NULL; \
> + r; \
> +})
> +
> static inline void kvm_vcpu_map_mark_dirty(struct kvm_vcpu *vcpu,
> struct kvm_host_map *map)
> {
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 9093251beb39..e8d2e98b0068 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3114,9 +3114,10 @@ struct page *__gfn_to_page(struct kvm *kvm, gfn_t gfn, bool write)
> }
> EXPORT_SYMBOL_FOR_KVM_INTERNAL(__gfn_to_page);
>
> -int __kvm_vcpu_map(struct kvm_vcpu *vcpu, gfn_t gfn, struct kvm_host_map *map,
> - bool writable)
> +int __kvm_vcpu_map(struct kvm_vcpu *vcpu, gpa_t gpa, gpa_t len,
> + struct kvm_host_map *map, bool writable)
> {
> + gfn_t gfn = gpa_to_gfn(gpa);
> struct kvm_follow_pfn kfp = {
> .slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn),
> .gfn = gfn,
> @@ -3124,6 +3125,10 @@ int __kvm_vcpu_map(struct kvm_vcpu *vcpu, gfn_t gfn, struct kvm_host_map *map,
> .refcounted_page = &map->pinned_page,
> .pin = true,
> };
> + kvm_pfn_t pfn;
> +
> + if (WARN_ON_ONCE(offset_in_page(gpa) + len > PAGE_SIZE))
> + return -EINVAL;
Maybe do the bounds checking after initializing 'map', then
kvm_vcpu_map_ptr() wouldn't need to explicitly set the pointer to NULL
on failure? There is already possibility of failure after initialization
anyway.
>
> map->pinned_page = NULL;
> map->page = NULL;
> @@ -3131,20 +3136,25 @@ int __kvm_vcpu_map(struct kvm_vcpu *vcpu, gfn_t gfn, struct kvm_host_map *map,
> map->gfn = gfn;
> map->writable = writable;
>
> - map->pfn = kvm_follow_pfn(&kfp);
> - if (is_error_noslot_pfn(map->pfn))
> + pfn = kvm_follow_pfn(&kfp);
> + if (is_error_noslot_pfn(pfn))
> return -EINVAL;
>
> - if (pfn_valid(map->pfn)) {
> - map->page = pfn_to_page(map->pfn);
> + map->hpa = pfn_to_hpa(pfn);
> + if (pfn_valid(pfn)) {
> + map->page = pfn_to_page(pfn);
> map->hva = kmap(map->page);
> #ifdef CONFIG_HAS_IOMEM
> } else {
> - map->hva = memremap(pfn_to_hpa(map->pfn), PAGE_SIZE, MEMREMAP_WB);
> + map->hva = memremap(map->hpa, PAGE_SIZE, MEMREMAP_WB);
> + if (!map->hva)
> + return -EFAULT;
> #endif
> }
>
> - return map->hva ? 0 : -EFAULT;
> + map->hpa += offset_in_page(gpa);
> + map->hva += offset_in_page(gpa);
> + return 0;
> }
> EXPORT_SYMBOL_FOR_KVM_INTERNAL(__kvm_vcpu_map);
[..]
next prev parent reply other threads:[~2026-04-22 20:19 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-08 0:11 [PATCH v2 0/3] KVM: Fix and clean up kvm_vcpu_map[_readonly]() usages Peter Fang
2026-04-08 0:11 ` [PATCH v2 1/3] KVM: Fix kvm_vcpu_map[_readonly]() function prototypes Peter Fang
2026-04-21 23:05 ` Yosry Ahmed
2026-04-08 0:11 ` [PATCH v2 2/3] KVM: Move page mapping/unmapping APIs in kvm_host.h Peter Fang
2026-04-21 23:06 ` Yosry Ahmed
2026-04-08 0:11 ` [PATCH v2 3/3] KVM: Take gpa_t in kvm_vcpu_map[_readonly]() Peter Fang
2026-04-21 23:08 ` Yosry Ahmed
2026-04-21 23:19 ` Sean Christopherson
2026-04-21 23:25 ` Yosry Ahmed
2026-04-21 23:29 ` Sean Christopherson
2026-04-21 23:41 ` Yosry Ahmed
2026-04-22 0:27 ` Sean Christopherson
2026-04-22 20:19 ` Yosry Ahmed [this message]
2026-04-22 20:34 ` Sean Christopherson
2026-04-22 21:44 ` Yosry Ahmed
2026-04-22 22:17 ` Sean Christopherson
2026-04-22 22:19 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aekrJrSRoclZoBo9@google.com \
--to=yosry@kernel.org \
--cc=bp@alien8.de \
--cc=chleroy@kernel.org \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peter.fang@intel.com \
--cc=ritesh.list@gmail.com \
--cc=seanjc@google.com \
--cc=tglx@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox