From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Takahiro Itazuri <itazur@amazon.com>,
kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Fuad Tabba <tabba@google.com>,
Brendan Jackman <jackmanb@google.com>,
David Woodhouse <dwmw2@infradead.org>,
Paul Durrant <pdurrant@amazon.com>,
Nikita Kalyazin <kalyazin@amazon.com>,
Patrick Roy <patrick.roy@campus.lmu.de>,
Takahiro Itazuri <zulinx86@gmail.com>
Subject: Re: [RFC PATCH 1/2] KVM: pfncache: Use kvm_gmem_get_pfn() for guest_memfd-backed memslots
Date: Mon, 19 Jan 2026 13:34:07 +0100 [thread overview]
Message-ID: <67b7e428-5ed3-4794-b8c4-dcebf724972d@kernel.org> (raw)
In-Reply-To: <20251203144159.6131-2-itazur@amazon.com>
On 12/3/25 15:41, Takahiro Itazuri wrote:
> gfn_to_pfn_cache currently relies on hva_to_pfn(), which resolves PFNs
> through GUP. GUP assumes that the page has a valid direct-map PTE,
> which is not true for guest_memfd created with
> GUEST_MEMFD_FLAG_NO_DIRECT_MAP, because their direct-map PTEs are
> explicitly invalidated via set_direct_map_valid_noflush().
>
> Introduce a helper function, gpc_to_pfn(), that routes PFN lookup to
> kvm_gmem_get_pfn() for guest_memfd-backed memslots (regardless of
> whether GUEST_MEMFD_FLAG_NO_DIRECT_MAP is set), and otherwise falls
> back to the existing hva_to_pfn() path. Rename hva_to_pfn_retry() to
> gpc_to_pfn_retry() accordingly.
Let's look into some details:
The pfncache looks up a page from the page tables through GUP.
To make sure that the looked up PFN can be safely used, it must we very
careful: after it looked up the page through hva_to_pfn(), it marks the
entry as "valid" and drops the folio reference obtained through
hva_to_pfn().
At this point, nothing stops the page from getting unmapped from the
page tables to be freed etc.
Of course, that sounds very dangerous.
That's why the pfncache uses the (KVM) mmu_notifier framework to get
notified when the page was just unmapped from the KVM mmu while it
prepared the cache entry (see mmu_notifier_retry_cache()).
But it also has to deal with the page getting removed (+possibly freed)
from the KVM MMU later, after we already have a valid entry in the cache.
For this reason, gfn_to_pfn_cache_invalidate_start() is used to
invalidate any entries as they get unmapped from page tables.
Now the big question: how is this supposed to work with gmem? I would
have expected that we would need similar invalidations etc. from gmem code?
Imagine ftruncate() targets the gmem folio we just looked up, would be
we get an appropriate invalidate notification?
>
> Signed-off-by: Takahiro Itazuri <itazur@amazon.com>
> ---
> virt/kvm/pfncache.c | 34 +++++++++++++++++++++++-----------
> 1 file changed, 23 insertions(+), 11 deletions(-)
>
> diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
> index 728d2c1b488a..bf8d6090e283 100644
> --- a/virt/kvm/pfncache.c
> +++ b/virt/kvm/pfncache.c
> @@ -152,22 +152,34 @@ static inline bool mmu_notifier_retry_cache(struct kvm *kvm, unsigned long mmu_s
> return kvm->mmu_invalidate_seq != mmu_seq;
> }
>
> -static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
> +static kvm_pfn_t gpc_to_pfn(struct gfn_to_pfn_cache *gpc, struct page **page)
> {
> - /* Note, the new page offset may be different than the old! */
> - void *old_khva = (void *)PAGE_ALIGN_DOWN((uintptr_t)gpc->khva);
> - kvm_pfn_t new_pfn = KVM_PFN_ERR_FAULT;
> - void *new_khva = NULL;
> - unsigned long mmu_seq;
> - struct page *page;
> + if (kvm_slot_has_gmem(gpc->memslot)) {
> + kvm_pfn_t pfn;
> +
> + kvm_gmem_get_pfn(gpc->kvm, gpc->memslot, gpa_to_gfn(gpc->gpa),
> + &pfn, page, NULL);
> + return pfn;
> + }
>
> struct kvm_follow_pfn kfp = {
> .slot = gpc->memslot,
> .gfn = gpa_to_gfn(gpc->gpa),
> .flags = FOLL_WRITE,
> .hva = gpc->uhva,
> - .refcounted_page = &page,
> + .refcounted_page = page,
> };
> + return hva_to_pfn(&kfp);
> +}
> +
> +static kvm_pfn_t gpc_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
> +{
> + /* Note, the new page offset may be different than the old! */
> + void *old_khva = (void *)PAGE_ALIGN_DOWN((uintptr_t)gpc->khva);
> + kvm_pfn_t new_pfn = KVM_PFN_ERR_FAULT;
> + void *new_khva = NULL;
> + unsigned long mmu_seq;
> + struct page *page;
>
> lockdep_assert_held(&gpc->refresh_lock);
>
> @@ -206,7 +218,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
> cond_resched();
> }
>
> - new_pfn = hva_to_pfn(&kfp);
> + new_pfn = gpc_to_pfn(gpc, &page);
> if (is_error_noslot_pfn(new_pfn))
> goto out_error;
>
> @@ -319,7 +331,7 @@ static int __kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned l
> }
> }
>
> - /* Note: the offset must be correct before calling hva_to_pfn_retry() */
> + /* Note: the offset must be correct before calling gpc_to_pfn_retry() */
> gpc->uhva += page_offset;
>
> /*
> @@ -327,7 +339,7 @@ static int __kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned l
> * drop the lock and do the HVA to PFN lookup again.
> */
> if (!gpc->valid || hva_change) {
> - ret = hva_to_pfn_retry(gpc);
> + ret = gpc_to_pfn_retry(gpc);
> } else {
> /*
> * If the HVA→PFN mapping was already valid, don't unmap it.
--
Cheers
David
next prev parent reply other threads:[~2026-01-19 12:34 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-03 14:41 [RFC PATCH 0/2] KVM: pfncache: Support guest_memfd without direct map Takahiro Itazuri
2025-12-03 14:41 ` [RFC PATCH 1/2] KVM: pfncache: Use kvm_gmem_get_pfn() for guest_memfd-backed memslots Takahiro Itazuri
2026-01-19 12:34 ` David Hildenbrand (Red Hat) [this message]
2025-12-03 14:41 ` [RFC PATCH 2/2] KVM: pfncache: Use vmap() for guest_memfd pages without direct map Takahiro Itazuri
2025-12-03 16:01 ` [RFC PATCH 0/2] KVM: pfncache: Support guest_memfd " Brendan Jackman
2025-12-03 16:35 ` David Woodhouse
2025-12-03 17:06 ` Brendan Jackman
2025-12-04 22:31 ` David Woodhouse
2025-12-05 7:15 ` David Hildenbrand (Red Hat)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=67b7e428-5ed3-4794-b8c4-dcebf724972d@kernel.org \
--to=david@kernel.org \
--cc=dwmw2@infradead.org \
--cc=itazur@amazon.com \
--cc=jackmanb@google.com \
--cc=kalyazin@amazon.com \
--cc=kvm@vger.kernel.org \
--cc=patrick.roy@campus.lmu.de \
--cc=pbonzini@redhat.com \
--cc=pdurrant@amazon.com \
--cc=seanjc@google.com \
--cc=tabba@google.com \
--cc=vkuznets@redhat.com \
--cc=zulinx86@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox