From: Leonardo Bras <leo.bras@arm.com>
To: Bradley Morgan <include@grrlz.net>
Cc: Leonardo Bras <leo.bras@arm.com>, Marc Zyngier <maz@kernel.org>,
Oliver Upton <oupton@kernel.org>, Fuad Tabba <tabba@google.com>,
Joey Gouly <joey.gouly@arm.com>,
Steffen Eiden <seiden@linux.ibm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Quentin Perret <qperret@google.com>,
Vincent Donnefort <vdonnefort@google.com>,
Gavin Shan <gshan@redhat.com>,
Alexandru Elisei <alexandru.elisei@arm.com>,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 1/3] KVM: arm64: skip pKVM cache flushes for non cacheable mappings
Date: Thu, 2 Jul 2026 11:43:06 +0100 [thread overview]
Message-ID: <akZAuqOk7ocNFhcH@LeoBrasDK> (raw)
In-Reply-To: <BCFEB62C-FA80-474B-A9F5-9CD6CA87E857@grrlz.net>
On Wed, Jul 01, 2026 at 05:57:48PM +0100, Bradley Morgan wrote:
> On July 1, 2026 5:56:37 PM GMT+01:00, Leonardo Bras <leo.bras@arm.com>
> wrote:
> >On Wed, Jul 01, 2026 at 05:54:40PM +0100, Bradley Morgan wrote:
> >> On July 1, 2026 5:53:34 PM GMT+01:00, Leonardo Bras <leo.bras@arm.com>
> >> wrote:
> >> >On Wed, Jul 01, 2026 at 05:40:46PM +0100, Bradley Morgan wrote:
> >> >> On July 1, 2026 5:05:53 PM GMT+01:00, Leonardo Bras
> ><leo.bras@arm.com>
> >> >> wrote:
> >> >> >On Wed, Jun 24, 2026 at 04:00:26PM +0000, Bradley Morgan wrote:
> >> >> >> pKVM keeps its own mapping list for stage 2 operations. Its flush
> >> >path
> >> >> >> uses that list directly, so it lost the PTE attribute check done
> >by
> >> >the
> >> >> >> generic stage 2 walker.
> >> >> >>
> >> >> >> Record whether a mapping is cacheable and skip cache maintenance
> >for
> >> >> >> mappings that are not cacheable.
> >> >> >>
> >> >> >> Fixes: e912efed485a ("KVM: arm64: Introduce the EL1 pKVM MMU")
> >> >> >> Signed-off-by: Bradley Morgan <include@grrlz.net>
> >> >> >> ---
> >> >> >> arch/arm64/kvm/pkvm.c | 51
> >> >++++++++++++++++++++++++++++++++++---------
> >> >> >> 1 file changed, 41 insertions(+), 10 deletions(-)
> >> >> >>
> >> >> >> diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c
> >> >> >> index 428723b1b0f5..ca6e823028c2 100644
> >> >> >> --- a/arch/arm64/kvm/pkvm.c
> >> >> >> +++ b/arch/arm64/kvm/pkvm.c
> >> >> >> @@ -302,9 +302,32 @@ static u64 __pkvm_mapping_start(struct
> >> >pkvm_mapping
> >> >> >*m)
> >> >> >> return m->gfn * PAGE_SIZE;
> >> >> >> }
> >> >> >>
> >> >> >> +#define PKVM_MAPPING_NR_PAGES_MASK GENMASK_ULL(47, 0)
> >> >> >> +#define PKVM_MAPPING_CACHEABLE BIT_ULL(48)
> >> >> >
> >> >> >Out of curiosity here, why do you choose to use bit 48 here instead
> >of,
> >> >> >let's say, bit 63?
> >> >> >
> >> >> >(I know it makes absolutely no difference to inner working here, as
> >> >there
> >> >> >should probably not be 2^48 pages in one mapping.)
> >> >> >
> >> >> >Thanks!
> >> >> >Leo
> >> >>
> >> >>
> >> >> sup Leo, here's a quote from maz
> >> >
> >> >Hi Bradley,
> >> >
> >> >>
> >> >> "This thing is already big enough, let's not add a bool right in the
> >> >> middle (use pahole to find out why this is bad).
> >> >
> >> >I suppose you proposed to add a bool into a struct, maybe?
> >> >It would screw the struct alignment.
> >>
> >> yep, crappy old me
> >>
> >
> >Hah, you were probably focused on the big picture.
> >
> >>
> >> >> Given that nr_pages
> >> >> is for a range, and that the minimum page size uses 12 bits, the
> >> >> largest number of pages you can have here is 56-12=48 bit wide.
> >That's
> >> >> another 16 bits worth of flags you can use."
> >> >
> >> >Humm, makes sense.
> >> >And since he mentions 16 bits worth of flags, you start by using the
> >48th
> >> >bit. Ok, got your rationale.
> >> >
> >> >(I would possibly start with the 63, though, but that's more on
> >personal
> >> >taste)
> >>
> >> 48 won't make the world blow up :)
> >
> >yeap,
>
>
> Would you like to be CCed on v4 or nahhhh?
Whatever you feel like :)
Thanks!
Leo
>
> >>
> >> >>
> >> >> this should just clarify things, any questions, feel more than free
> >to
> >> >ask!
> >> >>
> >> >> (btw V4 is coming soon)
> >> >
> >> >Thanks!
> >> >Leo
> >> >
> >> >>
> >> >> >> +
> >> >> >> +static u64 pkvm_mapping_nr_pages(struct pkvm_mapping *m)
> >> >> >> +{
> >> >> >> + return m->nr_pages & PKVM_MAPPING_NR_PAGES_MASK;
> >> >> >> +}
> >> >> >> +
> >> >> >> +static bool pkvm_mapping_is_cacheable(struct pkvm_mapping *m)
> >> >> >> +{
> >> >> >> + return m->nr_pages & PKVM_MAPPING_CACHEABLE;
> >> >> >> +}
> >> >> >> +
> >> >> >> +static void pkvm_mapping_set_nr_pages(struct pkvm_mapping *m, u64
> >> >> >nr_pages,
> >> >> >> + bool cacheable)
> >> >> >> +{
> >> >> >> + WARN_ON_ONCE(nr_pages & ~PKVM_MAPPING_NR_PAGES_MASK);
> >> >> >> +
> >> >> >> + m->nr_pages = nr_pages & PKVM_MAPPING_NR_PAGES_MASK;
> >> >> >> + if (cacheable)
> >> >> >> + m->nr_pages |= PKVM_MAPPING_CACHEABLE;
> >> >> >> +}
> >> >> >> +
> >> >> >> static u64 __pkvm_mapping_end(struct pkvm_mapping *m)
> >> >> >> {
> >> >> >> - return (m->gfn + m->nr_pages) * PAGE_SIZE - 1;
> >> >> >> + return (m->gfn + pkvm_mapping_nr_pages(m)) * PAGE_SIZE - 1;
> >> >> >> }
> >> >> >>
> >> >> >> INTERVAL_TREE_DEFINE(struct pkvm_mapping, node, u64,
> >__subtree_last,
> >> >> >> @@ -350,7 +373,7 @@ static int
> >__pkvm_pgtable_stage2_reclaim(struct
> >> >> >kvm_pgtable *pgt, u64 start, u64
> >> >> >> continue;
> >> >> >>
> >> >> >> page = pfn_to_page(mapping->pfn);
> >> >> >> - WARN_ON_ONCE(mapping->nr_pages != 1);
> >> >> >> + WARN_ON_ONCE(pkvm_mapping_nr_pages(mapping) != 1);
> >> >> >> unpin_user_pages_dirty_lock(&page, 1, true);
> >> >> >> account_locked_vm(kvm->mm, 1, false);
> >> >> >> pkvm_mapping_remove(mapping, &pgt->pkvm_mappings);
> >> >> >> @@ -369,7 +392,7 @@ static int
> >__pkvm_pgtable_stage2_unshare(struct
> >> >> >kvm_pgtable *pgt, u64 start, u64
> >> >> >>
> >> >> >> for_each_mapping_in_range_safe(pgt, start, end, mapping) {
> >> >> >> ret = kvm_call_hyp_nvhe(__pkvm_host_unshare_guest, handle, mapping->gfn,
> >> >> >> - mapping->nr_pages);
> >> >> >> + pkvm_mapping_nr_pages(mapping));
> >> >> >> if (WARN_ON(ret))
> >> >> >> return ret;
> >> >> >> pkvm_mapping_remove(mapping, &pgt->pkvm_mappings);
> >> >> >> @@ -448,7 +471,7 @@ int pkvm_pgtable_stage2_map(struct kvm_pgtable
> >> >*pgt,
> >> >> >u64 addr, u64 size,
> >> >> >> * permission faults are handled in the relax_perms() path.
> >> >> >> */
> >> >> >> if (mapping) {
> >> >> >> - if (size == (mapping->nr_pages * PAGE_SIZE))
> >> >> >> + if (size == (pkvm_mapping_nr_pages(mapping) * PAGE_SIZE))
> >> >> >> return -EAGAIN;
> >> >> >>
> >> >> >> /*
> >> >> >> @@ -472,7 +495,9 @@ int pkvm_pgtable_stage2_map(struct kvm_pgtable
> >> >*pgt,
> >> >> >u64 addr, u64 size,
> >> >> >> swap(mapping, cache->mapping);
> >> >> >> mapping->gfn = gfn;
> >> >> >> mapping->pfn = pfn;
> >> >> >> - mapping->nr_pages = size / PAGE_SIZE;
> >> >> >> + pkvm_mapping_set_nr_pages(mapping, size / PAGE_SIZE,
> >> >> >> + !(prot & (KVM_PGTABLE_PROT_DEVICE |
> >> >> >> + KVM_PGTABLE_PROT_NORMAL_NC)));
> >> >> >> pkvm_mapping_insert(mapping, &pgt->pkvm_mappings);
> >> >> >>
> >> >> >> return ret;
> >> >> >> @@ -503,7 +528,7 @@ int pkvm_pgtable_stage2_wrprotect(struct
> >> >kvm_pgtable
> >> >> >*pgt, u64 addr, u64 size)
> >> >> >> lockdep_assert_held(&kvm->mmu_lock);
> >> >> >> for_each_mapping_in_range_safe(pgt, addr, addr + size, mapping) {
> >> >> >> ret = kvm_call_hyp_nvhe(__pkvm_host_wrprotect_guest, handle, mapping->gfn,
> >> >> >> - mapping->nr_pages);
> >> >> >> + pkvm_mapping_nr_pages(mapping));
> >> >> >> if (WARN_ON(ret))
> >> >> >> break;
> >> >> >> }
> >> >> >> @@ -517,9 +542,13 @@ int pkvm_pgtable_stage2_flush(struct
> >kvm_pgtable
> >> >> >*pgt, u64 addr, u64 size)
> >> >> >> struct pkvm_mapping *mapping;
> >> >> >>
> >> >> >> lockdep_assert_held(&kvm->mmu_lock);
> >> >> >> - for_each_mapping_in_range_safe(pgt, addr, addr + size, mapping)
> >> >> >> + for_each_mapping_in_range_safe(pgt, addr, addr + size, mapping) {
> >> >> >> + if (!pkvm_mapping_is_cacheable(mapping))
> >> >> >> + continue;
> >> >> >> +
> >> >> >> __clean_dcache_guest_page(pfn_to_kaddr(mapping->pfn),
> >> >> >> - PAGE_SIZE * mapping->nr_pages);
> >> >> >> + PAGE_SIZE * pkvm_mapping_nr_pages(mapping));
> >> >> >> + }
> >> >> >>
> >> >> >> return 0;
> >> >> >> }
> >> >> >> @@ -536,8 +565,10 @@ bool
> >pkvm_pgtable_stage2_test_clear_young(struct
> >> >> >kvm_pgtable *pgt, u64 addr, u64
> >> >> >>
> >> >> >> lockdep_assert_held(&kvm->mmu_lock);
> >> >> >> for_each_mapping_in_range_safe(pgt, addr, addr + size, mapping)
> >> >> >> - young |= kvm_call_hyp_nvhe(__pkvm_host_test_clear_young_guest, handle, mapping->gfn,
> >> >> >> - mapping->nr_pages, mkold);
> >> >> >> + young |= kvm_call_hyp_nvhe(__pkvm_host_test_clear_young_guest,
> >> >> >> + handle, mapping->gfn,
> >> >> >> + pkvm_mapping_nr_pages(mapping),
> >> >> >> + mkold);
> >> >> >>
> >> >> >> return young;
> >> >> >> }
> >> >> >> --
> >> >> >> 2.53.0
> >> >> >>
> >> >> >
> >> >>
> >> >> Thanks!
> >> >
> >>
> >> Thanks!
> >
> >Thanks!
> >
>
> Thanks!
next prev parent reply other threads:[~2026-07-02 10:43 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-24 16:00 [PATCH v3 0/3] KVM: arm64: fix pKVM mapping cache corner cases Bradley Morgan
2026-06-24 16:00 ` [PATCH v3 1/3] KVM: arm64: skip pKVM cache flushes for non cacheable mappings Bradley Morgan
2026-07-01 13:31 ` Vincent Donnefort
2026-07-01 16:05 ` Leonardo Bras
2026-07-01 16:40 ` Bradley Morgan
2026-07-01 16:53 ` Leonardo Bras
2026-07-01 16:54 ` Bradley Morgan
2026-07-01 16:56 ` Leonardo Bras
2026-07-01 16:57 ` Bradley Morgan
2026-07-02 10:43 ` Leonardo Bras [this message]
2026-06-24 16:00 ` [PATCH v3 2/3] KVM: arm64: top up pKVM mapping cache for permission faults Bradley Morgan
2026-06-24 16:00 ` [PATCH v3 3/3] KVM: arm64: top up stage 2 memcache for dirty logging faults Bradley Morgan
2026-06-24 17:39 ` Bradley Morgan
2026-06-24 17:46 ` Bradley Morgan
2026-06-24 18:25 ` Marc Zyngier
2026-06-24 18:37 ` Bradley Morgan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=akZAuqOk7ocNFhcH@LeoBrasDK \
--to=leo.bras@arm.com \
--cc=alexandru.elisei@arm.com \
--cc=catalin.marinas@arm.com \
--cc=gshan@redhat.com \
--cc=include@grrlz.net \
--cc=joey.gouly@arm.com \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=oupton@kernel.org \
--cc=qperret@google.com \
--cc=seiden@linux.ibm.com \
--cc=suzuki.poulose@arm.com \
--cc=tabba@google.com \
--cc=vdonnefort@google.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox