Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: arm64: skip pKVM cache flushes for non cacheable mappings
@ 2026-06-23 16:03 Bradley Morgan
  2026-06-23 16:37 ` [PATCH v2 1/2] " Bradley Morgan
  0 siblings, 1 reply; 6+ messages in thread
From: Bradley Morgan @ 2026-06-23 16:03 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm
  Cc: Fuad Tabba, Joey Gouly, Steffen Eiden, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Quentin Perret,
	linux-arm-kernel, linux-kernel, Bradley Morgan, stable

pKVM keeps its own mapping list for stage 2 operations. Its flush path
uses that list directly, so it lost the PTE attribute check done by the
generic stage 2 walker.

Record whether a mapping is cacheable and skip cache maintenance for
mappings that are not cacheable.

Fixes: e912efed485a ("KVM: arm64: Introduce the EL1 pKVM MMU")
Cc: stable@vger.kernel.org
Signed-off-by: Bradley Morgan <include@grrlz.net>
---
 arch/arm64/include/asm/kvm_pkvm.h | 1 +
 arch/arm64/kvm/pkvm.c             | 8 +++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
index 74fedd9c5ff0..d9dd8239910d 100644
--- a/arch/arm64/include/asm/kvm_pkvm.h
+++ b/arch/arm64/include/asm/kvm_pkvm.h
@@ -196,6 +196,7 @@ struct pkvm_mapping {
 	u64 gfn;
 	u64 pfn;
 	u64 nr_pages;
+	bool cacheable;
 	u64 __subtree_last;	/* Internal member for interval tree */
 };
 
diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c
index 428723b1b0f5..105ab1258066 100644
--- a/arch/arm64/kvm/pkvm.c
+++ b/arch/arm64/kvm/pkvm.c
@@ -473,6 +473,8 @@ int pkvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size,
 	mapping->gfn = gfn;
 	mapping->pfn = pfn;
 	mapping->nr_pages = size / PAGE_SIZE;
+	mapping->cacheable = !(prot & (KVM_PGTABLE_PROT_DEVICE |
+				       KVM_PGTABLE_PROT_NORMAL_NC));
 	pkvm_mapping_insert(mapping, &pgt->pkvm_mappings);
 
 	return ret;
@@ -517,9 +519,13 @@ int pkvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size)
 	struct pkvm_mapping *mapping;
 
 	lockdep_assert_held(&kvm->mmu_lock);
-	for_each_mapping_in_range_safe(pgt, addr, addr + size, mapping)
+	for_each_mapping_in_range_safe(pgt, addr, addr + size, mapping) {
+		if (!mapping->cacheable)
+			continue;
+
 		__clean_dcache_guest_page(pfn_to_kaddr(mapping->pfn),
 					  PAGE_SIZE * mapping->nr_pages);
+	}
 
 	return 0;
 }
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 1/2] KVM: arm64: skip pKVM cache flushes for non cacheable mappings
  2026-06-23 16:03 [PATCH] KVM: arm64: skip pKVM cache flushes for non cacheable mappings Bradley Morgan
@ 2026-06-23 16:37 ` Bradley Morgan
  2026-06-23 16:37   ` [PATCH v2 2/2] KVM: arm64: top up pKVM mapping cache for permission faults Bradley Morgan
  2026-06-23 17:02   ` [PATCH v2 1/2] KVM: arm64: skip pKVM cache flushes for non cacheable mappings Marc Zyngier
  0 siblings, 2 replies; 6+ messages in thread
From: Bradley Morgan @ 2026-06-23 16:37 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm
  Cc: Fuad Tabba, Joey Gouly, Steffen Eiden, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Quentin Perret,
	Vincent Donnefort, linux-arm-kernel, linux-kernel, Bradley Morgan,
	stable

pKVM keeps its own mapping list for stage 2 operations. Its flush path
uses that list directly, so it lost the PTE attribute check done by the
generic stage 2 walker.

Record whether a mapping is cacheable and skip cache maintenance for
mappings that are not cacheable.

Fixes: e912efed485a ("KVM: arm64: Introduce the EL1 pKVM MMU")
Cc: stable@vger.kernel.org
Signed-off-by: Bradley Morgan <include@grrlz.net>
---
Changes in v2:
- Add patch 2 for the pKVM permission fault mapping cache bug.

 arch/arm64/include/asm/kvm_pkvm.h | 1 +
 arch/arm64/kvm/pkvm.c             | 8 +++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
index 74fedd9c5ff0..d9dd8239910d 100644
--- a/arch/arm64/include/asm/kvm_pkvm.h
+++ b/arch/arm64/include/asm/kvm_pkvm.h
@@ -196,6 +196,7 @@ struct pkvm_mapping {
 	u64 gfn;
 	u64 pfn;
 	u64 nr_pages;
+	bool cacheable;
 	u64 __subtree_last;	/* Internal member for interval tree */
 };
 
diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c
index 428723b1b0f5..105ab1258066 100644
--- a/arch/arm64/kvm/pkvm.c
+++ b/arch/arm64/kvm/pkvm.c
@@ -473,6 +473,8 @@ int pkvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size,
 	mapping->gfn = gfn;
 	mapping->pfn = pfn;
 	mapping->nr_pages = size / PAGE_SIZE;
+	mapping->cacheable = !(prot & (KVM_PGTABLE_PROT_DEVICE |
+				       KVM_PGTABLE_PROT_NORMAL_NC));
 	pkvm_mapping_insert(mapping, &pgt->pkvm_mappings);
 
 	return ret;
@@ -517,9 +519,13 @@ int pkvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size)
 	struct pkvm_mapping *mapping;
 
 	lockdep_assert_held(&kvm->mmu_lock);
-	for_each_mapping_in_range_safe(pgt, addr, addr + size, mapping)
+	for_each_mapping_in_range_safe(pgt, addr, addr + size, mapping) {
+		if (!mapping->cacheable)
+			continue;
+
 		__clean_dcache_guest_page(pfn_to_kaddr(mapping->pfn),
 					  PAGE_SIZE * mapping->nr_pages);
+	}
 
 	return 0;
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] KVM: arm64: top up pKVM mapping cache for permission faults
  2026-06-23 16:37 ` [PATCH v2 1/2] " Bradley Morgan
@ 2026-06-23 16:37   ` Bradley Morgan
  2026-06-23 17:02   ` [PATCH v2 1/2] KVM: arm64: skip pKVM cache flushes for non cacheable mappings Marc Zyngier
  1 sibling, 0 replies; 6+ messages in thread
From: Bradley Morgan @ 2026-06-23 16:37 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm
  Cc: Fuad Tabba, Joey Gouly, Steffen Eiden, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Quentin Perret,
	Vincent Donnefort, linux-arm-kernel, linux-kernel, Bradley Morgan,
	stable

Permission faults normally only relax an existing leaf, so the fault path
does not top up the memcache.

With pKVM, a permission fault can also replace page mappings with a
PMD mapping. That path needs a fresh pkvm_mapping object, and can
dereference a NULL cache->mapping if the cache was not topped up.

Allocate just that object for pKVM permission faults.

The issue was discovered [1] by Sashiko.

Link: https://lore.kernel.org/all/20260623161545.EA08E1F000E9@smtp.kernel.org/ [1]

Fixes: db14091d8f75 ("KVM: arm64: Stage-2 huge mappings for np-guests")
Cc: stable@vger.kernel.org
Signed-off-by: Bradley Morgan <include@grrlz.net>
---
Changes in v2:
- New patch.

 arch/arm64/kvm/mmu.c | 29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 6c941aaa10c6..3f57f6825a33 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1177,17 +1177,26 @@ void free_hyp_memcache(struct kvm_hyp_memcache *mc)
 	__free_hyp_memcache(mc, hyp_mc_free_fn, kvm_host_va, mc);
 }
 
+static int topup_hyp_memcache_mapping(struct kvm_hyp_memcache *mc)
+{
+	if (mc->mapping)
+		return 0;
+
+	mc->mapping = kzalloc_obj(struct pkvm_mapping,
+				  GFP_KERNEL_ACCOUNT);
+	return mc->mapping ? 0 : -ENOMEM;
+}
+
 int topup_hyp_memcache(struct kvm_hyp_memcache *mc, unsigned long min_pages)
 {
+	int ret;
+
 	if (!is_protected_kvm_enabled())
 		return 0;
 
-	if (!mc->mapping) {
-		mc->mapping = kzalloc_obj(struct pkvm_mapping,
-					  GFP_KERNEL_ACCOUNT);
-		if (!mc->mapping)
-			return -ENOMEM;
-	}
+	ret = topup_hyp_memcache_mapping(mc);
+	if (ret)
+		return ret;
 
 	return __topup_hyp_memcache(mc, min_pages, hyp_mc_alloc_fn,
 				    kvm_host_pa, mc);
@@ -2113,7 +2122,9 @@ static int user_mem_abort(const struct kvm_s2_fault_desc *s2fd)
 	 * Permission faults just need to update the existing leaf entry,
 	 * and so normally don't require allocations from the memcache. The
 	 * only exception to this is when dirty logging is enabled at runtime
-	 * and a write fault needs to collapse a block entry into a table.
+	 * and a write fault needs to collapse a block entry into a table. With
+	 * pKVM, they may still need a fresh mapping object if the fault turns
+	 * page entries into a block entry.
 	 */
 	memcache = get_mmu_memcache(s2fd->vcpu);
 	if (!perm_fault || (memslot_is_logging(s2fd->memslot) &&
@@ -2121,6 +2132,10 @@ static int user_mem_abort(const struct kvm_s2_fault_desc *s2fd)
 		ret = topup_mmu_memcache(s2fd->vcpu, memcache);
 		if (ret)
 			return ret;
+	} else if (is_protected_kvm_enabled()) {
+		ret = topup_hyp_memcache_mapping(memcache);
+		if (ret)
+			return ret;
 	}
 
 	/*
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] KVM: arm64: skip pKVM cache flushes for non cacheable mappings
  2026-06-23 16:37 ` [PATCH v2 1/2] " Bradley Morgan
  2026-06-23 16:37   ` [PATCH v2 2/2] KVM: arm64: top up pKVM mapping cache for permission faults Bradley Morgan
@ 2026-06-23 17:02   ` Marc Zyngier
  2026-06-23 17:04     ` Bradley Morgan
  1 sibling, 1 reply; 6+ messages in thread
From: Marc Zyngier @ 2026-06-23 17:02 UTC (permalink / raw)
  To: Bradley Morgan
  Cc: Oliver Upton, kvmarm, Fuad Tabba, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	Quentin Perret, Vincent Donnefort, linux-arm-kernel, linux-kernel,
	stable

Bradley,

Just a few things to keep in mind for your next contributions:

- If you are sending more than a single patch, add a cover letter.

- Don't send a v2 in reply to a v1. It messes the threading we are
  relying on, and makes it hard to ignore replies to an older version.
  Always send new series standalone.

- Don't immediately send a V2, even if (especially if!) a bot is
  pestering you. 34 minutes between versions is way too short (at
  least a few days is the norm).

On Tue, 23 Jun 2026 17:37:55 +0100,
Bradley Morgan <include@grrlz.net> wrote:
> 
> pKVM keeps its own mapping list for stage 2 operations. Its flush path
> uses that list directly, so it lost the PTE attribute check done by the
> generic stage 2 walker.
> 
> Record whether a mapping is cacheable and skip cache maintenance for
> mappings that are not cacheable.
> 
> Fixes: e912efed485a ("KVM: arm64: Introduce the EL1 pKVM MMU")
> Cc: stable@vger.kernel.org

What device memory gets mapped in an upstream pKVM guest that would
require a backport to stable?

> Signed-off-by: Bradley Morgan <include@grrlz.net>
> ---
> Changes in v2:
> - Add patch 2 for the pKVM permission fault mapping cache bug.

This is the sort of information that goes in the cover letter.

> 
>  arch/arm64/include/asm/kvm_pkvm.h | 1 +
>  arch/arm64/kvm/pkvm.c             | 8 +++++++-
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
> index 74fedd9c5ff0..d9dd8239910d 100644
> --- a/arch/arm64/include/asm/kvm_pkvm.h
> +++ b/arch/arm64/include/asm/kvm_pkvm.h
> @@ -196,6 +196,7 @@ struct pkvm_mapping {
>  	u64 gfn;
>  	u64 pfn;
>  	u64 nr_pages;
> +	bool cacheable;

Errr, no. That's a terrible idea.

This thing is already big enough, let's not add a bool right in the
middle (use pahole to find out why this is bad). Given that nr_pages
is for a range, and that the minimum page size uses 12 bits, the
largest number of pages you can have here is 56-12=48 bit wide. That's
another 16 bits worth of flags you can use.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] KVM: arm64: skip pKVM cache flushes for non cacheable mappings
  2026-06-23 17:02   ` [PATCH v2 1/2] KVM: arm64: skip pKVM cache flushes for non cacheable mappings Marc Zyngier
@ 2026-06-23 17:04     ` Bradley Morgan
  2026-06-23 17:13       ` Marc Zyngier
  0 siblings, 1 reply; 6+ messages in thread
From: Bradley Morgan @ 2026-06-23 17:04 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Oliver Upton, kvmarm, Fuad Tabba, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	Quentin Perret, Vincent Donnefort, linux-arm-kernel, linux-kernel,
	stable

On June 23, 2026 6:02:40 PM GMT+01:00, Marc Zyngier <maz@kernel.org> wrote:
>Bradley,
>
>Just a few things to keep in mind for your next contributions:
>
>- If you are sending more than a single patch, add a cover letter.
>
>- Don't send a v2 in reply to a v1. It messes the threading we are
>  relying on, and makes it hard to ignore replies to an older version.
>  Always send new series standalone.
>
>- Don't immediately send a V2, even if (especially if!) a bot is
>  pestering you. 34 minutes between versions is way too short (at
>  least a few days is the norm).
>
>On Tue, 23 Jun 2026 17:37:55 +0100,
>Bradley Morgan <include@grrlz.net> wrote:
>> 
>> pKVM keeps its own mapping list for stage 2 operations. Its flush path
>> uses that list directly, so it lost the PTE attribute check done by the
>> generic stage 2 walker.
>> 
>> Record whether a mapping is cacheable and skip cache maintenance for
>> mappings that are not cacheable.
>> 
>> Fixes: e912efed485a ("KVM: arm64: Introduce the EL1 pKVM MMU")
>> Cc: stable@vger.kernel.org
>
>What device memory gets mapped in an upstream pKVM guest that would
>require a backport to stable?
>
>> Signed-off-by: Bradley Morgan <include@grrlz.net>
>> ---
>> Changes in v2:
>> - Add patch 2 for the pKVM permission fault mapping cache bug.
>
>This is the sort of information that goes in the cover letter.
>
>> 
>>  arch/arm64/include/asm/kvm_pkvm.h | 1 +
>>  arch/arm64/kvm/pkvm.c             | 8 +++++++-
>>  2 files changed, 8 insertions(+), 1 deletion(-)
>> 
>> diff --git a/arch/arm64/include/asm/kvm_pkvm.h
>b/arch/arm64/include/asm/kvm_pkvm.h
>> index 74fedd9c5ff0..d9dd8239910d 100644
>> --- a/arch/arm64/include/asm/kvm_pkvm.h
>> +++ b/arch/arm64/include/asm/kvm_pkvm.h
>> @@ -196,6 +196,7 @@ struct pkvm_mapping {
>>  	u64 gfn;
>>  	u64 pfn;
>>  	u64 nr_pages;
>> +	bool cacheable;
>
>Errr, no. That's a terrible idea.
>
>This thing is already big enough, let's not add a bool right in the
>middle (use pahole to find out why this is bad). Given that nr_pages
>is for a range, and that the minimum page size uses 12 bits, the
>largest number of pages you can have here is 56-12=48 bit wide. That's
>another 16 bits worth of flags you can use.
>
>Thanks,
>
>	M.
>
>

thanks.

I'll go and do V3 with another sashiko suggestion. I'll fix your path too.
I'll park V3 for a bit.

Thanks!


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] KVM: arm64: skip pKVM cache flushes for non cacheable mappings
  2026-06-23 17:04     ` Bradley Morgan
@ 2026-06-23 17:13       ` Marc Zyngier
  0 siblings, 0 replies; 6+ messages in thread
From: Marc Zyngier @ 2026-06-23 17:13 UTC (permalink / raw)
  To: Bradley Morgan
  Cc: Oliver Upton, kvmarm, Fuad Tabba, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	Quentin Perret, Vincent Donnefort, linux-arm-kernel, linux-kernel,
	stable

On Tue, 23 Jun 2026 18:04:07 +0100,
Bradley Morgan <include@grrlz.net> wrote:
> 
> I'll go and do V3 with another sashiko suggestion. I'll fix your path too.

Before you do that, please verify that whatever Sashiko spits out
makes any sense. I'm not convinced by its reply on v1 at all.

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-06-23 17:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-23 16:03 [PATCH] KVM: arm64: skip pKVM cache flushes for non cacheable mappings Bradley Morgan
2026-06-23 16:37 ` [PATCH v2 1/2] " Bradley Morgan
2026-06-23 16:37   ` [PATCH v2 2/2] KVM: arm64: top up pKVM mapping cache for permission faults Bradley Morgan
2026-06-23 17:02   ` [PATCH v2 1/2] KVM: arm64: skip pKVM cache flushes for non cacheable mappings Marc Zyngier
2026-06-23 17:04     ` Bradley Morgan
2026-06-23 17:13       ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox