* [PATCH v3 1/5] KVM: arm64: nv: Respect read-only PFN when mapping L1 VNCR
2026-06-18 23:42 [PATCH v3 0/5] KVM: arm64: nv: Even more VNCR fixes Oliver Upton
@ 2026-06-18 23:42 ` Oliver Upton
2026-06-19 0:07 ` sashiko-bot
2026-06-18 23:42 ` [PATCH v3 2/5] KVM: arm64: nv: Inject SEA if kvm_translate_vncr() can't resolve PFN Oliver Upton
` (3 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Oliver Upton @ 2026-06-18 23:42 UTC (permalink / raw)
To: kvmarm
Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
Wei-Lin Chang, Steffen Eiden, Oliver Upton, stable
KVM currently maps the L1 VNCR into the host stage-1 by relying entirely
on the permissions of the guest stage-1. At the same time, it is
entirely possible that the backing PFN is read-only (e.g. RO memslot),
meaning that the L1 VNCR should use at most a read-only mapping.
Cache the writability of the PFN in the VNCR TLB and use it to constrain
the resulting fixmap permissions. Promote VNCR permission faults to an
SEA in the case where the guest attempts to write to a read-only
endpoint. Conveniently, this also plugs a page leak found by Sashiko [*]
resulting from the early return for a read-only PFN.
Cc: stable@vger.kernel.org
Fixes: 2a359e072596 ("KVM: arm64: nv: Handle mapping of VNCR_EL2 at EL2")
Link: https://lore.kernel.org/kvm/20260608082603.16AEC1F00893@smtp.kernel.org/
Signed-off-by: Oliver Upton <oupton@kernel.org>
---
arch/arm64/kvm/nested.c | 36 ++++++++++++++++++++++++++----------
1 file changed, 26 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index 3a5571c3c114..903ccabca78c 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -24,6 +24,7 @@ struct vncr_tlb {
struct s1_walk_result wr;
u64 hpa;
+ bool hpa_writable;
/* -1 when not mapped on a CPU */
int cpu;
@@ -1401,7 +1402,7 @@ static int kvm_translate_vncr(struct kvm_vcpu *vcpu, bool *is_gmem)
if (!*is_gmem) {
pfn = __kvm_faultin_pfn(memslot, gfn, write_fault ? FOLL_WRITE : 0,
&writable, &page);
- if (is_error_noslot_pfn(pfn) || (write_fault && !writable))
+ if (is_error_noslot_pfn(pfn))
return -EFAULT;
} else {
ret = kvm_gmem_get_pfn(vcpu->kvm, memslot, gfn, &pfn, &page, NULL);
@@ -1410,6 +1411,8 @@ static int kvm_translate_vncr(struct kvm_vcpu *vcpu, bool *is_gmem)
write_fault, false, false);
return ret;
}
+
+ writable = !(memslot->flags & KVM_MEM_READONLY);
}
scoped_guard(write_lock, &vcpu->kvm->mmu_lock) {
@@ -1420,28 +1423,41 @@ static int kvm_translate_vncr(struct kvm_vcpu *vcpu, bool *is_gmem)
vt->gva = va;
vt->hpa = pfn << PAGE_SHIFT;
+ vt->hpa_writable = writable;
vt->valid = true;
vt->cpu = -1;
kvm_make_request(KVM_REQ_MAP_L1_VNCR_EL2, vcpu);
- kvm_release_faultin_page(vcpu->kvm, page, false, vt->wr.pw);
+ kvm_release_faultin_page(vcpu->kvm, page, false, vt->wr.pw && vt->hpa_writable);
}
- if (vt->wr.pw)
+ if (vt->wr.pw && vt->hpa_writable)
mark_page_dirty(vcpu->kvm, gfn);
return 0;
}
-static void inject_vncr_perm(struct kvm_vcpu *vcpu)
+static void handle_vncr_perm(struct kvm_vcpu *vcpu)
{
struct vncr_tlb *vt = vcpu->arch.vncr_tlb;
u64 esr = kvm_vcpu_get_esr(vcpu);
+ u64 fsc;
+
+ /*
+ * Promote to an external abort if the stage-1 permits writes but the
+ * HPA is read-only (e.g. RO memslot).
+ */
+ if (kvm_is_write_fault(vcpu) && vt->wr.pw && !vt->hpa_writable)
+ fsc = ESR_ELx_FSC_EXTABT;
+ /*
+ * Otherwise, inject a permission fault using the guest's translation
+ * level rather than the host's.
+ */
+ else
+ fsc = ESR_ELx_FSC_PERM_L(vt->wr.level);
- /* Adjust the fault level to reflect that of the guest's */
esr &= ~ESR_ELx_FSC;
- esr |= FIELD_PREP(ESR_ELx_FSC,
- ESR_ELx_FSC_PERM_L(vt->wr.level));
+ esr |= FIELD_PREP(ESR_ELx_FSC, fsc);
kvm_inject_nested_sync(vcpu, esr);
}
@@ -1475,7 +1491,7 @@ int kvm_handle_vncr_abort(struct kvm_vcpu *vcpu)
return kvm_handle_guest_sea(vcpu);
if (esr_fsc_is_permission_fault(esr)) {
- inject_vncr_perm(vcpu);
+ handle_vncr_perm(vcpu);
} else if (esr_fsc_is_translation_fault(esr)) {
bool valid, is_gmem = false;
int ret;
@@ -1523,7 +1539,7 @@ int kvm_handle_vncr_abort(struct kvm_vcpu *vcpu)
break;
case -EPERM:
/* Hack to deal with POE until we get kernel support */
- inject_vncr_perm(vcpu);
+ handle_vncr_perm(vcpu);
break;
case 0:
break;
@@ -1567,7 +1583,7 @@ static void kvm_map_l1_vncr(struct kvm_vcpu *vcpu)
vt->cpu = smp_processor_id();
- if (vt->wr.pw && vt->wr.pr)
+ if (vt->hpa_writable && vt->wr.pw && vt->wr.pr)
prot = PAGE_KERNEL;
else if (vt->wr.pr)
prot = PAGE_KERNEL_RO;
--
2.47.3
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v3 1/5] KVM: arm64: nv: Respect read-only PFN when mapping L1 VNCR
2026-06-18 23:42 ` [PATCH v3 1/5] KVM: arm64: nv: Respect read-only PFN when mapping L1 VNCR Oliver Upton
@ 2026-06-19 0:07 ` sashiko-bot
0 siblings, 0 replies; 9+ messages in thread
From: sashiko-bot @ 2026-06-19 0:07 UTC (permalink / raw)
To: Oliver Upton; +Cc: Marc Zyngier, kvmarm
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] KVM injects an External Abort instead of a Permission Fault when a guest accesses a read-only memslot via VNCR with guest stage-1 write protections.
--
commit be3537f8a87d3f507b2b54b34084b37b1630155a
Author: Oliver Upton <oupton@kernel.org>
KVM: arm64: nv: Respect read-only PFN when mapping L1 VNCR
This commit updates KVM to cache the writability of the PFN in the VNCR
TLB to constrain the resulting fixmap permissions. It also promotes VNCR
permission faults to a Synchronous External Abort (SEA) when the guest
attempts to write to a read-only endpoint.
Link: https://lore.kernel.org/kvm/20260608082603.16AEC1F00893@smtp.kernel.org/
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 3a5571c3c1142..903ccabca78c5 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
[ ... ]
> @@ -1401,7 +1402,7 @@ static int kvm_translate_vncr(struct kvm_vcpu *vcpu, bool *is_gmem)
> if (!*is_gmem) {
> pfn = __kvm_faultin_pfn(memslot, gfn, write_fault ? FOLL_WRITE : 0,
> &writable, &page);
> - if (is_error_noslot_pfn(pfn) || (write_fault && !writable))
> + if (is_error_noslot_pfn(pfn))
> return -EFAULT;
[Severity: High]
This is a pre-existing issue, but does this leak host memory configuration
to the guest by incorrectly prioritizing an External Abort over a Permission
Fault?
When a guest takes a VNCR write fault, this code unconditionally requests
FOLL_WRITE from __kvm_faultin_pfn() without checking if the guest's
stage-1 page tables actually permit writes (vt->wr.pw).
If the backing host memslot is read-only, __kvm_faultin_pfn() fails and KVM
returns -EFAULT, which ultimately causes KVM to inject a guest External Abort
(ESR_ELx_FSC_EXTABT). Architecturally, if the guest's stage-1 translation
lacks write permission, the guest should receive a Stage-1 Permission Fault
regardless of the host's physical memory permissions.
Could this cause unexpected guest panics if the guest OS relies on handling
Permission Faults gracefully before the host RO mapping is enforced?
[ ... ]
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618234207.1063941-1-oupton@kernel.org?part=1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v3 2/5] KVM: arm64: nv: Inject SEA if kvm_translate_vncr() can't resolve PFN
2026-06-18 23:42 [PATCH v3 0/5] KVM: arm64: nv: Even more VNCR fixes Oliver Upton
2026-06-18 23:42 ` [PATCH v3 1/5] KVM: arm64: nv: Respect read-only PFN when mapping L1 VNCR Oliver Upton
@ 2026-06-18 23:42 ` Oliver Upton
2026-06-18 23:57 ` sashiko-bot
2026-06-18 23:42 ` [PATCH v3 3/5] KVM: arm64: nv: Re-translate VNCR before injecting abort Oliver Upton
` (2 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Oliver Upton @ 2026-06-18 23:42 UTC (permalink / raw)
To: kvmarm
Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
Wei-Lin Chang, Steffen Eiden, Oliver Upton, stable
kvm_handle_vncr_abort() assumes that s1_walk_result conveys an abort
when kvm_translate_vncr() returns -EFAULT. This is not always the case
as it's possible to encounter 'late' failures on the output of S1
translation, e.g. a GFN outside of the memslots.
Fix it by preparing an external abort before returning from
kvm_translate_vncr(). Get rid of the BUG_ON() in the fault injection
path while at it.
Cc: stable@vger.kernel.org
Fixes: 2a359e072596 ("KVM: arm64: nv: Handle mapping of VNCR_EL2 at EL2")
Signed-off-by: Oliver Upton <oupton@kernel.org>
---
arch/arm64/include/asm/kvm_nested.h | 8 ++++++++
arch/arm64/kvm/at.c | 8 --------
arch/arm64/kvm/nested.c | 10 ++++++----
3 files changed, 14 insertions(+), 12 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h
index dc2957662ff2..cbdaaa2a2903 100644
--- a/arch/arm64/include/asm/kvm_nested.h
+++ b/arch/arm64/include/asm/kvm_nested.h
@@ -388,6 +388,14 @@ struct s1_walk_result {
bool failed;
};
+static inline void fail_s1_walk(struct s1_walk_result *wr, u8 fst, bool s1ptw)
+{
+ wr->fst = fst;
+ wr->ptw = s1ptw;
+ wr->s2 = s1ptw;
+ wr->failed = true;
+}
+
int __kvm_translate_va(struct kvm_vcpu *vcpu, struct s1_walk_info *wi,
struct s1_walk_result *wr, u64 va);
int __kvm_find_s1_desc_level(struct kvm_vcpu *vcpu, u64 va, u64 ipa,
diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c
index 30e6fa8ac07c..8263c648207b 100644
--- a/arch/arm64/kvm/at.c
+++ b/arch/arm64/kvm/at.c
@@ -11,14 +11,6 @@
#include <asm/kvm_mmu.h>
#include <asm/lsui.h>
-static void fail_s1_walk(struct s1_walk_result *wr, u8 fst, bool s1ptw)
-{
- wr->fst = fst;
- wr->ptw = s1ptw;
- wr->s2 = s1ptw;
- wr->failed = true;
-}
-
#define S1_MMU_DISABLED (-127)
static int get_ia_size(struct s1_walk_info *wi)
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index 903ccabca78c..53dea9c3f14f 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -1395,15 +1395,19 @@ static int kvm_translate_vncr(struct kvm_vcpu *vcpu, bool *is_gmem)
gfn = vt->wr.pa >> PAGE_SHIFT;
memslot = gfn_to_memslot(vcpu->kvm, gfn);
- if (!memslot)
+ if (!memslot) {
+ fail_s1_walk(&vt->wr, ESR_ELx_FSC_EXTABT, false);
return -EFAULT;
+ }
*is_gmem = kvm_slot_has_gmem(memslot);
if (!*is_gmem) {
pfn = __kvm_faultin_pfn(memslot, gfn, write_fault ? FOLL_WRITE : 0,
&writable, &page);
- if (is_error_noslot_pfn(pfn))
+ if (is_error_noslot_pfn(pfn)) {
+ fail_s1_walk(&vt->wr, ESR_ELx_FSC_EXTABT, false);
return -EFAULT;
+ }
} else {
ret = kvm_gmem_get_pfn(vcpu->kvm, memslot, gfn, &pfn, &page, NULL);
if (ret) {
@@ -1530,8 +1534,6 @@ int kvm_handle_vncr_abort(struct kvm_vcpu *vcpu)
* Translation failed, inject the corresponding
* exception back to EL2.
*/
- BUG_ON(!vt->wr.failed);
-
esr &= ~ESR_ELx_FSC;
esr |= FIELD_PREP(ESR_ELx_FSC, vt->wr.fst);
--
2.47.3
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v3 2/5] KVM: arm64: nv: Inject SEA if kvm_translate_vncr() can't resolve PFN
2026-06-18 23:42 ` [PATCH v3 2/5] KVM: arm64: nv: Inject SEA if kvm_translate_vncr() can't resolve PFN Oliver Upton
@ 2026-06-18 23:57 ` sashiko-bot
0 siblings, 0 replies; 9+ messages in thread
From: sashiko-bot @ 2026-06-18 23:57 UTC (permalink / raw)
To: Oliver Upton; +Cc: Marc Zyngier, kvmarm
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
New issues:
- [High] Unhandled guest_memfd errors ignore prepared memory fault exits and inject malformed guest exceptions.
Pre-existing issues:
- [High] Host-side PFN resolution errors (e.g., SIGPENDING, HWPOISON) are silently converted into guest Synchronous External Aborts (SEA).
--
commit 68bfda7f1617f64f7121c49b477f4cf4e33b2e10
Author: Oliver Upton <oupton@kernel.org>
KVM: arm64: nv: Inject SEA if kvm_translate_vncr() can't resolve PFN
This commit updates kvm_translate_vncr() to prepare an external abort
when failing to resolve a PFN instead of returning -EFAULT and relying
on kvm_handle_vncr_abort() to BUG_ON() or correctly guess the failure.
It drops the BUG_ON() in the abort handler to safely inject the SEA.
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 903ccabca78c5..53dea9c3f14f8 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -1395,15 +1395,19 @@ static int kvm_translate_vncr(struct kvm_vcpu *vcpu, bool *is_gmem)
>
> gfn = vt->wr.pa >> PAGE_SHIFT;
> memslot = gfn_to_memslot(vcpu->kvm, gfn);
> - if (!memslot)
> + if (!memslot) {
> + fail_s1_walk(&vt->wr, ESR_ELx_FSC_EXTABT, false);
> return -EFAULT;
> + }
>
> *is_gmem = kvm_slot_has_gmem(memslot);
> if (!*is_gmem) {
> pfn = __kvm_faultin_pfn(memslot, gfn, write_fault ? FOLL_WRITE : 0,
> &writable, &page);
> - if (is_error_noslot_pfn(pfn))
> + if (is_error_noslot_pfn(pfn)) {
> + fail_s1_walk(&vt->wr, ESR_ELx_FSC_EXTABT, false);
> return -EFAULT;
> + }
[Severity: High]
This is a pre-existing issue, but does this improperly convert host PFN
resolution errors into guest aborts?
If __kvm_faultin_pfn() encounters a pending host signal or poisoned memory,
it can return KVM_PFN_ERR_SIGPENDING or KVM_PFN_ERR_HWPOISON. These are
caught by is_error_noslot_pfn().
By calling fail_s1_walk() and returning -EFAULT here, kvm_handle_vncr_abort()
will inject a Synchronous External Abort (SEA) into the guest instead of
returning to userspace. Will this cause a simple host signal to randomly
crash the VM and break host HWPOISON recovery?
[ ... ]
> @@ -1530,8 +1534,6 @@ int kvm_handle_vncr_abort(struct kvm_vcpu *vcpu)
> * Translation failed, inject the corresponding
> * exception back to EL2.
> */
> - BUG_ON(!vt->wr.failed);
> -
[Severity: High]
Does removing this BUG_ON() unmask an issue when kvm_gmem_get_pfn() fails
with -EINVAL, -ENOENT, or -EACCES?
In kvm_translate_vncr(), if kvm_gmem_get_pfn() fails, it prepares a memory
fault exit for userspace and returns the error code without calling
fail_s1_walk().
Because kvm_handle_vncr_abort() lacks an 'if (is_gmem) return 0;' check for
-EINVAL, -ENOENT, and -EACCES, it will fall through to the exception injection
path below. Since vt->wr.fst is uninitialized (0), won't KVM ignore the
prepared memory fault exit and inject a malformed exception (FSC=0) into the
guest?
> esr &= ~ESR_ELx_FSC;
> esr |= FIELD_PREP(ESR_ELx_FSC, vt->wr.fst);
>
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618234207.1063941-1-oupton@kernel.org?part=2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v3 3/5] KVM: arm64: nv: Re-translate VNCR before injecting abort
2026-06-18 23:42 [PATCH v3 0/5] KVM: arm64: nv: Even more VNCR fixes Oliver Upton
2026-06-18 23:42 ` [PATCH v3 1/5] KVM: arm64: nv: Respect read-only PFN when mapping L1 VNCR Oliver Upton
2026-06-18 23:42 ` [PATCH v3 2/5] KVM: arm64: nv: Inject SEA if kvm_translate_vncr() can't resolve PFN Oliver Upton
@ 2026-06-18 23:42 ` Oliver Upton
2026-06-19 0:00 ` sashiko-bot
2026-06-18 23:42 ` [PATCH v3 4/5] KVM: arm64: nv: Inject SEA if guest VNCR isn't normal memory Oliver Upton
2026-06-18 23:42 ` [PATCH v3 5/5] KVM: arm64: nv: Mark VM as bugged for unexpected VNCR abort Oliver Upton
4 siblings, 1 reply; 9+ messages in thread
From: Oliver Upton @ 2026-06-18 23:42 UTC (permalink / raw)
To: kvmarm
Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
Wei-Lin Chang, Steffen Eiden, Oliver Upton, stable, Sashiko
KVM faults in the VNCR page with FOLL_WRITE whenever the guest aborts
for a write, similar to how a regular stage-2 mapping is handled. It is
entirely possible that the guest reads from the VNCR before writing to
it, in which case the PFN could only be read-only.
Invalidate the VNCR TLB and re-fetch the translation upon taking a VNCR
abort, allowing the host mapping to be faulted in for write the second
time around. Interestingly enough, this also satisfies the ordering
requirements of FEAT_ETS2/3 between descriptor updates and MMU faults.
Cc: stable@vger.kernel.org
Fixes: 2a359e072596 ("KVM: arm64: nv: Handle mapping of VNCR_EL2 at EL2")
Reported-by: Sashiko <sashiko-bot@kernel.org>
Signed-off-by: Oliver Upton <oupton@kernel.org>
---
arch/arm64/kvm/nested.c | 111 +++++++++++++++-------------------------
1 file changed, 42 insertions(+), 69 deletions(-)
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index 53dea9c3f14f..7fffd86eee94 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -1466,88 +1466,61 @@ static void handle_vncr_perm(struct kvm_vcpu *vcpu)
kvm_inject_nested_sync(vcpu, esr);
}
-static bool kvm_vncr_tlb_lookup(struct kvm_vcpu *vcpu)
-{
- struct vncr_tlb *vt = vcpu->arch.vncr_tlb;
-
- lockdep_assert_held_read(&vcpu->kvm->mmu_lock);
-
- if (!vt->valid)
- return false;
-
- if (read_vncr_el2(vcpu) != vt->gva)
- return false;
-
- if (vt->wr.nG)
- return get_asid_by_regime(vcpu, TR_EL20) == vt->wr.asid;
-
- return true;
-}
-
int kvm_handle_vncr_abort(struct kvm_vcpu *vcpu)
{
struct vncr_tlb *vt = vcpu->arch.vncr_tlb;
u64 esr = kvm_vcpu_get_esr(vcpu);
+ bool is_gmem = false;
+ bool perm;
+ int ret;
WARN_ON_ONCE(!(esr & ESR_ELx_VNCR));
if (kvm_vcpu_abt_issea(vcpu))
return kvm_handle_guest_sea(vcpu);
- if (esr_fsc_is_permission_fault(esr)) {
- handle_vncr_perm(vcpu);
- } else if (esr_fsc_is_translation_fault(esr)) {
- bool valid, is_gmem = false;
- int ret;
-
- scoped_guard(read_lock, &vcpu->kvm->mmu_lock)
- valid = kvm_vncr_tlb_lookup(vcpu);
-
- if (!valid)
- ret = kvm_translate_vncr(vcpu, &is_gmem);
- else
- ret = -EPERM;
+ if (!esr_fsc_is_translation_fault(esr) && !esr_fsc_is_permission_fault(esr)) {
+ WARN_ONCE(1, "Unhandled VNCR abort, ESR=%llx\n", esr);
+ return 1;
+ }
- switch (ret) {
- case -EAGAIN:
- /* Let's try again... */
- break;
- case -ENOMEM:
- /*
- * For guest_memfd, this indicates that it failed to
- * create a folio to back the memory. Inform userspace.
- */
- if (is_gmem)
- return 0;
- /* Otherwise, let's try again... */
- break;
- case -EFAULT:
- case -EIO:
- case -EHWPOISON:
- if (is_gmem)
- return 0;
- fallthrough;
- case -EINVAL:
- case -ENOENT:
- case -EACCES:
- /*
- * Translation failed, inject the corresponding
- * exception back to EL2.
- */
- esr &= ~ESR_ELx_FSC;
- esr |= FIELD_PREP(ESR_ELx_FSC, vt->wr.fst);
+ ret = kvm_translate_vncr(vcpu, &is_gmem);
+ switch (ret) {
+ case -EAGAIN:
+ /* Let's try again... */
+ return 1;
+ case -ENOMEM:
+ /*
+ * For guest_memfd, this indicates that it failed to
+ * create a folio to back the memory. Inform userspace.
+ */
+ if (is_gmem)
+ return 0;
+ /* Otherwise, let's try again... */
+ break;
+ case -EFAULT:
+ case -EIO:
+ case -EHWPOISON:
+ if (is_gmem)
+ return 0;
+ fallthrough;
+ case -EINVAL:
+ case -ENOENT:
+ case -EACCES:
+ /*
+ * Translation failed, inject the corresponding
+ * exception back to EL2.
+ */
+ esr &= ~ESR_ELx_FSC;
+ esr |= FIELD_PREP(ESR_ELx_FSC, vt->wr.fst);
- kvm_inject_nested_sync(vcpu, esr);
- break;
- case -EPERM:
- /* Hack to deal with POE until we get kernel support */
+ kvm_inject_nested_sync(vcpu, esr);
+ break;
+ case 0:
+ perm = kvm_is_write_fault(vcpu) ? vt->wr.pw && vt->hpa_writable : vt->wr.pr;
+ if (!perm)
handle_vncr_perm(vcpu);
- break;
- case 0:
- break;
- }
- } else {
- WARN_ONCE(1, "Unhandled VNCR abort, ESR=%llx\n", esr);
+ break;
}
return 1;
--
2.47.3
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v3 3/5] KVM: arm64: nv: Re-translate VNCR before injecting abort
2026-06-18 23:42 ` [PATCH v3 3/5] KVM: arm64: nv: Re-translate VNCR before injecting abort Oliver Upton
@ 2026-06-19 0:00 ` sashiko-bot
0 siblings, 0 replies; 9+ messages in thread
From: sashiko-bot @ 2026-06-19 0:00 UTC (permalink / raw)
To: Oliver Upton; +Cc: Marc Zyngier, kvmarm
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [High] Guest Stage-1 permission faults on VNCR are incorrectly promoted to External Aborts when backed by a read-only memslot.
- [High] Removing the `kvm_vncr_tlb_lookup()` check introduces an infinite host CPU livelock for Write-Only guest VNCR mappings.
Pre-existing issues:
- [High] If `kvm_gmem_get_pfn()` returns `-EINVAL`, KVM drops the prepared userspace memory fault exit and injects a bogus nested exception.
--
commit 51cca6668ae197c096ae8a70f22546dc1a8163be
Author: Oliver Upton <oupton@kernel.org>
KVM: arm64: nv: Re-translate VNCR before injecting abort
KVM faults in the VNCR page with FOLL_WRITE whenever the guest aborts
for a write, similar to how a regular stage-2 mapping is handled. It is
entirely possible that the guest reads from the VNCR before writing to
it, in which case the PFN could only be read-only.
Invalidate the VNCR TLB and re-fetch the translation upon taking a VNCR
abort, allowing the host mapping to be faulted in for write the second
time around. Interestingly enough, this also satisfies the ordering
requirements of FEAT_ETS2/3 between descriptor updates and MMU faults.
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 53dea9c3f14f8..7fffd86eee944 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
[ ... ]
> @@ -1466,88 +1466,61 @@ static void handle_vncr_perm(struct kvm_vcpu *vcpu)
[ ... ]
> + ret = kvm_translate_vncr(vcpu, &is_gmem);
> + switch (ret) {
> + case -EAGAIN:
[ ... ]
> + case -EFAULT:
> + case -EIO:
> + case -EHWPOISON:
> + if (is_gmem)
> + return 0;
> + fallthrough;
> + case -EINVAL:
> + case -ENOENT:
> + case -EACCES:
> + /*
> + * Translation failed, inject the corresponding
> + * exception back to EL2.
> + */
> + esr &= ~ESR_ELx_FSC;
> + esr |= FIELD_PREP(ESR_ELx_FSC, vt->wr.fst);
>
> - kvm_inject_nested_sync(vcpu, esr);
> - break;
> - case -EPERM:
> - /* Hack to deal with POE until we get kernel support */
> + kvm_inject_nested_sync(vcpu, esr);
> + break;
[Severity: High]
This is a pre-existing issue, but if kvm_gmem_get_pfn() fails and returns
-EINVAL, -ENOENT, or -EACCES, does this branch bypass the userspace exit?
If those errors are returned, they skip the is_gmem check because they bypass
the fallthrough block above. Instead of returning to userspace, KVM handles
them by injecting a nested exception back to the guest using an uninitialized
or stale value in vt->wr.fst. Does this destroy the memory fault exit state
and inject a bogus exception?
[ ... ]
> - handle_vncr_perm(vcpu);
> - break;
> - case 0:
> - break;
> - }
> - } else {
> - WARN_ONCE(1, "Unhandled VNCR abort, ESR=%llx\n", esr);
> + case 0:
> + perm = kvm_is_write_fault(vcpu) ? vt->wr.pw && vt->hpa_writable : vt->wr.pr;
> + if (!perm)
> handle_vncr_perm(vcpu);
> + break;
> }
>
> return 1;
[Severity: High]
Does removing the -EPERM loop breaker introduce a host CPU livelock for
Write-Only guest VNCR mappings?
If the guest configures a Write-Only mapping, kvm_map_l1_vncr() falls back
to mapping it as PAGE_NONE. When the hardware accesses the fixmap, it triggers
a Data Abort. KVM then traps and re-translates the mapping.
Because vt->wr.pw is true, the perm check evaluates to true, and KVM returns 1
to resume the guest instead of injecting a fault. The hardware will then fault
again indefinitely.
[Severity: High]
Also, does calling kvm_translate_vncr() unconditionally request FOLL_WRITE
on a write fault without verifying if the guest's own Stage-1 mapping
actually permits writes?
In kvm_translate_vncr(), __kvm_faultin_pfn() is called with FOLL_WRITE if it's
a write fault. If the host memslot happens to be read-only, __kvm_faultin_pfn()
fails and returns -EFAULT.
This error bubbles up to the switch statement above, which injects an External
Abort into the guest. Should the guest instead receive a Stage-1 Permission
Fault if its own Stage-1 translation does not permit writes, regardless of the
memslot permissions?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618234207.1063941-1-oupton@kernel.org?part=3
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v3 4/5] KVM: arm64: nv: Inject SEA if guest VNCR isn't normal memory
2026-06-18 23:42 [PATCH v3 0/5] KVM: arm64: nv: Even more VNCR fixes Oliver Upton
` (2 preceding siblings ...)
2026-06-18 23:42 ` [PATCH v3 3/5] KVM: arm64: nv: Re-translate VNCR before injecting abort Oliver Upton
@ 2026-06-18 23:42 ` Oliver Upton
2026-06-18 23:42 ` [PATCH v3 5/5] KVM: arm64: nv: Mark VM as bugged for unexpected VNCR abort Oliver Upton
4 siblings, 0 replies; 9+ messages in thread
From: Oliver Upton @ 2026-06-18 23:42 UTC (permalink / raw)
To: kvmarm
Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
Wei-Lin Chang, Steffen Eiden, Oliver Upton, stable
When constructing an L1 VNCR mapping, KVM unconditionally uses cacheable
memory attributes, even if the underlying PFN isn't memory. This gets
particularly hairy if the endpoint doesn't support cacheable memory
attributes, potentially throwing an SError on writeback...
While KVM does permit cacheable memory attributes on certain PFNMAP
VMAs, kvm_translate_vncr() isn't currently grabbing the VMA. So do the
simpler thing for now and just reject everything that isn't memory.
Cc: stable@vger.kernel.org
Fixes: 2a359e072596 ("KVM: arm64: nv: Handle mapping of VNCR_EL2 at EL2")
Signed-off-by: Oliver Upton <oupton@kernel.org>
---
arch/arm64/kvm/nested.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index 7fffd86eee94..d4c9a9b05e3f 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -1419,6 +1419,17 @@ static int kvm_translate_vncr(struct kvm_vcpu *vcpu, bool *is_gmem)
writable = !(memslot->flags & KVM_MEM_READONLY);
}
+ /*
+ * FIXME: This check is too restrictive as KVM allows cacheable memory
+ * attributes for PFNMAP VMAs that have cacheable attributes in host
+ * stage-1.
+ */
+ if (!pfn_is_map_memory(pfn)) {
+ kvm_release_faultin_page(vcpu->kvm, page, true, false);
+ fail_s1_walk(&vt->wr, ESR_ELx_FSC_EXTABT, false);
+ return -EINVAL;
+ }
+
scoped_guard(write_lock, &vcpu->kvm->mmu_lock) {
if (mmu_invalidate_retry(vcpu->kvm, mmu_seq)) {
kvm_release_faultin_page(vcpu->kvm, page, true, false);
--
2.47.3
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v3 5/5] KVM: arm64: nv: Mark VM as bugged for unexpected VNCR abort
2026-06-18 23:42 [PATCH v3 0/5] KVM: arm64: nv: Even more VNCR fixes Oliver Upton
` (3 preceding siblings ...)
2026-06-18 23:42 ` [PATCH v3 4/5] KVM: arm64: nv: Inject SEA if guest VNCR isn't normal memory Oliver Upton
@ 2026-06-18 23:42 ` Oliver Upton
4 siblings, 0 replies; 9+ messages in thread
From: Oliver Upton @ 2026-06-18 23:42 UTC (permalink / raw)
To: kvmarm
Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
Wei-Lin Chang, Steffen Eiden, Oliver Upton
KVM is unlikely to resolve an unexpected VNCR abort, meaning that
returning to the guest will likely leave the vCPU stuck in an abort
loop. Bug the VM and exit to userspace instead.
Signed-off-by: Oliver Upton <oupton@kernel.org>
---
arch/arm64/kvm/nested.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index d4c9a9b05e3f..94df26de6990 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -1491,8 +1491,8 @@ int kvm_handle_vncr_abort(struct kvm_vcpu *vcpu)
return kvm_handle_guest_sea(vcpu);
if (!esr_fsc_is_translation_fault(esr) && !esr_fsc_is_permission_fault(esr)) {
- WARN_ONCE(1, "Unhandled VNCR abort, ESR=%llx\n", esr);
- return 1;
+ KVM_BUG(1, vcpu->kvm, "Unhandled VNCR abort, ESR=%llx\n", esr);
+ return -EIO;
}
ret = kvm_translate_vncr(vcpu, &is_gmem);
--
2.47.3
^ permalink raw reply related [flat|nested] 9+ messages in thread