* [PATCH v2 0/2] KVM: arm64: nv: Fix permission checks for S1PTW faults @ 2026-06-24 20:24 Oliver Upton 2026-06-24 20:24 ` [PATCH v2 1/2] KVM: arm64: Only consider S1PTW a write fault if HA is set Oliver Upton 2026-06-24 20:24 ` [PATCH v2 2/2] KVM: arm64: nv: Treat S1PTW permission faults specially Oliver Upton 0 siblings, 2 replies; 7+ messages in thread From: Oliver Upton @ 2026-06-24 20:24 UTC (permalink / raw) To: kvmarm Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Wei-Lin Chang, Steffen Eiden, Oliver Upton Small series that papers over arch ambiguity around S1PTW permission faults. As kvm_s2_handle_perm_fault() wasn't checking for S1PTW instruction aborts, it was incorrectly evaluating the execute permissions to decide where to send the fault. Fixing that uncovers another issue in that kvm_is_write_fault() assumes any S1PTW permission fault was due to write. Nested screws this up since an L1 hypervisor could use write-only permissions at stage-2. We end up papering over architecture ambiguity by potentially evaluating *both* read and write permissions for S1PTW, assuming any fault with HA set to require write permission (in addition to read). Applies to kvmarm/fixes. v1: https://lore.kernel.org/kvmarm/20260623211310.1529760-1-oupton@kernel.org/ Oliver Upton (2): KVM: arm64: Only consider S1PTW a write fault if HA is set KVM: arm64: nv: Treat S1PTW permission faults specially arch/arm64/include/asm/kvm_emulate.h | 22 +++++---------- arch/arm64/include/asm/kvm_nested.h | 2 ++ arch/arm64/kvm/at.c | 42 +++++++++++++++++++++------- arch/arm64/kvm/nested.c | 20 +++++++++++-- 4 files changed, 58 insertions(+), 28 deletions(-) base-commit: d098bb75d14fde2f12155f1a95ec0168160867ce -- 2.47.3 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/2] KVM: arm64: Only consider S1PTW a write fault if HA is set 2026-06-24 20:24 [PATCH v2 0/2] KVM: arm64: nv: Fix permission checks for S1PTW faults Oliver Upton @ 2026-06-24 20:24 ` Oliver Upton 2026-06-24 20:40 ` sashiko-bot 2026-06-24 20:24 ` [PATCH v2 2/2] KVM: arm64: nv: Treat S1PTW permission faults specially Oliver Upton 1 sibling, 1 reply; 7+ messages in thread From: Oliver Upton @ 2026-06-24 20:24 UTC (permalink / raw) To: kvmarm Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Wei-Lin Chang, Steffen Eiden, Oliver Upton, Sashiko In yet another example where the architecture is awesome, S1PTW faults may not have a valid ESR_ELx.WnR. kvm_is_write_fault() worked around this by relying on a KVM implementation detail that canonical stage-2 translations must have at least read-only permissions. That assumption no longer holds for nested virt where an L1 hypervisor could construct write-only mappings that propagate to the shadow stage-2. Since there's no exact science to this, assume that the S1PTW fault was for write if HA is enabled at stage-1. Fixes: fd276e71d1e7 ("KVM: arm64: nv: Handle shadow stage 2 page faults") Reported-by: Sashiko <sashiko-bot@kernel.org> Closes: https://lore.kernel.org/kvmarm/20260623213225.A89CF1F000E9@smtp.kernel.org/ Signed-off-by: Oliver Upton <oupton@kernel.org> --- arch/arm64/include/asm/kvm_emulate.h | 22 +++++---------- arch/arm64/include/asm/kvm_nested.h | 2 ++ arch/arm64/kvm/at.c | 42 +++++++++++++++++++++------- 3 files changed, 41 insertions(+), 25 deletions(-) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 5bf3d7e1d92c..8e208ce2597e 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -479,21 +479,13 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu) static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu) { - if (kvm_vcpu_abt_iss1tw(vcpu)) { - /* - * Only a permission fault on a S1PTW should be - * considered as a write. Otherwise, page tables baked - * in a read-only memslot will result in an exception - * being delivered in the guest. - * - * The drawback is that we end-up faulting twice if the - * guest is using any of HW AF/DB: a translation fault - * to map the page containing the PT (read only at - * first), then a permission fault to allow the flags - * to be set. - */ - return kvm_vcpu_trap_is_permission_fault(vcpu); - } + /* + * The architecture sucks; assume that the S1PTW fetched for write if + * HA is enabled at stage-1. Note that hardware updates to dirty state + * and table AF are predicated on HA=1 (DDI0487 M.a D24.2.194; R_SNVTX). + */ + if (kvm_vcpu_abt_iss1tw(vcpu)) + return effective_tcr_ha(vcpu); if (kvm_vcpu_trap_is_iabt(vcpu)) return false; diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index cbdaaa2a2903..e9f48f94a77f 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -417,4 +417,6 @@ u16 get_asid_by_regime(struct kvm_vcpu *vcpu, enum trans_regime regime); int __kvm_at_swap_desc(struct kvm *kvm, gpa_t ipa, u64 old, u64 new); +bool effective_tcr_ha(struct kvm_vcpu *vcpu); + #endif /* __ARM64_KVM_NESTED_H */ diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c index 8263c648207b..91154654210e 100644 --- a/arch/arm64/kvm/at.c +++ b/arch/arm64/kvm/at.c @@ -219,6 +219,36 @@ static unsigned int tcr_tg_pgshift(struct kvm *kvm, u64 tcr, bool upper_range) return shift; } +static bool __effective_tcr_ha(struct kvm_vcpu *vcpu, enum trans_regime regime) +{ + if (!kvm_has_feat(vcpu->kvm, ID_AA64MMFR1_EL1, HAFDBS, AF)) + return false; + + switch (regime) { + case TR_EL10: + return vcpu_read_sys_reg(vcpu, TCR_EL1) & TCR_HA; + case TR_EL20: + return vcpu_read_sys_reg(vcpu, TCR_EL2) & TCR_HA; + case TR_EL2: + return vcpu_read_sys_reg(vcpu, TCR_EL2) & TCR_EL2_HA; + default: + BUG(); + } +} + +static enum trans_regime vcpu_trans_regime(struct kvm_vcpu *vcpu) +{ + if (!is_hyp_ctxt(vcpu)) + return TR_EL10; + + return vcpu_el2_e2h_is_set(vcpu) ? TR_EL20 : TR_EL2; +} + +bool effective_tcr_ha(struct kvm_vcpu *vcpu) +{ + return __effective_tcr_ha(vcpu, vcpu_trans_regime(vcpu)); +} + static int setup_s1_walk(struct kvm_vcpu *vcpu, struct s1_walk_info *wi, struct s1_walk_result *wr, u64 va) { @@ -407,12 +437,7 @@ static int setup_s1_walk(struct kvm_vcpu *vcpu, struct s1_walk_info *wi, goto addrsz; wi->baddr &= GENMASK_ULL(wi->max_oa_bits - 1, x); - - wi->ha = kvm_has_feat(vcpu->kvm, ID_AA64MMFR1_EL1, HAFDBS, AF); - wi->ha &= (wi->regime == TR_EL2 ? - FIELD_GET(TCR_EL2_HA, tcr) : - FIELD_GET(TCR_HA, tcr)); - + wi->ha = __effective_tcr_ha(vcpu, wi->regime); return 0; addrsz: @@ -1723,10 +1748,7 @@ int __kvm_find_s1_desc_level(struct kvm_vcpu *vcpu, u64 va, u64 ipa, int *level) struct s1_walk_result wr = {}; int ret; - if (is_hyp_ctxt(vcpu)) - wi.regime = vcpu_el2_e2h_is_set(vcpu) ? TR_EL20 : TR_EL2; - else - wi.regime = TR_EL10; + wi.regime = vcpu_trans_regime(vcpu); ret = setup_s1_walk(vcpu, &wi, &wr, va); if (ret) -- 2.47.3 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/2] KVM: arm64: Only consider S1PTW a write fault if HA is set 2026-06-24 20:24 ` [PATCH v2 1/2] KVM: arm64: Only consider S1PTW a write fault if HA is set Oliver Upton @ 2026-06-24 20:40 ` sashiko-bot 2026-06-24 21:00 ` Oliver Upton 0 siblings, 1 reply; 7+ messages in thread From: sashiko-bot @ 2026-06-24 20:40 UTC (permalink / raw) To: Oliver Upton; +Cc: Marc Zyngier, kvmarm Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider: - [High] Unconditionally treating all S1PTW faults as write faults when HA=1 breaks pre-populated guest page tables in read-only memslots. - [High] `effective_tcr_ha()` incorrectly evaluates the Hardware Access (HA) bit for unprivileged memory accesses (e.g., `LDTR`) at EL2, leading to potential infinite fault loops. -- commit 05ab0c07bd3e6099e1182b29afb0219565246eb7 Author: Oliver Upton <oupton@kernel.org> KVM: arm64: Only consider S1PTW a write fault if HA is set This commit works around the hardware not always setting the WnR bit for stage-1 page table walk (S1PTW) faults. It introduces effective_tcr_ha() to assume an S1PTW fault is a write if the Hardware Access (HA) flag is enabled at stage-1. Link: https://lore.kernel.org/kvmarm/20260623213225.A89CF1F000E9@smtp.kernel.org/ > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h > index 5bf3d7e1d92c7..8e208ce2597e5 100644 > --- a/arch/arm64/include/asm/kvm_emulate.h > +++ b/arch/arm64/include/asm/kvm_emulate.h [ ... ] > @@ -479,21 +479,13 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu) > > static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu) > { > - if (kvm_vcpu_abt_iss1tw(vcpu)) { > - /* > - * Only a permission fault on a S1PTW should be > - * considered as a write. Otherwise, page tables baked > - * in a read-only memslot will result in an exception > - * being delivered in the guest. > - * > - * The drawback is that we end-up faulting twice if the > - * guest is using any of HW AF/DB: a translation fault > - * to map the page containing the PT (read only at > - * first), then a permission fault to allow the flags > - * to be set. > - */ > - return kvm_vcpu_trap_is_permission_fault(vcpu); > - } > + /* > + * The architecture sucks; assume that the S1PTW fetched for write if > + * HA is enabled at stage-1. Note that hardware updates to dirty state > + * and table AF are predicated on HA=1 (DDI0487 M.a D24.2.194; R_SNVTX). > + */ > + if (kvm_vcpu_abt_iss1tw(vcpu)) > + return effective_tcr_ha(vcpu); [Severity: High] Does unconditionally treating S1PTW faults as writes when HA is enabled break guests that use pre-populated page tables in read-only memslots? If a guest populates its page tables with the Access Flag already set, places them in an RO memslot, and enables HA in its TCR, the hardware only needs read access during a walk. However, if a translation fault occurs, kvm_is_write_fault() will unconditionally evaluate to true because of effective_tcr_ha(). This will cause kvm_handle_guest_abort() to see a write fault on an RO memslot and inject a spurious Synchronous External Abort (SEA) into the guest, even though the hardware only needed to read the page table. Could we retain the conditional check for permission faults to avoid this? > diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c > index 8263c648207b0..91154654210ec 100644 > --- a/arch/arm64/kvm/at.c > +++ b/arch/arm64/kvm/at.c [ ... ] > +static bool __effective_tcr_ha(struct kvm_vcpu *vcpu, enum trans_regime regime) > +{ > + if (!kvm_has_feat(vcpu->kvm, ID_AA64MMFR1_EL1, HAFDBS, AF)) > + return false; > + > + switch (regime) { > + case TR_EL10: > + return vcpu_read_sys_reg(vcpu, TCR_EL1) & TCR_HA; > + case TR_EL20: > + return vcpu_read_sys_reg(vcpu, TCR_EL2) & TCR_HA; > + case TR_EL2: > + return vcpu_read_sys_reg(vcpu, TCR_EL2) & TCR_EL2_HA; > + default: > + BUG(); > + } > +} > + > +static enum trans_regime vcpu_trans_regime(struct kvm_vcpu *vcpu) > +{ > + if (!is_hyp_ctxt(vcpu)) > + return TR_EL10; > + > + return vcpu_el2_e2h_is_set(vcpu) ? TR_EL20 : TR_EL2; > +} [Severity: High] Does vcpu_trans_regime() correctly handle unprivileged memory accesses like LDTR/STTR executed from EL2? By returning TR_EL2 based strictly on the vCPU's execution context, KVM disregards that unprivileged instructions at EL2 use the EL1 and 0 translation regime (TCR_EL1) when HCR_EL2.E2H is 0 or TGE is 0. Since KVM cannot distinguish an LDTR fault from an LDR fault via the ESR syndrome, it will read TCR_EL2.HA in __effective_tcr_ha() instead of TCR_EL1.HA. If TCR_EL1.HA is 1 but TCR_EL2.HA is 0, KVM incorrectly evaluates HA as disabled. This causes KVM to treat a legitimate S1PTW permission fault (generated to update the Access Flag) as a read fault, refusing to grant write permissions. The hardware will then fault again on retry, creating an infinite stage-2 fault loop that hangs the guest vCPU. -- Sashiko AI review · https://sashiko.dev/#/patchset/20260624202446.1698535-1-oupton@kernel.org?part=1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/2] KVM: arm64: Only consider S1PTW a write fault if HA is set 2026-06-24 20:40 ` sashiko-bot @ 2026-06-24 21:00 ` Oliver Upton 0 siblings, 0 replies; 7+ messages in thread From: Oliver Upton @ 2026-06-24 21:00 UTC (permalink / raw) To: sashiko-reviews; +Cc: Marc Zyngier, kvmarm On Wed, Jun 24, 2026 at 08:40:24PM +0000, sashiko-bot@kernel.org wrote: > > + /* > > + * The architecture sucks; assume that the S1PTW fetched for write if > > + * HA is enabled at stage-1. Note that hardware updates to dirty state > > + * and table AF are predicated on HA=1 (DDI0487 M.a D24.2.194; R_SNVTX). > > + */ > > + if (kvm_vcpu_abt_iss1tw(vcpu)) > > + return effective_tcr_ha(vcpu); > > [Severity: High] > Does unconditionally treating S1PTW faults as writes when HA is enabled break > guests that use pre-populated page tables in read-only memslots? > > If a guest populates its page tables with the Access Flag already set, places > them in an RO memslot, and enables HA in its TCR, the hardware only needs read > access during a walk. However, if a translation fault occurs, > kvm_is_write_fault() will unconditionally evaluate to true because of > effective_tcr_ha(). > > This will cause kvm_handle_guest_abort() to see a write fault on an RO memslot > and inject a spurious Synchronous External Abort (SEA) into the guest, even > though the hardware only needed to read the page table. Could we retain the > conditional check for permission faults to avoid this? Gah, I tested with QEMU+EDK2 for this exact purpose but did so on an implementation without HAFDBS. > > +static enum trans_regime vcpu_trans_regime(struct kvm_vcpu *vcpu) > > +{ > > + if (!is_hyp_ctxt(vcpu)) > > + return TR_EL10; > > + > > + return vcpu_el2_e2h_is_set(vcpu) ? TR_EL20 : TR_EL2; > > +} > > [Severity: High] > Does vcpu_trans_regime() correctly handle unprivileged memory accesses > like LDTR/STTR executed from EL2? > > By returning TR_EL2 based strictly on the vCPU's execution context, KVM > disregards that unprivileged instructions at EL2 use the EL1 and 0 translation > regime (TCR_EL1) when HCR_EL2.E2H is 0 or TGE is 0. > > Since KVM cannot distinguish an LDTR fault from an LDR fault via the ESR > syndrome, it will read TCR_EL2.HA in __effective_tcr_ha() instead of > TCR_EL1.HA. If TCR_EL1.HA is 1 but TCR_EL2.HA is 0, KVM incorrectly evaluates > HA as disabled. > > This causes KVM to treat a legitimate S1PTW permission fault (generated to > update the Access Flag) as a read fault, refusing to grant write permissions. > The hardware will then fault again on retry, creating an infinite stage-2 fault > loop that hangs the guest vCPU. This is wrong, the LDTR instruction does not use an out-of-context translation regime. If HCR_EL2.{E2H,TGE} != {1,1}, LDTR silently upgrades to a privileged access. """ func AArch64_IsUnprivAccessPriv() => boolean begin var privileged : boolean; case PSTATE.EL of when EL0 => privileged = FALSE; when EL1 => privileged = EffectiveHCR_EL2_NVx()[1:0] == '11'; when EL2 => privileged = !ELIsInHost(EL0); when EL3 => privileged = TRUE; end; if IsFeatureImplemented(FEAT_UAO) && PSTATE.UAO == '1' then privileged = PSTATE.EL != EL0; end; return privileged; end; """ ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 2/2] KVM: arm64: nv: Treat S1PTW permission faults specially 2026-06-24 20:24 [PATCH v2 0/2] KVM: arm64: nv: Fix permission checks for S1PTW faults Oliver Upton 2026-06-24 20:24 ` [PATCH v2 1/2] KVM: arm64: Only consider S1PTW a write fault if HA is set Oliver Upton @ 2026-06-24 20:24 ` Oliver Upton 2026-06-24 20:35 ` sashiko-bot 1 sibling, 1 reply; 7+ messages in thread From: Oliver Upton @ 2026-06-24 20:24 UTC (permalink / raw) To: kvmarm Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Wei-Lin Chang, Steffen Eiden, Oliver Upton, Sashiko Permission faults reported as instruction aborts with S1PTW set are due to missing read/write permissions for the table walk, not execute permissions on the output of translation. As there's no way to directly tell which of the two permissions failed, evaluate both and forward the fault to the L1 hypervisor if either fails. Fixes: fd276e71d1e7 ("KVM: arm64: nv: Handle shadow stage 2 page faults") Reported-by: Sashiko <sashiko-bot@kernel.org> Closes: https://lore.kernel.org/kvmarm/20260623190607.7106B1F000E9@smtp.kernel.org/ Signed-off-by: Oliver Upton <oupton@kernel.org> --- arch/arm64/kvm/nested.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index 94df26de6990..4c9123cb2e1c 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -911,6 +911,7 @@ void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu) */ int kvm_s2_handle_perm_fault(struct kvm_vcpu *vcpu, struct kvm_s2_trans *trans) { + bool write_fault = kvm_is_write_fault(vcpu); bool forward_fault = false; trans->esr = 0; @@ -918,14 +919,27 @@ int kvm_s2_handle_perm_fault(struct kvm_vcpu *vcpu, struct kvm_s2_trans *trans) if (!kvm_vcpu_trap_is_permission_fault(vcpu)) return 0; - if (kvm_vcpu_trap_is_iabt(vcpu)) { + /* + * S1PTW permission faults do not provide sufficient syndrome information + * to determine if the fault was for read or write permissions. Perform a + * read permission check and an optional write permission check, relying + * on the fact that: + * + * - The table walker at minimum requires read permission + * + * - The L1 hypervisor also needs to deal with the architecture and + * cannot directly infer the failing permission from the fault context + */ + if (kvm_vcpu_abt_iss1tw(vcpu)) { + forward_fault = !trans->readable; + if (write_fault) + forward_fault |= !trans->writable; + } else if (kvm_vcpu_trap_is_iabt(vcpu)) { if (vcpu_mode_priv(vcpu)) forward_fault = !kvm_s2_trans_exec_el1(vcpu->kvm, trans); else forward_fault = !kvm_s2_trans_exec_el0(vcpu->kvm, trans); } else { - bool write_fault = kvm_is_write_fault(vcpu); - forward_fault = ((write_fault && !trans->writable) || (!write_fault && !trans->readable)); } -- 2.47.3 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/2] KVM: arm64: nv: Treat S1PTW permission faults specially 2026-06-24 20:24 ` [PATCH v2 2/2] KVM: arm64: nv: Treat S1PTW permission faults specially Oliver Upton @ 2026-06-24 20:35 ` sashiko-bot 2026-06-24 21:22 ` Oliver Upton 0 siblings, 1 reply; 7+ messages in thread From: sashiko-bot @ 2026-06-24 20:35 UTC (permalink / raw) To: Oliver Upton; +Cc: kvmarm, Marc Zyngier Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider: Pre-existing issues: - [High] An L1 hypervisor can cause an infinite page fault loop in the L0 hypervisor by creating an L1 S2 PTE with XN=0 (execution allowed) but AP=00 (read/write denied). L0 fails to forward the resulting instruction abort to L1, repeatedly mapping the shadow PTE without read permissions, which the hardware continually rejects as non-executable. -- commit f64f1ef4967d6c0954790535ee60d6e909805163 Author: Oliver Upton <oupton@kernel.org> KVM: arm64: nv: Treat S1PTW permission faults specially Permission faults reported as instruction aborts with S1PTW set are due to missing read/write permissions for the table walk, not execute permissions on the output of translation. This patch adds logic to evaluate both and forward the fault to the L1 hypervisor if either fails. Closes: https://lore.kernel.org/kvmarm/20260623190607.7106B1F000E9@smtp.kernel.org/ > diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c > index 94df26de6990c..4c9123cb2e1c5 100644 > --- a/arch/arm64/kvm/nested.c > +++ b/arch/arm64/kvm/nested.c [ ... ] > @@ -918,14 +919,27 @@ int kvm_s2_handle_perm_fault(struct kvm_vcpu *vcpu, struct kvm_s2_trans *trans) > if (!kvm_vcpu_trap_is_permission_fault(vcpu)) > return 0; > > - if (kvm_vcpu_trap_is_iabt(vcpu)) { > + /* > + * S1PTW permission faults do not provide sufficient syndrome information > + * to determine if the fault was for read or write permissions. Perform a > + * read permission check and an optional write permission check, relying > + * on the fact that: > + * > + * - The table walker at minimum requires read permission > + * > + * - The L1 hypervisor also needs to deal with the architecture and > + * cannot directly infer the failing permission from the fault context > + */ > + if (kvm_vcpu_abt_iss1tw(vcpu)) { > + forward_fault = !trans->readable; > + if (write_fault) > + forward_fault |= !trans->writable; > + } else if (kvm_vcpu_trap_is_iabt(vcpu)) { > if (vcpu_mode_priv(vcpu)) > forward_fault = !kvm_s2_trans_exec_el1(vcpu->kvm, trans); > else > forward_fault = !kvm_s2_trans_exec_el0(vcpu->kvm, trans); [Severity: High] This is a pre-existing issue, but does this logic miss a read-permission dependency check for normal instruction aborts? If an L1 hypervisor maps a page for L2 with Stage-2 PTE AP=00 (no read/write) and XN=0 (executable), and L2 attempts to fetch an instruction, the hardware generates a Stage-2 Instruction Abort (Permission Fault) because execution requires read permission. Here, kvm_s2_handle_perm_fault() intercepts the IABT and checks only the XN bits via kvm_s2_trans_exec_el1() and kvm_s2_trans_exec_el0(). Since they only check the XN bits and ignore trans->readable, forward_fault is set to false. L0 then handles the fault by mapping the shadow S2 PTE. The shadow fault handler will clear the read protection bit, leaving the execute protection bit intact. The hardware will reject this shadow mapping as non-executable due to missing read permissions, triggering another IABT upon guest resumption and causing an infinite page fault loop in the L0 hypervisor. > } else { > - bool write_fault = kvm_is_write_fault(vcpu); > - > forward_fault = ((write_fault && !trans->writable) || > (!write_fault && !trans->readable)); > } -- Sashiko AI review · https://sashiko.dev/#/patchset/20260624202446.1698535-1-oupton@kernel.org?part=2 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/2] KVM: arm64: nv: Treat S1PTW permission faults specially 2026-06-24 20:35 ` sashiko-bot @ 2026-06-24 21:22 ` Oliver Upton 0 siblings, 0 replies; 7+ messages in thread From: Oliver Upton @ 2026-06-24 21:22 UTC (permalink / raw) To: sashiko-reviews; +Cc: kvmarm, Marc Zyngier On Wed, Jun 24, 2026 at 08:35:16PM +0000, sashiko-bot@kernel.org wrote: > > + /* > > + * S1PTW permission faults do not provide sufficient syndrome information > > + * to determine if the fault was for read or write permissions. Perform a > > + * read permission check and an optional write permission check, relying > > + * on the fact that: > > + * > > + * - The table walker at minimum requires read permission > > + * > > + * - The L1 hypervisor also needs to deal with the architecture and > > + * cannot directly infer the failing permission from the fault context > > + */ > > + if (kvm_vcpu_abt_iss1tw(vcpu)) { > > + forward_fault = !trans->readable; > > + if (write_fault) > > + forward_fault |= !trans->writable; > > + } else if (kvm_vcpu_trap_is_iabt(vcpu)) { > > if (vcpu_mode_priv(vcpu)) > > forward_fault = !kvm_s2_trans_exec_el1(vcpu->kvm, trans); > > else > > forward_fault = !kvm_s2_trans_exec_el0(vcpu->kvm, trans); > > [Severity: High] > This is a pre-existing issue, but does this logic miss a read-permission > dependency check for normal instruction aborts? > > If an L1 hypervisor maps a page for L2 with Stage-2 PTE AP=00 (no read/write) > and XN=0 (executable), and L2 attempts to fetch an instruction, the hardware > generates a Stage-2 Instruction Abort (Permission Fault) because execution > requires read permission. This does not match my understanding of stage-2 permissions. S2AP controls data permissions, whereas the XN field controls instruction fetches. The pseudocode also supports this view, where execute permissions are computed separately from S2AP: AArch64.S2DirectBasePermissions() """ S2AccessControls s2perms; bit w; constant bit r = permissions.s2ap<0>; if permissions.s2ap<1> == '1' then w = '1'; // Descriptors marked with DBM set have the effective value of S2AP[1] set. // This implies no Permission faults caused by lack of write permissions are // reported, and the Dirty bit can be set. elsif permissions.dbm == '1' && walkparams.hd == '1' then // An update occurs here, conditional to being able to append to HDBSS if walkparams.hdbss == '1' then w = if CanAppendToHDBSS() then '1' else '0'; else w = '1'; else w = '0'; bit px, ux; case (permissions.s2xn:permissions.s2xnx) of when '00' (px,ux) = ('1','1'); when '01' (px,ux) = ('0','1'); when '10' (px,ux) = ('0','0'); when '11' (px,ux) = ('1','0'); x = if accdesc.el == EL0 then ux else px; s2perms.r = r; s2perms.w = w; s2perms.x = x; s2perms.r_rcw = r; s2perms.w_rcw = w; s2perms.r_mmu = r; s2perms.w_mmu = w; s2perms.toplevel0 = '0'; s2perms.toplevel1 = '0'; s2perms.overlay = FALSE; return s2perms; """ AArch64.S2CheckPermissions() """ elsif accdesc.acctype == AccessType_IFETCH then if s2perms.overlay && s2perms.ox == '0' then fault.statuscode = Fault_Permission; fault.overlay = TRUE; elsif (memtype == MemType_Device && ConstrainUnpredictable(Unpredictable_INSTRDEVICE) == Constraint_FAULT) then fault.statuscode = Fault_Permission; // Prevent execution from Non-secure space by Realm state elsif accdesc.ss == SS_Realm && walkstate.baseaddress.paspace != PAS_Realm then fault.statuscode = Fault_Permission; elsif s2perms.x == '0' then fault.statuscode = Fault_Permission; """ Thanks, Oliver ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-06-24 21:22 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-24 20:24 [PATCH v2 0/2] KVM: arm64: nv: Fix permission checks for S1PTW faults Oliver Upton 2026-06-24 20:24 ` [PATCH v2 1/2] KVM: arm64: Only consider S1PTW a write fault if HA is set Oliver Upton 2026-06-24 20:40 ` sashiko-bot 2026-06-24 21:00 ` Oliver Upton 2026-06-24 20:24 ` [PATCH v2 2/2] KVM: arm64: nv: Treat S1PTW permission faults specially Oliver Upton 2026-06-24 20:35 ` sashiko-bot 2026-06-24 21:22 ` Oliver Upton
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.