From: Alexandru Elisei <alexandru.elisei@arm.com>
To: maz@kernel.org, oupton@kernel.org, joey.gouly@arm.com,
suzuki.poulose@arm.com, yuzenghui@huawei.com,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
tabba@google.com
Cc: mark.rutland@arm.com
Subject: [PATCH v2] KVM: arm64: Handle permission faults with guest_memfd
Date: Tue, 5 May 2026 10:49:13 +0100 [thread overview]
Message-ID: <20260505094913.75317-1-alexandru.elisei@arm.com> (raw)
gmem_abort() calls kvm_pgtable_stage2_map() to make changes to stage 2. It
does this for both relaxing permissions on an existing mapping and to
install a missing mapping.
kvm_pgtable_stage2_map() doesn't make changes to stage 2 if there is an
existing, valid entry and the new entry modifies only the permissions.
This is checked in:
kvm_pgtable_stage2_map()
stage2_map_walk_leaf()
stage2_map_walker_try_leaf()
stage2_pte_needs_update()
and if only the permissions differ, kvm_pgtable_stage2_map() returns
-EAGAIN and KVM returns to the guest to replay the instruction. The
assumption is that a concurrent fault on a different VCPU already mapped
the faulting IPA, and replaying the instruction will either succeed, or
cause a permission fault, which should be handled with
kvm_pgtable_stage2_relax_perms().
gmem_abort(), on a read or write fault on a system without DIC (instruction
cache invalidation required for data to instruction coherence), installs a
valid entry with read and write permissions, but without executable
permissions. On an execution fault on the same page, gmem_abort() attempts
to relax the permissions to allow execution, but calls
kvm_pgtable_stage2_map() to change the existing, valid, entry.
kvm_pgtable_stage2_map() returns -EAGAIN and KVM resumes execution from the
faulting instruction, which leads to an infinite loop of permission faults
on the same instruction.
Allow the guest to make progress by using kvm_pgtable_stage2_relax_perms()
to relax permissions.
Fixes: a7b57e099592 ("KVM: arm64: Handle guest_memfd-backed guest page faults")
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
v1: https://lore.kernel.org/kvmarm/20260430132351.280766-1-alexandru.elisei@arm.com/
Changes from v1:
- Rebased on top of v7.1-rc2.
- Initialised memcache to NULL (Fuad).
- Copied kvm_pgtable_stage2_relax_perms() comment from kvm_s2_fault_map()
(Fuad).
- Fixed KVM_PGT_FN() macro invocation (Fuad).
Same as before, tested booting a linux VM on Orion O6, no pkvm, no nested virt.
arch/arm64/kvm/mmu.c | 29 +++++++++++++++++++++--------
1 file changed, 21 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index d089c107d9b7..4da9281312eb 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1576,21 +1576,24 @@ struct kvm_s2_fault_desc {
static int gmem_abort(const struct kvm_s2_fault_desc *s2fd)
{
bool write_fault, exec_fault;
+ bool perm_fault = kvm_vcpu_trap_is_permission_fault(s2fd->vcpu);
enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_SHARED;
enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
struct kvm_pgtable *pgt = s2fd->vcpu->arch.hw_mmu->pgt;
unsigned long mmu_seq;
struct page *page;
struct kvm *kvm = s2fd->vcpu->kvm;
- void *memcache;
+ void *memcache = NULL;
kvm_pfn_t pfn;
gfn_t gfn;
int ret;
- memcache = get_mmu_memcache(s2fd->vcpu);
- ret = topup_mmu_memcache(s2fd->vcpu, memcache);
- if (ret)
- return ret;
+ if (!perm_fault) {
+ memcache = get_mmu_memcache(s2fd->vcpu);
+ ret = topup_mmu_memcache(s2fd->vcpu, memcache);
+ if (ret)
+ return ret;
+ }
if (s2fd->nested)
gfn = kvm_s2_trans_output(s2fd->nested) >> PAGE_SHIFT;
@@ -1631,9 +1634,19 @@ static int gmem_abort(const struct kvm_s2_fault_desc *s2fd)
goto out_unlock;
}
- ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, s2fd->fault_ipa, PAGE_SIZE,
- __pfn_to_phys(pfn), prot,
- memcache, flags);
+ if (perm_fault) {
+ /*
+ * Drop the SW bits in favour of those stored in the
+ * PTE, which will be preserved.
+ */
+ prot &= ~KVM_NV_GUEST_MAP_SZ;
+ ret = KVM_PGT_FN(kvm_pgtable_stage2_relax_perms)(pgt, s2fd->fault_ipa,
+ prot, flags);
+ } else {
+ ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, s2fd->fault_ipa, PAGE_SIZE,
+ __pfn_to_phys(pfn), prot,
+ memcache, flags);
+ }
out_unlock:
kvm_release_faultin_page(kvm, page, !!ret, prot & KVM_PGTABLE_PROT_W);
base-commit: 7fd2df204f342fc17d1a0bfcd474b24232fb0f32
--
2.54.0
next reply other threads:[~2026-05-05 9:49 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-05 9:49 Alexandru Elisei [this message]
2026-05-05 12:00 ` [PATCH v2] KVM: arm64: Handle permission faults with guest_memfd Fuad Tabba
2026-05-06 16:14 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260505094913.75317-1-alexandru.elisei@arm.com \
--to=alexandru.elisei@arm.com \
--cc=joey.gouly@arm.com \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=oupton@kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=tabba@google.com \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox