From: Fuad Tabba <tabba@google.com>
To: maz@kernel.org, oliver.upton@linux.dev
Cc: james.morse@arm.com, suzuki.poulose@arm.com,
yuzenghui@huawei.com, qperret@google.com, vdonnefort@google.com,
tabba@google.com, catalin.marinas@arm.com, will@kernel.org,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: [PATCH 8/8] KVM: arm64: Propagate stage-2 map failure on guest->host unshare
Date: Tue, 28 Apr 2026 11:30:08 +0100 [thread overview]
Message-ID: <20260428103008.696141-9-tabba@google.com> (raw)
In-Reply-To: <20260428103008.696141-1-tabba@google.com>
__pkvm_guest_unshare_host() re-acquires exclusive guest ownership of
a page by (i) annotating the host stage-2 PTE via
host_stage2_set_owner_metadata_locked(), (ii) mapping the page in
the guest stage-2 as PKVM_PAGE_OWNED via kvm_pgtable_stage2_map(),
and (iii) restoring host ownership via
host_stage2_set_owner_locked(). The map's return value was wrapped
in WARN_ON() and otherwise discarded.
At EL2 in nVHE/pKVM, WARN_ON() is not warn-and-continue: it expands
to a BRK that enters the invalid-host-el2 vector and branches to
hyp_panic(), declared __noreturn.
__pkvm_guest_unshare_host() calls get_valid_guest_pte() before the
map, which verifies that a valid last-level (PAGE_SIZE) leaf PTE
already exists for the IPA. Because the leaf and all intermediate
tables are in place, the subsequent kvm_pgtable_stage2_map()
replacing it cannot fail via -ENOMEM: no block to split, no new
tables to install. The failure path is not currently reachable.
Nevertheless, WARN_ON() on any fallible call is the wrong pattern at
EL2. Capture the return value and propagate it. The unmap() and
host-side rollback are kept as defensive guards for the currently
unreachable failure path. The rollback's
WARN_ON(__host_set_page_state_range()) asserts an impossible state:
the host leaf PTE was just written by
host_stage2_set_owner_metadata_locked(), so the reverse idmap
rewrite cannot require new page-table allocation from host_s2_pool.
This is the correct use of WARN_ON at EL2 — an impossible-state
assertion, not a reachable error being ignored.
Fixes: 246c976c370d ("KVM: arm64: Implement the MEM_UNSHARE hypercall for protected VMs")
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/kvm/hyp/nvhe/mem_protect.c | 37 ++++++++++++++++++---------
1 file changed, 25 insertions(+), 12 deletions(-)
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 6fb546af699f..12f3ea7a2d75 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -984,14 +984,10 @@ int __pkvm_guest_share_host(struct pkvm_hyp_vcpu *vcpu, u64 gfn)
&vcpu->vcpu.arch.pkvm_memcache, 0);
if (ret) {
/*
- * Stage-2 map can fail mid-walk (e.g. -ENOMEM from the
- * memcache), leaving partial leaf entries in the guest
- * stage-2 transitioned to PKVM_PAGE_SHARED_OWNED. Tear
- * them down so the host does not see a partially-shared
- * mapping it has not yet acknowledged via the host
- * stage-2 update below. No host bookkeeping needs
- * unwinding here: the only mutation prior to the failed
- * map is the (now-discarded) guest stage-2 update itself.
+ * Defensive: get_valid_guest_pte() guarantees a last-level
+ * leaf PTE already exists, so stage-2 map() cannot currently
+ * fail here. The unmap() restores the IPA to a clean state as
+ * a guard should the precondition ever change.
*/
kvm_pgtable_stage2_unmap(&vm->pgt, ipa, PAGE_SIZE);
goto unlock;
@@ -1024,13 +1020,30 @@ int __pkvm_guest_unshare_host(struct pkvm_hyp_vcpu *vcpu, u64 gfn)
if (__host_check_page_state_range(phys, PAGE_SIZE, PKVM_PAGE_SHARED_BORROWED))
goto unlock;
- ret = 0;
meta = host_stage2_encode_gfn_meta(vm, gfn);
WARN_ON(host_stage2_set_owner_metadata_locked(phys, PAGE_SIZE,
PKVM_ID_GUEST, meta));
- WARN_ON(kvm_pgtable_stage2_map(&vm->pgt, ipa, PAGE_SIZE, phys,
- pkvm_mkstate(KVM_PGTABLE_PROT_RWX, PKVM_PAGE_OWNED),
- &vcpu->vcpu.arch.pkvm_memcache, 0));
+ ret = kvm_pgtable_stage2_map(&vm->pgt, ipa, PAGE_SIZE, phys,
+ pkvm_mkstate(KVM_PGTABLE_PROT_RWX, PKVM_PAGE_OWNED),
+ &vcpu->vcpu.arch.pkvm_memcache, 0);
+ if (ret) {
+ /*
+ * Defensive: get_valid_guest_pte() guarantees a last-level
+ * leaf PTE already exists, so stage-2 map() cannot currently
+ * fail here. The unmap() and host-side rollback below are
+ * kept as guards should the precondition ever change.
+ */
+ kvm_pgtable_stage2_unmap(&vm->pgt, ipa, PAGE_SIZE);
+
+ /*
+ * Roll back the host stage-2 mutation above: the host leaf
+ * PTE was just written by host_stage2_set_owner_metadata_locked(),
+ * so __host_set_page_state_range() rewrites it in-place
+ * without needing fresh page-table pages from host_s2_pool.
+ */
+ WARN_ON(__host_set_page_state_range(phys, PAGE_SIZE,
+ PKVM_PAGE_SHARED_BORROWED));
+ }
unlock:
guest_unlock_component(vm);
host_unlock_component();
--
2.54.0.545.g6539524ca2-goog
prev parent reply other threads:[~2026-04-28 10:30 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 10:30 [PATCH 0/8] KVM: arm64: EL2 synchronisation and pKVM stage-2 error propagation fixes Fuad Tabba
2026-04-28 10:30 ` [PATCH 1/8] KVM: arm64: Make EL2 exception entry and exit context-synchronization events Fuad Tabba
2026-04-28 10:30 ` [PATCH 2/8] KVM: arm64: Synchronise HCR_EL2 writes on the guest exit path Fuad Tabba
2026-04-28 13:50 ` Will Deacon
2026-04-28 14:21 ` Fuad Tabba
2026-04-28 10:30 ` [PATCH 3/8] KVM: arm64: Guard against NULL vcpu on VHE hyp panic path Fuad Tabba
2026-04-28 10:30 ` [PATCH 4/8] KVM: arm64: Fix __deactivate_fgt macro parameter typo Fuad Tabba
2026-04-28 10:30 ` [PATCH 5/8] KVM: arm64: Propagate stage-2 map failure on host->guest share Fuad Tabba
2026-04-28 10:30 ` [PATCH 6/8] KVM: arm64: Propagate stage-2 map failure on host->guest donation Fuad Tabba
2026-04-28 13:45 ` Will Deacon
2026-04-28 14:36 ` Fuad Tabba
2026-04-28 16:57 ` Will Deacon
2026-04-28 17:03 ` Fuad Tabba
2026-04-28 10:30 ` [PATCH 7/8] KVM: arm64: Propagate stage-2 map failure on guest->host share Fuad Tabba
2026-04-28 10:30 ` Fuad Tabba [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260428103008.696141-9-tabba@google.com \
--to=tabba@google.com \
--cc=catalin.marinas@arm.com \
--cc=james.morse@arm.com \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=qperret@google.com \
--cc=stable@vger.kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=vdonnefort@google.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox