* [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes
@ 2026-06-30 12:09 Wei-Lin Chang
2026-06-30 12:10 ` [PATCH v2 1/6] KVM: arm64: ptdump: Remove shadow ptdump files Wei-Lin Chang
` (6 more replies)
0 siblings, 7 replies; 17+ messages in thread
From: Wei-Lin Chang @ 2026-06-30 12:09 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm, linux-kernel
Cc: Marc Zyngier, Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Itaru Kitayama, Sebastian Ene, Wei-Lin Chang
Hi,
This is v2 of fixing shadow ptdump debugfs files. Unfortunately I couldn't make
per mmu ptdump files work after all, mainly because there isn't a clean way to
locate the specific nested mmu for each ptdump file as the nested mmus could be
freed when the file gets opened. Therefore in this series a single file
"shadow_page_tables" is created that dumps all valid mmus' page table
information.
An advantage of this is that this new ptdump file have a lifetime identical to
other ptdump files i.e. stage2_page_tables, ipa_range, etc., hence avoiding the
dentry UAF found last time [1].
With this all ptdump files are only removed when the last kvm reference gets
dropped and kvm_destroy_vm_debugfs() is called, in their open(), show()
functions the nested mmu array and mmu->pgt are checked with mmu_lock held to
prevent UAF.
Patch 1-2: Undo previous shadow ptdump implementation.
Patch 3: Fix a mmu->pgt UAF that happens when ptdump files are read after
mmu->pgt is freed.
Patch 4-5: Preparation for the shadow page table dump file.
Patch 6: Implementation of the shadow page table dump file.
The fixes are tested with CONFIG_PROVE_LOCKING,
CONFIG_DEBUG_ATOMIC_SLEEP, and CONFIG_KASAN.
Thanks!
* Changes from v1 ([2]):
- Move from per mmu ptdump files to one file that will dump all shadow page
tables.
[1]: https://lore.kernel.org/kvmarm/ajty6I7ZqodP4ous@sm-arm-grace07/
[2]: https://lore.kernel.org/kvmarm/20260623142443.648972-1-weilin.chang@arm.com/
Wei-Lin Chang (6):
KVM: arm64: ptdump: Remove shadow ptdump files
KVM: arm64: ptdump: Undo making the ptdump code mmu aware
KVM: arm64: ptdump: Fix UAF when mmu->pgt is freed
KVM: arm64: ptdump: Factor out initialization of
kvm_ptdump_guest_state
KVM: arm64: ptdump: Extract kvm_ptdump_guest_open() from canonical
ptdump path
KVM: arm64: ptdump: Introduce the shadow ptdump file
arch/arm64/include/asm/kvm_host.h | 5 +-
arch/arm64/include/asm/kvm_mmu.h | 4 -
arch/arm64/kvm/nested.c | 18 +--
arch/arm64/kvm/ptdump.c | 185 ++++++++++++++++++++----------
4 files changed, 135 insertions(+), 77 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2 1/6] KVM: arm64: ptdump: Remove shadow ptdump files
2026-06-30 12:09 [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes Wei-Lin Chang
@ 2026-06-30 12:10 ` Wei-Lin Chang
2026-06-30 12:10 ` [PATCH v2 2/6] KVM: arm64: ptdump: Undo making the ptdump code mmu aware Wei-Lin Chang
` (5 subsequent siblings)
6 siblings, 0 replies; 17+ messages in thread
From: Wei-Lin Chang @ 2026-06-30 12:10 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm, linux-kernel
Cc: Marc Zyngier, Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Itaru Kitayama, Sebastian Ene, Wei-Lin Chang
Previously we exposed shadow page tables by creating a debugfs ptdump
file whenever a nested mmu instance gets bound to a new context, and
deleting the debugfs file whose context was getting unbound.
This turned out to be buggy, as the instance<->context binding process
is done with the mmu_lock held, and debugfs creation/deletion can sleep.
Revert most of commit 19e15dc73f0f ("KVM: arm64: nv: Expose shadow page
tables in debugfs"), keep the "nested" debugfs directory for use in a
later patch where we'll expose the shadow ptdump in another way.
Fixes: 19e15dc73f0f ("KVM: arm64: nv: Expose shadow page tables in debugfs")
Reported-by: Itaru Kitayama <itaru.kitayama@fujitsu.com>
Closes: https://lore.kernel.org/kvmarm/aiuF0KSvvv-ZozI1@sm-arm-grace07/
Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
---
arch/arm64/include/asm/kvm_host.h | 5 +----
arch/arm64/include/asm/kvm_mmu.h | 4 ----
arch/arm64/kvm/nested.c | 6 +-----
arch/arm64/kvm/ptdump.c | 23 -----------------------
4 files changed, 2 insertions(+), 36 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2faa60df847d..94bced53a323 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -217,10 +217,6 @@ struct kvm_s2_mmu {
*/
bool nested_stage2_enabled;
-#ifdef CONFIG_PTDUMP_STAGE2_DEBUGFS
- struct dentry *shadow_pt_debugfs_dentry;
-#endif
-
/*
* true when this MMU needs to be unmapped before being used for a new
* purpose.
@@ -424,6 +420,7 @@ struct kvm_arch {
/* Nested virtualization info */
struct dentry *debugfs_nv_dentry;
#endif
+
};
struct kvm_vcpu_fault_info {
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 6eae7e7e2a68..c1e535e3d931 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -392,12 +392,8 @@ static inline bool kvm_supports_cacheable_pfnmap(void)
#ifdef CONFIG_PTDUMP_STAGE2_DEBUGFS
void kvm_s2_ptdump_create_debugfs(struct kvm *kvm);
-void kvm_nested_s2_ptdump_create_debugfs(struct kvm_s2_mmu *mmu);
-void kvm_nested_s2_ptdump_remove_debugfs(struct kvm_s2_mmu *mmu);
#else
static inline void kvm_s2_ptdump_create_debugfs(struct kvm *kvm) {}
-static inline void kvm_nested_s2_ptdump_create_debugfs(struct kvm_s2_mmu *mmu) {}
-static inline void kvm_nested_s2_ptdump_remove_debugfs(struct kvm_s2_mmu *mmu) {}
#endif /* CONFIG_PTDUMP_STAGE2_DEBUGFS */
#endif /* __ASSEMBLER__ */
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index faa4a48f265d..6435efd65cb5 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -834,10 +834,8 @@ static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu)
kvm->arch.nested_mmus_next = (i + 1) % kvm->arch.nested_mmus_size;
/* Make sure we don't forget to do the laundry */
- if (kvm_s2_mmu_valid(s2_mmu)) {
- kvm_nested_s2_ptdump_remove_debugfs(s2_mmu);
+ if (kvm_s2_mmu_valid(s2_mmu))
s2_mmu->pending_unmap = true;
- }
/*
* The virtual VMID (modulo CnP) will be used as a key when matching
@@ -851,8 +849,6 @@ static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu)
s2_mmu->tlb_vtcr = vcpu_read_sys_reg(vcpu, VTCR_EL2);
s2_mmu->nested_stage2_enabled = vcpu_read_sys_reg(vcpu, HCR_EL2) & HCR_VM;
- kvm_nested_s2_ptdump_create_debugfs(s2_mmu);
-
out:
atomic_inc(&s2_mmu->refcnt);
diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
index c9140e22abcf..7c32f1f7772c 100644
--- a/arch/arm64/kvm/ptdump.c
+++ b/arch/arm64/kvm/ptdump.c
@@ -17,7 +17,6 @@
#define MARKERS_LEN 2
#define KVM_PGTABLE_MAX_LEVELS (KVM_PGTABLE_LAST_LEVEL + 1)
-#define S2FNAMESZ sizeof("0x0123456789abcdef-0x0123456789abcdef-s2-disabled")
struct kvm_ptdump_guest_state {
struct kvm_s2_mmu *mmu;
@@ -273,28 +272,6 @@ static const struct file_operations kvm_pgtable_levels_fops = {
.release = kvm_pgtable_debugfs_close,
};
-void kvm_nested_s2_ptdump_create_debugfs(struct kvm_s2_mmu *mmu)
-{
- struct dentry *dent;
- char file_name[S2FNAMESZ];
-
- snprintf(file_name, sizeof(file_name), "0x%016llx-0x%016llx-s2-%sabled",
- mmu->tlb_vttbr,
- mmu->tlb_vtcr,
- mmu->nested_stage2_enabled ? "en" : "dis");
-
- dent = debugfs_create_file(file_name, 0400,
- mmu->arch->debugfs_nv_dentry, mmu,
- &kvm_ptdump_guest_fops);
-
- mmu->shadow_pt_debugfs_dentry = dent;
-}
-
-void kvm_nested_s2_ptdump_remove_debugfs(struct kvm_s2_mmu *mmu)
-{
- debugfs_remove(mmu->shadow_pt_debugfs_dentry);
-}
-
void kvm_s2_ptdump_create_debugfs(struct kvm *kvm)
{
debugfs_create_file("stage2_page_tables", 0400, kvm->debugfs_dentry,
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 2/6] KVM: arm64: ptdump: Undo making the ptdump code mmu aware
2026-06-30 12:09 [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes Wei-Lin Chang
2026-06-30 12:10 ` [PATCH v2 1/6] KVM: arm64: ptdump: Remove shadow ptdump files Wei-Lin Chang
@ 2026-06-30 12:10 ` Wei-Lin Chang
2026-06-30 12:10 ` [PATCH v2 3/6] KVM: arm64: ptdump: Fix UAF when mmu->pgt is freed Wei-Lin Chang
` (4 subsequent siblings)
6 siblings, 0 replies; 17+ messages in thread
From: Wei-Lin Chang @ 2026-06-30 12:10 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm, linux-kernel
Cc: Marc Zyngier, Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Itaru Kitayama, Sebastian Ene, Wei-Lin Chang
Commit 204f7c018d76 ("KVM: arm64: ptdump: Make KVM ptdump code s2 mmu
aware") changed the ptdump code from storing the kvm pointer to storing
the mmu pointer, in order to reuse code for shadow ptdumps.
This turned out to be buggy as the nested mmus can be freed before the
last access to the ptdump files. To prepare for a new implementation of
the shadow ptdumps which solves this problem, revert the effects of the
commit to avoid this UAF.
Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
---
arch/arm64/kvm/ptdump.c | 32 +++++++++++++++-----------------
1 file changed, 15 insertions(+), 17 deletions(-)
diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
index 7c32f1f7772c..d5aa9eff08d1 100644
--- a/arch/arm64/kvm/ptdump.c
+++ b/arch/arm64/kvm/ptdump.c
@@ -19,7 +19,7 @@
#define KVM_PGTABLE_MAX_LEVELS (KVM_PGTABLE_LAST_LEVEL + 1)
struct kvm_ptdump_guest_state {
- struct kvm_s2_mmu *mmu;
+ struct kvm *kvm;
struct ptdump_pg_state parser_state;
struct addr_marker ipa_marker[MARKERS_LEN];
struct ptdump_pg_level level[KVM_PGTABLE_MAX_LEVELS];
@@ -112,10 +112,10 @@ static int kvm_ptdump_build_levels(struct ptdump_pg_level *level, u32 start_lvl)
return 0;
}
-static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm_s2_mmu *mmu)
+static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm *kvm)
{
struct kvm_ptdump_guest_state *st;
- struct kvm_pgtable *pgtable = mmu->pgt;
+ struct kvm_pgtable *pgtable = kvm->arch.mmu.pgt;
int ret;
st = kzalloc_obj(struct kvm_ptdump_guest_state, GFP_KERNEL_ACCOUNT);
@@ -131,7 +131,7 @@ static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm_s2_mmu
st->ipa_marker[0].name = "Guest IPA";
st->ipa_marker[1].start_address = BIT(pgtable->ia_bits);
- st->mmu = mmu;
+ st->kvm = kvm;
return st;
}
@@ -139,8 +139,8 @@ static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
{
int ret;
struct kvm_ptdump_guest_state *st = m->private;
- struct kvm_s2_mmu *mmu = st->mmu;
- struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
+ struct kvm *kvm = st->kvm;
+ struct kvm_s2_mmu *mmu = &kvm->arch.mmu;
struct kvm_pgtable_walker walker = (struct kvm_pgtable_walker) {
.cb = kvm_ptdump_visitor,
.arg = &st->parser_state,
@@ -163,15 +163,14 @@ static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
static int kvm_ptdump_guest_open(struct inode *m, struct file *file)
{
- struct kvm_s2_mmu *mmu = m->i_private;
- struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
+ struct kvm *kvm = m->i_private;
struct kvm_ptdump_guest_state *st;
int ret;
if (!kvm_get_kvm_safe(kvm))
return -ENOENT;
- st = kvm_ptdump_parser_create(mmu);
+ st = kvm_ptdump_parser_create(kvm);
if (IS_ERR(st)) {
ret = PTR_ERR(st);
goto err_with_kvm_ref;
@@ -189,7 +188,7 @@ static int kvm_ptdump_guest_open(struct inode *m, struct file *file)
static int kvm_ptdump_guest_close(struct inode *m, struct file *file)
{
- struct kvm *kvm = kvm_s2_mmu_to_kvm(m->i_private);
+ struct kvm *kvm = m->i_private;
void *st = ((struct seq_file *)file->private_data)->private;
kfree(st);
@@ -224,15 +223,14 @@ static int kvm_pgtable_levels_show(struct seq_file *m, void *unused)
static int kvm_pgtable_debugfs_open(struct inode *m, struct file *file,
int (*show)(struct seq_file *, void *))
{
- struct kvm_s2_mmu *mmu = m->i_private;
- struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
+ struct kvm *kvm = m->i_private;
struct kvm_pgtable *pgtable;
int ret;
if (!kvm_get_kvm_safe(kvm))
return -ENOENT;
- pgtable = mmu->pgt;
+ pgtable = kvm->arch.mmu.pgt;
ret = single_open(file, show, pgtable);
if (ret < 0)
@@ -252,7 +250,7 @@ static int kvm_pgtable_levels_open(struct inode *m, struct file *file)
static int kvm_pgtable_debugfs_close(struct inode *m, struct file *file)
{
- struct kvm *kvm = kvm_s2_mmu_to_kvm(m->i_private);
+ struct kvm *kvm = m->i_private;
kvm_put_kvm(kvm);
return single_release(m, file);
@@ -275,11 +273,11 @@ static const struct file_operations kvm_pgtable_levels_fops = {
void kvm_s2_ptdump_create_debugfs(struct kvm *kvm)
{
debugfs_create_file("stage2_page_tables", 0400, kvm->debugfs_dentry,
- &kvm->arch.mmu, &kvm_ptdump_guest_fops);
+ kvm, &kvm_ptdump_guest_fops);
debugfs_create_file("ipa_range", 0400, kvm->debugfs_dentry,
- &kvm->arch.mmu, &kvm_pgtable_range_fops);
+ kvm, &kvm_pgtable_range_fops);
debugfs_create_file("stage2_levels", 0400, kvm->debugfs_dentry,
- &kvm->arch.mmu, &kvm_pgtable_levels_fops);
+ kvm, &kvm_pgtable_levels_fops);
if (cpus_have_final_cap(ARM64_HAS_NESTED_VIRT))
kvm->arch.debugfs_nv_dentry = debugfs_create_dir("nested", kvm->debugfs_dentry);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 3/6] KVM: arm64: ptdump: Fix UAF when mmu->pgt is freed
2026-06-30 12:09 [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes Wei-Lin Chang
2026-06-30 12:10 ` [PATCH v2 1/6] KVM: arm64: ptdump: Remove shadow ptdump files Wei-Lin Chang
2026-06-30 12:10 ` [PATCH v2 2/6] KVM: arm64: ptdump: Undo making the ptdump code mmu aware Wei-Lin Chang
@ 2026-06-30 12:10 ` Wei-Lin Chang
2026-07-01 15:00 ` Leonardo Bras
2026-06-30 12:10 ` [PATCH v2 4/6] KVM: arm64: ptdump: Factor out initialization of kvm_ptdump_guest_state Wei-Lin Chang
` (3 subsequent siblings)
6 siblings, 1 reply; 17+ messages in thread
From: Wei-Lin Chang @ 2026-06-30 12:10 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm, linux-kernel
Cc: Marc Zyngier, Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Itaru Kitayama, Sebastian Ene, Wei-Lin Chang
ptdump files can still be read after the pgt of the canonical mmu is
freed, if they are opened before the VM debugfs directory is removed.
This triggers UAF in places where we cache the pgt pointer or access it
without checking its validity.
Check the pgt is still alive under the mmu_lock before accessing the
pgt.
Reported-by: Sashiko <sashiko-bot@kernel.org>
Closes: https://sashiko.dev/#/patchset/20260623142443.648972-1-weilin.chang@arm.com?part=1
Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
---
arch/arm64/kvm/ptdump.c | 38 ++++++++++++++++++++++++--------------
1 file changed, 24 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
index d5aa9eff08d1..752d8e0cd25c 100644
--- a/arch/arm64/kvm/ptdump.c
+++ b/arch/arm64/kvm/ptdump.c
@@ -115,13 +115,21 @@ static int kvm_ptdump_build_levels(struct ptdump_pg_level *level, u32 start_lvl)
static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm *kvm)
{
struct kvm_ptdump_guest_state *st;
- struct kvm_pgtable *pgtable = kvm->arch.mmu.pgt;
+ struct kvm_pgtable *pgtable;
int ret;
st = kzalloc_obj(struct kvm_ptdump_guest_state, GFP_KERNEL_ACCOUNT);
if (!st)
return ERR_PTR(-ENOMEM);
+ guard(write_lock)(&kvm->mmu_lock);
+ if (!kvm->arch.mmu.pgt) {
+ kfree(st);
+ return ERR_PTR(-ENXIO);
+ }
+
+ pgtable = kvm->arch.mmu.pgt;
+
ret = kvm_ptdump_build_levels(&st->level[0], pgtable->start_level);
if (ret) {
kfree(st);
@@ -137,7 +145,6 @@ static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm *kvm)
static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
{
- int ret;
struct kvm_ptdump_guest_state *st = m->private;
struct kvm *kvm = st->kvm;
struct kvm_s2_mmu *mmu = &kvm->arch.mmu;
@@ -154,11 +161,11 @@ static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
.seq = m,
};
- write_lock(&kvm->mmu_lock);
- ret = kvm_pgtable_walk(mmu->pgt, 0, BIT(mmu->pgt->ia_bits), &walker);
- write_unlock(&kvm->mmu_lock);
+ guard(write_lock)(&kvm->mmu_lock);
+ if (mmu->pgt)
+ return kvm_pgtable_walk(mmu->pgt, 0, BIT(mmu->pgt->ia_bits), &walker);
- return ret;
+ return 0;
}
static int kvm_ptdump_guest_open(struct inode *m, struct file *file)
@@ -206,17 +213,23 @@ static const struct file_operations kvm_ptdump_guest_fops = {
static int kvm_pgtable_range_show(struct seq_file *m, void *unused)
{
- struct kvm_pgtable *pgtable = m->private;
+ struct kvm *kvm = m->private;
+
+ guard(write_lock)(&kvm->mmu_lock);
+ if (kvm->arch.mmu.pgt)
+ seq_printf(m, "%2u\n", kvm->arch.mmu.pgt->ia_bits);
- seq_printf(m, "%2u\n", pgtable->ia_bits);
return 0;
}
static int kvm_pgtable_levels_show(struct seq_file *m, void *unused)
{
- struct kvm_pgtable *pgtable = m->private;
+ struct kvm *kvm = m->private;
+
+ guard(write_lock)(&kvm->mmu_lock);
+ if (kvm->arch.mmu.pgt)
+ seq_printf(m, "%1d\n", KVM_PGTABLE_MAX_LEVELS - kvm->arch.mmu.pgt->start_level);
- seq_printf(m, "%1d\n", KVM_PGTABLE_MAX_LEVELS - pgtable->start_level);
return 0;
}
@@ -224,15 +237,12 @@ static int kvm_pgtable_debugfs_open(struct inode *m, struct file *file,
int (*show)(struct seq_file *, void *))
{
struct kvm *kvm = m->i_private;
- struct kvm_pgtable *pgtable;
int ret;
if (!kvm_get_kvm_safe(kvm))
return -ENOENT;
- pgtable = kvm->arch.mmu.pgt;
-
- ret = single_open(file, show, pgtable);
+ ret = single_open(file, show, kvm);
if (ret < 0)
kvm_put_kvm(kvm);
return ret;
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 4/6] KVM: arm64: ptdump: Factor out initialization of kvm_ptdump_guest_state
2026-06-30 12:09 [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes Wei-Lin Chang
` (2 preceding siblings ...)
2026-06-30 12:10 ` [PATCH v2 3/6] KVM: arm64: ptdump: Fix UAF when mmu->pgt is freed Wei-Lin Chang
@ 2026-06-30 12:10 ` Wei-Lin Chang
2026-06-30 12:10 ` [PATCH v2 5/6] KVM: arm64: ptdump: Extract kvm_ptdump_guest_open() from canonical ptdump path Wei-Lin Chang
` (2 subsequent siblings)
6 siblings, 0 replies; 17+ messages in thread
From: Wei-Lin Chang @ 2026-06-30 12:10 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm, linux-kernel
Cc: Marc Zyngier, Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Itaru Kitayama, Sebastian Ene, Wei-Lin Chang
Extract the code for initializing kvm_ptdump_guest_state to allow
reusing the same instance for dumping multiple page tables.
Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
---
arch/arm64/kvm/ptdump.c | 23 ++++++++++++++++++-----
1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
index 752d8e0cd25c..0c9647666e65 100644
--- a/arch/arm64/kvm/ptdump.c
+++ b/arch/arm64/kvm/ptdump.c
@@ -112,6 +112,23 @@ static int kvm_ptdump_build_levels(struct ptdump_pg_level *level, u32 start_lvl)
return 0;
}
+static int kvm_ptdump_parser_init(struct kvm_ptdump_guest_state *st, struct kvm *kvm,
+ struct kvm_pgtable *pgt)
+{
+ int ret;
+
+ ret = kvm_ptdump_build_levels(&st->level[0], pgt->start_level);
+ if (ret)
+ return ret;
+
+ st->ipa_marker[0].name = "Guest IPA";
+ st->ipa_marker[1].start_address = BIT(pgt->ia_bits);
+
+ st->kvm = kvm;
+
+ return 0;
+}
+
static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm *kvm)
{
struct kvm_ptdump_guest_state *st;
@@ -129,17 +146,13 @@ static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm *kvm)
}
pgtable = kvm->arch.mmu.pgt;
+ ret = kvm_ptdump_parser_init(st, kvm, pgtable);
- ret = kvm_ptdump_build_levels(&st->level[0], pgtable->start_level);
if (ret) {
kfree(st);
return ERR_PTR(ret);
}
- st->ipa_marker[0].name = "Guest IPA";
- st->ipa_marker[1].start_address = BIT(pgtable->ia_bits);
-
- st->kvm = kvm;
return st;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 5/6] KVM: arm64: ptdump: Extract kvm_ptdump_guest_open() from canonical ptdump path
2026-06-30 12:09 [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes Wei-Lin Chang
` (3 preceding siblings ...)
2026-06-30 12:10 ` [PATCH v2 4/6] KVM: arm64: ptdump: Factor out initialization of kvm_ptdump_guest_state Wei-Lin Chang
@ 2026-06-30 12:10 ` Wei-Lin Chang
2026-06-30 12:10 ` [PATCH v2 6/6] KVM: arm64: ptdump: Introduce the shadow ptdump file Wei-Lin Chang
2026-07-02 6:55 ` [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes Itaru Kitayama
6 siblings, 0 replies; 17+ messages in thread
From: Wei-Lin Chang @ 2026-06-30 12:10 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm, linux-kernel
Cc: Marc Zyngier, Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Itaru Kitayama, Sebastian Ene, Wei-Lin Chang
Factor out kvm_ptdump_guest_open() so that the shadow ptdump can reuse
kvm_ptdump_guest_open().
Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
---
arch/arm64/kvm/ptdump.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
index 0c9647666e65..40f93b7c7ad9 100644
--- a/arch/arm64/kvm/ptdump.c
+++ b/arch/arm64/kvm/ptdump.c
@@ -156,7 +156,7 @@ static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm *kvm)
return st;
}
-static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
+static int kvm_ptdump_guest_canonical_show(struct seq_file *m, void *unused)
{
struct kvm_ptdump_guest_state *st = m->private;
struct kvm *kvm = st->kvm;
@@ -181,7 +181,8 @@ static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
return 0;
}
-static int kvm_ptdump_guest_open(struct inode *m, struct file *file)
+static int kvm_ptdump_guest_open(struct inode *m, struct file *file,
+ int (*show)(struct seq_file *, void *))
{
struct kvm *kvm = m->i_private;
struct kvm_ptdump_guest_state *st;
@@ -196,7 +197,7 @@ static int kvm_ptdump_guest_open(struct inode *m, struct file *file)
goto err_with_kvm_ref;
}
- ret = single_open(file, kvm_ptdump_guest_show, st);
+ ret = single_open(file, show, st);
if (!ret)
return 0;
@@ -206,6 +207,11 @@ static int kvm_ptdump_guest_open(struct inode *m, struct file *file)
return ret;
}
+static int kvm_ptdump_guest_canonical_open(struct inode *m, struct file *file)
+{
+ return kvm_ptdump_guest_open(m, file, kvm_ptdump_guest_canonical_show);
+}
+
static int kvm_ptdump_guest_close(struct inode *m, struct file *file)
{
struct kvm *kvm = m->i_private;
@@ -217,8 +223,8 @@ static int kvm_ptdump_guest_close(struct inode *m, struct file *file)
return single_release(m, file);
}
-static const struct file_operations kvm_ptdump_guest_fops = {
- .open = kvm_ptdump_guest_open,
+static const struct file_operations kvm_ptdump_guest_canonical_fops = {
+ .open = kvm_ptdump_guest_canonical_open,
.read = seq_read,
.llseek = seq_lseek,
.release = kvm_ptdump_guest_close,
@@ -296,7 +302,7 @@ static const struct file_operations kvm_pgtable_levels_fops = {
void kvm_s2_ptdump_create_debugfs(struct kvm *kvm)
{
debugfs_create_file("stage2_page_tables", 0400, kvm->debugfs_dentry,
- kvm, &kvm_ptdump_guest_fops);
+ kvm, &kvm_ptdump_guest_canonical_fops);
debugfs_create_file("ipa_range", 0400, kvm->debugfs_dentry,
kvm, &kvm_pgtable_range_fops);
debugfs_create_file("stage2_levels", 0400, kvm->debugfs_dentry,
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 6/6] KVM: arm64: ptdump: Introduce the shadow ptdump file
2026-06-30 12:09 [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes Wei-Lin Chang
` (4 preceding siblings ...)
2026-06-30 12:10 ` [PATCH v2 5/6] KVM: arm64: ptdump: Extract kvm_ptdump_guest_open() from canonical ptdump path Wei-Lin Chang
@ 2026-06-30 12:10 ` Wei-Lin Chang
2026-07-01 15:28 ` Leonardo Bras
2026-07-02 21:48 ` Itaru Kitayama
2026-07-02 6:55 ` [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes Itaru Kitayama
6 siblings, 2 replies; 17+ messages in thread
From: Wei-Lin Chang @ 2026-06-30 12:10 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm, linux-kernel
Cc: Marc Zyngier, Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Itaru Kitayama, Sebastian Ene, Wei-Lin Chang
Create a ptdump file for all shadow page tables. It will dump out all
valid shadow page tables at the time of request, with the mmu's index,
guest VTCR_EL2, VTTBR_EL2, and whether the guest stage-2 is enabled or
not.
Also detach the nested mmu array under the mmu_lock in
kvm_arch_flush_shadow_all() so readers cannot race with the array being
removed, then free the old array after dropping the lock.
Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
---
arch/arm64/kvm/nested.c | 12 ++++++--
arch/arm64/kvm/ptdump.c | 61 ++++++++++++++++++++++++++++++++++++++++-
2 files changed, 69 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index 6435efd65cb5..17a180ddf6ca 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -1283,6 +1283,7 @@ void kvm_nested_s2_flush(struct kvm *kvm)
void kvm_arch_flush_shadow_all(struct kvm *kvm)
{
+ struct kvm_s2_mmu *mmus;
int i;
for (i = 0; i < kvm->arch.nested_mmus_size; i++) {
@@ -1291,9 +1292,14 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm)
if (!WARN_ON(atomic_read(&mmu->refcnt)))
kvm_free_stage2_pgd(mmu);
}
- kvfree(kvm->arch.nested_mmus);
- kvm->arch.nested_mmus = NULL;
- kvm->arch.nested_mmus_size = 0;
+
+ scoped_guard(write_lock, &kvm->mmu_lock) {
+ mmus = kvm->arch.nested_mmus;
+ kvm->arch.nested_mmus = NULL;
+ kvm->arch.nested_mmus_size = 0;
+ }
+
+ kvfree(mmus);
kvm_uninit_stage2_mmu(kvm);
}
diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
index 40f93b7c7ad9..1649eaa75798 100644
--- a/arch/arm64/kvm/ptdump.c
+++ b/arch/arm64/kvm/ptdump.c
@@ -181,6 +181,50 @@ static int kvm_ptdump_guest_canonical_show(struct seq_file *m, void *unused)
return 0;
}
+static int kvm_ptdump_guest_nested_show(struct seq_file *m, void *unused)
+{
+ int ret = 0, i;
+ struct kvm_ptdump_guest_state *st = m->private;
+ struct kvm *kvm = st->kvm;
+ struct kvm_pgtable_walker walker = (struct kvm_pgtable_walker) {
+ .cb = kvm_ptdump_visitor,
+ .arg = &st->parser_state,
+ .flags = KVM_PGTABLE_WALK_LEAF,
+ };
+
+ guard(write_lock)(&kvm->mmu_lock);
+
+ if (!kvm->arch.nested_mmus)
+ return 0;
+
+ for (i = 0; i < kvm->arch.nested_mmus_size; i++) {
+ struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i];
+
+ if (!mmu->pgt)
+ continue;
+
+ if (kvm_s2_mmu_valid(mmu)) {
+ memset(st, 0, sizeof(*st));
+ ret = kvm_ptdump_parser_init(st, kvm, mmu->pgt);
+ if (ret)
+ return ret;
+ st->parser_state = (struct ptdump_pg_state) {
+ .marker = &st->ipa_marker[0],
+ .level = -1,
+ .pg_level = &st->level[0],
+ .seq = m,
+ };
+ seq_printf(m, "nested mmu %d VTCR: 0x%016llx VTTBR: 0x%016llx s2: %s\n",
+ i, mmu->tlb_vtcr, mmu->tlb_vttbr,
+ mmu->nested_stage2_enabled ? "enabled" : "disabled");
+ ret = kvm_pgtable_walk(mmu->pgt, 0, BIT(mmu->pgt->ia_bits), &walker);
+ if (ret)
+ return ret;
+ }
+ }
+ return ret;
+}
+
static int kvm_ptdump_guest_open(struct inode *m, struct file *file,
int (*show)(struct seq_file *, void *))
{
@@ -212,6 +256,11 @@ static int kvm_ptdump_guest_canonical_open(struct inode *m, struct file *file)
return kvm_ptdump_guest_open(m, file, kvm_ptdump_guest_canonical_show);
}
+static int kvm_ptdump_guest_nested_open(struct inode *m, struct file *file)
+{
+ return kvm_ptdump_guest_open(m, file, kvm_ptdump_guest_nested_show);
+}
+
static int kvm_ptdump_guest_close(struct inode *m, struct file *file)
{
struct kvm *kvm = m->i_private;
@@ -230,6 +279,13 @@ static const struct file_operations kvm_ptdump_guest_canonical_fops = {
.release = kvm_ptdump_guest_close,
};
+static const struct file_operations kvm_ptdump_guest_nested_fops = {
+ .open = kvm_ptdump_guest_nested_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = kvm_ptdump_guest_close,
+};
+
static int kvm_pgtable_range_show(struct seq_file *m, void *unused)
{
struct kvm *kvm = m->private;
@@ -307,6 +363,9 @@ void kvm_s2_ptdump_create_debugfs(struct kvm *kvm)
kvm, &kvm_pgtable_range_fops);
debugfs_create_file("stage2_levels", 0400, kvm->debugfs_dentry,
kvm, &kvm_pgtable_levels_fops);
- if (cpus_have_final_cap(ARM64_HAS_NESTED_VIRT))
+ if (cpus_have_final_cap(ARM64_HAS_NESTED_VIRT)) {
kvm->arch.debugfs_nv_dentry = debugfs_create_dir("nested", kvm->debugfs_dentry);
+ debugfs_create_file("shadow_page_tables", 0400, kvm->arch.debugfs_nv_dentry,
+ kvm, &kvm_ptdump_guest_nested_fops);
+ }
}
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 3/6] KVM: arm64: ptdump: Fix UAF when mmu->pgt is freed
2026-06-30 12:10 ` [PATCH v2 3/6] KVM: arm64: ptdump: Fix UAF when mmu->pgt is freed Wei-Lin Chang
@ 2026-07-01 15:00 ` Leonardo Bras
2026-07-01 17:27 ` Wei-Lin Chang
0 siblings, 1 reply; 17+ messages in thread
From: Leonardo Bras @ 2026-07-01 15:00 UTC (permalink / raw)
To: Wei-Lin Chang
Cc: Leonardo Bras, linux-arm-kernel, kvmarm, linux-kernel,
Marc Zyngier, Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Itaru Kitayama, Sebastian Ene
Hi Wei Lin,
On Tue, Jun 30, 2026 at 01:10:02PM +0100, Wei-Lin Chang wrote:
> ptdump files can still be read after the pgt of the canonical mmu is
> freed, if they are opened before the VM debugfs directory is removed.
> This triggers UAF in places where we cache the pgt pointer or access it
> without checking its validity.
>
> Check the pgt is still alive under the mmu_lock before accessing the
> pgt.
>
> Reported-by: Sashiko <sashiko-bot@kernel.org>
> Closes: https://sashiko.dev/#/patchset/20260623142443.648972-1-weilin.chang@arm.com?part=1
> Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> ---
> arch/arm64/kvm/ptdump.c | 38 ++++++++++++++++++++++++--------------
> 1 file changed, 24 insertions(+), 14 deletions(-)
>
> diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> index d5aa9eff08d1..752d8e0cd25c 100644
> --- a/arch/arm64/kvm/ptdump.c
> +++ b/arch/arm64/kvm/ptdump.c
> @@ -115,13 +115,21 @@ static int kvm_ptdump_build_levels(struct ptdump_pg_level *level, u32 start_lvl)
> static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm *kvm)
> {
> struct kvm_ptdump_guest_state *st;
> - struct kvm_pgtable *pgtable = kvm->arch.mmu.pgt;
> + struct kvm_pgtable *pgtable;
> int ret;
>
> st = kzalloc_obj(struct kvm_ptdump_guest_state, GFP_KERNEL_ACCOUNT);
> if (!st)
> return ERR_PTR(-ENOMEM);
>
> + guard(write_lock)(&kvm->mmu_lock);
> + if (!kvm->arch.mmu.pgt) {
> + kfree(st);
> + return ERR_PTR(-ENXIO);
> + }
> +
> + pgtable = kvm->arch.mmu.pgt;
> +
> ret = kvm_ptdump_build_levels(&st->level[0], pgtable->start_level);
> if (ret) {
> kfree(st);
> @@ -137,7 +145,6 @@ static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm *kvm)
>
> static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
> {
> - int ret;
> struct kvm_ptdump_guest_state *st = m->private;
> struct kvm *kvm = st->kvm;
> struct kvm_s2_mmu *mmu = &kvm->arch.mmu;
> @@ -154,11 +161,11 @@ static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
> .seq = m,
> };
>
> - write_lock(&kvm->mmu_lock);
> - ret = kvm_pgtable_walk(mmu->pgt, 0, BIT(mmu->pgt->ia_bits), &walker);
> - write_unlock(&kvm->mmu_lock);
> + guard(write_lock)(&kvm->mmu_lock);
> + if (mmu->pgt)
> + return kvm_pgtable_walk(mmu->pgt, 0, BIT(mmu->pgt->ia_bits), &walker);
IIUC, that's the same behavior, right?
Just changed to look about the same with the rest of this file?
>
> - return ret;
> + return 0;
> }
So if the pgt does not exist anymore, it returns zero. Is that the desired
behavior?
I guess it's aligned with the idea of single file mentioned in the cover,
right?
>
> static int kvm_ptdump_guest_open(struct inode *m, struct file *file)
> @@ -206,17 +213,23 @@ static const struct file_operations kvm_ptdump_guest_fops = {
>
> static int kvm_pgtable_range_show(struct seq_file *m, void *unused)
> {
> - struct kvm_pgtable *pgtable = m->private;
> + struct kvm *kvm = m->private;
> +
> + guard(write_lock)(&kvm->mmu_lock);
> + if (kvm->arch.mmu.pgt)
> + seq_printf(m, "%2u\n", kvm->arch.mmu.pgt->ia_bits);
>
> - seq_printf(m, "%2u\n", pgtable->ia_bits);
> return 0;
> }
>
> static int kvm_pgtable_levels_show(struct seq_file *m, void *unused)
> {
> - struct kvm_pgtable *pgtable = m->private;
> + struct kvm *kvm = m->private;
> +
> + guard(write_lock)(&kvm->mmu_lock);
> + if (kvm->arch.mmu.pgt)
> + seq_printf(m, "%1d\n", KVM_PGTABLE_MAX_LEVELS - kvm->arch.mmu.pgt->start_level);
>
> - seq_printf(m, "%1d\n", KVM_PGTABLE_MAX_LEVELS - pgtable->start_level);
> return 0;
> }
>
> @@ -224,15 +237,12 @@ static int kvm_pgtable_debugfs_open(struct inode *m, struct file *file,
> int (*show)(struct seq_file *, void *))
> {
> struct kvm *kvm = m->i_private;
> - struct kvm_pgtable *pgtable;
> int ret;
>
> if (!kvm_get_kvm_safe(kvm))
> return -ENOENT;
>
> - pgtable = kvm->arch.mmu.pgt;
> -
> - ret = single_open(file, show, pgtable);
> + ret = single_open(file, show, kvm);
Maybe this change is more related with the previous patch?
> if (ret < 0)
> kvm_put_kvm(kvm);
> return ret;
> --
> 2.43.0
>
Thanks!
Leo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 6/6] KVM: arm64: ptdump: Introduce the shadow ptdump file
2026-06-30 12:10 ` [PATCH v2 6/6] KVM: arm64: ptdump: Introduce the shadow ptdump file Wei-Lin Chang
@ 2026-07-01 15:28 ` Leonardo Bras
2026-07-01 17:35 ` Wei-Lin Chang
2026-07-02 21:48 ` Itaru Kitayama
1 sibling, 1 reply; 17+ messages in thread
From: Leonardo Bras @ 2026-07-01 15:28 UTC (permalink / raw)
To: Wei-Lin Chang
Cc: Leonardo Bras, linux-arm-kernel, kvmarm, linux-kernel,
Marc Zyngier, Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Itaru Kitayama, Sebastian Ene
On Tue, Jun 30, 2026 at 01:10:05PM +0100, Wei-Lin Chang wrote:
> Create a ptdump file for all shadow page tables. It will dump out all
> valid shadow page tables at the time of request, with the mmu's index,
> guest VTCR_EL2, VTTBR_EL2, and whether the guest stage-2 is enabled or
> not.
>
> Also detach the nested mmu array under the mmu_lock in
> kvm_arch_flush_shadow_all() so readers cannot race with the array being
> removed, then free the old array after dropping the lock.
Out of curiosity: why drop the lock before kfree'ing ?
Thanks!
Leo
>
> Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> ---
> arch/arm64/kvm/nested.c | 12 ++++++--
> arch/arm64/kvm/ptdump.c | 61 ++++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 69 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 6435efd65cb5..17a180ddf6ca 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -1283,6 +1283,7 @@ void kvm_nested_s2_flush(struct kvm *kvm)
>
> void kvm_arch_flush_shadow_all(struct kvm *kvm)
> {
> + struct kvm_s2_mmu *mmus;
> int i;
>
> for (i = 0; i < kvm->arch.nested_mmus_size; i++) {
> @@ -1291,9 +1292,14 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm)
> if (!WARN_ON(atomic_read(&mmu->refcnt)))
> kvm_free_stage2_pgd(mmu);
> }
> - kvfree(kvm->arch.nested_mmus);
> - kvm->arch.nested_mmus = NULL;
> - kvm->arch.nested_mmus_size = 0;
> +
> + scoped_guard(write_lock, &kvm->mmu_lock) {
> + mmus = kvm->arch.nested_mmus;
> + kvm->arch.nested_mmus = NULL;
> + kvm->arch.nested_mmus_size = 0;
> + }
> +
> + kvfree(mmus);
> kvm_uninit_stage2_mmu(kvm);
> }
>
> diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> index 40f93b7c7ad9..1649eaa75798 100644
> --- a/arch/arm64/kvm/ptdump.c
> +++ b/arch/arm64/kvm/ptdump.c
> @@ -181,6 +181,50 @@ static int kvm_ptdump_guest_canonical_show(struct seq_file *m, void *unused)
> return 0;
> }
>
> +static int kvm_ptdump_guest_nested_show(struct seq_file *m, void *unused)
> +{
> + int ret = 0, i;
> + struct kvm_ptdump_guest_state *st = m->private;
> + struct kvm *kvm = st->kvm;
> + struct kvm_pgtable_walker walker = (struct kvm_pgtable_walker) {
> + .cb = kvm_ptdump_visitor,
> + .arg = &st->parser_state,
> + .flags = KVM_PGTABLE_WALK_LEAF,
> + };
> +
> + guard(write_lock)(&kvm->mmu_lock);
> +
> + if (!kvm->arch.nested_mmus)
> + return 0;
> +
> + for (i = 0; i < kvm->arch.nested_mmus_size; i++) {
> + struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i];
> +
> + if (!mmu->pgt)
> + continue;
> +
> + if (kvm_s2_mmu_valid(mmu)) {
> + memset(st, 0, sizeof(*st));
> + ret = kvm_ptdump_parser_init(st, kvm, mmu->pgt);
> + if (ret)
> + return ret;
> + st->parser_state = (struct ptdump_pg_state) {
> + .marker = &st->ipa_marker[0],
> + .level = -1,
> + .pg_level = &st->level[0],
> + .seq = m,
> + };
> + seq_printf(m, "nested mmu %d VTCR: 0x%016llx VTTBR: 0x%016llx s2: %s\n",
> + i, mmu->tlb_vtcr, mmu->tlb_vttbr,
> + mmu->nested_stage2_enabled ? "enabled" : "disabled");
> + ret = kvm_pgtable_walk(mmu->pgt, 0, BIT(mmu->pgt->ia_bits), &walker);
> + if (ret)
> + return ret;
> + }
> + }
> + return ret;
> +}
> +
> static int kvm_ptdump_guest_open(struct inode *m, struct file *file,
> int (*show)(struct seq_file *, void *))
> {
> @@ -212,6 +256,11 @@ static int kvm_ptdump_guest_canonical_open(struct inode *m, struct file *file)
> return kvm_ptdump_guest_open(m, file, kvm_ptdump_guest_canonical_show);
> }
>
> +static int kvm_ptdump_guest_nested_open(struct inode *m, struct file *file)
> +{
> + return kvm_ptdump_guest_open(m, file, kvm_ptdump_guest_nested_show);
> +}
> +
> static int kvm_ptdump_guest_close(struct inode *m, struct file *file)
> {
> struct kvm *kvm = m->i_private;
> @@ -230,6 +279,13 @@ static const struct file_operations kvm_ptdump_guest_canonical_fops = {
> .release = kvm_ptdump_guest_close,
> };
>
> +static const struct file_operations kvm_ptdump_guest_nested_fops = {
> + .open = kvm_ptdump_guest_nested_open,
> + .read = seq_read,
> + .llseek = seq_lseek,
> + .release = kvm_ptdump_guest_close,
> +};
> +
> static int kvm_pgtable_range_show(struct seq_file *m, void *unused)
> {
> struct kvm *kvm = m->private;
> @@ -307,6 +363,9 @@ void kvm_s2_ptdump_create_debugfs(struct kvm *kvm)
> kvm, &kvm_pgtable_range_fops);
> debugfs_create_file("stage2_levels", 0400, kvm->debugfs_dentry,
> kvm, &kvm_pgtable_levels_fops);
> - if (cpus_have_final_cap(ARM64_HAS_NESTED_VIRT))
> + if (cpus_have_final_cap(ARM64_HAS_NESTED_VIRT)) {
> kvm->arch.debugfs_nv_dentry = debugfs_create_dir("nested", kvm->debugfs_dentry);
> + debugfs_create_file("shadow_page_tables", 0400, kvm->arch.debugfs_nv_dentry,
> + kvm, &kvm_ptdump_guest_nested_fops);
> + }
> }
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 3/6] KVM: arm64: ptdump: Fix UAF when mmu->pgt is freed
2026-07-01 15:00 ` Leonardo Bras
@ 2026-07-01 17:27 ` Wei-Lin Chang
2026-07-02 10:58 ` Leonardo Bras
0 siblings, 1 reply; 17+ messages in thread
From: Wei-Lin Chang @ 2026-07-01 17:27 UTC (permalink / raw)
To: Leonardo Bras
Cc: linux-arm-kernel, kvmarm, linux-kernel, Marc Zyngier,
Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Itaru Kitayama, Sebastian Ene
On Wed, Jul 01, 2026 at 04:00:41PM +0100, Leonardo Bras wrote:
> Hi Wei Lin,
>
> On Tue, Jun 30, 2026 at 01:10:02PM +0100, Wei-Lin Chang wrote:
> > ptdump files can still be read after the pgt of the canonical mmu is
> > freed, if they are opened before the VM debugfs directory is removed.
> > This triggers UAF in places where we cache the pgt pointer or access it
> > without checking its validity.
> >
> > Check the pgt is still alive under the mmu_lock before accessing the
> > pgt.
> >
> > Reported-by: Sashiko <sashiko-bot@kernel.org>
> > Closes: https://sashiko.dev/#/patchset/20260623142443.648972-1-weilin.chang@arm.com?part=1
> > Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> > ---
> > arch/arm64/kvm/ptdump.c | 38 ++++++++++++++++++++++++--------------
> > 1 file changed, 24 insertions(+), 14 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> > index d5aa9eff08d1..752d8e0cd25c 100644
> > --- a/arch/arm64/kvm/ptdump.c
> > +++ b/arch/arm64/kvm/ptdump.c
> > @@ -115,13 +115,21 @@ static int kvm_ptdump_build_levels(struct ptdump_pg_level *level, u32 start_lvl)
> > static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm *kvm)
> > {
> > struct kvm_ptdump_guest_state *st;
> > - struct kvm_pgtable *pgtable = kvm->arch.mmu.pgt;
> > + struct kvm_pgtable *pgtable;
> > int ret;
> >
> > st = kzalloc_obj(struct kvm_ptdump_guest_state, GFP_KERNEL_ACCOUNT);
> > if (!st)
> > return ERR_PTR(-ENOMEM);
> >
> > + guard(write_lock)(&kvm->mmu_lock);
> > + if (!kvm->arch.mmu.pgt) {
> > + kfree(st);
> > + return ERR_PTR(-ENXIO);
> > + }
> > +
> > + pgtable = kvm->arch.mmu.pgt;
> > +
> > ret = kvm_ptdump_build_levels(&st->level[0], pgtable->start_level);
> > if (ret) {
> > kfree(st);
> > @@ -137,7 +145,6 @@ static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm *kvm)
> >
> > static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
> > {
> > - int ret;
> > struct kvm_ptdump_guest_state *st = m->private;
> > struct kvm *kvm = st->kvm;
> > struct kvm_s2_mmu *mmu = &kvm->arch.mmu;
> > @@ -154,11 +161,11 @@ static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
> > .seq = m,
> > };
> >
> > - write_lock(&kvm->mmu_lock);
> > - ret = kvm_pgtable_walk(mmu->pgt, 0, BIT(mmu->pgt->ia_bits), &walker);
> > - write_unlock(&kvm->mmu_lock);
> > + guard(write_lock)(&kvm->mmu_lock);
> > + if (mmu->pgt)
> > + return kvm_pgtable_walk(mmu->pgt, 0, BIT(mmu->pgt->ia_bits), &walker);
>
> IIUC, that's the same behavior, right?
> Just changed to look about the same with the rest of this file?
I'm not too sure what you are referring to, if you mean the
write_lock/unlock -> guard(write_lock) change, then yes, mostly. Just
also checking mmu->pgt is still not freed.
>
> >
> > - return ret;
> > + return 0;
> > }
>
> So if the pgt does not exist anymore, it returns zero. Is that the desired
> behavior?
Good question, so the question is what contract between the ptdump and
user do we want to make for this case. I guess returning some error like -EIO
could make a little more sense than just printing nothing?
>
> I guess it's aligned with the idea of single file mentioned in the cover,
> right?
Sorry, I don't get what you are asking here?
>
> >
> > static int kvm_ptdump_guest_open(struct inode *m, struct file *file)
> > @@ -206,17 +213,23 @@ static const struct file_operations kvm_ptdump_guest_fops = {
> >
> > static int kvm_pgtable_range_show(struct seq_file *m, void *unused)
> > {
> > - struct kvm_pgtable *pgtable = m->private;
> > + struct kvm *kvm = m->private;
> > +
> > + guard(write_lock)(&kvm->mmu_lock);
> > + if (kvm->arch.mmu.pgt)
> > + seq_printf(m, "%2u\n", kvm->arch.mmu.pgt->ia_bits);
> >
> > - seq_printf(m, "%2u\n", pgtable->ia_bits);
> > return 0;
> > }
> >
> > static int kvm_pgtable_levels_show(struct seq_file *m, void *unused)
> > {
> > - struct kvm_pgtable *pgtable = m->private;
> > + struct kvm *kvm = m->private;
> > +
> > + guard(write_lock)(&kvm->mmu_lock);
> > + if (kvm->arch.mmu.pgt)
> > + seq_printf(m, "%1d\n", KVM_PGTABLE_MAX_LEVELS - kvm->arch.mmu.pgt->start_level);
> >
> > - seq_printf(m, "%1d\n", KVM_PGTABLE_MAX_LEVELS - pgtable->start_level);
> > return 0;
> > }
> >
> > @@ -224,15 +237,12 @@ static int kvm_pgtable_debugfs_open(struct inode *m, struct file *file,
> > int (*show)(struct seq_file *, void *))
> > {
> > struct kvm *kvm = m->i_private;
> > - struct kvm_pgtable *pgtable;
> > int ret;
> >
> > if (!kvm_get_kvm_safe(kvm))
> > return -ENOENT;
> >
> > - pgtable = kvm->arch.mmu.pgt;
> > -
> > - ret = single_open(file, show, pgtable);
> > + ret = single_open(file, show, kvm);
>
> Maybe this change is more related with the previous patch?
I see your point, but I divided it into first fixing mmu UAF, then the
pgt UAF, which I also think makes sense.
Thanks,
Wei-Lin Chang
>
> > if (ret < 0)
> > kvm_put_kvm(kvm);
> > return ret;
> > --
> > 2.43.0
> >
>
> Thanks!
> Leo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 6/6] KVM: arm64: ptdump: Introduce the shadow ptdump file
2026-07-01 15:28 ` Leonardo Bras
@ 2026-07-01 17:35 ` Wei-Lin Chang
2026-07-02 11:00 ` Leonardo Bras
0 siblings, 1 reply; 17+ messages in thread
From: Wei-Lin Chang @ 2026-07-01 17:35 UTC (permalink / raw)
To: Leonardo Bras
Cc: linux-arm-kernel, kvmarm, linux-kernel, Marc Zyngier,
Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Itaru Kitayama, Sebastian Ene
On Wed, Jul 01, 2026 at 04:28:35PM +0100, Leonardo Bras wrote:
> On Tue, Jun 30, 2026 at 01:10:05PM +0100, Wei-Lin Chang wrote:
> > Create a ptdump file for all shadow page tables. It will dump out all
> > valid shadow page tables at the time of request, with the mmu's index,
> > guest VTCR_EL2, VTTBR_EL2, and whether the guest stage-2 is enabled or
> > not.
> >
> > Also detach the nested mmu array under the mmu_lock in
> > kvm_arch_flush_shadow_all() so readers cannot race with the array being
> > removed, then free the old array after dropping the lock.
>
> Out of curiosity: why drop the lock before kfree'ing ?
Because kvfree() can sleep! :)
Thanks,
Wei-Lin Chang
>
> Thanks!
> Leo
>
> >
> > Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> > ---
> > arch/arm64/kvm/nested.c | 12 ++++++--
> > arch/arm64/kvm/ptdump.c | 61 ++++++++++++++++++++++++++++++++++++++++-
> > 2 files changed, 69 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> > index 6435efd65cb5..17a180ddf6ca 100644
> > --- a/arch/arm64/kvm/nested.c
> > +++ b/arch/arm64/kvm/nested.c
> > @@ -1283,6 +1283,7 @@ void kvm_nested_s2_flush(struct kvm *kvm)
> >
> > void kvm_arch_flush_shadow_all(struct kvm *kvm)
> > {
> > + struct kvm_s2_mmu *mmus;
> > int i;
> >
> > for (i = 0; i < kvm->arch.nested_mmus_size; i++) {
> > @@ -1291,9 +1292,14 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm)
> > if (!WARN_ON(atomic_read(&mmu->refcnt)))
> > kvm_free_stage2_pgd(mmu);
> > }
> > - kvfree(kvm->arch.nested_mmus);
> > - kvm->arch.nested_mmus = NULL;
> > - kvm->arch.nested_mmus_size = 0;
> > +
> > + scoped_guard(write_lock, &kvm->mmu_lock) {
> > + mmus = kvm->arch.nested_mmus;
> > + kvm->arch.nested_mmus = NULL;
> > + kvm->arch.nested_mmus_size = 0;
> > + }
> > +
> > + kvfree(mmus);
> > kvm_uninit_stage2_mmu(kvm);
> > }
> >
> > diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> > index 40f93b7c7ad9..1649eaa75798 100644
> > --- a/arch/arm64/kvm/ptdump.c
> > +++ b/arch/arm64/kvm/ptdump.c
> > @@ -181,6 +181,50 @@ static int kvm_ptdump_guest_canonical_show(struct seq_file *m, void *unused)
> > return 0;
> > }
> >
> > +static int kvm_ptdump_guest_nested_show(struct seq_file *m, void *unused)
> > +{
> > + int ret = 0, i;
> > + struct kvm_ptdump_guest_state *st = m->private;
> > + struct kvm *kvm = st->kvm;
> > + struct kvm_pgtable_walker walker = (struct kvm_pgtable_walker) {
> > + .cb = kvm_ptdump_visitor,
> > + .arg = &st->parser_state,
> > + .flags = KVM_PGTABLE_WALK_LEAF,
> > + };
> > +
> > + guard(write_lock)(&kvm->mmu_lock);
> > +
> > + if (!kvm->arch.nested_mmus)
> > + return 0;
> > +
> > + for (i = 0; i < kvm->arch.nested_mmus_size; i++) {
> > + struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i];
> > +
> > + if (!mmu->pgt)
> > + continue;
> > +
> > + if (kvm_s2_mmu_valid(mmu)) {
> > + memset(st, 0, sizeof(*st));
> > + ret = kvm_ptdump_parser_init(st, kvm, mmu->pgt);
> > + if (ret)
> > + return ret;
> > + st->parser_state = (struct ptdump_pg_state) {
> > + .marker = &st->ipa_marker[0],
> > + .level = -1,
> > + .pg_level = &st->level[0],
> > + .seq = m,
> > + };
> > + seq_printf(m, "nested mmu %d VTCR: 0x%016llx VTTBR: 0x%016llx s2: %s\n",
> > + i, mmu->tlb_vtcr, mmu->tlb_vttbr,
> > + mmu->nested_stage2_enabled ? "enabled" : "disabled");
> > + ret = kvm_pgtable_walk(mmu->pgt, 0, BIT(mmu->pgt->ia_bits), &walker);
> > + if (ret)
> > + return ret;
> > + }
> > + }
> > + return ret;
> > +}
> > +
> > static int kvm_ptdump_guest_open(struct inode *m, struct file *file,
> > int (*show)(struct seq_file *, void *))
> > {
> > @@ -212,6 +256,11 @@ static int kvm_ptdump_guest_canonical_open(struct inode *m, struct file *file)
> > return kvm_ptdump_guest_open(m, file, kvm_ptdump_guest_canonical_show);
> > }
> >
> > +static int kvm_ptdump_guest_nested_open(struct inode *m, struct file *file)
> > +{
> > + return kvm_ptdump_guest_open(m, file, kvm_ptdump_guest_nested_show);
> > +}
> > +
> > static int kvm_ptdump_guest_close(struct inode *m, struct file *file)
> > {
> > struct kvm *kvm = m->i_private;
> > @@ -230,6 +279,13 @@ static const struct file_operations kvm_ptdump_guest_canonical_fops = {
> > .release = kvm_ptdump_guest_close,
> > };
> >
> > +static const struct file_operations kvm_ptdump_guest_nested_fops = {
> > + .open = kvm_ptdump_guest_nested_open,
> > + .read = seq_read,
> > + .llseek = seq_lseek,
> > + .release = kvm_ptdump_guest_close,
> > +};
> > +
> > static int kvm_pgtable_range_show(struct seq_file *m, void *unused)
> > {
> > struct kvm *kvm = m->private;
> > @@ -307,6 +363,9 @@ void kvm_s2_ptdump_create_debugfs(struct kvm *kvm)
> > kvm, &kvm_pgtable_range_fops);
> > debugfs_create_file("stage2_levels", 0400, kvm->debugfs_dentry,
> > kvm, &kvm_pgtable_levels_fops);
> > - if (cpus_have_final_cap(ARM64_HAS_NESTED_VIRT))
> > + if (cpus_have_final_cap(ARM64_HAS_NESTED_VIRT)) {
> > kvm->arch.debugfs_nv_dentry = debugfs_create_dir("nested", kvm->debugfs_dentry);
> > + debugfs_create_file("shadow_page_tables", 0400, kvm->arch.debugfs_nv_dentry,
> > + kvm, &kvm_ptdump_guest_nested_fops);
> > + }
> > }
> > --
> > 2.43.0
> >
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes
2026-06-30 12:09 [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes Wei-Lin Chang
` (5 preceding siblings ...)
2026-06-30 12:10 ` [PATCH v2 6/6] KVM: arm64: ptdump: Introduce the shadow ptdump file Wei-Lin Chang
@ 2026-07-02 6:55 ` Itaru Kitayama
2026-07-02 7:41 ` Wei-Lin Chang
6 siblings, 1 reply; 17+ messages in thread
From: Itaru Kitayama @ 2026-07-02 6:55 UTC (permalink / raw)
To: Wei-Lin Chang
Cc: linux-arm-kernel, kvmarm, linux-kernel, Marc Zyngier,
Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Sebastian Ene
Hi Wei-Lin,
On Tue, Jun 30, 2026 at 01:09:59PM +0100, Wei-Lin Chang wrote:
> Hi,
>
> This is v2 of fixing shadow ptdump debugfs files. Unfortunately I couldn't make
> per mmu ptdump files work after all, mainly because there isn't a clean way to
> locate the specific nested mmu for each ptdump file as the nested mmus could be
> freed when the file gets opened. Therefore in this series a single file
> "shadow_page_tables" is created that dumps all valid mmus' page table
> information.
>
> An advantage of this is that this new ptdump file have a lifetime identical to
> other ptdump files i.e. stage2_page_tables, ipa_range, etc., hence avoiding the
> dentry UAF found last time [1].
>
> With this all ptdump files are only removed when the last kvm reference gets
> dropped and kvm_destroy_vm_debugfs() is called, in their open(), show()
> functions the nested mmu array and mmu->pgt are checked with mmu_lock held to
> prevent UAF.
>
> Patch 1-2: Undo previous shadow ptdump implementation.
> Patch 3: Fix a mmu->pgt UAF that happens when ptdump files are read after
> mmu->pgt is freed.
> Patch 4-5: Preparation for the shadow page table dump file.
> Patch 6: Implementation of the shadow page table dump file.
>
> The fixes are tested with CONFIG_PROVE_LOCKING,
> CONFIG_DEBUG_ATOMIC_SLEEP, and CONFIG_KASAN.
>
> Thanks!
Running your shadow stage 2 kselftest with bpftrace shows me that __kvm_pgtable_stage2_init()
for shadow stage 2 translation tables are built with ia_bits = 52 and
start_level = 0, but the debugfs entry for the active shadow stage 2 tables prints
out that's 3 levels. Is this fully expected?
Thanks,
Itaru.
>
> * Changes from v1 ([2]):
>
> - Move from per mmu ptdump files to one file that will dump all shadow page
> tables.
>
> [1]: https://lore.kernel.org/kvmarm/ajty6I7ZqodP4ous@sm-arm-grace07/
> [2]: https://lore.kernel.org/kvmarm/20260623142443.648972-1-weilin.chang@arm.com/
>
> Wei-Lin Chang (6):
> KVM: arm64: ptdump: Remove shadow ptdump files
> KVM: arm64: ptdump: Undo making the ptdump code mmu aware
> KVM: arm64: ptdump: Fix UAF when mmu->pgt is freed
> KVM: arm64: ptdump: Factor out initialization of
> kvm_ptdump_guest_state
> KVM: arm64: ptdump: Extract kvm_ptdump_guest_open() from canonical
> ptdump path
> KVM: arm64: ptdump: Introduce the shadow ptdump file
>
> arch/arm64/include/asm/kvm_host.h | 5 +-
> arch/arm64/include/asm/kvm_mmu.h | 4 -
> arch/arm64/kvm/nested.c | 18 +--
> arch/arm64/kvm/ptdump.c | 185 ++++++++++++++++++++----------
> 4 files changed, 135 insertions(+), 77 deletions(-)
>
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes
2026-07-02 6:55 ` [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes Itaru Kitayama
@ 2026-07-02 7:41 ` Wei-Lin Chang
2026-07-02 23:02 ` Itaru Kitayama
0 siblings, 1 reply; 17+ messages in thread
From: Wei-Lin Chang @ 2026-07-02 7:41 UTC (permalink / raw)
To: Itaru Kitayama
Cc: linux-arm-kernel, kvmarm, linux-kernel, Marc Zyngier,
Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Sebastian Ene
On Thu, Jul 02, 2026 at 03:55:48PM +0900, Itaru Kitayama wrote:
> Hi Wei-Lin,
> On Tue, Jun 30, 2026 at 01:09:59PM +0100, Wei-Lin Chang wrote:
> > Hi,
> >
> > This is v2 of fixing shadow ptdump debugfs files. Unfortunately I couldn't make
> > per mmu ptdump files work after all, mainly because there isn't a clean way to
> > locate the specific nested mmu for each ptdump file as the nested mmus could be
> > freed when the file gets opened. Therefore in this series a single file
> > "shadow_page_tables" is created that dumps all valid mmus' page table
> > information.
> >
> > An advantage of this is that this new ptdump file have a lifetime identical to
> > other ptdump files i.e. stage2_page_tables, ipa_range, etc., hence avoiding the
> > dentry UAF found last time [1].
> >
> > With this all ptdump files are only removed when the last kvm reference gets
> > dropped and kvm_destroy_vm_debugfs() is called, in their open(), show()
> > functions the nested mmu array and mmu->pgt are checked with mmu_lock held to
> > prevent UAF.
> >
> > Patch 1-2: Undo previous shadow ptdump implementation.
> > Patch 3: Fix a mmu->pgt UAF that happens when ptdump files are read after
> > mmu->pgt is freed.
> > Patch 4-5: Preparation for the shadow page table dump file.
> > Patch 6: Implementation of the shadow page table dump file.
> >
> > The fixes are tested with CONFIG_PROVE_LOCKING,
> > CONFIG_DEBUG_ATOMIC_SLEEP, and CONFIG_KASAN.
> >
> > Thanks!
>
> Running your shadow stage 2 kselftest with bpftrace shows me that __kvm_pgtable_stage2_init()
> for shadow stage 2 translation tables are built with ia_bits = 52 and
> start_level = 0, but the debugfs entry for the active shadow stage 2 tables prints
> out that's 3 levels. Is this fully expected?
Where is this level information you are seeing from? If it is
"stage2_level", that only reports the number of levels for the canonical
stage-2 (non nested). For nested mmus only the page tables are dumped in
nested/shadow_page_tables.
Thanks,
Wei-Lin Chang
>
> Thanks,
> Itaru.
>
> >
> > * Changes from v1 ([2]):
> >
> > - Move from per mmu ptdump files to one file that will dump all shadow page
> > tables.
> >
> > [1]: https://lore.kernel.org/kvmarm/ajty6I7ZqodP4ous@sm-arm-grace07/
> > [2]: https://lore.kernel.org/kvmarm/20260623142443.648972-1-weilin.chang@arm.com/
> >
> > Wei-Lin Chang (6):
> > KVM: arm64: ptdump: Remove shadow ptdump files
> > KVM: arm64: ptdump: Undo making the ptdump code mmu aware
> > KVM: arm64: ptdump: Fix UAF when mmu->pgt is freed
> > KVM: arm64: ptdump: Factor out initialization of
> > kvm_ptdump_guest_state
> > KVM: arm64: ptdump: Extract kvm_ptdump_guest_open() from canonical
> > ptdump path
> > KVM: arm64: ptdump: Introduce the shadow ptdump file
> >
> > arch/arm64/include/asm/kvm_host.h | 5 +-
> > arch/arm64/include/asm/kvm_mmu.h | 4 -
> > arch/arm64/kvm/nested.c | 18 +--
> > arch/arm64/kvm/ptdump.c | 185 ++++++++++++++++++++----------
> > 4 files changed, 135 insertions(+), 77 deletions(-)
> >
> > --
> > 2.43.0
> >
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 3/6] KVM: arm64: ptdump: Fix UAF when mmu->pgt is freed
2026-07-01 17:27 ` Wei-Lin Chang
@ 2026-07-02 10:58 ` Leonardo Bras
0 siblings, 0 replies; 17+ messages in thread
From: Leonardo Bras @ 2026-07-02 10:58 UTC (permalink / raw)
To: Wei-Lin Chang
Cc: Leonardo Bras, linux-arm-kernel, kvmarm, linux-kernel,
Marc Zyngier, Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Itaru Kitayama, Sebastian Ene
On Wed, Jul 01, 2026 at 06:27:51PM +0100, Wei-Lin Chang wrote:
> On Wed, Jul 01, 2026 at 04:00:41PM +0100, Leonardo Bras wrote:
> > Hi Wei Lin,
> >
> > On Tue, Jun 30, 2026 at 01:10:02PM +0100, Wei-Lin Chang wrote:
> > > ptdump files can still be read after the pgt of the canonical mmu is
> > > freed, if they are opened before the VM debugfs directory is removed.
> > > This triggers UAF in places where we cache the pgt pointer or access it
> > > without checking its validity.
> > >
> > > Check the pgt is still alive under the mmu_lock before accessing the
> > > pgt.
> > >
> > > Reported-by: Sashiko <sashiko-bot@kernel.org>
> > > Closes: https://sashiko.dev/#/patchset/20260623142443.648972-1-weilin.chang@arm.com?part=1
> > > Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> > > ---
> > > arch/arm64/kvm/ptdump.c | 38 ++++++++++++++++++++++++--------------
> > > 1 file changed, 24 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> > > index d5aa9eff08d1..752d8e0cd25c 100644
> > > --- a/arch/arm64/kvm/ptdump.c
> > > +++ b/arch/arm64/kvm/ptdump.c
> > > @@ -115,13 +115,21 @@ static int kvm_ptdump_build_levels(struct ptdump_pg_level *level, u32 start_lvl)
> > > static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm *kvm)
> > > {
> > > struct kvm_ptdump_guest_state *st;
> > > - struct kvm_pgtable *pgtable = kvm->arch.mmu.pgt;
> > > + struct kvm_pgtable *pgtable;
> > > int ret;
> > >
> > > st = kzalloc_obj(struct kvm_ptdump_guest_state, GFP_KERNEL_ACCOUNT);
> > > if (!st)
> > > return ERR_PTR(-ENOMEM);
> > >
> > > + guard(write_lock)(&kvm->mmu_lock);
> > > + if (!kvm->arch.mmu.pgt) {
> > > + kfree(st);
> > > + return ERR_PTR(-ENXIO);
> > > + }
> > > +
> > > + pgtable = kvm->arch.mmu.pgt;
> > > +
> > > ret = kvm_ptdump_build_levels(&st->level[0], pgtable->start_level);
> > > if (ret) {
> > > kfree(st);
> > > @@ -137,7 +145,6 @@ static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm *kvm)
> > >
> > > static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
> > > {
> > > - int ret;
> > > struct kvm_ptdump_guest_state *st = m->private;
> > > struct kvm *kvm = st->kvm;
> > > struct kvm_s2_mmu *mmu = &kvm->arch.mmu;
> > > @@ -154,11 +161,11 @@ static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
> > > .seq = m,
> > > };
> > >
> > > - write_lock(&kvm->mmu_lock);
> > > - ret = kvm_pgtable_walk(mmu->pgt, 0, BIT(mmu->pgt->ia_bits), &walker);
> > > - write_unlock(&kvm->mmu_lock);
> > > + guard(write_lock)(&kvm->mmu_lock);
> > > + if (mmu->pgt)
> > > + return kvm_pgtable_walk(mmu->pgt, 0, BIT(mmu->pgt->ia_bits), &walker);
> >
> > IIUC, that's the same behavior, right?
> > Just changed to look about the same with the rest of this file?
>
> I'm not too sure what you are referring to, if you mean the
> write_lock/unlock -> guard(write_lock) change, then yes, mostly. Just
> also checking mmu->pgt is still not freed.
I was referring just to lock+unlock -> guard() strategy.
IIUC this has the same effect, so I was wondering on why you suggested the
change.
I then supposed it was to look the same of the rest of the file.
>
> >
> > >
> > > - return ret;
> > > + return 0;
> > > }
> >
> > So if the pgt does not exist anymore, it returns zero. Is that the desired
> > behavior?
>
> Good question, so the question is what contract between the ptdump and
> user do we want to make for this case. I guess returning some error like -EIO
> could make a little more sense than just printing nothing?
>
It depends on what is the behaviour you want to see... see below..
> >
> > I guess it's aligned with the idea of single file mentioned in the cover,
> > right?
>
> Sorry, I don't get what you are asking here?
IIUC you mentioned in the cover letter that you planned to have a file
which would, on read, output ptdump for every nested pgtable. Did I get
that right?
If so, I imagine that the user has no idea how many nested pgtables are
there, and if the ioctl initially finds N, but ends up printing N-1
(because a pgtable was not there anymore), that would not be an error that
the user should be worried about.
It could be an issue if you were printing in multiple steps (multiple
checks), as a part of that pgtable could be printed while another part
could not. Which does not seem to be the case.
>
> >
> > >
> > > static int kvm_ptdump_guest_open(struct inode *m, struct file *file)
> > > @@ -206,17 +213,23 @@ static const struct file_operations kvm_ptdump_guest_fops = {
> > >
> > > static int kvm_pgtable_range_show(struct seq_file *m, void *unused)
> > > {
> > > - struct kvm_pgtable *pgtable = m->private;
> > > + struct kvm *kvm = m->private;
> > > +
> > > + guard(write_lock)(&kvm->mmu_lock);
> > > + if (kvm->arch.mmu.pgt)
> > > + seq_printf(m, "%2u\n", kvm->arch.mmu.pgt->ia_bits);
> > >
> > > - seq_printf(m, "%2u\n", pgtable->ia_bits);
> > > return 0;
> > > }
> > >
> > > static int kvm_pgtable_levels_show(struct seq_file *m, void *unused)
> > > {
> > > - struct kvm_pgtable *pgtable = m->private;
> > > + struct kvm *kvm = m->private;
> > > +
> > > + guard(write_lock)(&kvm->mmu_lock);
> > > + if (kvm->arch.mmu.pgt)
> > > + seq_printf(m, "%1d\n", KVM_PGTABLE_MAX_LEVELS - kvm->arch.mmu.pgt->start_level);
> > >
> > > - seq_printf(m, "%1d\n", KVM_PGTABLE_MAX_LEVELS - pgtable->start_level);
> > > return 0;
> > > }
> > >
> > > @@ -224,15 +237,12 @@ static int kvm_pgtable_debugfs_open(struct inode *m, struct file *file,
> > > int (*show)(struct seq_file *, void *))
> > > {
> > > struct kvm *kvm = m->i_private;
> > > - struct kvm_pgtable *pgtable;
> > > int ret;
> > >
> > > if (!kvm_get_kvm_safe(kvm))
> > > return -ENOENT;
> > >
> > > - pgtable = kvm->arch.mmu.pgt;
> > > -
> > > - ret = single_open(file, show, pgtable);
> > > + ret = single_open(file, show, kvm);
> >
> > Maybe this change is more related with the previous patch?
>
> I see your point, but I divided it into first fixing mmu UAF, then the
> pgt UAF, which I also think makes sense.
>
Okay then
Thanks!
Leo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 6/6] KVM: arm64: ptdump: Introduce the shadow ptdump file
2026-07-01 17:35 ` Wei-Lin Chang
@ 2026-07-02 11:00 ` Leonardo Bras
0 siblings, 0 replies; 17+ messages in thread
From: Leonardo Bras @ 2026-07-02 11:00 UTC (permalink / raw)
To: Wei-Lin Chang
Cc: Leonardo Bras, linux-arm-kernel, kvmarm, linux-kernel,
Marc Zyngier, Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Itaru Kitayama, Sebastian Ene
On Wed, Jul 01, 2026 at 06:35:41PM +0100, Wei-Lin Chang wrote:
> On Wed, Jul 01, 2026 at 04:28:35PM +0100, Leonardo Bras wrote:
> > On Tue, Jun 30, 2026 at 01:10:05PM +0100, Wei-Lin Chang wrote:
> > > Create a ptdump file for all shadow page tables. It will dump out all
> > > valid shadow page tables at the time of request, with the mmu's index,
> > > guest VTCR_EL2, VTTBR_EL2, and whether the guest stage-2 is enabled or
> > > not.
> > >
> > > Also detach the nested mmu array under the mmu_lock in
> > > kvm_arch_flush_shadow_all() so readers cannot race with the array being
> > > removed, then free the old array after dropping the lock.
> >
> > Out of curiosity: why drop the lock before kfree'ing ?
>
> Because kvfree() can sleep! :)
>
Damn, I was certain to read kfree, not kvfree(), LOL
You are right, then :)
Thanks!
Leoy
> Thanks,
> Wei-Lin Chang
>
> >
> > Thanks!
> > Leo
> >
> > >
> > > Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> > > ---
> > > arch/arm64/kvm/nested.c | 12 ++++++--
> > > arch/arm64/kvm/ptdump.c | 61 ++++++++++++++++++++++++++++++++++++++++-
> > > 2 files changed, 69 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> > > index 6435efd65cb5..17a180ddf6ca 100644
> > > --- a/arch/arm64/kvm/nested.c
> > > +++ b/arch/arm64/kvm/nested.c
> > > @@ -1283,6 +1283,7 @@ void kvm_nested_s2_flush(struct kvm *kvm)
> > >
> > > void kvm_arch_flush_shadow_all(struct kvm *kvm)
> > > {
> > > + struct kvm_s2_mmu *mmus;
> > > int i;
> > >
> > > for (i = 0; i < kvm->arch.nested_mmus_size; i++) {
> > > @@ -1291,9 +1292,14 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm)
> > > if (!WARN_ON(atomic_read(&mmu->refcnt)))
> > > kvm_free_stage2_pgd(mmu);
> > > }
> > > - kvfree(kvm->arch.nested_mmus);
> > > - kvm->arch.nested_mmus = NULL;
> > > - kvm->arch.nested_mmus_size = 0;
> > > +
> > > + scoped_guard(write_lock, &kvm->mmu_lock) {
> > > + mmus = kvm->arch.nested_mmus;
> > > + kvm->arch.nested_mmus = NULL;
> > > + kvm->arch.nested_mmus_size = 0;
> > > + }
> > > +
> > > + kvfree(mmus);
> > > kvm_uninit_stage2_mmu(kvm);
> > > }
> > >
> > > diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> > > index 40f93b7c7ad9..1649eaa75798 100644
> > > --- a/arch/arm64/kvm/ptdump.c
> > > +++ b/arch/arm64/kvm/ptdump.c
> > > @@ -181,6 +181,50 @@ static int kvm_ptdump_guest_canonical_show(struct seq_file *m, void *unused)
> > > return 0;
> > > }
> > >
> > > +static int kvm_ptdump_guest_nested_show(struct seq_file *m, void *unused)
> > > +{
> > > + int ret = 0, i;
> > > + struct kvm_ptdump_guest_state *st = m->private;
> > > + struct kvm *kvm = st->kvm;
> > > + struct kvm_pgtable_walker walker = (struct kvm_pgtable_walker) {
> > > + .cb = kvm_ptdump_visitor,
> > > + .arg = &st->parser_state,
> > > + .flags = KVM_PGTABLE_WALK_LEAF,
> > > + };
> > > +
> > > + guard(write_lock)(&kvm->mmu_lock);
> > > +
> > > + if (!kvm->arch.nested_mmus)
> > > + return 0;
> > > +
> > > + for (i = 0; i < kvm->arch.nested_mmus_size; i++) {
> > > + struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i];
> > > +
> > > + if (!mmu->pgt)
> > > + continue;
> > > +
> > > + if (kvm_s2_mmu_valid(mmu)) {
> > > + memset(st, 0, sizeof(*st));
> > > + ret = kvm_ptdump_parser_init(st, kvm, mmu->pgt);
> > > + if (ret)
> > > + return ret;
> > > + st->parser_state = (struct ptdump_pg_state) {
> > > + .marker = &st->ipa_marker[0],
> > > + .level = -1,
> > > + .pg_level = &st->level[0],
> > > + .seq = m,
> > > + };
> > > + seq_printf(m, "nested mmu %d VTCR: 0x%016llx VTTBR: 0x%016llx s2: %s\n",
> > > + i, mmu->tlb_vtcr, mmu->tlb_vttbr,
> > > + mmu->nested_stage2_enabled ? "enabled" : "disabled");
> > > + ret = kvm_pgtable_walk(mmu->pgt, 0, BIT(mmu->pgt->ia_bits), &walker);
> > > + if (ret)
> > > + return ret;
> > > + }
> > > + }
> > > + return ret;
> > > +}
> > > +
> > > static int kvm_ptdump_guest_open(struct inode *m, struct file *file,
> > > int (*show)(struct seq_file *, void *))
> > > {
> > > @@ -212,6 +256,11 @@ static int kvm_ptdump_guest_canonical_open(struct inode *m, struct file *file)
> > > return kvm_ptdump_guest_open(m, file, kvm_ptdump_guest_canonical_show);
> > > }
> > >
> > > +static int kvm_ptdump_guest_nested_open(struct inode *m, struct file *file)
> > > +{
> > > + return kvm_ptdump_guest_open(m, file, kvm_ptdump_guest_nested_show);
> > > +}
> > > +
> > > static int kvm_ptdump_guest_close(struct inode *m, struct file *file)
> > > {
> > > struct kvm *kvm = m->i_private;
> > > @@ -230,6 +279,13 @@ static const struct file_operations kvm_ptdump_guest_canonical_fops = {
> > > .release = kvm_ptdump_guest_close,
> > > };
> > >
> > > +static const struct file_operations kvm_ptdump_guest_nested_fops = {
> > > + .open = kvm_ptdump_guest_nested_open,
> > > + .read = seq_read,
> > > + .llseek = seq_lseek,
> > > + .release = kvm_ptdump_guest_close,
> > > +};
> > > +
> > > static int kvm_pgtable_range_show(struct seq_file *m, void *unused)
> > > {
> > > struct kvm *kvm = m->private;
> > > @@ -307,6 +363,9 @@ void kvm_s2_ptdump_create_debugfs(struct kvm *kvm)
> > > kvm, &kvm_pgtable_range_fops);
> > > debugfs_create_file("stage2_levels", 0400, kvm->debugfs_dentry,
> > > kvm, &kvm_pgtable_levels_fops);
> > > - if (cpus_have_final_cap(ARM64_HAS_NESTED_VIRT))
> > > + if (cpus_have_final_cap(ARM64_HAS_NESTED_VIRT)) {
> > > kvm->arch.debugfs_nv_dentry = debugfs_create_dir("nested", kvm->debugfs_dentry);
> > > + debugfs_create_file("shadow_page_tables", 0400, kvm->arch.debugfs_nv_dentry,
> > > + kvm, &kvm_ptdump_guest_nested_fops);
> > > + }
> > > }
> > > --
> > > 2.43.0
> > >
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 6/6] KVM: arm64: ptdump: Introduce the shadow ptdump file
2026-06-30 12:10 ` [PATCH v2 6/6] KVM: arm64: ptdump: Introduce the shadow ptdump file Wei-Lin Chang
2026-07-01 15:28 ` Leonardo Bras
@ 2026-07-02 21:48 ` Itaru Kitayama
1 sibling, 0 replies; 17+ messages in thread
From: Itaru Kitayama @ 2026-07-02 21:48 UTC (permalink / raw)
To: Wei-Lin Chang
Cc: linux-arm-kernel, kvmarm, linux-kernel, Marc Zyngier,
Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Sebastian Ene
On Tue, Jun 30, 2026 at 01:10:05PM +0100, Wei-Lin Chang wrote:
> Create a ptdump file for all shadow page tables. It will dump out all
> valid shadow page tables at the time of request, with the mmu's index,
> guest VTCR_EL2, VTTBR_EL2, and whether the guest stage-2 is enabled or
> not.
>
> Also detach the nested mmu array under the mmu_lock in
> kvm_arch_flush_shadow_all() so readers cannot race with the array being
> removed, then free the old array after dropping the lock.
>
> Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> ---
> arch/arm64/kvm/nested.c | 12 ++++++--
> arch/arm64/kvm/ptdump.c | 61 ++++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 69 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 6435efd65cb5..17a180ddf6ca 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -1283,6 +1283,7 @@ void kvm_nested_s2_flush(struct kvm *kvm)
>
> void kvm_arch_flush_shadow_all(struct kvm *kvm)
> {
> + struct kvm_s2_mmu *mmus;
> int i;
>
> for (i = 0; i < kvm->arch.nested_mmus_size; i++) {
> @@ -1291,9 +1292,14 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm)
> if (!WARN_ON(atomic_read(&mmu->refcnt)))
> kvm_free_stage2_pgd(mmu);
> }
> - kvfree(kvm->arch.nested_mmus);
> - kvm->arch.nested_mmus = NULL;
> - kvm->arch.nested_mmus_size = 0;
> +
> + scoped_guard(write_lock, &kvm->mmu_lock) {
> + mmus = kvm->arch.nested_mmus;
> + kvm->arch.nested_mmus = NULL;
> + kvm->arch.nested_mmus_size = 0;
> + }
> +
> + kvfree(mmus);
> kvm_uninit_stage2_mmu(kvm);
> }
>
> diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> index 40f93b7c7ad9..1649eaa75798 100644
> --- a/arch/arm64/kvm/ptdump.c
> +++ b/arch/arm64/kvm/ptdump.c
> @@ -181,6 +181,50 @@ static int kvm_ptdump_guest_canonical_show(struct seq_file *m, void *unused)
> return 0;
> }
>
> +static int kvm_ptdump_guest_nested_show(struct seq_file *m, void *unused)
> +{
> + int ret = 0, i;
> + struct kvm_ptdump_guest_state *st = m->private;
> + struct kvm *kvm = st->kvm;
> + struct kvm_pgtable_walker walker = (struct kvm_pgtable_walker) {
> + .cb = kvm_ptdump_visitor,
> + .arg = &st->parser_state,
> + .flags = KVM_PGTABLE_WALK_LEAF,
> + };
> +
> + guard(write_lock)(&kvm->mmu_lock);
> +
> + if (!kvm->arch.nested_mmus)
> + return 0;
> +
> + for (i = 0; i < kvm->arch.nested_mmus_size; i++) {
> + struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i];
> +
> + if (!mmu->pgt)
> + continue;
> +
> + if (kvm_s2_mmu_valid(mmu)) {
> + memset(st, 0, sizeof(*st));
> + ret = kvm_ptdump_parser_init(st, kvm, mmu->pgt);
> + if (ret)
> + return ret;
> + st->parser_state = (struct ptdump_pg_state) {
> + .marker = &st->ipa_marker[0],
> + .level = -1,
> + .pg_level = &st->level[0],
> + .seq = m,
> + };
> + seq_printf(m, "nested mmu %d VTCR: 0x%016llx VTTBR: 0x%016llx s2: %s\n",
> + i, mmu->tlb_vtcr, mmu->tlb_vttbr,
> + mmu->nested_stage2_enabled ? "enabled" : "disabled");
This header information in the debugfs "shadow_page_tables" file, under
the nested directory is showing guest hypervisor's configuration while
the file is designed to show nested guest's shadwo stage 2 translation
tables layouts owned by Host EL2 hypervisor. Is this intentional?
Thanks,
Itaru.
> + ret = kvm_pgtable_walk(mmu->pgt, 0, BIT(mmu->pgt->ia_bits), &walker);
> + if (ret)
> + return ret;
> + }
> + }
> + return ret;
> +}
> +
> static int kvm_ptdump_guest_open(struct inode *m, struct file *file,
> int (*show)(struct seq_file *, void *))
> {
> @@ -212,6 +256,11 @@ static int kvm_ptdump_guest_canonical_open(struct inode *m, struct file *file)
> return kvm_ptdump_guest_open(m, file, kvm_ptdump_guest_canonical_show);
> }
>
> +static int kvm_ptdump_guest_nested_open(struct inode *m, struct file *file)
> +{
> + return kvm_ptdump_guest_open(m, file, kvm_ptdump_guest_nested_show);
> +}
> +
> static int kvm_ptdump_guest_close(struct inode *m, struct file *file)
> {
> struct kvm *kvm = m->i_private;
> @@ -230,6 +279,13 @@ static const struct file_operations kvm_ptdump_guest_canonical_fops = {
> .release = kvm_ptdump_guest_close,
> };
>
> +static const struct file_operations kvm_ptdump_guest_nested_fops = {
> + .open = kvm_ptdump_guest_nested_open,
> + .read = seq_read,
> + .llseek = seq_lseek,
> + .release = kvm_ptdump_guest_close,
> +};
> +
> static int kvm_pgtable_range_show(struct seq_file *m, void *unused)
> {
> struct kvm *kvm = m->private;
> @@ -307,6 +363,9 @@ void kvm_s2_ptdump_create_debugfs(struct kvm *kvm)
> kvm, &kvm_pgtable_range_fops);
> debugfs_create_file("stage2_levels", 0400, kvm->debugfs_dentry,
> kvm, &kvm_pgtable_levels_fops);
> - if (cpus_have_final_cap(ARM64_HAS_NESTED_VIRT))
> + if (cpus_have_final_cap(ARM64_HAS_NESTED_VIRT)) {
> kvm->arch.debugfs_nv_dentry = debugfs_create_dir("nested", kvm->debugfs_dentry);
> + debugfs_create_file("shadow_page_tables", 0400, kvm->arch.debugfs_nv_dentry,
> + kvm, &kvm_ptdump_guest_nested_fops);
> + }
> }
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes
2026-07-02 7:41 ` Wei-Lin Chang
@ 2026-07-02 23:02 ` Itaru Kitayama
0 siblings, 0 replies; 17+ messages in thread
From: Itaru Kitayama @ 2026-07-02 23:02 UTC (permalink / raw)
To: Wei-Lin Chang
Cc: linux-arm-kernel, kvmarm, linux-kernel, Marc Zyngier,
Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
Sebastian Ene
On Thu, Jul 02, 2026 at 08:41:43AM +0100, Wei-Lin Chang wrote:
> On Thu, Jul 02, 2026 at 03:55:48PM +0900, Itaru Kitayama wrote:
> > Hi Wei-Lin,
> > On Tue, Jun 30, 2026 at 01:09:59PM +0100, Wei-Lin Chang wrote:
> > > Hi,
> > >
> > > This is v2 of fixing shadow ptdump debugfs files. Unfortunately I couldn't make
> > > per mmu ptdump files work after all, mainly because there isn't a clean way to
> > > locate the specific nested mmu for each ptdump file as the nested mmus could be
> > > freed when the file gets opened. Therefore in this series a single file
> > > "shadow_page_tables" is created that dumps all valid mmus' page table
> > > information.
> > >
> > > An advantage of this is that this new ptdump file have a lifetime identical to
> > > other ptdump files i.e. stage2_page_tables, ipa_range, etc., hence avoiding the
> > > dentry UAF found last time [1].
> > >
> > > With this all ptdump files are only removed when the last kvm reference gets
> > > dropped and kvm_destroy_vm_debugfs() is called, in their open(), show()
> > > functions the nested mmu array and mmu->pgt are checked with mmu_lock held to
> > > prevent UAF.
> > >
> > > Patch 1-2: Undo previous shadow ptdump implementation.
> > > Patch 3: Fix a mmu->pgt UAF that happens when ptdump files are read after
> > > mmu->pgt is freed.
> > > Patch 4-5: Preparation for the shadow page table dump file.
> > > Patch 6: Implementation of the shadow page table dump file.
> > >
> > > The fixes are tested with CONFIG_PROVE_LOCKING,
> > > CONFIG_DEBUG_ATOMIC_SLEEP, and CONFIG_KASAN.
> > >
> > > Thanks!
> >
> > Running your shadow stage 2 kselftest with bpftrace shows me that __kvm_pgtable_stage2_init()
> > for shadow stage 2 translation tables are built with ia_bits = 52 and
> > start_level = 0, but the debugfs entry for the active shadow stage 2 tables prints
> > out that's 3 levels. Is this fully expected?
>
> Where is this level information you are seeing from? If it is
> "stage2_level", that only reports the number of levels for the canonical
> stage-2 (non nested). For nested mmus only the page tables are dumped in
> nested/shadow_page_tables.
Yes I know. The initial stage 2 translation table structure information is obtained by
instrumenting the kernel using eBPF fexit to __kvm_pgtable_stage2_init().
Since you're correctly loopin over nested_mmus array, and the output is
correctly shown using the kvm_pgtable information via the kvm_s2_mmu for
them, I am confused at this moment.
Thanks,
Itaru.
>
> Thanks,
> Wei-Lin Chang
>
> >
> > Thanks,
> > Itaru.
> >
> > >
> > > * Changes from v1 ([2]):
> > >
> > > - Move from per mmu ptdump files to one file that will dump all shadow page
> > > tables.
> > >
> > > [1]: https://lore.kernel.org/kvmarm/ajty6I7ZqodP4ous@sm-arm-grace07/
> > > [2]: https://lore.kernel.org/kvmarm/20260623142443.648972-1-weilin.chang@arm.com/
> > >
> > > Wei-Lin Chang (6):
> > > KVM: arm64: ptdump: Remove shadow ptdump files
> > > KVM: arm64: ptdump: Undo making the ptdump code mmu aware
> > > KVM: arm64: ptdump: Fix UAF when mmu->pgt is freed
> > > KVM: arm64: ptdump: Factor out initialization of
> > > kvm_ptdump_guest_state
> > > KVM: arm64: ptdump: Extract kvm_ptdump_guest_open() from canonical
> > > ptdump path
> > > KVM: arm64: ptdump: Introduce the shadow ptdump file
> > >
> > > arch/arm64/include/asm/kvm_host.h | 5 +-
> > > arch/arm64/include/asm/kvm_mmu.h | 4 -
> > > arch/arm64/kvm/nested.c | 18 +--
> > > arch/arm64/kvm/ptdump.c | 185 ++++++++++++++++++++----------
> > > 4 files changed, 135 insertions(+), 77 deletions(-)
> > >
> > > --
> > > 2.43.0
> > >
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2026-07-02 23:03 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-30 12:09 [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes Wei-Lin Chang
2026-06-30 12:10 ` [PATCH v2 1/6] KVM: arm64: ptdump: Remove shadow ptdump files Wei-Lin Chang
2026-06-30 12:10 ` [PATCH v2 2/6] KVM: arm64: ptdump: Undo making the ptdump code mmu aware Wei-Lin Chang
2026-06-30 12:10 ` [PATCH v2 3/6] KVM: arm64: ptdump: Fix UAF when mmu->pgt is freed Wei-Lin Chang
2026-07-01 15:00 ` Leonardo Bras
2026-07-01 17:27 ` Wei-Lin Chang
2026-07-02 10:58 ` Leonardo Bras
2026-06-30 12:10 ` [PATCH v2 4/6] KVM: arm64: ptdump: Factor out initialization of kvm_ptdump_guest_state Wei-Lin Chang
2026-06-30 12:10 ` [PATCH v2 5/6] KVM: arm64: ptdump: Extract kvm_ptdump_guest_open() from canonical ptdump path Wei-Lin Chang
2026-06-30 12:10 ` [PATCH v2 6/6] KVM: arm64: ptdump: Introduce the shadow ptdump file Wei-Lin Chang
2026-07-01 15:28 ` Leonardo Bras
2026-07-01 17:35 ` Wei-Lin Chang
2026-07-02 11:00 ` Leonardo Bras
2026-07-02 21:48 ` Itaru Kitayama
2026-07-02 6:55 ` [PATCH v2 0/6] KVM: arm64: ptdump: Shadow ptdump fixes Itaru Kitayama
2026-07-02 7:41 ` Wei-Lin Chang
2026-07-02 23:02 ` Itaru Kitayama
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox