linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables
@ 2023-12-18 13:58 Sebastian Ene
  2023-12-18 13:58 ` [PATCH v4 01/10] KVM: arm64: Add snapshot interface for the host stage-2 pagetable Sebastian Ene
                   ` (10 more replies)
  0 siblings, 11 replies; 22+ messages in thread
From: Sebastian Ene @ 2023-12-18 13:58 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa, Sebastian Ene

Hi,

This can be used as a debugging tool for dumping the second stage
page-tables.

When CONFIG_PTDUMP_STAGE2_DEBUGFS is enabled, ptdump registers 
'/sys/debug/kvm/<guest_id>/stage2_page_tables' entry with debugfs
upon guest creation. This allows userspace tools (eg. cat) to dump the
stage-2 pagetables by reading the registered file.

Reading the debugfs file shows stage-2 memory ranges in following format:
<IPA range> <size> <descriptor type> <access permissions> <mem_attributes>

Under pKVM configuration(kvm-arm.mode=protected) ptdump registers an entry
for the host stage-2 pagetables in the following path:
/sys/debug/kvm/host_stage2_page_tables/

The tool interprets the pKVM ownership annotation stored in the invalid
entries and dumps to the console the ownership information. To be able
to access the host stage-2 page-tables from the kernel, a new hypervisor
call was introduced which allows us to snapshot the page-tables in a host
provided buffer. The hypervisor call is hidden behind CONFIG_NVHE_EL2_DEBUG
as this should be used under debugging environment.

Link to the third version:
https://lore.kernel.org/all/20231115171639.2852644-2-sebastianene@google.com/

Link to the second version:
https://lore.kernel.org/all/20231019144032.2943044-1-sebastianene@google.com/#r

Link to the first version:
https://lore.kernel.org/all/20230927112517.2631674-1-sebastianene@google.com/

Changelog:
  v3 -> current_version:
  * refactorization: moved all the **KVM** specific components under
    kvm/ as suggested by Oliver. Introduced a new file
    arm64/kvm/ptdump.c which handled the second stage translation.
    re-used only the display portion from mm/ptdump.c

  * pagetable snapshot creation now uses memory donated from the host.
    The memory is no longer shared with the host as this can pose a security
    risk if the host has access to manipulate the pagetable copy while
    the hypervisor iterates it.

  * fixed a memory leak: while memory was used from the memcache for
    building the snapshot pagetable, it was no longer giving back the
    pages to the host for freeing. A separate array was introduced to
    keep track of the pages allocated from the memcache.

  v2 -> v3:
  * register the stage-2 debugfs entry for the host under
    /sys/debug/kvm/host_stage2_page_tables and in
    /sys/debug/kvm/<guest_id>/stage2_page_tables for guests.

  * don't use a static array for parsing the attributes description,
    generate it dynamically based on the number of pagetable levels

  * remove the lock that was guarding the seq_file private inode data,
    and keep the data private to the open file session.

  * minor fixes & renaming of CONFIG_NVHE_EL2_PTDUMP_DEBUGFS to
    CONFIG_PTDUMP_STAGE2_DEBUGFS

  v1 -> v2:
  * use the stage-2 pagetable walker for dumping descriptors instead of
    the one provided by ptdump.

  * support for guests pagetables dumping under VHE/nVHE non-protected

Thanks,

Sebastian Ene (10):
  KVM: arm64: Add snapshot interface for the host stage-2 pagetable
  KVM: arm64: Add ptdump registration with debugfs for the stage-2
    pagetables
  KVM: arm64: Invoke the snapshot interface for the host stage-2
    pagetable
  arm64: ptdump: Expose the attribute parsing functionality
  arm64: ptdump: Use the mask from the state structure
  KVM: arm64: Move pagetable definitions to common header
  KVM: arm64: Walk the pagetable snapshot and parse the ptdump
    descriptors
  arm64: ptdump: Interpret memory attributes based on the runtime config
  arm64: ptdump: Interpret pKVM ownership annotations
  arm64: ptdump: Add guest stage-2 pagetables dumping

 arch/arm64/include/asm/kvm_asm.h              |   1 +
 arch/arm64/include/asm/kvm_pgtable.h          | 111 +++++
 arch/arm64/include/asm/ptdump.h               |  49 +-
 arch/arm64/kvm/Kconfig                        |  13 +
 arch/arm64/kvm/Makefile                       |   1 +
 arch/arm64/kvm/arm.c                          |   2 +
 arch/arm64/kvm/debug.c                        |   6 +
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  27 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  12 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 186 ++++++++
 arch/arm64/kvm/hyp/pgtable.c                  |  85 ++--
 arch/arm64/kvm/kvm_ptdump.h                   |  21 +
 arch/arm64/kvm/ptdump.c                       | 436 ++++++++++++++++++
 arch/arm64/mm/ptdump.c                        |  55 +--
 14 files changed, 897 insertions(+), 108 deletions(-)
 create mode 100644 arch/arm64/kvm/kvm_ptdump.h
 create mode 100644 arch/arm64/kvm/ptdump.c

-- 
2.43.0.472.g3155946c3a-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v4 01/10] KVM: arm64: Add snapshot interface for the host stage-2 pagetable
  2023-12-18 13:58 [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Sebastian Ene
@ 2023-12-18 13:58 ` Sebastian Ene
  2023-12-19 11:44   ` Sebastian Ene
  2023-12-18 13:58 ` [PATCH v4 02/10] KVM: arm64: Add ptdump registration with debugfs for the stage-2 pagetables Sebastian Ene
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Sebastian Ene @ 2023-12-18 13:58 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa, Sebastian Ene

Introduce a new hvc that allows the pKVM hypervisor to snapshot the host
stage-2 pagetables. The caller is expected to allocate & free the memory
for the pagetable snapshot. The below diagram shows the sequence of
events.

[--- HOST ---]
(1) allocate memory for the snapshot
(2) invoke the __pkvm_copy_host_stage2(snapshot) interface
|____
	[--- HYP ---]
	(3) donate the snapshot from HOST -> HYP
	(4) save the host stage-2 pagetables in the snapshot
	(5) donate the snapshot from HYP -> HOST
	[--- return from HYP ---]
[--- HOST ---]
(6) free the memory for the snapshot

When the host kernel invokes this interface, the memory is donated from
the host to the hypervisor. By doing this the memory is unmapped from the
host's translation regime to prevent the host from modifying the buffer
while the hypervisor uses it. The hypervisor walks the host stage-2
pagetable copy while re-creating the original mappings. When the copying
is done, the memory is donated from the hypervisor back to the host.
The pages that have been used from the memcache are donated back to the
host from a temporary array. The hvc is executing synchronously and the
preemption is disabled for the current CPU while running inside the
hypervisor so calling this interface blocks the current CPU until the
snapshoting is done.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/include/asm/kvm_asm.h              |   1 +
 arch/arm64/include/asm/kvm_pgtable.h          |  43 ++++
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   1 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  12 ++
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 186 ++++++++++++++++++
 arch/arm64/kvm/hyp/pgtable.c                  |  43 ++++
 6 files changed, 286 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 24b5e6b23..9df3367d8 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -81,6 +81,7 @@ enum __kvm_host_smccc_func {
 	__KVM_HOST_SMCCC_FUNC___pkvm_init_vm,
 	__KVM_HOST_SMCCC_FUNC___pkvm_init_vcpu,
 	__KVM_HOST_SMCCC_FUNC___pkvm_teardown_vm,
+	__KVM_HOST_SMCCC_FUNC___pkvm_host_stage2_snapshot,
 };
 
 #define DECLARE_KVM_VHE_SYM(sym)	extern char sym[]
diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index d3e354bb8..f73efd8a8 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -10,6 +10,7 @@
 #include <linux/bits.h>
 #include <linux/kvm_host.h>
 #include <linux/types.h>
+#include <asm/kvm_host.h>
 
 #define KVM_PGTABLE_MAX_LEVELS		4U
 
@@ -351,6 +352,28 @@ struct kvm_pgtable {
 	kvm_pgtable_force_pte_cb_t		force_pte_cb;
 };
 
+/**
+ * struct kvm_pgtable_snapshot - Snapshot page-table wrapper.
+ * @pgtable:		The page-table configuration.
+ * @mc:			Memcache used for pagetable pages allocation.
+ * @pgd_hva:		Host virtual address of a physically contiguous buffer
+ *			used for storing the PGD.
+ * @pgd_pages:		The size of the phyisically contiguous buffer in pages.
+ * @used_pages_hva:	Host virtual address of a physically contiguous buffer
+ *			used for storing the consumed pages from the memcache.
+ * @num_used_pages		The size of the used buffer in pages.
+ * @used_pages_indx		The current index of the used pages array.
+ */
+struct kvm_pgtable_snapshot {
+	struct kvm_pgtable			pgtable;
+	struct kvm_hyp_memcache			mc;
+	void					*pgd_hva;
+	size_t					pgd_pages;
+	phys_addr_t				*used_pages_hva;
+	size_t					num_used_pages;
+	size_t					used_pages_indx;
+};
+
 /**
  * kvm_pgtable_hyp_init() - Initialise a hypervisor stage-1 page-table.
  * @pgt:	Uninitialised page-table structure to initialise.
@@ -756,4 +779,24 @@ enum kvm_pgtable_prot kvm_pgtable_hyp_pte_prot(kvm_pte_t pte);
  */
 void kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
 				phys_addr_t addr, size_t size);
+
+#ifdef CONFIG_NVHE_EL2_DEBUG
+/**
+ * kvm_pgtable_stage2_snapshot() - Given a memcache and a destination
+ *				pagetable where we have the original PGD
+ *				copied, build a snapshot table with page table
+ *				pages from the given memcache.
+ *
+ * @to_pgt:	Destination pagetable
+ * @mc:		The memcache where we allocate the destination pagetables from
+ */
+int kvm_pgtable_stage2_snapshot(struct kvm_pgtable *to_pgt,
+				void *mc);
+#else
+static inline int kvm_pgtable_stage2_snapshot(struct kvm_pgtable *to_pgt,
+					      void *mc)
+{
+	return -EPERM;
+}
+#endif	/* CONFIG_NVHE_EL2_DEBUG */
 #endif	/* __ARM64_KVM_PGTABLE_H__ */
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index 0972faccc..ca8f76915 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -69,6 +69,7 @@ int __pkvm_host_donate_hyp(u64 pfn, u64 nr_pages);
 int __pkvm_hyp_donate_host(u64 pfn, u64 nr_pages);
 int __pkvm_host_share_ffa(u64 pfn, u64 nr_pages);
 int __pkvm_host_unshare_ffa(u64 pfn, u64 nr_pages);
+int __pkvm_host_stage2_snapshot(struct kvm_pgtable_snapshot *snapshot);
 
 bool addr_is_memory(phys_addr_t phys);
 int host_stage2_idmap_locked(phys_addr_t addr, u64 size, enum kvm_pgtable_prot prot);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 2385fd03e..7b215245f 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -314,6 +314,17 @@ static void handle___pkvm_teardown_vm(struct kvm_cpu_context *host_ctxt)
 	cpu_reg(host_ctxt, 1) = __pkvm_teardown_vm(handle);
 }
 
+static void handle___pkvm_host_stage2_snapshot(struct kvm_cpu_context *host_ctxt)
+{
+#ifdef CONFIG_NVHE_EL2_DEBUG
+	DECLARE_REG(struct kvm_pgtable_snapshot *, snapshot_hva, host_ctxt, 1);
+
+	cpu_reg(host_ctxt, 1) = __pkvm_host_stage2_snapshot(snapshot_hva);
+#else
+	cpu_reg(host_ctxt, 0) = SMCCC_RET_NOT_SUPPORTED;
+#endif
+}
+
 typedef void (*hcall_t)(struct kvm_cpu_context *);
 
 #define HANDLE_FUNC(x)	[__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
@@ -348,6 +359,7 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__pkvm_init_vm),
 	HANDLE_FUNC(__pkvm_init_vcpu),
 	HANDLE_FUNC(__pkvm_teardown_vm),
+	HANDLE_FUNC(__pkvm_host_stage2_snapshot),
 };
 
 static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 8d0a5834e..aaf07f9e1 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -266,6 +266,192 @@ int kvm_guest_prepare_stage2(struct pkvm_hyp_vm *vm, void *pgd)
 	return 0;
 }
 
+#ifdef CONFIG_NVHE_EL2_DEBUG
+static void *snap_zalloc_page(void *mc)
+{
+	struct hyp_page *p;
+	void *addr;
+	struct kvm_pgtable_snapshot *snap;
+	phys_addr_t *used_pg;
+
+	snap = container_of(mc, struct kvm_pgtable_snapshot, mc);
+	used_pg = kern_hyp_va(snap->used_pages_hva);
+
+	/* Check if we have space to track the used page */
+	if (snap->used_pages_indx * sizeof(phys_addr_t) > snap->num_used_pages * PAGE_SIZE)
+		return NULL;
+
+	addr = pop_hyp_memcache(mc, hyp_phys_to_virt);
+	if (!addr)
+		return addr;
+	used_pg[snap->used_pages_indx++] = hyp_virt_to_phys(addr);
+
+	memset(addr, 0, PAGE_SIZE);
+	p = hyp_virt_to_page(addr);
+	memset(p, 0, sizeof(*p));
+
+	return addr;
+}
+
+static void snap_free_pages_exact(void *addr, unsigned long size)
+{
+	u8 order = get_order(size);
+	unsigned int i;
+	struct hyp_page *p;
+
+	for (i = 0; i < (1 << order); i++) {
+		p = hyp_virt_to_page(addr + (i * PAGE_SIZE));
+		hyp_page_ref_dec(p);
+	}
+}
+
+static void *pkvm_setup_snapshot(struct kvm_pgtable_snapshot *snap_hva)
+{
+	unsigned long i;
+	void *pgd, *used_pg;
+	phys_addr_t mc_page, next_mc_page;
+	struct kvm_pgtable_snapshot *snap;
+
+	snap = (void *)kern_hyp_va(snap_hva);
+	if (!PAGE_ALIGNED(snap))
+		return NULL;
+
+	if (__pkvm_host_donate_hyp(hyp_virt_to_pfn(snap), 1))
+		return NULL;
+
+	pgd = kern_hyp_va(snap->pgd_hva);
+	if (!PAGE_ALIGNED(pgd))
+		goto error_with_snapshot;
+
+	if (__pkvm_host_donate_hyp(hyp_virt_to_pfn(pgd), snap->pgd_pages))
+		goto error_with_snapshot;
+
+	mc_page = snap->mc.head;
+	for (i = 0; i < snap->mc.nr_pages; i++) {
+		if (!PAGE_ALIGNED(mc_page))
+			goto error_with_memcache;
+
+		if (__pkvm_host_donate_hyp(hyp_phys_to_pfn(mc_page), 1))
+			goto error_with_memcache;
+
+		mc_page = *((phys_addr_t *)hyp_phys_to_virt(mc_page));
+	}
+
+	used_pg = kern_hyp_va(snap->used_pages_hva);
+	if (!PAGE_ALIGNED(used_pg))
+		goto error_with_memcache;
+
+	if (__pkvm_host_donate_hyp(hyp_virt_to_pfn(used_pg), snap->num_used_pages))
+		goto error_with_memcache;
+
+	return snap;
+error_with_memcache:
+	mc_page = snap->mc.head;
+	for (; i >= 0; i--) {
+		next_mc_page = *((phys_addr_t *)hyp_phys_to_virt(mc_page));
+		WARN_ON(__pkvm_hyp_donate_host(hyp_phys_to_pfn(mc_page), 1));
+		mc_page = next_mc_page;
+	}
+
+	WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(pgd), snap->pgd_pages));
+error_with_snapshot:
+	WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(snap), 1));
+	return NULL;
+}
+
+static void pkvm_teardown_snapshot(struct kvm_pgtable_snapshot *snap)
+{
+	size_t i;
+	phys_addr_t mc_page, next_mc_page;
+	u64 *used_pg = kern_hyp_va(snap->used_pages_hva);
+	void *pgd = kern_hyp_va(snap->pgd_hva);
+
+	for (i = 0; i < snap->used_pages_indx; i++) {
+		mc_page = used_pg[i];
+		WARN_ON(__pkvm_hyp_donate_host(hyp_phys_to_pfn(mc_page), 1));
+	}
+
+	WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(used_pg),
+				       snap->num_used_pages));
+
+	mc_page = snap->mc.head;
+	for (i = 0; i < snap->mc.nr_pages; i++) {
+		next_mc_page = *((phys_addr_t *)hyp_phys_to_virt(mc_page));
+		WARN_ON(__pkvm_hyp_donate_host(hyp_phys_to_pfn(mc_page), 1));
+		mc_page = next_mc_page;
+	}
+
+	snap->pgtable.mm_ops = NULL;
+	WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(pgd), snap->pgd_pages));
+	WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(snap), 1));
+}
+
+static int pkvm_host_stage2_snapshot(struct kvm_pgtable_snapshot *snap)
+{
+	int ret;
+	void *pgd;
+	size_t required_pgd_len;
+	struct kvm_pgtable_mm_ops mm_ops = {0};
+	struct kvm_s2_mmu *mmu;
+	struct kvm_pgtable *to_pgt, *from_pgt;
+
+	if (snap->used_pages_indx != 0)
+		return -EINVAL;
+
+	from_pgt = &host_mmu.pgt;
+	mmu = &host_mmu.arch.mmu;
+	required_pgd_len = kvm_pgtable_stage2_pgd_size(mmu->vtcr);
+	if (snap->pgd_pages < (required_pgd_len >> PAGE_SHIFT))
+		return -ENOMEM;
+
+	if (snap->mc.nr_pages < host_s2_pgtable_pages())
+		return -ENOMEM;
+
+	to_pgt = &snap->pgtable;
+	pgd = kern_hyp_va(snap->pgd_hva);
+
+	mm_ops.zalloc_page		= snap_zalloc_page;
+	mm_ops.free_pages_exact		= snap_free_pages_exact;
+	mm_ops.phys_to_virt		= hyp_phys_to_virt;
+	mm_ops.virt_to_phys		= hyp_virt_to_phys;
+	mm_ops.page_count		= hyp_page_count;
+
+	to_pgt->ia_bits		= from_pgt->ia_bits;
+	to_pgt->start_level	= from_pgt->start_level;
+	to_pgt->flags		= from_pgt->flags;
+	to_pgt->mm_ops		= &mm_ops;
+
+	host_lock_component();
+
+	to_pgt->pgd = pgd;
+	memcpy(to_pgt->pgd, from_pgt->pgd, required_pgd_len);
+	ret = kvm_pgtable_stage2_snapshot(to_pgt, &snap->mc);
+
+	host_unlock_component();
+
+	return ret;
+}
+
+int __pkvm_host_stage2_snapshot(struct kvm_pgtable_snapshot *snap_hva)
+{
+	int ret;
+	struct kvm_pgtable_snapshot *snap;
+	kvm_pteref_t pgd;
+
+	snap = pkvm_setup_snapshot(snap_hva);
+	if (!snap)
+		return -EPERM;
+
+	ret = pkvm_host_stage2_snapshot(snap);
+	if (!ret) {
+		pgd = snap->pgtable.pgd;
+		snap->pgtable.pgd = (kvm_pteref_t)__hyp_pa(pgd);
+	}
+	pkvm_teardown_snapshot(snap);
+	return ret;
+}
+#endif /* CONFIG_NVHE_EL2_DEBUG */
+
 void reclaim_guest_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc)
 {
 	void *addr;
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 1966fdee7..82fef9620 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -1598,3 +1598,46 @@ void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *p
 	WARN_ON(mm_ops->page_count(pgtable) != 1);
 	mm_ops->put_page(pgtable);
 }
+
+#ifdef CONFIG_NVHE_EL2_DEBUG
+static int snapshot_walker(const struct kvm_pgtable_visit_ctx *ctx,
+			   enum kvm_pgtable_walk_flags visit)
+{
+	struct kvm_pgtable_mm_ops *mm_ops = ctx->mm_ops;
+	void *copy_table, *original_addr;
+	kvm_pte_t new = ctx->old;
+
+	if (!stage2_pte_is_counted(ctx->old))
+		return 0;
+
+	if (kvm_pte_table(ctx->old, ctx->level)) {
+		copy_table = mm_ops->zalloc_page(ctx->arg);
+		if (!copy_table)
+			return -ENOMEM;
+
+		original_addr = kvm_pte_follow(ctx->old, mm_ops);
+
+		memcpy(copy_table, original_addr, PAGE_SIZE);
+		new = kvm_init_table_pte(copy_table, mm_ops);
+	}
+
+	*ctx->ptep = new;
+
+	return 0;
+}
+
+int kvm_pgtable_stage2_snapshot(struct kvm_pgtable *to_pgt, void *mc)
+{
+	struct kvm_pgtable_walker walker = {
+		.cb	= snapshot_walker,
+		.flags	= KVM_PGTABLE_WALK_LEAF |
+			  KVM_PGTABLE_WALK_TABLE_PRE,
+		.arg = mc
+	};
+
+	if (!to_pgt->pgd)
+		return -ENOMEM;
+
+	return kvm_pgtable_walk(to_pgt, 0, BIT(to_pgt->ia_bits), &walker);
+}
+#endif /* CONFIG_NVHE_EL2_DEBUG */
-- 
2.43.0.472.g3155946c3a-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 02/10] KVM: arm64: Add ptdump registration with debugfs for the stage-2 pagetables
  2023-12-18 13:58 [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Sebastian Ene
  2023-12-18 13:58 ` [PATCH v4 01/10] KVM: arm64: Add snapshot interface for the host stage-2 pagetable Sebastian Ene
@ 2023-12-18 13:58 ` Sebastian Ene
  2023-12-19 11:47   ` Sebastian Ene
  2023-12-21 18:14   ` Oliver Upton
  2023-12-18 13:58 ` [PATCH v4 03/10] KVM: arm64: Invoke the snapshot interface for the host stage-2 pagetable Sebastian Ene
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 22+ messages in thread
From: Sebastian Ene @ 2023-12-18 13:58 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa, Sebastian Ene

While arch/*/mem/ptdump handles the kernel pagetable dumping code,
introduce KVM/ptdump which deals with the stage-2 pagetables. The
separation is necessary because most of the definitions from the
stage-2 pagetable reside in the KVM path and we will be invoking
functionality **specific** to KVM.

This registers a wrapper on top of debugfs_create_file which allows us
to hook callbacks on the debugfs open/show/close. The callbacks are used
to prepare the display portion of the pagetable dumping code.
Guard this functionality under the newly introduced PTDUMP_STAGE2_DEBUGFS.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/kvm/Kconfig      | 13 +++++
 arch/arm64/kvm/Makefile     |  1 +
 arch/arm64/kvm/arm.c        |  2 +
 arch/arm64/kvm/kvm_ptdump.h | 18 +++++++
 arch/arm64/kvm/ptdump.c     | 96 +++++++++++++++++++++++++++++++++++++
 5 files changed, 130 insertions(+)
 create mode 100644 arch/arm64/kvm/kvm_ptdump.h
 create mode 100644 arch/arm64/kvm/ptdump.c

diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 83c1e09be..0014e55e2 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -71,4 +71,17 @@ config PROTECTED_NVHE_STACKTRACE
 
 	  If unsure, or not using protected nVHE (pKVM), say N.
 
+config PTDUMP_STAGE2_DEBUGFS
+       bool "Present the stage-2 pagetables to debugfs"
+       depends on PTDUMP_DEBUGFS && KVM
+       default n
+       help
+         Say Y here if you want to show the stage-2 kernel pagetables
+         layout in a debugfs file. This information is only useful for kernel developers
+         who are working in architecture specific areas of the kernel.
+         It is probably not a good idea to enable this feature in a production
+         kernel.
+
+         If in doubt, say N.
+
 endif # VIRTUALIZATION
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index c0c050e53..190eac175 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -23,6 +23,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
 	 vgic/vgic-its.o vgic/vgic-debug.o
 
 kvm-$(CONFIG_HW_PERF_EVENTS)  += pmu-emul.o pmu.o
+kvm-$(CONFIG_PTDUMP_STAGE2_DEBUGFS) += ptdump.o
 
 always-y := hyp_constants.h hyp-constants.s
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e5f75f1f1..ee8d7cb67 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -40,6 +40,7 @@
 #include <asm/kvm_pkvm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/sections.h>
+#include <kvm_ptdump.h>
 
 #include <kvm/arm_hypercalls.h>
 #include <kvm/arm_pmu.h>
@@ -2592,6 +2593,7 @@ static __init int kvm_arm_init(void)
 	if (err)
 		goto out_subs;
 
+	kvm_ptdump_register_host();
 	kvm_arm_initialised = true;
 
 	return 0;
diff --git a/arch/arm64/kvm/kvm_ptdump.h b/arch/arm64/kvm/kvm_ptdump.h
new file mode 100644
index 000000000..98b595ce8
--- /dev/null
+++ b/arch/arm64/kvm/kvm_ptdump.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+//
+// Copyright (C) Google, 2023
+// Author: Sebastian Ene <sebastianene@google.com>
+
+#ifndef __KVM_PTDUMP_H
+#define __KVM_PTDUMP_H
+
+#include <asm/ptdump.h>
+
+
+#ifdef CONFIG_PTDUMP_STAGE2_DEBUGFS
+void kvm_ptdump_register_host(void);
+#else
+static inline void kvm_ptdump_register_host(void) { }
+#endif /* CONFIG_PTDUMP_STAGE2_DEBUGFS */
+
+#endif /* __KVM_PTDUMP_H */
diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
new file mode 100644
index 000000000..5816fc632
--- /dev/null
+++ b/arch/arm64/kvm/ptdump.c
@@ -0,0 +1,96 @@
+// SPDX-License-Identifier: GPL-2.0-only
+//
+// Debug helper used to dump the stage-2 pagetables of the system and their
+// associated permissions.
+//
+// Copyright (C) Google, 2023
+// Author: Sebastian Ene <sebastianene@google.com>
+
+#include <linux/debugfs.h>
+#include <linux/kvm_host.h>
+#include <linux/seq_file.h>
+
+#include <asm/kvm_pkvm.h>
+#include <kvm_ptdump.h>
+
+
+struct kvm_ptdump_register {
+	void *(*get_ptdump_info)(struct kvm_ptdump_register *reg);
+	void (*put_ptdump_info)(void *priv);
+	int (*show_ptdump_info)(struct seq_file *m, void *v);
+	void *priv;
+};
+
+static int kvm_ptdump_open(struct inode *inode, struct file *file);
+static int kvm_ptdump_release(struct inode *inode, struct file *file);
+static int kvm_ptdump_show(struct seq_file *m, void *);
+
+static const struct file_operations kvm_ptdump_fops = {
+	.open		= kvm_ptdump_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= kvm_ptdump_release,
+};
+
+static int kvm_ptdump_open(struct inode *inode, struct file *file)
+{
+	struct kvm_ptdump_register *reg = inode->i_private;
+	void *info = NULL;
+	int ret;
+
+	if (reg->get_ptdump_info) {
+		info = reg->get_ptdump_info(reg);
+		if (!info)
+			return -ENOMEM;
+	}
+
+	if (!reg->show_ptdump_info)
+		reg->show_ptdump_info = kvm_ptdump_show;
+
+	ret = single_open(file, reg->show_ptdump_info, info);
+	if (ret && reg->put_ptdump_info)
+		reg->put_ptdump_info(info);
+
+	return ret;
+}
+
+static int kvm_ptdump_release(struct inode *inode, struct file *file)
+{
+	struct kvm_ptdump_register *reg = inode->i_private;
+	struct seq_file *seq_file = file->private_data;
+
+	if (reg->put_ptdump_info)
+		reg->put_ptdump_info(seq_file->private);
+
+	return 0;
+}
+
+static int kvm_ptdump_show(struct seq_file *m, void *)
+{
+	return -EINVAL;
+}
+
+static void kvm_ptdump_debugfs_register(struct kvm_ptdump_register *reg,
+					const char *name, struct dentry *parent)
+{
+	debugfs_create_file(name, 0400, parent, reg, &kvm_ptdump_fops);
+}
+
+static struct kvm_ptdump_register host_reg;
+
+void kvm_ptdump_register_host(void)
+{
+	if (!is_protected_kvm_enabled())
+		return;
+
+	kvm_ptdump_debugfs_register(&host_reg, "host_page_tables",
+				    kvm_debugfs_dir);
+}
+
+static int __init kvm_host_ptdump_init(void)
+{
+	host_reg.priv = (void *)host_s2_pgtable_pages();
+	return 0;
+}
+
+device_initcall(kvm_host_ptdump_init);
-- 
2.43.0.472.g3155946c3a-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 03/10] KVM: arm64: Invoke the snapshot interface for the host stage-2 pagetable
  2023-12-18 13:58 [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Sebastian Ene
  2023-12-18 13:58 ` [PATCH v4 01/10] KVM: arm64: Add snapshot interface for the host stage-2 pagetable Sebastian Ene
  2023-12-18 13:58 ` [PATCH v4 02/10] KVM: arm64: Add ptdump registration with debugfs for the stage-2 pagetables Sebastian Ene
@ 2023-12-18 13:58 ` Sebastian Ene
  2023-12-19 11:45   ` Sebastian Ene
  2023-12-18 13:58 ` [PATCH v4 04/10] arm64: ptdump: Expose the attribute parsing functionality Sebastian Ene
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Sebastian Ene @ 2023-12-18 13:58 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa, Sebastian Ene

Allocate memory for the snapshot by creating a memory cache with empty
pages that will be used by the hypervisor during the page table copy.
Get the required size of the PGD and allocate physically contiguous
memory for it. Allocate contiguous memory for an array that is used to
keep track of the pages used from the memcache. Call the snapshot
interface and release the memory for the snapshot.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/kvm/ptdump.c | 107 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 107 insertions(+)

diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
index 5816fc632..e99bab427 100644
--- a/arch/arm64/kvm/ptdump.c
+++ b/arch/arm64/kvm/ptdump.c
@@ -25,6 +25,9 @@ static int kvm_ptdump_open(struct inode *inode, struct file *file);
 static int kvm_ptdump_release(struct inode *inode, struct file *file);
 static int kvm_ptdump_show(struct seq_file *m, void *);
 
+static phys_addr_t get_host_pa(void *addr);
+static void *get_host_va(phys_addr_t pa);
+
 static const struct file_operations kvm_ptdump_fops = {
 	.open		= kvm_ptdump_open,
 	.read		= seq_read,
@@ -32,6 +35,11 @@ static const struct file_operations kvm_ptdump_fops = {
 	.release	= kvm_ptdump_release,
 };
 
+static struct kvm_pgtable_mm_ops ptdump_host_mmops = {
+	.phys_to_virt	= get_host_va,
+	.virt_to_phys	= get_host_pa,
+};
+
 static int kvm_ptdump_open(struct inode *inode, struct file *file)
 {
 	struct kvm_ptdump_register *reg = inode->i_private;
@@ -78,11 +86,110 @@ static void kvm_ptdump_debugfs_register(struct kvm_ptdump_register *reg,
 
 static struct kvm_ptdump_register host_reg;
 
+static size_t host_stage2_get_pgd_len(void)
+{
+	u32 phys_shift = get_kvm_ipa_limit();
+	u64 vtcr = kvm_get_vtcr(read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1),
+				read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1),
+				phys_shift);
+	return (kvm_pgtable_stage2_pgd_size(vtcr) >> PAGE_SHIFT);
+}
+
+static phys_addr_t get_host_pa(void *addr)
+{
+	return __pa(addr);
+}
+
+static void *get_host_va(phys_addr_t pa)
+{
+	return __va(pa);
+}
+
+static void kvm_host_put_ptdump_info(void *snap)
+{
+	void *mc_page;
+	size_t i;
+	struct kvm_pgtable_snapshot *snapshot;
+
+	if (!snap)
+		return;
+
+	snapshot = snap;
+	while ((mc_page = pop_hyp_memcache(&snapshot->mc, get_host_va)) != NULL)
+		free_page((unsigned long)mc_page);
+
+	if (snapshot->pgd_hva)
+		free_pages_exact(snapshot->pgd_hva, snapshot->pgd_pages);
+
+	if (snapshot->used_pages_hva) {
+		for (i = 0; i < snapshot->used_pages_indx; i++) {
+			mc_page = get_host_va(snapshot->used_pages_hva[i]);
+			free_page((unsigned long)mc_page);
+		}
+
+		free_pages_exact(snapshot->used_pages_hva, snapshot->num_used_pages);
+	}
+
+	free_page((unsigned long)snapshot);
+}
+
+static void *kvm_host_get_ptdump_info(struct kvm_ptdump_register *reg)
+{
+	int i, ret;
+	void *mc_page;
+	struct kvm_pgtable_snapshot *snapshot;
+	size_t memcache_len;
+
+	snapshot = (void *)__get_free_page(GFP_KERNEL_ACCOUNT);
+	if (!snapshot)
+		return NULL;
+
+	memset(snapshot, 0, sizeof(struct kvm_pgtable_snapshot));
+
+	snapshot->pgd_pages = host_stage2_get_pgd_len();
+	snapshot->pgd_hva = alloc_pages_exact(snapshot->pgd_pages, GFP_KERNEL_ACCOUNT);
+	if (!snapshot->pgd_hva)
+		goto err;
+
+	memcache_len = (size_t)reg->priv;
+	for (i = 0; i < memcache_len; i++) {
+		mc_page = (void *)__get_free_page(GFP_KERNEL_ACCOUNT);
+		if (!mc_page)
+			goto err;
+
+		push_hyp_memcache(&snapshot->mc, mc_page, get_host_pa);
+	}
+
+	snapshot->num_used_pages = DIV_ROUND_UP(sizeof(phys_addr_t) * memcache_len,
+					     PAGE_SIZE);
+	snapshot->used_pages_hva = alloc_pages_exact(snapshot->num_used_pages,
+						  GFP_KERNEL_ACCOUNT);
+	if (!snapshot->used_pages_hva)
+		goto err;
+
+	ret = kvm_call_hyp_nvhe(__pkvm_host_stage2_snapshot, snapshot);
+	if (ret) {
+		pr_err("ERROR %d snapshot host pagetables\n", ret);
+		goto err;
+	}
+
+	snapshot->pgtable.pgd = get_host_va((phys_addr_t)snapshot->pgtable.pgd);
+	snapshot->pgtable.mm_ops = &ptdump_host_mmops;
+
+	return snapshot;
+err:
+	kvm_host_put_ptdump_info(snapshot);
+	return NULL;
+}
+
 void kvm_ptdump_register_host(void)
 {
 	if (!is_protected_kvm_enabled())
 		return;
 
+	host_reg.get_ptdump_info = kvm_host_get_ptdump_info;
+	host_reg.put_ptdump_info = kvm_host_put_ptdump_info;
+
 	kvm_ptdump_debugfs_register(&host_reg, "host_page_tables",
 				    kvm_debugfs_dir);
 }
-- 
2.43.0.472.g3155946c3a-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 04/10] arm64: ptdump: Expose the attribute parsing functionality
  2023-12-18 13:58 [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (2 preceding siblings ...)
  2023-12-18 13:58 ` [PATCH v4 03/10] KVM: arm64: Invoke the snapshot interface for the host stage-2 pagetable Sebastian Ene
@ 2023-12-18 13:58 ` Sebastian Ene
  2023-12-18 13:58 ` [PATCH v4 05/10] arm64: ptdump: Use the mask from the state structure Sebastian Ene
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Sebastian Ene @ 2023-12-18 13:58 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa, Sebastian Ene

To keep the same output format as the arch specific ptdump and for the
sake of reusability, move the parser's state tracking code out
of the arch specific.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/include/asm/ptdump.h | 41 ++++++++++++++++++++++++++++++++-
 arch/arm64/mm/ptdump.c          | 36 ++---------------------------
 2 files changed, 42 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
index 581caac52..23510be35 100644
--- a/arch/arm64/include/asm/ptdump.h
+++ b/arch/arm64/include/asm/ptdump.h
@@ -9,6 +9,8 @@
 
 #include <linux/mm_types.h>
 #include <linux/seq_file.h>
+#include <linux/ptdump.h>
+
 
 struct addr_marker {
 	unsigned long start_address;
@@ -21,15 +23,52 @@ struct ptdump_info {
 	unsigned long			base_addr;
 };
 
+struct prot_bits {
+	u64		mask;
+	u64		val;
+	const char	*set;
+	const char	*clear;
+};
+
+struct pg_level {
+	const struct prot_bits	*bits;
+	const char		*name;
+	size_t			num;
+	u64			mask;
+};
+
+/*
+ * The page dumper groups page table entries of the same type into a single
+ * description. It uses pg_state to track the range information while
+ * iterating over the pte entries. When the continuity is broken it then
+ * dumps out a description of the range.
+ */
+struct pg_state {
+	struct ptdump_state		ptdump;
+	struct seq_file			*seq;
+	const struct addr_marker	*marker;
+	unsigned long			start_address;
+	int				level;
+	u64				current_prot;
+	bool				check_wx;
+	unsigned long			wx_pages;
+	unsigned long			uxn_pages;
+};
+
 void ptdump_walk(struct seq_file *s, struct ptdump_info *info);
+void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
+	       u64 val);
 #ifdef CONFIG_PTDUMP_DEBUGFS
 #define EFI_RUNTIME_MAP_END	DEFAULT_MAP_WINDOW_64
 void __init ptdump_debugfs_register(struct ptdump_info *info, const char *name);
 #else
 static inline void ptdump_debugfs_register(struct ptdump_info *info,
 					   const char *name) { }
-#endif
+#endif /* CONFIG_PTDUMP_DEBUGFS */
 void ptdump_check_wx(void);
+#else
+static inline void note_page(void *pt_st, unsigned long addr,
+			     int level, u64 val) { }
 #endif /* CONFIG_PTDUMP_CORE */
 
 #ifdef CONFIG_DEBUG_WX
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index e305b6593..64127c70b 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -66,31 +66,6 @@ static struct addr_marker address_markers[] = {
 		seq_printf(m, fmt);	\
 })
 
-/*
- * The page dumper groups page table entries of the same type into a single
- * description. It uses pg_state to track the range information while
- * iterating over the pte entries. When the continuity is broken it then
- * dumps out a description of the range.
- */
-struct pg_state {
-	struct ptdump_state ptdump;
-	struct seq_file *seq;
-	const struct addr_marker *marker;
-	unsigned long start_address;
-	int level;
-	u64 current_prot;
-	bool check_wx;
-	unsigned long wx_pages;
-	unsigned long uxn_pages;
-};
-
-struct prot_bits {
-	u64		mask;
-	u64		val;
-	const char	*set;
-	const char	*clear;
-};
-
 static const struct prot_bits pte_bits[] = {
 	{
 		.mask	= PTE_VALID,
@@ -170,13 +145,6 @@ static const struct prot_bits pte_bits[] = {
 	}
 };
 
-struct pg_level {
-	const struct prot_bits *bits;
-	const char *name;
-	size_t num;
-	u64 mask;
-};
-
 static struct pg_level pg_level[] = {
 	{ /* pgd */
 		.name	= "PGD",
@@ -248,8 +216,8 @@ static void note_prot_wx(struct pg_state *st, unsigned long addr)
 	st->wx_pages += (addr - st->start_address) / PAGE_SIZE;
 }
 
-static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
-		      u64 val)
+void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
+	       u64 val)
 {
 	struct pg_state *st = container_of(pt_st, struct pg_state, ptdump);
 	static const char units[] = "KMGTPE";
-- 
2.43.0.472.g3155946c3a-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 05/10] arm64: ptdump: Use the mask from the state structure
  2023-12-18 13:58 [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (3 preceding siblings ...)
  2023-12-18 13:58 ` [PATCH v4 04/10] arm64: ptdump: Expose the attribute parsing functionality Sebastian Ene
@ 2023-12-18 13:58 ` Sebastian Ene
  2023-12-18 13:58 ` [PATCH v4 06/10] KVM: arm64: Move pagetable definitions to common header Sebastian Ene
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Sebastian Ene @ 2023-12-18 13:58 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa, Sebastian Ene

Printing the descriptor attributes requires accessing a mask which has a
different set of attributes for stage-2. In preparation for adding support
for the stage-2 pagetables dumping, use the mask from the local context
and not from the globally defined pg_level array. Store a pointer to
the pg_level array in the ptdump state structure. This will allow us to
extract the mask which is wrapped in the pg_level array and use it for
descriptor parsing in the note_page.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/include/asm/ptdump.h |  1 +
 arch/arm64/mm/ptdump.c          | 13 ++++++++-----
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
index 23510be35..4e728d2a1 100644
--- a/arch/arm64/include/asm/ptdump.h
+++ b/arch/arm64/include/asm/ptdump.h
@@ -45,6 +45,7 @@ struct pg_level {
  */
 struct pg_state {
 	struct ptdump_state		ptdump;
+	struct pg_level			*pg_level;
 	struct seq_file			*seq;
 	const struct addr_marker	*marker;
 	unsigned long			start_address;
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 64127c70b..015ed65d3 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -220,11 +220,12 @@ void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
 	       u64 val)
 {
 	struct pg_state *st = container_of(pt_st, struct pg_state, ptdump);
+	struct pg_level *pg_info = st->pg_level;
 	static const char units[] = "KMGTPE";
 	u64 prot = 0;
 
 	if (level >= 0)
-		prot = val & pg_level[level].mask;
+		prot = val & pg_info[level].mask;
 
 	if (st->level == -1) {
 		st->level = level;
@@ -250,10 +251,10 @@ void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
 			unit++;
 		}
 		pt_dump_seq_printf(st->seq, "%9lu%c %s", delta, *unit,
-				   pg_level[st->level].name);
-		if (st->current_prot && pg_level[st->level].bits)
-			dump_prot(st, pg_level[st->level].bits,
-				  pg_level[st->level].num);
+				   pg_info[st->level].name);
+		if (st->current_prot && pg_info[st->level].bits)
+			dump_prot(st, pg_info[st->level].bits,
+				  pg_info[st->level].num);
 		pt_dump_seq_puts(st->seq, "\n");
 
 		if (addr >= st->marker[1].start_address) {
@@ -284,6 +285,7 @@ void ptdump_walk(struct seq_file *s, struct ptdump_info *info)
 	st = (struct pg_state){
 		.seq = s,
 		.marker = info->markers,
+		.pg_level = &pg_level[0],
 		.level = -1,
 		.ptdump = {
 			.note_page = note_page,
@@ -321,6 +323,7 @@ void ptdump_check_wx(void)
 			{ 0, NULL},
 			{ -1, NULL},
 		},
+		.pg_level = &pg_level[0],
 		.level = -1,
 		.check_wx = true,
 		.ptdump = {
-- 
2.43.0.472.g3155946c3a-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 06/10] KVM: arm64: Move pagetable definitions to common header
  2023-12-18 13:58 [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (4 preceding siblings ...)
  2023-12-18 13:58 ` [PATCH v4 05/10] arm64: ptdump: Use the mask from the state structure Sebastian Ene
@ 2023-12-18 13:58 ` Sebastian Ene
  2023-12-18 13:58 ` [PATCH v4 07/10] KVM: arm64: Walk the pagetable snapshot and parse the ptdump descriptors Sebastian Ene
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Sebastian Ene @ 2023-12-18 13:58 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa, Sebastian Ene

In preparation for using the stage-2 definitions in ptdump, move some of
these macros in the common header.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/include/asm/kvm_pgtable.h | 42 ++++++++++++++++++++++++++++
 arch/arm64/kvm/hyp/pgtable.c         | 42 ----------------------------
 2 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index f73efd8a8..37f2a8532 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -45,6 +45,48 @@ typedef u64 kvm_pte_t;
 
 #define KVM_PHYS_INVALID		(-1ULL)
 
+#define KVM_PTE_LEAF_ATTR_LO		GENMASK(11, 2)
+
+#define KVM_PTE_LEAF_ATTR_LO_S1_ATTRIDX	GENMASK(4, 2)
+#define KVM_PTE_LEAF_ATTR_LO_S1_AP	GENMASK(7, 6)
+#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RO		\
+	({ cpus_have_final_cap(ARM64_KVM_HVHE) ? 2 : 3; })
+#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RW		\
+	({ cpus_have_final_cap(ARM64_KVM_HVHE) ? 0 : 1; })
+#define KVM_PTE_LEAF_ATTR_LO_S1_SH	GENMASK(9, 8)
+#define KVM_PTE_LEAF_ATTR_LO_S1_SH_IS	3
+#define KVM_PTE_LEAF_ATTR_LO_S1_AF	BIT(10)
+
+#define KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR	GENMASK(5, 2)
+#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R	BIT(6)
+#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W	BIT(7)
+#define KVM_PTE_LEAF_ATTR_LO_S2_SH	GENMASK(9, 8)
+#define KVM_PTE_LEAF_ATTR_LO_S2_SH_IS	3
+#define KVM_PTE_LEAF_ATTR_LO_S2_AF	BIT(10)
+
+#define KVM_PTE_LEAF_ATTR_HI		GENMASK(63, 50)
+
+#define KVM_PTE_LEAF_ATTR_HI_SW		GENMASK(58, 55)
+
+#define KVM_PTE_LEAF_ATTR_HI_S1_XN	BIT(54)
+
+#define KVM_PTE_LEAF_ATTR_HI_S2_XN	BIT(54)
+
+#define KVM_PTE_LEAF_ATTR_HI_S1_GP	BIT(50)
+
+#define KVM_PTE_LEAF_ATTR_S2_PERMS	(KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | \
+					 KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \
+					 KVM_PTE_LEAF_ATTR_HI_S2_XN)
+
+#define KVM_INVALID_PTE_OWNER_MASK	GENMASK(9, 2)
+#define KVM_MAX_OWNER_ID		1
+
+/*
+ * Used to indicate a pte for which a 'break-before-make' sequence is in
+ * progress.
+ */
+#define KVM_INVALID_PTE_LOCKED		BIT(10)
+
 static inline bool kvm_pte_valid(kvm_pte_t pte)
 {
 	return pte & KVM_PTE_VALID;
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 82fef9620..ccd46bf56 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -17,48 +17,6 @@
 #define KVM_PTE_TYPE_PAGE		1
 #define KVM_PTE_TYPE_TABLE		1
 
-#define KVM_PTE_LEAF_ATTR_LO		GENMASK(11, 2)
-
-#define KVM_PTE_LEAF_ATTR_LO_S1_ATTRIDX	GENMASK(4, 2)
-#define KVM_PTE_LEAF_ATTR_LO_S1_AP	GENMASK(7, 6)
-#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RO		\
-	({ cpus_have_final_cap(ARM64_KVM_HVHE) ? 2 : 3; })
-#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RW		\
-	({ cpus_have_final_cap(ARM64_KVM_HVHE) ? 0 : 1; })
-#define KVM_PTE_LEAF_ATTR_LO_S1_SH	GENMASK(9, 8)
-#define KVM_PTE_LEAF_ATTR_LO_S1_SH_IS	3
-#define KVM_PTE_LEAF_ATTR_LO_S1_AF	BIT(10)
-
-#define KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR	GENMASK(5, 2)
-#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R	BIT(6)
-#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W	BIT(7)
-#define KVM_PTE_LEAF_ATTR_LO_S2_SH	GENMASK(9, 8)
-#define KVM_PTE_LEAF_ATTR_LO_S2_SH_IS	3
-#define KVM_PTE_LEAF_ATTR_LO_S2_AF	BIT(10)
-
-#define KVM_PTE_LEAF_ATTR_HI		GENMASK(63, 50)
-
-#define KVM_PTE_LEAF_ATTR_HI_SW		GENMASK(58, 55)
-
-#define KVM_PTE_LEAF_ATTR_HI_S1_XN	BIT(54)
-
-#define KVM_PTE_LEAF_ATTR_HI_S2_XN	BIT(54)
-
-#define KVM_PTE_LEAF_ATTR_HI_S1_GP	BIT(50)
-
-#define KVM_PTE_LEAF_ATTR_S2_PERMS	(KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | \
-					 KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \
-					 KVM_PTE_LEAF_ATTR_HI_S2_XN)
-
-#define KVM_INVALID_PTE_OWNER_MASK	GENMASK(9, 2)
-#define KVM_MAX_OWNER_ID		1
-
-/*
- * Used to indicate a pte for which a 'break-before-make' sequence is in
- * progress.
- */
-#define KVM_INVALID_PTE_LOCKED		BIT(10)
-
 struct kvm_pgtable_walk_data {
 	struct kvm_pgtable_walker	*walker;
 
-- 
2.43.0.472.g3155946c3a-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 07/10] KVM: arm64: Walk the pagetable snapshot and parse the ptdump descriptors
  2023-12-18 13:58 [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (5 preceding siblings ...)
  2023-12-18 13:58 ` [PATCH v4 06/10] KVM: arm64: Move pagetable definitions to common header Sebastian Ene
@ 2023-12-18 13:58 ` Sebastian Ene
  2023-12-18 13:58 ` [PATCH v4 08/10] arm64: ptdump: Interpret memory attributes based on the runtime config Sebastian Ene
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Sebastian Ene @ 2023-12-18 13:58 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa, Sebastian Ene

Define the attributes for the stage-2 pagetables that will be used by
the ptdump parser code to interepret and show the descriptors. Build the
pagetable level description dynamically and use the KVM pagetable walker
to visit the PTEs. Display the number of the bits used by the IPA
address space and the start level.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/kvm/ptdump.c | 135 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 134 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
index e99bab427..80c338e03 100644
--- a/arch/arm64/kvm/ptdump.c
+++ b/arch/arm64/kvm/ptdump.c
@@ -40,6 +40,66 @@ static struct kvm_pgtable_mm_ops ptdump_host_mmops = {
 	.virt_to_phys	= get_host_pa,
 };
 
+static const struct prot_bits stage2_pte_bits[] = {
+	{
+		.mask	= PTE_VALID,
+		.val	= PTE_VALID,
+		.set	= " ",
+		.clear	= "F",
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_HI_S2_XN,
+		.val	= KVM_PTE_LEAF_ATTR_HI_S2_XN,
+		.set	= "XN",
+		.clear	= "  ",
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R,
+		.val	= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R,
+		.set	= "R",
+		.clear	= " ",
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W,
+		.val	= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W,
+		.set	= "W",
+		.clear	= " ",
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_LO_S2_AF,
+		.val	= KVM_PTE_LEAF_ATTR_LO_S2_AF,
+		.set	= "AF",
+		.clear	= "  ",
+	}, {
+		.mask	= PTE_NG,
+		.val	= PTE_NG,
+		.set	= "FnXS",
+		.clear	= "  ",
+	}, {
+		.mask	= PTE_CONT,
+		.val	= PTE_CONT,
+		.set	= "CON",
+		.clear	= "   ",
+	}, {
+		.mask	= PTE_TABLE_BIT,
+		.val	= PTE_TABLE_BIT,
+		.set	= "   ",
+		.clear	= "BLK",
+	}, {
+		.mask	= KVM_PGTABLE_PROT_SW0,
+		.val	= KVM_PGTABLE_PROT_SW0,
+		.set	= "SW0", /* PKVM_PAGE_SHARED_OWNED */
+	}, {
+		.mask   = KVM_PGTABLE_PROT_SW1,
+		.val	= KVM_PGTABLE_PROT_SW1,
+		.set	= "SW1", /* PKVM_PAGE_SHARED_BORROWED */
+	}, {
+		.mask	= KVM_PGTABLE_PROT_SW2,
+		.val	= KVM_PGTABLE_PROT_SW2,
+		.set	= "SW2",
+	}, {
+		.mask   = KVM_PGTABLE_PROT_SW3,
+		.val	= KVM_PGTABLE_PROT_SW3,
+		.set	= "SW3",
+	},
+};
+
 static int kvm_ptdump_open(struct inode *inode, struct file *file)
 {
 	struct kvm_ptdump_register *reg = inode->i_private;
@@ -73,9 +133,82 @@ static int kvm_ptdump_release(struct inode *inode, struct file *file)
 	return 0;
 }
 
+static int kvm_ptdump_build_levels(struct pg_level *level, unsigned int start_level)
+{
+	static const char * const level_names[] = {"PGD", "PUD", "PMD", "PTE"};
+	int i, j, name_index;
+
+	if (start_level > 2) {
+		pr_err("invalid start_level %u\n", start_level);
+		return -EINVAL;
+	}
+
+	for (i = start_level; i < KVM_PGTABLE_MAX_LEVELS; i++) {
+		name_index = i - start_level;
+		name_index += name_index * start_level;
+
+		level[i].name	= level_names[name_index];
+		level[i].num	= ARRAY_SIZE(stage2_pte_bits);
+		level[i].bits	= stage2_pte_bits;
+
+		for (j = 0; j < level[i].num; j++)
+			level[i].mask |= level[i].bits[j].mask;
+	}
+
+	return 0;
+}
+
+static int kvm_ptdump_visitor(const struct kvm_pgtable_visit_ctx *ctx,
+			      enum kvm_pgtable_walk_flags visit)
+{
+	struct pg_state *st = ctx->arg;
+	struct ptdump_state *pt_st = &st->ptdump;
+
+	note_page(pt_st, ctx->addr, ctx->level, ctx->old);
+	return 0;
+}
+
 static int kvm_ptdump_show(struct seq_file *m, void *)
 {
-	return -EINVAL;
+	u64 ipa_size;
+	char ipa_description[32];
+	struct pg_state st;
+	struct addr_marker ipa_addr_markers[3] = {0};
+	struct pg_level pg_level_descr[KVM_PGTABLE_MAX_LEVELS] = {0};
+	struct kvm_pgtable_snapshot *snapshot = m->private;
+	struct kvm_pgtable *pgtable = &snapshot->pgtable;
+	struct kvm_pgtable_walker walker = (struct kvm_pgtable_walker) {
+		.cb	= kvm_ptdump_visitor,
+		.arg	= &st,
+		.flags	= KVM_PGTABLE_WALK_LEAF,
+	};
+
+	if (kvm_ptdump_build_levels(pg_level_descr, pgtable->start_level) < 0)
+		return -EINVAL;
+
+	snprintf(ipa_description, sizeof(ipa_description),
+		 "IPA bits %2u start lvl %1u", pgtable->ia_bits,
+		 pgtable->start_level);
+
+	ipa_size = BIT(pgtable->ia_bits);
+	ipa_addr_markers[0].name = ipa_description;
+	ipa_addr_markers[1].start_address = ipa_size;
+
+	st = (struct pg_state) {
+		.seq		= m,
+		.marker		= &ipa_addr_markers[0],
+		.level		= -1,
+		.pg_level	= &pg_level_descr[0],
+		.ptdump	= {
+			.note_page	= note_page,
+			.range		= (struct ptdump_range[]) {
+				{0, ipa_size},
+				{0, 0},
+			},
+		},
+	};
+
+	return kvm_pgtable_walk(pgtable, 0, ipa_size, &walker);
 }
 
 static void kvm_ptdump_debugfs_register(struct kvm_ptdump_register *reg,
-- 
2.43.0.472.g3155946c3a-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 08/10] arm64: ptdump: Interpret memory attributes based on the runtime config
  2023-12-18 13:58 [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (6 preceding siblings ...)
  2023-12-18 13:58 ` [PATCH v4 07/10] KVM: arm64: Walk the pagetable snapshot and parse the ptdump descriptors Sebastian Ene
@ 2023-12-18 13:58 ` Sebastian Ene
  2023-12-18 13:58 ` [PATCH v4 09/10] arm64: ptdump: Interpret pKVM ownership annotations Sebastian Ene
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Sebastian Ene @ 2023-12-18 13:58 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa, Sebastian Ene

Introduce two callbacks that verify the current runtime configuration
before parsing the attribute fields. This is used to check when FWB is
enabled which changes the interpretation of the descriptor bits.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/include/asm/ptdump.h |  7 +++++++
 arch/arm64/kvm/ptdump.c         | 32 ++++++++++++++++++++++++++++++++
 arch/arm64/mm/ptdump.c          |  6 ++++++
 3 files changed, 45 insertions(+)

diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
index 4e728d2a1..e150fc21f 100644
--- a/arch/arm64/include/asm/ptdump.h
+++ b/arch/arm64/include/asm/ptdump.h
@@ -23,11 +23,18 @@ struct ptdump_info {
 	unsigned long			base_addr;
 };
 
+/* Forward declaration */
+struct pg_state;
+
 struct prot_bits {
 	u64		mask;
 	u64		val;
 	const char	*set;
 	const char	*clear;
+	/* bit ignored if the callback returns false */
+	bool		(*feature_on)(const struct pg_state *ctxt);
+	/* bit ignored if the callback returns true */
+	bool		(*feature_off)(const struct pg_state *ctxt);
 };
 
 struct pg_level {
diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
index 80c338e03..0ad7944e5 100644
--- a/arch/arm64/kvm/ptdump.c
+++ b/arch/arm64/kvm/ptdump.c
@@ -40,6 +40,18 @@ static struct kvm_pgtable_mm_ops ptdump_host_mmops = {
 	.virt_to_phys	= get_host_pa,
 };
 
+static bool is_fwb_enabled(const struct pg_state *m)
+{
+	struct kvm_pgtable_snapshot *snapshot = m->seq->private;
+	struct kvm_pgtable *pgtable = &snapshot->pgtable;
+	bool fwb_enabled = false;
+
+	if (cpus_have_final_cap(ARM64_HAS_STAGE2_FWB))
+		fwb_enabled = !(pgtable->flags & KVM_PGTABLE_S2_NOFWB);
+
+	return fwb_enabled;
+}
+
 static const struct prot_bits stage2_pte_bits[] = {
 	{
 		.mask	= PTE_VALID,
@@ -81,6 +93,26 @@ static const struct prot_bits stage2_pte_bits[] = {
 		.val	= PTE_TABLE_BIT,
 		.set	= "   ",
 		.clear	= "BLK",
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR | PTE_VALID,
+		.val	= PTE_S2_MEMATTR(MT_S2_DEVICE_nGnRE) | PTE_VALID,
+		.set	= "DEVICE/nGnRE",
+		.feature_off	= is_fwb_enabled,
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR | PTE_VALID,
+		.val	= PTE_S2_MEMATTR(MT_S2_FWB_DEVICE_nGnRE) | PTE_VALID,
+		.set	= "DEVICE/nGnRE FWB",
+		.feature_on	= is_fwb_enabled,
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR | PTE_VALID,
+		.val	= PTE_S2_MEMATTR(MT_S2_NORMAL) | PTE_VALID,
+		.set	= "MEM/NORMAL",
+		.feature_off	= is_fwb_enabled,
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR | PTE_VALID,
+		.val	= PTE_S2_MEMATTR(MT_S2_FWB_NORMAL) | PTE_VALID,
+		.set	= "MEM/NORMAL FWB",
+		.feature_on	= is_fwb_enabled,
 	}, {
 		.mask	= KVM_PGTABLE_PROT_SW0,
 		.val	= KVM_PGTABLE_PROT_SW0,
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 015ed65d3..6c7208f66 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -177,6 +177,12 @@ static void dump_prot(struct pg_state *st, const struct prot_bits *bits,
 	for (i = 0; i < num; i++, bits++) {
 		const char *s;
 
+		if (bits->feature_on && !bits->feature_on(st))
+			continue;
+
+		if (bits->feature_off && bits->feature_off(st))
+			continue;
+
 		if ((st->current_prot & bits->mask) == bits->val)
 			s = bits->set;
 		else
-- 
2.43.0.472.g3155946c3a-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 09/10] arm64: ptdump: Interpret pKVM ownership annotations
  2023-12-18 13:58 [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (7 preceding siblings ...)
  2023-12-18 13:58 ` [PATCH v4 08/10] arm64: ptdump: Interpret memory attributes based on the runtime config Sebastian Ene
@ 2023-12-18 13:58 ` Sebastian Ene
  2023-12-18 13:59 ` [PATCH v4 10/10] arm64: ptdump: Add guest stage-2 pagetables dumping Sebastian Ene
  2023-12-21 18:36 ` [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Oliver Upton
  10 siblings, 0 replies; 22+ messages in thread
From: Sebastian Ene @ 2023-12-18 13:58 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa, Sebastian Ene

When pKVM is enabled the software bits are used to keep track of the
page sharing state. Interepret these fields when pKVM is enabled and
print the sharing state. Move the definitions to common pagetable
header.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/include/asm/kvm_pgtable.h          | 26 ++++++++++
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 26 ----------
 arch/arm64/kvm/ptdump.c                       | 47 +++++++++++++++++--
 3 files changed, 69 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 37f2a8532..7f654d4aa 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -87,6 +87,13 @@ typedef u64 kvm_pte_t;
  */
 #define KVM_INVALID_PTE_LOCKED		BIT(10)
 
+/* This corresponds to page-table locking order */
+enum pkvm_component_id {
+	PKVM_ID_HOST,
+	PKVM_ID_HYP,
+	PKVM_ID_FFA,
+};
+
 static inline bool kvm_pte_valid(kvm_pte_t pte)
 {
 	return pte & KVM_PTE_VALID;
@@ -230,6 +237,25 @@ enum kvm_pgtable_prot {
 	KVM_PGTABLE_PROT_SW3			= BIT(58),
 };
 
+/*
+ * SW bits 0-1 are reserved to track the memory ownership state of each page:
+ *   00: The page is owned exclusively by the page-table owner.
+ *   01: The page is owned by the page-table owner, but is shared
+ *       with another entity.
+ *   10: The page is shared with, but not owned by the page-table owner.
+ *   11: Reserved for future use (lending).
+ */
+enum pkvm_page_state {
+	PKVM_PAGE_OWNED			= 0ULL,
+	PKVM_PAGE_SHARED_OWNED		= KVM_PGTABLE_PROT_SW0,
+	PKVM_PAGE_SHARED_BORROWED	= KVM_PGTABLE_PROT_SW1,
+	__PKVM_PAGE_RESERVED		= KVM_PGTABLE_PROT_SW0 |
+					  KVM_PGTABLE_PROT_SW1,
+
+	/* Meta-states which aren't encoded directly in the PTE's SW bits */
+	PKVM_NOPAGE,
+};
+
 #define KVM_PGTABLE_PROT_RW	(KVM_PGTABLE_PROT_R | KVM_PGTABLE_PROT_W)
 #define KVM_PGTABLE_PROT_RWX	(KVM_PGTABLE_PROT_RW | KVM_PGTABLE_PROT_X)
 
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index ca8f76915..677686b86 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -14,25 +14,6 @@
 #include <nvhe/pkvm.h>
 #include <nvhe/spinlock.h>
 
-/*
- * SW bits 0-1 are reserved to track the memory ownership state of each page:
- *   00: The page is owned exclusively by the page-table owner.
- *   01: The page is owned by the page-table owner, but is shared
- *       with another entity.
- *   10: The page is shared with, but not owned by the page-table owner.
- *   11: Reserved for future use (lending).
- */
-enum pkvm_page_state {
-	PKVM_PAGE_OWNED			= 0ULL,
-	PKVM_PAGE_SHARED_OWNED		= KVM_PGTABLE_PROT_SW0,
-	PKVM_PAGE_SHARED_BORROWED	= KVM_PGTABLE_PROT_SW1,
-	__PKVM_PAGE_RESERVED		= KVM_PGTABLE_PROT_SW0 |
-					  KVM_PGTABLE_PROT_SW1,
-
-	/* Meta-states which aren't encoded directly in the PTE's SW bits */
-	PKVM_NOPAGE,
-};
-
 #define PKVM_PAGE_STATE_PROT_MASK	(KVM_PGTABLE_PROT_SW0 | KVM_PGTABLE_PROT_SW1)
 static inline enum kvm_pgtable_prot pkvm_mkstate(enum kvm_pgtable_prot prot,
 						 enum pkvm_page_state state)
@@ -53,13 +34,6 @@ struct host_mmu {
 };
 extern struct host_mmu host_mmu;
 
-/* This corresponds to page-table locking order */
-enum pkvm_component_id {
-	PKVM_ID_HOST,
-	PKVM_ID_HYP,
-	PKVM_ID_FFA,
-};
-
 extern unsigned long hyp_nr_cpus;
 
 int __pkvm_prot_finalize(void);
diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
index 0ad7944e5..4296e739f 100644
--- a/arch/arm64/kvm/ptdump.c
+++ b/arch/arm64/kvm/ptdump.c
@@ -52,6 +52,11 @@ static bool is_fwb_enabled(const struct pg_state *m)
 	return fwb_enabled;
 }
 
+static bool is_pkvm_enabled(const struct pg_state *m)
+{
+	return is_protected_kvm_enabled();
+}
+
 static const struct prot_bits stage2_pte_bits[] = {
 	{
 		.mask	= PTE_VALID,
@@ -113,22 +118,56 @@ static const struct prot_bits stage2_pte_bits[] = {
 		.val	= PTE_S2_MEMATTR(MT_S2_FWB_NORMAL) | PTE_VALID,
 		.set	= "MEM/NORMAL FWB",
 		.feature_on	= is_fwb_enabled,
+	}, {
+		.mask	= KVM_INVALID_PTE_OWNER_MASK | PTE_VALID,
+		.val	= FIELD_PREP_CONST(KVM_INVALID_PTE_OWNER_MASK,
+					   PKVM_ID_HYP),
+		.set	= "HYP",
+	}, {
+		.mask	= KVM_INVALID_PTE_OWNER_MASK | PTE_VALID,
+		.val	= FIELD_PREP_CONST(KVM_INVALID_PTE_OWNER_MASK,
+					   PKVM_ID_FFA),
+		.set	= "FF-A",
+	}, {
+		.mask	= __PKVM_PAGE_RESERVED | PTE_VALID,
+		.val	= PKVM_PAGE_OWNED | PTE_VALID,
+		.set	= "PKVM_PAGE_OWNED",
+		.feature_on	= is_pkvm_enabled,
+	}, {
+		.mask   = __PKVM_PAGE_RESERVED | PTE_VALID,
+		.val	= PKVM_PAGE_SHARED_OWNED | PTE_VALID,
+		.set	= "PKVM_PAGE_SHARED_OWNED",
+		.feature_on     = is_pkvm_enabled,
+	}, {
+		.mask	= __PKVM_PAGE_RESERVED | PTE_VALID,
+		.val	= PKVM_PAGE_SHARED_BORROWED | PTE_VALID,
+		.set	= "PKVM_PAGE_SHARED_BORROWED",
+		.feature_on     = is_pkvm_enabled,
+	}, {
+		.mask	= PKVM_NOPAGE | PTE_VALID,
+		.val	= PKVM_NOPAGE,
+		.set	= "PKVM_NOPAGE",
+		.feature_on     = is_pkvm_enabled,
 	}, {
 		.mask	= KVM_PGTABLE_PROT_SW0,
 		.val	= KVM_PGTABLE_PROT_SW0,
-		.set	= "SW0", /* PKVM_PAGE_SHARED_OWNED */
+		.set    = "SW0",
+		.feature_off	= is_pkvm_enabled,
 	}, {
-		.mask   = KVM_PGTABLE_PROT_SW1,
+		.mask	= KVM_PGTABLE_PROT_SW1,
 		.val	= KVM_PGTABLE_PROT_SW1,
-		.set	= "SW1", /* PKVM_PAGE_SHARED_BORROWED */
+		.set	= "SW1",
+		.feature_off	= is_pkvm_enabled,
 	}, {
-		.mask	= KVM_PGTABLE_PROT_SW2,
+		.mask   = KVM_PGTABLE_PROT_SW2,
 		.val	= KVM_PGTABLE_PROT_SW2,
 		.set	= "SW2",
+		.feature_off	= is_pkvm_enabled,
 	}, {
 		.mask   = KVM_PGTABLE_PROT_SW3,
 		.val	= KVM_PGTABLE_PROT_SW3,
 		.set	= "SW3",
+		.feature_off	= is_pkvm_enabled,
 	},
 };
 
-- 
2.43.0.472.g3155946c3a-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 10/10] arm64: ptdump: Add guest stage-2 pagetables dumping
  2023-12-18 13:58 [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (8 preceding siblings ...)
  2023-12-18 13:58 ` [PATCH v4 09/10] arm64: ptdump: Interpret pKVM ownership annotations Sebastian Ene
@ 2023-12-18 13:59 ` Sebastian Ene
  2023-12-19 11:52   ` Sebastian Ene
  2023-12-21 18:27   ` Oliver Upton
  2023-12-21 18:36 ` [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Oliver Upton
  10 siblings, 2 replies; 22+ messages in thread
From: Sebastian Ene @ 2023-12-18 13:59 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa, Sebastian Ene

Register a debugfs file on guest creation to be able to view their
second translation tables with ptdump. This assumes that the host is in
control of the guest stage-2 and has direct access to the pagetables.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/kvm/debug.c      |  6 ++++++
 arch/arm64/kvm/kvm_ptdump.h |  3 +++
 arch/arm64/kvm/ptdump.c     | 35 ++++++++++++++++++++++++++++++++---
 3 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index 8725291cb..7c4c2902d 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -13,6 +13,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_arm.h>
 #include <asm/kvm_emulate.h>
+#include <kvm_ptdump.h>
 
 #include "trace.h"
 
@@ -342,3 +343,8 @@ void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu)
 	vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_SPE);
 	vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_TRBE);
 }
+
+int kvm_arch_create_vm_debugfs(struct kvm *kvm)
+{
+	return kvm_ptdump_register_guest(kvm);
+}
diff --git a/arch/arm64/kvm/kvm_ptdump.h b/arch/arm64/kvm/kvm_ptdump.h
index 98b595ce8..5f5a455d0 100644
--- a/arch/arm64/kvm/kvm_ptdump.h
+++ b/arch/arm64/kvm/kvm_ptdump.h
@@ -6,13 +6,16 @@
 #ifndef __KVM_PTDUMP_H
 #define __KVM_PTDUMP_H
 
+#include <linux/kvm_host.h>
 #include <asm/ptdump.h>
 
 
 #ifdef CONFIG_PTDUMP_STAGE2_DEBUGFS
 void kvm_ptdump_register_host(void);
+int kvm_ptdump_register_guest(struct kvm *kvm);
 #else
 static inline void kvm_ptdump_register_host(void) { }
+static inline int kvm_ptdump_register_guest(struct kvm *kvm) { return -1; }
 #endif /* CONFIG_PTDUMP_STAGE2_DEBUGFS */
 
 #endif /* __KVM_PTDUMP_H */
diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
index 4296e739f..62a753d6b 100644
--- a/arch/arm64/kvm/ptdump.c
+++ b/arch/arm64/kvm/ptdump.c
@@ -181,6 +181,8 @@ static int kvm_ptdump_open(struct inode *inode, struct file *file)
 		info = reg->get_ptdump_info(reg);
 		if (!info)
 			return -ENOMEM;
+	} else {
+		info = inode->i_private;
 	}
 
 	if (!reg->show_ptdump_info)
@@ -239,15 +241,14 @@ static int kvm_ptdump_visitor(const struct kvm_pgtable_visit_ctx *ctx,
 	return 0;
 }
 
-static int kvm_ptdump_show(struct seq_file *m, void *)
+static int kvm_ptdump_show_common(struct seq_file *m,
+				  struct kvm_pgtable *pgtable)
 {
 	u64 ipa_size;
 	char ipa_description[32];
 	struct pg_state st;
 	struct addr_marker ipa_addr_markers[3] = {0};
 	struct pg_level pg_level_descr[KVM_PGTABLE_MAX_LEVELS] = {0};
-	struct kvm_pgtable_snapshot *snapshot = m->private;
-	struct kvm_pgtable *pgtable = &snapshot->pgtable;
 	struct kvm_pgtable_walker walker = (struct kvm_pgtable_walker) {
 		.cb	= kvm_ptdump_visitor,
 		.arg	= &st,
@@ -282,6 +283,26 @@ static int kvm_ptdump_show(struct seq_file *m, void *)
 	return kvm_pgtable_walk(pgtable, 0, ipa_size, &walker);
 }
 
+static int kvm_host_ptdump_show(struct seq_file *m, void *)
+{
+	struct kvm_pgtable_snapshot *snapshot = m->private;
+
+	return kvm_ptdump_show_common(m, &snapshot->pgtable);
+}
+
+static int kvm_ptdump_show(struct seq_file *m, void *)
+{
+	struct kvm *guest_kvm = m->private;
+	struct kvm_s2_mmu *mmu = &guest_kvm->arch.mmu;
+	int ret;
+
+	write_lock(&guest_kvm->mmu_lock);
+	ret = kvm_ptdump_show_common(m, mmu->pgt);
+	write_unlock(&guest_kvm->mmu_lock);
+
+	return ret;
+}
+
 static void kvm_ptdump_debugfs_register(struct kvm_ptdump_register *reg,
 					const char *name, struct dentry *parent)
 {
@@ -393,11 +414,19 @@ void kvm_ptdump_register_host(void)
 
 	host_reg.get_ptdump_info = kvm_host_get_ptdump_info;
 	host_reg.put_ptdump_info = kvm_host_put_ptdump_info;
+	host_reg.show_ptdump_info = kvm_host_ptdump_show;
 
 	kvm_ptdump_debugfs_register(&host_reg, "host_page_tables",
 				    kvm_debugfs_dir);
 }
 
+int kvm_ptdump_register_guest(struct kvm *kvm)
+{
+	debugfs_create_file("stage2_page_tables", 0400, kvm->debugfs_dentry,
+			    kvm, &kvm_ptdump_fops);
+	return 0;
+}
+
 static int __init kvm_host_ptdump_init(void)
 {
 	host_reg.priv = (void *)host_s2_pgtable_pages();
-- 
2.43.0.472.g3155946c3a-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 01/10] KVM: arm64: Add snapshot interface for the host stage-2 pagetable
  2023-12-18 13:58 ` [PATCH v4 01/10] KVM: arm64: Add snapshot interface for the host stage-2 pagetable Sebastian Ene
@ 2023-12-19 11:44   ` Sebastian Ene
  0 siblings, 0 replies; 22+ messages in thread
From: Sebastian Ene @ 2023-12-19 11:44 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa

On Mon, Dec 18, 2023 at 01:58:51PM +0000, Sebastian Ene wrote:
> Introduce a new hvc that allows the pKVM hypervisor to snapshot the host
> stage-2 pagetables. The caller is expected to allocate & free the memory
> for the pagetable snapshot. The below diagram shows the sequence of
> events.
>
> [--- HOST ---]
> (1) allocate memory for the snapshot
> (2) invoke the __pkvm_copy_host_stage2(snapshot) interface
> |____
>       [--- HYP ---]
>       (3) donate the snapshot from HOST -> HYP
>       (4) save the host stage-2 pagetables in the snapshot
>       (5) donate the snapshot from HYP -> HOST
>       [--- return from HYP ---]
> [--- HOST ---]
> (6) free the memory for the snapshot
>
> When the host kernel invokes this interface, the memory is donated from
> the host to the hypervisor. By doing this the memory is unmapped from the
> host's translation regime to prevent the host from modifying the buffer
> while the hypervisor uses it. The hypervisor walks the host stage-2
> pagetable copy while re-creating the original mappings. When the copying
> is done, the memory is donated from the hypervisor back to the host.
> The pages that have been used from the memcache are donated back to the
> host from a temporary array. The hvc is executing synchronously and the
> preemption is disabled for the current CPU while running inside the
> hypervisor so calling this interface blocks the current CPU until the
> snapshoting is done.
>
> Signed-off-by: Sebastian Ene <sebastianene@google.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h              |   1 +
>  arch/arm64/include/asm/kvm_pgtable.h          |  43 ++++
>  arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   1 +
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  12 ++
>  arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 186 ++++++++++++++++++
>  arch/arm64/kvm/hyp/pgtable.c                  |  43 ++++
>  6 files changed, 286 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 24b5e6b23..9df3367d8 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -81,6 +81,7 @@ enum __kvm_host_smccc_func {
>       __KVM_HOST_SMCCC_FUNC___pkvm_init_vm,
>       __KVM_HOST_SMCCC_FUNC___pkvm_init_vcpu,
>       __KVM_HOST_SMCCC_FUNC___pkvm_teardown_vm,
> +     __KVM_HOST_SMCCC_FUNC___pkvm_host_stage2_snapshot,
>  };
>
>  #define DECLARE_KVM_VHE_SYM(sym)     extern char sym[]
> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> index d3e354bb8..f73efd8a8 100644
> --- a/arch/arm64/include/asm/kvm_pgtable.h
> +++ b/arch/arm64/include/asm/kvm_pgtable.h
> @@ -10,6 +10,7 @@
>  #include <linux/bits.h>
>  #include <linux/kvm_host.h>
>  #include <linux/types.h>
> +#include <asm/kvm_host.h>
>
>  #define KVM_PGTABLE_MAX_LEVELS               4U
>
> @@ -351,6 +352,28 @@ struct kvm_pgtable {
>       kvm_pgtable_force_pte_cb_t              force_pte_cb;
>  };
>
> +/**
> + * struct kvm_pgtable_snapshot - Snapshot page-table wrapper.
> + * @pgtable:         The page-table configuration.
> + * @mc:                      Memcache used for pagetable pages allocation.
> + * @pgd_hva:         Host virtual address of a physically contiguous buffer
> + *                   used for storing the PGD.
> + * @pgd_pages:               The size of the phyisically contiguous buffer in pages.
> + * @used_pages_hva:  Host virtual address of a physically contiguous buffer
> + *                   used for storing the consumed pages from the memcache.
> + * @num_used_pages           The size of the used buffer in pages.
> + * @used_pages_indx          The current index of the used pages array.
> + */
> +struct kvm_pgtable_snapshot {
> +     struct kvm_pgtable                      pgtable;
> +     struct kvm_hyp_memcache                 mc;
> +     void                                    *pgd_hva;
> +     size_t                                  pgd_pages;
> +     phys_addr_t                             *used_pages_hva;
> +     size_t                                  num_used_pages;
> +     size_t                                  used_pages_indx;
> +};
> +
>  /**
>   * kvm_pgtable_hyp_init() - Initialise a hypervisor stage-1 page-table.
>   * @pgt:     Uninitialised page-table structure to initialise.
> @@ -756,4 +779,24 @@ enum kvm_pgtable_prot kvm_pgtable_hyp_pte_prot(kvm_pte_t pte);
>   */
>  void kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
>                               phys_addr_t addr, size_t size);
> +
> +#ifdef CONFIG_NVHE_EL2_DEBUG
> +/**
> + * kvm_pgtable_stage2_snapshot() - Given a memcache and a destination
> + *                           pagetable where we have the original PGD
> + *                           copied, build a snapshot table with page table
> + *                           pages from the given memcache.
> + *
> + * @to_pgt:  Destination pagetable
> + * @mc:              The memcache where we allocate the destination pagetables from
> + */
> +int kvm_pgtable_stage2_snapshot(struct kvm_pgtable *to_pgt,
> +                             void *mc);
> +#else
> +static inline int kvm_pgtable_stage2_snapshot(struct kvm_pgtable *to_pgt,
> +                                           void *mc)
> +{
> +     return -EPERM;
> +}
> +#endif       /* CONFIG_NVHE_EL2_DEBUG */
>  #endif       /* __ARM64_KVM_PGTABLE_H__ */
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> index 0972faccc..ca8f76915 100644
> --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> @@ -69,6 +69,7 @@ int __pkvm_host_donate_hyp(u64 pfn, u64 nr_pages);
>  int __pkvm_hyp_donate_host(u64 pfn, u64 nr_pages);
>  int __pkvm_host_share_ffa(u64 pfn, u64 nr_pages);
>  int __pkvm_host_unshare_ffa(u64 pfn, u64 nr_pages);
> +int __pkvm_host_stage2_snapshot(struct kvm_pgtable_snapshot *snapshot);
>
>  bool addr_is_memory(phys_addr_t phys);
>  int host_stage2_idmap_locked(phys_addr_t addr, u64 size, enum kvm_pgtable_prot prot);
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index 2385fd03e..7b215245f 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -314,6 +314,17 @@ static void handle___pkvm_teardown_vm(struct kvm_cpu_context *host_ctxt)
>       cpu_reg(host_ctxt, 1) = __pkvm_teardown_vm(handle);
>  }
>
> +static void handle___pkvm_host_stage2_snapshot(struct kvm_cpu_context *host_ctxt)
> +{
> +#ifdef CONFIG_NVHE_EL2_DEBUG
> +     DECLARE_REG(struct kvm_pgtable_snapshot *, snapshot_hva, host_ctxt, 1);
> +
> +     cpu_reg(host_ctxt, 1) = __pkvm_host_stage2_snapshot(snapshot_hva);
> +#else
> +     cpu_reg(host_ctxt, 0) = SMCCC_RET_NOT_SUPPORTED;
> +#endif
> +}
> +
>  typedef void (*hcall_t)(struct kvm_cpu_context *);
>
>  #define HANDLE_FUNC(x)       [__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
> @@ -348,6 +359,7 @@ static const hcall_t host_hcall[] = {
>       HANDLE_FUNC(__pkvm_init_vm),
>       HANDLE_FUNC(__pkvm_init_vcpu),
>       HANDLE_FUNC(__pkvm_teardown_vm),
> +     HANDLE_FUNC(__pkvm_host_stage2_snapshot),
>  };
>
>  static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
> diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> index 8d0a5834e..aaf07f9e1 100644
> --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> @@ -266,6 +266,192 @@ int kvm_guest_prepare_stage2(struct pkvm_hyp_vm *vm, void *pgd)
>       return 0;
>  }
>
> +#ifdef CONFIG_NVHE_EL2_DEBUG
> +static void *snap_zalloc_page(void *mc)
> +{
> +     struct hyp_page *p;
> +     void *addr;
> +     struct kvm_pgtable_snapshot *snap;
> +     phys_addr_t *used_pg;
> +
> +     snap = container_of(mc, struct kvm_pgtable_snapshot, mc);
> +     used_pg = kern_hyp_va(snap->used_pages_hva);
> +
> +     /* Check if we have space to track the used page */
> +     if (snap->used_pages_indx * sizeof(phys_addr_t) > snap->num_used_pages * PAGE_SIZE)

This should be '>=' instead, and I will fix this.


> +             return NULL;
> +
> +     addr = pop_hyp_memcache(mc, hyp_phys_to_virt);
> +     if (!addr)
> +             return addr;
> +     used_pg[snap->used_pages_indx++] = hyp_virt_to_phys(addr);
> +
> +     memset(addr, 0, PAGE_SIZE);
> +     p = hyp_virt_to_page(addr);
> +     memset(p, 0, sizeof(*p));
> +
> +     return addr;
> +}
> +
> +static void snap_free_pages_exact(void *addr, unsigned long size)
> +{
> +     u8 order = get_order(size);
> +     unsigned int i;
> +     struct hyp_page *p;
> +
> +     for (i = 0; i < (1 << order); i++) {
> +             p = hyp_virt_to_page(addr + (i * PAGE_SIZE));
> +             hyp_page_ref_dec(p);
> +     }
> +}
> +
> +static void *pkvm_setup_snapshot(struct kvm_pgtable_snapshot *snap_hva)
> +{
> +     unsigned long i;
> +     void *pgd, *used_pg;
> +     phys_addr_t mc_page, next_mc_page;
> +     struct kvm_pgtable_snapshot *snap;
> +
> +     snap = (void *)kern_hyp_va(snap_hva);
> +     if (!PAGE_ALIGNED(snap))
> +             return NULL;
> +
> +     if (__pkvm_host_donate_hyp(hyp_virt_to_pfn(snap), 1))
> +             return NULL;
> +
> +     pgd = kern_hyp_va(snap->pgd_hva);
> +     if (!PAGE_ALIGNED(pgd))
> +             goto error_with_snapshot;
> +
> +     if (__pkvm_host_donate_hyp(hyp_virt_to_pfn(pgd), snap->pgd_pages))
> +             goto error_with_snapshot;
> +
> +     mc_page = snap->mc.head;
> +     for (i = 0; i < snap->mc.nr_pages; i++) {
> +             if (!PAGE_ALIGNED(mc_page))
> +                     goto error_with_memcache;
> +
> +             if (__pkvm_host_donate_hyp(hyp_phys_to_pfn(mc_page), 1))
> +                     goto error_with_memcache;
> +
> +             mc_page = *((phys_addr_t *)hyp_phys_to_virt(mc_page));
> +     }
> +
> +     used_pg = kern_hyp_va(snap->used_pages_hva);
> +     if (!PAGE_ALIGNED(used_pg))
> +             goto error_with_memcache;
> +
> +     if (__pkvm_host_donate_hyp(hyp_virt_to_pfn(used_pg), snap->num_used_pages))
> +             goto error_with_memcache;
> +
> +     return snap;
> +error_with_memcache:
> +     mc_page = snap->mc.head;
> +     for (; i >= 0; i--) {
> +             next_mc_page = *((phys_addr_t *)hyp_phys_to_virt(mc_page));
> +             WARN_ON(__pkvm_hyp_donate_host(hyp_phys_to_pfn(mc_page), 1));
> +             mc_page = next_mc_page;
> +     }
> +
> +     WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(pgd), snap->pgd_pages));
> +error_with_snapshot:
> +     WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(snap), 1));
> +     return NULL;
> +}
> +
> +static void pkvm_teardown_snapshot(struct kvm_pgtable_snapshot *snap)
> +{
> +     size_t i;
> +     phys_addr_t mc_page, next_mc_page;
> +     u64 *used_pg = kern_hyp_va(snap->used_pages_hva);
> +     void *pgd = kern_hyp_va(snap->pgd_hva);
> +
> +     for (i = 0; i < snap->used_pages_indx; i++) {
> +             mc_page = used_pg[i];
> +             WARN_ON(__pkvm_hyp_donate_host(hyp_phys_to_pfn(mc_page), 1));
> +     }
> +
> +     WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(used_pg),
> +                                    snap->num_used_pages));
> +
> +     mc_page = snap->mc.head;
> +     for (i = 0; i < snap->mc.nr_pages; i++) {
> +             next_mc_page = *((phys_addr_t *)hyp_phys_to_virt(mc_page));
> +             WARN_ON(__pkvm_hyp_donate_host(hyp_phys_to_pfn(mc_page), 1));
> +             mc_page = next_mc_page;
> +     }
> +
> +     snap->pgtable.mm_ops = NULL;
> +     WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(pgd), snap->pgd_pages));
> +     WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(snap), 1));
> +}
> +
> +static int pkvm_host_stage2_snapshot(struct kvm_pgtable_snapshot *snap)
> +{
> +     int ret;
> +     void *pgd;
> +     size_t required_pgd_len;
> +     struct kvm_pgtable_mm_ops mm_ops = {0};
> +     struct kvm_s2_mmu *mmu;
> +     struct kvm_pgtable *to_pgt, *from_pgt;
> +
> +     if (snap->used_pages_indx != 0)
> +             return -EINVAL;
> +
> +     from_pgt = &host_mmu.pgt;
> +     mmu = &host_mmu.arch.mmu;
> +     required_pgd_len = kvm_pgtable_stage2_pgd_size(mmu->vtcr);
> +     if (snap->pgd_pages < (required_pgd_len >> PAGE_SHIFT))
> +             return -ENOMEM;
> +
> +     if (snap->mc.nr_pages < host_s2_pgtable_pages())
> +             return -ENOMEM;
> +
> +     to_pgt = &snap->pgtable;
> +     pgd = kern_hyp_va(snap->pgd_hva);
> +
> +     mm_ops.zalloc_page              = snap_zalloc_page;
> +     mm_ops.free_pages_exact         = snap_free_pages_exact;
> +     mm_ops.phys_to_virt             = hyp_phys_to_virt;
> +     mm_ops.virt_to_phys             = hyp_virt_to_phys;
> +     mm_ops.page_count               = hyp_page_count;
> +
> +     to_pgt->ia_bits         = from_pgt->ia_bits;
> +     to_pgt->start_level     = from_pgt->start_level;
> +     to_pgt->flags           = from_pgt->flags;
> +     to_pgt->mm_ops          = &mm_ops;
> +
> +     host_lock_component();
> +
> +     to_pgt->pgd = pgd;
> +     memcpy(to_pgt->pgd, from_pgt->pgd, required_pgd_len);
> +     ret = kvm_pgtable_stage2_snapshot(to_pgt, &snap->mc);
> +
> +     host_unlock_component();
> +
> +     return ret;
> +}
> +
> +int __pkvm_host_stage2_snapshot(struct kvm_pgtable_snapshot *snap_hva)
> +{
> +     int ret;
> +     struct kvm_pgtable_snapshot *snap;
> +     kvm_pteref_t pgd;
> +
> +     snap = pkvm_setup_snapshot(snap_hva);
> +     if (!snap)
> +             return -EPERM;
> +
> +     ret = pkvm_host_stage2_snapshot(snap);
> +     if (!ret) {
> +             pgd = snap->pgtable.pgd;
> +             snap->pgtable.pgd = (kvm_pteref_t)__hyp_pa(pgd);
> +     }
> +     pkvm_teardown_snapshot(snap);
> +     return ret;
> +}
> +#endif /* CONFIG_NVHE_EL2_DEBUG */
> +
>  void reclaim_guest_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc)
>  {
>       void *addr;
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index 1966fdee7..82fef9620 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -1598,3 +1598,46 @@ void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *p
>       WARN_ON(mm_ops->page_count(pgtable) != 1);
>       mm_ops->put_page(pgtable);
>  }
> +
> +#ifdef CONFIG_NVHE_EL2_DEBUG
> +static int snapshot_walker(const struct kvm_pgtable_visit_ctx *ctx,
> +                        enum kvm_pgtable_walk_flags visit)
> +{
> +     struct kvm_pgtable_mm_ops *mm_ops = ctx->mm_ops;
> +     void *copy_table, *original_addr;
> +     kvm_pte_t new = ctx->old;
> +
> +     if (!stage2_pte_is_counted(ctx->old))
> +             return 0;
> +
> +     if (kvm_pte_table(ctx->old, ctx->level)) {
> +             copy_table = mm_ops->zalloc_page(ctx->arg);
> +             if (!copy_table)
> +                     return -ENOMEM;
> +
> +             original_addr = kvm_pte_follow(ctx->old, mm_ops);
> +
> +             memcpy(copy_table, original_addr, PAGE_SIZE);
> +             new = kvm_init_table_pte(copy_table, mm_ops);
> +     }
> +
> +     *ctx->ptep = new;
> +
> +     return 0;
> +}
> +
> +int kvm_pgtable_stage2_snapshot(struct kvm_pgtable *to_pgt, void *mc)
> +{
> +     struct kvm_pgtable_walker walker = {
> +             .cb     = snapshot_walker,
> +             .flags  = KVM_PGTABLE_WALK_LEAF |
> +                       KVM_PGTABLE_WALK_TABLE_PRE,
> +             .arg = mc
> +     };
> +
> +     if (!to_pgt->pgd)
> +             return -ENOMEM;
> +
> +     return kvm_pgtable_walk(to_pgt, 0, BIT(to_pgt->ia_bits), &walker);
> +}
> +#endif /* CONFIG_NVHE_EL2_DEBUG */
> --
> 2.43.0.472.g3155946c3a-goog
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 03/10] KVM: arm64: Invoke the snapshot interface for the host stage-2 pagetable
  2023-12-18 13:58 ` [PATCH v4 03/10] KVM: arm64: Invoke the snapshot interface for the host stage-2 pagetable Sebastian Ene
@ 2023-12-19 11:45   ` Sebastian Ene
  0 siblings, 0 replies; 22+ messages in thread
From: Sebastian Ene @ 2023-12-19 11:45 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa

On Mon, Dec 18, 2023 at 01:58:53PM +0000, Sebastian Ene wrote:
> Allocate memory for the snapshot by creating a memory cache with empty
> pages that will be used by the hypervisor during the page table copy.
> Get the required size of the PGD and allocate physically contiguous
> memory for it. Allocate contiguous memory for an array that is used to
> keep track of the pages used from the memcache. Call the snapshot
> interface and release the memory for the snapshot.
> 
> Signed-off-by: Sebastian Ene <sebastianene@google.com>
> ---
>  arch/arm64/kvm/ptdump.c | 107 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 107 insertions(+)
> 
> diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> index 5816fc632..e99bab427 100644
> --- a/arch/arm64/kvm/ptdump.c
> +++ b/arch/arm64/kvm/ptdump.c
> @@ -25,6 +25,9 @@ static int kvm_ptdump_open(struct inode *inode, struct file *file);
>  static int kvm_ptdump_release(struct inode *inode, struct file *file);
>  static int kvm_ptdump_show(struct seq_file *m, void *);
>  
> +static phys_addr_t get_host_pa(void *addr);
> +static void *get_host_va(phys_addr_t pa);
> +
>  static const struct file_operations kvm_ptdump_fops = {
>  	.open		= kvm_ptdump_open,
>  	.read		= seq_read,
> @@ -32,6 +35,11 @@ static const struct file_operations kvm_ptdump_fops = {
>  	.release	= kvm_ptdump_release,
>  };
>  
> +static struct kvm_pgtable_mm_ops ptdump_host_mmops = {
> +	.phys_to_virt	= get_host_va,
> +	.virt_to_phys	= get_host_pa,
> +};
> +
>  static int kvm_ptdump_open(struct inode *inode, struct file *file)
>  {
>  	struct kvm_ptdump_register *reg = inode->i_private;
> @@ -78,11 +86,110 @@ static void kvm_ptdump_debugfs_register(struct kvm_ptdump_register *reg,
>  
>  static struct kvm_ptdump_register host_reg;
>  
> +static size_t host_stage2_get_pgd_len(void)
> +{
> +	u32 phys_shift = get_kvm_ipa_limit();
> +	u64 vtcr = kvm_get_vtcr(read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1),
> +				read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1),
> +				phys_shift);
> +	return (kvm_pgtable_stage2_pgd_size(vtcr) >> PAGE_SHIFT);
> +}
> +
> +static phys_addr_t get_host_pa(void *addr)
> +{
> +	return __pa(addr);
> +}
> +
> +static void *get_host_va(phys_addr_t pa)
> +{
> +	return __va(pa);
> +}
> +
> +static void kvm_host_put_ptdump_info(void *snap)
> +{
> +	void *mc_page;
> +	size_t i;
> +	struct kvm_pgtable_snapshot *snapshot;
> +
> +	if (!snap)
> +		return;
> +
> +	snapshot = snap;
> +	while ((mc_page = pop_hyp_memcache(&snapshot->mc, get_host_va)) != NULL)
> +		free_page((unsigned long)mc_page);
> +
> +	if (snapshot->pgd_hva)
> +		free_pages_exact(snapshot->pgd_hva, snapshot->pgd_pages);
> +
> +	if (snapshot->used_pages_hva) {
> +		for (i = 0; i < snapshot->used_pages_indx; i++) {
> +			mc_page = get_host_va(snapshot->used_pages_hva[i]);
> +			free_page((unsigned long)mc_page);
> +		}
> +
> +		free_pages_exact(snapshot->used_pages_hva, snapshot->num_used_pages);
> +	}
> +
> +	free_page((unsigned long)snapshot);
> +}
> +
> +static void *kvm_host_get_ptdump_info(struct kvm_ptdump_register *reg)
> +{
> +	int i, ret;
> +	void *mc_page;
> +	struct kvm_pgtable_snapshot *snapshot;
> +	size_t memcache_len;
> +
> +	snapshot = (void *)__get_free_page(GFP_KERNEL_ACCOUNT);
> +	if (!snapshot)
> +		return NULL;
> +
> +	memset(snapshot, 0, sizeof(struct kvm_pgtable_snapshot));
> +
> +	snapshot->pgd_pages = host_stage2_get_pgd_len();
> +	snapshot->pgd_hva = alloc_pages_exact(snapshot->pgd_pages, GFP_KERNEL_ACCOUNT);

The allocation should be specified in the number of bytes here.

> +	if (!snapshot->pgd_hva)
> +		goto err;
> +
> +	memcache_len = (size_t)reg->priv;
> +	for (i = 0; i < memcache_len; i++) {
> +		mc_page = (void *)__get_free_page(GFP_KERNEL_ACCOUNT);
> +		if (!mc_page)
> +			goto err;
> +
> +		push_hyp_memcache(&snapshot->mc, mc_page, get_host_pa);
> +	}
> +
> +	snapshot->num_used_pages = DIV_ROUND_UP(sizeof(phys_addr_t) * memcache_len,
> +					     PAGE_SIZE);
> +	snapshot->used_pages_hva = alloc_pages_exact(snapshot->num_used_pages,
> +						  GFP_KERNEL_ACCOUNT);

the same as previous comment.

> +	if (!snapshot->used_pages_hva)
> +		goto err;
> +
> +	ret = kvm_call_hyp_nvhe(__pkvm_host_stage2_snapshot, snapshot);
> +	if (ret) {
> +		pr_err("ERROR %d snapshot host pagetables\n", ret);
> +		goto err;
> +	}
> +
> +	snapshot->pgtable.pgd = get_host_va((phys_addr_t)snapshot->pgtable.pgd);
> +	snapshot->pgtable.mm_ops = &ptdump_host_mmops;
> +
> +	return snapshot;
> +err:
> +	kvm_host_put_ptdump_info(snapshot);
> +	return NULL;
> +}
> +
>  void kvm_ptdump_register_host(void)
>  {
>  	if (!is_protected_kvm_enabled())
>  		return;
>  
> +	host_reg.get_ptdump_info = kvm_host_get_ptdump_info;
> +	host_reg.put_ptdump_info = kvm_host_put_ptdump_info;
> +
>  	kvm_ptdump_debugfs_register(&host_reg, "host_page_tables",
>  				    kvm_debugfs_dir);
>  }
> -- 
> 2.43.0.472.g3155946c3a-goog
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 02/10] KVM: arm64: Add ptdump registration with debugfs for the stage-2 pagetables
  2023-12-18 13:58 ` [PATCH v4 02/10] KVM: arm64: Add ptdump registration with debugfs for the stage-2 pagetables Sebastian Ene
@ 2023-12-19 11:47   ` Sebastian Ene
  2023-12-21 18:14   ` Oliver Upton
  1 sibling, 0 replies; 22+ messages in thread
From: Sebastian Ene @ 2023-12-19 11:47 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa

On Mon, Dec 18, 2023 at 01:58:52PM +0000, Sebastian Ene wrote:
> While arch/*/mem/ptdump handles the kernel pagetable dumping code,
> introduce KVM/ptdump which deals with the stage-2 pagetables. The
> separation is necessary because most of the definitions from the
> stage-2 pagetable reside in the KVM path and we will be invoking
> functionality **specific** to KVM.
> 
> This registers a wrapper on top of debugfs_create_file which allows us
> to hook callbacks on the debugfs open/show/close. The callbacks are used
> to prepare the display portion of the pagetable dumping code.
> Guard this functionality under the newly introduced PTDUMP_STAGE2_DEBUGFS.
> 
> Signed-off-by: Sebastian Ene <sebastianene@google.com>
> ---
>  arch/arm64/kvm/Kconfig      | 13 +++++
>  arch/arm64/kvm/Makefile     |  1 +
>  arch/arm64/kvm/arm.c        |  2 +
>  arch/arm64/kvm/kvm_ptdump.h | 18 +++++++
>  arch/arm64/kvm/ptdump.c     | 96 +++++++++++++++++++++++++++++++++++++
>  5 files changed, 130 insertions(+)
>  create mode 100644 arch/arm64/kvm/kvm_ptdump.h
>  create mode 100644 arch/arm64/kvm/ptdump.c
> 
> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
> index 83c1e09be..0014e55e2 100644
> --- a/arch/arm64/kvm/Kconfig
> +++ b/arch/arm64/kvm/Kconfig
> @@ -71,4 +71,17 @@ config PROTECTED_NVHE_STACKTRACE
>  
>  	  If unsure, or not using protected nVHE (pKVM), say N.
>  
> +config PTDUMP_STAGE2_DEBUGFS
> +       bool "Present the stage-2 pagetables to debugfs"
> +       depends on PTDUMP_DEBUGFS && KVM
> +       default n
> +       help
> +         Say Y here if you want to show the stage-2 kernel pagetables
> +         layout in a debugfs file. This information is only useful for kernel developers
> +         who are working in architecture specific areas of the kernel.
> +         It is probably not a good idea to enable this feature in a production
> +         kernel.
> +
> +         If in doubt, say N.
> +
>  endif # VIRTUALIZATION
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index c0c050e53..190eac175 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -23,6 +23,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
>  	 vgic/vgic-its.o vgic/vgic-debug.o
>  
>  kvm-$(CONFIG_HW_PERF_EVENTS)  += pmu-emul.o pmu.o
> +kvm-$(CONFIG_PTDUMP_STAGE2_DEBUGFS) += ptdump.o
>  
>  always-y := hyp_constants.h hyp-constants.s
>  
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e5f75f1f1..ee8d7cb67 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -40,6 +40,7 @@
>  #include <asm/kvm_pkvm.h>
>  #include <asm/kvm_emulate.h>
>  #include <asm/sections.h>
> +#include <kvm_ptdump.h>
>  
>  #include <kvm/arm_hypercalls.h>
>  #include <kvm/arm_pmu.h>
> @@ -2592,6 +2593,7 @@ static __init int kvm_arm_init(void)
>  	if (err)
>  		goto out_subs;
>  
> +	kvm_ptdump_register_host();
>  	kvm_arm_initialised = true;
>  
>  	return 0;
> diff --git a/arch/arm64/kvm/kvm_ptdump.h b/arch/arm64/kvm/kvm_ptdump.h
> new file mode 100644
> index 000000000..98b595ce8
> --- /dev/null
> +++ b/arch/arm64/kvm/kvm_ptdump.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +//
> +// Copyright (C) Google, 2023
> +// Author: Sebastian Ene <sebastianene@google.com>
> +
> +#ifndef __KVM_PTDUMP_H
> +#define __KVM_PTDUMP_H
> +
> +#include <asm/ptdump.h>
> +
> +
> +#ifdef CONFIG_PTDUMP_STAGE2_DEBUGFS
> +void kvm_ptdump_register_host(void);
> +#else
> +static inline void kvm_ptdump_register_host(void) { }
> +#endif /* CONFIG_PTDUMP_STAGE2_DEBUGFS */
> +
> +#endif /* __KVM_PTDUMP_H */
> diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> new file mode 100644
> index 000000000..5816fc632
> --- /dev/null
> +++ b/arch/arm64/kvm/ptdump.c
> @@ -0,0 +1,96 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +//
> +// Debug helper used to dump the stage-2 pagetables of the system and their
> +// associated permissions.
> +//
> +// Copyright (C) Google, 2023
> +// Author: Sebastian Ene <sebastianene@google.com>
> +
> +#include <linux/debugfs.h>
> +#include <linux/kvm_host.h>
> +#include <linux/seq_file.h>
> +
> +#include <asm/kvm_pkvm.h>
> +#include <kvm_ptdump.h>
> +
> +
> +struct kvm_ptdump_register {
> +	void *(*get_ptdump_info)(struct kvm_ptdump_register *reg);
> +	void (*put_ptdump_info)(void *priv);
> +	int (*show_ptdump_info)(struct seq_file *m, void *v);
> +	void *priv;
> +};
> +
> +static int kvm_ptdump_open(struct inode *inode, struct file *file);
> +static int kvm_ptdump_release(struct inode *inode, struct file *file);
> +static int kvm_ptdump_show(struct seq_file *m, void *);
> +
> +static const struct file_operations kvm_ptdump_fops = {
> +	.open		= kvm_ptdump_open,
> +	.read		= seq_read,
> +	.llseek		= seq_lseek,
> +	.release	= kvm_ptdump_release,
> +};
> +
> +static int kvm_ptdump_open(struct inode *inode, struct file *file)
> +{
> +	struct kvm_ptdump_register *reg = inode->i_private;
> +	void *info = NULL;
> +	int ret;
> +
> +	if (reg->get_ptdump_info) {
> +		info = reg->get_ptdump_info(reg);
> +		if (!info)
> +			return -ENOMEM;
> +	}
> +
> +	if (!reg->show_ptdump_info)
> +		reg->show_ptdump_info = kvm_ptdump_show;
> +
> +	ret = single_open(file, reg->show_ptdump_info, info);
> +	if (ret && reg->put_ptdump_info)
> +		reg->put_ptdump_info(info);
> +
> +	return ret;
> +}
> +
> +static int kvm_ptdump_release(struct inode *inode, struct file *file)
> +{
> +	struct kvm_ptdump_register *reg = inode->i_private;
> +	struct seq_file *seq_file = file->private_data;
> +
> +	if (reg->put_ptdump_info)
> +		reg->put_ptdump_info(seq_file->private);
> +

Call single_release here.

> +	return 0;
> +}
> +
> +static int kvm_ptdump_show(struct seq_file *m, void *)
> +{
> +	return -EINVAL;
> +}
> +
> +static void kvm_ptdump_debugfs_register(struct kvm_ptdump_register *reg,
> +					const char *name, struct dentry *parent)
> +{
> +	debugfs_create_file(name, 0400, parent, reg, &kvm_ptdump_fops);
> +}
> +
> +static struct kvm_ptdump_register host_reg;
> +
> +void kvm_ptdump_register_host(void)
> +{
> +	if (!is_protected_kvm_enabled())
> +		return;
> +
> +	kvm_ptdump_debugfs_register(&host_reg, "host_page_tables",
> +				    kvm_debugfs_dir);
> +}
> +
> +static int __init kvm_host_ptdump_init(void)
> +{
> +	host_reg.priv = (void *)host_s2_pgtable_pages();
> +	return 0;
> +}
> +
> +device_initcall(kvm_host_ptdump_init);
> -- 
> 2.43.0.472.g3155946c3a-goog
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 10/10] arm64: ptdump: Add guest stage-2 pagetables dumping
  2023-12-18 13:59 ` [PATCH v4 10/10] arm64: ptdump: Add guest stage-2 pagetables dumping Sebastian Ene
@ 2023-12-19 11:52   ` Sebastian Ene
  2023-12-21 18:27   ` Oliver Upton
  1 sibling, 0 replies; 22+ messages in thread
From: Sebastian Ene @ 2023-12-19 11:52 UTC (permalink / raw)
  To: will, Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	catalin.marinas, mark.rutland, akpm, maz
  Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team, vdonnefort,
	qperret, smostafa

On Mon, Dec 18, 2023 at 01:59:00PM +0000, Sebastian Ene wrote:
> Register a debugfs file on guest creation to be able to view their
> second translation tables with ptdump. This assumes that the host is in
> control of the guest stage-2 and has direct access to the pagetables.
> 
> Signed-off-by: Sebastian Ene <sebastianene@google.com>
> ---
>  arch/arm64/kvm/debug.c      |  6 ++++++
>  arch/arm64/kvm/kvm_ptdump.h |  3 +++
>  arch/arm64/kvm/ptdump.c     | 35 ++++++++++++++++++++++++++++++++---
>  3 files changed, 41 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index 8725291cb..7c4c2902d 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -13,6 +13,7 @@
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_arm.h>
>  #include <asm/kvm_emulate.h>
> +#include <kvm_ptdump.h>
>  
>  #include "trace.h"
>  
> @@ -342,3 +343,8 @@ void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu)
>  	vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_SPE);
>  	vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_TRBE);
>  }
> +
> +int kvm_arch_create_vm_debugfs(struct kvm *kvm)
> +{
> +	return kvm_ptdump_register_guest(kvm);
> +}
> diff --git a/arch/arm64/kvm/kvm_ptdump.h b/arch/arm64/kvm/kvm_ptdump.h
> index 98b595ce8..5f5a455d0 100644
> --- a/arch/arm64/kvm/kvm_ptdump.h
> +++ b/arch/arm64/kvm/kvm_ptdump.h
> @@ -6,13 +6,16 @@
>  #ifndef __KVM_PTDUMP_H
>  #define __KVM_PTDUMP_H
>  
> +#include <linux/kvm_host.h>
>  #include <asm/ptdump.h>
>  
>  
>  #ifdef CONFIG_PTDUMP_STAGE2_DEBUGFS
>  void kvm_ptdump_register_host(void);
> +int kvm_ptdump_register_guest(struct kvm *kvm);
>  #else
>  static inline void kvm_ptdump_register_host(void) { }
> +static inline int kvm_ptdump_register_guest(struct kvm *kvm) { return -1; }
>  #endif /* CONFIG_PTDUMP_STAGE2_DEBUGFS */
>  
>  #endif /* __KVM_PTDUMP_H */
> diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> index 4296e739f..62a753d6b 100644
> --- a/arch/arm64/kvm/ptdump.c
> +++ b/arch/arm64/kvm/ptdump.c
> @@ -181,6 +181,8 @@ static int kvm_ptdump_open(struct inode *inode, struct file *file)
>  		info = reg->get_ptdump_info(reg);
>  		if (!info)
>  			return -ENOMEM;
> +	} else {
> +		info = inode->i_private;

FIXME: Don't call kvm_ptdump_release with this ^ argument.

>  	}
>  
>  	if (!reg->show_ptdump_info)
> @@ -239,15 +241,14 @@ static int kvm_ptdump_visitor(const struct kvm_pgtable_visit_ctx *ctx,
>  	return 0;
>  }
>  
> -static int kvm_ptdump_show(struct seq_file *m, void *)
> +static int kvm_ptdump_show_common(struct seq_file *m,
> +				  struct kvm_pgtable *pgtable)
>  {
>  	u64 ipa_size;
>  	char ipa_description[32];
>  	struct pg_state st;
>  	struct addr_marker ipa_addr_markers[3] = {0};
>  	struct pg_level pg_level_descr[KVM_PGTABLE_MAX_LEVELS] = {0};
> -	struct kvm_pgtable_snapshot *snapshot = m->private;
> -	struct kvm_pgtable *pgtable = &snapshot->pgtable;
>  	struct kvm_pgtable_walker walker = (struct kvm_pgtable_walker) {
>  		.cb	= kvm_ptdump_visitor,
>  		.arg	= &st,
> @@ -282,6 +283,26 @@ static int kvm_ptdump_show(struct seq_file *m, void *)
>  	return kvm_pgtable_walk(pgtable, 0, ipa_size, &walker);
>  }
>  
> +static int kvm_host_ptdump_show(struct seq_file *m, void *)
> +{
> +	struct kvm_pgtable_snapshot *snapshot = m->private;
> +
> +	return kvm_ptdump_show_common(m, &snapshot->pgtable);
> +}
> +
> +static int kvm_ptdump_show(struct seq_file *m, void *)
> +{
> +	struct kvm *guest_kvm = m->private;
> +	struct kvm_s2_mmu *mmu = &guest_kvm->arch.mmu;
> +	int ret;
> +
> +	write_lock(&guest_kvm->mmu_lock);
> +	ret = kvm_ptdump_show_common(m, mmu->pgt);
> +	write_unlock(&guest_kvm->mmu_lock);
> +
> +	return ret;
> +}
> +
>  static void kvm_ptdump_debugfs_register(struct kvm_ptdump_register *reg,
>  					const char *name, struct dentry *parent)
>  {
> @@ -393,11 +414,19 @@ void kvm_ptdump_register_host(void)
>  
>  	host_reg.get_ptdump_info = kvm_host_get_ptdump_info;
>  	host_reg.put_ptdump_info = kvm_host_put_ptdump_info;
> +	host_reg.show_ptdump_info = kvm_host_ptdump_show;
>  
>  	kvm_ptdump_debugfs_register(&host_reg, "host_page_tables",
>  				    kvm_debugfs_dir);
>  }
>  
> +int kvm_ptdump_register_guest(struct kvm *kvm)
> +{
> +	debugfs_create_file("stage2_page_tables", 0400, kvm->debugfs_dentry,
> +			    kvm, &kvm_ptdump_fops);
> +	return 0;
> +}
> +
>  static int __init kvm_host_ptdump_init(void)
>  {
>  	host_reg.priv = (void *)host_s2_pgtable_pages();
> -- 
> 2.43.0.472.g3155946c3a-goog
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 02/10] KVM: arm64: Add ptdump registration with debugfs for the stage-2 pagetables
  2023-12-18 13:58 ` [PATCH v4 02/10] KVM: arm64: Add ptdump registration with debugfs for the stage-2 pagetables Sebastian Ene
  2023-12-19 11:47   ` Sebastian Ene
@ 2023-12-21 18:14   ` Oliver Upton
  2024-02-01 11:20     ` Sebastian Ene
  1 sibling, 1 reply; 22+ messages in thread
From: Oliver Upton @ 2023-12-21 18:14 UTC (permalink / raw)
  To: Sebastian Ene
  Cc: will, James Morse, Suzuki K Poulose, Zenghui Yu, catalin.marinas,
	mark.rutland, akpm, maz, kvmarm, linux-arm-kernel, linux-kernel,
	kernel-team, vdonnefort, qperret, smostafa

On Mon, Dec 18, 2023 at 01:58:52PM +0000, Sebastian Ene wrote:
> +config PTDUMP_STAGE2_DEBUGFS
> +       bool "Present the stage-2 pagetables to debugfs"
> +       depends on PTDUMP_DEBUGFS && KVM
> +       default n
> +       help
> +         Say Y here if you want to show the stage-2 kernel pagetables
> +         layout in a debugfs file. This information is only useful for kernel developers
> +         who are working in architecture specific areas of the kernel.
> +         It is probably not a good idea to enable this feature in a production
> +         kernel.

It isn't really a good idea to mount debugfs at all in a production
system. There are already plenty worse interfaces lurking in that
filesystem. The pKVM portions already depend on CONFIG_NVHE_EL2_DEBUG,
so I don't see a need for this Kconfig option.

> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e5f75f1f1..ee8d7cb67 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -40,6 +40,7 @@
>  #include <asm/kvm_pkvm.h>
>  #include <asm/kvm_emulate.h>
>  #include <asm/sections.h>
> +#include <kvm_ptdump.h>
>  
>  #include <kvm/arm_hypercalls.h>
>  #include <kvm/arm_pmu.h>
> @@ -2592,6 +2593,7 @@ static __init int kvm_arm_init(void)
>  	if (err)
>  		goto out_subs;
>  
> +	kvm_ptdump_register_host();
>  	kvm_arm_initialised = true;
>  
>  	return 0;
> diff --git a/arch/arm64/kvm/kvm_ptdump.h b/arch/arm64/kvm/kvm_ptdump.h
> new file mode 100644
> index 000000000..98b595ce8
> --- /dev/null
> +++ b/arch/arm64/kvm/kvm_ptdump.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +//
> +// Copyright (C) Google, 2023
> +// Author: Sebastian Ene <sebastianene@google.com>

You've got the comment styles backwards for these. The SPDX license uses
the 'C++' style comment (//), whereas your multiline comment should always
use a 'C' style comment (/* */).

> +struct kvm_ptdump_register {
> +	void *(*get_ptdump_info)(struct kvm_ptdump_register *reg);
> +	void (*put_ptdump_info)(void *priv);
> +	int (*show_ptdump_info)(struct seq_file *m, void *v);
> +	void *priv;
> +};

Please thoroughly consider the necessity of this. You're wrapping a
callback structure with yet another callback structure. IMO, it would
make a lot more sense to implement the file ops structure for every
walker variant you need and avoid the indirection, it's hard to
understand.

> +void kvm_ptdump_register_host(void)
> +{
> +	if (!is_protected_kvm_enabled())
> +		return;
> +
> +	kvm_ptdump_debugfs_register(&host_reg, "host_page_tables",
> +				    kvm_debugfs_dir);
> +}
> +
> +static int __init kvm_host_ptdump_init(void)
> +{
> +	host_reg.priv = (void *)host_s2_pgtable_pages();
> +	return 0;
> +}
> +
> +device_initcall(kvm_host_ptdump_init);

Why can't all of this be called from finalize_pkvm()?

> -- 
> 2.43.0.472.g3155946c3a-goog
> 

-- 
Thanks,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 10/10] arm64: ptdump: Add guest stage-2 pagetables dumping
  2023-12-18 13:59 ` [PATCH v4 10/10] arm64: ptdump: Add guest stage-2 pagetables dumping Sebastian Ene
  2023-12-19 11:52   ` Sebastian Ene
@ 2023-12-21 18:27   ` Oliver Upton
  1 sibling, 0 replies; 22+ messages in thread
From: Oliver Upton @ 2023-12-21 18:27 UTC (permalink / raw)
  To: Sebastian Ene
  Cc: will, James Morse, Suzuki K Poulose, Zenghui Yu, catalin.marinas,
	mark.rutland, akpm, maz, kvmarm, linux-arm-kernel, linux-kernel,
	kernel-team, vdonnefort, qperret, smostafa

On Mon, Dec 18, 2023 at 01:59:00PM +0000, Sebastian Ene wrote:
> Register a debugfs file on guest creation to be able to view their
> second translation tables with ptdump. This assumes that the host is in
> control of the guest stage-2 and has direct access to the pagetables.
> 
> Signed-off-by: Sebastian Ene <sebastianene@google.com>

I couldn't see how this patched worked at all until I went back to patch
1 and found this:

> +static int kvm_ptdump_open(struct inode *inode, struct file *file)
> +{

[...]

> +	if (!reg->show_ptdump_info)
> +		reg->show_ptdump_info = kvm_ptdump_show;

[...]

> +}


> +static int kvm_ptdump_show(struct seq_file *m, void *)
> +{
> +	struct kvm *guest_kvm = m->private;
> +	struct kvm_s2_mmu *mmu = &guest_kvm->arch.mmu;
> +	int ret;
> +
> +	write_lock(&guest_kvm->mmu_lock);
> +	ret = kvm_ptdump_show_common(m, mmu->pgt);
> +	write_unlock(&guest_kvm->mmu_lock);
> +
> +	return ret;
> +}

Where are you getting a reference on the kvm struct? You need to do this
to ensure the VM doesn't get destroyed behind your back.

-- 
Thanks,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables
  2023-12-18 13:58 [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (9 preceding siblings ...)
  2023-12-18 13:59 ` [PATCH v4 10/10] arm64: ptdump: Add guest stage-2 pagetables dumping Sebastian Ene
@ 2023-12-21 18:36 ` Oliver Upton
  2023-12-21 18:41   ` Sebastian Ene
  10 siblings, 1 reply; 22+ messages in thread
From: Oliver Upton @ 2023-12-21 18:36 UTC (permalink / raw)
  To: Sebastian Ene
  Cc: will, James Morse, Suzuki K Poulose, Zenghui Yu, catalin.marinas,
	mark.rutland, akpm, maz, kvmarm, linux-arm-kernel, linux-kernel,
	kernel-team, vdonnefort, qperret, smostafa

Hi Sebastian,

I'm going to try and review the series a bit more after the holidays,
but in general this is unecessarily complex. Furthermore, building out
the feature for pKVM *first* rather than 'normal' VMs makes it all very
challenging to follow.

I would strongly prefer that this series be split in two, adding support
for 'normal' VMs with a follow-up series for getting the pKVM plumbing
in place for host stage-2.

-- 
Thanks,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables
  2023-12-21 18:36 ` [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Oliver Upton
@ 2023-12-21 18:41   ` Sebastian Ene
  0 siblings, 0 replies; 22+ messages in thread
From: Sebastian Ene @ 2023-12-21 18:41 UTC (permalink / raw)
  To: Oliver Upton
  Cc: will, James Morse, Suzuki K Poulose, Zenghui Yu, catalin.marinas,
	mark.rutland, akpm, maz, kvmarm, linux-arm-kernel, linux-kernel,
	kernel-team, vdonnefort, qperret, smostafa

On Thu, Dec 21, 2023 at 06:36:28PM +0000, Oliver Upton wrote:

Hello Oliver,

> Hi Sebastian,
> 
> I'm going to try and review the series a bit more after the holidays,
> but in general this is unecessarily complex. Furthermore, building out
> the feature for pKVM *first* rather than 'normal' VMs makes it all very
> challenging to follow.
> 

Thanks for having a look. This sounds good, I can split the series into
two afer the holidays and push a new version because I am really not
happy with my current state of the patches (especially because I
discovered some issues after I subimtted on the list).


> I would strongly prefer that this series be split in two, adding support
> for 'normal' VMs with a follow-up series for getting the pKVM plumbing
> in place for host stage-2.
> 
> -- 
> Thanks,
> Oliver

Thank you,
Seb

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 02/10] KVM: arm64: Add ptdump registration with debugfs for the stage-2 pagetables
  2023-12-21 18:14   ` Oliver Upton
@ 2024-02-01 11:20     ` Sebastian Ene
  2024-02-05 13:14       ` Oliver Upton
  0 siblings, 1 reply; 22+ messages in thread
From: Sebastian Ene @ 2024-02-01 11:20 UTC (permalink / raw)
  To: Oliver Upton
  Cc: will, James Morse, Suzuki K Poulose, Zenghui Yu, catalin.marinas,
	mark.rutland, akpm, maz, kvmarm, linux-arm-kernel, linux-kernel,
	kernel-team, vdonnefort, qperret, smostafa

On Thu, Dec 21, 2023 at 06:14:20PM +0000, Oliver Upton wrote:

Hi Oliver,

I am planning to split the series based on your suggestion and I
wanted to make sure that I understand your feedback.

> On Mon, Dec 18, 2023 at 01:58:52PM +0000, Sebastian Ene wrote:
> > +config PTDUMP_STAGE2_DEBUGFS
> > +       bool "Present the stage-2 pagetables to debugfs"
> > +       depends on PTDUMP_DEBUGFS && KVM
> > +       default n
> > +       help
> > +         Say Y here if you want to show the stage-2 kernel pagetables
> > +         layout in a debugfs file. This information is only useful for kernel developers
> > +         who are working in architecture specific areas of the kernel.
> > +         It is probably not a good idea to enable this feature in a production
> > +         kernel.
> 
> It isn't really a good idea to mount debugfs at all in a production
> system. There are already plenty worse interfaces lurking in that
> filesystem. The pKVM portions already depend on CONFIG_NVHE_EL2_DEBUG,
> so I don't see a need for this Kconfig option.
> 

I created a separate option because I wanted to re-use the parsing
functionality from the already existing ptdump code for EL1. This option
is turned off in production and only enabled for debug.

I was thinking to make use of the `CONFIG_NVHE_EL2_DEBUG` but then I abandoned 
this ideea as one can use ptdump for vHE as well.

> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index e5f75f1f1..ee8d7cb67 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -40,6 +40,7 @@
> >  #include <asm/kvm_pkvm.h>
> >  #include <asm/kvm_emulate.h>
> >  #include <asm/sections.h>
> > +#include <kvm_ptdump.h>
> >  
> >  #include <kvm/arm_hypercalls.h>
> >  #include <kvm/arm_pmu.h>
> > @@ -2592,6 +2593,7 @@ static __init int kvm_arm_init(void)
> >  	if (err)
> >  		goto out_subs;
> >  
> > +	kvm_ptdump_register_host();
> >  	kvm_arm_initialised = true;
> >  
> >  	return 0;
> > diff --git a/arch/arm64/kvm/kvm_ptdump.h b/arch/arm64/kvm/kvm_ptdump.h
> > new file mode 100644
> > index 000000000..98b595ce8
> > --- /dev/null
> > +++ b/arch/arm64/kvm/kvm_ptdump.h
> > @@ -0,0 +1,18 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +//
> > +// Copyright (C) Google, 2023
> > +// Author: Sebastian Ene <sebastianene@google.com>
> 
> You've got the comment styles backwards for these. The SPDX license uses
> the 'C++' style comment (//), whereas your multiline comment should always
> use a 'C' style comment (/* */).
>

Let me fix this, thanks.

> > +struct kvm_ptdump_register {
> > +	void *(*get_ptdump_info)(struct kvm_ptdump_register *reg);
> > +	void (*put_ptdump_info)(void *priv);
> > +	int (*show_ptdump_info)(struct seq_file *m, void *v);
> > +	void *priv;
> > +};
> 
> Please thoroughly consider the necessity of this. You're wrapping a
> callback structure with yet another callback structure. IMO, it would
> make a lot more sense to implement the file ops structure for every
> walker variant you need and avoid the indirection, it's hard to
> understand.
>

I think we can drop this and have different file_ops.

> > +void kvm_ptdump_register_host(void)
> > +{
> > +	if (!is_protected_kvm_enabled())
> > +		return;
> > +
> > +	kvm_ptdump_debugfs_register(&host_reg, "host_page_tables",
> > +				    kvm_debugfs_dir);
> > +}
> > +
> > +static int __init kvm_host_ptdump_init(void)
> > +{
> > +	host_reg.priv = (void *)host_s2_pgtable_pages();
> > +	return 0;
> > +}
> > +
> > +device_initcall(kvm_host_ptdump_init);
> 
> Why can't all of this be called from finalize_pkvm()?
> 

I guess it can be called from finalize_pkvm before the is_protected_kvm_enabled
check. This should work for nvhe & vhe as well.

Thanks,
Seb

> > -- 
> > 2.43.0.472.g3155946c3a-goog
> > 
> 
> -- 
> Thanks,
> Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 02/10] KVM: arm64: Add ptdump registration with debugfs for the stage-2 pagetables
  2024-02-01 11:20     ` Sebastian Ene
@ 2024-02-05 13:14       ` Oliver Upton
  2024-02-05 16:05         ` Sebastian Ene
  0 siblings, 1 reply; 22+ messages in thread
From: Oliver Upton @ 2024-02-05 13:14 UTC (permalink / raw)
  To: Sebastian Ene
  Cc: will, James Morse, Suzuki K Poulose, Zenghui Yu, catalin.marinas,
	mark.rutland, akpm, maz, kvmarm, linux-arm-kernel, linux-kernel,
	kernel-team, vdonnefort, qperret, smostafa

On Thu, Feb 01, 2024 at 11:20:53AM +0000, Sebastian Ene wrote:
> On Thu, Dec 21, 2023 at 06:14:20PM +0000, Oliver Upton wrote:
> 
> Hi Oliver,
> 
> I am planning to split the series based on your suggestion and I
> wanted to make sure that I understand your feedback.
> 
> > On Mon, Dec 18, 2023 at 01:58:52PM +0000, Sebastian Ene wrote:
> > > +config PTDUMP_STAGE2_DEBUGFS
> > > +       bool "Present the stage-2 pagetables to debugfs"
> > > +       depends on PTDUMP_DEBUGFS && KVM
> > > +       default n
> > > +       help
> > > +         Say Y here if you want to show the stage-2 kernel pagetables
> > > +         layout in a debugfs file. This information is only useful for kernel developers
> > > +         who are working in architecture specific areas of the kernel.
> > > +         It is probably not a good idea to enable this feature in a production
> > > +         kernel.
> > 
> > It isn't really a good idea to mount debugfs at all in a production
> > system. There are already plenty worse interfaces lurking in that
> > filesystem. The pKVM portions already depend on CONFIG_NVHE_EL2_DEBUG,
> > so I don't see a need for this Kconfig option.
> > 
> 
> I created a separate option because I wanted to re-use the parsing
> functionality from the already existing ptdump code for EL1. This option
> is turned off in production and only enabled for debug.
> 
> I was thinking to make use of the `CONFIG_NVHE_EL2_DEBUG` but then I abandoned 
> this ideea as one can use ptdump for vHE as well.

Fair enough. I was going to say we could just have KVM follow
CONFIG_PTDUMP_DEBUGFS, but it doesn't matter either way.

> > > +void kvm_ptdump_register_host(void)
> > > +{
> > > +	if (!is_protected_kvm_enabled())
> > > +		return;
> > > +
> > > +	kvm_ptdump_debugfs_register(&host_reg, "host_page_tables",
> > > +				    kvm_debugfs_dir);
> > > +}
> > > +
> > > +static int __init kvm_host_ptdump_init(void)
> > > +{
> > > +	host_reg.priv = (void *)host_s2_pgtable_pages();
> > > +	return 0;
> > > +}
> > > +
> > > +device_initcall(kvm_host_ptdump_init);
> > 
> > Why can't all of this be called from finalize_pkvm()?
> > 
> 
> I guess it can be called from finalize_pkvm before the is_protected_kvm_enabled
> check. This should work for nvhe & vhe as well.

What does nvhe and vhe modes have to do with it? I thought this was for
hooking up the host's S2, which does not exist outside protected mode.

-- 
Thanks,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 02/10] KVM: arm64: Add ptdump registration with debugfs for the stage-2 pagetables
  2024-02-05 13:14       ` Oliver Upton
@ 2024-02-05 16:05         ` Sebastian Ene
  0 siblings, 0 replies; 22+ messages in thread
From: Sebastian Ene @ 2024-02-05 16:05 UTC (permalink / raw)
  To: Oliver Upton
  Cc: will, James Morse, Suzuki K Poulose, Zenghui Yu, catalin.marinas,
	mark.rutland, akpm, maz, kvmarm, linux-arm-kernel, linux-kernel,
	kernel-team, vdonnefort, qperret, smostafa

On Mon, Feb 05, 2024 at 01:14:27PM +0000, Oliver Upton wrote:
> On Thu, Feb 01, 2024 at 11:20:53AM +0000, Sebastian Ene wrote:
> > On Thu, Dec 21, 2023 at 06:14:20PM +0000, Oliver Upton wrote:
> > 
> > Hi Oliver,
> > 
> > I am planning to split the series based on your suggestion and I
> > wanted to make sure that I understand your feedback.
> > 
> > > On Mon, Dec 18, 2023 at 01:58:52PM +0000, Sebastian Ene wrote:
> > > > +config PTDUMP_STAGE2_DEBUGFS
> > > > +       bool "Present the stage-2 pagetables to debugfs"
> > > > +       depends on PTDUMP_DEBUGFS && KVM
> > > > +       default n
> > > > +       help
> > > > +         Say Y here if you want to show the stage-2 kernel pagetables
> > > > +         layout in a debugfs file. This information is only useful for kernel developers
> > > > +         who are working in architecture specific areas of the kernel.
> > > > +         It is probably not a good idea to enable this feature in a production
> > > > +         kernel.
> > > 
> > > It isn't really a good idea to mount debugfs at all in a production
> > > system. There are already plenty worse interfaces lurking in that
> > > filesystem. The pKVM portions already depend on CONFIG_NVHE_EL2_DEBUG,
> > > so I don't see a need for this Kconfig option.
> > > 
> > 
> > I created a separate option because I wanted to re-use the parsing
> > functionality from the already existing ptdump code for EL1. This option
> > is turned off in production and only enabled for debug.
> > 
> > I was thinking to make use of the `CONFIG_NVHE_EL2_DEBUG` but then I abandoned 
> > this ideea as one can use ptdump for vHE as well.
> 
> Fair enough. I was going to say we could just have KVM follow
> CONFIG_PTDUMP_DEBUGFS, but it doesn't matter either way.
> 
> > > > +void kvm_ptdump_register_host(void)
> > > > +{
> > > > +	if (!is_protected_kvm_enabled())
> > > > +		return;
> > > > +
> > > > +	kvm_ptdump_debugfs_register(&host_reg, "host_page_tables",
> > > > +				    kvm_debugfs_dir);
> > > > +}
> > > > +
> > > > +static int __init kvm_host_ptdump_init(void)
> > > > +{
> > > > +	host_reg.priv = (void *)host_s2_pgtable_pages();
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +device_initcall(kvm_host_ptdump_init);
> > > 
> > > Why can't all of this be called from finalize_pkvm()?
> > > 
> > 
> > I guess it can be called from finalize_pkvm before the is_protected_kvm_enabled
> > check. This should work for nvhe & vhe as well.
> 
> What does nvhe and vhe modes have to do with it? I thought this was for
> hooking up the host's S2, which does not exist outside protected mode.
> 

True I guess there is no other need for the initialization portion in
this function. I will split the series to address the non-protected
support first.

Thanks,
Seb

> -- 
> Thanks,
> Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-02-05 16:06 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-18 13:58 [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Sebastian Ene
2023-12-18 13:58 ` [PATCH v4 01/10] KVM: arm64: Add snapshot interface for the host stage-2 pagetable Sebastian Ene
2023-12-19 11:44   ` Sebastian Ene
2023-12-18 13:58 ` [PATCH v4 02/10] KVM: arm64: Add ptdump registration with debugfs for the stage-2 pagetables Sebastian Ene
2023-12-19 11:47   ` Sebastian Ene
2023-12-21 18:14   ` Oliver Upton
2024-02-01 11:20     ` Sebastian Ene
2024-02-05 13:14       ` Oliver Upton
2024-02-05 16:05         ` Sebastian Ene
2023-12-18 13:58 ` [PATCH v4 03/10] KVM: arm64: Invoke the snapshot interface for the host stage-2 pagetable Sebastian Ene
2023-12-19 11:45   ` Sebastian Ene
2023-12-18 13:58 ` [PATCH v4 04/10] arm64: ptdump: Expose the attribute parsing functionality Sebastian Ene
2023-12-18 13:58 ` [PATCH v4 05/10] arm64: ptdump: Use the mask from the state structure Sebastian Ene
2023-12-18 13:58 ` [PATCH v4 06/10] KVM: arm64: Move pagetable definitions to common header Sebastian Ene
2023-12-18 13:58 ` [PATCH v4 07/10] KVM: arm64: Walk the pagetable snapshot and parse the ptdump descriptors Sebastian Ene
2023-12-18 13:58 ` [PATCH v4 08/10] arm64: ptdump: Interpret memory attributes based on the runtime config Sebastian Ene
2023-12-18 13:58 ` [PATCH v4 09/10] arm64: ptdump: Interpret pKVM ownership annotations Sebastian Ene
2023-12-18 13:59 ` [PATCH v4 10/10] arm64: ptdump: Add guest stage-2 pagetables dumping Sebastian Ene
2023-12-19 11:52   ` Sebastian Ene
2023-12-21 18:27   ` Oliver Upton
2023-12-21 18:36 ` [PATCH v4 00/10] arm64: ptdump: View the second stage page-tables Oliver Upton
2023-12-21 18:41   ` Sebastian Ene

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).