linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables
@ 2023-10-19 14:40 Sebastian Ene
  2023-10-19 14:40 ` [PATCH v2 01/11] KVM: arm64: Add snap shooting the host stage-2 pagetables Sebastian Ene
                   ` (11 more replies)
  0 siblings, 12 replies; 17+ messages in thread
From: Sebastian Ene @ 2023-10-19 14:40 UTC (permalink / raw)
  To: will, catalin.marinas, mark.rutland, akpm, maz
  Cc: linux-arm-kernel, linux-kernel, kernel-team, vdonnefort, qperret,
	smostafa, Sebastian Ene

Hi,

This can be used as a debugging tool for dumping the second stage
page-tables under debugfs.

From the previous feedback I re-worked the series and added support for
guest page-tables dumping under VHE & nVHE configuration. I extended the
list of reviewers as I missed the interested parties in the first round. 

When CONFIG_NVHE_EL2_PTDUMP_DEBUGFS is enabled under pKVM environment,
ptdump registers the 'host_stage2_kernel_page_tables' entry with debugfs.
Guests are registering a file named '%u_guest_stage2_page_tables' when
they are created. 

This allows us to dump the host stage-2 page-tables with the following command:
cat /sys/kernel/debug/host_stage2_kernel_page_tables.

The output is showing the entries in the following format:
<IPA range> <size> <descriptor type> <access permissions> <mem_attributes>

The tool interprets the pKVM ownership annotation stored in the invalid
entries and dumps to the console the ownership information. To be able
to access the host stage-2 page-tables from the kernel, a new hypervisor
call was introduced which allows us to snapshot the page-tables in a host
provided buffer. The hypervisor call is hidden behind CONFIG_NVHE_EL2_DEBUG
as this should be used under debugging environment.

Link to the first version:
https://lore.kernel.org/all/20230927112517.2631674-1-sebastianene@google.com/

Changelog:
  v1 -> v2:
  * use the stage-2 pagetable walker for dumping descriptors instead of
    the one provided by ptdump.

  * support for guests pagetables dumping under VHE/nVHE non-protected

Thanks,


Sebastian Ene (11):
  KVM: arm64: Add snap shooting the host stage-2 pagetables
  arm64: ptdump: Use the mask from the state structure
  arm64: ptdump: Add the walker function to the ptdump info structure
  KVM: arm64: Move pagetable definitions to common header
  arm64: ptdump: Introduce stage-2 pagetables format description
  arm64: ptdump: Add hooks on debugfs file operations
  arm64: ptdump: Register a debugfs entry for the host stage-2
    page-tables
  arm64: ptdump: Parse the host stage-2 page-tables from the snapshot
  arm64: ptdump: Interpret memory attributes based on runtime
    configuration
  arm64: ptdump: Interpret pKVM ownership annotations
  arm64: ptdump: Add support for guest stage-2 pagetables dumping

 arch/arm64/include/asm/kvm_asm.h              |   1 +
 arch/arm64/include/asm/kvm_pgtable.h          |  85 +++
 arch/arm64/include/asm/ptdump.h               |  27 +-
 arch/arm64/kvm/Kconfig                        |  12 +
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   8 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  18 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 103 ++++
 arch/arm64/kvm/hyp/pgtable.c                  |  98 ++--
 arch/arm64/kvm/mmu.c                          |   3 +
 arch/arm64/mm/ptdump.c                        | 487 +++++++++++++++++-
 arch/arm64/mm/ptdump_debugfs.c                |  42 +-
 11 files changed, 822 insertions(+), 62 deletions(-)

-- 
2.42.0.655.g421f12c284-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 01/11] KVM: arm64: Add snap shooting the host stage-2 pagetables
  2023-10-19 14:40 [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Sebastian Ene
@ 2023-10-19 14:40 ` Sebastian Ene
  2023-10-26 12:45   ` kernel test robot
  2023-10-19 14:40 ` [PATCH v2 02/11] arm64: ptdump: Use the mask from the state structure Sebastian Ene
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: Sebastian Ene @ 2023-10-19 14:40 UTC (permalink / raw)
  To: will, catalin.marinas, mark.rutland, akpm, maz
  Cc: linux-arm-kernel, linux-kernel, kernel-team, vdonnefort, qperret,
	smostafa, Sebastian Ene

Introduce a new HVC that allows the caller to snap shoot the stage-2
pagetables under NVHE debug configuration. The caller specifies the
location where the pagetables are copied and must ensure that the memory
is accessible by the hypervisor. The memory where the pagetables are
copied has to be allocated by the caller and shared with the hypervisor.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/include/asm/kvm_asm.h              |   1 +
 arch/arm64/include/asm/kvm_pgtable.h          |  36 ++++++
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   1 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  18 +++
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 103 ++++++++++++++++++
 arch/arm64/kvm/hyp/pgtable.c                  |  56 ++++++++++
 6 files changed, 215 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 24b5e6b23417..99145a24c0f6 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -81,6 +81,7 @@ enum __kvm_host_smccc_func {
 	__KVM_HOST_SMCCC_FUNC___pkvm_init_vm,
 	__KVM_HOST_SMCCC_FUNC___pkvm_init_vcpu,
 	__KVM_HOST_SMCCC_FUNC___pkvm_teardown_vm,
+	__KVM_HOST_SMCCC_FUNC___pkvm_copy_host_stage2,
 };
 
 #define DECLARE_KVM_VHE_SYM(sym)	extern char sym[]
diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index d3e354bb8351..be615700f8ac 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -10,6 +10,7 @@
 #include <linux/bits.h>
 #include <linux/kvm_host.h>
 #include <linux/types.h>
+#include <asm/kvm_host.h>
 
 #define KVM_PGTABLE_MAX_LEVELS		4U
 
@@ -351,6 +352,21 @@ struct kvm_pgtable {
 	kvm_pgtable_force_pte_cb_t		force_pte_cb;
 };
 
+/**
+ * struct kvm_pgtable_snapshot - Snapshot page-table wrapper.
+ * @pgtable:		The page-table configuration.
+ * @mc:			Memcache used for pagetable pages allocation.
+ * @pgd_hva:		Host virtual address of a physically contiguous buffer
+ *			used for storing the PGD.
+ * @pgd_len:		The size of the phyisically contiguous buffer in bytes.
+ */
+struct kvm_pgtable_snapshot {
+	struct kvm_pgtable			pgtable;
+	struct kvm_hyp_memcache			mc;
+	void					*pgd_hva;
+	size_t					pgd_len;
+};
+
 /**
  * kvm_pgtable_hyp_init() - Initialise a hypervisor stage-1 page-table.
  * @pgt:	Uninitialised page-table structure to initialise.
@@ -756,4 +772,24 @@ enum kvm_pgtable_prot kvm_pgtable_hyp_pte_prot(kvm_pte_t pte);
  */
 void kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
 				phys_addr_t addr, size_t size);
+
+#ifdef CONFIG_NVHE_EL2_DEBUG
+/**
+ * kvm_pgtable_stage2_copy() - Snapshot the pagetable
+ *
+ * @to_pgt:	Destination pagetable
+ * @from_pgt:	Source pagetable. The caller must lock the pagetables first
+ * @mc:		The memcache where we allocate the destination pagetables from
+ */
+int kvm_pgtable_stage2_copy(struct kvm_pgtable *to_pgt,
+			    const struct kvm_pgtable *from_pgt,
+			    void *mc);
+#else
+static inline int kvm_pgtable_stage2_copy(struct kvm_pgtable *to_pgt,
+					  const struct kvm_pgtable *from_pgt,
+					  void *mc)
+{
+	return -EPERM;
+}
+#endif	/* CONFIG_NVHE_EL2_DEBUG */
 #endif	/* __ARM64_KVM_PGTABLE_H__ */
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index 0972faccc2af..9cfb35d68850 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -69,6 +69,7 @@ int __pkvm_host_donate_hyp(u64 pfn, u64 nr_pages);
 int __pkvm_hyp_donate_host(u64 pfn, u64 nr_pages);
 int __pkvm_host_share_ffa(u64 pfn, u64 nr_pages);
 int __pkvm_host_unshare_ffa(u64 pfn, u64 nr_pages);
+int __pkvm_host_stage2_prepare_copy(struct kvm_pgtable_snapshot *snapshot);
 
 bool addr_is_memory(phys_addr_t phys);
 int host_stage2_idmap_locked(phys_addr_t addr, u64 size, enum kvm_pgtable_prot prot);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 2385fd03ed87..0d9b56c31cf2 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -314,6 +314,23 @@ static void handle___pkvm_teardown_vm(struct kvm_cpu_context *host_ctxt)
 	cpu_reg(host_ctxt, 1) = __pkvm_teardown_vm(handle);
 }
 
+static void handle___pkvm_copy_host_stage2(struct kvm_cpu_context *host_ctxt)
+{
+	int ret = -EPERM;
+#ifdef CONFIG_NVHE_EL2_DEBUG
+	DECLARE_REG(struct kvm_pgtable_snapshot *, snapshot, host_ctxt, 1);
+	kvm_pteref_t pgd;
+
+	snapshot = kern_hyp_va(snapshot);
+	ret = __pkvm_host_stage2_prepare_copy(snapshot);
+	if (!ret) {
+		pgd = snapshot->pgtable.pgd;
+		snapshot->pgtable.pgd = (kvm_pteref_t)__hyp_pa(pgd);
+	}
+#endif
+	cpu_reg(host_ctxt, 1) = ret;
+}
+
 typedef void (*hcall_t)(struct kvm_cpu_context *);
 
 #define HANDLE_FUNC(x)	[__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
@@ -348,6 +365,7 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__pkvm_init_vm),
 	HANDLE_FUNC(__pkvm_init_vcpu),
 	HANDLE_FUNC(__pkvm_teardown_vm),
+	HANDLE_FUNC(__pkvm_copy_host_stage2),
 };
 
 static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 9d703441278b..fe1a6dbd6d31 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -266,6 +266,109 @@ int kvm_guest_prepare_stage2(struct pkvm_hyp_vm *vm, void *pgd)
 	return 0;
 }
 
+#ifdef CONFIG_NVHE_EL2_DEBUG
+static struct hyp_pool snapshot_pool = {0};
+static DEFINE_HYP_SPINLOCK(snapshot_pool_lock);
+
+static void *snapshot_zalloc_pages_exact(size_t size)
+{
+	void *addr = hyp_alloc_pages(&snapshot_pool, get_order(size));
+
+	hyp_split_page(hyp_virt_to_page(addr));
+
+	/*
+	 * The size of concatenated PGDs is always a power of two of PAGE_SIZE,
+	 * so there should be no need to free any of the tail pages to make the
+	 * allocation exact.
+	 */
+	WARN_ON(size != (PAGE_SIZE << get_order(size)));
+
+	return addr;
+}
+
+static void snapshot_get_page(void *addr)
+{
+	hyp_get_page(&snapshot_pool, addr);
+}
+
+static void *snapshot_zalloc_page(void *mc)
+{
+	struct hyp_page *p;
+	void *addr;
+
+	addr = hyp_alloc_pages(&snapshot_pool, 0);
+	if (addr)
+		return addr;
+
+	addr = pop_hyp_memcache(mc, hyp_phys_to_virt);
+	if (!addr)
+		return addr;
+
+	memset(addr, 0, PAGE_SIZE);
+	p = hyp_virt_to_page(addr);
+	memset(p, 0, sizeof(*p));
+	p->refcount = 1;
+
+	return addr;
+}
+
+static void snapshot_s2_free_pages_exact(void *addr, unsigned long size)
+{
+	u8 order = get_order(size);
+	unsigned int i;
+	struct hyp_page *p;
+
+	for (i = 0; i < (1 << order); i++) {
+		p = hyp_virt_to_page(addr + (i * PAGE_SIZE));
+		hyp_page_ref_dec_and_test(p);
+	}
+}
+
+int __pkvm_host_stage2_prepare_copy(struct kvm_pgtable_snapshot *snapshot)
+{
+	size_t required_pgd_len;
+	struct kvm_pgtable_mm_ops mm_ops = {0};
+	struct kvm_pgtable *to_pgt, *from_pgt = &host_mmu.pgt;
+	struct kvm_hyp_memcache *memcache = &snapshot->mc;
+	int ret;
+	void *pgd;
+	u64 nr_pages;
+
+	required_pgd_len = kvm_pgtable_stage2_pgd_size(host_mmu.arch.vtcr);
+	if (snapshot->pgd_len < required_pgd_len)
+		return -ENOMEM;
+
+	to_pgt = &snapshot->pgtable;
+	nr_pages = snapshot->pgd_len / PAGE_SIZE;
+	pgd = kern_hyp_va(snapshot->pgd_hva);
+
+	hyp_spin_lock(&snapshot_pool_lock);
+	hyp_pool_init(&snapshot_pool, hyp_virt_to_pfn(pgd),
+		      required_pgd_len / PAGE_SIZE, 0);
+
+	mm_ops.zalloc_pages_exact	= snapshot_zalloc_pages_exact;
+	mm_ops.zalloc_page		= snapshot_zalloc_page;
+	mm_ops.free_pages_exact		= snapshot_s2_free_pages_exact;
+	mm_ops.get_page			= snapshot_get_page;
+	mm_ops.phys_to_virt		= hyp_phys_to_virt;
+	mm_ops.virt_to_phys		= hyp_virt_to_phys;
+	mm_ops.page_count		= hyp_page_count;
+
+	to_pgt->ia_bits		= from_pgt->ia_bits;
+	to_pgt->start_level	= from_pgt->start_level;
+	to_pgt->flags		= from_pgt->flags;
+	to_pgt->mm_ops		= &mm_ops;
+
+	host_lock_component();
+	ret = kvm_pgtable_stage2_copy(to_pgt, from_pgt, memcache);
+	host_unlock_component();
+
+	hyp_spin_unlock(&snapshot_pool_lock);
+
+	return ret;
+}
+#endif /* CONFIG_NVHE_EL2_DEBUG */
+
 void reclaim_guest_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc)
 {
 	void *addr;
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index f155b8c9e98c..256654b89c1e 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -1598,3 +1598,59 @@ void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *p
 	WARN_ON(mm_ops->page_count(pgtable) != 1);
 	mm_ops->put_page(pgtable);
 }
+
+#ifdef CONFIG_NVHE_EL2_DEBUG
+static int stage2_copy_walker(const struct kvm_pgtable_visit_ctx *ctx,
+			      enum kvm_pgtable_walk_flags visit)
+{
+	struct kvm_pgtable_mm_ops *mm_ops = ctx->mm_ops;
+	void *copy_table, *original_addr;
+	kvm_pte_t new = ctx->old;
+
+	if (!stage2_pte_is_counted(ctx->old))
+		return 0;
+
+	if (kvm_pte_table(ctx->old, ctx->level)) {
+		copy_table = mm_ops->zalloc_page(ctx->arg);
+		if (!copy_table)
+			return -ENOMEM;
+
+		original_addr = kvm_pte_follow(ctx->old, mm_ops);
+
+		memcpy(copy_table, original_addr, PAGE_SIZE);
+		new = kvm_init_table_pte(copy_table, mm_ops);
+	}
+
+	*ctx->ptep = new;
+
+	return 0;
+}
+
+int kvm_pgtable_stage2_copy(struct kvm_pgtable *to_pgt,
+			    const struct kvm_pgtable *from_pgt,
+			    void *mc)
+{
+	int ret;
+	size_t pgd_sz;
+	struct kvm_pgtable_mm_ops *mm_ops = to_pgt->mm_ops;
+	struct kvm_pgtable_walker walker = {
+		.cb	= stage2_copy_walker,
+		.flags	= KVM_PGTABLE_WALK_LEAF |
+			  KVM_PGTABLE_WALK_TABLE_PRE,
+		.arg = mc
+	};
+
+	pgd_sz = kvm_pgd_pages(to_pgt->ia_bits, to_pgt->start_level) *
+		PAGE_SIZE;
+	to_pgt->pgd = (kvm_pteref_t)mm_ops->zalloc_pages_exact(pgd_sz);
+	if (!to_pgt->pgd)
+		return -ENOMEM;
+
+	memcpy(to_pgt->pgd, from_pgt->pgd, pgd_sz);
+
+	ret = kvm_pgtable_walk(to_pgt, 0, BIT(to_pgt->ia_bits), &walker);
+	mm_ops->free_pages_exact(to_pgt->pgd, pgd_sz);
+
+	return ret;
+}
+#endif /* CONFIG_NVHE_EL2_DEBUG */
-- 
2.42.0.655.g421f12c284-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 02/11] arm64: ptdump: Use the mask from the state structure
  2023-10-19 14:40 [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Sebastian Ene
  2023-10-19 14:40 ` [PATCH v2 01/11] KVM: arm64: Add snap shooting the host stage-2 pagetables Sebastian Ene
@ 2023-10-19 14:40 ` Sebastian Ene
  2023-10-19 14:40 ` [PATCH v2 03/11] arm64: ptdump: Add the walker function to the ptdump info structure Sebastian Ene
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Sebastian Ene @ 2023-10-19 14:40 UTC (permalink / raw)
  To: will, catalin.marinas, mark.rutland, akpm, maz
  Cc: linux-arm-kernel, linux-kernel, kernel-team, vdonnefort, qperret,
	smostafa, Sebastian Ene

Printing the descriptor attributes requires accessing a mask which has a
different set of attributes for stage-2. In preparation for adding support
for the stage-2 pagetables dumping, use the mask from the local context
and not from the globally defined pg_level array. Store a pointer to
the pg_level array in the ptdump state structure. This will allow us to
extract the mask which is wrapped in the pg_level array and use it for
descriptor parsing in the note_page.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/mm/ptdump.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index e305b6593c4e..8761a70f916f 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -75,6 +75,7 @@ static struct addr_marker address_markers[] = {
 struct pg_state {
 	struct ptdump_state ptdump;
 	struct seq_file *seq;
+	struct pg_level *pg_level;
 	const struct addr_marker *marker;
 	unsigned long start_address;
 	int level;
@@ -252,11 +253,12 @@ static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
 		      u64 val)
 {
 	struct pg_state *st = container_of(pt_st, struct pg_state, ptdump);
+	struct pg_level *pg_info = st->pg_level;
 	static const char units[] = "KMGTPE";
 	u64 prot = 0;
 
 	if (level >= 0)
-		prot = val & pg_level[level].mask;
+		prot = val & pg_info[level].mask;
 
 	if (st->level == -1) {
 		st->level = level;
@@ -282,10 +284,10 @@ static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
 			unit++;
 		}
 		pt_dump_seq_printf(st->seq, "%9lu%c %s", delta, *unit,
-				   pg_level[st->level].name);
-		if (st->current_prot && pg_level[st->level].bits)
-			dump_prot(st, pg_level[st->level].bits,
-				  pg_level[st->level].num);
+				   pg_info[st->level].name);
+		if (st->current_prot && pg_info[st->level].bits)
+			dump_prot(st, pg_info[st->level].bits,
+				  pg_info[st->level].num);
 		pt_dump_seq_puts(st->seq, "\n");
 
 		if (addr >= st->marker[1].start_address) {
@@ -316,6 +318,7 @@ void ptdump_walk(struct seq_file *s, struct ptdump_info *info)
 	st = (struct pg_state){
 		.seq = s,
 		.marker = info->markers,
+		.pg_level = &pg_level[0],
 		.level = -1,
 		.ptdump = {
 			.note_page = note_page,
@@ -353,6 +356,7 @@ void ptdump_check_wx(void)
 			{ 0, NULL},
 			{ -1, NULL},
 		},
+		.pg_level = &pg_level[0],
 		.level = -1,
 		.check_wx = true,
 		.ptdump = {
-- 
2.42.0.655.g421f12c284-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 03/11] arm64: ptdump: Add the walker function to the ptdump info structure
  2023-10-19 14:40 [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Sebastian Ene
  2023-10-19 14:40 ` [PATCH v2 01/11] KVM: arm64: Add snap shooting the host stage-2 pagetables Sebastian Ene
  2023-10-19 14:40 ` [PATCH v2 02/11] arm64: ptdump: Use the mask from the state structure Sebastian Ene
@ 2023-10-19 14:40 ` Sebastian Ene
  2023-10-19 14:40 ` [PATCH v2 04/11] KVM: arm64: Move pagetable definitions to common header Sebastian Ene
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Sebastian Ene @ 2023-10-19 14:40 UTC (permalink / raw)
  To: will, catalin.marinas, mark.rutland, akpm, maz
  Cc: linux-arm-kernel, linux-kernel, kernel-team, vdonnefort, qperret,
	smostafa, Sebastian Ene

Stage-2 needs a dedicated walk function to be able to parse concatenated
pagetables. The ptdump info structure is used to hold different
configuration options for the walker. This structure is registered with
the debugfs entry and is stored in the argument for the debugfs file.
Hence, in preparation for parsing the stage-2 pagetables add the walk
function as an argument for the debugfs file.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/include/asm/ptdump.h | 1 +
 arch/arm64/mm/ptdump.c          | 1 +
 arch/arm64/mm/ptdump_debugfs.c  | 3 ++-
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
index 581caac525b0..1f6e0aabf16a 100644
--- a/arch/arm64/include/asm/ptdump.h
+++ b/arch/arm64/include/asm/ptdump.h
@@ -19,6 +19,7 @@ struct ptdump_info {
 	struct mm_struct		*mm;
 	const struct addr_marker	*markers;
 	unsigned long			base_addr;
+	void (*ptdump_walk)(struct seq_file *s, struct ptdump_info *info);
 };
 
 void ptdump_walk(struct seq_file *s, struct ptdump_info *info);
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 8761a70f916f..d531e24ea0b2 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -346,6 +346,7 @@ static struct ptdump_info kernel_ptdump_info = {
 	.mm		= &init_mm,
 	.markers	= address_markers,
 	.base_addr	= PAGE_OFFSET,
+	.ptdump_walk	= &ptdump_walk,
 };
 
 void ptdump_check_wx(void)
diff --git a/arch/arm64/mm/ptdump_debugfs.c b/arch/arm64/mm/ptdump_debugfs.c
index 68bf1a125502..7564519db1e6 100644
--- a/arch/arm64/mm/ptdump_debugfs.c
+++ b/arch/arm64/mm/ptdump_debugfs.c
@@ -10,7 +10,8 @@ static int ptdump_show(struct seq_file *m, void *v)
 	struct ptdump_info *info = m->private;
 
 	get_online_mems();
-	ptdump_walk(m, info);
+	if (info->ptdump_walk)
+		info->ptdump_walk(m, info);
 	put_online_mems();
 	return 0;
 }
-- 
2.42.0.655.g421f12c284-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 04/11] KVM: arm64: Move pagetable definitions to common header
  2023-10-19 14:40 [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (2 preceding siblings ...)
  2023-10-19 14:40 ` [PATCH v2 03/11] arm64: ptdump: Add the walker function to the ptdump info structure Sebastian Ene
@ 2023-10-19 14:40 ` Sebastian Ene
  2023-10-19 14:40 ` [PATCH v2 05/11] arm64: ptdump: Introduce stage-2 pagetables format description Sebastian Ene
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Sebastian Ene @ 2023-10-19 14:40 UTC (permalink / raw)
  To: will, catalin.marinas, mark.rutland, akpm, maz
  Cc: linux-arm-kernel, linux-kernel, kernel-team, vdonnefort, qperret,
	smostafa, Sebastian Ene

In preparation for using the stage-2 definitions in ptdump, move some of
these macros in the common header.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/include/asm/kvm_pgtable.h | 42 ++++++++++++++++++++++++++++
 arch/arm64/kvm/hyp/pgtable.c         | 42 ----------------------------
 2 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index be615700f8ac..913f34d75b29 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -45,6 +45,48 @@ typedef u64 kvm_pte_t;
 
 #define KVM_PHYS_INVALID		(-1ULL)
 
+#define KVM_PTE_LEAF_ATTR_LO		GENMASK(11, 2)
+
+#define KVM_PTE_LEAF_ATTR_LO_S1_ATTRIDX	GENMASK(4, 2)
+#define KVM_PTE_LEAF_ATTR_LO_S1_AP	GENMASK(7, 6)
+#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RO		\
+	({ cpus_have_final_cap(ARM64_KVM_HVHE) ? 2 : 3; })
+#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RW		\
+	({ cpus_have_final_cap(ARM64_KVM_HVHE) ? 0 : 1; })
+#define KVM_PTE_LEAF_ATTR_LO_S1_SH	GENMASK(9, 8)
+#define KVM_PTE_LEAF_ATTR_LO_S1_SH_IS	3
+#define KVM_PTE_LEAF_ATTR_LO_S1_AF	BIT(10)
+
+#define KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR	GENMASK(5, 2)
+#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R	BIT(6)
+#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W	BIT(7)
+#define KVM_PTE_LEAF_ATTR_LO_S2_SH	GENMASK(9, 8)
+#define KVM_PTE_LEAF_ATTR_LO_S2_SH_IS	3
+#define KVM_PTE_LEAF_ATTR_LO_S2_AF	BIT(10)
+
+#define KVM_PTE_LEAF_ATTR_HI		GENMASK(63, 50)
+
+#define KVM_PTE_LEAF_ATTR_HI_SW		GENMASK(58, 55)
+
+#define KVM_PTE_LEAF_ATTR_HI_S1_XN	BIT(54)
+
+#define KVM_PTE_LEAF_ATTR_HI_S2_XN	BIT(54)
+
+#define KVM_PTE_LEAF_ATTR_HI_S1_GP	BIT(50)
+
+#define KVM_PTE_LEAF_ATTR_S2_PERMS	(KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | \
+					 KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \
+					 KVM_PTE_LEAF_ATTR_HI_S2_XN)
+
+#define KVM_INVALID_PTE_OWNER_MASK	GENMASK(9, 2)
+#define KVM_MAX_OWNER_ID		1
+
+/*
+ * Used to indicate a pte for which a 'break-before-make' sequence is in
+ * progress.
+ */
+#define KVM_INVALID_PTE_LOCKED		BIT(10)
+
 static inline bool kvm_pte_valid(kvm_pte_t pte)
 {
 	return pte & KVM_PTE_VALID;
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 256654b89c1e..67fa122c6028 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -17,48 +17,6 @@
 #define KVM_PTE_TYPE_PAGE		1
 #define KVM_PTE_TYPE_TABLE		1
 
-#define KVM_PTE_LEAF_ATTR_LO		GENMASK(11, 2)
-
-#define KVM_PTE_LEAF_ATTR_LO_S1_ATTRIDX	GENMASK(4, 2)
-#define KVM_PTE_LEAF_ATTR_LO_S1_AP	GENMASK(7, 6)
-#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RO		\
-	({ cpus_have_final_cap(ARM64_KVM_HVHE) ? 2 : 3; })
-#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RW		\
-	({ cpus_have_final_cap(ARM64_KVM_HVHE) ? 0 : 1; })
-#define KVM_PTE_LEAF_ATTR_LO_S1_SH	GENMASK(9, 8)
-#define KVM_PTE_LEAF_ATTR_LO_S1_SH_IS	3
-#define KVM_PTE_LEAF_ATTR_LO_S1_AF	BIT(10)
-
-#define KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR	GENMASK(5, 2)
-#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R	BIT(6)
-#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W	BIT(7)
-#define KVM_PTE_LEAF_ATTR_LO_S2_SH	GENMASK(9, 8)
-#define KVM_PTE_LEAF_ATTR_LO_S2_SH_IS	3
-#define KVM_PTE_LEAF_ATTR_LO_S2_AF	BIT(10)
-
-#define KVM_PTE_LEAF_ATTR_HI		GENMASK(63, 50)
-
-#define KVM_PTE_LEAF_ATTR_HI_SW		GENMASK(58, 55)
-
-#define KVM_PTE_LEAF_ATTR_HI_S1_XN	BIT(54)
-
-#define KVM_PTE_LEAF_ATTR_HI_S2_XN	BIT(54)
-
-#define KVM_PTE_LEAF_ATTR_HI_S1_GP	BIT(50)
-
-#define KVM_PTE_LEAF_ATTR_S2_PERMS	(KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | \
-					 KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \
-					 KVM_PTE_LEAF_ATTR_HI_S2_XN)
-
-#define KVM_INVALID_PTE_OWNER_MASK	GENMASK(9, 2)
-#define KVM_MAX_OWNER_ID		1
-
-/*
- * Used to indicate a pte for which a 'break-before-make' sequence is in
- * progress.
- */
-#define KVM_INVALID_PTE_LOCKED		BIT(10)
-
 struct kvm_pgtable_walk_data {
 	struct kvm_pgtable_walker	*walker;
 
-- 
2.42.0.655.g421f12c284-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 05/11] arm64: ptdump: Introduce stage-2 pagetables format description
  2023-10-19 14:40 [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (3 preceding siblings ...)
  2023-10-19 14:40 ` [PATCH v2 04/11] KVM: arm64: Move pagetable definitions to common header Sebastian Ene
@ 2023-10-19 14:40 ` Sebastian Ene
  2023-10-19 14:40 ` [PATCH v2 06/11] arm64: ptdump: Add hooks on debugfs file operations Sebastian Ene
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Sebastian Ene @ 2023-10-19 14:40 UTC (permalink / raw)
  To: will, catalin.marinas, mark.rutland, akpm, maz
  Cc: linux-arm-kernel, linux-kernel, kernel-team, vdonnefort, qperret,
	smostafa, Sebastian Ene

Add an array which holds human readable information about the format of
a stage-2 descriptor. The array is then used by the descriptor parser
to extract information about the memory attributes.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/mm/ptdump.c | 87 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 87 insertions(+)

diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index d531e24ea0b2..58a4ea975497 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -24,6 +24,7 @@
 #include <asm/memory.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptdump.h>
+#include <asm/kvm_pgtable.h>
 
 
 enum address_markers_idx {
@@ -171,6 +172,66 @@ static const struct prot_bits pte_bits[] = {
 	}
 };
 
+static const struct prot_bits stage2_pte_bits[] = {
+	{
+		.mask	= PTE_VALID,
+		.val	= PTE_VALID,
+		.set	= " ",
+		.clear	= "F",
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_HI_S2_XN,
+		.val	= KVM_PTE_LEAF_ATTR_HI_S2_XN,
+		.set	= "XN",
+		.clear	= "  ",
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R,
+		.val	= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R,
+		.set	= "R",
+		.clear	= " ",
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W,
+		.val	= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W,
+		.set	= "W",
+		.clear	= " ",
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_LO_S2_AF,
+		.val	= KVM_PTE_LEAF_ATTR_LO_S2_AF,
+		.set	= "AF",
+		.clear	= "  ",
+	}, {
+		.mask	= PTE_NG,
+		.val	= PTE_NG,
+		.set	= "FnXS",
+		.clear	= "  ",
+	}, {
+		.mask	= PTE_CONT,
+		.val	= PTE_CONT,
+		.set	= "CON",
+		.clear	= "   ",
+	}, {
+		.mask	= PTE_TABLE_BIT,
+		.val	= PTE_TABLE_BIT,
+		.set	= "   ",
+		.clear	= "BLK",
+	}, {
+		.mask	= KVM_PGTABLE_PROT_SW0,
+		.val	= KVM_PGTABLE_PROT_SW0,
+		.set	= "SW0", /* PKVM_PAGE_SHARED_OWNED */
+	}, {
+		.mask   = KVM_PGTABLE_PROT_SW1,
+		.val	= KVM_PGTABLE_PROT_SW1,
+		.set	= "SW1", /* PKVM_PAGE_SHARED_BORROWED */
+	}, {
+		.mask	= KVM_PGTABLE_PROT_SW2,
+		.val	= KVM_PGTABLE_PROT_SW2,
+		.set	= "SW2",
+	}, {
+		.mask   = KVM_PGTABLE_PROT_SW3,
+		.val	= KVM_PGTABLE_PROT_SW3,
+		.set	= "SW3",
+	},
+};
+
 struct pg_level {
 	const struct prot_bits *bits;
 	const char *name;
@@ -202,6 +263,26 @@ static struct pg_level pg_level[] = {
 	},
 };
 
+static struct pg_level stage2_pg_level[] = {
+	{ /* pgd */
+		.name	= "PGD",
+		.bits	= stage2_pte_bits,
+		.num	= ARRAY_SIZE(stage2_pte_bits),
+	}, { /* pud */
+		.name	= (CONFIG_PGTABLE_LEVELS > 3) ? "PUD" : "PGD",
+		.bits	= stage2_pte_bits,
+		.num	= ARRAY_SIZE(stage2_pte_bits),
+	}, { /* pmd */
+		.name	= (CONFIG_PGTABLE_LEVELS > 2) ? "PMD" : "PGD",
+		.bits	= stage2_pte_bits,
+		.num	= ARRAY_SIZE(stage2_pte_bits),
+	}, { /* pte */
+		.name	= "PTE",
+		.bits	= stage2_pte_bits,
+		.num	= ARRAY_SIZE(stage2_pte_bits),
+	},
+};
+
 static void dump_prot(struct pg_state *st, const struct prot_bits *bits,
 			size_t num)
 {
@@ -340,6 +421,12 @@ static void __init ptdump_initialize(void)
 		if (pg_level[i].bits)
 			for (j = 0; j < pg_level[i].num; j++)
 				pg_level[i].mask |= pg_level[i].bits[j].mask;
+
+	for (i = 0; i < ARRAY_SIZE(stage2_pg_level); i++)
+		if (stage2_pg_level[i].bits)
+			for (j = 0; j < stage2_pg_level[i].num; j++)
+				stage2_pg_level[i].mask |=
+					stage2_pg_level[i].bits[j].mask;
 }
 
 static struct ptdump_info kernel_ptdump_info = {
-- 
2.42.0.655.g421f12c284-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 06/11] arm64: ptdump: Add hooks on debugfs file operations
  2023-10-19 14:40 [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (4 preceding siblings ...)
  2023-10-19 14:40 ` [PATCH v2 05/11] arm64: ptdump: Introduce stage-2 pagetables format description Sebastian Ene
@ 2023-10-19 14:40 ` Sebastian Ene
  2023-10-19 14:40 ` [PATCH v2 07/11] arm64: ptdump: Register a debugfs entry for the host stage-2 page-tables Sebastian Ene
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Sebastian Ene @ 2023-10-19 14:40 UTC (permalink / raw)
  To: will, catalin.marinas, mark.rutland, akpm, maz
  Cc: linux-arm-kernel, linux-kernel, kernel-team, vdonnefort, qperret,
	smostafa, Sebastian Ene

Introduce callbacks invoked when the debugfs entry is accessed from
userspace. This hooks will allow us to allocate and prepare the memory
resources used by ptdump when the debugfs file is opened/closed.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/include/asm/ptdump.h |  3 +++
 arch/arm64/mm/ptdump.c          |  1 +
 arch/arm64/mm/ptdump_debugfs.c  | 34 ++++++++++++++++++++++++++++++++-
 3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
index 1f6e0aabf16a..88dcab1dab97 100644
--- a/arch/arm64/include/asm/ptdump.h
+++ b/arch/arm64/include/asm/ptdump.h
@@ -19,7 +19,10 @@ struct ptdump_info {
 	struct mm_struct		*mm;
 	const struct addr_marker	*markers;
 	unsigned long			base_addr;
+	void (*ptdump_prepare_walk)(struct ptdump_info *info);
 	void (*ptdump_walk)(struct seq_file *s, struct ptdump_info *info);
+	void (*ptdump_end_walk)(struct ptdump_info *info);
+	struct mutex			file_lock;
 };
 
 void ptdump_walk(struct seq_file *s, struct ptdump_info *info);
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 58a4ea975497..fe239b9af50c 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -24,6 +24,7 @@
 #include <asm/memory.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptdump.h>
+#include <asm/kvm_pkvm.h>
 #include <asm/kvm_pgtable.h>
 
 
diff --git a/arch/arm64/mm/ptdump_debugfs.c b/arch/arm64/mm/ptdump_debugfs.c
index 7564519db1e6..14619452dd8d 100644
--- a/arch/arm64/mm/ptdump_debugfs.c
+++ b/arch/arm64/mm/ptdump_debugfs.c
@@ -15,7 +15,39 @@ static int ptdump_show(struct seq_file *m, void *v)
 	put_online_mems();
 	return 0;
 }
-DEFINE_SHOW_ATTRIBUTE(ptdump);
+
+static int ptdump_open(struct inode *inode, struct file *file)
+{
+	int ret;
+	struct ptdump_info *info = inode->i_private;
+
+	ret = single_open(file, ptdump_show, inode->i_private);
+	if (!ret && info->ptdump_prepare_walk) {
+		mutex_lock(&info->file_lock);
+		info->ptdump_prepare_walk(info);
+	}
+	return ret;
+}
+
+static int ptdump_release(struct inode *inode, struct file *file)
+{
+	struct ptdump_info *info = inode->i_private;
+
+	if (info->ptdump_end_walk) {
+		info->ptdump_end_walk(info);
+		mutex_unlock(&info->file_lock);
+	}
+
+	return single_release(inode, file);
+}
+
+static const struct file_operations ptdump_fops = {
+	.owner		= THIS_MODULE,
+	.open		= ptdump_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= ptdump_release,
+};
 
 void __init ptdump_debugfs_register(struct ptdump_info *info, const char *name)
 {
-- 
2.42.0.655.g421f12c284-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 07/11] arm64: ptdump: Register a debugfs entry for the host stage-2 page-tables
  2023-10-19 14:40 [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (5 preceding siblings ...)
  2023-10-19 14:40 ` [PATCH v2 06/11] arm64: ptdump: Add hooks on debugfs file operations Sebastian Ene
@ 2023-10-19 14:40 ` Sebastian Ene
  2023-10-19 14:40 ` [PATCH v2 08/11] arm64: ptdump: Parse the host stage-2 page-tables from the snapshot Sebastian Ene
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Sebastian Ene @ 2023-10-19 14:40 UTC (permalink / raw)
  To: will, catalin.marinas, mark.rutland, akpm, maz
  Cc: linux-arm-kernel, linux-kernel, kernel-team, vdonnefort, qperret,
	smostafa, Sebastian Ene

Initialize a structures used to keep the state of the host stage-2 ptdump
walker when pKVM is enabled. Create a new debugfs entry for the host
stage-2 pagetables and hook the callbacks invoked when the entry is
accessed. When the debugfs file is opened, allocate memory resources which
will be shared with the hypervisor for saving the pagetable snapshot.
On close release the associated memory and we unshare it from the
hypervisor.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/include/asm/ptdump.h |   2 +
 arch/arm64/kvm/Kconfig          |  12 +++
 arch/arm64/mm/ptdump.c          | 161 ++++++++++++++++++++++++++++++++
 3 files changed, 175 insertions(+)

diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
index 88dcab1dab97..35b883524462 100644
--- a/arch/arm64/include/asm/ptdump.h
+++ b/arch/arm64/include/asm/ptdump.h
@@ -23,6 +23,8 @@ struct ptdump_info {
 	void (*ptdump_walk)(struct seq_file *s, struct ptdump_info *info);
 	void (*ptdump_end_walk)(struct ptdump_info *info);
 	struct mutex			file_lock;
+	size_t				mc_len;
+	void				*priv;
 };
 
 void ptdump_walk(struct seq_file *s, struct ptdump_info *info);
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 83c1e09be42e..4b1847704bb3 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -71,4 +71,16 @@ config PROTECTED_NVHE_STACKTRACE
 
 	  If unsure, or not using protected nVHE (pKVM), say N.
 
+config NVHE_EL2_PTDUMP_DEBUGFS
+	bool "Present the stage-2 pagetables to debugfs"
+	depends on NVHE_EL2_DEBUG && PTDUMP_DEBUGFS && KVM
+	help
+	  Say Y here if you want to show the stage-2 kernel pagetables
+	  layout in a debugfs file. This information is only useful for kernel developers
+	  who are working in architecture specific areas of the kernel.
+	  It is probably not a good idea to enable this feature in a production
+	  kernel.
+
+	  If in doubt, say N.
+
 endif # VIRTUALIZATION
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index fe239b9af50c..7c78b8994ca1 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -466,6 +466,165 @@ void ptdump_check_wx(void)
 		pr_info("Checked W+X mappings: passed, no W+X pages found\n");
 }
 
+#ifdef CONFIG_NVHE_EL2_PTDUMP_DEBUGFS
+static struct ptdump_info stage2_kernel_ptdump_info;
+
+static phys_addr_t ptdump_host_pa(void *addr)
+{
+	return __pa(addr);
+}
+
+static void *ptdump_host_va(phys_addr_t phys)
+{
+	return __va(phys);
+}
+
+static size_t stage2_get_pgd_len(void)
+{
+	u64 mmfr0, mmfr1, vtcr;
+	u32 phys_shift = get_kvm_ipa_limit();
+
+	mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+	mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
+	vtcr = kvm_get_vtcr(mmfr0, mmfr1, phys_shift);
+
+	return kvm_pgtable_stage2_pgd_size(vtcr);
+}
+
+static void stage2_ptdump_prepare_walk(struct ptdump_info *info)
+{
+	struct kvm_pgtable_snapshot *snapshot;
+	int ret, pgd_index, mc_index, pgd_pages_sz;
+	void *page_hva;
+	phys_addr_t pgd;
+
+	snapshot = alloc_pages_exact(PAGE_SIZE, GFP_KERNEL_ACCOUNT);
+	if (!snapshot)
+		return;
+
+	memset(snapshot, 0, PAGE_SIZE);
+	ret = kvm_call_hyp_nvhe(__pkvm_host_share_hyp, virt_to_pfn(snapshot));
+	if (ret)
+		goto free_snapshot;
+
+	snapshot->pgd_len = stage2_get_pgd_len();
+	pgd_pages_sz = snapshot->pgd_len / PAGE_SIZE;
+	snapshot->pgd_hva = alloc_pages_exact(snapshot->pgd_len,
+					      GFP_KERNEL_ACCOUNT);
+	if (!snapshot->pgd_hva)
+		goto unshare_snapshot;
+
+	for (pgd_index = 0; pgd_index < pgd_pages_sz; pgd_index++) {
+		page_hva = snapshot->pgd_hva + pgd_index * PAGE_SIZE;
+		ret = kvm_call_hyp_nvhe(__pkvm_host_share_hyp,
+					virt_to_pfn(page_hva));
+		if (ret)
+			goto unshare_pgd_pages;
+	}
+
+	for (mc_index = 0; mc_index < info->mc_len; mc_index++) {
+		page_hva = alloc_pages_exact(PAGE_SIZE, GFP_KERNEL_ACCOUNT);
+		if (!page_hva)
+			goto free_memcache_pages;
+
+		push_hyp_memcache(&snapshot->mc, page_hva, ptdump_host_pa);
+		ret = kvm_call_hyp_nvhe(__pkvm_host_share_hyp,
+					virt_to_pfn(page_hva));
+		if (ret) {
+			pop_hyp_memcache(&snapshot->mc, ptdump_host_va);
+			free_pages_exact(page_hva, PAGE_SIZE);
+			goto free_memcache_pages;
+		}
+	}
+
+	ret = kvm_call_hyp_nvhe(__pkvm_copy_host_stage2, snapshot);
+	if (ret)
+		goto free_memcache_pages;
+
+	pgd = (phys_addr_t)snapshot->pgtable.pgd;
+	snapshot->pgtable.pgd = phys_to_virt(pgd);
+	info->priv = snapshot;
+	return;
+
+free_memcache_pages:
+	page_hva = pop_hyp_memcache(&snapshot->mc, ptdump_host_va);
+	while (page_hva) {
+		ret = kvm_call_hyp_nvhe(__pkvm_host_unshare_hyp,
+					virt_to_pfn(page_hva));
+		WARN_ON(ret);
+		free_pages_exact(page_hva, PAGE_SIZE);
+		page_hva = pop_hyp_memcache(&snapshot->mc, ptdump_host_va);
+	}
+unshare_pgd_pages:
+	pgd_index = pgd_index - 1;
+	for (; pgd_index >= 0; pgd_index--) {
+		page_hva = snapshot->pgd_hva + pgd_index * PAGE_SIZE;
+		ret = kvm_call_hyp_nvhe(__pkvm_host_unshare_hyp,
+					virt_to_pfn(page_hva));
+		WARN_ON(ret);
+	}
+	free_pages_exact(snapshot->pgd_hva, snapshot->pgd_len);
+unshare_snapshot:
+	WARN_ON(kvm_call_hyp_nvhe(__pkvm_host_unshare_hyp,
+				  virt_to_pfn(snapshot)));
+free_snapshot:
+	free_pages_exact(snapshot, PAGE_SIZE);
+	info->priv = NULL;
+}
+
+static void stage2_ptdump_end_walk(struct ptdump_info *info)
+{
+	struct kvm_pgtable_snapshot *snapshot = info->priv;
+	void *page_hva;
+	int pgd_index, ret, pgd_pages_sz;
+
+	if (!snapshot)
+		return;
+
+	page_hva = pop_hyp_memcache(&snapshot->mc, ptdump_host_va);
+	while (page_hva) {
+		ret = kvm_call_hyp_nvhe(__pkvm_host_unshare_hyp,
+					virt_to_pfn(page_hva));
+		WARN_ON(ret);
+		free_pages_exact(page_hva, PAGE_SIZE);
+		page_hva = pop_hyp_memcache(&snapshot->mc, ptdump_host_va);
+	}
+
+	pgd_pages_sz = snapshot->pgd_len / PAGE_SIZE;
+	for (pgd_index = 0; pgd_index < pgd_pages_sz; pgd_index++) {
+		page_hva = snapshot->pgd_hva + pgd_index * PAGE_SIZE;
+		ret = kvm_call_hyp_nvhe(__pkvm_host_unshare_hyp,
+					virt_to_pfn(page_hva));
+		WARN_ON(ret);
+	}
+
+	free_pages_exact(snapshot->pgd_hva, snapshot->pgd_len);
+	WARN_ON(kvm_call_hyp_nvhe(__pkvm_host_unshare_hyp,
+				  virt_to_pfn(snapshot)));
+	free_pages_exact(snapshot, PAGE_SIZE);
+	info->priv = NULL;
+}
+#endif /* CONFIG_NVHE_EL2_PTDUMP_DEBUGFS */
+
+static void __init ptdump_register_host_stage2(void)
+{
+#ifdef CONFIG_NVHE_EL2_PTDUMP_DEBUGFS
+	if (!is_protected_kvm_enabled())
+		return;
+
+	stage2_kernel_ptdump_info = (struct ptdump_info) {
+		.mc_len			= host_s2_pgtable_pages(),
+		.ptdump_prepare_walk	= stage2_ptdump_prepare_walk,
+		.ptdump_end_walk	= stage2_ptdump_end_walk,
+	};
+
+	mutex_init(&stage2_kernel_ptdump_info.file_lock);
+
+	ptdump_debugfs_register(&stage2_kernel_ptdump_info,
+				"host_stage2_kernel_page_tables");
+#endif
+}
+
 static int __init ptdump_init(void)
 {
 	address_markers[PAGE_END_NR].start_address = PAGE_END;
@@ -474,6 +633,8 @@ static int __init ptdump_init(void)
 #endif
 	ptdump_initialize();
 	ptdump_debugfs_register(&kernel_ptdump_info, "kernel_page_tables");
+	ptdump_register_host_stage2();
+
 	return 0;
 }
 device_initcall(ptdump_init);
-- 
2.42.0.655.g421f12c284-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 08/11] arm64: ptdump: Parse the host stage-2 page-tables from the snapshot
  2023-10-19 14:40 [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (6 preceding siblings ...)
  2023-10-19 14:40 ` [PATCH v2 07/11] arm64: ptdump: Register a debugfs entry for the host stage-2 page-tables Sebastian Ene
@ 2023-10-19 14:40 ` Sebastian Ene
  2023-10-19 14:40 ` [PATCH v2 09/11] arm64: ptdump: Interpret memory attributes based on runtime configuration Sebastian Ene
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Sebastian Ene @ 2023-10-19 14:40 UTC (permalink / raw)
  To: will, catalin.marinas, mark.rutland, akpm, maz
  Cc: linux-arm-kernel, linux-kernel, kernel-team, vdonnefort, qperret,
	smostafa, Sebastian Ene

Add a walker function which configures ptdump to parse the page-tables
from the snapshot. Convert the physical address of the pagetable's start
address to a host virtual address and use the ptdump walker to parse the
page-table descriptors.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/mm/ptdump.c | 63 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 7c78b8994ca1..3ba4848272df 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -479,6 +479,11 @@ static void *ptdump_host_va(phys_addr_t phys)
 	return __va(phys);
 }
 
+static struct kvm_pgtable_mm_ops host_mmops = {
+	.phys_to_virt	=	ptdump_host_va,
+	.virt_to_phys	=	ptdump_host_pa,
+};
+
 static size_t stage2_get_pgd_len(void)
 {
 	u64 mmfr0, mmfr1, vtcr;
@@ -604,6 +609,63 @@ static void stage2_ptdump_end_walk(struct ptdump_info *info)
 	free_pages_exact(snapshot, PAGE_SIZE);
 	info->priv = NULL;
 }
+
+static int stage2_ptdump_visitor(const struct kvm_pgtable_visit_ctx *ctx,
+				 enum kvm_pgtable_walk_flags visit)
+{
+	struct pg_state *st = ctx->arg;
+	struct ptdump_state *pt_st = &st->ptdump;
+
+	if (st->pg_level[ctx->level].mask & ctx->old)
+		pt_st->note_page(pt_st, ctx->addr, ctx->level, ctx->old);
+
+	return 0;
+}
+
+static void stage2_ptdump_walk(struct seq_file *s, struct ptdump_info *info)
+{
+	struct kvm_pgtable_snapshot *snapshot = info->priv;
+	struct pg_state st;
+	struct kvm_pgtable *pgtable;
+	u64 start_ipa = 0, end_ipa;
+	struct addr_marker ipa_address_markers[3];
+	struct kvm_pgtable_walker walker = (struct kvm_pgtable_walker) {
+		.cb	= stage2_ptdump_visitor,
+		.arg	= &st,
+		.flags	= KVM_PGTABLE_WALK_LEAF,
+	};
+
+	if (snapshot == NULL || !snapshot->pgtable.pgd)
+		return;
+
+	pgtable = &snapshot->pgtable;
+	pgtable->mm_ops = &host_mmops;
+	end_ipa = BIT(pgtable->ia_bits) - 1;
+
+	memset(&ipa_address_markers[0], 0, sizeof(ipa_address_markers));
+
+	ipa_address_markers[0].start_address = start_ipa;
+	ipa_address_markers[0].name = "IPA start";
+
+	ipa_address_markers[1].start_address = end_ipa;
+	ipa_address_markers[1].name = "IPA end";
+
+	st = (struct pg_state) {
+		.seq		= s,
+		.marker		= &ipa_address_markers[0],
+		.level		= pgtable->start_level - 1,
+		.pg_level	= &stage2_pg_level[0],
+		.ptdump		= {
+			.note_page	= note_page,
+			.range		= (struct ptdump_range[]) {
+				{start_ipa,	end_ipa},
+				{0,		0},
+			},
+		},
+	};
+
+	kvm_pgtable_walk(pgtable, start_ipa, end_ipa, &walker);
+}
 #endif /* CONFIG_NVHE_EL2_PTDUMP_DEBUGFS */
 
 static void __init ptdump_register_host_stage2(void)
@@ -616,6 +678,7 @@ static void __init ptdump_register_host_stage2(void)
 		.mc_len			= host_s2_pgtable_pages(),
 		.ptdump_prepare_walk	= stage2_ptdump_prepare_walk,
 		.ptdump_end_walk	= stage2_ptdump_end_walk,
+		.ptdump_walk		= stage2_ptdump_walk,
 	};
 
 	mutex_init(&stage2_kernel_ptdump_info.file_lock);
-- 
2.42.0.655.g421f12c284-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 09/11] arm64: ptdump: Interpret memory attributes based on runtime configuration
  2023-10-19 14:40 [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (7 preceding siblings ...)
  2023-10-19 14:40 ` [PATCH v2 08/11] arm64: ptdump: Parse the host stage-2 page-tables from the snapshot Sebastian Ene
@ 2023-10-19 14:40 ` Sebastian Ene
  2023-10-19 14:40 ` [PATCH v2 10/11] arm64: ptdump: Interpret pKVM ownership annotations Sebastian Ene
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Sebastian Ene @ 2023-10-19 14:40 UTC (permalink / raw)
  To: will, catalin.marinas, mark.rutland, akpm, maz
  Cc: linux-arm-kernel, linux-kernel, kernel-team, vdonnefort, qperret,
	smostafa, Sebastian Ene

When FWB is used the memory attributes stored in the descriptors have a
different bitfield layout. Introduce two callbacks that verify the current
runtime configuration before parsing the attribute fields.
Add support for parsing the memory attribute fields from the page table
descriptors.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/mm/ptdump.c | 66 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 65 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 3ba4848272df..5f9a334b0f0c 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -85,13 +85,22 @@ struct pg_state {
 	bool check_wx;
 	unsigned long wx_pages;
 	unsigned long uxn_pages;
+	struct ptdump_info *info;
 };
 
+/*
+ * This callback checks the runtime configuration before interpreting the
+ * attributes defined in the prot_bits.
+ */
+typedef bool (*is_feature_cb)(const void *ctx);
+
 struct prot_bits {
 	u64		mask;
 	u64		val;
 	const char	*set;
 	const char	*clear;
+	is_feature_cb   feature_on;  /* bit ignored if the callback returns false */
+	is_feature_cb   feature_off; /* bit ignored if the callback returns true */
 };
 
 static const struct prot_bits pte_bits[] = {
@@ -173,6 +182,34 @@ static const struct prot_bits pte_bits[] = {
 	}
 };
 
+static bool is_fwb_enabled(const void *ctx)
+{
+	const struct pg_state *st = ctx;
+	const struct ptdump_info *info = st->info;
+	struct kvm_pgtable_snapshot *snapshot = info->priv;
+	struct kvm_pgtable *pgtable = &snapshot->pgtable;
+
+	bool fwb_enabled = false;
+
+	if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
+		fwb_enabled = !(pgtable->flags & KVM_PGTABLE_S2_NOFWB);
+
+	return fwb_enabled;
+}
+
+static bool is_table_bit_ignored(const void *ctx)
+{
+	const struct pg_state *st = ctx;
+
+	if (!(st->current_prot & PTE_VALID))
+		return true;
+
+	if (st->level == CONFIG_PGTABLE_LEVELS)
+		return true;
+
+	return false;
+}
+
 static const struct prot_bits stage2_pte_bits[] = {
 	{
 		.mask	= PTE_VALID,
@@ -214,6 +251,27 @@ static const struct prot_bits stage2_pte_bits[] = {
 		.val	= PTE_TABLE_BIT,
 		.set	= "   ",
 		.clear	= "BLK",
+		.feature_off	= is_table_bit_ignored,
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR | PTE_VALID,
+		.val	= PTE_S2_MEMATTR(MT_S2_DEVICE_nGnRE) | PTE_VALID,
+		.set	= "DEVICE/nGnRE",
+		.feature_off	= is_fwb_enabled,
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR | PTE_VALID,
+		.val	= PTE_S2_MEMATTR(MT_S2_FWB_DEVICE_nGnRE) | PTE_VALID,
+		.set	= "DEVICE/nGnRE FWB",
+		.feature_on	= is_fwb_enabled,
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR | PTE_VALID,
+		.val	= PTE_S2_MEMATTR(MT_S2_NORMAL) | PTE_VALID,
+		.set	= "MEM/NORMAL",
+		.feature_off	= is_fwb_enabled,
+	}, {
+		.mask	= KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR | PTE_VALID,
+		.val	= PTE_S2_MEMATTR(MT_S2_FWB_NORMAL) | PTE_VALID,
+		.set	= "MEM/NORMAL FWB",
+		.feature_on	= is_fwb_enabled,
 	}, {
 		.mask	= KVM_PGTABLE_PROT_SW0,
 		.val	= KVM_PGTABLE_PROT_SW0,
@@ -285,13 +343,19 @@ static struct pg_level stage2_pg_level[] = {
 };
 
 static void dump_prot(struct pg_state *st, const struct prot_bits *bits,
-			size_t num)
+		      size_t num)
 {
 	unsigned i;
 
 	for (i = 0; i < num; i++, bits++) {
 		const char *s;
 
+		if (bits->feature_on && !bits->feature_on(st))
+			continue;
+
+		if (bits->feature_off && bits->feature_off(st))
+			continue;
+
 		if ((st->current_prot & bits->mask) == bits->val)
 			s = bits->set;
 		else
-- 
2.42.0.655.g421f12c284-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 10/11] arm64: ptdump: Interpret pKVM ownership annotations
  2023-10-19 14:40 [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (8 preceding siblings ...)
  2023-10-19 14:40 ` [PATCH v2 09/11] arm64: ptdump: Interpret memory attributes based on runtime configuration Sebastian Ene
@ 2023-10-19 14:40 ` Sebastian Ene
  2023-10-19 14:40 ` [PATCH v2 11/11] arm64: ptdump: Add support for guest stage-2 pagetables dumping Sebastian Ene
  2023-10-20  8:19 ` [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Vincent Donnefort
  11 siblings, 0 replies; 17+ messages in thread
From: Sebastian Ene @ 2023-10-19 14:40 UTC (permalink / raw)
  To: will, catalin.marinas, mark.rutland, akpm, maz
  Cc: linux-arm-kernel, linux-kernel, kernel-team, vdonnefort, qperret,
	smostafa, Sebastian Ene

Add support for interpretting pKVM invalid stage-2 descriptors that hold
ownership information. We use these descriptors to keep track of the
memory donations from the host side.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/include/asm/kvm_pgtable.h          |  7 +++++++
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  7 -------
 arch/arm64/mm/ptdump.c                        | 10 ++++++++++
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 913f34d75b29..938baffa7d4d 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -87,6 +87,13 @@ typedef u64 kvm_pte_t;
  */
 #define KVM_INVALID_PTE_LOCKED		BIT(10)
 
+/* This corresponds to page-table locking order */
+enum pkvm_component_id {
+	PKVM_ID_HOST,
+	PKVM_ID_HYP,
+	PKVM_ID_FFA,
+};
+
 static inline bool kvm_pte_valid(kvm_pte_t pte)
 {
 	return pte & KVM_PTE_VALID;
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index 9cfb35d68850..cc2c439ffe75 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -53,13 +53,6 @@ struct host_mmu {
 };
 extern struct host_mmu host_mmu;
 
-/* This corresponds to page-table locking order */
-enum pkvm_component_id {
-	PKVM_ID_HOST,
-	PKVM_ID_HYP,
-	PKVM_ID_FFA,
-};
-
 extern unsigned long hyp_nr_cpus;
 
 int __pkvm_prot_finalize(void);
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 5f9a334b0f0c..4687840dcb69 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -272,6 +272,16 @@ static const struct prot_bits stage2_pte_bits[] = {
 		.val	= PTE_S2_MEMATTR(MT_S2_FWB_NORMAL) | PTE_VALID,
 		.set	= "MEM/NORMAL FWB",
 		.feature_on	= is_fwb_enabled,
+	}, {
+		.mask	= KVM_INVALID_PTE_OWNER_MASK | PTE_VALID,
+		.val	= FIELD_PREP_CONST(KVM_INVALID_PTE_OWNER_MASK,
+					   PKVM_ID_HYP),
+		.set	= "HYP",
+	}, {
+		.mask	= KVM_INVALID_PTE_OWNER_MASK | PTE_VALID,
+		.val	= FIELD_PREP_CONST(KVM_INVALID_PTE_OWNER_MASK,
+					   PKVM_ID_FFA),
+		.set	= "FF-A",
 	}, {
 		.mask	= KVM_PGTABLE_PROT_SW0,
 		.val	= KVM_PGTABLE_PROT_SW0,
-- 
2.42.0.655.g421f12c284-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 11/11] arm64: ptdump: Add support for guest stage-2 pagetables dumping
  2023-10-19 14:40 [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (9 preceding siblings ...)
  2023-10-19 14:40 ` [PATCH v2 10/11] arm64: ptdump: Interpret pKVM ownership annotations Sebastian Ene
@ 2023-10-19 14:40 ` Sebastian Ene
  2023-10-20  8:40   ` Vincent Donnefort
  2023-10-20  8:19 ` [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Vincent Donnefort
  11 siblings, 1 reply; 17+ messages in thread
From: Sebastian Ene @ 2023-10-19 14:40 UTC (permalink / raw)
  To: will, catalin.marinas, mark.rutland, akpm, maz
  Cc: linux-arm-kernel, linux-kernel, kernel-team, vdonnefort, qperret,
	smostafa, Sebastian Ene

Register a debugfs file on guest creation to be able to view their
second translation tables with ptdump. This assumes that the host is in
control of the guest stage-2 and has direct access to the pagetables.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 arch/arm64/include/asm/ptdump.h | 21 +++++++--
 arch/arm64/kvm/mmu.c            |  3 ++
 arch/arm64/mm/ptdump.c          | 84 +++++++++++++++++++++++++++++++++
 arch/arm64/mm/ptdump_debugfs.c  |  5 +-
 4 files changed, 108 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
index 35b883524462..be86244d532b 100644
--- a/arch/arm64/include/asm/ptdump.h
+++ b/arch/arm64/include/asm/ptdump.h
@@ -5,6 +5,8 @@
 #ifndef __ASM_PTDUMP_H
 #define __ASM_PTDUMP_H
 
+#include <asm/kvm_pgtable.h>
+
 #ifdef CONFIG_PTDUMP_CORE
 
 #include <linux/mm_types.h>
@@ -30,14 +32,27 @@ struct ptdump_info {
 void ptdump_walk(struct seq_file *s, struct ptdump_info *info);
 #ifdef CONFIG_PTDUMP_DEBUGFS
 #define EFI_RUNTIME_MAP_END	DEFAULT_MAP_WINDOW_64
-void __init ptdump_debugfs_register(struct ptdump_info *info, const char *name);
+struct dentry *ptdump_debugfs_register(struct ptdump_info *info,
+				       const char *name);
 #else
-static inline void ptdump_debugfs_register(struct ptdump_info *info,
-					   const char *name) { }
+static inline struct dentry *ptdump_debugfs_register(struct ptdump_info *info,
+						     const char *name)
+{
+	return NULL;
+}
 #endif
 void ptdump_check_wx(void);
 #endif /* CONFIG_PTDUMP_CORE */
 
+#ifdef CONFIG_NVHE_EL2_PTDUMP_DEBUGFS
+void ptdump_register_guest_stage2(struct kvm_pgtable *pgt, void *lock);
+void ptdump_unregister_guest_stage2(struct kvm_pgtable *pgt);
+#else
+static inline void ptdump_register_guest_stage2(struct kvm_pgtable *pgt,
+						void *lock) { }
+static inline void ptdump_unregister_guest_stage2(struct kvm_pgtable *pgt) { }
+#endif /* CONFIG_NVHE_EL2_PTDUMP_DEBUGFS */
+
 #ifdef CONFIG_DEBUG_WX
 #define debug_checkwx()	ptdump_check_wx()
 #else
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 482280fe22d7..e47988dba34d 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -11,6 +11,7 @@
 #include <linux/sched/signal.h>
 #include <trace/events/kvm.h>
 #include <asm/pgalloc.h>
+#include <asm/ptdump.h>
 #include <asm/cacheflush.h>
 #include <asm/kvm_arm.h>
 #include <asm/kvm_mmu.h>
@@ -908,6 +909,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t
 	if (err)
 		goto out_free_pgtable;
 
+	ptdump_register_guest_stage2(pgt, &kvm->mmu_lock);
 	mmu->last_vcpu_ran = alloc_percpu(typeof(*mmu->last_vcpu_ran));
 	if (!mmu->last_vcpu_ran) {
 		err = -ENOMEM;
@@ -1021,6 +1023,7 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
 	write_unlock(&kvm->mmu_lock);
 
 	if (pgt) {
+		ptdump_unregister_guest_stage2(pgt);
 		kvm_pgtable_stage2_destroy(pgt);
 		kfree(pgt);
 	}
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 4687840dcb69..facfb15468f5 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -26,6 +26,7 @@
 #include <asm/ptdump.h>
 #include <asm/kvm_pkvm.h>
 #include <asm/kvm_pgtable.h>
+#include <asm/kvm_host.h>
 
 
 enum address_markers_idx {
@@ -543,6 +544,22 @@ void ptdump_check_wx(void)
 #ifdef CONFIG_NVHE_EL2_PTDUMP_DEBUGFS
 static struct ptdump_info stage2_kernel_ptdump_info;
 
+#define GUEST_NAME_LEN	(32U)
+
+struct ptdump_registered_guest {
+	struct list_head		reg_list;
+	struct ptdump_info		info;
+	struct mm_struct		mem;
+	struct kvm_pgtable_snapshot	snapshot;
+	struct dentry			*dentry;
+	rwlock_t			*lock;
+	char				reg_name[GUEST_NAME_LEN];
+};
+
+static LIST_HEAD(ptdump_guest_list);
+static DEFINE_MUTEX(ptdump_list_lock);
+static u16 guest_no;
+
 static phys_addr_t ptdump_host_pa(void *addr)
 {
 	return __pa(addr);
@@ -740,6 +757,73 @@ static void stage2_ptdump_walk(struct seq_file *s, struct ptdump_info *info)
 
 	kvm_pgtable_walk(pgtable, start_ipa, end_ipa, &walker);
 }
+
+static void guest_stage2_ptdump_walk(struct seq_file *s,
+				     struct ptdump_info *info)
+{
+	struct kvm_pgtable_snapshot *snapshot = info->priv;
+	struct ptdump_registered_guest *guest;
+
+	guest = container_of(snapshot, struct ptdump_registered_guest,
+			     snapshot);
+	read_lock(guest->lock);
+	stage2_ptdump_walk(s, info);
+	read_unlock(guest->lock);
+}
+
+void ptdump_register_guest_stage2(struct kvm_pgtable *pgt, void *lock)
+{
+	struct ptdump_registered_guest *guest;
+	struct dentry *d;
+
+	if (pgt == NULL || lock == NULL)
+		return;
+
+	guest = kzalloc(sizeof(struct ptdump_registered_guest), GFP_KERNEL);
+	if (!guest)
+		return;
+
+	memcpy(&guest->snapshot.pgtable, pgt, sizeof(struct kvm_pgtable));
+	guest->info = (struct ptdump_info) {
+		.ptdump_walk		= guest_stage2_ptdump_walk,
+		.priv			= &guest->snapshot
+	};
+
+	mutex_init(&guest->info.file_lock);
+	guest->lock = lock;
+	mutex_lock(&ptdump_list_lock);
+	snprintf(guest->reg_name, GUEST_NAME_LEN,
+		 "%u_guest_stage2_page_tables", guest_no++);
+	d = ptdump_debugfs_register(&guest->info, guest->reg_name);
+	if (!d) {
+		mutex_unlock(&ptdump_list_lock);
+		goto free_entry;
+	}
+
+	guest->dentry = d;
+	list_add(&guest->reg_list, &ptdump_guest_list);
+	mutex_unlock(&ptdump_list_lock);
+	return;
+
+free_entry:
+	kfree(guest);
+}
+
+void ptdump_unregister_guest_stage2(struct kvm_pgtable *pgt)
+{
+	struct ptdump_registered_guest *guest;
+
+	mutex_lock(&ptdump_list_lock);
+	list_for_each_entry(guest, &ptdump_guest_list, reg_list) {
+		if (guest->snapshot.pgtable.pgd == pgt->pgd) {
+			list_del(&guest->reg_list);
+			debugfs_remove(guest->dentry);
+			kfree(guest);
+			break;
+		}
+	}
+	mutex_unlock(&ptdump_list_lock);
+}
 #endif /* CONFIG_NVHE_EL2_PTDUMP_DEBUGFS */
 
 static void __init ptdump_register_host_stage2(void)
diff --git a/arch/arm64/mm/ptdump_debugfs.c b/arch/arm64/mm/ptdump_debugfs.c
index 14619452dd8d..356753e27dee 100644
--- a/arch/arm64/mm/ptdump_debugfs.c
+++ b/arch/arm64/mm/ptdump_debugfs.c
@@ -49,7 +49,8 @@ static const struct file_operations ptdump_fops = {
 	.release	= ptdump_release,
 };
 
-void __init ptdump_debugfs_register(struct ptdump_info *info, const char *name)
+struct dentry *ptdump_debugfs_register(struct ptdump_info *info,
+				       const char *name)
 {
-	debugfs_create_file(name, 0400, NULL, info, &ptdump_fops);
+	return debugfs_create_file(name, 0400, NULL, info, &ptdump_fops);
 }
-- 
2.42.0.655.g421f12c284-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables
  2023-10-19 14:40 [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Sebastian Ene
                   ` (10 preceding siblings ...)
  2023-10-19 14:40 ` [PATCH v2 11/11] arm64: ptdump: Add support for guest stage-2 pagetables dumping Sebastian Ene
@ 2023-10-20  8:19 ` Vincent Donnefort
  2023-10-23 14:32   ` Sebastian Ene
  11 siblings, 1 reply; 17+ messages in thread
From: Vincent Donnefort @ 2023-10-20  8:19 UTC (permalink / raw)
  To: Sebastian Ene
  Cc: will, catalin.marinas, mark.rutland, akpm, maz, linux-arm-kernel,
	linux-kernel, kernel-team, qperret, smostafa

On Thu, Oct 19, 2023 at 02:40:21PM +0000, Sebastian Ene wrote:
> Hi,
> 
> This can be used as a debugging tool for dumping the second stage
> page-tables under debugfs.
> 
> From the previous feedback I re-worked the series and added support for
> guest page-tables dumping under VHE & nVHE configuration. I extended the
> list of reviewers as I missed the interested parties in the first round. 
> 
> When CONFIG_NVHE_EL2_PTDUMP_DEBUGFS is enabled under pKVM environment,
> ptdump registers the 'host_stage2_kernel_page_tables' entry with debugfs.
> Guests are registering a file named '%u_guest_stage2_page_tables' when
> they are created. 

I believe guests entries should be also available for nVHE and VHE.

> 
> This allows us to dump the host stage-2 page-tables with the following command:
> cat /sys/kernel/debug/host_stage2_kernel_page_tables.

As it needs the debugfs anyway, this should probably live in the kvm/ debugfs
folder, while the VMs ptdump should be placed in their respective folder.

This is quite easy, you should get access to the global kvm_debugfs_dir and
struct kvm->debugfs_dentry.

> 
> The output is showing the entries in the following format:
> <IPA range> <size> <descriptor type> <access permissions> <mem_attributes>
> 
> The tool interprets the pKVM ownership annotation stored in the invalid
> entries and dumps to the console the ownership information. To be able
> to access the host stage-2 page-tables from the kernel, a new hypervisor
> call was introduced which allows us to snapshot the page-tables in a host
> provided buffer. The hypervisor call is hidden behind CONFIG_NVHE_EL2_DEBUG
> as this should be used under debugging environment.
> 
> Link to the first version:
> https://lore.kernel.org/all/20230927112517.2631674-1-sebastianene@google.com/
> 
> Changelog:
>   v1 -> v2:
>   * use the stage-2 pagetable walker for dumping descriptors instead of
>     the one provided by ptdump.
> 
>   * support for guests pagetables dumping under VHE/nVHE non-protected
> 
> Thanks,
> 
> 
> Sebastian Ene (11):
>   KVM: arm64: Add snap shooting the host stage-2 pagetables
>   arm64: ptdump: Use the mask from the state structure
>   arm64: ptdump: Add the walker function to the ptdump info structure
>   KVM: arm64: Move pagetable definitions to common header
>   arm64: ptdump: Introduce stage-2 pagetables format description
>   arm64: ptdump: Add hooks on debugfs file operations
>   arm64: ptdump: Register a debugfs entry for the host stage-2
>     page-tables
>   arm64: ptdump: Parse the host stage-2 page-tables from the snapshot
>   arm64: ptdump: Interpret memory attributes based on runtime
>     configuration
>   arm64: ptdump: Interpret pKVM ownership annotations
>   arm64: ptdump: Add support for guest stage-2 pagetables dumping
> 
>  arch/arm64/include/asm/kvm_asm.h              |   1 +
>  arch/arm64/include/asm/kvm_pgtable.h          |  85 +++
>  arch/arm64/include/asm/ptdump.h               |  27 +-
>  arch/arm64/kvm/Kconfig                        |  12 +
>  arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   8 +-
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  18 +
>  arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 103 ++++
>  arch/arm64/kvm/hyp/pgtable.c                  |  98 ++--
>  arch/arm64/kvm/mmu.c                          |   3 +
>  arch/arm64/mm/ptdump.c                        | 487 +++++++++++++++++-
>  arch/arm64/mm/ptdump_debugfs.c                |  42 +-
>  11 files changed, 822 insertions(+), 62 deletions(-)
> 
> -- 
> 2.42.0.655.g421f12c284-goog
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 11/11] arm64: ptdump: Add support for guest stage-2 pagetables dumping
  2023-10-19 14:40 ` [PATCH v2 11/11] arm64: ptdump: Add support for guest stage-2 pagetables dumping Sebastian Ene
@ 2023-10-20  8:40   ` Vincent Donnefort
  2023-10-23 14:45     ` Sebastian Ene
  0 siblings, 1 reply; 17+ messages in thread
From: Vincent Donnefort @ 2023-10-20  8:40 UTC (permalink / raw)
  To: Sebastian Ene
  Cc: will, catalin.marinas, mark.rutland, akpm, maz, linux-arm-kernel,
	linux-kernel, kernel-team, qperret, smostafa

On Thu, Oct 19, 2023 at 02:40:33PM +0000, Sebastian Ene wrote:
> Register a debugfs file on guest creation to be able to view their
> second translation tables with ptdump. This assumes that the host is in
> control of the guest stage-2 and has direct access to the pagetables.

What about pKVM? The walker you wrote for the host stage-2 should be
reusable in that case?

> 
> Signed-off-by: Sebastian Ene <sebastianene@google.com>
> ---
>  arch/arm64/include/asm/ptdump.h | 21 +++++++--
>  arch/arm64/kvm/mmu.c            |  3 ++
>  arch/arm64/mm/ptdump.c          | 84 +++++++++++++++++++++++++++++++++
>  arch/arm64/mm/ptdump_debugfs.c  |  5 +-
>  4 files changed, 108 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
> index 35b883524462..be86244d532b 100644
> --- a/arch/arm64/include/asm/ptdump.h
> +++ b/arch/arm64/include/asm/ptdump.h
> @@ -5,6 +5,8 @@
>  #ifndef __ASM_PTDUMP_H
>  #define __ASM_PTDUMP_H
>  
> +#include <asm/kvm_pgtable.h>
> +
>  #ifdef CONFIG_PTDUMP_CORE
>  
>  #include <linux/mm_types.h>
> @@ -30,14 +32,27 @@ struct ptdump_info {
>  void ptdump_walk(struct seq_file *s, struct ptdump_info *info);
>  #ifdef CONFIG_PTDUMP_DEBUGFS
>  #define EFI_RUNTIME_MAP_END	DEFAULT_MAP_WINDOW_64
> -void __init ptdump_debugfs_register(struct ptdump_info *info, const char *name);
> +struct dentry *ptdump_debugfs_register(struct ptdump_info *info,
> +				       const char *name);
>  #else
> -static inline void ptdump_debugfs_register(struct ptdump_info *info,
> -					   const char *name) { }
> +static inline struct dentry *ptdump_debugfs_register(struct ptdump_info *info,
> +						     const char *name)
> +{
> +	return NULL;
> +}
>  #endif
>  void ptdump_check_wx(void);
>  #endif /* CONFIG_PTDUMP_CORE */
>  
> +#ifdef CONFIG_NVHE_EL2_PTDUMP_DEBUGFS
> +void ptdump_register_guest_stage2(struct kvm_pgtable *pgt, void *lock);
> +void ptdump_unregister_guest_stage2(struct kvm_pgtable *pgt);
> +#else
> +static inline void ptdump_register_guest_stage2(struct kvm_pgtable *pgt,
> +						void *lock) { }
> +static inline void ptdump_unregister_guest_stage2(struct kvm_pgtable *pgt) { }
> +#endif /* CONFIG_NVHE_EL2_PTDUMP_DEBUGFS */

I believe this should be compatible with VHE as well, that option should be
renamed.

> +
>  #ifdef CONFIG_DEBUG_WX
>  #define debug_checkwx()	ptdump_check_wx()
>  #else
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 482280fe22d7..e47988dba34d 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -11,6 +11,7 @@
>  #include <linux/sched/signal.h>
>  #include <trace/events/kvm.h>
>  #include <asm/pgalloc.h>
> +#include <asm/ptdump.h>
>  #include <asm/cacheflush.h>
>  #include <asm/kvm_arm.h>
>  #include <asm/kvm_mmu.h>
> @@ -908,6 +909,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t
>  	if (err)
>  		goto out_free_pgtable;
>  
> +	ptdump_register_guest_stage2(pgt, &kvm->mmu_lock);
>  	mmu->last_vcpu_ran = alloc_percpu(typeof(*mmu->last_vcpu_ran));
>  	if (!mmu->last_vcpu_ran) {
>  		err = -ENOMEM;
> @@ -1021,6 +1023,7 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
>  	write_unlock(&kvm->mmu_lock);
>  
>  	if (pgt) {
> +		ptdump_unregister_guest_stage2(pgt);
>  		kvm_pgtable_stage2_destroy(pgt);
>  		kfree(pgt);
>  	}
> diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
> index 4687840dcb69..facfb15468f5 100644
> --- a/arch/arm64/mm/ptdump.c
> +++ b/arch/arm64/mm/ptdump.c
> @@ -26,6 +26,7 @@
>  #include <asm/ptdump.h>
>  #include <asm/kvm_pkvm.h>
>  #include <asm/kvm_pgtable.h>
> +#include <asm/kvm_host.h>
>  
>  
>  enum address_markers_idx {
> @@ -543,6 +544,22 @@ void ptdump_check_wx(void)
>  #ifdef CONFIG_NVHE_EL2_PTDUMP_DEBUGFS
>  static struct ptdump_info stage2_kernel_ptdump_info;
>  
> +#define GUEST_NAME_LEN	(32U)
> +
> +struct ptdump_registered_guest {
> +	struct list_head		reg_list;
> +	struct ptdump_info		info;
> +	struct mm_struct		mem;
> +	struct kvm_pgtable_snapshot	snapshot;
> +	struct dentry			*dentry;
> +	rwlock_t			*lock;
> +	char				reg_name[GUEST_NAME_LEN];
> +};
> +
> +static LIST_HEAD(ptdump_guest_list);
> +static DEFINE_MUTEX(ptdump_list_lock);
> +static u16 guest_no;

This is not robust enough: If 1 VM starts then 65535 others which are killed.
guest_no overflows. The next number is 0 which is already taken.

Linux has and ID allocation to solve this problem, but I don't think this is
necessary anyway. This should simply reuse the struct kvm->debugfs_dentry.

Also probably most of the informations contained in ptdump_registered_guest can
be found in struct kvm. The debugfs should then probably simply take struct kvm
for the private argument.

> +
>  static phys_addr_t ptdump_host_pa(void *addr)
>  {
>  	return __pa(addr);
> @@ -740,6 +757,73 @@ static void stage2_ptdump_walk(struct seq_file *s, struct ptdump_info *info)
>  
>  	kvm_pgtable_walk(pgtable, start_ipa, end_ipa, &walker);
>  }

[...]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables
  2023-10-20  8:19 ` [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Vincent Donnefort
@ 2023-10-23 14:32   ` Sebastian Ene
  0 siblings, 0 replies; 17+ messages in thread
From: Sebastian Ene @ 2023-10-23 14:32 UTC (permalink / raw)
  To: Vincent Donnefort
  Cc: will, catalin.marinas, mark.rutland, akpm, maz, linux-arm-kernel,
	linux-kernel, kernel-team, qperret, smostafa

On Fri, Oct 20, 2023 at 09:19:33AM +0100, Vincent Donnefort wrote:
> On Thu, Oct 19, 2023 at 02:40:21PM +0000, Sebastian Ene wrote:
> > Hi,
> > 
> > This can be used as a debugging tool for dumping the second stage
> > page-tables under debugfs.
> > 
> > From the previous feedback I re-worked the series and added support for
> > guest page-tables dumping under VHE & nVHE configuration. I extended the
> > list of reviewers as I missed the interested parties in the first round. 
> > 
> > When CONFIG_NVHE_EL2_PTDUMP_DEBUGFS is enabled under pKVM environment,
> > ptdump registers the 'host_stage2_kernel_page_tables' entry with debugfs.
> > Guests are registering a file named '%u_guest_stage2_page_tables' when
> > they are created. 

Hi,

> 
> I believe guests entries should be also available for nVHE and VHE.
> 

Yes, we support dumping the guest stage-2 pagetables with this under
both modes. The host stage-2 is available only in
kvm.arm.mode="protected".

> > 
> > This allows us to dump the host stage-2 page-tables with the following command:
> > cat /sys/kernel/debug/host_stage2_kernel_page_tables.
> 
> As it needs the debugfs anyway, this should probably live in the kvm/ debugfs
> folder, while the VMs ptdump should be placed in their respective folder.
> 
> This is quite easy, you should get access to the global kvm_debugfs_dir and
> struct kvm->debugfs_dentry.
>

Right, I was thinking to place them under kvm/ debugfs entry but then I
noticed that ptdump files are not registered under this path.

> > 
> > The output is showing the entries in the following format:
> > <IPA range> <size> <descriptor type> <access permissions> <mem_attributes>
> > 
> > The tool interprets the pKVM ownership annotation stored in the invalid
> > entries and dumps to the console the ownership information. To be able
> > to access the host stage-2 page-tables from the kernel, a new hypervisor
> > call was introduced which allows us to snapshot the page-tables in a host
> > provided buffer. The hypervisor call is hidden behind CONFIG_NVHE_EL2_DEBUG
> > as this should be used under debugging environment.
> > 
> > Link to the first version:
> > https://lore.kernel.org/all/20230927112517.2631674-1-sebastianene@google.com/
> > 
> > Changelog:
> >   v1 -> v2:
> >   * use the stage-2 pagetable walker for dumping descriptors instead of
> >     the one provided by ptdump.
> > 
> >   * support for guests pagetables dumping under VHE/nVHE non-protected
> > 
> > Thanks,
> > 
> > 
> > Sebastian Ene (11):
> >   KVM: arm64: Add snap shooting the host stage-2 pagetables
> >   arm64: ptdump: Use the mask from the state structure
> >   arm64: ptdump: Add the walker function to the ptdump info structure
> >   KVM: arm64: Move pagetable definitions to common header
> >   arm64: ptdump: Introduce stage-2 pagetables format description
> >   arm64: ptdump: Add hooks on debugfs file operations
> >   arm64: ptdump: Register a debugfs entry for the host stage-2
> >     page-tables
> >   arm64: ptdump: Parse the host stage-2 page-tables from the snapshot
> >   arm64: ptdump: Interpret memory attributes based on runtime
> >     configuration
> >   arm64: ptdump: Interpret pKVM ownership annotations
> >   arm64: ptdump: Add support for guest stage-2 pagetables dumping
> > 
> >  arch/arm64/include/asm/kvm_asm.h              |   1 +
> >  arch/arm64/include/asm/kvm_pgtable.h          |  85 +++
> >  arch/arm64/include/asm/ptdump.h               |  27 +-
> >  arch/arm64/kvm/Kconfig                        |  12 +
> >  arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   8 +-
> >  arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  18 +
> >  arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 103 ++++
> >  arch/arm64/kvm/hyp/pgtable.c                  |  98 ++--
> >  arch/arm64/kvm/mmu.c                          |   3 +
> >  arch/arm64/mm/ptdump.c                        | 487 +++++++++++++++++-
> >  arch/arm64/mm/ptdump_debugfs.c                |  42 +-
> >  11 files changed, 822 insertions(+), 62 deletions(-)
> > 
> > -- 
> > 2.42.0.655.g421f12c284-goog
> > 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 11/11] arm64: ptdump: Add support for guest stage-2 pagetables dumping
  2023-10-20  8:40   ` Vincent Donnefort
@ 2023-10-23 14:45     ` Sebastian Ene
  0 siblings, 0 replies; 17+ messages in thread
From: Sebastian Ene @ 2023-10-23 14:45 UTC (permalink / raw)
  To: Vincent Donnefort
  Cc: will, catalin.marinas, mark.rutland, akpm, maz, linux-arm-kernel,
	linux-kernel, kernel-team, qperret, smostafa

On Fri, Oct 20, 2023 at 09:40:06AM +0100, Vincent Donnefort wrote:
> On Thu, Oct 19, 2023 at 02:40:33PM +0000, Sebastian Ene wrote:
> > Register a debugfs file on guest creation to be able to view their
> > second translation tables with ptdump. This assumes that the host is in
> > control of the guest stage-2 and has direct access to the pagetables.
> 
> What about pKVM? The walker you wrote for the host stage-2 should be
> reusable in that case?
> 

Yes, when pKVM will be ready upstream the walker which duplicates the
pagetables for the host will be re-used for the guests. We will have to
add a separate HVC for this which receives as an argument the guest
vmid.

> > 
> > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > ---
> >  arch/arm64/include/asm/ptdump.h | 21 +++++++--
> >  arch/arm64/kvm/mmu.c            |  3 ++
> >  arch/arm64/mm/ptdump.c          | 84 +++++++++++++++++++++++++++++++++
> >  arch/arm64/mm/ptdump_debugfs.c  |  5 +-
> >  4 files changed, 108 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
> > index 35b883524462..be86244d532b 100644
> > --- a/arch/arm64/include/asm/ptdump.h
> > +++ b/arch/arm64/include/asm/ptdump.h
> > @@ -5,6 +5,8 @@
> >  #ifndef __ASM_PTDUMP_H
> >  #define __ASM_PTDUMP_H
> >  
> > +#include <asm/kvm_pgtable.h>
> > +
> >  #ifdef CONFIG_PTDUMP_CORE
> >  
> >  #include <linux/mm_types.h>
> > @@ -30,14 +32,27 @@ struct ptdump_info {
> >  void ptdump_walk(struct seq_file *s, struct ptdump_info *info);
> >  #ifdef CONFIG_PTDUMP_DEBUGFS
> >  #define EFI_RUNTIME_MAP_END	DEFAULT_MAP_WINDOW_64
> > -void __init ptdump_debugfs_register(struct ptdump_info *info, const char *name);
> > +struct dentry *ptdump_debugfs_register(struct ptdump_info *info,
> > +				       const char *name);
> >  #else
> > -static inline void ptdump_debugfs_register(struct ptdump_info *info,
> > -					   const char *name) { }
> > +static inline struct dentry *ptdump_debugfs_register(struct ptdump_info *info,
> > +						     const char *name)
> > +{
> > +	return NULL;
> > +}
> >  #endif
> >  void ptdump_check_wx(void);
> >  #endif /* CONFIG_PTDUMP_CORE */
> >  
> > +#ifdef CONFIG_NVHE_EL2_PTDUMP_DEBUGFS
> > +void ptdump_register_guest_stage2(struct kvm_pgtable *pgt, void *lock);
> > +void ptdump_unregister_guest_stage2(struct kvm_pgtable *pgt);
> > +#else
> > +static inline void ptdump_register_guest_stage2(struct kvm_pgtable *pgt,
> > +						void *lock) { }
> > +static inline void ptdump_unregister_guest_stage2(struct kvm_pgtable *pgt) { }
> > +#endif /* CONFIG_NVHE_EL2_PTDUMP_DEBUGFS */
> 
> I believe this should be compatible with VHE as well, that option should be
> renamed.
> 

Good point, I will rename this.

> > +
> >  #ifdef CONFIG_DEBUG_WX
> >  #define debug_checkwx()	ptdump_check_wx()
> >  #else
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > index 482280fe22d7..e47988dba34d 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -11,6 +11,7 @@
> >  #include <linux/sched/signal.h>
> >  #include <trace/events/kvm.h>
> >  #include <asm/pgalloc.h>
> > +#include <asm/ptdump.h>
> >  #include <asm/cacheflush.h>
> >  #include <asm/kvm_arm.h>
> >  #include <asm/kvm_mmu.h>
> > @@ -908,6 +909,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t
> >  	if (err)
> >  		goto out_free_pgtable;
> >  
> > +	ptdump_register_guest_stage2(pgt, &kvm->mmu_lock);
> >  	mmu->last_vcpu_ran = alloc_percpu(typeof(*mmu->last_vcpu_ran));
> >  	if (!mmu->last_vcpu_ran) {
> >  		err = -ENOMEM;
> > @@ -1021,6 +1023,7 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
> >  	write_unlock(&kvm->mmu_lock);
> >  
> >  	if (pgt) {
> > +		ptdump_unregister_guest_stage2(pgt);
> >  		kvm_pgtable_stage2_destroy(pgt);
> >  		kfree(pgt);
> >  	}
> > diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
> > index 4687840dcb69..facfb15468f5 100644
> > --- a/arch/arm64/mm/ptdump.c
> > +++ b/arch/arm64/mm/ptdump.c
> > @@ -26,6 +26,7 @@
> >  #include <asm/ptdump.h>
> >  #include <asm/kvm_pkvm.h>
> >  #include <asm/kvm_pgtable.h>
> > +#include <asm/kvm_host.h>
> >  
> >  
> >  enum address_markers_idx {
> > @@ -543,6 +544,22 @@ void ptdump_check_wx(void)
> >  #ifdef CONFIG_NVHE_EL2_PTDUMP_DEBUGFS
> >  static struct ptdump_info stage2_kernel_ptdump_info;
> >  
> > +#define GUEST_NAME_LEN	(32U)
> > +
> > +struct ptdump_registered_guest {
> > +	struct list_head		reg_list;
> > +	struct ptdump_info		info;
> > +	struct mm_struct		mem;
> > +	struct kvm_pgtable_snapshot	snapshot;
> > +	struct dentry			*dentry;
> > +	rwlock_t			*lock;
> > +	char				reg_name[GUEST_NAME_LEN];
> > +};
> > +
> > +static LIST_HEAD(ptdump_guest_list);
> > +static DEFINE_MUTEX(ptdump_list_lock);
> > +static u16 guest_no;
> 
> This is not robust enough: If 1 VM starts then 65535 others which are killed.
> guest_no overflows. The next number is 0 which is already taken.
>

Yes, I guess this should be improved. In the case you described we won't
register any debugfs file because of the name clash.

> Linux has and ID allocation to solve this problem, but I don't think this is
> necessary anyway. This should simply reuse the struct kvm->debugfs_dentry.
> 
> Also probably most of the informations contained in ptdump_registered_guest can
> be found in struct kvm. The debugfs should then probably simply take struct kvm
> for the private argument.
>

I would prefer to keep it as a separate struct here as it gives some
flexibility if we need to extend it for guests pKVM support. I think we
can drop the struct mm_struct from here.

Thanks,
Sebastian

> > +
> >  static phys_addr_t ptdump_host_pa(void *addr)
> >  {
> >  	return __pa(addr);
> > @@ -740,6 +757,73 @@ static void stage2_ptdump_walk(struct seq_file *s, struct ptdump_info *info)
> >  
> >  	kvm_pgtable_walk(pgtable, start_ipa, end_ipa, &walker);
> >  }
> 
> [...]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 01/11] KVM: arm64: Add snap shooting the host stage-2 pagetables
  2023-10-19 14:40 ` [PATCH v2 01/11] KVM: arm64: Add snap shooting the host stage-2 pagetables Sebastian Ene
@ 2023-10-26 12:45   ` kernel test robot
  0 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2023-10-26 12:45 UTC (permalink / raw)
  To: Sebastian Ene, will, catalin.marinas, mark.rutland, akpm, maz
  Cc: oe-kbuild-all, linux-arm-kernel, linux-kernel, kernel-team,
	vdonnefort, qperret, smostafa, Sebastian Ene

Hi Sebastian,

kernel test robot noticed the following build warnings:

[auto build test WARNING on arm64/for-next/core]
[also build test WARNING on kvmarm/next akpm-mm/mm-everything linus/master v6.6-rc7 next-20231025]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Sebastian-Ene/KVM-arm64-Add-snap-shooting-the-host-stage-2-pagetables/20231019-224346
base:   https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/core
patch link:    https://lore.kernel.org/r/20231019144032.2943044-3-sebastianene%40google.com
patch subject: [PATCH v2 01/11] KVM: arm64: Add snap shooting the host stage-2 pagetables
config: arm64-allmodconfig (https://download.01.org/0day-ci/archive/20231026/202310262036.TVwm0bsI-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231026/202310262036.TVwm0bsI-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310262036.TVwm0bsI-lkp@intel.com/

All warnings (new ones prefixed by >>):

   arch/arm64/kvm/hyp/nvhe/mem_protect.c: In function '__pkvm_host_stage2_prepare_copy':
>> arch/arm64/kvm/hyp/nvhe/mem_protect.c:335:13: warning: variable 'nr_pages' set but not used [-Wunused-but-set-variable]
     335 |         u64 nr_pages;
         |             ^~~~~~~~


vim +/nr_pages +335 arch/arm64/kvm/hyp/nvhe/mem_protect.c

   326	
   327	int __pkvm_host_stage2_prepare_copy(struct kvm_pgtable_snapshot *snapshot)
   328	{
   329		size_t required_pgd_len;
   330		struct kvm_pgtable_mm_ops mm_ops = {0};
   331		struct kvm_pgtable *to_pgt, *from_pgt = &host_mmu.pgt;
   332		struct kvm_hyp_memcache *memcache = &snapshot->mc;
   333		int ret;
   334		void *pgd;
 > 335		u64 nr_pages;
   336	
   337		required_pgd_len = kvm_pgtable_stage2_pgd_size(host_mmu.arch.vtcr);
   338		if (snapshot->pgd_len < required_pgd_len)
   339			return -ENOMEM;
   340	
   341		to_pgt = &snapshot->pgtable;
   342		nr_pages = snapshot->pgd_len / PAGE_SIZE;
   343		pgd = kern_hyp_va(snapshot->pgd_hva);
   344	
   345		hyp_spin_lock(&snapshot_pool_lock);
   346		hyp_pool_init(&snapshot_pool, hyp_virt_to_pfn(pgd),
   347			      required_pgd_len / PAGE_SIZE, 0);
   348	
   349		mm_ops.zalloc_pages_exact	= snapshot_zalloc_pages_exact;
   350		mm_ops.zalloc_page		= snapshot_zalloc_page;
   351		mm_ops.free_pages_exact		= snapshot_s2_free_pages_exact;
   352		mm_ops.get_page			= snapshot_get_page;
   353		mm_ops.phys_to_virt		= hyp_phys_to_virt;
   354		mm_ops.virt_to_phys		= hyp_virt_to_phys;
   355		mm_ops.page_count		= hyp_page_count;
   356	
   357		to_pgt->ia_bits		= from_pgt->ia_bits;
   358		to_pgt->start_level	= from_pgt->start_level;
   359		to_pgt->flags		= from_pgt->flags;
   360		to_pgt->mm_ops		= &mm_ops;
   361	
   362		host_lock_component();
   363		ret = kvm_pgtable_stage2_copy(to_pgt, from_pgt, memcache);
   364		host_unlock_component();
   365	
   366		hyp_spin_unlock(&snapshot_pool_lock);
   367	
   368		return ret;
   369	}
   370	#endif /* CONFIG_NVHE_EL2_DEBUG */
   371	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-10-26 12:45 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-19 14:40 [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Sebastian Ene
2023-10-19 14:40 ` [PATCH v2 01/11] KVM: arm64: Add snap shooting the host stage-2 pagetables Sebastian Ene
2023-10-26 12:45   ` kernel test robot
2023-10-19 14:40 ` [PATCH v2 02/11] arm64: ptdump: Use the mask from the state structure Sebastian Ene
2023-10-19 14:40 ` [PATCH v2 03/11] arm64: ptdump: Add the walker function to the ptdump info structure Sebastian Ene
2023-10-19 14:40 ` [PATCH v2 04/11] KVM: arm64: Move pagetable definitions to common header Sebastian Ene
2023-10-19 14:40 ` [PATCH v2 05/11] arm64: ptdump: Introduce stage-2 pagetables format description Sebastian Ene
2023-10-19 14:40 ` [PATCH v2 06/11] arm64: ptdump: Add hooks on debugfs file operations Sebastian Ene
2023-10-19 14:40 ` [PATCH v2 07/11] arm64: ptdump: Register a debugfs entry for the host stage-2 page-tables Sebastian Ene
2023-10-19 14:40 ` [PATCH v2 08/11] arm64: ptdump: Parse the host stage-2 page-tables from the snapshot Sebastian Ene
2023-10-19 14:40 ` [PATCH v2 09/11] arm64: ptdump: Interpret memory attributes based on runtime configuration Sebastian Ene
2023-10-19 14:40 ` [PATCH v2 10/11] arm64: ptdump: Interpret pKVM ownership annotations Sebastian Ene
2023-10-19 14:40 ` [PATCH v2 11/11] arm64: ptdump: Add support for guest stage-2 pagetables dumping Sebastian Ene
2023-10-20  8:40   ` Vincent Donnefort
2023-10-23 14:45     ` Sebastian Ene
2023-10-20  8:19 ` [PATCH v2 00/11] arm64: ptdump: View the second stage page-tables Vincent Donnefort
2023-10-23 14:32   ` Sebastian Ene

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).