* [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables
@ 2024-06-21 12:32 Sebastian Ene
2024-06-21 12:32 ` [PATCH v7 1/6] KVM: arm64: Move pagetable definitions to common header Sebastian Ene
` (6 more replies)
0 siblings, 7 replies; 25+ messages in thread
From: Sebastian Ene @ 2024-06-21 12:32 UTC (permalink / raw)
To: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, sebastianene, shahuang, suzuki.poulose, will,
yuzenghui
Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team
Hi,
This series extends the ptdump support to allow dumping the guest
stage-2 pagetables. When CONFIG_PTDUMP_STAGE2_DEBUGFS is enabled, ptdump
registers '/sys/debug/kvm/<guest_id>/stage2_page_tables' entry with debugfs
upon guest creation. This allows userspace tools (eg. cat) to dump the
stage-2 pagetables by reading the registered file.
Reading the debugfs file shows stage-2 memory ranges in following format:
<IPA range> <size> <descriptor type> <access permissions> <mem_attributes>
Below is the output of a guest stage-2 pagetable dump running under Qemu:
---[ IPA bits 33 start lvl 2 ]---
0x0000000000000000-0x0000000080000000 2G PGD
0x0000000080000000-0x0000000080c00000 12M PGD R W AF BLK
0x0000000080c00000-0x0000000080e00000 2M PGD XN R W AF BLK
0x0000000080e00000-0x0000000081000000 2M PGD R W AF BLK
0x0000000081000000-0x0000000081400000 4M PGD XN R W AF BLK
0x0000000081400000-0x000000008fe00000 234M PGD
0x000000008fe00000-0x0000000090000000 2M PGD XN R W AF BLK
0x0000000090000000-0x00000000fa000000 1696M PGD
0x00000000fa000000-0x00000000fe000000 64M PGD XN R W AF BLK
0x00000000fe000000-0x0000000100000000 32M PGD
0x0000000100000000-0x0000000101c00000 28M PGD XN R W AF BLK
0x0000000101c00000-0x0000000102000000 4M PGD
0x0000000102000000-0x0000000102200000 2M PGD XN R W AF BLK
0x0000000102200000-0x000000017b000000 1934M PGD
0x000000017b000000-0x0000000180000000 80M PGD XN R W AF BLK
Link to v6:
https://lore.kernel.org/all/20240220151035.327199-1-sebastianene@google.com/
Link to v5:
https://lore.kernel.org/all/20240207144832.1017815-2-sebastianene@google.com/
Link to v4:
https://lore.kernel.org/all/20231218135859.2513568-2-sebastianene@google.com/
Link to v3:
https://lore.kernel.org/all/20231115171639.2852644-2-sebastianene@google.com/
Changelog:
v6 -> v7:
* Reworded commit for this patch : [PATCH v6 2/6] arm64: ptdump: Expose
the attribute parsing functionality
* fixed minor conflicts in the struct pg_state definition
* moved the kvm_ptdump_guest_registration in the
* kvm_arch_create_vm_debugfs
* reset the parse state before walking the pagetables
* copy the level name to the pg_level buffer
v5 -> v6:
* don't return an error if the kvm_arch_create_vm_debugfs fails to
initialize (ref.
https://lore.kernel.org/all/20240216155941.2029458-1-oliver.upton@linux.dev/)
* fix use-after-free suggested by getting a reference to the
KVM struct while manipulating the debugfs files
and put the reference on the file close.
* do all the allocations at once for the ptdump parser state tracking
and simplify the initialization.
* move the ptdump parser state initialization as part of the file_open
* create separate files for printing the guest stage-2 pagetable
configuration such as: the start level of the pagetable walk and the
number of bits used for the IPA space representation.
* fixed the wrong header format for the newly added file
* include missing patch which hasn't been posted on the v5:
"KVM-arm64-Move-pagetable-definitions-to-common-heade.patch"
v4 -> v5:
* refactorization: split the series into two parts as per the feedback
received from Oliver. Introduce the base support which allows dumping
of the guest stage-2 pagetables.
* removed the *ops* struct wrapper built on top of the file_ops and
simplify the ptdump interface access.
* keep the page table walker away from the ptdump specific code
v3 -> current_version:
* refactorization: moved all the **KVM** specific components under
kvm/ as suggested by Oliver. Introduced a new file
arm64/kvm/ptdump.c which handled the second stage translation.
re-used only the display portion from mm/ptdump.c
* pagetable snapshot creation now uses memory donated from the host.
The memory is no longer shared with the host as this can pose a security
risk if the host has access to manipulate the pagetable copy while
the hypervisor iterates it.
* fixed a memory leak: while memory was used from the memcache for
building the snapshot pagetable, it was no longer giving back the
pages to the host for freeing. A separate array was introduced to
keep track of the pages allocated from the memcache.
v2 -> v3:
* register the stage-2 debugfs entry for the host under
/sys/debug/kvm/host_stage2_page_tables and in
/sys/debug/kvm/<guest_id>/stage2_page_tables for guests.
* don't use a static array for parsing the attributes description,
generate it dynamically based on the number of pagetable levels
* remove the lock that was guarding the seq_file private inode data,
and keep the data private to the open file session.
* minor fixes & renaming of CONFIG_NVHE_EL2_PTDUMP_DEBUGFS to
CONFIG_PTDUMP_STAGE2_DEBUGFS
v1 -> v2:
* use the stage-2 pagetable walker for dumping descriptors instead of
the one provided by ptdump.
* support for guests pagetables dumping under VHE/nVHE non-protected
Thanks,
Sebastian Ene (6):
KVM: arm64: Move pagetable definitions to common header
arm64: ptdump: Expose the attribute parsing functionality
arm64: ptdump: Use the mask from the state structure
KVM: arm64: Register ptdump with debugfs on guest creation
KVM: arm64: Initialize the ptdump parser with stage-2 attributes
KVM: arm64: Expose guest stage-2 pagetable config to debugfs
arch/arm64/include/asm/kvm_pgtable.h | 42 +++++
arch/arm64/include/asm/ptdump.h | 42 ++++-
arch/arm64/kvm/Kconfig | 14 ++
arch/arm64/kvm/Makefile | 1 +
arch/arm64/kvm/arm.c | 2 +
arch/arm64/kvm/hyp/pgtable.c | 42 -----
arch/arm64/kvm/kvm_ptdump.h | 20 ++
arch/arm64/kvm/ptdump.c | 272 +++++++++++++++++++++++++++
arch/arm64/mm/ptdump.c | 50 +----
9 files changed, 402 insertions(+), 83 deletions(-)
create mode 100644 arch/arm64/kvm/kvm_ptdump.h
create mode 100644 arch/arm64/kvm/ptdump.c
--
2.45.2.741.gdbec12cfda-goog
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v7 1/6] KVM: arm64: Move pagetable definitions to common header
2024-06-21 12:32 [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables Sebastian Ene
@ 2024-06-21 12:32 ` Sebastian Ene
2024-06-21 12:32 ` [PATCH v7 2/6] arm64: ptdump: Expose the attribute parsing functionality Sebastian Ene
` (5 subsequent siblings)
6 siblings, 0 replies; 25+ messages in thread
From: Sebastian Ene @ 2024-06-21 12:32 UTC (permalink / raw)
To: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, sebastianene, shahuang, suzuki.poulose, will,
yuzenghui
Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team
In preparation for using the stage-2 definitions in ptdump, move some of
these macros in the common header.
Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
arch/arm64/include/asm/kvm_pgtable.h | 42 ++++++++++++++++++++++++++++
arch/arm64/kvm/hyp/pgtable.c | 42 ----------------------------
2 files changed, 42 insertions(+), 42 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 19278dfe7978..03f4c3d7839c 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -59,6 +59,48 @@ typedef u64 kvm_pte_t;
#define KVM_PHYS_INVALID (-1ULL)
+#define KVM_PTE_LEAF_ATTR_LO GENMASK(11, 2)
+
+#define KVM_PTE_LEAF_ATTR_LO_S1_ATTRIDX GENMASK(4, 2)
+#define KVM_PTE_LEAF_ATTR_LO_S1_AP GENMASK(7, 6)
+#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RO \
+ ({ cpus_have_final_cap(ARM64_KVM_HVHE) ? 2 : 3; })
+#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RW \
+ ({ cpus_have_final_cap(ARM64_KVM_HVHE) ? 0 : 1; })
+#define KVM_PTE_LEAF_ATTR_LO_S1_SH GENMASK(9, 8)
+#define KVM_PTE_LEAF_ATTR_LO_S1_SH_IS 3
+#define KVM_PTE_LEAF_ATTR_LO_S1_AF BIT(10)
+
+#define KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR GENMASK(5, 2)
+#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R BIT(6)
+#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W BIT(7)
+#define KVM_PTE_LEAF_ATTR_LO_S2_SH GENMASK(9, 8)
+#define KVM_PTE_LEAF_ATTR_LO_S2_SH_IS 3
+#define KVM_PTE_LEAF_ATTR_LO_S2_AF BIT(10)
+
+#define KVM_PTE_LEAF_ATTR_HI GENMASK(63, 50)
+
+#define KVM_PTE_LEAF_ATTR_HI_SW GENMASK(58, 55)
+
+#define KVM_PTE_LEAF_ATTR_HI_S1_XN BIT(54)
+
+#define KVM_PTE_LEAF_ATTR_HI_S2_XN BIT(54)
+
+#define KVM_PTE_LEAF_ATTR_HI_S1_GP BIT(50)
+
+#define KVM_PTE_LEAF_ATTR_S2_PERMS (KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | \
+ KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \
+ KVM_PTE_LEAF_ATTR_HI_S2_XN)
+
+#define KVM_INVALID_PTE_OWNER_MASK GENMASK(9, 2)
+#define KVM_MAX_OWNER_ID 1
+
+/*
+ * Used to indicate a pte for which a 'break-before-make' sequence is in
+ * progress.
+ */
+#define KVM_INVALID_PTE_LOCKED BIT(10)
+
static inline bool kvm_pte_valid(kvm_pte_t pte)
{
return pte & KVM_PTE_VALID;
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 9e2bbee77491..c3e9d77bba23 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -17,48 +17,6 @@
#define KVM_PTE_TYPE_PAGE 1
#define KVM_PTE_TYPE_TABLE 1
-#define KVM_PTE_LEAF_ATTR_LO GENMASK(11, 2)
-
-#define KVM_PTE_LEAF_ATTR_LO_S1_ATTRIDX GENMASK(4, 2)
-#define KVM_PTE_LEAF_ATTR_LO_S1_AP GENMASK(7, 6)
-#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RO \
- ({ cpus_have_final_cap(ARM64_KVM_HVHE) ? 2 : 3; })
-#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RW \
- ({ cpus_have_final_cap(ARM64_KVM_HVHE) ? 0 : 1; })
-#define KVM_PTE_LEAF_ATTR_LO_S1_SH GENMASK(9, 8)
-#define KVM_PTE_LEAF_ATTR_LO_S1_SH_IS 3
-#define KVM_PTE_LEAF_ATTR_LO_S1_AF BIT(10)
-
-#define KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR GENMASK(5, 2)
-#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R BIT(6)
-#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W BIT(7)
-#define KVM_PTE_LEAF_ATTR_LO_S2_SH GENMASK(9, 8)
-#define KVM_PTE_LEAF_ATTR_LO_S2_SH_IS 3
-#define KVM_PTE_LEAF_ATTR_LO_S2_AF BIT(10)
-
-#define KVM_PTE_LEAF_ATTR_HI GENMASK(63, 50)
-
-#define KVM_PTE_LEAF_ATTR_HI_SW GENMASK(58, 55)
-
-#define KVM_PTE_LEAF_ATTR_HI_S1_XN BIT(54)
-
-#define KVM_PTE_LEAF_ATTR_HI_S2_XN BIT(54)
-
-#define KVM_PTE_LEAF_ATTR_HI_S1_GP BIT(50)
-
-#define KVM_PTE_LEAF_ATTR_S2_PERMS (KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | \
- KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \
- KVM_PTE_LEAF_ATTR_HI_S2_XN)
-
-#define KVM_INVALID_PTE_OWNER_MASK GENMASK(9, 2)
-#define KVM_MAX_OWNER_ID 1
-
-/*
- * Used to indicate a pte for which a 'break-before-make' sequence is in
- * progress.
- */
-#define KVM_INVALID_PTE_LOCKED BIT(10)
-
struct kvm_pgtable_walk_data {
struct kvm_pgtable_walker *walker;
--
2.45.2.741.gdbec12cfda-goog
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v7 2/6] arm64: ptdump: Expose the attribute parsing functionality
2024-06-21 12:32 [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables Sebastian Ene
2024-06-21 12:32 ` [PATCH v7 1/6] KVM: arm64: Move pagetable definitions to common header Sebastian Ene
@ 2024-06-21 12:32 ` Sebastian Ene
2024-07-05 11:07 ` Will Deacon
2024-06-21 12:32 ` [PATCH v7 3/6] arm64: ptdump: Use the mask from the state structure Sebastian Ene
` (4 subsequent siblings)
6 siblings, 1 reply; 25+ messages in thread
From: Sebastian Ene @ 2024-06-21 12:32 UTC (permalink / raw)
To: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, sebastianene, shahuang, suzuki.poulose, will,
yuzenghui
Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team
Reuse the descriptor parsing functionality to keep the same output format
as the original ptdump code. In order for this to happen, move the state
tracking objects into a common header.
Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
arch/arm64/include/asm/ptdump.h | 41 ++++++++++++++++++++++++++++++++-
arch/arm64/mm/ptdump.c | 37 ++---------------------------
2 files changed, 42 insertions(+), 36 deletions(-)
diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
index 5b1701c76d1c..c550b2afcab7 100644
--- a/arch/arm64/include/asm/ptdump.h
+++ b/arch/arm64/include/asm/ptdump.h
@@ -9,6 +9,7 @@
#include <linux/mm_types.h>
#include <linux/seq_file.h>
+#include <linux/ptdump.h>
struct addr_marker {
unsigned long start_address;
@@ -21,14 +22,52 @@ struct ptdump_info {
unsigned long base_addr;
};
+struct prot_bits {
+ u64 mask;
+ u64 val;
+ const char *set;
+ const char *clear;
+};
+
+struct pg_level {
+ const struct prot_bits *bits;
+ char name[4];
+ int num;
+ u64 mask;
+};
+
+/*
+ * The page dumper groups page table entries of the same type into a single
+ * description. It uses pg_state to track the range information while
+ * iterating over the pte entries. When the continuity is broken it then
+ * dumps out a description of the range.
+ */
+struct pg_state {
+ struct ptdump_state ptdump;
+ struct seq_file *seq;
+ const struct addr_marker *marker;
+ const struct mm_struct *mm;
+ unsigned long start_address;
+ int level;
+ u64 current_prot;
+ bool check_wx;
+ unsigned long wx_pages;
+ unsigned long uxn_pages;
+};
+
void ptdump_walk(struct seq_file *s, struct ptdump_info *info);
+void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
+ u64 val);
#ifdef CONFIG_PTDUMP_DEBUGFS
#define EFI_RUNTIME_MAP_END DEFAULT_MAP_WINDOW_64
void __init ptdump_debugfs_register(struct ptdump_info *info, const char *name);
#else
static inline void ptdump_debugfs_register(struct ptdump_info *info,
const char *name) { }
-#endif
+#endif /* CONFIG_PTDUMP_DEBUGFS */
+#else
+static inline void note_page(void *pt_st, unsigned long addr,
+ int level, u64 val) { }
#endif /* CONFIG_PTDUMP_CORE */
#endif /* __ASM_PTDUMP_H */
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 6986827e0d64..e370b7a945de 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -38,32 +38,6 @@
seq_printf(m, fmt); \
})
-/*
- * The page dumper groups page table entries of the same type into a single
- * description. It uses pg_state to track the range information while
- * iterating over the pte entries. When the continuity is broken it then
- * dumps out a description of the range.
- */
-struct pg_state {
- struct ptdump_state ptdump;
- struct seq_file *seq;
- const struct addr_marker *marker;
- const struct mm_struct *mm;
- unsigned long start_address;
- int level;
- u64 current_prot;
- bool check_wx;
- unsigned long wx_pages;
- unsigned long uxn_pages;
-};
-
-struct prot_bits {
- u64 mask;
- u64 val;
- const char *set;
- const char *clear;
-};
-
static const struct prot_bits pte_bits[] = {
{
.mask = PTE_VALID,
@@ -143,13 +117,6 @@ static const struct prot_bits pte_bits[] = {
}
};
-struct pg_level {
- const struct prot_bits *bits;
- char name[4];
- int num;
- u64 mask;
-};
-
static struct pg_level pg_level[] __ro_after_init = {
{ /* pgd */
.name = "PGD",
@@ -221,8 +188,8 @@ static void note_prot_wx(struct pg_state *st, unsigned long addr)
st->wx_pages += (addr - st->start_address) / PAGE_SIZE;
}
-static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
- u64 val)
+void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
+ u64 val)
{
struct pg_state *st = container_of(pt_st, struct pg_state, ptdump);
static const char units[] = "KMGTPE";
--
2.45.2.741.gdbec12cfda-goog
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v7 3/6] arm64: ptdump: Use the mask from the state structure
2024-06-21 12:32 [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables Sebastian Ene
2024-06-21 12:32 ` [PATCH v7 1/6] KVM: arm64: Move pagetable definitions to common header Sebastian Ene
2024-06-21 12:32 ` [PATCH v7 2/6] arm64: ptdump: Expose the attribute parsing functionality Sebastian Ene
@ 2024-06-21 12:32 ` Sebastian Ene
2024-07-05 11:12 ` Will Deacon
2024-06-21 12:32 ` [PATCH v7 4/6] KVM: arm64: Register ptdump with debugfs on guest creation Sebastian Ene
` (3 subsequent siblings)
6 siblings, 1 reply; 25+ messages in thread
From: Sebastian Ene @ 2024-06-21 12:32 UTC (permalink / raw)
To: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, sebastianene, shahuang, suzuki.poulose, will,
yuzenghui
Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team
Printing the descriptor attributes requires accessing a mask which has a
different set of attributes for stage-2. In preparation for adding support
for the stage-2 pagetables dumping, use the mask from the local context
and not from the globally defined pg_level array. Store a pointer to
the pg_level array in the ptdump state structure. This will allow us to
extract the mask which is wrapped in the pg_level array and use it for
descriptor parsing in the note_page.
Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
arch/arm64/include/asm/ptdump.h | 1 +
arch/arm64/mm/ptdump.c | 13 ++++++++-----
2 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
index c550b2afcab7..a4125d8d5a32 100644
--- a/arch/arm64/include/asm/ptdump.h
+++ b/arch/arm64/include/asm/ptdump.h
@@ -44,6 +44,7 @@ struct pg_level {
*/
struct pg_state {
struct ptdump_state ptdump;
+ struct pg_level *pg_level;
struct seq_file *seq;
const struct addr_marker *marker;
const struct mm_struct *mm;
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index e370b7a945de..9637a6415ea7 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -192,6 +192,7 @@ void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
u64 val)
{
struct pg_state *st = container_of(pt_st, struct pg_state, ptdump);
+ struct pg_level *pg_info = st->pg_level;
static const char units[] = "KMGTPE";
u64 prot = 0;
@@ -201,7 +202,7 @@ void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
level = 0;
if (level >= 0)
- prot = val & pg_level[level].mask;
+ prot = val & pg_info[level].mask;
if (st->level == -1) {
st->level = level;
@@ -227,10 +228,10 @@ void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
unit++;
}
pt_dump_seq_printf(st->seq, "%9lu%c %s", delta, *unit,
- pg_level[st->level].name);
- if (st->current_prot && pg_level[st->level].bits)
- dump_prot(st, pg_level[st->level].bits,
- pg_level[st->level].num);
+ pg_info[st->level].name);
+ if (st->current_prot && pg_info[st->level].bits)
+ dump_prot(st, pg_info[st->level].bits,
+ pg_info[st->level].num);
pt_dump_seq_puts(st->seq, "\n");
if (addr >= st->marker[1].start_address) {
@@ -262,6 +263,7 @@ void ptdump_walk(struct seq_file *s, struct ptdump_info *info)
.seq = s,
.marker = info->markers,
.mm = info->mm,
+ .pg_level = &pg_level[0],
.level = -1,
.ptdump = {
.note_page = note_page,
@@ -297,6 +299,7 @@ bool ptdump_check_wx(void)
{ 0, NULL},
{ -1, NULL},
},
+ .pg_level = &pg_level[0],
.level = -1,
.check_wx = true,
.ptdump = {
--
2.45.2.741.gdbec12cfda-goog
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v7 4/6] KVM: arm64: Register ptdump with debugfs on guest creation
2024-06-21 12:32 [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables Sebastian Ene
` (2 preceding siblings ...)
2024-06-21 12:32 ` [PATCH v7 3/6] arm64: ptdump: Use the mask from the state structure Sebastian Ene
@ 2024-06-21 12:32 ` Sebastian Ene
2024-06-21 12:32 ` [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes Sebastian Ene
` (2 subsequent siblings)
6 siblings, 0 replies; 25+ messages in thread
From: Sebastian Ene @ 2024-06-21 12:32 UTC (permalink / raw)
To: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, sebastianene, shahuang, suzuki.poulose, will,
yuzenghui
Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team
While arch/*/mem/ptdump handles the kernel pagetable dumping code,
introduce KVM/ptdump which shows the guest stage-2 pagetables. The
separation is necessary because most of the definitions from the
stage-2 pagetable reside in the KVM path and we will be invoking
functionality **specific** to KVM.
When a guest is created, register a new file entry under the guest
debugfs dir which allows userspace to show the contents of the guest
stage-2 pagetables when accessed.
Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
arch/arm64/kvm/Kconfig | 14 ++++++
arch/arm64/kvm/Makefile | 1 +
arch/arm64/kvm/arm.c | 2 +
arch/arm64/kvm/kvm_ptdump.h | 20 ++++++++
arch/arm64/kvm/ptdump.c | 91 +++++++++++++++++++++++++++++++++++++
5 files changed, 128 insertions(+)
create mode 100644 arch/arm64/kvm/kvm_ptdump.h
create mode 100644 arch/arm64/kvm/ptdump.c
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 58f09370d17e..99e4a82e141d 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -65,4 +65,18 @@ config PROTECTED_NVHE_STACKTRACE
If unsure, or not using protected nVHE (pKVM), say N.
+config PTDUMP_STAGE2_DEBUGFS
+ bool "Present the stage-2 pagetables to debugfs"
+ depends on KVM
+ select PTDUMP_CORE
+ default n
+ help
+ Say Y here if you want to show the stage-2 kernel pagetables
+ layout in a debugfs file. This information is only useful for kernel developers
+ who are working in architecture specific areas of the kernel.
+ It is probably not a good idea to enable this feature in a production
+ kernel.
+
+ If in doubt, say N.
+
endif # VIRTUALIZATION
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index a6497228c5a8..f6ef4140b20a 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -24,6 +24,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
kvm-$(CONFIG_HW_PERF_EVENTS) += pmu-emul.o pmu.o
kvm-$(CONFIG_ARM64_PTR_AUTH) += pauth.o
+kvm-$(CONFIG_PTDUMP_STAGE2_DEBUGFS) += ptdump.o
always-y := hyp_constants.h hyp-constants.s
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 59716789fe0f..310abc612965 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -45,6 +45,7 @@
#include <kvm/arm_hypercalls.h>
#include <kvm/arm_pmu.h>
#include <kvm/arm_psci.h>
+#include <kvm_ptdump.h>
static enum kvm_mode kvm_mode = KVM_MODE_DEFAULT;
@@ -216,6 +217,7 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
void kvm_arch_create_vm_debugfs(struct kvm *kvm)
{
kvm_sys_regs_create_debugfs(kvm);
+ kvm_ptdump_guest_register(kvm);
}
static void kvm_destroy_mpidr_data(struct kvm *kvm)
diff --git a/arch/arm64/kvm/kvm_ptdump.h b/arch/arm64/kvm/kvm_ptdump.h
new file mode 100644
index 000000000000..0a62b0e2908c
--- /dev/null
+++ b/arch/arm64/kvm/kvm_ptdump.h
@@ -0,0 +1,20 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) Google, 2024
+ * Author: Sebastian Ene <sebastianene@google.com>
+ */
+
+#ifndef __KVM_PTDUMP_H
+#define __KVM_PTDUMP_H
+
+#include <linux/kvm_host.h>
+#include <asm/ptdump.h>
+
+
+#ifdef CONFIG_PTDUMP_STAGE2_DEBUGFS
+void kvm_ptdump_guest_register(struct kvm *kvm);
+#else
+static inline void kvm_ptdump_guest_register(struct kvm *kvm) {}
+#endif /* CONFIG_PTDUMP_STAGE2_DEBUGFS */
+
+#endif /* __KVM_PTDUMP_H */
diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
new file mode 100644
index 000000000000..36dc7662729f
--- /dev/null
+++ b/arch/arm64/kvm/ptdump.c
@@ -0,0 +1,91 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Debug helper used to dump the stage-2 pagetables of the system and their
+ * associated permissions.
+ *
+ * Copyright (C) Google, 2024
+ * Author: Sebastian Ene <sebastianene@google.com>
+ */
+#include <linux/debugfs.h>
+#include <linux/kvm_host.h>
+#include <linux/seq_file.h>
+
+#include <asm/kvm_pgtable.h>
+#include <kvm_ptdump.h>
+
+
+static int kvm_ptdump_visitor(const struct kvm_pgtable_visit_ctx *ctx,
+ enum kvm_pgtable_walk_flags visit)
+{
+ struct pg_state *st = ctx->arg;
+ struct ptdump_state *pt_st = &st->ptdump;
+
+ note_page(pt_st, ctx->addr, ctx->level, ctx->old);
+ return 0;
+}
+
+static int kvm_ptdump_show_common(struct seq_file *m,
+ struct kvm_pgtable *pgtable,
+ struct pg_state *parser_state)
+{
+ struct kvm_pgtable_walker walker = (struct kvm_pgtable_walker) {
+ .cb = kvm_ptdump_visitor,
+ .arg = parser_state,
+ .flags = KVM_PGTABLE_WALK_LEAF,
+ };
+
+ parser_state->level = -1;
+ parser_state->start_address = 0;
+
+ return kvm_pgtable_walk(pgtable, 0, BIT(pgtable->ia_bits), &walker);
+}
+
+static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
+{
+ struct kvm *kvm = m->private;
+ struct kvm_s2_mmu *mmu = &kvm->arch.mmu;
+ struct pg_state parser_state = {0};
+ int ret;
+
+ write_lock(&kvm->mmu_lock);
+ ret = kvm_ptdump_show_common(m, mmu->pgt, &parser_state);
+ write_unlock(&kvm->mmu_lock);
+
+ return ret;
+}
+
+static int kvm_ptdump_guest_open(struct inode *m, struct file *file)
+{
+ struct kvm *kvm = m->i_private;
+ int ret;
+
+ if (!kvm_get_kvm_safe(kvm))
+ return -ENOENT;
+
+ ret = single_open(file, kvm_ptdump_guest_show, m->i_private);
+ if (ret < 0)
+ kvm_put_kvm(kvm);
+
+ return ret;
+}
+
+static int kvm_ptdump_guest_close(struct inode *m, struct file *file)
+{
+ struct kvm *kvm = m->i_private;
+
+ kvm_put_kvm(kvm);
+ return single_release(m, file);
+}
+
+static const struct file_operations kvm_ptdump_guest_fops = {
+ .open = kvm_ptdump_guest_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = kvm_ptdump_guest_close,
+};
+
+void kvm_ptdump_guest_register(struct kvm *kvm)
+{
+ debugfs_create_file("stage2_page_tables", 0400, kvm->debugfs_dentry,
+ kvm, &kvm_ptdump_guest_fops);
+}
--
2.45.2.741.gdbec12cfda-goog
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes
2024-06-21 12:32 [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables Sebastian Ene
` (3 preceding siblings ...)
2024-06-21 12:32 ` [PATCH v7 4/6] KVM: arm64: Register ptdump with debugfs on guest creation Sebastian Ene
@ 2024-06-21 12:32 ` Sebastian Ene
2024-06-28 21:18 ` Oliver Upton
2024-07-01 8:42 ` Vincent Donnefort
2024-06-21 12:32 ` [PATCH v7 6/6] KVM: arm64: Expose guest stage-2 pagetable config to debugfs Sebastian Ene
2024-06-28 21:25 ` [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables Oliver Upton
6 siblings, 2 replies; 25+ messages in thread
From: Sebastian Ene @ 2024-06-21 12:32 UTC (permalink / raw)
To: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, sebastianene, shahuang, suzuki.poulose, will,
yuzenghui
Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team
Define a set of attributes used by the ptdump parser to display the
properties of a guest memory region covered by a pagetable descriptor.
Build a description of the pagetable levels and initialize the parser
with this configuration.
Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
arch/arm64/kvm/ptdump.c | 143 ++++++++++++++++++++++++++++++++++++++--
1 file changed, 137 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
index 36dc7662729f..cc1d4fdddc6e 100644
--- a/arch/arm64/kvm/ptdump.c
+++ b/arch/arm64/kvm/ptdump.c
@@ -14,6 +14,61 @@
#include <kvm_ptdump.h>
+#define MARKERS_LEN (2)
+#define KVM_PGTABLE_MAX_LEVELS (KVM_PGTABLE_LAST_LEVEL + 1)
+
+struct kvm_ptdump_guest_state {
+ struct kvm *kvm;
+ struct pg_state parser_state;
+ struct addr_marker ipa_marker[MARKERS_LEN];
+ struct pg_level level[KVM_PGTABLE_MAX_LEVELS];
+ struct ptdump_range range[MARKERS_LEN];
+};
+
+static const struct prot_bits stage2_pte_bits[] = {
+ {
+ .mask = PTE_VALID,
+ .val = PTE_VALID,
+ .set = " ",
+ .clear = "F",
+ }, {
+ .mask = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
+ .val = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
+ .set = "XN",
+ .clear = " ",
+ }, {
+ .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
+ .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
+ .set = "R",
+ .clear = " ",
+ }, {
+ .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | PTE_VALID,
+ .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | PTE_VALID,
+ .set = "W",
+ .clear = " ",
+ }, {
+ .mask = KVM_PTE_LEAF_ATTR_LO_S2_AF | PTE_VALID,
+ .val = KVM_PTE_LEAF_ATTR_LO_S2_AF | PTE_VALID,
+ .set = "AF",
+ .clear = " ",
+ }, {
+ .mask = PTE_NG,
+ .val = PTE_NG,
+ .set = "FnXS",
+ .clear = " ",
+ }, {
+ .mask = PTE_CONT | PTE_VALID,
+ .val = PTE_CONT | PTE_VALID,
+ .set = "CON",
+ .clear = " ",
+ }, {
+ .mask = PTE_TABLE_BIT,
+ .val = PTE_TABLE_BIT,
+ .set = " ",
+ .clear = "BLK",
+ },
+};
+
static int kvm_ptdump_visitor(const struct kvm_pgtable_visit_ctx *ctx,
enum kvm_pgtable_walk_flags visit)
{
@@ -40,15 +95,79 @@ static int kvm_ptdump_show_common(struct seq_file *m,
return kvm_pgtable_walk(pgtable, 0, BIT(pgtable->ia_bits), &walker);
}
+static int kvm_ptdump_build_levels(struct pg_level *level, u32 start_lvl)
+{
+ static const char * const level_names[] = {"PGD", "PUD", "PMD", "PTE"};
+ u32 i = 0;
+ u64 mask = 0;
+
+ if (start_lvl > 2) {
+ pr_err("invalid start_lvl %u\n", start_lvl);
+ return -EINVAL;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(stage2_pte_bits); i++)
+ mask |= stage2_pte_bits[i].mask;
+
+ for (i = start_lvl; i < KVM_PGTABLE_MAX_LEVELS; i++) {
+ strscpy(level[i].name, level_names[i], sizeof(level[i].name));
+
+ level[i].num = ARRAY_SIZE(stage2_pte_bits);
+ level[i].bits = stage2_pte_bits;
+ level[i].mask = mask;
+ }
+
+ if (start_lvl > 0)
+ strscpy(level[start_lvl].name, level_names[0], sizeof(level_names[0]));
+
+ return 0;
+}
+
+static struct kvm_ptdump_guest_state
+*kvm_ptdump_parser_init(struct kvm *kvm)
+{
+ struct kvm_ptdump_guest_state *st;
+ struct kvm_s2_mmu *mmu = &kvm->arch.mmu;
+ struct kvm_pgtable *pgtable = mmu->pgt;
+ int ret;
+
+ st = kzalloc(sizeof(struct kvm_ptdump_guest_state), GFP_KERNEL_ACCOUNT);
+ if (!st)
+ return NULL;
+
+ ret = kvm_ptdump_build_levels(&st->level[0], pgtable->start_level);
+ if (ret)
+ goto free_with_state;
+
+ st->ipa_marker[0].name = "Guest IPA";
+ st->ipa_marker[1].start_address = BIT(pgtable->ia_bits);
+ st->range[0].end = BIT(pgtable->ia_bits);
+
+ st->kvm = kvm;
+ st->parser_state = (struct pg_state) {
+ .marker = &st->ipa_marker[0],
+ .level = -1,
+ .pg_level = &st->level[0],
+ .ptdump.range = &st->range[0],
+ };
+
+ return st;
+free_with_state:
+ kfree(st);
+ return NULL;
+}
+
static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
{
- struct kvm *kvm = m->private;
+ struct kvm_ptdump_guest_state *st = m->private;
+ struct kvm *kvm = st->kvm;
struct kvm_s2_mmu *mmu = &kvm->arch.mmu;
- struct pg_state parser_state = {0};
int ret;
+ st->parser_state.seq = m;
+
write_lock(&kvm->mmu_lock);
- ret = kvm_ptdump_show_common(m, mmu->pgt, &parser_state);
+ ret = kvm_ptdump_show_common(m, mmu->pgt, &st->parser_state);
write_unlock(&kvm->mmu_lock);
return ret;
@@ -57,22 +176,34 @@ static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
static int kvm_ptdump_guest_open(struct inode *m, struct file *file)
{
struct kvm *kvm = m->i_private;
+ struct kvm_ptdump_guest_state *st;
int ret;
if (!kvm_get_kvm_safe(kvm))
return -ENOENT;
- ret = single_open(file, kvm_ptdump_guest_show, m->i_private);
- if (ret < 0)
- kvm_put_kvm(kvm);
+ st = kvm_ptdump_parser_init(kvm);
+ if (!st) {
+ ret = -ENOMEM;
+ goto free_with_kvm_ref;
+ }
+
+ ret = single_open(file, kvm_ptdump_guest_show, st);
+ if (!ret)
+ return 0;
+ kfree(st);
+free_with_kvm_ref:
+ kvm_put_kvm(kvm);
return ret;
}
static int kvm_ptdump_guest_close(struct inode *m, struct file *file)
{
struct kvm *kvm = m->i_private;
+ void *st = ((struct seq_file *)file->private_data)->private;
+ kfree(st);
kvm_put_kvm(kvm);
return single_release(m, file);
}
--
2.45.2.741.gdbec12cfda-goog
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v7 6/6] KVM: arm64: Expose guest stage-2 pagetable config to debugfs
2024-06-21 12:32 [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables Sebastian Ene
` (4 preceding siblings ...)
2024-06-21 12:32 ` [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes Sebastian Ene
@ 2024-06-21 12:32 ` Sebastian Ene
2024-06-28 20:49 ` Oliver Upton
2024-06-28 21:25 ` [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables Oliver Upton
6 siblings, 1 reply; 25+ messages in thread
From: Sebastian Ene @ 2024-06-21 12:32 UTC (permalink / raw)
To: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, sebastianene, shahuang, suzuki.poulose, will,
yuzenghui
Cc: kvmarm, linux-arm-kernel, linux-kernel, kernel-team
Make the start level and the IPA bits properties available in the
virtual machine debugfs directory. Make sure that the KVM structure
doesn't disappear behind our back and keep a reference to the KVM struct
while these files are opened.
Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
arch/arm64/kvm/ptdump.c | 50 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)
diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
index cc1d4fdddc6e..17649e3cbc8f 100644
--- a/arch/arm64/kvm/ptdump.c
+++ b/arch/arm64/kvm/ptdump.c
@@ -215,8 +215,58 @@ static const struct file_operations kvm_ptdump_guest_fops = {
.release = kvm_ptdump_guest_close,
};
+static int kvm_pgtable_debugfs_show(struct seq_file *m, void *unused)
+{
+ const struct file *file = m->file;
+ struct kvm_pgtable *pgtable = m->private;
+
+ if (!strcmp(file_dentry(file)->d_iname, "ipa_range"))
+ seq_printf(m, "%2u\n", pgtable->ia_bits);
+ else if (!strcmp(file_dentry(file)->d_iname, "stage2_levels"))
+ seq_printf(m, "%1d\n", pgtable->start_level);
+ return 0;
+}
+
+static int kvm_pgtable_debugfs_open(struct inode *m, struct file *file)
+{
+ struct kvm *kvm = m->i_private;
+ struct kvm_s2_mmu *mmu;
+ struct kvm_pgtable *pgtable;
+ int ret;
+
+ if (!kvm_get_kvm_safe(kvm))
+ return -ENOENT;
+
+ mmu = &kvm->arch.mmu;
+ pgtable = mmu->pgt;
+
+ ret = single_open(file, kvm_pgtable_debugfs_show, pgtable);
+ if (ret < 0)
+ kvm_put_kvm(kvm);
+ return ret;
+}
+
+static int kvm_pgtable_debugfs_close(struct inode *m, struct file *file)
+{
+ struct kvm *kvm = m->i_private;
+
+ kvm_put_kvm(kvm);
+ return single_release(m, file);
+}
+
+static const struct file_operations kvm_pgtable_debugfs_fops = {
+ .open = kvm_pgtable_debugfs_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = kvm_pgtable_debugfs_close,
+};
+
void kvm_ptdump_guest_register(struct kvm *kvm)
{
debugfs_create_file("stage2_page_tables", 0400, kvm->debugfs_dentry,
kvm, &kvm_ptdump_guest_fops);
+ debugfs_create_file("ipa_range", 0400, kvm->debugfs_dentry, kvm,
+ &kvm_pgtable_debugfs_fops);
+ debugfs_create_file("stage2_levels", 0400, kvm->debugfs_dentry,
+ kvm, &kvm_pgtable_debugfs_fops);
}
--
2.45.2.741.gdbec12cfda-goog
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v7 6/6] KVM: arm64: Expose guest stage-2 pagetable config to debugfs
2024-06-21 12:32 ` [PATCH v7 6/6] KVM: arm64: Expose guest stage-2 pagetable config to debugfs Sebastian Ene
@ 2024-06-28 20:49 ` Oliver Upton
2024-07-01 14:10 ` Sebastian Ene
0 siblings, 1 reply; 25+ messages in thread
From: Oliver Upton @ 2024-06-28 20:49 UTC (permalink / raw)
To: Sebastian Ene
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, rananta, ryan.roberts,
shahuang, suzuki.poulose, will, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Fri, Jun 21, 2024 at 12:32:30PM +0000, Sebastian Ene wrote:
> Make the start level and the IPA bits properties available in the
> virtual machine debugfs directory. Make sure that the KVM structure
> doesn't disappear behind our back and keep a reference to the KVM struct
> while these files are opened.
>
> Signed-off-by: Sebastian Ene <sebastianene@google.com>
> ---
> arch/arm64/kvm/ptdump.c | 50 +++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 50 insertions(+)
>
> diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> index cc1d4fdddc6e..17649e3cbc8f 100644
> --- a/arch/arm64/kvm/ptdump.c
> +++ b/arch/arm64/kvm/ptdump.c
> @@ -215,8 +215,58 @@ static const struct file_operations kvm_ptdump_guest_fops = {
> .release = kvm_ptdump_guest_close,
> };
>
> +static int kvm_pgtable_debugfs_show(struct seq_file *m, void *unused)
> +{
> + const struct file *file = m->file;
> + struct kvm_pgtable *pgtable = m->private;
> +
> + if (!strcmp(file_dentry(file)->d_iname, "ipa_range"))
> + seq_printf(m, "%2u\n", pgtable->ia_bits);
> + else if (!strcmp(file_dentry(file)->d_iname, "stage2_levels"))
> + seq_printf(m, "%1d\n", pgtable->start_level);
The name of the file suggests sounds like this is the number of page
table levels instead of the starting level of the walk.
So instead:
seq_printf(m, "%1d\n",
KVM_PGTABLE_LAST_LEVEL - pgtable->start_level + 1);
> + return 0;
> +}
> +
> +static int kvm_pgtable_debugfs_open(struct inode *m, struct file *file)
> +{
> + struct kvm *kvm = m->i_private;
> + struct kvm_s2_mmu *mmu;
> + struct kvm_pgtable *pgtable;
> + int ret;
> +
> + if (!kvm_get_kvm_safe(kvm))
> + return -ENOENT;
> +
> + mmu = &kvm->arch.mmu;
> + pgtable = mmu->pgt;
nitpick: pgtable = &kvm->arch.mmu.pgt
--
Thanks,
Oliver
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes
2024-06-21 12:32 ` [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes Sebastian Ene
@ 2024-06-28 21:18 ` Oliver Upton
2024-07-01 14:17 ` Sebastian Ene
2024-07-19 14:01 ` Sebastian Ene
2024-07-01 8:42 ` Vincent Donnefort
1 sibling, 2 replies; 25+ messages in thread
From: Oliver Upton @ 2024-06-28 21:18 UTC (permalink / raw)
To: Sebastian Ene
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, rananta, ryan.roberts,
shahuang, suzuki.poulose, will, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
Hi Seb,
On Fri, Jun 21, 2024 at 12:32:29PM +0000, Sebastian Ene wrote:
> Define a set of attributes used by the ptdump parser to display the
> properties of a guest memory region covered by a pagetable descriptor.
> Build a description of the pagetable levels and initialize the parser
> with this configuration.
>
> Signed-off-by: Sebastian Ene <sebastianene@google.com>
This patch should come *before* patch 4, no point in exposing the
debugfs file if we aren't ready to handle it yet.
> ---
> arch/arm64/kvm/ptdump.c | 143 ++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 137 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> index 36dc7662729f..cc1d4fdddc6e 100644
> --- a/arch/arm64/kvm/ptdump.c
> +++ b/arch/arm64/kvm/ptdump.c
> @@ -14,6 +14,61 @@
> #include <kvm_ptdump.h>
>
>
> +#define MARKERS_LEN (2)
> +#define KVM_PGTABLE_MAX_LEVELS (KVM_PGTABLE_LAST_LEVEL + 1)
> +
> +struct kvm_ptdump_guest_state {
> + struct kvm *kvm;
> + struct pg_state parser_state;
> + struct addr_marker ipa_marker[MARKERS_LEN];
> + struct pg_level level[KVM_PGTABLE_MAX_LEVELS];
> + struct ptdump_range range[MARKERS_LEN];
> +};
> +
> +static const struct prot_bits stage2_pte_bits[] = {
> + {
> + .mask = PTE_VALID,
> + .val = PTE_VALID,
> + .set = " ",
> + .clear = "F",
> + }, {
> + .mask = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> + .val = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> + .set = "XN",
> + .clear = " ",
> + }, {
> + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> + .set = "R",
> + .clear = " ",
> + }, {
> + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | PTE_VALID,
> + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | PTE_VALID,
> + .set = "W",
> + .clear = " ",
> + }, {
> + .mask = KVM_PTE_LEAF_ATTR_LO_S2_AF | PTE_VALID,
> + .val = KVM_PTE_LEAF_ATTR_LO_S2_AF | PTE_VALID,
> + .set = "AF",
> + .clear = " ",
<snip>
> + }, {
> + .mask = PTE_NG,
> + .val = PTE_NG,
> + .set = "FnXS",
> + .clear = " ",
> + }, {
> + .mask = PTE_CONT | PTE_VALID,
> + .val = PTE_CONT | PTE_VALID,
> + .set = "CON",
> + .clear = " ",
> + }, {
</snip>
Neither of these bits are used at stage-2, why have descriptors for
them?
> +static int kvm_ptdump_build_levels(struct pg_level *level, u32 start_lvl)
> +{
> + static const char * const level_names[] = {"PGD", "PUD", "PMD", "PTE"};
> + u32 i = 0;
> + u64 mask = 0;
> +
> + if (start_lvl > 2) {
> + pr_err("invalid start_lvl %u\n", start_lvl);
> + return -EINVAL;
> + }
if (WARN_ON_ONCE(start_lvl >= KVM_PGTABLE_LAST_LEVEL))
return -EINVAL;
> + for (i = 0; i < ARRAY_SIZE(stage2_pte_bits); i++)
> + mask |= stage2_pte_bits[i].mask;
> +
> + for (i = start_lvl; i < KVM_PGTABLE_MAX_LEVELS; i++) {
> + strscpy(level[i].name, level_names[i], sizeof(level[i].name));
> +
> + level[i].num = ARRAY_SIZE(stage2_pte_bits);
> + level[i].bits = stage2_pte_bits;
> + level[i].mask = mask;
> + }
> +
> + if (start_lvl > 0)
> + strscpy(level[start_lvl].name, level_names[0], sizeof(level_names[0]));
This should pass the size of @dst, not the source. This becomes slightly
more self-documenting if you use a literal for "PGD" here too.
strscpy(level[start_lvl].name, "PGD", sizeof(level[start_lvl].name));
> + return 0;
> +}
> +
> +static struct kvm_ptdump_guest_state
> +*kvm_ptdump_parser_init(struct kvm *kvm)
> +{
> + struct kvm_ptdump_guest_state *st;
> + struct kvm_s2_mmu *mmu = &kvm->arch.mmu;
> + struct kvm_pgtable *pgtable = mmu->pgt;
> + int ret;
> +
> + st = kzalloc(sizeof(struct kvm_ptdump_guest_state), GFP_KERNEL_ACCOUNT);
> + if (!st)
> + return NULL;
> +
> + ret = kvm_ptdump_build_levels(&st->level[0], pgtable->start_level);
> + if (ret)
> + goto free_with_state;
I don't see any value in the use of goto here, as there isn't any sort
of cascading initialization / cleanup. This also presents an opportunity
to get an error back out to the caller.
if (ret) {
kfree(st);
return ERR_PTR(ret);
}
> @@ -57,22 +176,34 @@ static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
> static int kvm_ptdump_guest_open(struct inode *m, struct file *file)
> {
> struct kvm *kvm = m->i_private;
> + struct kvm_ptdump_guest_state *st;
> int ret;
>
> if (!kvm_get_kvm_safe(kvm))
> return -ENOENT;
>
> - ret = single_open(file, kvm_ptdump_guest_show, m->i_private);
> - if (ret < 0)
> - kvm_put_kvm(kvm);
> + st = kvm_ptdump_parser_init(kvm);
> + if (!st) {
> + ret = -ENOMEM;
> + goto free_with_kvm_ref;
> + }
(with the earlier suggestion)
st = kvm_ptdump_parser_init(kvm);
if (IS_ERR(st)) {
ret = PTR_ERR(st);
goto free_with_kvm_ref;
}
Otherwise genuine KVM bugs (-EINVAL) are getting lumped into ENOMEM.
--
Thanks,
Oliver
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables
2024-06-21 12:32 [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables Sebastian Ene
` (5 preceding siblings ...)
2024-06-21 12:32 ` [PATCH v7 6/6] KVM: arm64: Expose guest stage-2 pagetable config to debugfs Sebastian Ene
@ 2024-06-28 21:25 ` Oliver Upton
2024-07-01 14:22 ` Sebastian Ene
6 siblings, 1 reply; 25+ messages in thread
From: Oliver Upton @ 2024-06-28 21:25 UTC (permalink / raw)
To: Sebastian Ene
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, rananta, ryan.roberts,
shahuang, suzuki.poulose, will, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
Hi Seb,
I think we're getting closer on this series, thanks for reposting it. In
addition to polishing up the KVM side of things, I'd like to have acks
on the ptdump changes (patches 2-3).
Will + Catalin, any concerns?
--
Thanks,
Oliver
On Fri, Jun 21, 2024 at 12:32:24PM +0000, Sebastian Ene wrote:
> Hi,
>
>
> This series extends the ptdump support to allow dumping the guest
> stage-2 pagetables. When CONFIG_PTDUMP_STAGE2_DEBUGFS is enabled, ptdump
> registers '/sys/debug/kvm/<guest_id>/stage2_page_tables' entry with debugfs
> upon guest creation. This allows userspace tools (eg. cat) to dump the
> stage-2 pagetables by reading the registered file.
>
> Reading the debugfs file shows stage-2 memory ranges in following format:
> <IPA range> <size> <descriptor type> <access permissions> <mem_attributes>
>
> Below is the output of a guest stage-2 pagetable dump running under Qemu:
>
> ---[ IPA bits 33 start lvl 2 ]---
> 0x0000000000000000-0x0000000080000000 2G PGD
> 0x0000000080000000-0x0000000080c00000 12M PGD R W AF BLK
> 0x0000000080c00000-0x0000000080e00000 2M PGD XN R W AF BLK
> 0x0000000080e00000-0x0000000081000000 2M PGD R W AF BLK
> 0x0000000081000000-0x0000000081400000 4M PGD XN R W AF BLK
> 0x0000000081400000-0x000000008fe00000 234M PGD
> 0x000000008fe00000-0x0000000090000000 2M PGD XN R W AF BLK
> 0x0000000090000000-0x00000000fa000000 1696M PGD
> 0x00000000fa000000-0x00000000fe000000 64M PGD XN R W AF BLK
> 0x00000000fe000000-0x0000000100000000 32M PGD
> 0x0000000100000000-0x0000000101c00000 28M PGD XN R W AF BLK
> 0x0000000101c00000-0x0000000102000000 4M PGD
> 0x0000000102000000-0x0000000102200000 2M PGD XN R W AF BLK
> 0x0000000102200000-0x000000017b000000 1934M PGD
> 0x000000017b000000-0x0000000180000000 80M PGD XN R W AF BLK
>
> Link to v6:
> https://lore.kernel.org/all/20240220151035.327199-1-sebastianene@google.com/
>
> Link to v5:
> https://lore.kernel.org/all/20240207144832.1017815-2-sebastianene@google.com/
>
> Link to v4:
> https://lore.kernel.org/all/20231218135859.2513568-2-sebastianene@google.com/
>
> Link to v3:
> https://lore.kernel.org/all/20231115171639.2852644-2-sebastianene@google.com/
>
> Changelog:
> v6 -> v7:
> * Reworded commit for this patch : [PATCH v6 2/6] arm64: ptdump: Expose
> the attribute parsing functionality
> * fixed minor conflicts in the struct pg_state definition
> * moved the kvm_ptdump_guest_registration in the
> * kvm_arch_create_vm_debugfs
> * reset the parse state before walking the pagetables
> * copy the level name to the pg_level buffer
>
>
> v5 -> v6:
> * don't return an error if the kvm_arch_create_vm_debugfs fails to
> initialize (ref.
> https://lore.kernel.org/all/20240216155941.2029458-1-oliver.upton@linux.dev/)
> * fix use-after-free suggested by getting a reference to the
> KVM struct while manipulating the debugfs files
> and put the reference on the file close.
> * do all the allocations at once for the ptdump parser state tracking
> and simplify the initialization.
> * move the ptdump parser state initialization as part of the file_open
> * create separate files for printing the guest stage-2 pagetable
> configuration such as: the start level of the pagetable walk and the
> number of bits used for the IPA space representation.
> * fixed the wrong header format for the newly added file
> * include missing patch which hasn't been posted on the v5:
> "KVM-arm64-Move-pagetable-definitions-to-common-heade.patch"
>
>
> v4 -> v5:
> * refactorization: split the series into two parts as per the feedback
> received from Oliver. Introduce the base support which allows dumping
> of the guest stage-2 pagetables.
> * removed the *ops* struct wrapper built on top of the file_ops and
> simplify the ptdump interface access.
> * keep the page table walker away from the ptdump specific code
>
> v3 -> current_version:
> * refactorization: moved all the **KVM** specific components under
> kvm/ as suggested by Oliver. Introduced a new file
> arm64/kvm/ptdump.c which handled the second stage translation.
> re-used only the display portion from mm/ptdump.c
> * pagetable snapshot creation now uses memory donated from the host.
> The memory is no longer shared with the host as this can pose a security
> risk if the host has access to manipulate the pagetable copy while
> the hypervisor iterates it.
> * fixed a memory leak: while memory was used from the memcache for
> building the snapshot pagetable, it was no longer giving back the
> pages to the host for freeing. A separate array was introduced to
> keep track of the pages allocated from the memcache.
>
>
> v2 -> v3:
> * register the stage-2 debugfs entry for the host under
> /sys/debug/kvm/host_stage2_page_tables and in
> /sys/debug/kvm/<guest_id>/stage2_page_tables for guests.
> * don't use a static array for parsing the attributes description,
> generate it dynamically based on the number of pagetable levels
> * remove the lock that was guarding the seq_file private inode data,
> and keep the data private to the open file session.
> * minor fixes & renaming of CONFIG_NVHE_EL2_PTDUMP_DEBUGFS to
> CONFIG_PTDUMP_STAGE2_DEBUGFS
>
>
> v1 -> v2:
> * use the stage-2 pagetable walker for dumping descriptors instead of
> the one provided by ptdump.
> * support for guests pagetables dumping under VHE/nVHE non-protected
>
> Thanks,
>
> Sebastian Ene (6):
> KVM: arm64: Move pagetable definitions to common header
> arm64: ptdump: Expose the attribute parsing functionality
> arm64: ptdump: Use the mask from the state structure
> KVM: arm64: Register ptdump with debugfs on guest creation
> KVM: arm64: Initialize the ptdump parser with stage-2 attributes
> KVM: arm64: Expose guest stage-2 pagetable config to debugfs
>
> arch/arm64/include/asm/kvm_pgtable.h | 42 +++++
> arch/arm64/include/asm/ptdump.h | 42 ++++-
> arch/arm64/kvm/Kconfig | 14 ++
> arch/arm64/kvm/Makefile | 1 +
> arch/arm64/kvm/arm.c | 2 +
> arch/arm64/kvm/hyp/pgtable.c | 42 -----
> arch/arm64/kvm/kvm_ptdump.h | 20 ++
> arch/arm64/kvm/ptdump.c | 272 +++++++++++++++++++++++++++
> arch/arm64/mm/ptdump.c | 50 +----
> 9 files changed, 402 insertions(+), 83 deletions(-)
> create mode 100644 arch/arm64/kvm/kvm_ptdump.h
> create mode 100644 arch/arm64/kvm/ptdump.c
>
> --
> 2.45.2.741.gdbec12cfda-goog
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes
2024-06-21 12:32 ` [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes Sebastian Ene
2024-06-28 21:18 ` Oliver Upton
@ 2024-07-01 8:42 ` Vincent Donnefort
2024-07-01 14:18 ` Sebastian Ene
1 sibling, 1 reply; 25+ messages in thread
From: Vincent Donnefort @ 2024-07-01 8:42 UTC (permalink / raw)
To: Sebastian Ene
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, shahuang, suzuki.poulose, will, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
O Fri, Jun 21, 2024 at 12:32:29PM +0000, 'Sebastian Ene' via kernel-team wrote:
> Define a set of attributes used by the ptdump parser to display the
> properties of a guest memory region covered by a pagetable descriptor.
> Build a description of the pagetable levels and initialize the parser
> with this configuration.
>
> Signed-off-by: Sebastian Ene <sebastianene@google.com>
> ---
> arch/arm64/kvm/ptdump.c | 143 ++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 137 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> index 36dc7662729f..cc1d4fdddc6e 100644
> --- a/arch/arm64/kvm/ptdump.c
> +++ b/arch/arm64/kvm/ptdump.c
> @@ -14,6 +14,61 @@
> #include <kvm_ptdump.h>
>
>
> +#define MARKERS_LEN (2)
> +#define KVM_PGTABLE_MAX_LEVELS (KVM_PGTABLE_LAST_LEVEL + 1)
> +
> +struct kvm_ptdump_guest_state {
> + struct kvm *kvm;
> + struct pg_state parser_state;
> + struct addr_marker ipa_marker[MARKERS_LEN];
> + struct pg_level level[KVM_PGTABLE_MAX_LEVELS];
> + struct ptdump_range range[MARKERS_LEN];
> +};
> +
> +static const struct prot_bits stage2_pte_bits[] = {
> + {
> + .mask = PTE_VALID,
> + .val = PTE_VALID,
> + .set = " ",
> + .clear = "F",
> + }, {
> + .mask = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> + .val = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> + .set = "XN",
> + .clear = " ",
> + }, {
> + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> + .set = "R",
> + .clear = " ",
> + }, {
> + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | PTE_VALID,
> + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | PTE_VALID,
> + .set = "W",
> + .clear = " ",
> + }, {
> + .mask = KVM_PTE_LEAF_ATTR_LO_S2_AF | PTE_VALID,
> + .val = KVM_PTE_LEAF_ATTR_LO_S2_AF | PTE_VALID,
> + .set = "AF",
> + .clear = " ",
> + }, {
> + .mask = PTE_NG,
> + .val = PTE_NG,
> + .set = "FnXS",
> + .clear = " ",
> + }, {
> + .mask = PTE_CONT | PTE_VALID,
> + .val = PTE_CONT | PTE_VALID,
> + .set = "CON",
> + .clear = " ",
> + }, {
> + .mask = PTE_TABLE_BIT,
>
> + .val = PTE_TABLE_BIT,
> + .set = " ",
> + .clear = "BLK",
> + },
When doing a kvm_pgtable_stage2_set_owner(), the walker will init a leaf which
has both the table-bit and the valid-bit unset. I believe this would lead to
spurious BLK annotations here.
The following should fix this problem:
.mask = PTE_TABLE_BIT | PTE_VALID,
.val = PTE_VALID,
.set = "BLK",
.clear = " ",
> +};
> +
> static int kvm_ptdump_visitor(const struct kvm_pgtable_visit_ctx *ctx,
> enum kvm_pgtable_walk_flags visit)
> {
> @@ -40,15 +95,79 @@ static int kvm_ptdump_show_common(struct seq_file *m,
> return kvm_pgtable_walk(pgtable, 0, BIT(pgtable->ia_bits), &walker);
> }
>
[...]
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 6/6] KVM: arm64: Expose guest stage-2 pagetable config to debugfs
2024-06-28 20:49 ` Oliver Upton
@ 2024-07-01 14:10 ` Sebastian Ene
0 siblings, 0 replies; 25+ messages in thread
From: Sebastian Ene @ 2024-07-01 14:10 UTC (permalink / raw)
To: Oliver Upton
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, rananta, ryan.roberts,
shahuang, suzuki.poulose, will, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Fri, Jun 28, 2024 at 08:49:19PM +0000, Oliver Upton wrote:
> On Fri, Jun 21, 2024 at 12:32:30PM +0000, Sebastian Ene wrote:
> > Make the start level and the IPA bits properties available in the
> > virtual machine debugfs directory. Make sure that the KVM structure
> > doesn't disappear behind our back and keep a reference to the KVM struct
> > while these files are opened.
> >
> > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > ---
> > arch/arm64/kvm/ptdump.c | 50 +++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 50 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> > index cc1d4fdddc6e..17649e3cbc8f 100644
> > --- a/arch/arm64/kvm/ptdump.c
> > +++ b/arch/arm64/kvm/ptdump.c
> > @@ -215,8 +215,58 @@ static const struct file_operations kvm_ptdump_guest_fops = {
> > .release = kvm_ptdump_guest_close,
> > };
> >
> > +static int kvm_pgtable_debugfs_show(struct seq_file *m, void *unused)
> > +{
> > + const struct file *file = m->file;
> > + struct kvm_pgtable *pgtable = m->private;
> > +
> > + if (!strcmp(file_dentry(file)->d_iname, "ipa_range"))
> > + seq_printf(m, "%2u\n", pgtable->ia_bits);
> > + else if (!strcmp(file_dentry(file)->d_iname, "stage2_levels"))
> > + seq_printf(m, "%1d\n", pgtable->start_level);
>
> The name of the file suggests sounds like this is the number of page
> table levels instead of the starting level of the walk.
>
> So instead:
>
> seq_printf(m, "%1d\n",
> KVM_PGTABLE_LAST_LEVEL - pgtable->start_level + 1);
>
Ack, I will rewrite this.
> > + return 0;
> > +}
> > +
> > +static int kvm_pgtable_debugfs_open(struct inode *m, struct file *file)
> > +{
> > + struct kvm *kvm = m->i_private;
> > + struct kvm_s2_mmu *mmu;
> > + struct kvm_pgtable *pgtable;
> > + int ret;
> > +
> > + if (!kvm_get_kvm_safe(kvm))
> > + return -ENOENT;
> > +
> > + mmu = &kvm->arch.mmu;
> > + pgtable = mmu->pgt;
>
> nitpick: pgtable = &kvm->arch.mmu.pgt
>
Will fix this.
Thanks,
Seb
> --
> Thanks,
> Oliver
>
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes
2024-06-28 21:18 ` Oliver Upton
@ 2024-07-01 14:17 ` Sebastian Ene
2024-07-08 19:47 ` Oliver Upton
2024-07-19 14:01 ` Sebastian Ene
1 sibling, 1 reply; 25+ messages in thread
From: Sebastian Ene @ 2024-07-01 14:17 UTC (permalink / raw)
To: Oliver Upton
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, rananta, ryan.roberts,
shahuang, suzuki.poulose, will, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Fri, Jun 28, 2024 at 09:18:16PM +0000, Oliver Upton wrote:
> Hi Seb,
>
> On Fri, Jun 21, 2024 at 12:32:29PM +0000, Sebastian Ene wrote:
> > Define a set of attributes used by the ptdump parser to display the
> > properties of a guest memory region covered by a pagetable descriptor.
> > Build a description of the pagetable levels and initialize the parser
> > with this configuration.
> >
> > Signed-off-by: Sebastian Ene <sebastianene@google.com>
>
> This patch should come *before* patch 4, no point in exposing the
> debugfs file if we aren't ready to handle it yet.
>
Gotcha, let me try to reorder them.
> > ---
> > arch/arm64/kvm/ptdump.c | 143 ++++++++++++++++++++++++++++++++++++++--
> > 1 file changed, 137 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> > index 36dc7662729f..cc1d4fdddc6e 100644
> > --- a/arch/arm64/kvm/ptdump.c
> > +++ b/arch/arm64/kvm/ptdump.c
> > @@ -14,6 +14,61 @@
> > #include <kvm_ptdump.h>
> >
> >
> > +#define MARKERS_LEN (2)
> > +#define KVM_PGTABLE_MAX_LEVELS (KVM_PGTABLE_LAST_LEVEL + 1)
> > +
> > +struct kvm_ptdump_guest_state {
> > + struct kvm *kvm;
> > + struct pg_state parser_state;
> > + struct addr_marker ipa_marker[MARKERS_LEN];
> > + struct pg_level level[KVM_PGTABLE_MAX_LEVELS];
> > + struct ptdump_range range[MARKERS_LEN];
> > +};
> > +
> > +static const struct prot_bits stage2_pte_bits[] = {
> > + {
> > + .mask = PTE_VALID,
> > + .val = PTE_VALID,
> > + .set = " ",
> > + .clear = "F",
> > + }, {
> > + .mask = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> > + .val = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> > + .set = "XN",
> > + .clear = " ",
> > + }, {
> > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> > + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> > + .set = "R",
> > + .clear = " ",
> > + }, {
> > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | PTE_VALID,
> > + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | PTE_VALID,
> > + .set = "W",
> > + .clear = " ",
> > + }, {
> > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_AF | PTE_VALID,
> > + .val = KVM_PTE_LEAF_ATTR_LO_S2_AF | PTE_VALID,
> > + .set = "AF",
> > + .clear = " ",
>
> <snip>
>
> > + }, {
> > + .mask = PTE_NG,
> > + .val = PTE_NG,
> > + .set = "FnXS",
> > + .clear = " ",
> > + }, {
> > + .mask = PTE_CONT | PTE_VALID,
> > + .val = PTE_CONT | PTE_VALID,
> > + .set = "CON",
> > + .clear = " ",
> > + }, {
>
> </snip>
>
> Neither of these bits are used at stage-2, why have descriptors for
> them?
>
Atm, we don't make use of the contiguous bit in stage-2 in upstream (but
we have it in some experimental patches). I can remove this, no hard
feelings for them.
> > +static int kvm_ptdump_build_levels(struct pg_level *level, u32 start_lvl)
> > +{
> > + static const char * const level_names[] = {"PGD", "PUD", "PMD", "PTE"};
> > + u32 i = 0;
> > + u64 mask = 0;
> > +
> > + if (start_lvl > 2) {
> > + pr_err("invalid start_lvl %u\n", start_lvl);
> > + return -EINVAL;
> > + }
>
> if (WARN_ON_ONCE(start_lvl >= KVM_PGTABLE_LAST_LEVEL))
> return -EINVAL;
>
I will include this validation, thanks !
> > + for (i = 0; i < ARRAY_SIZE(stage2_pte_bits); i++)
> > + mask |= stage2_pte_bits[i].mask;
> > +
> > + for (i = start_lvl; i < KVM_PGTABLE_MAX_LEVELS; i++) {
> > + strscpy(level[i].name, level_names[i], sizeof(level[i].name));
> > +
> > + level[i].num = ARRAY_SIZE(stage2_pte_bits);
> > + level[i].bits = stage2_pte_bits;
> > + level[i].mask = mask;
> > + }
> > +
> > + if (start_lvl > 0)
> > + strscpy(level[start_lvl].name, level_names[0], sizeof(level_names[0]));
>
> This should pass the size of @dst, not the source. This becomes slightly
> more self-documenting if you use a literal for "PGD" here too.
>
> strscpy(level[start_lvl].name, "PGD", sizeof(level[start_lvl].name));
>
Will use this, thanks for the suggestion !
> > + return 0;
> > +}
> > +
> > +static struct kvm_ptdump_guest_state
> > +*kvm_ptdump_parser_init(struct kvm *kvm)
> > +{
> > + struct kvm_ptdump_guest_state *st;
> > + struct kvm_s2_mmu *mmu = &kvm->arch.mmu;
> > + struct kvm_pgtable *pgtable = mmu->pgt;
> > + int ret;
> > +
> > + st = kzalloc(sizeof(struct kvm_ptdump_guest_state), GFP_KERNEL_ACCOUNT);
> > + if (!st)
> > + return NULL;
> > +
> > + ret = kvm_ptdump_build_levels(&st->level[0], pgtable->start_level);
> > + if (ret)
> > + goto free_with_state;
>
> I don't see any value in the use of goto here, as there isn't any sort
> of cascading initialization / cleanup. This also presents an opportunity
> to get an error back out to the caller.
>
> if (ret) {
> kfree(st);
> return ERR_PTR(ret);
> }
>
Let me remove that goto; statement.
> > @@ -57,22 +176,34 @@ static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
> > static int kvm_ptdump_guest_open(struct inode *m, struct file *file)
> > {
> > struct kvm *kvm = m->i_private;
> > + struct kvm_ptdump_guest_state *st;
> > int ret;
> >
> > if (!kvm_get_kvm_safe(kvm))
> > return -ENOENT;
> >
> > - ret = single_open(file, kvm_ptdump_guest_show, m->i_private);
> > - if (ret < 0)
> > - kvm_put_kvm(kvm);
> > + st = kvm_ptdump_parser_init(kvm);
> > + if (!st) {
> > + ret = -ENOMEM;
> > + goto free_with_kvm_ref;
> > + }
>
> (with the earlier suggestion)
>
> st = kvm_ptdump_parser_init(kvm);
> if (IS_ERR(st)) {
> ret = PTR_ERR(st);
> goto free_with_kvm_ref;
> }
>
> Otherwise genuine KVM bugs (-EINVAL) are getting lumped into ENOMEM.
>
> --
> Thanks,
> Oliver
Thanks,
Sebastian
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes
2024-07-01 8:42 ` Vincent Donnefort
@ 2024-07-01 14:18 ` Sebastian Ene
2024-07-16 9:59 ` Vincent Donnefort
0 siblings, 1 reply; 25+ messages in thread
From: Sebastian Ene @ 2024-07-01 14:18 UTC (permalink / raw)
To: Vincent Donnefort
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, shahuang, suzuki.poulose, will, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Mon, Jul 01, 2024 at 09:42:43AM +0100, Vincent Donnefort wrote:
> O Fri, Jun 21, 2024 at 12:32:29PM +0000, 'Sebastian Ene' via kernel-team wrote:
> > Define a set of attributes used by the ptdump parser to display the
> > properties of a guest memory region covered by a pagetable descriptor.
> > Build a description of the pagetable levels and initialize the parser
> > with this configuration.
> >
> > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > ---
> > arch/arm64/kvm/ptdump.c | 143 ++++++++++++++++++++++++++++++++++++++--
> > 1 file changed, 137 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> > index 36dc7662729f..cc1d4fdddc6e 100644
> > --- a/arch/arm64/kvm/ptdump.c
> > +++ b/arch/arm64/kvm/ptdump.c
> > @@ -14,6 +14,61 @@
> > #include <kvm_ptdump.h>
> >
> >
> > +#define MARKERS_LEN (2)
> > +#define KVM_PGTABLE_MAX_LEVELS (KVM_PGTABLE_LAST_LEVEL + 1)
> > +
> > +struct kvm_ptdump_guest_state {
> > + struct kvm *kvm;
> > + struct pg_state parser_state;
> > + struct addr_marker ipa_marker[MARKERS_LEN];
> > + struct pg_level level[KVM_PGTABLE_MAX_LEVELS];
> > + struct ptdump_range range[MARKERS_LEN];
> > +};
> > +
> > +static const struct prot_bits stage2_pte_bits[] = {
> > + {
> > + .mask = PTE_VALID,
> > + .val = PTE_VALID,
> > + .set = " ",
> > + .clear = "F",
> > + }, {
> > + .mask = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> > + .val = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> > + .set = "XN",
> > + .clear = " ",
> > + }, {
> > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> > + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> > + .set = "R",
> > + .clear = " ",
> > + }, {
> > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | PTE_VALID,
> > + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | PTE_VALID,
> > + .set = "W",
> > + .clear = " ",
> > + }, {
> > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_AF | PTE_VALID,
> > + .val = KVM_PTE_LEAF_ATTR_LO_S2_AF | PTE_VALID,
> > + .set = "AF",
> > + .clear = " ",
> > + }, {
> > + .mask = PTE_NG,
> > + .val = PTE_NG,
> > + .set = "FnXS",
> > + .clear = " ",
> > + }, {
> > + .mask = PTE_CONT | PTE_VALID,
> > + .val = PTE_CONT | PTE_VALID,
> > + .set = "CON",
> > + .clear = " ",
> > + }, {
> > + .mask = PTE_TABLE_BIT,
> >
> > + .val = PTE_TABLE_BIT,
> > + .set = " ",
> > + .clear = "BLK",
> > + },
Hello Vincent,
>
> When doing a kvm_pgtable_stage2_set_owner(), the walker will init a leaf which
> has both the table-bit and the valid-bit unset. I believe this would lead to
> spurious BLK annotations here.
>
> The following should fix this problem:
>
> .mask = PTE_TABLE_BIT | PTE_VALID,
> .val = PTE_VALID,
> .set = "BLK",
> .clear = " ",
>
Let me try this, thanks for the suggestion !
> > +};
> > +
> > static int kvm_ptdump_visitor(const struct kvm_pgtable_visit_ctx *ctx,
> > enum kvm_pgtable_walk_flags visit)
> > {
> > @@ -40,15 +95,79 @@ static int kvm_ptdump_show_common(struct seq_file *m,
> > return kvm_pgtable_walk(pgtable, 0, BIT(pgtable->ia_bits), &walker);
> > }
> >
>
> [...]
Seb
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables
2024-06-28 21:25 ` [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables Oliver Upton
@ 2024-07-01 14:22 ` Sebastian Ene
0 siblings, 0 replies; 25+ messages in thread
From: Sebastian Ene @ 2024-07-01 14:22 UTC (permalink / raw)
To: Oliver Upton
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, rananta, ryan.roberts,
shahuang, suzuki.poulose, will, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Fri, Jun 28, 2024 at 09:25:26PM +0000, Oliver Upton wrote:
Hello Oliver,
> Hi Seb,
>
> I think we're getting closer on this series, thanks for reposting it. In
> addition to polishing up the KVM side of things, I'd like to have acks
> on the ptdump changes (patches 2-3).
>
> Will + Catalin, any concerns?
>
> --
> Thanks,
> Oliver
Thanks for the feedback, I will spin a v8 later this week if there are
no more findings on the remaining patches.
Thanks,
Sebastian
>
> On Fri, Jun 21, 2024 at 12:32:24PM +0000, Sebastian Ene wrote:
> > Hi,
> >
> >
> > This series extends the ptdump support to allow dumping the guest
> > stage-2 pagetables. When CONFIG_PTDUMP_STAGE2_DEBUGFS is enabled, ptdump
> > registers '/sys/debug/kvm/<guest_id>/stage2_page_tables' entry with debugfs
> > upon guest creation. This allows userspace tools (eg. cat) to dump the
> > stage-2 pagetables by reading the registered file.
> >
> > Reading the debugfs file shows stage-2 memory ranges in following format:
> > <IPA range> <size> <descriptor type> <access permissions> <mem_attributes>
> >
> > Below is the output of a guest stage-2 pagetable dump running under Qemu:
> >
> > ---[ IPA bits 33 start lvl 2 ]---
> > 0x0000000000000000-0x0000000080000000 2G PGD
> > 0x0000000080000000-0x0000000080c00000 12M PGD R W AF BLK
> > 0x0000000080c00000-0x0000000080e00000 2M PGD XN R W AF BLK
> > 0x0000000080e00000-0x0000000081000000 2M PGD R W AF BLK
> > 0x0000000081000000-0x0000000081400000 4M PGD XN R W AF BLK
> > 0x0000000081400000-0x000000008fe00000 234M PGD
> > 0x000000008fe00000-0x0000000090000000 2M PGD XN R W AF BLK
> > 0x0000000090000000-0x00000000fa000000 1696M PGD
> > 0x00000000fa000000-0x00000000fe000000 64M PGD XN R W AF BLK
> > 0x00000000fe000000-0x0000000100000000 32M PGD
> > 0x0000000100000000-0x0000000101c00000 28M PGD XN R W AF BLK
> > 0x0000000101c00000-0x0000000102000000 4M PGD
> > 0x0000000102000000-0x0000000102200000 2M PGD XN R W AF BLK
> > 0x0000000102200000-0x000000017b000000 1934M PGD
> > 0x000000017b000000-0x0000000180000000 80M PGD XN R W AF BLK
> >
> > Link to v6:
> > https://lore.kernel.org/all/20240220151035.327199-1-sebastianene@google.com/
> >
> > Link to v5:
> > https://lore.kernel.org/all/20240207144832.1017815-2-sebastianene@google.com/
> >
> > Link to v4:
> > https://lore.kernel.org/all/20231218135859.2513568-2-sebastianene@google.com/
> >
> > Link to v3:
> > https://lore.kernel.org/all/20231115171639.2852644-2-sebastianene@google.com/
> >
> > Changelog:
> > v6 -> v7:
> > * Reworded commit for this patch : [PATCH v6 2/6] arm64: ptdump: Expose
> > the attribute parsing functionality
> > * fixed minor conflicts in the struct pg_state definition
> > * moved the kvm_ptdump_guest_registration in the
> > * kvm_arch_create_vm_debugfs
> > * reset the parse state before walking the pagetables
> > * copy the level name to the pg_level buffer
> >
> >
> > v5 -> v6:
> > * don't return an error if the kvm_arch_create_vm_debugfs fails to
> > initialize (ref.
> > https://lore.kernel.org/all/20240216155941.2029458-1-oliver.upton@linux.dev/)
> > * fix use-after-free suggested by getting a reference to the
> > KVM struct while manipulating the debugfs files
> > and put the reference on the file close.
> > * do all the allocations at once for the ptdump parser state tracking
> > and simplify the initialization.
> > * move the ptdump parser state initialization as part of the file_open
> > * create separate files for printing the guest stage-2 pagetable
> > configuration such as: the start level of the pagetable walk and the
> > number of bits used for the IPA space representation.
> > * fixed the wrong header format for the newly added file
> > * include missing patch which hasn't been posted on the v5:
> > "KVM-arm64-Move-pagetable-definitions-to-common-heade.patch"
> >
> >
> > v4 -> v5:
> > * refactorization: split the series into two parts as per the feedback
> > received from Oliver. Introduce the base support which allows dumping
> > of the guest stage-2 pagetables.
> > * removed the *ops* struct wrapper built on top of the file_ops and
> > simplify the ptdump interface access.
> > * keep the page table walker away from the ptdump specific code
> >
> > v3 -> current_version:
> > * refactorization: moved all the **KVM** specific components under
> > kvm/ as suggested by Oliver. Introduced a new file
> > arm64/kvm/ptdump.c which handled the second stage translation.
> > re-used only the display portion from mm/ptdump.c
> > * pagetable snapshot creation now uses memory donated from the host.
> > The memory is no longer shared with the host as this can pose a security
> > risk if the host has access to manipulate the pagetable copy while
> > the hypervisor iterates it.
> > * fixed a memory leak: while memory was used from the memcache for
> > building the snapshot pagetable, it was no longer giving back the
> > pages to the host for freeing. A separate array was introduced to
> > keep track of the pages allocated from the memcache.
> >
> >
> > v2 -> v3:
> > * register the stage-2 debugfs entry for the host under
> > /sys/debug/kvm/host_stage2_page_tables and in
> > /sys/debug/kvm/<guest_id>/stage2_page_tables for guests.
> > * don't use a static array for parsing the attributes description,
> > generate it dynamically based on the number of pagetable levels
> > * remove the lock that was guarding the seq_file private inode data,
> > and keep the data private to the open file session.
> > * minor fixes & renaming of CONFIG_NVHE_EL2_PTDUMP_DEBUGFS to
> > CONFIG_PTDUMP_STAGE2_DEBUGFS
> >
> >
> > v1 -> v2:
> > * use the stage-2 pagetable walker for dumping descriptors instead of
> > the one provided by ptdump.
> > * support for guests pagetables dumping under VHE/nVHE non-protected
> >
> > Thanks,
> >
> > Sebastian Ene (6):
> > KVM: arm64: Move pagetable definitions to common header
> > arm64: ptdump: Expose the attribute parsing functionality
> > arm64: ptdump: Use the mask from the state structure
> > KVM: arm64: Register ptdump with debugfs on guest creation
> > KVM: arm64: Initialize the ptdump parser with stage-2 attributes
> > KVM: arm64: Expose guest stage-2 pagetable config to debugfs
> >
> > arch/arm64/include/asm/kvm_pgtable.h | 42 +++++
> > arch/arm64/include/asm/ptdump.h | 42 ++++-
> > arch/arm64/kvm/Kconfig | 14 ++
> > arch/arm64/kvm/Makefile | 1 +
> > arch/arm64/kvm/arm.c | 2 +
> > arch/arm64/kvm/hyp/pgtable.c | 42 -----
> > arch/arm64/kvm/kvm_ptdump.h | 20 ++
> > arch/arm64/kvm/ptdump.c | 272 +++++++++++++++++++++++++++
> > arch/arm64/mm/ptdump.c | 50 +----
> > 9 files changed, 402 insertions(+), 83 deletions(-)
> > create mode 100644 arch/arm64/kvm/kvm_ptdump.h
> > create mode 100644 arch/arm64/kvm/ptdump.c
> >
> > --
> > 2.45.2.741.gdbec12cfda-goog
> >
>
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 2/6] arm64: ptdump: Expose the attribute parsing functionality
2024-06-21 12:32 ` [PATCH v7 2/6] arm64: ptdump: Expose the attribute parsing functionality Sebastian Ene
@ 2024-07-05 11:07 ` Will Deacon
2024-07-19 11:44 ` Sebastian Ene
0 siblings, 1 reply; 25+ messages in thread
From: Will Deacon @ 2024-07-05 11:07 UTC (permalink / raw)
To: Sebastian Ene
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, shahuang, suzuki.poulose, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Fri, Jun 21, 2024 at 12:32:26PM +0000, Sebastian Ene wrote:
> Reuse the descriptor parsing functionality to keep the same output format
> as the original ptdump code. In order for this to happen, move the state
> tracking objects into a common header.
>
> Signed-off-by: Sebastian Ene <sebastianene@google.com>
> ---
> arch/arm64/include/asm/ptdump.h | 41 ++++++++++++++++++++++++++++++++-
> arch/arm64/mm/ptdump.c | 37 ++---------------------------
> 2 files changed, 42 insertions(+), 36 deletions(-)
>
> diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
> index 5b1701c76d1c..c550b2afcab7 100644
> --- a/arch/arm64/include/asm/ptdump.h
> +++ b/arch/arm64/include/asm/ptdump.h
> @@ -9,6 +9,7 @@
>
> #include <linux/mm_types.h>
> #include <linux/seq_file.h>
> +#include <linux/ptdump.h>
>
> struct addr_marker {
> unsigned long start_address;
> @@ -21,14 +22,52 @@ struct ptdump_info {
> unsigned long base_addr;
> };
>
> +struct prot_bits {
> + u64 mask;
> + u64 val;
> + const char *set;
> + const char *clear;
> +};
> +
> +struct pg_level {
> + const struct prot_bits *bits;
> + char name[4];
> + int num;
> + u64 mask;
> +};
> +
> +/*
> + * The page dumper groups page table entries of the same type into a single
> + * description. It uses pg_state to track the range information while
> + * iterating over the pte entries. When the continuity is broken it then
> + * dumps out a description of the range.
> + */
> +struct pg_state {
> + struct ptdump_state ptdump;
> + struct seq_file *seq;
> + const struct addr_marker *marker;
> + const struct mm_struct *mm;
> + unsigned long start_address;
> + int level;
> + u64 current_prot;
> + bool check_wx;
> + unsigned long wx_pages;
> + unsigned long uxn_pages;
> +};
Minor nit, but if we're moving these structure definitions into the
header then I'd be inclined to give them some more specific names (e.g.
prefix them with 'ptdump_'). Granted, this header isn't used widely, but
it's included by arch/arm64/mm/mmu.c and claiming 'struct prot_bits' is
a bit over-reaching imo!
Will
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 3/6] arm64: ptdump: Use the mask from the state structure
2024-06-21 12:32 ` [PATCH v7 3/6] arm64: ptdump: Use the mask from the state structure Sebastian Ene
@ 2024-07-05 11:12 ` Will Deacon
2024-07-19 13:27 ` Sebastian Ene
0 siblings, 1 reply; 25+ messages in thread
From: Will Deacon @ 2024-07-05 11:12 UTC (permalink / raw)
To: Sebastian Ene
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, shahuang, suzuki.poulose, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Fri, Jun 21, 2024 at 12:32:27PM +0000, Sebastian Ene wrote:
> Printing the descriptor attributes requires accessing a mask which has a
> different set of attributes for stage-2. In preparation for adding support
> for the stage-2 pagetables dumping, use the mask from the local context
> and not from the globally defined pg_level array. Store a pointer to
> the pg_level array in the ptdump state structure. This will allow us to
> extract the mask which is wrapped in the pg_level array and use it for
> descriptor parsing in the note_page.
>
> Signed-off-by: Sebastian Ene <sebastianene@google.com>
> ---
> arch/arm64/include/asm/ptdump.h | 1 +
> arch/arm64/mm/ptdump.c | 13 ++++++++-----
> 2 files changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
> index c550b2afcab7..a4125d8d5a32 100644
> --- a/arch/arm64/include/asm/ptdump.h
> +++ b/arch/arm64/include/asm/ptdump.h
> @@ -44,6 +44,7 @@ struct pg_level {
> */
> struct pg_state {
> struct ptdump_state ptdump;
> + struct pg_level *pg_level;
> struct seq_file *seq;
> const struct addr_marker *marker;
> const struct mm_struct *mm;
> diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
> index e370b7a945de..9637a6415ea7 100644
> --- a/arch/arm64/mm/ptdump.c
> +++ b/arch/arm64/mm/ptdump.c
> @@ -192,6 +192,7 @@ void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
> u64 val)
> {
> struct pg_state *st = container_of(pt_st, struct pg_state, ptdump);
> + struct pg_level *pg_info = st->pg_level;
> static const char units[] = "KMGTPE";
> u64 prot = 0;
>
> @@ -201,7 +202,7 @@ void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
> level = 0;
>
> if (level >= 0)
> - prot = val & pg_level[level].mask;
> + prot = val & pg_info[level].mask;
If you rename the existing 'pg_level' array to something like
'kernel_pg_levels' then I think your local 'pg_info' variable can be
called 'pg_level' and this line doesn't need to change.
>
> if (st->level == -1) {
> st->level = level;
> @@ -227,10 +228,10 @@ void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
> unit++;
> }
> pt_dump_seq_printf(st->seq, "%9lu%c %s", delta, *unit,
> - pg_level[st->level].name);
> - if (st->current_prot && pg_level[st->level].bits)
> - dump_prot(st, pg_level[st->level].bits,
> - pg_level[st->level].num);
> + pg_info[st->level].name);
> + if (st->current_prot && pg_info[st->level].bits)
> + dump_prot(st, pg_info[st->level].bits,
> + pg_info[st->level].num);
I think this could then stay as-is too.
> pt_dump_seq_puts(st->seq, "\n");
>
> if (addr >= st->marker[1].start_address) {
> @@ -262,6 +263,7 @@ void ptdump_walk(struct seq_file *s, struct ptdump_info *info)
> .seq = s,
> .marker = info->markers,
> .mm = info->mm,
> + .pg_level = &pg_level[0],
Can't this just be '.pg_level = kernel_pg_levels'?
Will
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes
2024-07-01 14:17 ` Sebastian Ene
@ 2024-07-08 19:47 ` Oliver Upton
0 siblings, 0 replies; 25+ messages in thread
From: Oliver Upton @ 2024-07-08 19:47 UTC (permalink / raw)
To: Sebastian Ene
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, rananta, ryan.roberts,
shahuang, suzuki.poulose, will, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Mon, Jul 01, 2024 at 02:17:53PM +0000, Sebastian Ene wrote:
> > <snip>
> >
> > > + }, {
> > > + .mask = PTE_NG,
> > > + .val = PTE_NG,
> > > + .set = "FnXS",
> > > + .clear = " ",
> > > + }, {
> > > + .mask = PTE_CONT | PTE_VALID,
> > > + .val = PTE_CONT | PTE_VALID,
> > > + .set = "CON",
> > > + .clear = " ",
> > > + }, {
> >
> > </snip>
> >
> > Neither of these bits are used at stage-2, why have descriptors for
> > them?
> >
>
> Atm, we don't make use of the contiguous bit in stage-2 in upstream (but
> we have it in some experimental patches). I can remove this, no hard
> feelings for them.
Yes, please drop them. I'll nag whoever adds contpte support for stage-2
to add them back :)
--
Thanks,
Oliver
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes
2024-07-01 14:18 ` Sebastian Ene
@ 2024-07-16 9:59 ` Vincent Donnefort
2024-07-19 14:09 ` Sebastian Ene
0 siblings, 1 reply; 25+ messages in thread
From: Vincent Donnefort @ 2024-07-16 9:59 UTC (permalink / raw)
To: Sebastian Ene
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, shahuang, suzuki.poulose, will, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Mon, Jul 01, 2024 at 02:18:39PM +0000, Sebastian Ene wrote:
> On Mon, Jul 01, 2024 at 09:42:43AM +0100, Vincent Donnefort wrote:
> > O Fri, Jun 21, 2024 at 12:32:29PM +0000, 'Sebastian Ene' via kernel-team wrote:
> > > Define a set of attributes used by the ptdump parser to display the
> > > properties of a guest memory region covered by a pagetable descriptor.
> > > Build a description of the pagetable levels and initialize the parser
> > > with this configuration.
> > >
> > > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > > ---
> > > arch/arm64/kvm/ptdump.c | 143 ++++++++++++++++++++++++++++++++++++++--
> > > 1 file changed, 137 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> > > index 36dc7662729f..cc1d4fdddc6e 100644
> > > --- a/arch/arm64/kvm/ptdump.c
> > > +++ b/arch/arm64/kvm/ptdump.c
> > > @@ -14,6 +14,61 @@
> > > #include <kvm_ptdump.h>
> > >
> > >
> > > +#define MARKERS_LEN (2)
> > > +#define KVM_PGTABLE_MAX_LEVELS (KVM_PGTABLE_LAST_LEVEL + 1)
> > > +
> > > +struct kvm_ptdump_guest_state {
> > > + struct kvm *kvm;
> > > + struct pg_state parser_state;
> > > + struct addr_marker ipa_marker[MARKERS_LEN];
> > > + struct pg_level level[KVM_PGTABLE_MAX_LEVELS];
> > > + struct ptdump_range range[MARKERS_LEN];
> > > +};
> > > +
> > > +static const struct prot_bits stage2_pte_bits[] = {
> > > + {
> > > + .mask = PTE_VALID,
> > > + .val = PTE_VALID,
> > > + .set = " ",
> > > + .clear = "F",
> > > + }, {
> > > + .mask = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> > > + .val = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
KVM_PTE_LEAF_ATTR_HI_S2_XN is actually a mask covering
KVM_PTE_LEAF_ATTR_HI_S2_XN_XN, KVM_PTE_LEAF_ATTR_HI_S2_XN_PXN and
KVM_PTE_LEAF_ATTR_HI_S2_XN_UXN.
I believe here what we should do is something like?
.val = FIELD_PREP_CONST(KVM_PTE_LEAF_ATTR_HI_S2_XN,
KVM_PTE_LEAF_ATTR_HI_S2_XN_XN) | PTE_VALID
> > > + .set = "XN",
> > > + .clear = " ",
> > > + }, {
> > > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> > > + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> > > + .set = "R",
> > > + .clear = " ",
> > > + }, {
[...]
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 2/6] arm64: ptdump: Expose the attribute parsing functionality
2024-07-05 11:07 ` Will Deacon
@ 2024-07-19 11:44 ` Sebastian Ene
0 siblings, 0 replies; 25+ messages in thread
From: Sebastian Ene @ 2024-07-19 11:44 UTC (permalink / raw)
To: Will Deacon
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, shahuang, suzuki.poulose, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Fri, Jul 05, 2024 at 12:07:48PM +0100, Will Deacon wrote:
> On Fri, Jun 21, 2024 at 12:32:26PM +0000, Sebastian Ene wrote:
> > Reuse the descriptor parsing functionality to keep the same output format
> > as the original ptdump code. In order for this to happen, move the state
> > tracking objects into a common header.
> >
> > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > ---
> > arch/arm64/include/asm/ptdump.h | 41 ++++++++++++++++++++++++++++++++-
> > arch/arm64/mm/ptdump.c | 37 ++---------------------------
> > 2 files changed, 42 insertions(+), 36 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
> > index 5b1701c76d1c..c550b2afcab7 100644
> > --- a/arch/arm64/include/asm/ptdump.h
> > +++ b/arch/arm64/include/asm/ptdump.h
> > @@ -9,6 +9,7 @@
> >
> > #include <linux/mm_types.h>
> > #include <linux/seq_file.h>
> > +#include <linux/ptdump.h>
> >
> > struct addr_marker {
> > unsigned long start_address;
> > @@ -21,14 +22,52 @@ struct ptdump_info {
> > unsigned long base_addr;
> > };
> >
> > +struct prot_bits {
> > + u64 mask;
> > + u64 val;
> > + const char *set;
> > + const char *clear;
> > +};
> > +
> > +struct pg_level {
> > + const struct prot_bits *bits;
> > + char name[4];
> > + int num;
> > + u64 mask;
> > +};
> > +
> > +/*
> > + * The page dumper groups page table entries of the same type into a single
> > + * description. It uses pg_state to track the range information while
> > + * iterating over the pte entries. When the continuity is broken it then
> > + * dumps out a description of the range.
> > + */
> > +struct pg_state {
> > + struct ptdump_state ptdump;
> > + struct seq_file *seq;
> > + const struct addr_marker *marker;
> > + const struct mm_struct *mm;
> > + unsigned long start_address;
> > + int level;
> > + u64 current_prot;
> > + bool check_wx;
> > + unsigned long wx_pages;
> > + unsigned long uxn_pages;
> > +};
Hello Will,
>
> Minor nit, but if we're moving these structure definitions into the
> header then I'd be inclined to give them some more specific names (e.g.
> prefix them with 'ptdump_'). Granted, this header isn't used widely, but
> it's included by arch/arm64/mm/mmu.c and claiming 'struct prot_bits' is
> a bit over-reaching imo!
>
> Will
Thanks for having a look. I applied the prefix.
Cheers,
Seb
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 3/6] arm64: ptdump: Use the mask from the state structure
2024-07-05 11:12 ` Will Deacon
@ 2024-07-19 13:27 ` Sebastian Ene
0 siblings, 0 replies; 25+ messages in thread
From: Sebastian Ene @ 2024-07-19 13:27 UTC (permalink / raw)
To: Will Deacon
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, shahuang, suzuki.poulose, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Fri, Jul 05, 2024 at 12:12:29PM +0100, Will Deacon wrote:
> On Fri, Jun 21, 2024 at 12:32:27PM +0000, Sebastian Ene wrote:
> > Printing the descriptor attributes requires accessing a mask which has a
> > different set of attributes for stage-2. In preparation for adding support
> > for the stage-2 pagetables dumping, use the mask from the local context
> > and not from the globally defined pg_level array. Store a pointer to
> > the pg_level array in the ptdump state structure. This will allow us to
> > extract the mask which is wrapped in the pg_level array and use it for
> > descriptor parsing in the note_page.
> >
> > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > ---
> > arch/arm64/include/asm/ptdump.h | 1 +
> > arch/arm64/mm/ptdump.c | 13 ++++++++-----
> > 2 files changed, 9 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/ptdump.h b/arch/arm64/include/asm/ptdump.h
> > index c550b2afcab7..a4125d8d5a32 100644
> > --- a/arch/arm64/include/asm/ptdump.h
> > +++ b/arch/arm64/include/asm/ptdump.h
> > @@ -44,6 +44,7 @@ struct pg_level {
> > */
> > struct pg_state {
> > struct ptdump_state ptdump;
> > + struct pg_level *pg_level;
> > struct seq_file *seq;
> > const struct addr_marker *marker;
> > const struct mm_struct *mm;
> > diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
> > index e370b7a945de..9637a6415ea7 100644
> > --- a/arch/arm64/mm/ptdump.c
> > +++ b/arch/arm64/mm/ptdump.c
> > @@ -192,6 +192,7 @@ void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
> > u64 val)
> > {
> > struct pg_state *st = container_of(pt_st, struct pg_state, ptdump);
> > + struct pg_level *pg_info = st->pg_level;
> > static const char units[] = "KMGTPE";
> > u64 prot = 0;
> >
> > @@ -201,7 +202,7 @@ void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
> > level = 0;
> >
> > if (level >= 0)
> > - prot = val & pg_level[level].mask;
> > + prot = val & pg_info[level].mask;
>
> If you rename the existing 'pg_level' array to something like
> 'kernel_pg_levels' then I think your local 'pg_info' variable can be
> called 'pg_level' and this line doesn't need to change.
>
This is a good ideea, thanks. I applied your suggestion
> >
> > if (st->level == -1) {
> > st->level = level;
> > @@ -227,10 +228,10 @@ void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
> > unit++;
> > }
> > pt_dump_seq_printf(st->seq, "%9lu%c %s", delta, *unit,
> > - pg_level[st->level].name);
> > - if (st->current_prot && pg_level[st->level].bits)
> > - dump_prot(st, pg_level[st->level].bits,
> > - pg_level[st->level].num);
> > + pg_info[st->level].name);
> > + if (st->current_prot && pg_info[st->level].bits)
> > + dump_prot(st, pg_info[st->level].bits,
> > + pg_info[st->level].num);
>
> I think this could then stay as-is too.
>
> > pt_dump_seq_puts(st->seq, "\n");
> >
> > if (addr >= st->marker[1].start_address) {
> > @@ -262,6 +263,7 @@ void ptdump_walk(struct seq_file *s, struct ptdump_info *info)
> > .seq = s,
> > .marker = info->markers,
> > .mm = info->mm,
> > + .pg_level = &pg_level[0],
>
> Can't this just be '.pg_level = kernel_pg_levels'?
>
> Will
Seb
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes
2024-06-28 21:18 ` Oliver Upton
2024-07-01 14:17 ` Sebastian Ene
@ 2024-07-19 14:01 ` Sebastian Ene
1 sibling, 0 replies; 25+ messages in thread
From: Sebastian Ene @ 2024-07-19 14:01 UTC (permalink / raw)
To: Oliver Upton
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, vdonnefort, mark.rutland, maz, rananta, ryan.roberts,
shahuang, suzuki.poulose, will, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Fri, Jun 28, 2024 at 09:18:16PM +0000, Oliver Upton wrote:
> Hi Seb,
>
> On Fri, Jun 21, 2024 at 12:32:29PM +0000, Sebastian Ene wrote:
> > Define a set of attributes used by the ptdump parser to display the
> > properties of a guest memory region covered by a pagetable descriptor.
> > Build a description of the pagetable levels and initialize the parser
> > with this configuration.
> >
> > Signed-off-by: Sebastian Ene <sebastianene@google.com>
>
> This patch should come *before* patch 4, no point in exposing the
> debugfs file if we aren't ready to handle it yet.
>
This is true but this patch doesn't make sense without 4 because here I
add a bunch of functions which will not be invoked (they are invoked
from the debugfs calls).
IMO we can squash them (4 and 5) but it will be a bit harder to follow.
Let me know what you think, thanks.
Seb
> > ---
> > arch/arm64/kvm/ptdump.c | 143 ++++++++++++++++++++++++++++++++++++++--
> > 1 file changed, 137 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> > index 36dc7662729f..cc1d4fdddc6e 100644
> > --- a/arch/arm64/kvm/ptdump.c
> > +++ b/arch/arm64/kvm/ptdump.c
> > @@ -14,6 +14,61 @@
> > #include <kvm_ptdump.h>
> >
> >
> > +#define MARKERS_LEN (2)
> > +#define KVM_PGTABLE_MAX_LEVELS (KVM_PGTABLE_LAST_LEVEL + 1)
> > +
> > +struct kvm_ptdump_guest_state {
> > + struct kvm *kvm;
> > + struct pg_state parser_state;
> > + struct addr_marker ipa_marker[MARKERS_LEN];
> > + struct pg_level level[KVM_PGTABLE_MAX_LEVELS];
> > + struct ptdump_range range[MARKERS_LEN];
> > +};
> > +
> > +static const struct prot_bits stage2_pte_bits[] = {
> > + {
> > + .mask = PTE_VALID,
> > + .val = PTE_VALID,
> > + .set = " ",
> > + .clear = "F",
> > + }, {
> > + .mask = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> > + .val = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> > + .set = "XN",
> > + .clear = " ",
> > + }, {
> > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> > + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> > + .set = "R",
> > + .clear = " ",
> > + }, {
> > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | PTE_VALID,
> > + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | PTE_VALID,
> > + .set = "W",
> > + .clear = " ",
> > + }, {
> > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_AF | PTE_VALID,
> > + .val = KVM_PTE_LEAF_ATTR_LO_S2_AF | PTE_VALID,
> > + .set = "AF",
> > + .clear = " ",
>
> <snip>
>
> > + }, {
> > + .mask = PTE_NG,
> > + .val = PTE_NG,
> > + .set = "FnXS",
> > + .clear = " ",
> > + }, {
> > + .mask = PTE_CONT | PTE_VALID,
> > + .val = PTE_CONT | PTE_VALID,
> > + .set = "CON",
> > + .clear = " ",
> > + }, {
>
> </snip>
>
> Neither of these bits are used at stage-2, why have descriptors for
> them?
>
> > +static int kvm_ptdump_build_levels(struct pg_level *level, u32 start_lvl)
> > +{
> > + static const char * const level_names[] = {"PGD", "PUD", "PMD", "PTE"};
> > + u32 i = 0;
> > + u64 mask = 0;
> > +
> > + if (start_lvl > 2) {
> > + pr_err("invalid start_lvl %u\n", start_lvl);
> > + return -EINVAL;
> > + }
>
> if (WARN_ON_ONCE(start_lvl >= KVM_PGTABLE_LAST_LEVEL))
> return -EINVAL;
>
> > + for (i = 0; i < ARRAY_SIZE(stage2_pte_bits); i++)
> > + mask |= stage2_pte_bits[i].mask;
> > +
> > + for (i = start_lvl; i < KVM_PGTABLE_MAX_LEVELS; i++) {
> > + strscpy(level[i].name, level_names[i], sizeof(level[i].name));
> > +
> > + level[i].num = ARRAY_SIZE(stage2_pte_bits);
> > + level[i].bits = stage2_pte_bits;
> > + level[i].mask = mask;
> > + }
> > +
> > + if (start_lvl > 0)
> > + strscpy(level[start_lvl].name, level_names[0], sizeof(level_names[0]));
>
> This should pass the size of @dst, not the source. This becomes slightly
> more self-documenting if you use a literal for "PGD" here too.
>
> strscpy(level[start_lvl].name, "PGD", sizeof(level[start_lvl].name));
>
> > + return 0;
> > +}
> > +
> > +static struct kvm_ptdump_guest_state
> > +*kvm_ptdump_parser_init(struct kvm *kvm)
> > +{
> > + struct kvm_ptdump_guest_state *st;
> > + struct kvm_s2_mmu *mmu = &kvm->arch.mmu;
> > + struct kvm_pgtable *pgtable = mmu->pgt;
> > + int ret;
> > +
> > + st = kzalloc(sizeof(struct kvm_ptdump_guest_state), GFP_KERNEL_ACCOUNT);
> > + if (!st)
> > + return NULL;
> > +
> > + ret = kvm_ptdump_build_levels(&st->level[0], pgtable->start_level);
> > + if (ret)
> > + goto free_with_state;
>
> I don't see any value in the use of goto here, as there isn't any sort
> of cascading initialization / cleanup. This also presents an opportunity
> to get an error back out to the caller.
>
> if (ret) {
> kfree(st);
> return ERR_PTR(ret);
> }
>
> > @@ -57,22 +176,34 @@ static int kvm_ptdump_guest_show(struct seq_file *m, void *unused)
> > static int kvm_ptdump_guest_open(struct inode *m, struct file *file)
> > {
> > struct kvm *kvm = m->i_private;
> > + struct kvm_ptdump_guest_state *st;
> > int ret;
> >
> > if (!kvm_get_kvm_safe(kvm))
> > return -ENOENT;
> >
> > - ret = single_open(file, kvm_ptdump_guest_show, m->i_private);
> > - if (ret < 0)
> > - kvm_put_kvm(kvm);
> > + st = kvm_ptdump_parser_init(kvm);
> > + if (!st) {
> > + ret = -ENOMEM;
> > + goto free_with_kvm_ref;
> > + }
>
> (with the earlier suggestion)
>
> st = kvm_ptdump_parser_init(kvm);
> if (IS_ERR(st)) {
> ret = PTR_ERR(st);
> goto free_with_kvm_ref;
> }
>
> Otherwise genuine KVM bugs (-EINVAL) are getting lumped into ENOMEM.
>
> --
> Thanks,
> Oliver
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes
2024-07-16 9:59 ` Vincent Donnefort
@ 2024-07-19 14:09 ` Sebastian Ene
2024-07-19 14:36 ` Vincent Donnefort
0 siblings, 1 reply; 25+ messages in thread
From: Sebastian Ene @ 2024-07-19 14:09 UTC (permalink / raw)
To: Vincent Donnefort
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, shahuang, suzuki.poulose, will, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Tue, Jul 16, 2024 at 10:59:41AM +0100, Vincent Donnefort wrote:
> On Mon, Jul 01, 2024 at 02:18:39PM +0000, Sebastian Ene wrote:
> > On Mon, Jul 01, 2024 at 09:42:43AM +0100, Vincent Donnefort wrote:
> > > O Fri, Jun 21, 2024 at 12:32:29PM +0000, 'Sebastian Ene' via kernel-team wrote:
> > > > Define a set of attributes used by the ptdump parser to display the
> > > > properties of a guest memory region covered by a pagetable descriptor.
> > > > Build a description of the pagetable levels and initialize the parser
> > > > with this configuration.
> > > >
> > > > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > > > ---
> > > > arch/arm64/kvm/ptdump.c | 143 ++++++++++++++++++++++++++++++++++++++--
> > > > 1 file changed, 137 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> > > > index 36dc7662729f..cc1d4fdddc6e 100644
> > > > --- a/arch/arm64/kvm/ptdump.c
> > > > +++ b/arch/arm64/kvm/ptdump.c
> > > > @@ -14,6 +14,61 @@
> > > > #include <kvm_ptdump.h>
> > > >
> > > >
> > > > +#define MARKERS_LEN (2)
> > > > +#define KVM_PGTABLE_MAX_LEVELS (KVM_PGTABLE_LAST_LEVEL + 1)
> > > > +
> > > > +struct kvm_ptdump_guest_state {
> > > > + struct kvm *kvm;
> > > > + struct pg_state parser_state;
> > > > + struct addr_marker ipa_marker[MARKERS_LEN];
> > > > + struct pg_level level[KVM_PGTABLE_MAX_LEVELS];
> > > > + struct ptdump_range range[MARKERS_LEN];
> > > > +};
> > > > +
> > > > +static const struct prot_bits stage2_pte_bits[] = {
> > > > + {
> > > > + .mask = PTE_VALID,
> > > > + .val = PTE_VALID,
> > > > + .set = " ",
> > > > + .clear = "F",
> > > > + }, {
> > > > + .mask = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> > > > + .val = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
>
> KVM_PTE_LEAF_ATTR_HI_S2_XN is actually a mask covering
It is not a mask covering here but in the ACK kernel.
>
> KVM_PTE_LEAF_ATTR_HI_S2_XN_XN, KVM_PTE_LEAF_ATTR_HI_S2_XN_PXN and
> KVM_PTE_LEAF_ATTR_HI_S2_XN_UXN.
>
> I believe here what we should do is something like?
>
> .val = FIELD_PREP_CONST(KVM_PTE_LEAF_ATTR_HI_S2_XN,
> KVM_PTE_LEAF_ATTR_HI_S2_XN_XN) | PTE_VALID
>
> > > > + .set = "XN",
> > > > + .clear = " ",
> > > > + }, {
> > > > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> > > > + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> > > > + .set = "R",
> > > > + .clear = " ",
> > > > + }, {
>
> [...]
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes
2024-07-19 14:09 ` Sebastian Ene
@ 2024-07-19 14:36 ` Vincent Donnefort
2024-07-19 16:27 ` Sebastian Ene
0 siblings, 1 reply; 25+ messages in thread
From: Vincent Donnefort @ 2024-07-19 14:36 UTC (permalink / raw)
To: Sebastian Ene
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, shahuang, suzuki.poulose, will, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Fri, Jul 19, 2024 at 02:09:19PM +0000, Sebastian Ene wrote:
> On Tue, Jul 16, 2024 at 10:59:41AM +0100, Vincent Donnefort wrote:
> > On Mon, Jul 01, 2024 at 02:18:39PM +0000, Sebastian Ene wrote:
> > > On Mon, Jul 01, 2024 at 09:42:43AM +0100, Vincent Donnefort wrote:
> > > > O Fri, Jun 21, 2024 at 12:32:29PM +0000, 'Sebastian Ene' via kernel-team wrote:
> > > > > Define a set of attributes used by the ptdump parser to display the
> > > > > properties of a guest memory region covered by a pagetable descriptor.
> > > > > Build a description of the pagetable levels and initialize the parser
> > > > > with this configuration.
> > > > >
> > > > > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > > > > ---
> > > > > arch/arm64/kvm/ptdump.c | 143 ++++++++++++++++++++++++++++++++++++++--
> > > > > 1 file changed, 137 insertions(+), 6 deletions(-)
> > > > >
> > > > > diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> > > > > index 36dc7662729f..cc1d4fdddc6e 100644
> > > > > --- a/arch/arm64/kvm/ptdump.c
> > > > > +++ b/arch/arm64/kvm/ptdump.c
> > > > > @@ -14,6 +14,61 @@
> > > > > #include <kvm_ptdump.h>
> > > > >
> > > > >
> > > > > +#define MARKERS_LEN (2)
> > > > > +#define KVM_PGTABLE_MAX_LEVELS (KVM_PGTABLE_LAST_LEVEL + 1)
> > > > > +
> > > > > +struct kvm_ptdump_guest_state {
> > > > > + struct kvm *kvm;
> > > > > + struct pg_state parser_state;
> > > > > + struct addr_marker ipa_marker[MARKERS_LEN];
> > > > > + struct pg_level level[KVM_PGTABLE_MAX_LEVELS];
> > > > > + struct ptdump_range range[MARKERS_LEN];
> > > > > +};
> > > > > +
> > > > > +static const struct prot_bits stage2_pte_bits[] = {
> > > > > + {
> > > > > + .mask = PTE_VALID,
> > > > > + .val = PTE_VALID,
> > > > > + .set = " ",
> > > > > + .clear = "F",
> > > > > + }, {
> > > > > + .mask = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> > > > > + .val = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> >
> > KVM_PTE_LEAF_ATTR_HI_S2_XN is actually a mask covering
>
> It is not a mask covering here but in the ACK kernel.
You're right, I should have double-checked upstream.
That said, how about we move this __after__ RW ? and just use "X" on .clear
so we can have something R W X ?
>
> >
> > KVM_PTE_LEAF_ATTR_HI_S2_XN_XN, KVM_PTE_LEAF_ATTR_HI_S2_XN_PXN and
> > KVM_PTE_LEAF_ATTR_HI_S2_XN_UXN.
> >
> > I believe here what we should do is something like?
> >
> > .val = FIELD_PREP_CONST(KVM_PTE_LEAF_ATTR_HI_S2_XN,
> > KVM_PTE_LEAF_ATTR_HI_S2_XN_XN) | PTE_VALID
> >
> > > > > + .set = "XN",
> > > > > + .clear = " ",
> > > > > + }, {
> > > > > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> > > > > + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> > > > > + .set = "R",
> > > > > + .clear = " ",
> > > > > + }, {
> >
> > [...]
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes
2024-07-19 14:36 ` Vincent Donnefort
@ 2024-07-19 16:27 ` Sebastian Ene
0 siblings, 0 replies; 25+ messages in thread
From: Sebastian Ene @ 2024-07-19 16:27 UTC (permalink / raw)
To: Vincent Donnefort
Cc: akpm, alexghiti, ankita, ardb, catalin.marinas, christophe.leroy,
james.morse, mark.rutland, maz, oliver.upton, rananta,
ryan.roberts, shahuang, suzuki.poulose, will, yuzenghui, kvmarm,
linux-arm-kernel, linux-kernel, kernel-team
On Fri, Jul 19, 2024 at 03:36:04PM +0100, 'Vincent Donnefort' via kernel-team wrote:
> On Fri, Jul 19, 2024 at 02:09:19PM +0000, Sebastian Ene wrote:
> > On Tue, Jul 16, 2024 at 10:59:41AM +0100, Vincent Donnefort wrote:
> > > On Mon, Jul 01, 2024 at 02:18:39PM +0000, Sebastian Ene wrote:
> > > > On Mon, Jul 01, 2024 at 09:42:43AM +0100, Vincent Donnefort wrote:
> > > > > O Fri, Jun 21, 2024 at 12:32:29PM +0000, 'Sebastian Ene' via kernel-team wrote:
> > > > > > Define a set of attributes used by the ptdump parser to display the
> > > > > > properties of a guest memory region covered by a pagetable descriptor.
> > > > > > Build a description of the pagetable levels and initialize the parser
> > > > > > with this configuration.
> > > > > >
> > > > > > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > > > > > ---
> > > > > > arch/arm64/kvm/ptdump.c | 143 ++++++++++++++++++++++++++++++++++++++--
> > > > > > 1 file changed, 137 insertions(+), 6 deletions(-)
> > > > > >
> > > > > > diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c
> > > > > > index 36dc7662729f..cc1d4fdddc6e 100644
> > > > > > --- a/arch/arm64/kvm/ptdump.c
> > > > > > +++ b/arch/arm64/kvm/ptdump.c
> > > > > > @@ -14,6 +14,61 @@
> > > > > > #include <kvm_ptdump.h>
> > > > > >
> > > > > >
> > > > > > +#define MARKERS_LEN (2)
> > > > > > +#define KVM_PGTABLE_MAX_LEVELS (KVM_PGTABLE_LAST_LEVEL + 1)
> > > > > > +
> > > > > > +struct kvm_ptdump_guest_state {
> > > > > > + struct kvm *kvm;
> > > > > > + struct pg_state parser_state;
> > > > > > + struct addr_marker ipa_marker[MARKERS_LEN];
> > > > > > + struct pg_level level[KVM_PGTABLE_MAX_LEVELS];
> > > > > > + struct ptdump_range range[MARKERS_LEN];
> > > > > > +};
> > > > > > +
> > > > > > +static const struct prot_bits stage2_pte_bits[] = {
> > > > > > + {
> > > > > > + .mask = PTE_VALID,
> > > > > > + .val = PTE_VALID,
> > > > > > + .set = " ",
> > > > > > + .clear = "F",
> > > > > > + }, {
> > > > > > + .mask = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> > > > > > + .val = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID,
> > >
> > > KVM_PTE_LEAF_ATTR_HI_S2_XN is actually a mask covering
> >
> > It is not a mask covering here but in the ACK kernel.
>
> You're right, I should have double-checked upstream.
>
> That said, how about we move this __after__ RW ? and just use "X" on .clear
> so we can have something R W X ?
>
Yes that's a good ideea to improve output format, let me move it.
> >
> > >
> > > KVM_PTE_LEAF_ATTR_HI_S2_XN_XN, KVM_PTE_LEAF_ATTR_HI_S2_XN_PXN and
> > > KVM_PTE_LEAF_ATTR_HI_S2_XN_UXN.
> > >
> > > I believe here what we should do is something like?
> > >
> > > .val = FIELD_PREP_CONST(KVM_PTE_LEAF_ATTR_HI_S2_XN,
> > > KVM_PTE_LEAF_ATTR_HI_S2_XN_XN) | PTE_VALID
> > >
> > > > > > + .set = "XN",
> > > > > > + .clear = " ",
> > > > > > + }, {
> > > > > > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> > > > > > + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID,
> > > > > > + .set = "R",
> > > > > > + .clear = " ",
> > > > > > + }, {
> > >
> > > [...]
>
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2024-07-19 16:28 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-21 12:32 [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables Sebastian Ene
2024-06-21 12:32 ` [PATCH v7 1/6] KVM: arm64: Move pagetable definitions to common header Sebastian Ene
2024-06-21 12:32 ` [PATCH v7 2/6] arm64: ptdump: Expose the attribute parsing functionality Sebastian Ene
2024-07-05 11:07 ` Will Deacon
2024-07-19 11:44 ` Sebastian Ene
2024-06-21 12:32 ` [PATCH v7 3/6] arm64: ptdump: Use the mask from the state structure Sebastian Ene
2024-07-05 11:12 ` Will Deacon
2024-07-19 13:27 ` Sebastian Ene
2024-06-21 12:32 ` [PATCH v7 4/6] KVM: arm64: Register ptdump with debugfs on guest creation Sebastian Ene
2024-06-21 12:32 ` [PATCH v7 5/6] KVM: arm64: Initialize the ptdump parser with stage-2 attributes Sebastian Ene
2024-06-28 21:18 ` Oliver Upton
2024-07-01 14:17 ` Sebastian Ene
2024-07-08 19:47 ` Oliver Upton
2024-07-19 14:01 ` Sebastian Ene
2024-07-01 8:42 ` Vincent Donnefort
2024-07-01 14:18 ` Sebastian Ene
2024-07-16 9:59 ` Vincent Donnefort
2024-07-19 14:09 ` Sebastian Ene
2024-07-19 14:36 ` Vincent Donnefort
2024-07-19 16:27 ` Sebastian Ene
2024-06-21 12:32 ` [PATCH v7 6/6] KVM: arm64: Expose guest stage-2 pagetable config to debugfs Sebastian Ene
2024-06-28 20:49 ` Oliver Upton
2024-07-01 14:10 ` Sebastian Ene
2024-06-28 21:25 ` [PATCH v7 0/6] arm64: ptdump: View the second stage page-tables Oliver Upton
2024-07-01 14:22 ` Sebastian Ene
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).