* [PATCH v8 0/7] device tree mapping
@ 2024-09-27 16:33 Oleksii Kurochko
2024-09-27 16:33 ` [PATCH v8 1/7] xen/riscv: allow write_atomic() to work with non-scalar types Oleksii Kurochko
` (8 more replies)
0 siblings, 9 replies; 20+ messages in thread
From: Oleksii Kurochko @ 2024-09-27 16:33 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Jan Beulich, Julien Grall, Stefano Stabellini
Current patch series introduces device tree mapping for RISC-V
and necessary things for that such as:
- Fixmap mapping
- pmap
- Xen page table processing
---
Changes in v8:
- The following patch was merged to staging:
[PATCH v5 1/7] xen/riscv: use {read,write}{b,w,l,q}_cpu() to define {read,write}_atomic()
- All other changes are patch specific so please look at the patch.
---
Changes in v7:
- Drop the patch "xen/riscv: prevent recursion when ASSERT(), BUG*(), or panic() are called"
- All other changes are patch specific so please look at the patch.
---
Changes in v6:
- Add patch to fix recursion when ASSERT(), BUG*(), panic() are called.
- Add patch to allow write_atomic() to work with non-scalar types for consistence
with read_atomic().
- All other changes are patch specific so please look at the patch.
---
Changes in v5:
- The following patch was merged to staging:
[PATCH v3 3/9] xen/riscv: enable CONFIG_HAS_DEVICE_TREE
- Drop depedency from "RISCV basic exception handling implementation" as
it was meged to staging branch.
- All other changes are patch specific so please look at the patch.
---
Changes in v4:
- Drop depedency from common devicre tree patch series as it was merged to
staging.
- Update the cover letter message.
- All other changes are patch specific so please look at the patch.
---
Changes in v3:
- Introduce SBI RFENCE extension support.
- Introduce and initialize pcpu_info[] and __cpuid_to_hartid_map[] and functionality
to work with this arrays.
- Make page table handling arch specific instead of trying to make it generic.
- All other changes are patch specific so please look at the patch.
---
Changes in v2:
- Update the cover letter message
- introduce fixmap mapping
- introduce pmap
- introduce CONFIG_GENREIC_PT
- update use early_fdt_map() after MMU is enabled.
---
Oleksii Kurochko (7):
xen/riscv: allow write_atomic() to work with non-scalar types
xen/riscv: set up fixmap mappings
xen/riscv: introduce asm/pmap.h header
xen/riscv: introduce functionality to work with CPU info
xen/riscv: introduce and initialize SBI RFENCE extension
xen/riscv: page table handling
xen/riscv: introduce early_fdt_map()
xen/arch/riscv/Kconfig | 1 +
xen/arch/riscv/Makefile | 2 +
xen/arch/riscv/include/asm/atomic.h | 11 +-
xen/arch/riscv/include/asm/config.h | 16 +-
xen/arch/riscv/include/asm/current.h | 27 +-
xen/arch/riscv/include/asm/fixmap.h | 46 +++
xen/arch/riscv/include/asm/flushtlb.h | 15 +
xen/arch/riscv/include/asm/mm.h | 6 +
xen/arch/riscv/include/asm/page.h | 99 +++++
xen/arch/riscv/include/asm/pmap.h | 36 ++
xen/arch/riscv/include/asm/processor.h | 3 -
xen/arch/riscv/include/asm/riscv_encoding.h | 2 +
xen/arch/riscv/include/asm/sbi.h | 62 +++
xen/arch/riscv/include/asm/smp.h | 18 +
xen/arch/riscv/mm.c | 101 ++++-
xen/arch/riscv/pt.c | 421 ++++++++++++++++++++
xen/arch/riscv/riscv64/asm-offsets.c | 3 +
xen/arch/riscv/riscv64/head.S | 14 +
xen/arch/riscv/sbi.c | 273 ++++++++++++-
xen/arch/riscv/setup.c | 17 +
xen/arch/riscv/smp.c | 15 +
xen/arch/riscv/xen.lds.S | 2 +-
22 files changed, 1171 insertions(+), 19 deletions(-)
create mode 100644 xen/arch/riscv/include/asm/fixmap.h
create mode 100644 xen/arch/riscv/include/asm/pmap.h
create mode 100644 xen/arch/riscv/pt.c
create mode 100644 xen/arch/riscv/smp.c
--
2.46.1
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v8 1/7] xen/riscv: allow write_atomic() to work with non-scalar types
2024-09-27 16:33 [PATCH v8 0/7] device tree mapping Oleksii Kurochko
@ 2024-09-27 16:33 ` Oleksii Kurochko
2024-09-27 16:33 ` [PATCH v8 2/7] xen/riscv: set up fixmap mappings Oleksii Kurochko
` (7 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Oleksii Kurochko @ 2024-09-27 16:33 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Jan Beulich, Julien Grall, Stefano Stabellini
Update the defintion of write_atomic() to support non-scalar types,
bringing it closer to the behavior of read_atomic().
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
xen/arch/riscv/include/asm/atomic.h | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/xen/arch/riscv/include/asm/atomic.h b/xen/arch/riscv/include/asm/atomic.h
index 95910ebfeb..9669a3286d 100644
--- a/xen/arch/riscv/include/asm/atomic.h
+++ b/xen/arch/riscv/include/asm/atomic.h
@@ -1,4 +1,4 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
+ /* SPDX-License-Identifier: GPL-2.0-only */
/*
* Taken and modified from Linux.
*
@@ -69,10 +69,11 @@ static always_inline void _write_atomic(volatile void *p,
}
}
-#define write_atomic(p, x) \
-({ \
- typeof(*(p)) x_ = (x); \
- _write_atomic(p, x_, sizeof(*(p))); \
+#define write_atomic(p, x) \
+({ \
+ union { typeof(*(p)) v; unsigned long ul; } x_ = { .ul = 0UL }; \
+ x_.v = (x); \
+ _write_atomic(p, x_.ul, sizeof(*(p))); \
})
static always_inline void _add_sized(volatile void *p,
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v8 2/7] xen/riscv: set up fixmap mappings
2024-09-27 16:33 [PATCH v8 0/7] device tree mapping Oleksii Kurochko
2024-09-27 16:33 ` [PATCH v8 1/7] xen/riscv: allow write_atomic() to work with non-scalar types Oleksii Kurochko
@ 2024-09-27 16:33 ` Oleksii Kurochko
2024-09-27 16:33 ` [PATCH v8 3/7] xen/riscv: introduce asm/pmap.h header Oleksii Kurochko
` (6 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Oleksii Kurochko @ 2024-09-27 16:33 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Jan Beulich, Julien Grall, Stefano Stabellini
Set up fixmap mappings and the L0 page table for fixmap support.
Modify the PTEs (xen_fixmap[]) directly in arch_pmap_map() instead
of using set_fixmap() which is expected to be implemented using
map_pages_to_xen(), which, in turn, is expected to use
arch_pmap_map() during early boot, resulting in a loop.
Define new macros in riscv/config.h for calculating
the FIXMAP_BASE address, including BOOT_FDT_VIRT_{START, SIZE},
XEN_VIRT_SIZE, and XEN_VIRT_END.
Update the check for Xen size in riscv/xen.lds.S to use
XEN_VIRT_SIZE instead of a hardcoded constant.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
xen/arch/riscv/include/asm/config.h | 16 ++++++++--
xen/arch/riscv/include/asm/fixmap.h | 46 +++++++++++++++++++++++++++++
xen/arch/riscv/include/asm/mm.h | 2 ++
xen/arch/riscv/include/asm/page.h | 13 ++++++++
xen/arch/riscv/mm.c | 43 +++++++++++++++++++++++++++
xen/arch/riscv/setup.c | 2 ++
xen/arch/riscv/xen.lds.S | 2 +-
7 files changed, 121 insertions(+), 3 deletions(-)
create mode 100644 xen/arch/riscv/include/asm/fixmap.h
diff --git a/xen/arch/riscv/include/asm/config.h b/xen/arch/riscv/include/asm/config.h
index 50583aafdc..7dbb235685 100644
--- a/xen/arch/riscv/include/asm/config.h
+++ b/xen/arch/riscv/include/asm/config.h
@@ -41,8 +41,10 @@
* Start addr | End addr | Slot | area description
* ============================================================================
* ..... L2 511 Unused
- * 0xffffffffc0600000 0xffffffffc0800000 L2 511 Fixmap
- * 0xffffffffc0200000 0xffffffffc0600000 L2 511 FDT
+ * 0xffffffffc0a00000 0xffffffffc0c00000 L2 511 Fixmap
+ * ..... ( 2 MB gap )
+ * 0xffffffffc0400000 0xffffffffc0800000 L2 511 FDT
+ * ..... ( 2 MB gap )
* 0xffffffffc0000000 0xffffffffc0200000 L2 511 Xen
* ..... L2 510 Unused
* 0x3200000000 0x7f40000000 L2 200-509 Direct map
@@ -74,6 +76,16 @@
#error "unsupported RV_STAGE1_MODE"
#endif
+#define GAP_SIZE MB(2)
+
+#define XEN_VIRT_SIZE MB(2)
+
+#define BOOT_FDT_VIRT_START (XEN_VIRT_START + XEN_VIRT_SIZE + GAP_SIZE)
+#define BOOT_FDT_VIRT_SIZE MB(4)
+
+#define FIXMAP_BASE \
+ (BOOT_FDT_VIRT_START + BOOT_FDT_VIRT_SIZE + GAP_SIZE)
+
#define DIRECTMAP_SLOT_END 509
#define DIRECTMAP_SLOT_START 200
#define DIRECTMAP_VIRT_START SLOTN(DIRECTMAP_SLOT_START)
diff --git a/xen/arch/riscv/include/asm/fixmap.h b/xen/arch/riscv/include/asm/fixmap.h
new file mode 100644
index 0000000000..63732df36c
--- /dev/null
+++ b/xen/arch/riscv/include/asm/fixmap.h
@@ -0,0 +1,46 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * fixmap.h: compile-time virtual memory allocation
+ */
+#ifndef ASM_FIXMAP_H
+#define ASM_FIXMAP_H
+
+#include <xen/bug.h>
+#include <xen/page-size.h>
+#include <xen/pmap.h>
+
+#include <asm/page.h>
+
+#define FIXMAP_ADDR(n) (FIXMAP_BASE + (n) * PAGE_SIZE)
+
+/* Fixmap slots */
+#define FIX_PMAP_BEGIN (0) /* Start of PMAP */
+#define FIX_PMAP_END (FIX_PMAP_BEGIN + NUM_FIX_PMAP - 1) /* End of PMAP */
+#define FIX_MISC (FIX_PMAP_END + 1) /* Ephemeral mappings of hardware */
+
+#define FIX_LAST FIX_MISC
+
+#define FIXADDR_START FIXMAP_ADDR(0)
+#define FIXADDR_TOP FIXMAP_ADDR(FIX_LAST + 1)
+
+#ifndef __ASSEMBLY__
+
+/*
+ * Direct access to xen_fixmap[] should only happen when {set,
+ * clear}_fixmap() is unusable (e.g. where we would end up to
+ * recursively call the helpers).
+ */
+extern pte_t xen_fixmap[];
+
+#define fix_to_virt(slot) ((void *)FIXMAP_ADDR(slot))
+
+static inline unsigned int virt_to_fix(vaddr_t vaddr)
+{
+ BUG_ON(vaddr >= FIXADDR_TOP || vaddr < FIXADDR_START);
+
+ return ((vaddr - FIXADDR_START) >> PAGE_SHIFT);
+}
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* ASM_FIXMAP_H */
diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h
index 25af9e1aaa..a0bdc2bc3a 100644
--- a/xen/arch/riscv/include/asm/mm.h
+++ b/xen/arch/riscv/include/asm/mm.h
@@ -255,4 +255,6 @@ static inline unsigned int arch_get_dma_bitsize(void)
return 32; /* TODO */
}
+void setup_fixmap_mappings(void);
+
#endif /* _ASM_RISCV_MM_H */
diff --git a/xen/arch/riscv/include/asm/page.h b/xen/arch/riscv/include/asm/page.h
index c831e16417..d4a5009823 100644
--- a/xen/arch/riscv/include/asm/page.h
+++ b/xen/arch/riscv/include/asm/page.h
@@ -9,6 +9,7 @@
#include <xen/bug.h>
#include <xen/types.h>
+#include <asm/atomic.h>
#include <asm/mm.h>
#include <asm/page-bits.h>
@@ -81,6 +82,18 @@ static inline void flush_page_to_ram(unsigned long mfn, bool sync_icache)
BUG_ON("unimplemented");
}
+/* Write a pagetable entry. */
+static inline void write_pte(pte_t *p, pte_t pte)
+{
+ write_atomic(p, pte);
+}
+
+/* Read a pagetable entry. */
+static inline pte_t read_pte(const pte_t *p)
+{
+ return read_atomic(p);
+}
+
#endif /* __ASSEMBLY__ */
#endif /* _ASM_RISCV_PAGE_H */
diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c
index 7d09e781bf..b8ff91cf4e 100644
--- a/xen/arch/riscv/mm.c
+++ b/xen/arch/riscv/mm.c
@@ -12,6 +12,7 @@
#include <asm/early_printk.h>
#include <asm/csr.h>
#include <asm/current.h>
+#include <asm/fixmap.h>
#include <asm/page.h>
#include <asm/processor.h>
@@ -49,6 +50,9 @@ stage1_pgtbl_root[PAGETABLE_ENTRIES];
pte_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
stage1_pgtbl_nonroot[PGTBL_INITIAL_COUNT * PAGETABLE_ENTRIES];
+pte_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
+xen_fixmap[PAGETABLE_ENTRIES];
+
#define HANDLE_PGTBL(curr_lvl_num) \
index = pt_index(curr_lvl_num, page_addr); \
if ( pte_is_valid(pgtbl[index]) ) \
@@ -191,6 +195,45 @@ static bool __init check_pgtbl_mode_support(struct mmu_desc *mmu_desc,
return is_mode_supported;
}
+void __init setup_fixmap_mappings(void)
+{
+ pte_t *pte, tmp;
+ unsigned int i;
+
+ BUILD_BUG_ON(FIX_LAST >= PAGETABLE_ENTRIES);
+
+ pte = &stage1_pgtbl_root[pt_index(HYP_PT_ROOT_LEVEL, FIXMAP_ADDR(0))];
+
+ /*
+ * In RISC-V page table levels are numbered from Lx to L0 where
+ * x is the highest page table level for currect MMU mode ( for example,
+ * for Sv39 has 3 page tables so the x = 2 (L2 -> L1 -> L0) ).
+ *
+ * In this cycle we want to find L1 page table because as L0 page table
+ * xen_fixmap[] will be used.
+ */
+ for ( i = HYP_PT_ROOT_LEVEL; i-- > 1; )
+ {
+ BUG_ON(!pte_is_valid(*pte));
+
+ pte = (pte_t *)LOAD_TO_LINK(pte_to_paddr(*pte));
+ pte = &pte[pt_index(i, FIXMAP_ADDR(0))];
+ }
+
+ BUG_ON(pte_is_valid(*pte));
+
+ tmp = paddr_to_pte(LINK_TO_LOAD((unsigned long)&xen_fixmap), PTE_TABLE);
+ write_pte(pte, tmp);
+
+ RISCV_FENCE(rw, rw);
+ sfence_vma();
+
+ /*
+ * We only need the zeroeth table allocated, but not the PTEs set, because
+ * set_fixmap() will set them on the fly.
+ */
+}
+
/*
* setup_initial_pagetables:
*
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index 97c599db44..82c5752da1 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -47,6 +47,8 @@ void __init noreturn start_xen(unsigned long bootcpu_id,
test_macros_from_bug_h();
#endif
+ setup_fixmap_mappings();
+
printk("All set up\n");
machine_halt();
diff --git a/xen/arch/riscv/xen.lds.S b/xen/arch/riscv/xen.lds.S
index 871b47a235..558a5a992a 100644
--- a/xen/arch/riscv/xen.lds.S
+++ b/xen/arch/riscv/xen.lds.S
@@ -174,6 +174,6 @@ ASSERT(!SIZEOF(.got.plt), ".got.plt non-empty")
* Changing the size of Xen binary can require an update of
* PGTBL_INITIAL_COUNT.
*/
-ASSERT(_end - _start <= MB(2), "Xen too large for early-boot assumptions")
+ASSERT(_end - _start <= XEN_VIRT_SIZE, "Xen too large for early-boot assumptions")
ASSERT(_ident_end - _ident_start <= IDENT_AREA_SIZE, "identity region is too big");
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v8 3/7] xen/riscv: introduce asm/pmap.h header
2024-09-27 16:33 [PATCH v8 0/7] device tree mapping Oleksii Kurochko
2024-09-27 16:33 ` [PATCH v8 1/7] xen/riscv: allow write_atomic() to work with non-scalar types Oleksii Kurochko
2024-09-27 16:33 ` [PATCH v8 2/7] xen/riscv: set up fixmap mappings Oleksii Kurochko
@ 2024-09-27 16:33 ` Oleksii Kurochko
2024-09-27 16:33 ` [PATCH v8 4/7] xen/riscv: introduce functionality to work with CPU info Oleksii Kurochko
` (5 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Oleksii Kurochko @ 2024-09-27 16:33 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Jan Beulich, Julien Grall, Stefano Stabellini
Introduce arch_pmap_{un}map functions and select HAS_PMAP for CONFIG_RISCV.
Add pte_from_mfn() for use in arch_pmap_map().
Introduce flush_xen_tlb_one_local() and use it in arch_pmap_{un}map().
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
xen/arch/riscv/Kconfig | 1 +
xen/arch/riscv/include/asm/flushtlb.h | 6 +++++
xen/arch/riscv/include/asm/page.h | 6 +++++
xen/arch/riscv/include/asm/pmap.h | 36 +++++++++++++++++++++++++++
4 files changed, 49 insertions(+)
create mode 100644 xen/arch/riscv/include/asm/pmap.h
diff --git a/xen/arch/riscv/Kconfig b/xen/arch/riscv/Kconfig
index 7a113c7774..1858004676 100644
--- a/xen/arch/riscv/Kconfig
+++ b/xen/arch/riscv/Kconfig
@@ -3,6 +3,7 @@ config RISCV
select FUNCTION_ALIGNMENT_16B
select GENERIC_BUG_FRAME
select HAS_DEVICE_TREE
+ select HAS_PMAP
select HAS_VMAP
config RISCV_64
diff --git a/xen/arch/riscv/include/asm/flushtlb.h b/xen/arch/riscv/include/asm/flushtlb.h
index 7ce32bea0b..f4a735fd6c 100644
--- a/xen/arch/riscv/include/asm/flushtlb.h
+++ b/xen/arch/riscv/include/asm/flushtlb.h
@@ -5,6 +5,12 @@
#include <xen/bug.h>
#include <xen/cpumask.h>
+/* Flush TLB of local processor for address va. */
+static inline void flush_tlb_one_local(vaddr_t va)
+{
+ asm volatile ( "sfence.vma %0" :: "r" (va) : "memory" );
+}
+
/*
* Filter the given set of CPUs, removing those that definitely flushed their
* TLB since @page_timestamp.
diff --git a/xen/arch/riscv/include/asm/page.h b/xen/arch/riscv/include/asm/page.h
index d4a5009823..eb79cb9409 100644
--- a/xen/arch/riscv/include/asm/page.h
+++ b/xen/arch/riscv/include/asm/page.h
@@ -94,6 +94,12 @@ static inline pte_t read_pte(const pte_t *p)
return read_atomic(p);
}
+static inline pte_t pte_from_mfn(mfn_t mfn, unsigned int flags)
+{
+ unsigned long pte = (mfn_x(mfn) << PTE_PPN_SHIFT) | flags;
+ return (pte_t){ .pte = pte };
+}
+
#endif /* __ASSEMBLY__ */
#endif /* _ASM_RISCV_PAGE_H */
diff --git a/xen/arch/riscv/include/asm/pmap.h b/xen/arch/riscv/include/asm/pmap.h
new file mode 100644
index 0000000000..60065c996f
--- /dev/null
+++ b/xen/arch/riscv/include/asm/pmap.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef ASM_PMAP_H
+#define ASM_PMAP_H
+
+#include <xen/bug.h>
+#include <xen/init.h>
+#include <xen/mm.h>
+#include <xen/page-size.h>
+
+#include <asm/fixmap.h>
+#include <asm/flushtlb.h>
+#include <asm/system.h>
+
+static inline void __init arch_pmap_map(unsigned int slot, mfn_t mfn)
+{
+ pte_t *entry = &xen_fixmap[slot];
+ pte_t pte;
+
+ ASSERT(!pte_is_valid(*entry));
+
+ pte = pte_from_mfn(mfn, PAGE_HYPERVISOR_RW);
+ write_pte(entry, pte);
+
+ flush_tlb_one_local(FIXMAP_ADDR(slot));
+}
+
+static inline void __init arch_pmap_unmap(unsigned int slot)
+{
+ pte_t pte = {};
+
+ write_pte(&xen_fixmap[slot], pte);
+
+ flush_tlb_one_local(FIXMAP_ADDR(slot));
+}
+
+#endif /* ASM_PMAP_H */
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v8 4/7] xen/riscv: introduce functionality to work with CPU info
2024-09-27 16:33 [PATCH v8 0/7] device tree mapping Oleksii Kurochko
` (2 preceding siblings ...)
2024-09-27 16:33 ` [PATCH v8 3/7] xen/riscv: introduce asm/pmap.h header Oleksii Kurochko
@ 2024-09-27 16:33 ` Oleksii Kurochko
2024-09-30 7:25 ` Jan Beulich
2024-09-27 16:33 ` [PATCH v8 5/7] xen/riscv: introduce and initialize SBI RFENCE extension Oleksii Kurochko
` (4 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Oleksii Kurochko @ 2024-09-27 16:33 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Jan Beulich, Julien Grall, Stefano Stabellini
Introduce struct pcpu_info to store pCPU-related information.
Initially, it includes only processor_id and hart id, but it
will be extended to include guest CPU information and
temporary variables for saving/restoring vCPU registers.
Add set_processor_id() function to set processor_id stored in
pcpu_info.
Define smp_processor_id() to provide accurate information,
replacing the previous "dummy" value of 0.
Initialize tp registers to point to pcpu_info[0].
Set processor_id to 0 for logical CPU 0 and store the physical
CPU ID in pcpu_info[0].
Introduce helpers for getting/setting hart_id ( physical CPU id
in RISC-V terms ) from Xen CPU id.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
xen/arch/riscv/Makefile | 1 +
xen/arch/riscv/include/asm/current.h | 27 +++++++++++++++++++++++++-
xen/arch/riscv/include/asm/processor.h | 3 ---
xen/arch/riscv/include/asm/smp.h | 18 +++++++++++++++++
xen/arch/riscv/riscv64/asm-offsets.c | 3 +++
xen/arch/riscv/riscv64/head.S | 14 +++++++++++++
xen/arch/riscv/setup.c | 5 +++++
xen/arch/riscv/smp.c | 15 ++++++++++++++
8 files changed, 82 insertions(+), 4 deletions(-)
create mode 100644 xen/arch/riscv/smp.c
diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index d192be7b55..6832549133 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -5,6 +5,7 @@ obj-$(CONFIG_RISCV_64) += riscv64/
obj-y += sbi.o
obj-y += setup.o
obj-y += shutdown.o
+obj-y += smp.o
obj-y += stubs.o
obj-y += traps.o
obj-y += vm_event.o
diff --git a/xen/arch/riscv/include/asm/current.h b/xen/arch/riscv/include/asm/current.h
index aedb6dc732..6f1ec4e190 100644
--- a/xen/arch/riscv/include/asm/current.h
+++ b/xen/arch/riscv/include/asm/current.h
@@ -3,12 +3,37 @@
#ifndef __ASM_CURRENT_H
#define __ASM_CURRENT_H
-#include <xen/lib.h>
+#include <xen/bug.h>
+#include <xen/cache.h>
#include <xen/percpu.h>
+
#include <asm/processor.h>
#ifndef __ASSEMBLY__
+register struct pcpu_info *tp asm ( "tp" );
+
+struct pcpu_info {
+ unsigned int processor_id; /* Xen CPU id */
+ unsigned long hart_id; /* physical CPU id */
+} __cacheline_aligned;
+
+/* tp points to one of these */
+extern struct pcpu_info pcpu_info[NR_CPUS];
+
+#define set_processor_id(id) do { \
+ tp->processor_id = (id); \
+} while (0)
+
+static inline unsigned int smp_processor_id(void)
+{
+ unsigned int id = tp->processor_id;
+
+ BUG_ON(id >= NR_CPUS);
+
+ return id;
+}
+
/* Which VCPU is "current" on this PCPU. */
DECLARE_PER_CPU(struct vcpu *, curr_vcpu);
diff --git a/xen/arch/riscv/include/asm/processor.h b/xen/arch/riscv/include/asm/processor.h
index 3ae164c265..e42b353b4c 100644
--- a/xen/arch/riscv/include/asm/processor.h
+++ b/xen/arch/riscv/include/asm/processor.h
@@ -12,9 +12,6 @@
#ifndef __ASSEMBLY__
-/* TODO: need to be implemeted */
-#define smp_processor_id() 0
-
/* On stack VCPU state */
struct cpu_user_regs
{
diff --git a/xen/arch/riscv/include/asm/smp.h b/xen/arch/riscv/include/asm/smp.h
index b1ea91b1eb..a824be8e78 100644
--- a/xen/arch/riscv/include/asm/smp.h
+++ b/xen/arch/riscv/include/asm/smp.h
@@ -5,6 +5,8 @@
#include <xen/cpumask.h>
#include <xen/percpu.h>
+#include <asm/current.h>
+
DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_mask);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_mask);
@@ -14,6 +16,22 @@ DECLARE_PER_CPU(cpumask_var_t, cpu_core_mask);
*/
#define park_offline_cpus false
+/*
+ * Mapping between Xen logical cpu index and hartid.
+ */
+static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
+{
+ return pcpu_info[cpuid].hart_id;
+}
+
+static inline void set_cpuid_to_hartid(unsigned long cpuid,
+ unsigned long hartid)
+{
+ pcpu_info[cpuid].hart_id = hartid;
+}
+
+void setup_tp(unsigned int cpuid);
+
#endif
/*
diff --git a/xen/arch/riscv/riscv64/asm-offsets.c b/xen/arch/riscv/riscv64/asm-offsets.c
index 9f663b9510..3b5daf3b36 100644
--- a/xen/arch/riscv/riscv64/asm-offsets.c
+++ b/xen/arch/riscv/riscv64/asm-offsets.c
@@ -1,5 +1,6 @@
#define COMPILE_OFFSETS
+#include <asm/current.h>
#include <asm/processor.h>
#include <xen/types.h>
@@ -50,4 +51,6 @@ void asm_offsets(void)
OFFSET(CPU_USER_REGS_SSTATUS, struct cpu_user_regs, sstatus);
OFFSET(CPU_USER_REGS_PREGS, struct cpu_user_regs, pregs);
BLANK();
+ DEFINE(PCPU_INFO_SIZE, sizeof(struct pcpu_info));
+ BLANK();
}
diff --git a/xen/arch/riscv/riscv64/head.S b/xen/arch/riscv/riscv64/head.S
index 3261e9fce8..2a1b3dad91 100644
--- a/xen/arch/riscv/riscv64/head.S
+++ b/xen/arch/riscv/riscv64/head.S
@@ -1,4 +1,5 @@
#include <asm/asm.h>
+#include <asm/asm-offsets.h>
#include <asm/riscv_encoding.h>
.section .text.header, "ax", %progbits
@@ -55,6 +56,10 @@ FUNC(start)
*/
jal reset_stack
+ /* Xen's boot cpu id is equal to 0 so setup TP register for it */
+ li a0, 0
+ jal setup_tp
+
/* restore hart_id ( bootcpu_id ) and dtb address */
mv a0, s0
mv a1, s1
@@ -72,6 +77,15 @@ FUNC(reset_stack)
ret
END(reset_stack)
+/* void setup_tp(unsigned int xen_cpuid); */
+FUNC(setup_tp)
+ la t0, pcpu_info
+ li t1, PCPU_INFO_SIZE
+ mul t1, a0, t1
+ add tp, t0, t1
+ ret
+END(setup_tp)
+
.section .text.ident, "ax", %progbits
FUNC(turn_on_mmu)
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index 82c5752da1..6e3a787dbe 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -9,6 +9,7 @@
#include <public/version.h>
#include <asm/early_printk.h>
+#include <asm/smp.h>
#include <asm/traps.h>
void arch_get_xen_caps(xen_capabilities_info_t *info)
@@ -41,6 +42,10 @@ void __init noreturn start_xen(unsigned long bootcpu_id,
{
remove_identity_mapping();
+ set_processor_id(0);
+
+ set_cpuid_to_hartid(0, bootcpu_id);
+
trap_init();
#ifdef CONFIG_SELF_TESTS
diff --git a/xen/arch/riscv/smp.c b/xen/arch/riscv/smp.c
new file mode 100644
index 0000000000..4ca6a4e892
--- /dev/null
+++ b/xen/arch/riscv/smp.c
@@ -0,0 +1,15 @@
+#include <xen/smp.h>
+
+/*
+ * FIXME: make pcpu_info[] dynamically allocated when necessary
+ * functionality will be ready
+ */
+/*
+ * tp points to one of these per cpu.
+ *
+ * hart_id would be valid (no matter which value) if its
+ * processor_id field is valid (less than NR_CPUS).
+ */
+struct pcpu_info pcpu_info[NR_CPUS] = { [0 ... NR_CPUS - 1] = {
+ .processor_id = NR_CPUS,
+}};
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v8 5/7] xen/riscv: introduce and initialize SBI RFENCE extension
2024-09-27 16:33 [PATCH v8 0/7] device tree mapping Oleksii Kurochko
` (3 preceding siblings ...)
2024-09-27 16:33 ` [PATCH v8 4/7] xen/riscv: introduce functionality to work with CPU info Oleksii Kurochko
@ 2024-09-27 16:33 ` Oleksii Kurochko
2024-09-27 16:33 ` [PATCH v8 6/7] xen/riscv: page table handling Oleksii Kurochko
` (3 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Oleksii Kurochko @ 2024-09-27 16:33 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Jan Beulich, Julien Grall, Stefano Stabellini
Introduce functions to work with the SBI RFENCE extension for issuing
various fence operations to remote CPUs.
Add the sbi_init() function along with auxiliary functions and macro
definitions for proper initialization and checking the availability of
SBI extensions. Currently, this is implemented only for RFENCE.
Introduce sbi_remote_sfence_vma() to send SFENCE_VMA instructions to
a set of target HARTs. This will support the implementation of
flush_xen_tlb_range_va().
Integrate __sbi_rfence_v02 from Linux kernel 6.6.0-rc4 with minimal
modifications:
- Adapt to Xen code style.
- Use cpuid_to_hartid() instead of cpuid_to_hartid_map[].
- Update BIT(...) to BIT(..., UL).
- Rename __sbi_rfence_v02_call to sbi_rfence_v02_real and
remove the unused arg5.
- Handle NULL cpu_mask to execute rfence on all CPUs by calling
sbi_rfence_v02_real(..., 0UL, -1UL,...) instead of creating hmask.
- change type for start_addr and size to vaddr_t and size_t.
- Add an explanatory comment about when batching can and cannot occur,
and why batching happens in the first place.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
xen/arch/riscv/include/asm/sbi.h | 62 +++++++
xen/arch/riscv/sbi.c | 273 ++++++++++++++++++++++++++++++-
xen/arch/riscv/setup.c | 3 +
3 files changed, 337 insertions(+), 1 deletion(-)
diff --git a/xen/arch/riscv/include/asm/sbi.h b/xen/arch/riscv/include/asm/sbi.h
index 4d72a2295e..5947fed779 100644
--- a/xen/arch/riscv/include/asm/sbi.h
+++ b/xen/arch/riscv/include/asm/sbi.h
@@ -12,9 +12,42 @@
#ifndef __ASM_RISCV_SBI_H__
#define __ASM_RISCV_SBI_H__
+#include <xen/cpumask.h>
+
#define SBI_EXT_0_1_CONSOLE_PUTCHAR 0x1
#define SBI_EXT_0_1_SHUTDOWN 0x8
+#define SBI_EXT_BASE 0x10
+#define SBI_EXT_RFENCE 0x52464E43
+
+/* SBI function IDs for BASE extension */
+#define SBI_EXT_BASE_GET_SPEC_VERSION 0x0
+#define SBI_EXT_BASE_GET_IMP_ID 0x1
+#define SBI_EXT_BASE_GET_IMP_VERSION 0x2
+#define SBI_EXT_BASE_PROBE_EXT 0x3
+
+/* SBI function IDs for RFENCE extension */
+#define SBI_EXT_RFENCE_REMOTE_FENCE_I 0x0
+#define SBI_EXT_RFENCE_REMOTE_SFENCE_VMA 0x1
+#define SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID 0x2
+#define SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA 0x3
+#define SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID 0x4
+#define SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA 0x5
+#define SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID 0x6
+
+#define SBI_SPEC_VERSION_MAJOR_MASK 0x7f000000
+#define SBI_SPEC_VERSION_MINOR_MASK 0x00ffffff
+
+/* SBI return error codes */
+#define SBI_SUCCESS 0
+#define SBI_ERR_FAILURE (-1)
+#define SBI_ERR_NOT_SUPPORTED (-2)
+#define SBI_ERR_INVALID_PARAM (-3)
+#define SBI_ERR_DENIED (-4)
+#define SBI_ERR_INVALID_ADDRESS (-5)
+
+#define SBI_SPEC_VERSION_DEFAULT 0x1
+
struct sbiret {
long error;
long value;
@@ -34,4 +67,33 @@ void sbi_console_putchar(int ch);
void sbi_shutdown(void);
+/*
+ * Check underlying SBI implementation has RFENCE
+ *
+ * @return true for supported AND false for not-supported
+ */
+bool sbi_has_rfence(void);
+
+/*
+ * Instructs the remote harts to execute one or more SFENCE.VMA
+ * instructions, covering the range of virtual addresses between
+ * [start_addr, start_addr + size).
+ *
+ * Returns 0 if IPI was sent to all the targeted harts successfully
+ * or negative value if start_addr or size is not valid.
+ *
+ * @hart_mask a cpu mask containing all the target harts.
+ * @param start virtual address start
+ * @param size virtual address range size
+ */
+int sbi_remote_sfence_vma(const cpumask_t *cpu_mask, vaddr_t start,
+ size_t size);
+
+/*
+ * Initialize SBI library
+ *
+ * @return 0 on success, otherwise negative errno on failure
+ */
+int sbi_init(void);
+
#endif /* __ASM_RISCV_SBI_H__ */
diff --git a/xen/arch/riscv/sbi.c b/xen/arch/riscv/sbi.c
index c7984344bc..4209520389 100644
--- a/xen/arch/riscv/sbi.c
+++ b/xen/arch/riscv/sbi.c
@@ -5,13 +5,26 @@
* (anup.patel@wdc.com).
*
* Modified by Bobby Eshleman (bobby.eshleman@gmail.com).
+ * Modified by Oleksii Kurochko (oleksii.kurochko@gmail.com).
*
* Copyright (c) 2019 Western Digital Corporation or its affiliates.
- * Copyright (c) 2021-2023 Vates SAS.
+ * Copyright (c) 2021-2024 Vates SAS.
*/
+#include <xen/compiler.h>
+#include <xen/const.h>
+#include <xen/cpumask.h>
+#include <xen/errno.h>
+#include <xen/init.h>
+#include <xen/lib.h>
+#include <xen/sections.h>
+#include <xen/smp.h>
+
+#include <asm/processor.h>
#include <asm/sbi.h>
+static unsigned long __ro_after_init sbi_spec_version = SBI_SPEC_VERSION_DEFAULT;
+
struct sbiret sbi_ecall(unsigned long ext, unsigned long fid,
unsigned long arg0, unsigned long arg1,
unsigned long arg2, unsigned long arg3,
@@ -38,6 +51,26 @@ struct sbiret sbi_ecall(unsigned long ext, unsigned long fid,
return ret;
}
+static int sbi_err_map_xen_errno(int err)
+{
+ switch ( err )
+ {
+ case SBI_SUCCESS:
+ return 0;
+ case SBI_ERR_DENIED:
+ return -EACCES;
+ case SBI_ERR_INVALID_PARAM:
+ return -EINVAL;
+ case SBI_ERR_INVALID_ADDRESS:
+ return -EFAULT;
+ case SBI_ERR_NOT_SUPPORTED:
+ return -EOPNOTSUPP;
+ case SBI_ERR_FAILURE:
+ default:
+ return -ENOSYS;
+ };
+}
+
void sbi_console_putchar(int ch)
{
sbi_ecall(SBI_EXT_0_1_CONSOLE_PUTCHAR, 0, ch, 0, 0, 0, 0, 0);
@@ -47,3 +80,241 @@ void sbi_shutdown(void)
{
sbi_ecall(SBI_EXT_0_1_SHUTDOWN, 0, 0, 0, 0, 0, 0, 0);
}
+
+static unsigned int sbi_major_version(void)
+{
+ return MASK_EXTR(sbi_spec_version, SBI_SPEC_VERSION_MAJOR_MASK);
+}
+
+static unsigned int sbi_minor_version(void)
+{
+ return MASK_EXTR(sbi_spec_version, SBI_SPEC_VERSION_MINOR_MASK);
+}
+
+static long sbi_ext_base_func(long fid)
+{
+ struct sbiret ret;
+
+ ret = sbi_ecall(SBI_EXT_BASE, fid, 0, 0, 0, 0, 0, 0);
+
+ if ( !ret.error )
+ {
+ /*
+ * I wasn't able to find a case in the SBI spec where sbiret.value
+ * could be negative.
+ *
+ * Unfortunately, the spec does not specify the possible values of
+ * sbiret.value, but based on the description of the SBI function,
+ * ret.value >= 0 when sbiret.error = 0. SPI spec specify only
+ * possible value for sbiret.error (<= 0 whwere 0 is SBI_SUCCESS ).
+ *
+ * Just to be sure that SBI base extension functions one day won't
+ * start to return a negative value for sbiret.value when
+ * sbiret.error < 0 BUG_ON() is added.
+ */
+ BUG_ON(ret.value < 0);
+
+ return ret.value;
+ }
+ else
+ return ret.error;
+}
+
+static int sbi_rfence_v02_real(unsigned long fid,
+ unsigned long hmask, unsigned long hbase,
+ vaddr_t start, size_t size,
+ unsigned long arg4)
+{
+ struct sbiret ret = {0};
+ int result = 0;
+
+ switch ( fid )
+ {
+ case SBI_EXT_RFENCE_REMOTE_FENCE_I:
+ ret = sbi_ecall(SBI_EXT_RFENCE, fid, hmask, hbase,
+ 0, 0, 0, 0);
+ break;
+
+ case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
+ case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA:
+ case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
+ ret = sbi_ecall(SBI_EXT_RFENCE, fid, hmask, hbase,
+ start, size, 0, 0);
+ break;
+
+ case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
+ case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID:
+ case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID:
+ ret = sbi_ecall(SBI_EXT_RFENCE, fid, hmask, hbase,
+ start, size, arg4, 0);
+ break;
+
+ default:
+ printk("%s: unknown function ID [%#lx]\n",
+ __func__, fid);
+ result = -EINVAL;
+ break;
+ };
+
+ if ( ret.error )
+ {
+ result = sbi_err_map_xen_errno(ret.error);
+ printk("%s: hbase=%lu hmask=%#lx failed (error %ld)\n",
+ __func__, hbase, hmask, ret.error);
+ }
+
+ return result;
+}
+
+static int cf_check sbi_rfence_v02(unsigned long fid,
+ const cpumask_t *cpu_mask,
+ vaddr_t start, size_t size,
+ unsigned long arg4, unsigned long arg5)
+{
+ unsigned long hartid, cpuid, hmask = 0, hbase = 0, htop = 0;
+ int result = -EINVAL;
+
+ /*
+ * hart_mask_base can be set to -1 to indicate that hart_mask can be
+ * ignored and all available harts must be considered.
+ */
+ if ( !cpu_mask )
+ return sbi_rfence_v02_real(fid, 0UL, -1UL, start, size, arg4);
+
+ for_each_cpu ( cpuid, cpu_mask )
+ {
+ /*
+ * Hart IDs might not necessarily be numbered contiguously in
+ * a multiprocessor system.
+ *
+ * This means that it is possible for the hart ID mapping to look like:
+ * 0, 1, 3, 65, 66, 69
+ * In such cases, more than one call to sbi_rfence_v02_real() will be
+ * needed, as a single hmask can only cover sizeof(unsigned long) CPUs:
+ * 1. sbi_rfence_v02_real(hmask=0b1011, hbase=0)
+ * 2. sbi_rfence_v02_real(hmask=0b1011, hbase=65)
+ *
+ * The algorithm below tries to batch as many harts as possible before
+ * making an SBI call. However, batching may not always be possible.
+ * For example, consider the hart ID mapping:
+ * 0, 64, 1, 65, 2, 66 (1)
+ *
+ * Generally, batching is also possible for (1):
+ * First (0,1,2), then (64,65,66).
+ * It just requires a different approach and updates to the current
+ * algorithm.
+ */
+ hartid = cpuid_to_hartid(cpuid);
+ if ( hmask )
+ {
+ if ( hartid + BITS_PER_LONG <= htop ||
+ hbase + BITS_PER_LONG <= hartid )
+ {
+ result = sbi_rfence_v02_real(fid, hmask, hbase,
+ start, size, arg4);
+ hmask = 0;
+ if ( result )
+ break;
+ }
+ else if ( hartid < hbase )
+ {
+ /* shift the mask to fit lower hartid */
+ hmask <<= hbase - hartid;
+ hbase = hartid;
+ }
+ }
+
+ if ( !hmask )
+ {
+ hbase = hartid;
+ htop = hartid;
+ }
+ else if ( hartid > htop )
+ htop = hartid;
+
+ hmask |= BIT(hartid - hbase, UL);
+ }
+
+ if ( hmask )
+ result = sbi_rfence_v02_real(fid, hmask, hbase,
+ start, size, arg4);
+
+ return result;
+}
+
+static int (* __ro_after_init sbi_rfence)(unsigned long fid,
+ const cpumask_t *cpu_mask,
+ vaddr_t start,
+ size_t size,
+ unsigned long arg4,
+ unsigned long arg5);
+
+int sbi_remote_sfence_vma(const cpumask_t *cpu_mask, vaddr_t start,
+ size_t size)
+{
+ ASSERT(sbi_rfence);
+
+ return sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA,
+ cpu_mask, start, size, 0, 0);
+}
+
+/* This function must always succeed. */
+#define sbi_get_spec_version() \
+ sbi_ext_base_func(SBI_EXT_BASE_GET_SPEC_VERSION)
+
+#define sbi_get_firmware_id() \
+ sbi_ext_base_func(SBI_EXT_BASE_GET_IMP_ID)
+
+#define sbi_get_firmware_version() \
+ sbi_ext_base_func(SBI_EXT_BASE_GET_IMP_VERSION)
+
+int sbi_probe_extension(long extid)
+{
+ struct sbiret ret;
+
+ ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_PROBE_EXT, extid,
+ 0, 0, 0, 0, 0);
+ if ( !ret.error && ret.value )
+ return ret.value;
+
+ return sbi_err_map_xen_errno(ret.error);
+}
+
+static bool sbi_spec_is_0_1(void)
+{
+ return (sbi_spec_version == SBI_SPEC_VERSION_DEFAULT);
+}
+
+bool sbi_has_rfence(void)
+{
+ return (sbi_rfence != NULL);
+}
+
+int __init sbi_init(void)
+{
+ sbi_spec_version = sbi_get_spec_version();
+
+ printk("SBI specification v%u.%u detected\n",
+ sbi_major_version(), sbi_minor_version());
+
+ if ( !sbi_spec_is_0_1() )
+ {
+ long sbi_fw_id = sbi_get_firmware_id();
+ long sbi_fw_version = sbi_get_firmware_version();
+
+ BUG_ON((sbi_fw_id < 0) || (sbi_fw_version < 0));
+
+ printk("SBI implementation ID=%#lx Version=%#lx\n",
+ sbi_fw_id, sbi_fw_version);
+
+ if ( sbi_probe_extension(SBI_EXT_RFENCE) > 0 )
+ {
+ sbi_rfence = sbi_rfence_v02;
+ printk("SBI v0.2 RFENCE extension detected\n");
+ }
+ }
+ else
+ panic("Ooops. SBI spec version 0.1 detected. Need to add support");
+
+ return 0;
+}
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index 6e3a787dbe..c4fadd36c6 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -9,6 +9,7 @@
#include <public/version.h>
#include <asm/early_printk.h>
+#include <asm/sbi.h>
#include <asm/smp.h>
#include <asm/traps.h>
@@ -48,6 +49,8 @@ void __init noreturn start_xen(unsigned long bootcpu_id,
trap_init();
+ sbi_init();
+
#ifdef CONFIG_SELF_TESTS
test_macros_from_bug_h();
#endif
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v8 6/7] xen/riscv: page table handling
2024-09-27 16:33 [PATCH v8 0/7] device tree mapping Oleksii Kurochko
` (4 preceding siblings ...)
2024-09-27 16:33 ` [PATCH v8 5/7] xen/riscv: introduce and initialize SBI RFENCE extension Oleksii Kurochko
@ 2024-09-27 16:33 ` Oleksii Kurochko
2024-09-30 8:35 ` oleksii.kurochko
2024-09-30 15:30 ` Jan Beulich
2024-09-27 16:33 ` [PATCH v8 7/7] xen/riscv: introduce early_fdt_map() Oleksii Kurochko
` (2 subsequent siblings)
8 siblings, 2 replies; 20+ messages in thread
From: Oleksii Kurochko @ 2024-09-27 16:33 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Jan Beulich, Julien Grall, Stefano Stabellini
Implement map_pages_to_xen() which requires several
functions to manage page tables and entries:
- pt_update()
- pt_mapping_level()
- pt_update_entry()
- pt_next_level()
- pt_check_entry()
To support these operations, add functions for creating,
mapping, and unmapping Xen tables:
- create_table()
- map_table()
- unmap_table()
Introduce PTE_SMALL to indicate that 4KB mapping is needed
and PTE_POPULATE.
In addition introduce flush_tlb_range_va() for TLB flushing across
CPUs after updating the PTE for the requested mapping.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
xen/arch/riscv/Makefile | 1 +
xen/arch/riscv/include/asm/flushtlb.h | 9 +
xen/arch/riscv/include/asm/mm.h | 2 +
xen/arch/riscv/include/asm/page.h | 80 ++++
xen/arch/riscv/include/asm/riscv_encoding.h | 2 +
xen/arch/riscv/mm.c | 9 -
xen/arch/riscv/pt.c | 421 ++++++++++++++++++++
7 files changed, 515 insertions(+), 9 deletions(-)
create mode 100644 xen/arch/riscv/pt.c
diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 6832549133..a5eb2aed4b 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -1,6 +1,7 @@
obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
obj-y += entry.o
obj-y += mm.o
+obj-y += pt.o
obj-$(CONFIG_RISCV_64) += riscv64/
obj-y += sbi.o
obj-y += setup.o
diff --git a/xen/arch/riscv/include/asm/flushtlb.h b/xen/arch/riscv/include/asm/flushtlb.h
index f4a735fd6c..43214f5e95 100644
--- a/xen/arch/riscv/include/asm/flushtlb.h
+++ b/xen/arch/riscv/include/asm/flushtlb.h
@@ -5,12 +5,21 @@
#include <xen/bug.h>
#include <xen/cpumask.h>
+#include <asm/sbi.h>
+
/* Flush TLB of local processor for address va. */
static inline void flush_tlb_one_local(vaddr_t va)
{
asm volatile ( "sfence.vma %0" :: "r" (va) : "memory" );
}
+/* Flush a range of VA's hypervisor mappings from the TLB of all processors. */
+static inline void flush_tlb_range_va(vaddr_t va, size_t size)
+{
+ BUG_ON(!sbi_has_rfence());
+ sbi_remote_sfence_vma(NULL, va, size);
+}
+
/*
* Filter the given set of CPUs, removing those that definitely flushed their
* TLB since @page_timestamp.
diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h
index a0bdc2bc3a..ce1557bb27 100644
--- a/xen/arch/riscv/include/asm/mm.h
+++ b/xen/arch/riscv/include/asm/mm.h
@@ -42,6 +42,8 @@ static inline void *maddr_to_virt(paddr_t ma)
#define virt_to_mfn(va) __virt_to_mfn(va)
#define mfn_to_virt(mfn) __mfn_to_virt(mfn)
+#define mfn_from_pte(pte) maddr_to_mfn(pte_to_paddr(pte))
+
struct page_info
{
/* Each frame can be threaded onto a doubly-linked list. */
diff --git a/xen/arch/riscv/include/asm/page.h b/xen/arch/riscv/include/asm/page.h
index eb79cb9409..89fa290697 100644
--- a/xen/arch/riscv/include/asm/page.h
+++ b/xen/arch/riscv/include/asm/page.h
@@ -21,6 +21,11 @@
#define XEN_PT_LEVEL_MAP_MASK(lvl) (~(XEN_PT_LEVEL_SIZE(lvl) - 1))
#define XEN_PT_LEVEL_MASK(lvl) (VPN_MASK << XEN_PT_LEVEL_SHIFT(lvl))
+/*
+ * PTE format:
+ * | XLEN-1 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
+ * PFN reserved for SW D A G U X W R V
+ */
#define PTE_VALID BIT(0, UL)
#define PTE_READABLE BIT(1, UL)
#define PTE_WRITABLE BIT(2, UL)
@@ -34,15 +39,49 @@
#define PTE_LEAF_DEFAULT (PTE_VALID | PTE_READABLE | PTE_WRITABLE)
#define PTE_TABLE (PTE_VALID)
+#define PAGE_HYPERVISOR_RO (PTE_VALID | PTE_READABLE)
#define PAGE_HYPERVISOR_RW (PTE_VALID | PTE_READABLE | PTE_WRITABLE)
+#define PAGE_HYPERVISOR_RX (PTE_VALID | PTE_READABLE | PTE_EXECUTABLE)
#define PAGE_HYPERVISOR PAGE_HYPERVISOR_RW
+/*
+ * The PTE format does not contain the following bits within itself;
+ * they are created artificially to inform the Xen page table
+ * handling algorithm. These bits should not be explicitly written
+ * to the PTE entry.
+ */
+#define PTE_SMALL BIT(10, UL)
+#define PTE_POPULATE BIT(11, UL)
+
+#define PTE_ACCESS_MASK (PTE_READABLE | PTE_WRITABLE | PTE_EXECUTABLE)
+
/* Calculate the offsets into the pagetables for a given VA */
#define pt_linear_offset(lvl, va) ((va) >> XEN_PT_LEVEL_SHIFT(lvl))
#define pt_index(lvl, va) (pt_linear_offset((lvl), (va)) & VPN_MASK)
+#define PAGETABLE_ORDER_MASK ((_AC(1, U) << PAGETABLE_ORDER) - 1)
+#define TABLE_OFFSET(offs) (_AT(unsigned int, offs) & PAGETABLE_ORDER_MASK)
+
+#if RV_STAGE1_MODE > SATP_MODE_SV39
+#error "need to to update DECLARE_OFFSETS macros"
+#else
+
+#define l0_table_offset(va) TABLE_OFFSET(pt_linear_offset(0, va))
+#define l1_table_offset(va) TABLE_OFFSET(pt_linear_offset(1, va))
+#define l2_table_offset(va) TABLE_OFFSET(pt_linear_offset(2, va))
+
+/* Generate an array @var containing the offset for each level from @addr */
+#define DECLARE_OFFSETS(var, addr) \
+ const unsigned int var[] = { \
+ l0_table_offset(addr), \
+ l1_table_offset(addr), \
+ l2_table_offset(addr), \
+ }
+
+#endif
+
/* Page Table entry */
typedef struct {
#ifdef CONFIG_RISCV_64
@@ -68,6 +107,47 @@ static inline bool pte_is_valid(pte_t p)
return p.pte & PTE_VALID;
}
+/*
+ * From the RISC-V spec:
+ * The V bit indicates whether the PTE is valid; if it is 0, all other bits
+ * in the PTE are don’t-cares and may be used freely by software.
+ *
+ * If V=1 the encoding of PTE R/W/X bits could be find in "the encoding
+ * of the permission bits" table.
+ *
+ * The encoding of the permission bits table:
+ * X W R Meaning
+ * 0 0 0 Pointer to next level of page table.
+ * 0 0 1 Read-only page.
+ * 0 1 0 Reserved for future use.
+ * 0 1 1 Read-write page.
+ * 1 0 0 Execute-only page.
+ * 1 0 1 Read-execute page.
+ * 1 1 0 Reserved for future use.
+ * 1 1 1 Read-write-execute page.
+ */
+static inline bool pte_is_table(pte_t p)
+{
+ /*
+ * According to the spec if V=1 and W=1 then R also needs to be 1 as
+ * R = 0 is reserved for future use ( look at the Table 4.5 ) so check
+ * in ASSERT that if (V==1 && W==1) then R isn't 0.
+ *
+ * PAGE_HYPERVISOR_RW contains PTE_VALID too.
+ */
+ ASSERT(((p.pte & PAGE_HYPERVISOR_RW) != (PTE_VALID | PTE_WRITABLE)));
+
+ return ((p.pte & (PTE_VALID | PTE_ACCESS_MASK)) == PTE_VALID);
+}
+
+static inline bool pte_is_mapping(pte_t p)
+{
+ /* See pte_is_table() */
+ ASSERT(((p.pte & PAGE_HYPERVISOR_RW) != (PTE_VALID | PTE_WRITABLE)));
+
+ return (p.pte & PTE_VALID) && (p.pte & PTE_ACCESS_MASK);
+}
+
static inline void invalidate_icache(void)
{
BUG_ON("unimplemented");
diff --git a/xen/arch/riscv/include/asm/riscv_encoding.h b/xen/arch/riscv/include/asm/riscv_encoding.h
index 58abe5eccc..e31e94e77e 100644
--- a/xen/arch/riscv/include/asm/riscv_encoding.h
+++ b/xen/arch/riscv/include/asm/riscv_encoding.h
@@ -164,6 +164,7 @@
#define SSTATUS_SD SSTATUS64_SD
#define SATP_MODE SATP64_MODE
#define SATP_MODE_SHIFT SATP64_MODE_SHIFT
+#define SATP_PPN_MASK SATP64_PPN
#define HGATP_PPN HGATP64_PPN
#define HGATP_VMID_SHIFT HGATP64_VMID_SHIFT
@@ -174,6 +175,7 @@
#define SSTATUS_SD SSTATUS32_SD
#define SATP_MODE SATP32_MODE
#define SATP_MODE_SHIFT SATP32_MODE_SHIFT
+#define SATP_PPN_MASK SATP32_PPN
#define HGATP_PPN HGATP32_PPN
#define HGATP_VMID_SHIFT HGATP32_VMID_SHIFT
diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c
index b8ff91cf4e..e8430def14 100644
--- a/xen/arch/riscv/mm.c
+++ b/xen/arch/riscv/mm.c
@@ -369,12 +369,3 @@ int destroy_xen_mappings(unsigned long s, unsigned long e)
BUG_ON("unimplemented");
return -1;
}
-
-int map_pages_to_xen(unsigned long virt,
- mfn_t mfn,
- unsigned long nr_mfns,
- unsigned int flags)
-{
- BUG_ON("unimplemented");
- return -1;
-}
diff --git a/xen/arch/riscv/pt.c b/xen/arch/riscv/pt.c
new file mode 100644
index 0000000000..a5552a4871
--- /dev/null
+++ b/xen/arch/riscv/pt.c
@@ -0,0 +1,421 @@
+#include <xen/bug.h>
+#include <xen/domain_page.h>
+#include <xen/errno.h>
+#include <xen/lib.h>
+#include <xen/mm.h>
+#include <xen/pfn.h>
+#include <xen/pmap.h>
+#include <xen/spinlock.h>
+
+#include <asm/flushtlb.h>
+#include <asm/page.h>
+
+static inline mfn_t get_root_page(void)
+{
+ paddr_t root_maddr = pfn_to_paddr(csr_read(CSR_SATP) & SATP_PPN_MASK);
+
+ return maddr_to_mfn(root_maddr);
+}
+
+/*
+ * Sanity check a page table entry about to be updated as per an (MFN,flags)
+ * tuple.
+ * See the comment about the possible combination of (mfn, flags) in
+ * the comment above pt_update().
+ */
+static bool pt_check_entry(pte_t entry, mfn_t mfn, unsigned int flags)
+{
+ /* Sanity check when modifying an entry. */
+ if ( (flags & PTE_VALID) && mfn_eq(mfn, INVALID_MFN) )
+ {
+ /* We don't allow modifying an invalid entry. */
+ if ( !pte_is_valid(entry) )
+ {
+ dprintk(XENLOG_ERR, "Modifying invalid entry is not allowed\n");
+ return false;
+ }
+
+ /* We don't allow modifying a table entry */
+ if ( pte_is_table(entry) )
+ {
+ dprintk(XENLOG_ERR, "Modifying a table entry is not allowed\n");
+ return false;
+ }
+ }
+ /* Sanity check when inserting a mapping */
+ else if ( flags & PTE_VALID )
+ {
+ /*
+ * We don't allow replacing any valid entry.
+ *
+ * Note that the function pt_update() relies on this
+ * assumption and will skip the TLB flush (when Svvptc
+ * extension will be ratified). The function will need
+ * to be updated if the check is relaxed.
+ */
+ if ( pte_is_valid(entry) )
+ {
+ if ( pte_is_mapping(entry) )
+ dprintk(XENLOG_ERR, "Changing MFN for valid PTE is not allowed (%#"PRI_mfn" -> %#"PRI_mfn")\n",
+ mfn_x(mfn_from_pte(entry)), mfn_x(mfn));
+ else
+ dprintk(XENLOG_ERR, "Trying to replace table with mapping\n");
+ return false;
+ }
+ }
+ /* Sanity check when removing a mapping. */
+ else if ( !(flags & PTE_POPULATE) )
+ {
+ /* We should be here with an invalid MFN. */
+ ASSERT(mfn_eq(mfn, INVALID_MFN));
+
+ /* We don't allow removing a table */
+ if ( pte_is_table(entry) )
+ {
+ dprintk(XENLOG_ERR, "Removing a table is not allowed\n");
+ return false;
+ }
+ }
+ /* Sanity check when populating the page-table. No check so far. */
+ else
+ {
+ /* We should be here with an invalid MFN */
+ ASSERT(mfn_eq(mfn, INVALID_MFN));
+ }
+
+ return true;
+}
+
+static pte_t *map_table(mfn_t mfn)
+{
+ /*
+ * During early boot, map_domain_page() may be unusable. Use the
+ * PMAP to map temporarily a page-table.
+ */
+ if ( system_state == SYS_STATE_early_boot )
+ return pmap_map(mfn);
+
+ return map_domain_page(mfn);
+}
+
+static void unmap_table(const pte_t *table)
+{
+ /*
+ * During early boot, map_table() will not use map_domain_page()
+ * but the PMAP.
+ */
+ if ( system_state == SYS_STATE_early_boot )
+ pmap_unmap(table);
+ else
+ unmap_domain_page(table);
+}
+
+static int create_table(pte_t *entry)
+{
+ mfn_t mfn;
+ void *p;
+ pte_t pte;
+
+ if ( system_state != SYS_STATE_early_boot )
+ {
+ struct page_info *pg = alloc_domheap_page(NULL, 0);
+
+ if ( pg == NULL )
+ return -ENOMEM;
+
+ mfn = page_to_mfn(pg);
+ }
+ else
+ mfn = alloc_boot_pages(1, 1);
+
+ p = map_table(mfn);
+ clear_page(p);
+ unmap_table(p);
+
+ pte = pte_from_mfn(mfn, PTE_TABLE);
+ write_pte(entry, pte);
+
+ return 0;
+}
+
+#define XEN_TABLE_MAP_NONE 0
+#define XEN_TABLE_MAP_NOMEM 1
+#define XEN_TABLE_SUPER_PAGE 2
+#define XEN_TABLE_NORMAL 3
+
+/*
+ * Take the currently mapped table, find the corresponding entry,
+ * and map the next table, if available.
+ *
+ * The alloc_tbl parameters indicates whether intermediate tables should
+ * be allocated when not present.
+ *
+ * Return values:
+ * XEN_TABLE_MAP_FAILED: Either alloc_only was set and the entry
+ * was empty, or allocating a new page failed.
+ * XEN_TABLE_NORMAL: next level or leaf mapped normally
+ * XEN_TABLE_SUPER_PAGE: The next entry points to a superpage.
+ */
+static int pt_next_level(bool alloc_tbl, pte_t **table, unsigned int offset)
+{
+ pte_t *entry;
+ mfn_t mfn;
+
+ entry = *table + offset;
+
+ if ( !pte_is_valid(*entry) )
+ {
+ if ( !alloc_tbl )
+ return XEN_TABLE_MAP_NONE;
+
+ if ( create_table(entry) )
+ return XEN_TABLE_MAP_NOMEM;
+ }
+
+ if ( pte_is_mapping(*entry) )
+ return XEN_TABLE_SUPER_PAGE;
+
+ mfn = mfn_from_pte(*entry);
+
+ unmap_table(*table);
+ *table = map_table(mfn);
+
+ return XEN_TABLE_NORMAL;
+}
+
+/* Update an entry at the level @target. */
+static int pt_update_entry(mfn_t root, vaddr_t virt,
+ mfn_t mfn, unsigned int target,
+ unsigned int flags)
+{
+ int rc;
+ unsigned int level = HYP_PT_ROOT_LEVEL;
+ pte_t *table;
+ /*
+ * The intermediate page table shouldn't be allocated when MFN isn't
+ * valid and we are not populating page table.
+ * This means we either modify permissions or remove an entry, or
+ * inserting brand new entry.
+ *
+ * See the comment above pt_update() for an additional explanation about
+ * combinations of (mfn, flags).
+ */
+ bool alloc_tbl = !mfn_eq(mfn, INVALID_MFN) || (flags & PTE_POPULATE);
+ pte_t pte, *entry;
+
+ /* convenience aliases */
+ DECLARE_OFFSETS(offsets, virt);
+
+ table = map_table(root);
+ for ( ; level > target; level-- )
+ {
+ rc = pt_next_level(alloc_tbl, &table, offsets[level]);
+ if ( rc == XEN_TABLE_MAP_NOMEM )
+ {
+ rc = -ENOMEM;
+ goto out;
+ }
+
+ if ( rc == XEN_TABLE_MAP_NONE )
+ {
+ rc = 0;
+ goto out;
+ }
+
+ if ( rc != XEN_TABLE_NORMAL )
+ break;
+ }
+
+ if ( level != target )
+ {
+ dprintk(XENLOG_ERR,
+ "%s: Shattering superpage is not supported\n", __func__);
+ rc = -EOPNOTSUPP;
+ goto out;
+ }
+
+ entry = table + offsets[level];
+
+ rc = -EINVAL;
+ if ( !pt_check_entry(*entry, mfn, flags) )
+ goto out;
+
+ /* We are removing the page */
+ if ( !(flags & PTE_VALID) )
+ /*
+ * There is also a check in pt_check_entry() which check that
+ * mfn=INVALID_MFN
+ */
+ pte.pte = 0;
+ else
+ {
+ /* We are inserting a mapping => Create new pte. */
+ if ( !mfn_eq(mfn, INVALID_MFN) )
+ pte = pte_from_mfn(mfn, PTE_VALID);
+ else /* We are updating the permission => Copy the current pte. */
+ {
+ pte = *entry;
+ pte.pte &= ~PTE_ACCESS_MASK;
+ }
+
+ /* update permission according to the flags */
+ pte.pte |= (flags & PTE_ACCESS_MASK) | PTE_ACCESSED | PTE_DIRTY;
+ }
+
+ write_pte(entry, pte);
+
+ rc = 0;
+
+ out:
+ unmap_table(table);
+
+ return rc;
+}
+
+/* Return the level where mapping should be done */
+static int pt_mapping_level(unsigned long vfn, mfn_t mfn, unsigned long nr,
+ unsigned int flags)
+{
+ unsigned int level = 0;
+ unsigned long mask;
+ unsigned int i;
+
+ /*
+ * Use a larger mapping than 4K unless the caller specifically requests
+ * 4K mapping
+ */
+ if ( unlikely(flags & PTE_SMALL) )
+ return level;
+
+ /*
+ * Don't take into account the MFN when removing mapping (i.e
+ * MFN_INVALID) to calculate the correct target order.
+ *
+ * `vfn` and `mfn` must be both superpage aligned.
+ * They are or-ed together and then checked against the size of
+ * each level.
+ *
+ * `left` ( variable declared in pt_update() ) is not included
+ * and checked separately to allow superpage mapping even if it
+ * is not properly aligned (the user may have asked to map 2MB + 4k).
+ */
+ mask = !mfn_eq(mfn, INVALID_MFN) ? mfn_x(mfn) : 0;
+ mask |= vfn;
+
+ for ( i = HYP_PT_ROOT_LEVEL; i != 0; i-- )
+ {
+ if ( !(mask & (BIT(XEN_PT_LEVEL_ORDER(i), UL) - 1)) &&
+ (nr >= BIT(XEN_PT_LEVEL_ORDER(i), UL)) )
+ {
+ level = i;
+ break;
+ }
+ }
+
+ return level;
+}
+
+static DEFINE_SPINLOCK(pt_lock);
+
+/*
+ * If `mfn` equals `INVALID_MFN`, it indicates that the following page table
+ * update operation might be related to either:
+ * - populating the table (PTE_POPULATE will be set additionaly),
+ * - destroying a mapping (PTE_VALID=0),
+ * - modifying an existing mapping (PTE_VALID=1).
+ *
+ * If `mfn` != INVALID_MFN and flags has PTE_VALID bit set then it means that
+ * inserting will be done.
+ */
+static int pt_update(vaddr_t virt, mfn_t mfn,
+ unsigned long nr_mfns, unsigned int flags)
+{
+ int rc = 0;
+ unsigned long vfn = PFN_DOWN(virt);
+ unsigned long left = nr_mfns;
+ const mfn_t root = get_root_page();
+
+ /*
+ * It is bad idea to have mapping both writeable and
+ * executable.
+ * When modifying/creating mapping (i.e PTE_VALID is set),
+ * prevent any update if this happen.
+ */
+ if ( (flags & PTE_VALID) && (flags & PTE_WRITABLE) &&
+ (flags & PTE_EXECUTABLE) )
+ {
+ dprintk(XENLOG_ERR,
+ "Mappings should not be both Writeable and Executable\n");
+ return -EINVAL;
+ }
+
+ if ( !IS_ALIGNED(virt, PAGE_SIZE) )
+ {
+ dprintk(XENLOG_ERR,
+ "The virtual address is not aligned to the page-size\n");
+ return -EINVAL;
+ }
+
+ spin_lock(&pt_lock);
+
+ while ( left )
+ {
+ unsigned int order, level;
+
+ level = pt_mapping_level(vfn, mfn, left, flags);
+ order = XEN_PT_LEVEL_ORDER(level);
+
+ ASSERT(left >= BIT(order, UL));
+
+ rc = pt_update_entry(root, vfn << PAGE_SHIFT, mfn, level, flags);
+ if ( rc )
+ break;
+
+ vfn += 1UL << order;
+ if ( !mfn_eq(mfn, INVALID_MFN) )
+ mfn = mfn_add(mfn, 1UL << order);
+
+ left -= (1UL << order);
+ }
+
+ /* Ensure that PTEs are all updated before flushing */
+ RISCV_FENCE(rw, rw);
+
+ spin_unlock(&pt_lock);
+
+ /*
+ * Always flush TLB at the end of the function as non-present entries
+ * can be put in the TLB.
+ *
+ * The remote fence operation applies to the entire address space if
+ * either:
+ * - start and size are both 0, or
+ * - size is equal to 2^XLEN-1.
+ *
+ * TODO: come up with something which will allow not to flash the entire
+ * address space.
+ */
+ flush_tlb_range_va(0, 0);
+
+ return rc;
+}
+
+int map_pages_to_xen(unsigned long virt,
+ mfn_t mfn,
+ unsigned long nr_mfns,
+ unsigned int flags)
+{
+ /*
+ * Ensure that flags has PTE_VALID bit as map_pages_to_xen() is supposed
+ * to create a mapping.
+ *
+ * Ensure that we have a valid MFN before proceeding.
+ *
+ * If the MFN is invalid, pt_update() might misinterpret the operation,
+ * treating it as either a population, a mapping destruction,
+ * or a mapping modification.
+ */
+ ASSERT(!mfn_eq(mfn, INVALID_MFN) && (flags & PTE_VALID));
+
+ return pt_update(virt, mfn, nr_mfns, flags);
+}
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v8 7/7] xen/riscv: introduce early_fdt_map()
2024-09-27 16:33 [PATCH v8 0/7] device tree mapping Oleksii Kurochko
` (5 preceding siblings ...)
2024-09-27 16:33 ` [PATCH v8 6/7] xen/riscv: page table handling Oleksii Kurochko
@ 2024-09-27 16:33 ` Oleksii Kurochko
2024-09-30 8:36 ` oleksii.kurochko
2024-09-30 7:27 ` [PATCH v8 0/7] device tree mapping Jan Beulich
2024-09-30 8:17 ` Jan Beulich
8 siblings, 1 reply; 20+ messages in thread
From: Oleksii Kurochko @ 2024-09-27 16:33 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Jan Beulich, Julien Grall, Stefano Stabellini
Introduce function which allows to map FDT to Xen.
Also, initialization of device_tree_flattened happens using
early_fdt_map().
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
xen/arch/riscv/include/asm/mm.h | 2 ++
xen/arch/riscv/mm.c | 55 +++++++++++++++++++++++++++++++++
xen/arch/riscv/setup.c | 7 +++++
3 files changed, 64 insertions(+)
diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h
index ce1557bb27..4b7b00b850 100644
--- a/xen/arch/riscv/include/asm/mm.h
+++ b/xen/arch/riscv/include/asm/mm.h
@@ -259,4 +259,6 @@ static inline unsigned int arch_get_dma_bitsize(void)
void setup_fixmap_mappings(void);
+void *early_fdt_map(paddr_t fdt_paddr);
+
#endif /* _ASM_RISCV_MM_H */
diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c
index e8430def14..4a628aef83 100644
--- a/xen/arch/riscv/mm.c
+++ b/xen/arch/riscv/mm.c
@@ -1,13 +1,16 @@
/* SPDX-License-Identifier: GPL-2.0-only */
+#include <xen/bootfdt.h>
#include <xen/bug.h>
#include <xen/compiler.h>
#include <xen/init.h>
#include <xen/kernel.h>
+#include <xen/libfdt/libfdt.h>
#include <xen/macros.h>
#include <xen/mm.h>
#include <xen/pfn.h>
#include <xen/sections.h>
+#include <xen/sizes.h>
#include <asm/early_printk.h>
#include <asm/csr.h>
@@ -369,3 +372,55 @@ int destroy_xen_mappings(unsigned long s, unsigned long e)
BUG_ON("unimplemented");
return -1;
}
+
+void * __init early_fdt_map(paddr_t fdt_paddr)
+{
+ /* We are using 2MB superpage for mapping the FDT */
+ paddr_t base_paddr = fdt_paddr & XEN_PT_LEVEL_MAP_MASK(1);
+ paddr_t offset;
+ void *fdt_virt;
+ uint32_t size;
+ int rc;
+
+ /*
+ * Check whether the physical FDT address is set and meets the minimum
+ * alignment requirement. Since we are relying on MIN_FDT_ALIGN to be at
+ * least 8 bytes so that we always access the magic and size fields
+ * of the FDT header after mapping the first chunk, double check if
+ * that is indeed the case.
+ */
+ BUILD_BUG_ON(MIN_FDT_ALIGN < 8);
+ if ( !fdt_paddr || fdt_paddr % MIN_FDT_ALIGN )
+ return NULL;
+
+ /* The FDT is mapped using 2MB superpage */
+ BUILD_BUG_ON(BOOT_FDT_VIRT_START % MB(2));
+
+ rc = map_pages_to_xen(BOOT_FDT_VIRT_START, maddr_to_mfn(base_paddr),
+ MB(2) >> PAGE_SHIFT,
+ PAGE_HYPERVISOR_RO);
+ if ( rc )
+ panic("Unable to map the device-tree.\n");
+
+ offset = fdt_paddr % XEN_PT_LEVEL_SIZE(1);
+ fdt_virt = (void *)BOOT_FDT_VIRT_START + offset;
+
+ if ( fdt_magic(fdt_virt) != FDT_MAGIC )
+ return NULL;
+
+ size = fdt_totalsize(fdt_virt);
+ if ( size > BOOT_FDT_VIRT_SIZE )
+ return NULL;
+
+ if ( (offset + size) > MB(2) )
+ {
+ rc = map_pages_to_xen(BOOT_FDT_VIRT_START + MB(2),
+ maddr_to_mfn(base_paddr + MB(2)),
+ MB(2) >> PAGE_SHIFT,
+ PAGE_HYPERVISOR_RO);
+ if ( rc )
+ panic("Unable to map the device-tree\n");
+ }
+
+ return fdt_virt;
+}
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index c4fadd36c6..a316901fd4 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -2,6 +2,7 @@
#include <xen/bug.h>
#include <xen/compile.h>
+#include <xen/device_tree.h>
#include <xen/init.h>
#include <xen/mm.h>
#include <xen/shutdown.h>
@@ -57,6 +58,12 @@ void __init noreturn start_xen(unsigned long bootcpu_id,
setup_fixmap_mappings();
+ device_tree_flattened = early_fdt_map(dtb_addr);
+ if ( !device_tree_flattened )
+ panic("Invalid device tree blob at physical address %#lx. The DTB must be 8-byte aligned and must not exceed %lld bytes in size.\n\n"
+ "Please check your bootloader.\n",
+ dtb_addr, BOOT_FDT_VIRT_SIZE);
+
printk("All set up\n");
machine_halt();
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v8 4/7] xen/riscv: introduce functionality to work with CPU info
2024-09-27 16:33 ` [PATCH v8 4/7] xen/riscv: introduce functionality to work with CPU info Oleksii Kurochko
@ 2024-09-30 7:25 ` Jan Beulich
0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2024-09-30 7:25 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Julien Grall, Stefano Stabellini, xen-devel
On 27.09.2024 18:33, Oleksii Kurochko wrote:
> Introduce struct pcpu_info to store pCPU-related information.
> Initially, it includes only processor_id and hart id, but it
> will be extended to include guest CPU information and
> temporary variables for saving/restoring vCPU registers.
>
> Add set_processor_id() function to set processor_id stored in
> pcpu_info.
>
> Define smp_processor_id() to provide accurate information,
> replacing the previous "dummy" value of 0.
>
> Initialize tp registers to point to pcpu_info[0].
> Set processor_id to 0 for logical CPU 0 and store the physical
> CPU ID in pcpu_info[0].
>
> Introduce helpers for getting/setting hart_id ( physical CPU id
> in RISC-V terms ) from Xen CPU id.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v8 0/7] device tree mapping
2024-09-27 16:33 [PATCH v8 0/7] device tree mapping Oleksii Kurochko
` (6 preceding siblings ...)
2024-09-27 16:33 ` [PATCH v8 7/7] xen/riscv: introduce early_fdt_map() Oleksii Kurochko
@ 2024-09-30 7:27 ` Jan Beulich
2024-09-30 8:14 ` oleksii.kurochko
2024-09-30 8:17 ` Jan Beulich
8 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2024-09-30 7:27 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Julien Grall, Stefano Stabellini, xen-devel
On 27.09.2024 18:33, Oleksii Kurochko wrote:
> Current patch series introduces device tree mapping for RISC-V
> and necessary things for that such as:
> - Fixmap mapping
> - pmap
> - Xen page table processing
>
> ---
> Changes in v8:
> - The following patch was merged to staging:
> [PATCH v5 1/7] xen/riscv: use {read,write}{b,w,l,q}_cpu() to define {read,write}_atomic()
> - All other changes are patch specific so please look at the patch.
Except that afaics none of the patches has any revision log.
Jan
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v8 0/7] device tree mapping
2024-09-30 7:27 ` [PATCH v8 0/7] device tree mapping Jan Beulich
@ 2024-09-30 8:14 ` oleksii.kurochko
2024-09-30 8:21 ` Jan Beulich
0 siblings, 1 reply; 20+ messages in thread
From: oleksii.kurochko @ 2024-09-30 8:14 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Julien Grall, Stefano Stabellini, xen-devel
On Mon, 2024-09-30 at 09:27 +0200, Jan Beulich wrote:
> On 27.09.2024 18:33, Oleksii Kurochko wrote:
> > Current patch series introduces device tree mapping for RISC-V
> > and necessary things for that such as:
> > - Fixmap mapping
> > - pmap
> > - Xen page table processing
> >
> > ---
> > Changes in v8:
> > - The following patch was merged to staging:
> > [PATCH v5 1/7] xen/riscv: use {read,write}{b,w,l,q}_cpu() to
> > define {read,write}_atomic()
> > - All other changes are patch specific so please look at the
> > patch.
>
> Except that afaics none of the patches has any revision log.
Would it be helpful if I will send revision log as a reply to each
patch? Or it would be better just to re-send the patch series?
~ Oleksii
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v8 0/7] device tree mapping
2024-09-27 16:33 [PATCH v8 0/7] device tree mapping Oleksii Kurochko
` (7 preceding siblings ...)
2024-09-30 7:27 ` [PATCH v8 0/7] device tree mapping Jan Beulich
@ 2024-09-30 8:17 ` Jan Beulich
2024-09-30 8:24 ` oleksii.kurochko
8 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2024-09-30 8:17 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Julien Grall, Stefano Stabellini, xen-devel
On 27.09.2024 18:33, Oleksii Kurochko wrote:
> Current patch series introduces device tree mapping for RISC-V
> and necessary things for that such as:
> - Fixmap mapping
> - pmap
> - Xen page table processing
While nothing is being said here towards a dependency, ...
> ---
> Changes in v8:
> - The following patch was merged to staging:
> [PATCH v5 1/7] xen/riscv: use {read,write}{b,w,l,q}_cpu() to define {read,write}_atomic()
> - All other changes are patch specific so please look at the patch.
> ---
> Changes in v7:
> - Drop the patch "xen/riscv: prevent recursion when ASSERT(), BUG*(), or panic() are called"
> - All other changes are patch specific so please look at the patch.
> ---
> Changes in v6:
> - Add patch to fix recursion when ASSERT(), BUG*(), panic() are called.
> - Add patch to allow write_atomic() to work with non-scalar types for consistence
> with read_atomic().
> - All other changes are patch specific so please look at the patch.
> ---
> Changes in v5:
> - The following patch was merged to staging:
> [PATCH v3 3/9] xen/riscv: enable CONFIG_HAS_DEVICE_TREE
> - Drop depedency from "RISCV basic exception handling implementation" as
> it was meged to staging branch.
> - All other changes are patch specific so please look at the patch.
> ---
> Changes in v4:
> - Drop depedency from common devicre tree patch series as it was merged to
> staging.
> - Update the cover letter message.
> - All other changes are patch specific so please look at the patch.
> ---
> Changes in v3:
> - Introduce SBI RFENCE extension support.
> - Introduce and initialize pcpu_info[] and __cpuid_to_hartid_map[] and functionality
> to work with this arrays.
> - Make page table handling arch specific instead of trying to make it generic.
> - All other changes are patch specific so please look at the patch.
> ---
> Changes in v2:
> - Update the cover letter message
> - introduce fixmap mapping
> - introduce pmap
> - introduce CONFIG_GENREIC_PT
> - update use early_fdt_map() after MMU is enabled.
> ---
>
> Oleksii Kurochko (7):
> xen/riscv: allow write_atomic() to work with non-scalar types
> xen/riscv: set up fixmap mappings
> xen/riscv: introduce asm/pmap.h header
> xen/riscv: introduce functionality to work with CPU info
> xen/riscv: introduce and initialize SBI RFENCE extension
> xen/riscv: page table handling
> xen/riscv: introduce early_fdt_map()
>
> xen/arch/riscv/Kconfig | 1 +
> xen/arch/riscv/Makefile | 2 +
> xen/arch/riscv/include/asm/atomic.h | 11 +-
> xen/arch/riscv/include/asm/config.h | 16 +-
> xen/arch/riscv/include/asm/current.h | 27 +-
> xen/arch/riscv/include/asm/fixmap.h | 46 +++
> xen/arch/riscv/include/asm/flushtlb.h | 15 +
> xen/arch/riscv/include/asm/mm.h | 6 +
> xen/arch/riscv/include/asm/page.h | 99 +++++
> xen/arch/riscv/include/asm/pmap.h | 36 ++
> xen/arch/riscv/include/asm/processor.h | 3 -
> xen/arch/riscv/include/asm/riscv_encoding.h | 2 +
> xen/arch/riscv/include/asm/sbi.h | 62 +++
> xen/arch/riscv/include/asm/smp.h | 18 +
> xen/arch/riscv/mm.c | 101 ++++-
> xen/arch/riscv/pt.c | 421 ++++++++++++++++++++
> xen/arch/riscv/riscv64/asm-offsets.c | 3 +
> xen/arch/riscv/riscv64/head.S | 14 +
> xen/arch/riscv/sbi.c | 273 ++++++++++++-
> xen/arch/riscv/setup.c | 17 +
... I had to fiddle with three of the patches touching this file, to
accommodate for an apparent debugging patch you have in your tree.
Please can you make sure to submit patches against plain staging, or
to clearly state dependencies?
Jan
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v8 0/7] device tree mapping
2024-09-30 8:14 ` oleksii.kurochko
@ 2024-09-30 8:21 ` Jan Beulich
0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2024-09-30 8:21 UTC (permalink / raw)
To: oleksii.kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Julien Grall, Stefano Stabellini, xen-devel
On 30.09.2024 10:14, oleksii.kurochko@gmail.com wrote:
> On Mon, 2024-09-30 at 09:27 +0200, Jan Beulich wrote:
>> On 27.09.2024 18:33, Oleksii Kurochko wrote:
>>> Current patch series introduces device tree mapping for RISC-V
>>> and necessary things for that such as:
>>> - Fixmap mapping
>>> - pmap
>>> - Xen page table processing
>>>
>>> ---
>>> Changes in v8:
>>> - The following patch was merged to staging:
>>> [PATCH v5 1/7] xen/riscv: use {read,write}{b,w,l,q}_cpu() to
>>> define {read,write}_atomic()
>>> - All other changes are patch specific so please look at the
>>> patch.
>>
>> Except that afaics none of the patches has any revision log.
> Would it be helpful if I will send revision log as a reply to each
> patch? Or it would be better just to re-send the patch series?
To me the one for 6/7 is relevant, to aid review. Sending that as reply
will be okay I guess. Patches 1-5 went in a few minutes ago anyway.
Jan
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v8 0/7] device tree mapping
2024-09-30 8:17 ` Jan Beulich
@ 2024-09-30 8:24 ` oleksii.kurochko
2024-09-30 8:32 ` Jan Beulich
0 siblings, 1 reply; 20+ messages in thread
From: oleksii.kurochko @ 2024-09-30 8:24 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Julien Grall, Stefano Stabellini, xen-devel
On Mon, 2024-09-30 at 10:17 +0200, Jan Beulich wrote:
> On 27.09.2024 18:33, Oleksii Kurochko wrote:
> > Current patch series introduces device tree mapping for RISC-V
> > and necessary things for that such as:
> > - Fixmap mapping
> > - pmap
> > - Xen page table processing
>
> While nothing is being said here towards a dependency, ...
>
> > ---
> > Changes in v8:
> > - The following patch was merged to staging:
> > [PATCH v5 1/7] xen/riscv: use {read,write}{b,w,l,q}_cpu() to
> > define {read,write}_atomic()
> > - All other changes are patch specific so please look at the
> > patch.
> > ---
> > Changes in v7:
> > - Drop the patch "xen/riscv: prevent recursion when ASSERT(),
> > BUG*(), or panic() are called"
> > - All other changes are patch specific so please look at the
> > patch.
> > ---
> > Changes in v6:
> > - Add patch to fix recursion when ASSERT(), BUG*(), panic() are
> > called.
> > - Add patch to allow write_atomic() to work with non-scalar types
> > for consistence
> > with read_atomic().
> > - All other changes are patch specific so please look at the
> > patch.
> > ---
> > Changes in v5:
> > - The following patch was merged to staging:
> > [PATCH v3 3/9] xen/riscv: enable CONFIG_HAS_DEVICE_TREE
> > - Drop depedency from "RISCV basic exception handling
> > implementation" as
> > it was meged to staging branch.
> > - All other changes are patch specific so please look at the
> > patch.
> > ---
> > Changes in v4:
> > - Drop depedency from common devicre tree patch series as it was
> > merged to
> > staging.
> > - Update the cover letter message.
> > - All other changes are patch specific so please look at the
> > patch.
> > ---
> > Changes in v3:
> > - Introduce SBI RFENCE extension support.
> > - Introduce and initialize pcpu_info[] and __cpuid_to_hartid_map[]
> > and functionality
> > to work with this arrays.
> > - Make page table handling arch specific instead of trying to make
> > it generic.
> > - All other changes are patch specific so please look at the
> > patch.
> > ---
> > Changes in v2:
> > - Update the cover letter message
> > - introduce fixmap mapping
> > - introduce pmap
> > - introduce CONFIG_GENREIC_PT
> > - update use early_fdt_map() after MMU is enabled.
> > ---
> >
> > Oleksii Kurochko (7):
> > xen/riscv: allow write_atomic() to work with non-scalar types
> > xen/riscv: set up fixmap mappings
> > xen/riscv: introduce asm/pmap.h header
> > xen/riscv: introduce functionality to work with CPU info
> > xen/riscv: introduce and initialize SBI RFENCE extension
> > xen/riscv: page table handling
> > xen/riscv: introduce early_fdt_map()
> >
> > xen/arch/riscv/Kconfig | 1 +
> > xen/arch/riscv/Makefile | 2 +
> > xen/arch/riscv/include/asm/atomic.h | 11 +-
> > xen/arch/riscv/include/asm/config.h | 16 +-
> > xen/arch/riscv/include/asm/current.h | 27 +-
> > xen/arch/riscv/include/asm/fixmap.h | 46 +++
> > xen/arch/riscv/include/asm/flushtlb.h | 15 +
> > xen/arch/riscv/include/asm/mm.h | 6 +
> > xen/arch/riscv/include/asm/page.h | 99 +++++
> > xen/arch/riscv/include/asm/pmap.h | 36 ++
> > xen/arch/riscv/include/asm/processor.h | 3 -
> > xen/arch/riscv/include/asm/riscv_encoding.h | 2 +
> > xen/arch/riscv/include/asm/sbi.h | 62 +++
> > xen/arch/riscv/include/asm/smp.h | 18 +
> > xen/arch/riscv/mm.c | 101 ++++-
> > xen/arch/riscv/pt.c | 421
> > ++++++++++++++++++++
> > xen/arch/riscv/riscv64/asm-offsets.c | 3 +
> > xen/arch/riscv/riscv64/head.S | 14 +
> > xen/arch/riscv/sbi.c | 273 ++++++++++++-
> > xen/arch/riscv/setup.c | 17 +
>
> ... I had to fiddle with three of the patches touching this file, to
> accommodate for an apparent debugging patch you have in your tree.
> Please can you make sure to submit patches against plain staging, or
> to clearly state dependencies?
I am always trying not to forget to rebase on top of staging for this
patch series:
65c49e7aa2 (HEAD -> riscv-dt-support-v8, origin/riscv-dt-support-v8)
xen/riscv: introduce early_fdt_map()
ead52f68ce xen/riscv: page table handling
c3aba0520f xen/riscv: introduce and initialize SBI RFENCE extension
3ffb3ffd38 xen/riscv: introduce functionality to work with CPU info
4bfd2bfdb2 xen/riscv: introduce asm/pmap.h header
87bc91db10 xen/riscv: set up fixmap mappings
09b925f973 xen/riscv: allow write_atomic() to work with non-scalar
types
625ee7650c xen/README: add compiler and binutils versions for RISC-V64
5379a23ad7 xen/riscv: test basic exception handling stuff
2b6fb9f3c4 (origin/staging, origin/HEAD, staging) blkif: Fix a couple
of typos
6e73a16230 blkif: Fix alignment description for discard request
0111c86bfa x86/boot: Refactor BIOS/PVH start
Only some patches have been merged today to staging on top of "blkif:
Fix a couple of typos".
It shouldn't be any issue with applying patches from these patch
series.
~ Oleksii
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v8 0/7] device tree mapping
2024-09-30 8:24 ` oleksii.kurochko
@ 2024-09-30 8:32 ` Jan Beulich
2024-09-30 8:38 ` oleksii.kurochko
0 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2024-09-30 8:32 UTC (permalink / raw)
To: oleksii.kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Julien Grall, Stefano Stabellini, xen-devel
On 30.09.2024 10:24, oleksii.kurochko@gmail.com wrote:
> On Mon, 2024-09-30 at 10:17 +0200, Jan Beulich wrote:
>> On 27.09.2024 18:33, Oleksii Kurochko wrote:
>>> Current patch series introduces device tree mapping for RISC-V
>>> and necessary things for that such as:
>>> - Fixmap mapping
>>> - pmap
>>> - Xen page table processing
>>
>> While nothing is being said here towards a dependency, ...
>>
>>> ---
>>> Changes in v8:
>>> - The following patch was merged to staging:
>>> [PATCH v5 1/7] xen/riscv: use {read,write}{b,w,l,q}_cpu() to
>>> define {read,write}_atomic()
>>> - All other changes are patch specific so please look at the
>>> patch.
>>> ---
>>> Changes in v7:
>>> - Drop the patch "xen/riscv: prevent recursion when ASSERT(),
>>> BUG*(), or panic() are called"
>>> - All other changes are patch specific so please look at the
>>> patch.
>>> ---
>>> Changes in v6:
>>> - Add patch to fix recursion when ASSERT(), BUG*(), panic() are
>>> called.
>>> - Add patch to allow write_atomic() to work with non-scalar types
>>> for consistence
>>> with read_atomic().
>>> - All other changes are patch specific so please look at the
>>> patch.
>>> ---
>>> Changes in v5:
>>> - The following patch was merged to staging:
>>> [PATCH v3 3/9] xen/riscv: enable CONFIG_HAS_DEVICE_TREE
>>> - Drop depedency from "RISCV basic exception handling
>>> implementation" as
>>> it was meged to staging branch.
>>> - All other changes are patch specific so please look at the
>>> patch.
>>> ---
>>> Changes in v4:
>>> - Drop depedency from common devicre tree patch series as it was
>>> merged to
>>> staging.
>>> - Update the cover letter message.
>>> - All other changes are patch specific so please look at the
>>> patch.
>>> ---
>>> Changes in v3:
>>> - Introduce SBI RFENCE extension support.
>>> - Introduce and initialize pcpu_info[] and __cpuid_to_hartid_map[]
>>> and functionality
>>> to work with this arrays.
>>> - Make page table handling arch specific instead of trying to make
>>> it generic.
>>> - All other changes are patch specific so please look at the
>>> patch.
>>> ---
>>> Changes in v2:
>>> - Update the cover letter message
>>> - introduce fixmap mapping
>>> - introduce pmap
>>> - introduce CONFIG_GENREIC_PT
>>> - update use early_fdt_map() after MMU is enabled.
>>> ---
>>>
>>> Oleksii Kurochko (7):
>>> xen/riscv: allow write_atomic() to work with non-scalar types
>>> xen/riscv: set up fixmap mappings
>>> xen/riscv: introduce asm/pmap.h header
>>> xen/riscv: introduce functionality to work with CPU info
>>> xen/riscv: introduce and initialize SBI RFENCE extension
>>> xen/riscv: page table handling
>>> xen/riscv: introduce early_fdt_map()
>>>
>>> xen/arch/riscv/Kconfig | 1 +
>>> xen/arch/riscv/Makefile | 2 +
>>> xen/arch/riscv/include/asm/atomic.h | 11 +-
>>> xen/arch/riscv/include/asm/config.h | 16 +-
>>> xen/arch/riscv/include/asm/current.h | 27 +-
>>> xen/arch/riscv/include/asm/fixmap.h | 46 +++
>>> xen/arch/riscv/include/asm/flushtlb.h | 15 +
>>> xen/arch/riscv/include/asm/mm.h | 6 +
>>> xen/arch/riscv/include/asm/page.h | 99 +++++
>>> xen/arch/riscv/include/asm/pmap.h | 36 ++
>>> xen/arch/riscv/include/asm/processor.h | 3 -
>>> xen/arch/riscv/include/asm/riscv_encoding.h | 2 +
>>> xen/arch/riscv/include/asm/sbi.h | 62 +++
>>> xen/arch/riscv/include/asm/smp.h | 18 +
>>> xen/arch/riscv/mm.c | 101 ++++-
>>> xen/arch/riscv/pt.c | 421
>>> ++++++++++++++++++++
>>> xen/arch/riscv/riscv64/asm-offsets.c | 3 +
>>> xen/arch/riscv/riscv64/head.S | 14 +
>>> xen/arch/riscv/sbi.c | 273 ++++++++++++-
>>> xen/arch/riscv/setup.c | 17 +
>>
>> ... I had to fiddle with three of the patches touching this file, to
>> accommodate for an apparent debugging patch you have in your tree.
>> Please can you make sure to submit patches against plain staging, or
>> to clearly state dependencies?
> I am always trying not to forget to rebase on top of staging for this
> patch series:
>
> 65c49e7aa2 (HEAD -> riscv-dt-support-v8, origin/riscv-dt-support-v8)
> xen/riscv: introduce early_fdt_map()
> ead52f68ce xen/riscv: page table handling
> c3aba0520f xen/riscv: introduce and initialize SBI RFENCE extension
> 3ffb3ffd38 xen/riscv: introduce functionality to work with CPU info
> 4bfd2bfdb2 xen/riscv: introduce asm/pmap.h header
> 87bc91db10 xen/riscv: set up fixmap mappings
> 09b925f973 xen/riscv: allow write_atomic() to work with non-scalar
> types
> 625ee7650c xen/README: add compiler and binutils versions for RISC-V64
> 5379a23ad7 xen/riscv: test basic exception handling stuff
> 2b6fb9f3c4 (origin/staging, origin/HEAD, staging) blkif: Fix a couple
> of typos
> 6e73a16230 blkif: Fix alignment description for discard request
> 0111c86bfa x86/boot: Refactor BIOS/PVH start
This looks to be a mix of several series. And "xen/riscv: test basic
exception handling stuff" looks to be what the problem was with, as that
wasn't committed yet (and imo also shouldn't be committed, as expressed
before; of course you can try to find someone else to approve it).
Jan
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v8 6/7] xen/riscv: page table handling
2024-09-27 16:33 ` [PATCH v8 6/7] xen/riscv: page table handling Oleksii Kurochko
@ 2024-09-30 8:35 ` oleksii.kurochko
2024-09-30 15:30 ` Jan Beulich
1 sibling, 0 replies; 20+ messages in thread
From: oleksii.kurochko @ 2024-09-30 8:35 UTC (permalink / raw)
To: xen-devel, Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Julien Grall, Stefano Stabellini
Missed to add revision log:
---
Changes in V8:
- drop PTE_LEAF_MASK.
- update the comment above pte_is_table(): drop table number and use
just "the encoding of the permission bits".
- declare pte_is_table() as static.
- drop const from the argument of pte_is_table
- drop the "const" comment before the argument of pte_is_mapping().
- update the comment above ASSERT() in pte_is_mapping() to : See
pte_is_table().
- drop "const" from the return type of get_root_page().
- update the comment above "pt_check_entry()".
- start the comment with capital letter.
- update the way how PTE_ACCESS_MASK bits are cleared before being
updated by
the value in flags.
- use dprintk() instead of printk() in riscv/pt.c
- introduce XEN_TABLE_MAP_NONE and XEN_TABLE_MAP_NOMEM instead of
XEN_TABLE_MAP_FAILED
and correspondingly update part of the code of pt_next_level()'s
return value
handling in pt_update_entry.
- update type of virt to vaddr_t for pt_update_entry()
---
Changes in V7:
- rename PTE_XWV_BITS to PTE_LEAF_MASK.
- drop PTE_XWV_MASK, PTE_RWX_MASK.
- introduce PTE_ACCESS_MASK.
- update the ASSERT and the comment about it in pte_is_mapping().
- add the same ASSERT as in pte_is_mapping() to pte_is_table().
- update the comment above pte_is_table().
- use PTE_ACCESS_MASK inside pte_is_{table,mapping} instead of
encoding
access bit explicitly.
- define SATP_PPN_MASK using SATP{32,64}_PPN.
- drop inclusion of #include <xen/mm-frame.h> in riscv/pt.c as
xen/mm.h is
included.
- use pfn_to_paddr() in get_root_page() instead of open-coding of
pfn_to_paddr().
- update if the comment and the if (...) in pt_update_entry() above
the check
in case of pt_next_level() returns XEN_TABLE_MAP_FAILED.
- update the the comment above pt_update(): drop unecessary mentioning
of INVALID_MFN
and blanks inside parentheses.
- drop "full stops" in printk().
- correct the condition in ASSERT() in map_pages_to_xen().
- clear permission bits before updating the permissions in
pt_update_entry().
---
Changes in V6:
- update the commit message.
- correct the comment above flush_tlb_range_va().
- add PTE_READABLE to the check of pte.rwx permissions in
pte_is_mapping().
- s/printk/dprintk in pt_check_entry().
- drop unnecessary ASSERTS() in pt_check_entry().
- drop checking of PTE_VALID flags in /* Sanity check when removing
a mapping */ because of the earlier check.
- drop ASSERT(flags & PTE_POPULATE) in /* Sanity check when populating
the page-table */
section as in the earlier if it is checked.
- pt_next_level() changes:
- invert if ( alloc_tbl ) condition.
- drop local variable ret.
- pt_update_entry() changes:
- invert definition of alloc_tbl.
- update the comment inside "if ( rc == XEN_TABLE_MAP_FAILED )".
- drop else for mentioned above if (...).
- clear some PTE flags before update.
- s/xen_pt_lock/pt_lock
- use PFN_DOWN() for vfn variable definition in pt_update().
- drop definition of PTE_{R,W,X}_MASK.
- introduce PTE_XWV_BITS and PTE_XWV_MASK() for convenience and use
them in if (...)
in pt_update().
- update the comment above pt_update().
- change memset(&pte, 0x00, sizeof(pte)) to pte.pte = 0.
- add the comment above pte_is_table().
- add ASSERT in pte_is_mapping() to check the cases which are reserved
for future
use.
---
Changes in V5:
- s/xen_{un}map/{un}map
- introduce PTE_SMALL instead of PTE_BLOCK.
- update the comment above defintion of PTE_4K_PAGES.
- code style fixes.
- s/RV_STAGE1_MODE > SATP_MODE_SV48/RV_STAGE1_MODE > SATP_MODE_SV39
around
DECLARE_OFFSETS macros.
- change type of root_maddr from unsgined long to maddr_t.
- drop duplicated check ( if (rc) break ) in pt_update() inside while
cycle.
- s/1U/1UL
- put 'spin_unlock(&xen_pt_lock);' ahead of TLB flush in pt_update().
- update the commit message.
- update the comment above ASSERT() in map_pages_to_xen() and also
update
the check within ASSERT() to check that flags has PTE_VALID bit set.
- update the comment above pt_update() function.
- add the comment inside pt_check_entry().
- update the TLB flushing region in pt_update().
- s/alloc_only/alloc_tbl
---
Changes in V4:
- update the commit message.
- drop xen_ prefix for functions: xen_pt_update(),
xen_pt_mapping_level(),
xen_pt_update_entry(), xen_pt_next_level(), xen_pt_check_entry().
- drop 'select GENERIC_PT' for CONFIG_RISCV. There is no GENERIC_PT
anymore.
- update implementation of flush_xen_tlb_range_va and
s/flush_xen_tlb_range_va/flush_tlb_range_va
- s/pte_get_mfn/mfn_from_pte. Others similar definitions I decided not
to touch as
they were introduced before and this patter of naming such type of
macros will be applied
for newly introduced macros.
- drop _PAGE_* definitions and use analogues of PTE_*.
- introduce PTE_{W,X,R}_MASK and drop PAGE_{XN,W,X}_MASK. Also drop
_PAGE_{*}_BIT
- introduce PAGE_HYPERVISOR_RX.
- drop unused now l3_table_offset.
- drop struct pt_t as it was used only for one function. If it will be
needed in the future
pt_t will be re-introduced.
- code styles fixes in pte_is_table(). drop level argument from t.
- update implementation and prototype of pte_is_mapping().
- drop level argument from pt_next_level().
- introduce definition of SATP_PPN_MASK.
- isolate PPN of CSR_SATP before shift by PAGE_SHIFT.
- drop set_permission() functions as it is not used more then once.
- update prototype of pt_check_entry(): drop level argument as it is
not used.
- pt_check_entry():
- code style fixes
- update the sanity check when modifying an entry
- update the sanity check when when removing a mapping.
- s/read_only/alloc_only.
- code style fixes for pt_next_level().
- pt_update_entry() changes:
- drop arch_level variable inisde pt_update_entry()
- drop convertion near virt to paddr_t in DECLARE_OFFSETS(offsets,
virt);
- pull out "goto out inside first 'for' cycle.
- drop braces for 'if' cases which has only one line.
- ident 'out' label with one blank.
- update the comment above alloc_only and also definition to take
into
account that if pte population was requested or not.
- drop target variable and rename arch_target argument of the
function to
target.
- pt_mapping_level() changes:
- move the check if PTE_BLOCK should be mapped on the top of the
function.
- change int i to unsigned int and update 'for' cycle
correspondingly.
- update prototye of pt_update():
- drop the comment above nr_mfns and drop const to be consistent
with other
arguments.
- always flush TLB at the end of the function as non-present entries
can be put
in the TLB.
- add fence before TLB flush to ensure that PTEs are all updated
before flushing.
- s/XEN_TABLE_NORMAL_PAGE/XEN_TABLE_NORMAL
- add a check in map_pages_to_xen() the mfn is not INVALID_MFN.
- add the comment on top of pt_update() how mfn = INVALID_MFN is
considered.
- s/_PAGE_BLOCK/PTE_BLOCK.
- add the comment with additional explanation for PTE_BLOCK.
- drop defintion of FIRST_SIZE as it isn't used.
---
Changes in V3:
- new patch. ( Technically it is reworked version of the generic
approach
which I tried to suggest in the previous version )
---
~ Oleksii
On Fri, 2024-09-27 at 18:33 +0200, Oleksii Kurochko wrote:
> Implement map_pages_to_xen() which requires several
> functions to manage page tables and entries:
> - pt_update()
> - pt_mapping_level()
> - pt_update_entry()
> - pt_next_level()
> - pt_check_entry()
>
> To support these operations, add functions for creating,
> mapping, and unmapping Xen tables:
> - create_table()
> - map_table()
> - unmap_table()
>
> Introduce PTE_SMALL to indicate that 4KB mapping is needed
> and PTE_POPULATE.
>
> In addition introduce flush_tlb_range_va() for TLB flushing across
> CPUs after updating the PTE for the requested mapping.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> xen/arch/riscv/Makefile | 1 +
> xen/arch/riscv/include/asm/flushtlb.h | 9 +
> xen/arch/riscv/include/asm/mm.h | 2 +
> xen/arch/riscv/include/asm/page.h | 80 ++++
> xen/arch/riscv/include/asm/riscv_encoding.h | 2 +
> xen/arch/riscv/mm.c | 9 -
> xen/arch/riscv/pt.c | 421
> ++++++++++++++++++++
> 7 files changed, 515 insertions(+), 9 deletions(-)
> create mode 100644 xen/arch/riscv/pt.c
>
> diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
> index 6832549133..a5eb2aed4b 100644
> --- a/xen/arch/riscv/Makefile
> +++ b/xen/arch/riscv/Makefile
> @@ -1,6 +1,7 @@
> obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
> obj-y += entry.o
> obj-y += mm.o
> +obj-y += pt.o
> obj-$(CONFIG_RISCV_64) += riscv64/
> obj-y += sbi.o
> obj-y += setup.o
> diff --git a/xen/arch/riscv/include/asm/flushtlb.h
> b/xen/arch/riscv/include/asm/flushtlb.h
> index f4a735fd6c..43214f5e95 100644
> --- a/xen/arch/riscv/include/asm/flushtlb.h
> +++ b/xen/arch/riscv/include/asm/flushtlb.h
> @@ -5,12 +5,21 @@
> #include <xen/bug.h>
> #include <xen/cpumask.h>
>
> +#include <asm/sbi.h>
> +
> /* Flush TLB of local processor for address va. */
> static inline void flush_tlb_one_local(vaddr_t va)
> {
> asm volatile ( "sfence.vma %0" :: "r" (va) : "memory" );
> }
>
> +/* Flush a range of VA's hypervisor mappings from the TLB of all
> processors. */
> +static inline void flush_tlb_range_va(vaddr_t va, size_t size)
> +{
> + BUG_ON(!sbi_has_rfence());
> + sbi_remote_sfence_vma(NULL, va, size);
> +}
> +
> /*
> * Filter the given set of CPUs, removing those that definitely
> flushed their
> * TLB since @page_timestamp.
> diff --git a/xen/arch/riscv/include/asm/mm.h
> b/xen/arch/riscv/include/asm/mm.h
> index a0bdc2bc3a..ce1557bb27 100644
> --- a/xen/arch/riscv/include/asm/mm.h
> +++ b/xen/arch/riscv/include/asm/mm.h
> @@ -42,6 +42,8 @@ static inline void *maddr_to_virt(paddr_t ma)
> #define virt_to_mfn(va) __virt_to_mfn(va)
> #define mfn_to_virt(mfn) __mfn_to_virt(mfn)
>
> +#define mfn_from_pte(pte) maddr_to_mfn(pte_to_paddr(pte))
> +
> struct page_info
> {
> /* Each frame can be threaded onto a doubly-linked list. */
> diff --git a/xen/arch/riscv/include/asm/page.h
> b/xen/arch/riscv/include/asm/page.h
> index eb79cb9409..89fa290697 100644
> --- a/xen/arch/riscv/include/asm/page.h
> +++ b/xen/arch/riscv/include/asm/page.h
> @@ -21,6 +21,11 @@
> #define XEN_PT_LEVEL_MAP_MASK(lvl) (~(XEN_PT_LEVEL_SIZE(lvl) - 1))
> #define XEN_PT_LEVEL_MASK(lvl) (VPN_MASK <<
> XEN_PT_LEVEL_SHIFT(lvl))
>
> +/*
> + * PTE format:
> + * | XLEN-1 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
> + * PFN reserved for SW D A G U X W R V
> + */
> #define PTE_VALID BIT(0, UL)
> #define PTE_READABLE BIT(1, UL)
> #define PTE_WRITABLE BIT(2, UL)
> @@ -34,15 +39,49 @@
> #define PTE_LEAF_DEFAULT (PTE_VALID | PTE_READABLE |
> PTE_WRITABLE)
> #define PTE_TABLE (PTE_VALID)
>
> +#define PAGE_HYPERVISOR_RO (PTE_VALID | PTE_READABLE)
> #define PAGE_HYPERVISOR_RW (PTE_VALID | PTE_READABLE |
> PTE_WRITABLE)
> +#define PAGE_HYPERVISOR_RX (PTE_VALID | PTE_READABLE |
> PTE_EXECUTABLE)
>
> #define PAGE_HYPERVISOR PAGE_HYPERVISOR_RW
>
> +/*
> + * The PTE format does not contain the following bits within itself;
> + * they are created artificially to inform the Xen page table
> + * handling algorithm. These bits should not be explicitly written
> + * to the PTE entry.
> + */
> +#define PTE_SMALL BIT(10, UL)
> +#define PTE_POPULATE BIT(11, UL)
> +
> +#define PTE_ACCESS_MASK (PTE_READABLE | PTE_WRITABLE |
> PTE_EXECUTABLE)
> +
> /* Calculate the offsets into the pagetables for a given VA */
> #define pt_linear_offset(lvl, va) ((va) >>
> XEN_PT_LEVEL_SHIFT(lvl))
>
> #define pt_index(lvl, va) (pt_linear_offset((lvl), (va)) & VPN_MASK)
>
> +#define PAGETABLE_ORDER_MASK ((_AC(1, U) << PAGETABLE_ORDER) - 1)
> +#define TABLE_OFFSET(offs) (_AT(unsigned int, offs) &
> PAGETABLE_ORDER_MASK)
> +
> +#if RV_STAGE1_MODE > SATP_MODE_SV39
> +#error "need to to update DECLARE_OFFSETS macros"
> +#else
> +
> +#define l0_table_offset(va) TABLE_OFFSET(pt_linear_offset(0, va))
> +#define l1_table_offset(va) TABLE_OFFSET(pt_linear_offset(1, va))
> +#define l2_table_offset(va) TABLE_OFFSET(pt_linear_offset(2, va))
> +
> +/* Generate an array @var containing the offset for each level from
> @addr */
> +#define DECLARE_OFFSETS(var, addr) \
> + const unsigned int var[] = { \
> + l0_table_offset(addr), \
> + l1_table_offset(addr), \
> + l2_table_offset(addr), \
> + }
> +
> +#endif
> +
> /* Page Table entry */
> typedef struct {
> #ifdef CONFIG_RISCV_64
> @@ -68,6 +107,47 @@ static inline bool pte_is_valid(pte_t p)
> return p.pte & PTE_VALID;
> }
>
> +/*
> + * From the RISC-V spec:
> + * The V bit indicates whether the PTE is valid; if it is 0, all
> other bits
> + * in the PTE are don’t-cares and may be used freely by software.
> + *
> + * If V=1 the encoding of PTE R/W/X bits could be find in "the
> encoding
> + * of the permission bits" table.
> + *
> + * The encoding of the permission bits table:
> + * X W R Meaning
> + * 0 0 0 Pointer to next level of page table.
> + * 0 0 1 Read-only page.
> + * 0 1 0 Reserved for future use.
> + * 0 1 1 Read-write page.
> + * 1 0 0 Execute-only page.
> + * 1 0 1 Read-execute page.
> + * 1 1 0 Reserved for future use.
> + * 1 1 1 Read-write-execute page.
> + */
> +static inline bool pte_is_table(pte_t p)
> +{
> + /*
> + * According to the spec if V=1 and W=1 then R also needs to be
> 1 as
> + * R = 0 is reserved for future use ( look at the Table 4.5 ) so
> check
> + * in ASSERT that if (V==1 && W==1) then R isn't 0.
> + *
> + * PAGE_HYPERVISOR_RW contains PTE_VALID too.
> + */
> + ASSERT(((p.pte & PAGE_HYPERVISOR_RW) != (PTE_VALID |
> PTE_WRITABLE)));
> +
> + return ((p.pte & (PTE_VALID | PTE_ACCESS_MASK)) == PTE_VALID);
> +}
> +
> +static inline bool pte_is_mapping(pte_t p)
> +{
> + /* See pte_is_table() */
> + ASSERT(((p.pte & PAGE_HYPERVISOR_RW) != (PTE_VALID |
> PTE_WRITABLE)));
> +
> + return (p.pte & PTE_VALID) && (p.pte & PTE_ACCESS_MASK);
> +}
> +
> static inline void invalidate_icache(void)
> {
> BUG_ON("unimplemented");
> diff --git a/xen/arch/riscv/include/asm/riscv_encoding.h
> b/xen/arch/riscv/include/asm/riscv_encoding.h
> index 58abe5eccc..e31e94e77e 100644
> --- a/xen/arch/riscv/include/asm/riscv_encoding.h
> +++ b/xen/arch/riscv/include/asm/riscv_encoding.h
> @@ -164,6 +164,7 @@
> #define SSTATUS_SD SSTATUS64_SD
> #define SATP_MODE SATP64_MODE
> #define SATP_MODE_SHIFT SATP64_MODE_SHIFT
> +#define SATP_PPN_MASK SATP64_PPN
>
> #define HGATP_PPN HGATP64_PPN
> #define HGATP_VMID_SHIFT HGATP64_VMID_SHIFT
> @@ -174,6 +175,7 @@
> #define SSTATUS_SD SSTATUS32_SD
> #define SATP_MODE SATP32_MODE
> #define SATP_MODE_SHIFT SATP32_MODE_SHIFT
> +#define SATP_PPN_MASK SATP32_PPN
>
> #define HGATP_PPN HGATP32_PPN
> #define HGATP_VMID_SHIFT HGATP32_VMID_SHIFT
> diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c
> index b8ff91cf4e..e8430def14 100644
> --- a/xen/arch/riscv/mm.c
> +++ b/xen/arch/riscv/mm.c
> @@ -369,12 +369,3 @@ int destroy_xen_mappings(unsigned long s,
> unsigned long e)
> BUG_ON("unimplemented");
> return -1;
> }
> -
> -int map_pages_to_xen(unsigned long virt,
> - mfn_t mfn,
> - unsigned long nr_mfns,
> - unsigned int flags)
> -{
> - BUG_ON("unimplemented");
> - return -1;
> -}
> diff --git a/xen/arch/riscv/pt.c b/xen/arch/riscv/pt.c
> new file mode 100644
> index 0000000000..a5552a4871
> --- /dev/null
> +++ b/xen/arch/riscv/pt.c
> @@ -0,0 +1,421 @@
> +#include <xen/bug.h>
> +#include <xen/domain_page.h>
> +#include <xen/errno.h>
> +#include <xen/lib.h>
> +#include <xen/mm.h>
> +#include <xen/pfn.h>
> +#include <xen/pmap.h>
> +#include <xen/spinlock.h>
> +
> +#include <asm/flushtlb.h>
> +#include <asm/page.h>
> +
> +static inline mfn_t get_root_page(void)
> +{
> + paddr_t root_maddr = pfn_to_paddr(csr_read(CSR_SATP) &
> SATP_PPN_MASK);
> +
> + return maddr_to_mfn(root_maddr);
> +}
> +
> +/*
> + * Sanity check a page table entry about to be updated as per an
> (MFN,flags)
> + * tuple.
> + * See the comment about the possible combination of (mfn, flags) in
> + * the comment above pt_update().
> + */
> +static bool pt_check_entry(pte_t entry, mfn_t mfn, unsigned int
> flags)
> +{
> + /* Sanity check when modifying an entry. */
> + if ( (flags & PTE_VALID) && mfn_eq(mfn, INVALID_MFN) )
> + {
> + /* We don't allow modifying an invalid entry. */
> + if ( !pte_is_valid(entry) )
> + {
> + dprintk(XENLOG_ERR, "Modifying invalid entry is not
> allowed\n");
> + return false;
> + }
> +
> + /* We don't allow modifying a table entry */
> + if ( pte_is_table(entry) )
> + {
> + dprintk(XENLOG_ERR, "Modifying a table entry is not
> allowed\n");
> + return false;
> + }
> + }
> + /* Sanity check when inserting a mapping */
> + else if ( flags & PTE_VALID )
> + {
> + /*
> + * We don't allow replacing any valid entry.
> + *
> + * Note that the function pt_update() relies on this
> + * assumption and will skip the TLB flush (when Svvptc
> + * extension will be ratified). The function will need
> + * to be updated if the check is relaxed.
> + */
> + if ( pte_is_valid(entry) )
> + {
> + if ( pte_is_mapping(entry) )
> + dprintk(XENLOG_ERR, "Changing MFN for valid PTE is
> not allowed (%#"PRI_mfn" -> %#"PRI_mfn")\n",
> + mfn_x(mfn_from_pte(entry)), mfn_x(mfn));
> + else
> + dprintk(XENLOG_ERR, "Trying to replace table with
> mapping\n");
> + return false;
> + }
> + }
> + /* Sanity check when removing a mapping. */
> + else if ( !(flags & PTE_POPULATE) )
> + {
> + /* We should be here with an invalid MFN. */
> + ASSERT(mfn_eq(mfn, INVALID_MFN));
> +
> + /* We don't allow removing a table */
> + if ( pte_is_table(entry) )
> + {
> + dprintk(XENLOG_ERR, "Removing a table is not
> allowed\n");
> + return false;
> + }
> + }
> + /* Sanity check when populating the page-table. No check so far.
> */
> + else
> + {
> + /* We should be here with an invalid MFN */
> + ASSERT(mfn_eq(mfn, INVALID_MFN));
> + }
> +
> + return true;
> +}
> +
> +static pte_t *map_table(mfn_t mfn)
> +{
> + /*
> + * During early boot, map_domain_page() may be unusable. Use the
> + * PMAP to map temporarily a page-table.
> + */
> + if ( system_state == SYS_STATE_early_boot )
> + return pmap_map(mfn);
> +
> + return map_domain_page(mfn);
> +}
> +
> +static void unmap_table(const pte_t *table)
> +{
> + /*
> + * During early boot, map_table() will not use map_domain_page()
> + * but the PMAP.
> + */
> + if ( system_state == SYS_STATE_early_boot )
> + pmap_unmap(table);
> + else
> + unmap_domain_page(table);
> +}
> +
> +static int create_table(pte_t *entry)
> +{
> + mfn_t mfn;
> + void *p;
> + pte_t pte;
> +
> + if ( system_state != SYS_STATE_early_boot )
> + {
> + struct page_info *pg = alloc_domheap_page(NULL, 0);
> +
> + if ( pg == NULL )
> + return -ENOMEM;
> +
> + mfn = page_to_mfn(pg);
> + }
> + else
> + mfn = alloc_boot_pages(1, 1);
> +
> + p = map_table(mfn);
> + clear_page(p);
> + unmap_table(p);
> +
> + pte = pte_from_mfn(mfn, PTE_TABLE);
> + write_pte(entry, pte);
> +
> + return 0;
> +}
> +
> +#define XEN_TABLE_MAP_NONE 0
> +#define XEN_TABLE_MAP_NOMEM 1
> +#define XEN_TABLE_SUPER_PAGE 2
> +#define XEN_TABLE_NORMAL 3
> +
> +/*
> + * Take the currently mapped table, find the corresponding entry,
> + * and map the next table, if available.
> + *
> + * The alloc_tbl parameters indicates whether intermediate tables
> should
> + * be allocated when not present.
> + *
> + * Return values:
> + * XEN_TABLE_MAP_FAILED: Either alloc_only was set and the entry
> + * was empty, or allocating a new page failed.
> + * XEN_TABLE_NORMAL: next level or leaf mapped normally
> + * XEN_TABLE_SUPER_PAGE: The next entry points to a superpage.
> + */
> +static int pt_next_level(bool alloc_tbl, pte_t **table, unsigned int
> offset)
> +{
> + pte_t *entry;
> + mfn_t mfn;
> +
> + entry = *table + offset;
> +
> + if ( !pte_is_valid(*entry) )
> + {
> + if ( !alloc_tbl )
> + return XEN_TABLE_MAP_NONE;
> +
> + if ( create_table(entry) )
> + return XEN_TABLE_MAP_NOMEM;
> + }
> +
> + if ( pte_is_mapping(*entry) )
> + return XEN_TABLE_SUPER_PAGE;
> +
> + mfn = mfn_from_pte(*entry);
> +
> + unmap_table(*table);
> + *table = map_table(mfn);
> +
> + return XEN_TABLE_NORMAL;
> +}
> +
> +/* Update an entry at the level @target. */
> +static int pt_update_entry(mfn_t root, vaddr_t virt,
> + mfn_t mfn, unsigned int target,
> + unsigned int flags)
> +{
> + int rc;
> + unsigned int level = HYP_PT_ROOT_LEVEL;
> + pte_t *table;
> + /*
> + * The intermediate page table shouldn't be allocated when MFN
> isn't
> + * valid and we are not populating page table.
> + * This means we either modify permissions or remove an entry,
> or
> + * inserting brand new entry.
> + *
> + * See the comment above pt_update() for an additional
> explanation about
> + * combinations of (mfn, flags).
> + */
> + bool alloc_tbl = !mfn_eq(mfn, INVALID_MFN) || (flags &
> PTE_POPULATE);
> + pte_t pte, *entry;
> +
> + /* convenience aliases */
> + DECLARE_OFFSETS(offsets, virt);
> +
> + table = map_table(root);
> + for ( ; level > target; level-- )
> + {
> + rc = pt_next_level(alloc_tbl, &table, offsets[level]);
> + if ( rc == XEN_TABLE_MAP_NOMEM )
> + {
> + rc = -ENOMEM;
> + goto out;
> + }
> +
> + if ( rc == XEN_TABLE_MAP_NONE )
> + {
> + rc = 0;
> + goto out;
> + }
> +
> + if ( rc != XEN_TABLE_NORMAL )
> + break;
> + }
> +
> + if ( level != target )
> + {
> + dprintk(XENLOG_ERR,
> + "%s: Shattering superpage is not supported\n",
> __func__);
> + rc = -EOPNOTSUPP;
> + goto out;
> + }
> +
> + entry = table + offsets[level];
> +
> + rc = -EINVAL;
> + if ( !pt_check_entry(*entry, mfn, flags) )
> + goto out;
> +
> + /* We are removing the page */
> + if ( !(flags & PTE_VALID) )
> + /*
> + * There is also a check in pt_check_entry() which check
> that
> + * mfn=INVALID_MFN
> + */
> + pte.pte = 0;
> + else
> + {
> + /* We are inserting a mapping => Create new pte. */
> + if ( !mfn_eq(mfn, INVALID_MFN) )
> + pte = pte_from_mfn(mfn, PTE_VALID);
> + else /* We are updating the permission => Copy the current
> pte. */
> + {
> + pte = *entry;
> + pte.pte &= ~PTE_ACCESS_MASK;
> + }
> +
> + /* update permission according to the flags */
> + pte.pte |= (flags & PTE_ACCESS_MASK) | PTE_ACCESSED |
> PTE_DIRTY;
> + }
> +
> + write_pte(entry, pte);
> +
> + rc = 0;
> +
> + out:
> + unmap_table(table);
> +
> + return rc;
> +}
> +
> +/* Return the level where mapping should be done */
> +static int pt_mapping_level(unsigned long vfn, mfn_t mfn, unsigned
> long nr,
> + unsigned int flags)
> +{
> + unsigned int level = 0;
> + unsigned long mask;
> + unsigned int i;
> +
> + /*
> + * Use a larger mapping than 4K unless the caller specifically
> requests
> + * 4K mapping
> + */
> + if ( unlikely(flags & PTE_SMALL) )
> + return level;
> +
> + /*
> + * Don't take into account the MFN when removing mapping (i.e
> + * MFN_INVALID) to calculate the correct target order.
> + *
> + * `vfn` and `mfn` must be both superpage aligned.
> + * They are or-ed together and then checked against the size of
> + * each level.
> + *
> + * `left` ( variable declared in pt_update() ) is not included
> + * and checked separately to allow superpage mapping even if it
> + * is not properly aligned (the user may have asked to map 2MB +
> 4k).
> + */
> + mask = !mfn_eq(mfn, INVALID_MFN) ? mfn_x(mfn) : 0;
> + mask |= vfn;
> +
> + for ( i = HYP_PT_ROOT_LEVEL; i != 0; i-- )
> + {
> + if ( !(mask & (BIT(XEN_PT_LEVEL_ORDER(i), UL) - 1)) &&
> + (nr >= BIT(XEN_PT_LEVEL_ORDER(i), UL)) )
> + {
> + level = i;
> + break;
> + }
> + }
> +
> + return level;
> +}
> +
> +static DEFINE_SPINLOCK(pt_lock);
> +
> +/*
> + * If `mfn` equals `INVALID_MFN`, it indicates that the following
> page table
> + * update operation might be related to either:
> + * - populating the table (PTE_POPULATE will be set additionaly),
> + * - destroying a mapping (PTE_VALID=0),
> + * - modifying an existing mapping (PTE_VALID=1).
> + *
> + * If `mfn` != INVALID_MFN and flags has PTE_VALID bit set then it
> means that
> + * inserting will be done.
> + */
> +static int pt_update(vaddr_t virt, mfn_t mfn,
> + unsigned long nr_mfns, unsigned int flags)
> +{
> + int rc = 0;
> + unsigned long vfn = PFN_DOWN(virt);
> + unsigned long left = nr_mfns;
> + const mfn_t root = get_root_page();
> +
> + /*
> + * It is bad idea to have mapping both writeable and
> + * executable.
> + * When modifying/creating mapping (i.e PTE_VALID is set),
> + * prevent any update if this happen.
> + */
> + if ( (flags & PTE_VALID) && (flags & PTE_WRITABLE) &&
> + (flags & PTE_EXECUTABLE) )
> + {
> + dprintk(XENLOG_ERR,
> + "Mappings should not be both Writeable and
> Executable\n");
> + return -EINVAL;
> + }
> +
> + if ( !IS_ALIGNED(virt, PAGE_SIZE) )
> + {
> + dprintk(XENLOG_ERR,
> + "The virtual address is not aligned to the page-
> size\n");
> + return -EINVAL;
> + }
> +
> + spin_lock(&pt_lock);
> +
> + while ( left )
> + {
> + unsigned int order, level;
> +
> + level = pt_mapping_level(vfn, mfn, left, flags);
> + order = XEN_PT_LEVEL_ORDER(level);
> +
> + ASSERT(left >= BIT(order, UL));
> +
> + rc = pt_update_entry(root, vfn << PAGE_SHIFT, mfn, level,
> flags);
> + if ( rc )
> + break;
> +
> + vfn += 1UL << order;
> + if ( !mfn_eq(mfn, INVALID_MFN) )
> + mfn = mfn_add(mfn, 1UL << order);
> +
> + left -= (1UL << order);
> + }
> +
> + /* Ensure that PTEs are all updated before flushing */
> + RISCV_FENCE(rw, rw);
> +
> + spin_unlock(&pt_lock);
> +
> + /*
> + * Always flush TLB at the end of the function as non-present
> entries
> + * can be put in the TLB.
> + *
> + * The remote fence operation applies to the entire address
> space if
> + * either:
> + * - start and size are both 0, or
> + * - size is equal to 2^XLEN-1.
> + *
> + * TODO: come up with something which will allow not to flash
> the entire
> + * address space.
> + */
> + flush_tlb_range_va(0, 0);
> +
> + return rc;
> +}
> +
> +int map_pages_to_xen(unsigned long virt,
> + mfn_t mfn,
> + unsigned long nr_mfns,
> + unsigned int flags)
> +{
> + /*
> + * Ensure that flags has PTE_VALID bit as map_pages_to_xen() is
> supposed
> + * to create a mapping.
> + *
> + * Ensure that we have a valid MFN before proceeding.
> + *
> + * If the MFN is invalid, pt_update() might misinterpret the
> operation,
> + * treating it as either a population, a mapping destruction,
> + * or a mapping modification.
> + */
> + ASSERT(!mfn_eq(mfn, INVALID_MFN) && (flags & PTE_VALID));
> +
> + return pt_update(virt, mfn, nr_mfns, flags);
> +}
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v8 7/7] xen/riscv: introduce early_fdt_map()
2024-09-27 16:33 ` [PATCH v8 7/7] xen/riscv: introduce early_fdt_map() Oleksii Kurochko
@ 2024-09-30 8:36 ` oleksii.kurochko
0 siblings, 0 replies; 20+ messages in thread
From: oleksii.kurochko @ 2024-09-30 8:36 UTC (permalink / raw)
To: xen-devel
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Jan Beulich, Julien Grall, Stefano Stabellini
Add missing revision log:
---
Changes in V6-V8:
- Nothing changed. Only rebase.
---
Changes in V5:
- drop usage of PTE_BLOCK for flag argument of map_pages_to_xen() in
early_fdt_map()
as block mapping is now default behaviour. Also PTE_BLOCK was
dropped in the patch
"xen/riscv: page table handling".
---
Changes in V4:
- s/_PAGE_BLOCK/PTE_BLOCK
- Add Acked-by: Jan Beulich <jbeulich@suse.com>
- unwarap two lines in panic() in case when device_tree_flattened is
NULL
so grep-ing for any part of the message line will always produce a
hit.
- slightly update the commit message.
---
Changes in V3:
- Code style fixes
- s/SZ_2M/MB(2)
- fix condition to check if early_fdt_map() in setup.c return NULL or
not.
---
Changes in V2:
- rework early_fdt_map to use map_pages_to_xen()
- move call early_fdt_map() to C code after MMU is enabled.
---
~ Oleksii
On Fri, 2024-09-27 at 18:33 +0200, Oleksii Kurochko wrote:
> Introduce function which allows to map FDT to Xen.
>
> Also, initialization of device_tree_flattened happens using
> early_fdt_map().
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> ---
> xen/arch/riscv/include/asm/mm.h | 2 ++
> xen/arch/riscv/mm.c | 55
> +++++++++++++++++++++++++++++++++
> xen/arch/riscv/setup.c | 7 +++++
> 3 files changed, 64 insertions(+)
>
> diff --git a/xen/arch/riscv/include/asm/mm.h
> b/xen/arch/riscv/include/asm/mm.h
> index ce1557bb27..4b7b00b850 100644
> --- a/xen/arch/riscv/include/asm/mm.h
> +++ b/xen/arch/riscv/include/asm/mm.h
> @@ -259,4 +259,6 @@ static inline unsigned int
> arch_get_dma_bitsize(void)
>
> void setup_fixmap_mappings(void);
>
> +void *early_fdt_map(paddr_t fdt_paddr);
> +
> #endif /* _ASM_RISCV_MM_H */
> diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c
> index e8430def14..4a628aef83 100644
> --- a/xen/arch/riscv/mm.c
> +++ b/xen/arch/riscv/mm.c
> @@ -1,13 +1,16 @@
> /* SPDX-License-Identifier: GPL-2.0-only */
>
> +#include <xen/bootfdt.h>
> #include <xen/bug.h>
> #include <xen/compiler.h>
> #include <xen/init.h>
> #include <xen/kernel.h>
> +#include <xen/libfdt/libfdt.h>
> #include <xen/macros.h>
> #include <xen/mm.h>
> #include <xen/pfn.h>
> #include <xen/sections.h>
> +#include <xen/sizes.h>
>
> #include <asm/early_printk.h>
> #include <asm/csr.h>
> @@ -369,3 +372,55 @@ int destroy_xen_mappings(unsigned long s,
> unsigned long e)
> BUG_ON("unimplemented");
> return -1;
> }
> +
> +void * __init early_fdt_map(paddr_t fdt_paddr)
> +{
> + /* We are using 2MB superpage for mapping the FDT */
> + paddr_t base_paddr = fdt_paddr & XEN_PT_LEVEL_MAP_MASK(1);
> + paddr_t offset;
> + void *fdt_virt;
> + uint32_t size;
> + int rc;
> +
> + /*
> + * Check whether the physical FDT address is set and meets the
> minimum
> + * alignment requirement. Since we are relying on MIN_FDT_ALIGN
> to be at
> + * least 8 bytes so that we always access the magic and size
> fields
> + * of the FDT header after mapping the first chunk, double check
> if
> + * that is indeed the case.
> + */
> + BUILD_BUG_ON(MIN_FDT_ALIGN < 8);
> + if ( !fdt_paddr || fdt_paddr % MIN_FDT_ALIGN )
> + return NULL;
> +
> + /* The FDT is mapped using 2MB superpage */
> + BUILD_BUG_ON(BOOT_FDT_VIRT_START % MB(2));
> +
> + rc = map_pages_to_xen(BOOT_FDT_VIRT_START,
> maddr_to_mfn(base_paddr),
> + MB(2) >> PAGE_SHIFT,
> + PAGE_HYPERVISOR_RO);
> + if ( rc )
> + panic("Unable to map the device-tree.\n");
> +
> + offset = fdt_paddr % XEN_PT_LEVEL_SIZE(1);
> + fdt_virt = (void *)BOOT_FDT_VIRT_START + offset;
> +
> + if ( fdt_magic(fdt_virt) != FDT_MAGIC )
> + return NULL;
> +
> + size = fdt_totalsize(fdt_virt);
> + if ( size > BOOT_FDT_VIRT_SIZE )
> + return NULL;
> +
> + if ( (offset + size) > MB(2) )
> + {
> + rc = map_pages_to_xen(BOOT_FDT_VIRT_START + MB(2),
> + maddr_to_mfn(base_paddr + MB(2)),
> + MB(2) >> PAGE_SHIFT,
> + PAGE_HYPERVISOR_RO);
> + if ( rc )
> + panic("Unable to map the device-tree\n");
> + }
> +
> + return fdt_virt;
> +}
> diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
> index c4fadd36c6..a316901fd4 100644
> --- a/xen/arch/riscv/setup.c
> +++ b/xen/arch/riscv/setup.c
> @@ -2,6 +2,7 @@
>
> #include <xen/bug.h>
> #include <xen/compile.h>
> +#include <xen/device_tree.h>
> #include <xen/init.h>
> #include <xen/mm.h>
> #include <xen/shutdown.h>
> @@ -57,6 +58,12 @@ void __init noreturn start_xen(unsigned long
> bootcpu_id,
>
> setup_fixmap_mappings();
>
> + device_tree_flattened = early_fdt_map(dtb_addr);
> + if ( !device_tree_flattened )
> + panic("Invalid device tree blob at physical address %#lx.
> The DTB must be 8-byte aligned and must not exceed %lld bytes in
> size.\n\n"
> + "Please check your bootloader.\n",
> + dtb_addr, BOOT_FDT_VIRT_SIZE);
> +
> printk("All set up\n");
>
> machine_halt();
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v8 0/7] device tree mapping
2024-09-30 8:32 ` Jan Beulich
@ 2024-09-30 8:38 ` oleksii.kurochko
0 siblings, 0 replies; 20+ messages in thread
From: oleksii.kurochko @ 2024-09-30 8:38 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Julien Grall, Stefano Stabellini, xen-devel
On Mon, 2024-09-30 at 10:32 +0200, Jan Beulich wrote:
> On 30.09.2024 10:24, oleksii.kurochko@gmail.com wrote:
> > On Mon, 2024-09-30 at 10:17 +0200, Jan Beulich wrote:
> > > On 27.09.2024 18:33, Oleksii Kurochko wrote:
> > > > Current patch series introduces device tree mapping for RISC-V
> > > > and necessary things for that such as:
> > > > - Fixmap mapping
> > > > - pmap
> > > > - Xen page table processing
> > >
> > > While nothing is being said here towards a dependency, ...
> > >
> > > > ---
> > > > Changes in v8:
> > > > - The following patch was merged to staging:
> > > > [PATCH v5 1/7] xen/riscv: use {read,write}{b,w,l,q}_cpu()
> > > > to
> > > > define {read,write}_atomic()
> > > > - All other changes are patch specific so please look at the
> > > > patch.
> > > > ---
> > > > Changes in v7:
> > > > - Drop the patch "xen/riscv: prevent recursion when ASSERT(),
> > > > BUG*(), or panic() are called"
> > > > - All other changes are patch specific so please look at the
> > > > patch.
> > > > ---
> > > > Changes in v6:
> > > > - Add patch to fix recursion when ASSERT(), BUG*(), panic()
> > > > are
> > > > called.
> > > > - Add patch to allow write_atomic() to work with non-scalar
> > > > types
> > > > for consistence
> > > > with read_atomic().
> > > > - All other changes are patch specific so please look at the
> > > > patch.
> > > > ---
> > > > Changes in v5:
> > > > - The following patch was merged to staging:
> > > > [PATCH v3 3/9] xen/riscv: enable CONFIG_HAS_DEVICE_TREE
> > > > - Drop depedency from "RISCV basic exception handling
> > > > implementation" as
> > > > it was meged to staging branch.
> > > > - All other changes are patch specific so please look at the
> > > > patch.
> > > > ---
> > > > Changes in v4:
> > > > - Drop depedency from common devicre tree patch series as it
> > > > was
> > > > merged to
> > > > staging.
> > > > - Update the cover letter message.
> > > > - All other changes are patch specific so please look at the
> > > > patch.
> > > > ---
> > > > Changes in v3:
> > > > - Introduce SBI RFENCE extension support.
> > > > - Introduce and initialize pcpu_info[] and
> > > > __cpuid_to_hartid_map[]
> > > > and functionality
> > > > to work with this arrays.
> > > > - Make page table handling arch specific instead of trying to
> > > > make
> > > > it generic.
> > > > - All other changes are patch specific so please look at the
> > > > patch.
> > > > ---
> > > > Changes in v2:
> > > > - Update the cover letter message
> > > > - introduce fixmap mapping
> > > > - introduce pmap
> > > > - introduce CONFIG_GENREIC_PT
> > > > - update use early_fdt_map() after MMU is enabled.
> > > > ---
> > > >
> > > > Oleksii Kurochko (7):
> > > > xen/riscv: allow write_atomic() to work with non-scalar types
> > > > xen/riscv: set up fixmap mappings
> > > > xen/riscv: introduce asm/pmap.h header
> > > > xen/riscv: introduce functionality to work with CPU info
> > > > xen/riscv: introduce and initialize SBI RFENCE extension
> > > > xen/riscv: page table handling
> > > > xen/riscv: introduce early_fdt_map()
> > > >
> > > > xen/arch/riscv/Kconfig | 1 +
> > > > xen/arch/riscv/Makefile | 2 +
> > > > xen/arch/riscv/include/asm/atomic.h | 11 +-
> > > > xen/arch/riscv/include/asm/config.h | 16 +-
> > > > xen/arch/riscv/include/asm/current.h | 27 +-
> > > > xen/arch/riscv/include/asm/fixmap.h | 46 +++
> > > > xen/arch/riscv/include/asm/flushtlb.h | 15 +
> > > > xen/arch/riscv/include/asm/mm.h | 6 +
> > > > xen/arch/riscv/include/asm/page.h | 99 +++++
> > > > xen/arch/riscv/include/asm/pmap.h | 36 ++
> > > > xen/arch/riscv/include/asm/processor.h | 3 -
> > > > xen/arch/riscv/include/asm/riscv_encoding.h | 2 +
> > > > xen/arch/riscv/include/asm/sbi.h | 62 +++
> > > > xen/arch/riscv/include/asm/smp.h | 18 +
> > > > xen/arch/riscv/mm.c | 101 ++++-
> > > > xen/arch/riscv/pt.c | 421
> > > > ++++++++++++++++++++
> > > > xen/arch/riscv/riscv64/asm-offsets.c | 3 +
> > > > xen/arch/riscv/riscv64/head.S | 14 +
> > > > xen/arch/riscv/sbi.c | 273
> > > > ++++++++++++-
> > > > xen/arch/riscv/setup.c | 17 +
> > >
> > > ... I had to fiddle with three of the patches touching this file,
> > > to
> > > accommodate for an apparent debugging patch you have in your
> > > tree.
> > > Please can you make sure to submit patches against plain staging,
> > > or
> > > to clearly state dependencies?
> > I am always trying not to forget to rebase on top of staging for
> > this
> > patch series:
> >
> > 65c49e7aa2 (HEAD -> riscv-dt-support-v8, origin/riscv-dt-support-
> > v8)
> > xen/riscv: introduce early_fdt_map()
> > ead52f68ce xen/riscv: page table handling
> > c3aba0520f xen/riscv: introduce and initialize SBI RFENCE extension
> > 3ffb3ffd38 xen/riscv: introduce functionality to work with CPU info
> > 4bfd2bfdb2 xen/riscv: introduce asm/pmap.h header
> > 87bc91db10 xen/riscv: set up fixmap mappings
> > 09b925f973 xen/riscv: allow write_atomic() to work with non-scalar
> > types
> > 625ee7650c xen/README: add compiler and binutils versions for RISC-
> > V64
> > 5379a23ad7 xen/riscv: test basic exception handling stuff
> > 2b6fb9f3c4 (origin/staging, origin/HEAD, staging) blkif: Fix a
> > couple
> > of typos
> > 6e73a16230 blkif: Fix alignment description for discard request
> > 0111c86bfa x86/boot: Refactor BIOS/PVH start
>
> This looks to be a mix of several series. And "xen/riscv: test basic
> exception handling stuff" looks to be what the problem was with, as
> that
> wasn't committed yet (and imo also shouldn't be committed, as
> expressed
> before; of course you can try to find someone else to approve it).
Oh, you are right. I will put it to separate branch to not breaking the
things. Thanks.
~ Oleksii
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v8 6/7] xen/riscv: page table handling
2024-09-27 16:33 ` [PATCH v8 6/7] xen/riscv: page table handling Oleksii Kurochko
2024-09-30 8:35 ` oleksii.kurochko
@ 2024-09-30 15:30 ` Jan Beulich
2024-10-01 9:29 ` oleksii.kurochko
1 sibling, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2024-09-30 15:30 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Julien Grall, Stefano Stabellini, xen-devel
On 27.09.2024 18:33, Oleksii Kurochko wrote:
> +static bool pt_check_entry(pte_t entry, mfn_t mfn, unsigned int flags)
> +{
> + /* Sanity check when modifying an entry. */
> + if ( (flags & PTE_VALID) && mfn_eq(mfn, INVALID_MFN) )
> + {
> + /* We don't allow modifying an invalid entry. */
> + if ( !pte_is_valid(entry) )
> + {
> + dprintk(XENLOG_ERR, "Modifying invalid entry is not allowed\n");
> + return false;
> + }
> +
> + /* We don't allow modifying a table entry */
> + if ( pte_is_table(entry) )
> + {
> + dprintk(XENLOG_ERR, "Modifying a table entry is not allowed\n");
> + return false;
> + }
> + }
> + /* Sanity check when inserting a mapping */
> + else if ( flags & PTE_VALID )
> + {
> + /*
> + * We don't allow replacing any valid entry.
> + *
> + * Note that the function pt_update() relies on this
> + * assumption and will skip the TLB flush (when Svvptc
> + * extension will be ratified). The function will need
> + * to be updated if the check is relaxed.
> + */
> + if ( pte_is_valid(entry) )
> + {
> + if ( pte_is_mapping(entry) )
> + dprintk(XENLOG_ERR, "Changing MFN for valid PTE is not allowed (%#"PRI_mfn" -> %#"PRI_mfn")\n",
> + mfn_x(mfn_from_pte(entry)), mfn_x(mfn));
Nit: Indentation is now off by one.
> +#define XEN_TABLE_MAP_NONE 0
> +#define XEN_TABLE_MAP_NOMEM 1
> +#define XEN_TABLE_SUPER_PAGE 2
> +#define XEN_TABLE_NORMAL 3
> +
> +/*
> + * Take the currently mapped table, find the corresponding entry,
> + * and map the next table, if available.
> + *
> + * The alloc_tbl parameters indicates whether intermediate tables should
> + * be allocated when not present.
> + *
> + * Return values:
> + * XEN_TABLE_MAP_FAILED: Either alloc_only was set and the entry
> + * was empty, or allocating a new page failed.
> + * XEN_TABLE_NORMAL: next level or leaf mapped normally
> + * XEN_TABLE_SUPER_PAGE: The next entry points to a superpage.
This part of the comment is now stale.
> + */
> +static int pt_next_level(bool alloc_tbl, pte_t **table, unsigned int offset)
> +{
> + pte_t *entry;
> + mfn_t mfn;
> +
> + entry = *table + offset;
> +
> + if ( !pte_is_valid(*entry) )
> + {
> + if ( !alloc_tbl )
> + return XEN_TABLE_MAP_NONE;
> +
> + if ( create_table(entry) )
> + return XEN_TABLE_MAP_NOMEM;
At the risk of being overly picky: You still lose -ENOMEM here. I'd suggest
to make create_table() return boolean (success / fail) to eliminate that
concern. Any hypothetical plan for someone to add another error code there
will then hit the return type aspect, pointing out that call sites need
looking at for such a change to be correct.
Jan
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v8 6/7] xen/riscv: page table handling
2024-09-30 15:30 ` Jan Beulich
@ 2024-10-01 9:29 ` oleksii.kurochko
0 siblings, 0 replies; 20+ messages in thread
From: oleksii.kurochko @ 2024-10-01 9:29 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Julien Grall, Stefano Stabellini, xen-devel
On Mon, 2024-09-30 at 17:30 +0200, Jan Beulich wrote:
> > + */
> > +static int pt_next_level(bool alloc_tbl, pte_t **table, unsigned
> > int offset)
> > +{
> > + pte_t *entry;
> > + mfn_t mfn;
> > +
> > + entry = *table + offset;
> > +
> > + if ( !pte_is_valid(*entry) )
> > + {
> > + if ( !alloc_tbl )
> > + return XEN_TABLE_MAP_NONE;
> > +
> > + if ( create_table(entry) )
> > + return XEN_TABLE_MAP_NOMEM;
>
> At the risk of being overly picky: You still lose -ENOMEM here. I'd
> suggest
> to make create_table() return boolean (success / fail) to eliminate
> that
> concern. Any hypothetical plan for someone to add another error code
> there
> will then hit the return type aspect, pointing out that call sites
> need
> looking at for such a change to be correct.
I agree with these changes. Let's go with this approach.
Thanks.
~ Oleksii
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2024-10-01 9:30 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-27 16:33 [PATCH v8 0/7] device tree mapping Oleksii Kurochko
2024-09-27 16:33 ` [PATCH v8 1/7] xen/riscv: allow write_atomic() to work with non-scalar types Oleksii Kurochko
2024-09-27 16:33 ` [PATCH v8 2/7] xen/riscv: set up fixmap mappings Oleksii Kurochko
2024-09-27 16:33 ` [PATCH v8 3/7] xen/riscv: introduce asm/pmap.h header Oleksii Kurochko
2024-09-27 16:33 ` [PATCH v8 4/7] xen/riscv: introduce functionality to work with CPU info Oleksii Kurochko
2024-09-30 7:25 ` Jan Beulich
2024-09-27 16:33 ` [PATCH v8 5/7] xen/riscv: introduce and initialize SBI RFENCE extension Oleksii Kurochko
2024-09-27 16:33 ` [PATCH v8 6/7] xen/riscv: page table handling Oleksii Kurochko
2024-09-30 8:35 ` oleksii.kurochko
2024-09-30 15:30 ` Jan Beulich
2024-10-01 9:29 ` oleksii.kurochko
2024-09-27 16:33 ` [PATCH v8 7/7] xen/riscv: introduce early_fdt_map() Oleksii Kurochko
2024-09-30 8:36 ` oleksii.kurochko
2024-09-30 7:27 ` [PATCH v8 0/7] device tree mapping Jan Beulich
2024-09-30 8:14 ` oleksii.kurochko
2024-09-30 8:21 ` Jan Beulich
2024-09-30 8:17 ` Jan Beulich
2024-09-30 8:24 ` oleksii.kurochko
2024-09-30 8:32 ` Jan Beulich
2024-09-30 8:38 ` oleksii.kurochko
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.