[PATCH v1 0/6] xen/riscv: introduce p2m functionality

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v1 0/6] xen/riscv: introduce p2m functionality
@ 2025-05-09 15:57 Oleksii Kurochko
  2025-05-09 15:57 ` [PATCH v1 1/6] xen/riscv: add inclusion of xen/bitops.h to asm/cmpxchg.h Oleksii Kurochko
                   ` (5 more replies)
  0 siblings, 6 replies; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-09 15:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

In this patch series are introduced necessary functions to build and manage
RISC-V guest page tables and MMIO/RAM mappings.

This patch series is based on the patch [1]:
  https://lore.kernel.org/xen-devel/da9273c20dc7ac1c131322e38a8cef361dfd86a9.1746530883.git.oleksii.kurochko@gmail.com/T/#u

Oleksii Kurochko (6):
  xen/riscv: add inclusion of xen/bitops.h to asm/cmpxchg.h
  xen/riscv: introduce things necessary for p2m initialization
  xen/riscv: construct the P2M pages pool for guests
  xen/riscv: define pt_t and pt_walk_t structures
  xen/riscv: add new p2m types and helper macros for type classification
  xen/riscv: implement p2m mapping functionality

 xen/arch/riscv/Makefile              |    1 +
 xen/arch/riscv/include/asm/cmpxchg.h |    1 +
 xen/arch/riscv/include/asm/domain.h  |   16 +
 xen/arch/riscv/include/asm/mm.h      |   36 +-
 xen/arch/riscv/include/asm/p2m.h     |  121 ++-
 xen/arch/riscv/include/asm/page.h    |   65 +-
 xen/arch/riscv/p2m.c                 | 1015 ++++++++++++++++++++++++++
 7 files changed, 1243 insertions(+), 12 deletions(-)
 create mode 100644 xen/arch/riscv/p2m.c

-- 
2.49.0



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v1 1/6] xen/riscv: add inclusion of xen/bitops.h to asm/cmpxchg.h
  2025-05-09 15:57 [PATCH v1 0/6] xen/riscv: introduce p2m functionality Oleksii Kurochko
@ 2025-05-09 15:57 ` Oleksii Kurochko
  2025-05-09 16:00   ` Andrew Cooper
  2025-05-09 15:57 ` [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization Oleksii Kurochko
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-09 15:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Add inclusion of xen/bitops.h to asm/cmpxchg.h to avoid compilation issues
connected to GENMASK() which is used inside asm/cmpxchg.h.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
This patch should go first; otherwise one of the further patches of this
patch series could face a compilation issue.
---
 xen/arch/riscv/include/asm/cmpxchg.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/arch/riscv/include/asm/cmpxchg.h b/xen/arch/riscv/include/asm/cmpxchg.h
index 7d7c89b6fa..d47ebaffce 100644
--- a/xen/arch/riscv/include/asm/cmpxchg.h
+++ b/xen/arch/riscv/include/asm/cmpxchg.h
@@ -4,6 +4,7 @@
 #ifndef ASM__RISCV__CMPXCHG_H
 #define ASM__RISCV__CMPXCHG_H
 
+#include <xen/bitops.h>
 #include <xen/compiler.h>
 #include <xen/lib.h>
 
-- 
2.49.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization
  2025-05-09 15:57 [PATCH v1 0/6] xen/riscv: introduce p2m functionality Oleksii Kurochko
  2025-05-09 15:57 ` [PATCH v1 1/6] xen/riscv: add inclusion of xen/bitops.h to asm/cmpxchg.h Oleksii Kurochko
@ 2025-05-09 15:57 ` Oleksii Kurochko
  2025-05-09 16:14   ` Andrew Cooper
  2025-05-20 13:37   ` Jan Beulich
  2025-05-09 15:57 ` [PATCH v1 3/6] xen/riscv: construct the P2M pages pool for guests Oleksii Kurochko
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-09 15:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Introduce the following things:
- p2m_domain structure which describe per p2m-table state.
- Update arch_domain structure with the mentioned above structure.
- p2m_get_hostp2m() to recieve domain's p2m structure.
- Introudce p2m_write_lock() and p2m_is_write_locked().
- p2m_init() to initalize p2m:
  - allocate p2m table by using of p2m_alloc_table().
  - initialize lock premitive necessary to protect updates to the p2m.
- Introduce the following functions to implement p2m_alloc_table():
  - p2m_allocate_root() to allocate p2m root table by using another introduced
    helpers p2m_get_clean_page() and clear_and_clean_page().
  - introduce p2m_force_tlb_flush_sync() to flush TLBs after p2m table
    allocation before being used. (it isn't necessary at the current stage of
    development but could be useful once the VMID is marked unused, a new domain
    can reuse the VMID for its own. If the TLB is not flushed, entries can
    contain wrong translation.)
- Implement maddr_to_page() and page_to_maddr().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 xen/arch/riscv/Makefile             |   1 +
 xen/arch/riscv/include/asm/domain.h |   6 +
 xen/arch/riscv/include/asm/mm.h     |   4 +
 xen/arch/riscv/include/asm/p2m.h    |  76 +++++++++++++
 xen/arch/riscv/p2m.c                | 168 ++++++++++++++++++++++++++++
 5 files changed, 255 insertions(+)
 create mode 100644 xen/arch/riscv/p2m.c

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index d882c57528..87c5e7e7f2 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -5,6 +5,7 @@ obj-y += entry.o
 obj-y += intc.o
 obj-y += mm.o
 obj-y += pt.o
+obj-y += p2m.o
 obj-$(CONFIG_RISCV_64) += riscv64/
 obj-y += sbi.o
 obj-y += setup.o
diff --git a/xen/arch/riscv/include/asm/domain.h b/xen/arch/riscv/include/asm/domain.h
index c3d965a559..48be90a395 100644
--- a/xen/arch/riscv/include/asm/domain.h
+++ b/xen/arch/riscv/include/asm/domain.h
@@ -5,6 +5,8 @@
 #include <xen/xmalloc.h>
 #include <public/hvm/params.h>
 
+#include <asm/p2m.h>
+
 struct hvm_domain
 {
     uint64_t              params[HVM_NR_PARAMS];
@@ -16,8 +18,12 @@ struct arch_vcpu_io {
 struct arch_vcpu {
 };
 
+
 struct arch_domain {
     struct hvm_domain hvm;
+
+    struct p2m_domain p2m;
+
 };
 
 #include <xen/sched.h>
diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h
index 01bbd92a06..972ec45448 100644
--- a/xen/arch/riscv/include/asm/mm.h
+++ b/xen/arch/riscv/include/asm/mm.h
@@ -149,6 +149,10 @@ extern struct page_info *frametable_virt_start;
 #define mfn_to_page(mfn)    (frametable_virt_start + mfn_x(mfn))
 #define page_to_mfn(pg)     _mfn((pg) - frametable_virt_start)
 
+/* Convert between machine addresses and page-info structures. */
+#define maddr_to_page(ma) mfn_to_page(maddr_to_mfn(ma))
+#define page_to_maddr(pg) (mfn_to_maddr(page_to_mfn(pg)))
+
 static inline void *page_to_virt(const struct page_info *pg)
 {
     return mfn_to_virt(mfn_x(page_to_mfn(pg)));
diff --git a/xen/arch/riscv/include/asm/p2m.h b/xen/arch/riscv/include/asm/p2m.h
index 28f57a74f2..8b46210768 100644
--- a/xen/arch/riscv/include/asm/p2m.h
+++ b/xen/arch/riscv/include/asm/p2m.h
@@ -3,11 +3,73 @@
 #define ASM__RISCV__P2M_H
 
 #include <xen/errno.h>
+#include <xen/mem_access.h>
+#include <xen/mm.h>
+#include <xen/radix-tree.h>
+#include <xen/rwlock.h>
+#include <xen/types.h>
 
 #include <asm/page-bits.h>
 
 #define paddr_bits PADDR_BITS
 
+/* Get host p2m table */
+#define p2m_get_hostp2m(d) (&(d)->arch.p2m)
+
+/* Per-p2m-table state */
+struct p2m_domain {
+    /*
+     * Lock that protects updates to the p2m.
+     */
+    rwlock_t lock;
+
+    /* Page containing root p2m table */
+    struct page_info *root;
+
+    /* Pages used to construct the p2m */
+    struct page_list_head pages;
+
+    /* Address Translation Table for the p2m */
+    paddr_t hgatp;
+
+    /*
+     * P2M updates may required TLBs to be flushed (invalidated).
+     *
+     * Flushes may be deferred by setting 'need_flush' and then flushing
+     * when the p2m write lock is released.
+     *
+     * If an immediate flush is required (e.g, if a super page is
+     * shattered), call p2m_tlb_flush_sync().
+     */
+    bool need_flush;
+
+    /* Indicate if it is required to clean the cache when writing an entry */
+    bool clean_pte;
+
+    struct radix_tree_root p2m_type;
+
+    /*
+     * Default P2M access type for each page in the the domain: new pages,
+     * swapped in pages, cleared pages, and pages that are ambiguously
+     * retyped get this access type.  See definition of p2m_access_t.
+     */
+    p2m_access_t default_access;
+
+    /* Highest guest frame that's ever been mapped in the p2m */
+    gfn_t max_mapped_gfn;
+
+    /*
+     * Lowest mapped gfn in the p2m. When releasing mapped gfn's in a
+     * preemptible manner this is update to track recall where to
+     * resume the search. Apart from during teardown this can only
+     * decrease.
+     */
+    gfn_t lowest_mapped_gfn;
+
+    /* Back pointer to domain */
+    struct domain *domain;
+};
+
 /*
  * List of possible type for each page in the p2m entry.
  * The number of available bit per page in the pte for this purpose is 2 bits.
@@ -93,6 +155,20 @@ static inline void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
     /* Not supported on RISCV. */
 }
 
+int p2m_init(struct domain *d);
+
+static inline void p2m_write_lock(struct p2m_domain *p2m)
+{
+    write_lock(&p2m->lock);
+}
+
+void p2m_write_unlock(struct p2m_domain *p2m);
+
+static inline int p2m_is_write_locked(struct p2m_domain *p2m)
+{
+    return rw_is_write_locked(&p2m->lock);
+}
+
 #endif /* ASM__RISCV__P2M_H */
 
 /*
diff --git a/xen/arch/riscv/p2m.c b/xen/arch/riscv/p2m.c
new file mode 100644
index 0000000000..ad4beef8f9
--- /dev/null
+++ b/xen/arch/riscv/p2m.c
@@ -0,0 +1,168 @@
+#include <xen/domain_page.h>
+#include <xen/iommu.h>
+#include <xen/lib.h>
+#include <xen/mm.h>
+#include <xen/pfn.h>
+#include <xen/rwlock.h>
+#include <xen/sched.h>
+#include <xen/spinlock.h>
+
+#include <asm/page.h>
+#include <asm/p2m.h>
+
+/*
+ * Force a synchronous P2M TLB flush.
+ *
+ * Must be called with the p2m lock held.
+ *
+ * TODO: add support of flushing TLB connected to VMID.
+ */
+static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
+{
+    ASSERT(p2m_is_write_locked(p2m));
+
+    /*
+     * TODO: shouldn't be this flush done for each physical CPU?
+     *       If yes, then SBI call sbi_remote_hfence_gvma() could
+     *       be used for that.
+     */
+#if defined(__riscv_hh) || defined(__riscv_h)
+    asm volatile ( "hfence.gvma" ::: "memory" );
+#else
+    asm volatile ( ".insn r 0x73, 0x0, 0x31, x0, x0, x0" ::: "memory" );
+#endif
+
+    p2m->need_flush = false;
+}
+
+static void p2m_tlb_flush_sync(struct p2m_domain *p2m)
+{
+    if ( p2m->need_flush )
+        p2m_force_tlb_flush_sync(p2m);
+}
+
+/* Unlock the flush and do a P2M TLB flush if necessary */
+void p2m_write_unlock(struct p2m_domain *p2m)
+{
+    /*
+     * The final flush is done with the P2M write lock taken to avoid
+     * someone else modifying the P2M wbefore the TLB invalidation has
+     * completed.
+     */
+    p2m_tlb_flush_sync(p2m);
+
+    write_unlock(&p2m->lock);
+}
+
+static void clear_and_clean_page(struct page_info *page)
+{
+    void *p = __map_domain_page(page);
+
+    clear_page(p);
+    unmap_domain_page(p);
+}
+
+static struct page_info *p2m_get_clean_page(struct domain *d)
+{
+    struct page_info *page;
+
+    /*
+     * As mentioned in the Priviliged Architecture Spec (version 20240411)
+     * As explained in Section 18.5.1, for the paged virtual-memory schemes
+     * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
+     * and must be aligned to a 16-KiB boundary.
+     */
+    page = alloc_domheap_pages(NULL, 2, 0);
+    if ( page == NULL )
+        return NULL;
+
+    clear_and_clean_page(page);
+
+    return page;
+}
+
+static struct page_info *p2m_allocate_root(struct domain *d)
+{
+    return p2m_get_clean_page(d);
+}
+
+static unsigned long hgatp_from_page_info(struct page_info *page_info)
+{
+    unsigned long ppn;
+    unsigned long hgatp_mode;
+
+    ppn = PFN_DOWN(page_to_maddr(page_info)) & HGATP_PPN;
+
+    /* ASID (VMID) not supported yet */
+
+#if RV_STAGE1_MODE == SATP_MODE_SV39
+    hgatp_mode = HGATP_MODE_SV39X4;
+#elif RV_STAGE1_MODE == SATP_MODE_SV48
+    hgatp_mode = HGATP_MODE_SV48X4;
+#else
+    #error "add HGATP_MODE"
+#endif
+
+    return ppn | (hgatp_mode << HGATP_MODE_SHIFT);
+}
+
+static int p2m_alloc_table(struct domain *d)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+    p2m->root = p2m_allocate_root(d);
+    if ( !p2m->root )
+        return -ENOMEM;
+
+    p2m->hgatp = hgatp_from_page_info(p2m->root);
+
+    /*
+     * Make sure that all TLBs corresponding to the new VMID are flushed
+     * before using it.
+     */
+    p2m_write_lock(p2m);
+    p2m_force_tlb_flush_sync(p2m);
+    p2m_write_unlock(p2m);
+
+    return 0;
+}
+
+int p2m_init(struct domain *d)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    int rc;
+
+    rwlock_init(&p2m->lock);
+    INIT_PAGE_LIST_HEAD(&p2m->pages);
+
+    p2m->max_mapped_gfn = _gfn(0);
+    p2m->lowest_mapped_gfn = _gfn(ULONG_MAX);
+
+    p2m->default_access = p2m_access_rwx;
+
+    radix_tree_init(&p2m->p2m_type);
+
+#ifdef CONFIG_HAS_PASSTHROUGH
+    /*
+     * Some IOMMUs don't support coherent PT walk. When the p2m is
+     * shared with the CPU, Xen has to make sure that the PT changes have
+     * reached the memory
+     */
+    p2m->clean_pte = is_iommu_enabled(d) &&
+        !iommu_has_feature(d, IOMMU_FEAT_COHERENT_WALK);
+#else
+    p2m->clean_pte = true;
+#endif
+
+    /*
+     * "Trivial" initialisation is now complete.  Set the backpointer so
+     * p2m_teardown() and friends know to do something.
+     */
+    p2m->domain = d;
+
+    rc = p2m_alloc_table(d);
+    if ( rc )
+        return rc;
+
+    return 0;
+}
-- 
2.49.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v1 3/6] xen/riscv: construct the P2M pages pool for guests
  2025-05-09 15:57 [PATCH v1 0/6] xen/riscv: introduce p2m functionality Oleksii Kurochko
  2025-05-09 15:57 ` [PATCH v1 1/6] xen/riscv: add inclusion of xen/bitops.h to asm/cmpxchg.h Oleksii Kurochko
  2025-05-09 15:57 ` [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization Oleksii Kurochko
@ 2025-05-09 15:57 ` Oleksii Kurochko
  2025-05-20 14:38   ` Jan Beulich
  2025-05-09 15:57 ` [PATCH v1 4/6] xen/riscv: define pt_t and pt_walk_t structures Oleksii Kurochko
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-09 15:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Implement p2m_set_allocation() to construct p2m pages pool for guests
based on required number of pages.

This is implemented by:
- Adding a `struct paging_domain` which contains a freelist, a
  counter variable and a spinlock to `struct arch_domain` to
  indicate the free p2m pages and the number of p2m total pages in
  the p2m pages pool.
- Adding a helper `p2m_set_allocation` to set the p2m pages pool
  size. This helper should be called before allocating memory for
  a guest and is called from domain_p2m_set_allocation(), the latter
  is a part of common dom0less code.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 xen/arch/riscv/include/asm/domain.h | 10 +++++
 xen/arch/riscv/p2m.c                | 67 +++++++++++++++++++++++++++++
 2 files changed, 77 insertions(+)

diff --git a/xen/arch/riscv/include/asm/domain.h b/xen/arch/riscv/include/asm/domain.h
index 48be90a395..b818127f9f 100644
--- a/xen/arch/riscv/include/asm/domain.h
+++ b/xen/arch/riscv/include/asm/domain.h
@@ -2,6 +2,8 @@
 #ifndef ASM__RISCV__DOMAIN_H
 #define ASM__RISCV__DOMAIN_H
 
+#include <xen/mm.h>
+#include <xen/spinlock.h>
 #include <xen/xmalloc.h>
 #include <public/hvm/params.h>
 
@@ -18,12 +20,20 @@ struct arch_vcpu_io {
 struct arch_vcpu {
 };
 
+struct paging_domain {
+    spinlock_t lock;
+    /* Free P2M pages from the pre-allocated P2M pool */
+    struct page_list_head p2m_freelist;
+    /* Number of pages from the pre-allocated P2M pool */
+    unsigned long p2m_total_pages;
+};
 
 struct arch_domain {
     struct hvm_domain hvm;
 
     struct p2m_domain p2m;
 
+    struct paging_domain paging;
 };
 
 #include <xen/sched.h>
diff --git a/xen/arch/riscv/p2m.c b/xen/arch/riscv/p2m.c
index ad4beef8f9..a890870391 100644
--- a/xen/arch/riscv/p2m.c
+++ b/xen/arch/riscv/p2m.c
@@ -1,4 +1,12 @@
 #include <xen/domain_page.h>
+/*
+ * Because of general_preempt_check() from xen/sched.h which uses
+ * local_events_need_delivery() but latter is declared in <asm/event.h>.
+ * Thereby it is needed to icnlude <xen/event.h> here before xen/sched.h.
+ *
+ * Shouldn't be xen/event.h be included in <xen/sched.h>?
+ */
+#include <xen/event.h>
 #include <xen/iommu.h>
 #include <xen/lib.h>
 #include <xen/mm.h>
@@ -133,7 +141,9 @@ int p2m_init(struct domain *d)
     int rc;
 
     rwlock_init(&p2m->lock);
+    spin_lock_init(&d->arch.paging.lock);
     INIT_PAGE_LIST_HEAD(&p2m->pages);
+    INIT_PAGE_LIST_HEAD(&d->arch.paging.p2m_freelist);
 
     p2m->max_mapped_gfn = _gfn(0);
     p2m->lowest_mapped_gfn = _gfn(ULONG_MAX);
@@ -166,3 +176,60 @@ int p2m_init(struct domain *d)
 
     return 0;
 }
+
+/*
+ * Set the pool of pages to the required number of pages.
+ * Returns 0 for success, non-zero for failure.
+ * Call with d->arch.paging.lock held.
+ */
+int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
+{
+    struct page_info *pg;
+
+    ASSERT(spin_is_locked(&d->arch.paging.lock));
+
+    for ( ; ; )
+    {
+        if ( d->arch.paging.p2m_total_pages < pages )
+        {
+            /* Need to allocate more memory from domheap */
+            pg = alloc_domheap_page(d, MEMF_no_owner);
+            if ( pg == NULL )
+            {
+                printk(XENLOG_ERR "Failed to allocate P2M pages.\n");
+                return -ENOMEM;
+            }
+            ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
+                d->arch.paging.p2m_total_pages + 1;
+            page_list_add_tail(pg, &d->arch.paging.p2m_freelist);
+        }
+        else if ( d->arch.paging.p2m_total_pages > pages )
+        {
+            /* Need to return memory to domheap */
+            pg = page_list_remove_head(&d->arch.paging.p2m_freelist);
+            if( pg )
+            {
+                ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
+                    d->arch.paging.p2m_total_pages - 1;
+                free_domheap_page(pg);
+            }
+            else
+            {
+                printk(XENLOG_ERR
+                       "Failed to free P2M pages, P2M freelist is empty.\n");
+                return -ENOMEM;
+            }
+        }
+        else
+            break;
+
+        /* Check to see if we need to yield and try again */
+        if ( preempted && general_preempt_check() )
+        {
+            *preempted = true;
+            return -ERESTART;
+        }
+    }
+
+    return 0;
+}
-- 
2.49.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v1 4/6] xen/riscv: define pt_t and pt_walk_t structures
  2025-05-09 15:57 [PATCH v1 0/6] xen/riscv: introduce p2m functionality Oleksii Kurochko
                   ` (2 preceding siblings ...)
  2025-05-09 15:57 ` [PATCH v1 3/6] xen/riscv: construct the P2M pages pool for guests Oleksii Kurochko
@ 2025-05-09 15:57 ` Oleksii Kurochko
  2025-05-20 15:04   ` Jan Beulich
  2025-05-09 15:57 ` [PATCH v1 5/6] xen/riscv: add new p2m types and helper macros for type classification Oleksii Kurochko
  2025-05-09 15:57 ` [PATCH v1 6/6] xen/riscv: implement p2m mapping functionality Oleksii Kurochko
  5 siblings, 1 reply; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-09 15:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Refactor pte_t to be a union which hold page table entry plus
pt_t and pt_walk_t structures to simpilfy p2m functions.

Also, introduce some helpers which are using pt_walk_t.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 xen/arch/riscv/include/asm/page.h | 54 ++++++++++++++++++++++++++++++-
 1 file changed, 53 insertions(+), 1 deletion(-)

diff --git a/xen/arch/riscv/include/asm/page.h b/xen/arch/riscv/include/asm/page.h
index 2af4823170..cb3dea309c 100644
--- a/xen/arch/riscv/include/asm/page.h
+++ b/xen/arch/riscv/include/asm/page.h
@@ -99,15 +99,67 @@
 
 #endif
 
-/* Page Table entry */
 typedef struct {
+    unsigned long v:1;
+    unsigned long r:1;
+    unsigned long w:1;
+    unsigned long x:1;
+    unsigned long u:1;
+    unsigned long g:1;
+    unsigned long a:1;
+    unsigned long d:1;
+    unsigned long rsw:2;
+#if RV_STAGE1_MODE == SATP_MODE_SV39
+    unsigned long ppn0:9;
+    unsigned long ppn1:9;
+    unsigned long ppn2:26;
+    unsigned long rsw2:7;
+    unsigned long pbmt:2;
+    unsigned long n:1;
+#elif RV_STAGE1_MODE == SATP_MODE_SV48
+    unsigned long ppn0:9;
+    unsigned long ppn1:9;
+    unsigned long ppn2:9;
+    unsigned long ppn3:17;
+    unsigned long rsw2:7;
+    unsigned long pbmt:2;
+    unsigned long n:1;
+#else
+#error "Add proper bits for SATP_MODE"
+#endif
+} pt_t;
+
+typedef struct {
+    unsigned long rsw:10;
+#if RV_STAGE1_MODE == SATP_MODE_SV39 || RV_STAGE1_MODE == SATP_MODE_SV48
+    unsigned long ppn: 44;
+#else
+#error "Add proper bits for SATP_MODE"
+#endif
+    unsigned long rsw2:10;
+} pt_walk_t;
+
+/* Page Table entry */
+typedef union {
 #ifdef CONFIG_RISCV_64
     uint64_t pte;
 #else
     uint32_t pte;
 #endif
+    pt_t bits;
+    pt_walk_t walk;
 } pte_t;
 
+static inline void pte_set_mfn(pte_t *pte, mfn_t mfn)
+{
+    pte->walk.ppn = mfn_x(mfn);
+}
+
+static inline mfn_t pte_get_mfn(pte_t pte)
+{
+    return _mfn(pte.walk.ppn);
+}
+
 static inline pte_t paddr_to_pte(paddr_t paddr,
                                  unsigned int permissions)
 {
-- 
2.49.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v1 5/6] xen/riscv: add new p2m types and helper macros for type classification
  2025-05-09 15:57 [PATCH v1 0/6] xen/riscv: introduce p2m functionality Oleksii Kurochko
                   ` (3 preceding siblings ...)
  2025-05-09 15:57 ` [PATCH v1 4/6] xen/riscv: define pt_t and pt_walk_t structures Oleksii Kurochko
@ 2025-05-09 15:57 ` Oleksii Kurochko
  2025-05-20 15:11   ` Jan Beulich
  2025-05-09 15:57 ` [PATCH v1 6/6] xen/riscv: implement p2m mapping functionality Oleksii Kurochko
  5 siblings, 1 reply; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-09 15:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

- Extended p2m_type_t with additional types: p2m_ram_ro, p2m_mmio_direct_dev,
  p2m_map_foreign_{rw,ro}, p2m_grant_map_{rw,ro}.
- Added macros to classify memory types: P2M_RAM_TYPES, P2M_GRANT_TYPES,
  P2M_FOREIGN_TYPES.
- Introduced helper predicates: p2m_is_ram(), p2m_is_foreign(),
  p2m_is_any_ram().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 xen/arch/riscv/include/asm/p2m.h | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/xen/arch/riscv/include/asm/p2m.h b/xen/arch/riscv/include/asm/p2m.h
index 8b46210768..0fcfec7f03 100644
--- a/xen/arch/riscv/include/asm/p2m.h
+++ b/xen/arch/riscv/include/asm/p2m.h
@@ -80,8 +80,36 @@ struct p2m_domain {
 typedef enum {
     p2m_invalid = 0,    /* Nothing mapped here */
     p2m_ram_rw,         /* Normal read/write domain RAM */
+    p2m_ram_ro,         /* Read-only; writes are silently dropped */
+    p2m_mmio_direct_dev,/* Read/write mapping of genuine Device MMIO area */
+    p2m_map_foreign_rw, /* Read/write RAM pages from foreign domain */
+    p2m_map_foreign_ro, /* Read-only RAM pages from foreign domain */
+    p2m_grant_map_rw,   /* Read/write grant mapping */
+    p2m_grant_map_ro,   /* Read-only grant mapping */
 } p2m_type_t;
 
+/* We use bitmaps and mask to handle groups of types */
+#define p2m_to_mask(t_) BIT(t_, UL)
+
+/* RAM types, which map to real machine frames */
+#define P2M_RAM_TYPES (p2m_to_mask(p2m_ram_rw) | \
+                       p2m_to_mask(p2m_ram_ro))
+
+/* Grant mapping types, which map to a real frame in another VM */
+#define P2M_GRANT_TYPES (p2m_to_mask(p2m_grant_map_rw) | \
+                         p2m_to_mask(p2m_grant_map_ro))
+
+/* Foreign mappings types */
+#define P2M_FOREIGN_TYPES (p2m_to_mask(p2m_map_foreign_rw) | \
+                           p2m_to_mask(p2m_map_foreign_ro))
+
+/* Useful predicates */
+#define p2m_is_ram(t_) (p2m_to_mask(t_) & P2M_RAM_TYPES)
+#define p2m_is_foreign(t_) (p2m_to_mask(t_) & P2M_FOREIGN_TYPES)
+#define p2m_is_any_ram(t_) (p2m_to_mask(t_) & \
+                            (P2M_RAM_TYPES | P2M_GRANT_TYPES | \
+                             P2M_FOREIGN_TYPES))
+
 #include <xen/p2m-common.h>
 
 static inline int get_page_and_type(struct page_info *page,
-- 
2.49.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v1 6/6] xen/riscv: implement p2m mapping functionality
  2025-05-09 15:57 [PATCH v1 0/6] xen/riscv: introduce p2m functionality Oleksii Kurochko
                   ` (4 preceding siblings ...)
  2025-05-09 15:57 ` [PATCH v1 5/6] xen/riscv: add new p2m types and helper macros for type classification Oleksii Kurochko
@ 2025-05-09 15:57 ` Oleksii Kurochko
  2025-05-20 15:16   ` Jan Beulich
  5 siblings, 1 reply; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-09 15:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

These utilities are needed for building and managing RISC-V guest page
tables and MMIO mappings by using functions map_regions_p2mt() and
guest_physmap_add_entry().

To implement p2m mapping functionality the following is introduced:
- Define P2M root level/order and entry count.
- Introdude radix type for p2m types as it isn't enough free bits in pte
  and the helpers (p2m_type_radix_{get,set}()) to deal with them.
- Introduce p2m_is_*() helpers() as  pte_is_*() helpers are checking
  the valid bit set in the PTE but we have to check p2m_type instead
  (look at the comment above p2m_is_valid() for some details).
- Introduce helper to set p2m's pte permission: p2m_set_permissions().
- Introduce helper to create p2m entry based on mfn, p2m_type_t and
  p2m_access_t.
- Introduce helper to generate table entry with correct attributes:
  page_to_p2m_table().
- Introduce p2m page allocation function: p2m_alloc_page().
- Introduce functions to write/remove p2m's entries: p2m_{write,remove}_pte().
- Introduce function to allocate p2m table: p2m_create_table().
- Introduce functions used to free p2m entry.
- Introduce function for table walking: p2m_next_level().
- Introduce function to insert an entry in the p2m (p2m_set_entry()).
- Introduce superpage splitting: p2m_split_superpage()).
- Introduce page table type defines (PGT_{none,writable_page}, etc).

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 xen/arch/riscv/include/asm/mm.h   |  32 +-
 xen/arch/riscv/include/asm/p2m.h  |  17 +-
 xen/arch/riscv/include/asm/page.h |  11 +
 xen/arch/riscv/p2m.c              | 780 ++++++++++++++++++++++++++++++
 4 files changed, 829 insertions(+), 11 deletions(-)

diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h
index 972ec45448..c1e4519839 100644
--- a/xen/arch/riscv/include/asm/mm.h
+++ b/xen/arch/riscv/include/asm/mm.h
@@ -12,6 +12,7 @@
 #include <xen/sections.h>
 #include <xen/types.h>
 
+#include <asm/cmpxchg.h>
 #include <asm/page-bits.h>
 
 extern vaddr_t directmap_virt_start;
@@ -229,9 +230,21 @@ static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
 #define PGT_writable_page PG_mask(1, 1)  /* has writable mappings?         */
 #define PGT_type_mask     PG_mask(1, 1)  /* Bits 31 or 63.                 */
 
-/* Count of uses of this frame as its current type. */
-#define PGT_count_width   PG_shift(2)
-#define PGT_count_mask    ((1UL << PGT_count_width) - 1)
+ /* 9-bit count of uses of this frame as its current type. */
+#define PGT_count_mask    PG_mask(0x3FF, 10)
+
+/*
+ * Sv32 has 22-bit GFN. Sv{39, 48, 57} have 44-bit GFN.
+ * Thereby we can use for `type_info` 10 bits for all modes, having the same
+ * amount of bits for `type_info` for all MMU modes let us avoid introducing
+ * an extra #ifdef to that header:
+ *   if we go with maximum possible bits for count on each configuration
+ *   we would need to have a set of PGT_count_* and PGT_gfn_*).
+ */
+#define PGT_gfn_width     PG_shift(10)
+#define PGT_gfn_mask      (BIT(PGT_gfn_width, UL) - 1)
+
+#define PGT_INVALID_XENHEAP_GFN   _gfn(PGT_gfn_mask)
 
 /*
  * Page needs to be scrubbed. Since this bit can only be set on a page that is
@@ -283,6 +296,19 @@ static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
 
 #define PFN_ORDER(pg) ((pg)->v.free.order)
 
+static inline void page_set_xenheap_gfn(struct page_info *p, gfn_t gfn)
+{
+    gfn_t gfn_ = gfn_eq(gfn, INVALID_GFN) ? PGT_INVALID_XENHEAP_GFN : gfn;
+    unsigned long x, nx, y = p->u.inuse.type_info;
+
+    ASSERT(is_xen_heap_page(p));
+
+    do {
+        x = y;
+        nx = (x & ~PGT_gfn_mask) | gfn_x(gfn_);
+    } while ( (y = cmpxchg(&p->u.inuse.type_info, x, nx)) != x );
+}
+
 extern unsigned char cpu0_boot_stack[];
 
 void setup_initial_pagetables(void);
diff --git a/xen/arch/riscv/include/asm/p2m.h b/xen/arch/riscv/include/asm/p2m.h
index 0fcfec7f03..9f4633501f 100644
--- a/xen/arch/riscv/include/asm/p2m.h
+++ b/xen/arch/riscv/include/asm/p2m.h
@@ -9,8 +9,13 @@
 #include <xen/rwlock.h>
 #include <xen/types.h>
 
+#include <asm/page.h>
 #include <asm/page-bits.h>
 
+#define P2M_ROOT_LEVEL  HYP_PT_ROOT_LEVEL
+#define P2M_ROOT_ORDER  XEN_PT_LEVEL_ORDER(P2M_ROOT_LEVEL)
+#define P2M_ROOT_PAGES  (1U << P2M_ROOT_ORDER)
+
 #define paddr_bits PADDR_BITS
 
 /* Get host p2m table */
@@ -145,14 +150,10 @@ static inline int guest_physmap_mark_populate_on_demand(struct domain *d,
     return -EOPNOTSUPP;
 }
 
-static inline int guest_physmap_add_entry(struct domain *d,
-                                          gfn_t gfn, mfn_t mfn,
-                                          unsigned long page_order,
-                                          p2m_type_t t)
-{
-    BUG_ON("unimplemented");
-    return -EINVAL;
-}
+int guest_physmap_add_entry(struct domain *d,
+                            gfn_t gfn, mfn_t mfn,
+                            unsigned long page_order,
+                            p2m_type_t t);
 
 /* Untyped version for RAM only, for compatibility */
 static inline int __must_check
diff --git a/xen/arch/riscv/include/asm/page.h b/xen/arch/riscv/include/asm/page.h
index cb3dea309c..a5c6f5140d 100644
--- a/xen/arch/riscv/include/asm/page.h
+++ b/xen/arch/riscv/include/asm/page.h
@@ -22,6 +22,7 @@
 #define XEN_PT_LEVEL_SIZE(lvl)      (_AT(paddr_t, 1) << XEN_PT_LEVEL_SHIFT(lvl))
 #define XEN_PT_LEVEL_MAP_MASK(lvl)  (~(XEN_PT_LEVEL_SIZE(lvl) - 1))
 #define XEN_PT_LEVEL_MASK(lvl)      (VPN_MASK << XEN_PT_LEVEL_SHIFT(lvl))
+#define XEN_PT_ENTRIES              (_AT(unsigned int, 1) << PAGETABLE_ORDER)
 
 /*
  * PTE format:
@@ -69,10 +70,20 @@
 #define PTE_PMBT_NOCACHE    BIT(61, UL)
 #define PTE_PMBT_IO         BIT(62, UL)
 
+enum pmbt_type_t {
+    pbmt_pma,
+    pbmt_nc,
+    pbmt_io,
+    pbmt_rsvd,
+    pbmt_max,
+};
+
 #define PTE_ACCESS_MASK (PTE_READABLE | PTE_WRITABLE | PTE_EXECUTABLE)
 
 #define PTE_PBMT_MASK   (PTE_PMBT_NOCACHE | PTE_PMBT_IO)
 
+#define P2M_CLEAR_PERM(p2m_pte) ((p2m_pte).pte & ~PTE_ACCESS_MASK)
+
 /* Calculate the offsets into the pagetables for a given VA */
 #define pt_linear_offset(lvl, va)   ((va) >> XEN_PT_LEVEL_SHIFT(lvl))
 
diff --git a/xen/arch/riscv/p2m.c b/xen/arch/riscv/p2m.c
index a890870391..84cb3d28af 100644
--- a/xen/arch/riscv/p2m.c
+++ b/xen/arch/riscv/p2m.c
@@ -135,6 +135,37 @@ static int p2m_alloc_table(struct domain *d)
     return 0;
 }
 
+static p2m_type_t p2m_type_radix_get(struct p2m_domain *p2m, pte_t pte)
+{
+    void *ptr;
+
+    ptr = radix_tree_lookup(&p2m->p2m_type, pte.pte);
+
+    if ( !ptr )
+        return p2m_invalid;
+
+    return radix_tree_ptr_to_int(ptr);
+}
+
+static int p2m_type_radix_set(struct p2m_domain *p2m, pte_t pte, p2m_type_t t)
+{
+    int rc;
+
+    rc = radix_tree_insert(&p2m->p2m_type, pte.pte,
+                           radix_tree_int_to_ptr(t));
+    if ( rc == -EEXIST )
+    {
+        /* If a setting already exists, change it to the new one */
+        radix_tree_replace_slot(
+            radix_tree_lookup_slot(
+                &p2m->p2m_type, pte.pte),
+            radix_tree_int_to_ptr(t));
+        rc = 0;
+    }
+
+    return rc;
+}
+
 int p2m_init(struct domain *d)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
@@ -233,3 +264,752 @@ int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
 
     return 0;
 }
+
+/*
+ * Find and map the root page table. The caller is responsible for
+ * unmapping the table.
+ *
+ * The function will return NULL if the offset of the root table is
+ * invalid.
+ */
+static pte_t *p2m_get_root_pointer(struct p2m_domain *p2m, gfn_t gfn)
+{
+    unsigned long root_table;
+
+    root_table = gfn_x(gfn) >> XEN_PT_LEVEL_ORDER(P2M_ROOT_LEVEL);
+    if ( root_table >= P2M_ROOT_PAGES )
+        return NULL;
+
+    return __map_domain_page(p2m->root + root_table);
+}
+
+/*
+ * In the case of the P2M, the valid bit is used for other purpose. Use
+ * the type to check whether an entry is valid.
+ */
+static inline bool p2m_is_valid(struct p2m_domain *p2m, pte_t pte)
+{
+    return p2m_type_radix_get(p2m, pte) != p2m_invalid;
+}
+
+/*
+ * pte_is_* helpers are checking the valid bit set in the
+ * PTE but we have to check p2m_type instead (look at the comment above
+ * p2m_is_valid())
+ * Provide our own overlay to check the valid bit.
+ */
+static inline bool p2m_is_mapping(struct p2m_domain *p2m, pte_t pte)
+{
+    return p2m_is_valid(p2m, pte) && (pte.pte & PTE_ACCESS_MASK);
+}
+
+static inline bool p2m_is_superpage(struct p2m_domain *p2m, pte_t pte,
+                                    unsigned int level)
+{
+    return p2m_is_valid(p2m, pte) && (pte.pte & PTE_ACCESS_MASK) &&
+           (level > 0);
+}
+
+static void p2m_set_permission(pte_t *e, p2m_type_t t, p2m_access_t a)
+{
+    /* First apply type permissions */
+    switch ( t )
+    {
+    case p2m_ram_rw:
+        e->bits.r = 1;
+        e->bits.w = 1;
+        e->bits.x = 1;
+
+        break;
+
+    case p2m_mmio_direct_dev:
+        e->bits.r = 1;
+        e->bits.w = 1;
+        e->bits.x = 0;
+        break;
+
+    case p2m_invalid:
+        e->bits.r = 0;
+        e->bits.w = 0;
+        e->bits.x = 0;
+        break;
+
+    default:
+        BUG();
+        break;
+    }
+
+    /* Then restrict with access permissions */
+    switch ( a )
+    {
+    case p2m_access_rwx:
+        break;
+    case p2m_access_wx:
+        e->bits.r = 0;
+        break;
+    case p2m_access_rw:
+        e->bits.x = 0;
+        break;
+    case p2m_access_w:
+        e->bits.r = 0;
+        e->bits.x = 0;
+        break;
+    case p2m_access_rx:
+    case p2m_access_rx2rw:
+        e->bits.w = 0;
+        break;
+    case p2m_access_x:
+        e->bits.r = 0;
+        e->bits.w = 0;
+        break;
+    case p2m_access_r:
+        e->bits.w = 0;
+        e->bits.x = 0;
+        break;
+    case p2m_access_n:
+    case p2m_access_n2rwx:
+        e->bits.r = 0;
+        e->bits.w = 0;
+        e->bits.x = 0;
+        break;
+    default:
+        BUG();
+        break;
+    }
+}
+
+static pte_t p2m_entry_from_mfn(struct p2m_domain *p2m, mfn_t mfn, p2m_type_t t, p2m_access_t a)
+{
+    pte_t e = (pte_t) {
+        .bits.v = 1,
+    };
+
+    switch ( t )
+    {
+    case p2m_mmio_direct_dev:
+        e.bits.pbmt = pbmt_io;
+        break;
+
+    default:
+        e.bits.pbmt = pbmt_pma;
+        break;
+    }
+
+    p2m_set_permission(&e, t, a);
+
+    ASSERT(!(mfn_to_maddr(mfn) & ~PADDR_MASK));
+
+    pte_set_mfn(&e, mfn);
+
+    BUG_ON(p2m_type_radix_set(p2m, e, t));
+
+    return e;
+}
+
+/* Generate table entry with correct attributes. */
+static pte_t page_to_p2m_table(struct p2m_domain *p2m, struct page_info *page)
+{
+    /*
+     * Since this function generates a table entry, according to "Encoding
+     * of PTE R/W/X fields," the entry's r, w, and x fields must be set to 0
+     * to point to the next level of the page table.
+     * Therefore, to ensure that an entry is a page table entry,
+     * `p2m_access_n2rwx` is passed to `mfn_to_p2m_entry()` as the access value,
+     * which overrides whatever was passed as `p2m_type_t` and guarantees that
+     * the entry is a page table entry by setting r = w = x = 0.
+     */
+    return p2m_entry_from_mfn(p2m, page_to_mfn(page), p2m_ram_rw, p2m_access_n2rwx);
+}
+
+static struct page_info *p2m_alloc_page(struct domain *d)
+{
+    struct page_info *pg;
+
+    /*
+     * For hardware domain, there should be no limit in the number of pages that
+     * can be allocated, so that the kernel may take advantage of the extended
+     * regions. Hence, allocate p2m pages for hardware domains from heap.
+     */
+    if ( is_hardware_domain(d) )
+    {
+        pg = alloc_domheap_page(d, MEMF_no_owner);
+        if ( pg == NULL )
+            printk(XENLOG_G_ERR "Failed to allocate P2M pages for hwdom.\n");
+    }
+    else
+    {
+        spin_lock(&d->arch.paging.lock);
+        pg = page_list_remove_head(&d->arch.paging.p2m_freelist);
+        spin_unlock(&d->arch.paging.lock);
+    }
+
+    return pg;
+}
+
+static inline void p2m_write_pte(pte_t *p, pte_t pte, bool clean_pte)
+{
+    write_pte(p, pte);
+    if ( clean_pte )
+        clean_dcache_va_range(p, sizeof(*p));
+}
+
+static inline void p2m_remove_pte(pte_t *p, bool clean_pte)
+{
+    pte_t pte;
+
+    memset(&pte, 0x00, sizeof(pte));
+    p2m_write_pte(p, pte, clean_pte);
+}
+
+/* Allocate a new page table page and hook it in via the given entry. */
+static int p2m_create_table(struct p2m_domain *p2m, pte_t *entry)
+{
+    struct page_info *page;
+    pte_t *p;
+
+    ASSERT(!p2m_is_valid(p2m, *entry));
+
+    page = p2m_alloc_page(p2m->domain);
+    if ( page == NULL )
+        return -ENOMEM;
+
+    page_list_add(page, &p2m->pages);
+
+    p = __map_domain_page(page);
+    clear_page(p);
+
+    unmap_domain_page(p);
+
+    p2m_write_pte(entry, page_to_p2m_table(p2m, page), p2m->clean_pte);
+
+    return 0;
+}
+
+#define GUEST_TABLE_MAP_NONE 0
+#define GUEST_TABLE_MAP_NOMEM 1
+#define GUEST_TABLE_SUPER_PAGE 2
+#define GUEST_TABLE_NORMAL 3
+
+/*
+ * Take the currently mapped table, find the corresponding GFN entry,
+ * and map the next table, if available. The previous table will be
+ * unmapped if the next level was mapped (e.g GUEST_TABLE_NORMAL
+ * returned).
+ *
+ * `alloc_tbl` parameter indicates whether intermediate tables should
+ * be allocated when not present.
+ *
+ * Return values:
+ *  GUEST_TABLE_MAP_NONE: a table allocation isn't permitted.
+ *  GUEST_TABLE_MAP_NOMEM: allocating a new page failed.
+ *  GUEST_TABLE_SUPER_PAGE: next level or leaf mapped normally.
+ *  GUEST_TABLE_NORMAL: The next entry points to a superpage.
+ */
+static int p2m_next_level(struct p2m_domain *p2m, bool alloc_tbl,
+                          unsigned int level, pte_t **table,
+                          unsigned int offset)
+{
+    pte_t *entry;
+    int ret;
+    mfn_t mfn;
+
+    entry = *table + offset;
+
+    if ( !p2m_is_valid(p2m, *entry) )
+    {
+        if ( !alloc_tbl )
+            return GUEST_TABLE_MAP_NONE;
+
+        ret = p2m_create_table(p2m, entry);
+        if ( ret )
+            return GUEST_TABLE_MAP_NOMEM;
+    }
+
+    /* The function p2m_next_level() is never called at the last level */
+    ASSERT(level != 0);
+    if ( p2m_is_mapping(p2m, *entry) )
+        return GUEST_TABLE_SUPER_PAGE;
+
+    mfn = mfn_from_pte(*entry);
+
+    unmap_domain_page(*table);
+    *table = map_domain_page(mfn);
+
+    return GUEST_TABLE_NORMAL;
+}
+
+static bool p2m_split_superpage(struct p2m_domain *p2m, pte_t *entry,
+                                unsigned int level, unsigned int target,
+                                const unsigned int *offsets)
+{
+    struct page_info *page;
+    unsigned int i;
+    pte_t pte, *table;
+    bool rv = true;
+
+    /* Convenience aliases */
+    mfn_t mfn = pte_get_mfn(*entry);
+    unsigned int next_level = level - 1;
+    unsigned int level_order = XEN_PT_LEVEL_ORDER(next_level);
+
+    /*
+     * This should only be called with target != level and the entry is
+     * a superpage.
+     */
+    ASSERT(level > target);
+    ASSERT(p2m_is_superpage(p2m, *entry, level));
+
+    page = p2m_alloc_page(p2m->domain);
+    if ( !page )
+        return false;
+
+    page_list_add(page, &p2m->pages);
+    table = __map_domain_page(page);
+
+    /*
+     * We are either splitting a first level 1G page into 512 second level
+     * 2M pages, or a second level 2M page into 512 third level 4K pages.
+     */
+    for ( i = 0; i < XEN_PT_ENTRIES; i++ )
+    {
+        pte_t *new_entry = table + i;
+
+        /*
+         * Use the content of the superpage entry and override
+         * the necessary fields. So the correct permission are kept.
+         */
+        pte = *entry;
+        pte_set_mfn(&pte, mfn_add(mfn, i << level_order));
+
+        write_pte(new_entry, pte);
+    }
+
+    /*
+     * Shatter superpage in the page to the level we want to make the
+     * changes.
+     * This is done outside the loop to avoid checking the offset to
+     * know whether the entry should be shattered for every entry.
+     */
+    if ( next_level != target )
+        rv = p2m_split_superpage(p2m, table + offsets[next_level],
+                                 level - 1, target, offsets);
+
+    /* TODO: why it is necessary to have clean here? Not somewhere in the caller */
+    if ( p2m->clean_pte )
+        clean_dcache_va_range(table, PAGE_SIZE);
+
+    unmap_domain_page(table);
+
+    /*
+     * Even if we failed, we should install the newly allocated PTE
+     * entry. The caller will be in charge to free the sub-tree.
+     */
+    p2m_write_pte(entry, page_to_p2m_table(p2m, page), p2m->clean_pte);
+
+    return rv;
+}
+
+static void p2m_put_foreign_page(struct page_info *pg)
+{
+    /*
+     * It's safe to do the put_page here because page_alloc will
+     * flush the TLBs if the page is reallocated before the end of
+     * this loop.
+     */
+    put_page(pg);
+}
+
+/* Put any references on the single 4K page referenced by mfn. */
+static void p2m_put_4k_page(mfn_t mfn, p2m_type_t type)
+{
+    /* TODO: Handle other p2m types */
+    if ( p2m_is_foreign(type) )
+    {
+        ASSERT(mfn_valid(mfn));
+        p2m_put_foreign_page(mfn_to_page(mfn));
+    }
+    /* Detect the xenheap page and mark the stored GFN as invalid. */
+    else if ( p2m_is_ram(type) && is_xen_heap_mfn(mfn) )
+        page_set_xenheap_gfn(mfn_to_page(mfn), INVALID_GFN);
+}
+
+/* Put any references on the superpage referenced by mfn. */
+static void p2m_put_2m_superpage(mfn_t mfn, p2m_type_t type)
+{
+    struct page_info *pg;
+    unsigned int i;
+
+    /*
+     * TODO: Handle other p2m types, but be aware that any changes to handle
+     * different types should require an update on the relinquish code to handle
+     * preemption.
+     */
+    if ( !p2m_is_foreign(type) )
+        return;
+
+    ASSERT(mfn_valid(mfn));
+
+    pg = mfn_to_page(mfn);
+
+    for ( i = 0; i < XEN_PT_ENTRIES; i++, pg++ )
+        p2m_put_foreign_page(pg);
+}
+
+/* Put any references on the page referenced by pte. */
+static void p2m_put_page(struct p2m_domain *p2m, const pte_t pte,
+                         unsigned int level)
+{
+    mfn_t mfn = pte_get_mfn(pte);
+    p2m_type_t p2m_type = p2m_type_radix_get(p2m, pte);
+
+    ASSERT(p2m_is_valid(p2m, pte));
+
+    /*
+     * TODO: Currently we don't handle level 2 super-page, Xen is not
+     * preemptible and therefore some work is needed to handle such
+     * superpages, for which at some point Xen might end up freeing memory
+     * and therefore for such a big mapping it could end up in a very long
+     * operation.
+     */
+    if ( level == 1 )
+        return p2m_put_2m_superpage(mfn, p2m_type);
+    else if ( level == 0 )
+        return p2m_put_4k_page(mfn, p2m_type);
+}
+
+static void p2m_free_page(struct domain *d, struct page_info *pg)
+{
+    if ( is_hardware_domain(d) )
+        free_domheap_page(pg);
+    else
+    {
+        spin_lock(&d->arch.paging.lock);
+        page_list_add_tail(pg, &d->arch.paging.p2m_freelist);
+        spin_unlock(&d->arch.paging.lock);
+    }
+}
+
+/* Free pte sub-tree behind an entry */
+static void p2m_free_entry(struct p2m_domain *p2m,
+                           pte_t entry, unsigned int level)
+{
+    unsigned int i;
+    pte_t *table;
+    mfn_t mfn;
+    struct page_info *pg;
+
+    /* Nothing to do if the entry is invalid. */
+    if ( !p2m_is_valid(p2m, entry) )
+        return;
+
+    if ( p2m_is_superpage(p2m, entry, level) || (level == 0) )
+    {
+#ifdef CONFIG_IOREQ_SERVER
+        /*
+         * If this gets called then either the entry was replaced by an entry
+         * with a different base (valid case) or the shattering of a superpage
+         * has failed (error case).
+         * So, at worst, the spurious mapcache invalidation might be sent.
+         */
+        if ( p2m_is_ram( p2m_type_radix_get(p2m, entry)) &&
+             domain_has_ioreq_server(p2m->domain) )
+            ioreq_request_mapcache_invalidate(p2m->domain);
+#endif
+
+        p2m_put_page(p2m, entry, level);
+
+        return;
+    }
+
+    table = map_domain_page(pte_get_mfn(entry));
+    for ( i = 0; i < XEN_PT_ENTRIES; i++ )
+        p2m_free_entry(p2m, *(table + i), level - 1);
+
+    unmap_domain_page(table);
+
+    /*
+     * Make sure all the references in the TLB have been removed before
+     * freing the intermediate page table.
+     * XXX: Should we defer the free of the page table to avoid the
+     * flush?
+     */
+    p2m_tlb_flush_sync(p2m);
+
+    mfn = pte_get_mfn(entry);
+    ASSERT(mfn_valid(mfn));
+
+    pg = mfn_to_page(mfn);
+
+    page_list_del(pg, &p2m->pages);
+    p2m_free_page(p2m->domain, pg);
+}
+
+/*
+ * Insert an entry in the p2m. This should be called with a mapping
+ * equal to a page/superpage.
+ */
+static int __p2m_set_entry(struct p2m_domain *p2m,
+                           gfn_t sgfn,
+                           unsigned int page_order,
+                           mfn_t smfn,
+                           p2m_type_t t,
+                           p2m_access_t a)
+{
+    unsigned int level;
+    unsigned int target = page_order / PAGETABLE_ORDER;
+    pte_t *entry, *table, orig_pte;
+    int rc;
+    /* A mapping is removed if the MFN is invalid. */
+    bool removing_mapping = mfn_eq(smfn, INVALID_MFN);
+    DECLARE_OFFSETS(offsets, gfn_to_gaddr(sgfn));
+
+    ASSERT(p2m_is_write_locked(p2m));
+
+    /*
+     * Check if the level target is valid: we only support
+     * 4K - 2M - 1G mapping.
+     */
+    ASSERT(target <= 2);
+
+    table = p2m_get_root_pointer(p2m, sgfn);
+    if ( !table )
+        return -EINVAL;
+
+    for ( level = P2M_ROOT_LEVEL; level > target; level-- )
+    {
+        /*
+         * Don't try to allocate intermediate page table if the mapping
+         * is about to be removed.
+         */
+        rc = p2m_next_level(p2m, !removing_mapping,
+                            level, &table, offsets[level]);
+        if ( (rc == GUEST_TABLE_MAP_NONE) || (rc == GUEST_TABLE_MAP_NOMEM) )
+        {
+            /*
+             * We are here because p2m_next_level has failed to map
+             * the intermediate page table (e.g the table does not exist
+             * and they p2m tree is read-only). It is a valid case
+             * when removing a mapping as it may not exist in the
+             * page table. In this case, just ignore it.
+             */
+            rc = removing_mapping ?  0 : -ENOENT;
+            goto out;
+        }
+        else if ( rc != GUEST_TABLE_NORMAL )
+            break;
+    }
+
+    entry = table + offsets[level];
+
+    /*
+     * If we are here with level > target, we must be at a leaf node,
+     * and we need to break up the superpage.
+     */
+    if ( level > target )
+    {
+        /* We need to split the original page. */
+        pte_t split_pte = *entry;
+
+        ASSERT(p2m_is_superpage(p2m, *entry, level));
+
+        if ( !p2m_split_superpage(p2m, &split_pte, level, target, offsets) )
+        {
+            /* Free the allocated sub-tree */
+            p2m_free_entry(p2m, split_pte, level);
+
+            rc = -ENOMEM;
+            goto out;
+        }
+
+        /* Follow the break-before-sequence to update the entry. */
+        p2m_remove_pte(entry, p2m->clean_pte);
+        p2m_force_tlb_flush_sync(p2m);
+
+        p2m_write_pte(entry, split_pte, p2m->clean_pte);
+
+        /* Then move to the level we want to make real changes */
+        for ( ; level < target; level++ )
+        {
+            rc = p2m_next_level(p2m, true, level, &table, offsets[level]);
+
+            /*
+             * The entry should be found and either be a table
+             * or a superpage if level 0 is not targeted
+             */
+            ASSERT(rc == GUEST_TABLE_NORMAL ||
+                   (rc == GUEST_TABLE_SUPER_PAGE && target > 0));
+        }
+
+        entry = table + offsets[level];
+    }
+
+    /*
+     * We should always be there with the correct level because
+     * all the intermediate tables have been installed if necessary.
+     */
+    ASSERT(level == target);
+
+    orig_pte = *entry;
+
+    /*
+     * The access type should always be p2m_access_rwx when the mapping
+     * is removed.
+     */
+    ASSERT(!mfn_eq(INVALID_MFN, smfn) || (a == p2m_access_rwx));
+
+    /*
+     * Always remove the entry in order to follow the break-before-make
+     * sequence when updating the translation table.
+     */
+    if ( pte_is_valid(orig_pte) || removing_mapping )
+        p2m_remove_pte(entry, p2m->clean_pte);
+
+    if ( removing_mapping )
+        /* Flush can be deferred if the entry is removed */
+        p2m->need_flush |= !!pte_is_valid(orig_pte);
+    else
+    {
+        pte_t pte = p2m_entry_from_mfn(p2m, smfn, t, a);
+
+        /*
+         * It is necessary to flush the TLB before writing the new entry
+         * to keep coherency when the previous entry was valid.
+         *
+         * Although, it could be defered when only the permissions are
+         * changed (e.g in case of memaccess).
+         */
+        if ( pte_is_valid(orig_pte) )
+        {
+            if ( P2M_CLEAR_PERM(pte) != P2M_CLEAR_PERM(orig_pte) )
+                p2m_force_tlb_flush_sync(p2m);
+            else
+                p2m->need_flush = true;
+        }
+
+        p2m_write_pte(entry, pte, p2m->clean_pte);
+
+        p2m->max_mapped_gfn = gfn_max(p2m->max_mapped_gfn,
+                                      gfn_add(sgfn, (1UL << page_order) - 1));
+        p2m->lowest_mapped_gfn = gfn_min(p2m->lowest_mapped_gfn, sgfn);
+    }
+
+#ifdef CONFIG_HAS_PASSTHROUGH
+    if ( is_iommu_enabled(p2m->domain) &&
+         (pte_is_valid(orig_pte) || pte_is_valid(*entry)) )
+    {
+        unsigned int flush_flags = 0;
+
+        if ( pte_is_valid(orig_pte) )
+            flush_flags |= IOMMU_FLUSHF_modified;
+        if ( pte_is_valid(*entry) )
+            flush_flags |= IOMMU_FLUSHF_added;
+
+        rc = iommu_iotlb_flush(p2m->domain, _dfn(gfn_x(sgfn)),
+                               1UL << page_order, flush_flags);
+    }
+    else
+#endif
+        rc = 0;
+
+    /*
+     * Free the entry only if the original pte was valid and the base
+     * is different (to avoid freeing when permission is changed).
+     */
+    if ( p2m_is_valid(p2m, orig_pte) &&
+         !mfn_eq(pte_get_mfn(*entry), pte_get_mfn(orig_pte)) )
+        p2m_free_entry(p2m, orig_pte, level);
+
+out:
+    unmap_domain_page(table);
+
+    return rc;
+}
+
+int p2m_set_entry(struct p2m_domain *p2m,
+                  gfn_t sgfn,
+                  unsigned long nr,
+                  mfn_t smfn,
+                  p2m_type_t t,
+                  p2m_access_t a)
+{
+    int rc = 0;
+
+    /*
+     * Any reference taken by the P2M mappings (e.g. foreign mapping) will
+     * be dropped in relinquish_p2m_mapping(). As the P2M will still
+     * be accessible after, we need to prevent mapping to be added when the
+     * domain is dying.
+     */
+    if ( unlikely(p2m->domain->is_dying) )
+        return -ENOMEM;
+
+    while ( nr )
+    {
+        unsigned long mask;
+        unsigned long order = 0;
+        /* 1gb, 2mb, 4k mappings are supported */
+        unsigned int i = ( P2M_ROOT_LEVEL > 2 ) ? 2 : P2M_ROOT_LEVEL;
+
+        /*
+         * Don't take into account the MFN when removing mapping (i.e
+         * MFN_INVALID) to calculate the correct target order.
+         *
+         * XXX: Support superpage mappings if nr is not aligned to a
+         * superpage size.
+         */
+        mask = !mfn_eq(smfn, INVALID_MFN) ? mfn_x(smfn) : 0;
+        mask |= gfn_x(sgfn) | nr;
+
+        for ( ; i != 0; i-- )
+        {
+            if ( !(mask & (BIT(XEN_PT_LEVEL_ORDER(i), UL) - 1)) )
+            {
+                    order = XEN_PT_LEVEL_ORDER(i);
+                    break;
+            }
+        }
+
+        rc = __p2m_set_entry(p2m, sgfn, order, smfn, t, a);
+        if ( rc )
+            break;
+
+        sgfn = gfn_add(sgfn, (1 << order));
+        if ( !mfn_eq(smfn, INVALID_MFN) )
+           smfn = mfn_add(smfn, (1 << order));
+
+        nr -= (1 << order);
+    }
+
+    return rc;
+}
+
+static int p2m_insert_mapping(struct domain *d, gfn_t start_gfn,
+                              unsigned long nr, mfn_t mfn, p2m_type_t t)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    int rc;
+
+    p2m_write_lock(p2m);
+    rc = p2m_set_entry(p2m, start_gfn, nr, mfn, t, p2m->default_access);
+    p2m_write_unlock(p2m);
+
+    return rc;
+}
+
+int map_regions_p2mt(struct domain *d,
+                     gfn_t gfn,
+                     unsigned long nr,
+                     mfn_t mfn,
+                     p2m_type_t p2mt)
+{
+    return p2m_insert_mapping(d, gfn, nr, mfn, p2mt);
+}
+
+int guest_physmap_add_entry(struct domain *d,
+                            gfn_t gfn,
+                            mfn_t mfn,
+                            unsigned long page_order,
+                            p2m_type_t t)
+{
+    return p2m_insert_mapping(d, gfn, (1 << page_order), mfn, t);
+}
-- 
2.49.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 1/6] xen/riscv: add inclusion of xen/bitops.h to asm/cmpxchg.h
  2025-05-09 15:57 ` [PATCH v1 1/6] xen/riscv: add inclusion of xen/bitops.h to asm/cmpxchg.h Oleksii Kurochko
@ 2025-05-09 16:00   ` Andrew Cooper
  0 siblings, 0 replies; 31+ messages in thread
From: Andrew Cooper @ 2025-05-09 16:00 UTC (permalink / raw)
  To: Oleksii Kurochko, xen-devel
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Anthony PERARD,
	Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
	Stefano Stabellini

On 09/05/2025 4:57 pm, Oleksii Kurochko wrote:
> Add inclusion of xen/bitops.h to asm/cmpxchg.h to avoid compilation issues
> connected to GENMASK() which is used inside asm/cmpxchg.h.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization
  2025-05-09 15:57 ` [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization Oleksii Kurochko
@ 2025-05-09 16:14   ` Andrew Cooper
  2025-05-12  9:24     ` Oleksii Kurochko
  2025-05-20 13:37   ` Jan Beulich
  1 sibling, 1 reply; 31+ messages in thread
From: Andrew Cooper @ 2025-05-09 16:14 UTC (permalink / raw)
  To: Oleksii Kurochko, xen-devel
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Anthony PERARD,
	Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
	Stefano Stabellini

On 09/05/2025 4:57 pm, Oleksii Kurochko wrote:
> diff --git a/xen/arch/riscv/include/asm/p2m.h b/xen/arch/riscv/include/asm/p2m.h
> index 28f57a74f2..8b46210768 100644
> --- a/xen/arch/riscv/include/asm/p2m.h
> +++ b/xen/arch/riscv/include/asm/p2m.h
> @@ -3,11 +3,73 @@
>  #define ASM__RISCV__P2M_H
>  
>  #include <xen/errno.h>
> +#include <xen/mem_access.h>
> +#include <xen/mm.h>
> +#include <xen/radix-tree.h>
> +#include <xen/rwlock.h>
> +#include <xen/types.h>

We're phasing out the inclusion of xen/types.h for complex headers like
this, as it's pulled in by almost all dependencies.

> diff --git a/xen/arch/riscv/p2m.c b/xen/arch/riscv/p2m.c
> new file mode 100644
> index 0000000000..ad4beef8f9
> --- /dev/null
> +++ b/xen/arch/riscv/p2m.c
> @@ -0,0 +1,168 @@
> +#include <xen/domain_page.h>
> +#include <xen/iommu.h>
> +#include <xen/lib.h>
> +#include <xen/mm.h>
> +#include <xen/pfn.h>
> +#include <xen/rwlock.h>
> +#include <xen/sched.h>
> +#include <xen/spinlock.h>
> +
> +#include <asm/page.h>
> +#include <asm/p2m.h>
> +
> +/*
> + * Force a synchronous P2M TLB flush.
> + *
> + * Must be called with the p2m lock held.
> + *
> + * TODO: add support of flushing TLB connected to VMID.
> + */
> +static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
> +{
> +    ASSERT(p2m_is_write_locked(p2m));
> +
> +    /*
> +     * TODO: shouldn't be this flush done for each physical CPU?
> +     *       If yes, then SBI call sbi_remote_hfence_gvma() could
> +     *       be used for that.
> +     */
> +#if defined(__riscv_hh) || defined(__riscv_h)
> +    asm volatile ( "hfence.gvma" ::: "memory" );
> +#else
> +    asm volatile ( ".insn r 0x73, 0x0, 0x31, x0, x0, x0" ::: "memory" );
> +#endif

TLB flushing needs to happen for each pCPU which potentially has cached
a mapping.

In other arches, this is tracked by d->dirty_cpumask which is the bitmap
of pCPUs where this domain is scheduled.

CPUs need to flush their TLBs before removing themselves from
d->dirty_cpumask, which is typically done during context switch, but it
means that to flush the P2M, you only need to IPI a subset of CPUs.


> +
> +    p2m->need_flush = false;
> +}
> +
> +static void p2m_tlb_flush_sync(struct p2m_domain *p2m)
> +{
> +    if ( p2m->need_flush )
> +        p2m_force_tlb_flush_sync(p2m);
> +}
> +
> +/* Unlock the flush and do a P2M TLB flush if necessary */
> +void p2m_write_unlock(struct p2m_domain *p2m)
> +{
> +    /*
> +     * The final flush is done with the P2M write lock taken to avoid
> +     * someone else modifying the P2M wbefore the TLB invalidation has
> +     * completed.
> +     */
> +    p2m_tlb_flush_sync(p2m);
> +
> +    write_unlock(&p2m->lock);
> +}
> +
> +static void clear_and_clean_page(struct page_info *page)
> +{
> +    void *p = __map_domain_page(page);
> +
> +    clear_page(p);
> +    unmap_domain_page(p);
> +}
> +
> +static struct page_info *p2m_get_clean_page(struct domain *d)
> +{
> +    struct page_info *page;
> +
> +    /*
> +     * As mentioned in the Priviliged Architecture Spec (version 20240411)
> +     * As explained in Section 18.5.1, for the paged virtual-memory schemes
> +     * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
> +     * and must be aligned to a 16-KiB boundary.
> +     */
> +    page = alloc_domheap_pages(NULL, 2, 0);
> +    if ( page == NULL )
> +        return NULL;
> +
> +    clear_and_clean_page(page);

You appear to have allocated 4 pages, but only zeroed one.

~Andrew


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization
  2025-05-09 16:14   ` Andrew Cooper
@ 2025-05-12  9:24     ` Oleksii Kurochko
  2025-05-12  9:33       ` Oleksii Kurochko
  2025-05-20 13:47       ` Jan Beulich
  0 siblings, 2 replies; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-12  9:24 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Anthony PERARD,
	Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
	Stefano Stabellini

[-- Attachment #1: Type: text/plain, Size: 2608 bytes --]


On 5/9/25 6:14 PM, Andrew Cooper wrote:
> On 09/05/2025 4:57 pm, Oleksii Kurochko wrote:
>> diff --git a/xen/arch/riscv/p2m.c b/xen/arch/riscv/p2m.c
>> new file mode 100644
>> index 0000000000..ad4beef8f9
>> --- /dev/null
>> +++ b/xen/arch/riscv/p2m.c
>> @@ -0,0 +1,168 @@
>> +#include <xen/domain_page.h>
>> +#include <xen/iommu.h>
>> +#include <xen/lib.h>
>> +#include <xen/mm.h>
>> +#include <xen/pfn.h>
>> +#include <xen/rwlock.h>
>> +#include <xen/sched.h>
>> +#include <xen/spinlock.h>
>> +
>> +#include <asm/page.h>
>> +#include <asm/p2m.h>
>> +
>> +/*
>> + * Force a synchronous P2M TLB flush.
>> + *
>> + * Must be called with the p2m lock held.
>> + *
>> + * TODO: add support of flushing TLB connected to VMID.
>> + */
>> +static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
>> +{
>> +    ASSERT(p2m_is_write_locked(p2m));
>> +
>> +    /*
>> +     * TODO: shouldn't be this flush done for each physical CPU?
>> +     *       If yes, then SBI call sbi_remote_hfence_gvma() could
>> +     *       be used for that.
>> +     */
>> +#if defined(__riscv_hh) || defined(__riscv_h)
>> +    asm volatile ( "hfence.gvma" ::: "memory" );
>> +#else
>> +    asm volatile ( ".insn r 0x73, 0x0, 0x31, x0, x0, x0" ::: "memory" );
>> +#endif
> TLB flushing needs to happen for each pCPU which potentially has cached
> a mapping.
>
> In other arches, this is tracked by d->dirty_cpumask which is the bitmap
> of pCPUs where this domain is scheduled.

I could only find usage of|d->dirty_cpumask| in x86 and common code (grant
tables) for flushing the TLB. However, it seems that|d->dirty_cpumask| is
not set anywhere for ARM. Is it sufficient to set a bit in|d->dirty_cpumask|
in|startup_cpu_idle_loop()|?

In addition, it’s also necessary to set and clear bits in|d->dirty_cpumask|
during|context_switch|, correct? Set it before switching from the previous
domain, and clear it after switching to the new domain?

Also, when a bit is set in|d->dirty_cpumask|, the|v->processor| value is also
stored in|v->dirty_cpu|. Is this needed to track which processor is
currently being used for the vCPU?

> CPUs need to flush their TLBs before removing themselves from
> d->dirty_cpumask, which is typically done during context switch, but it
> means that to flush the P2M, you only need to IPI a subset of CPUs.

I can't find where the P2M flush happens for x86/ARM. Could you please point me
to where it is handled?

Also, I found guest_flush_tlb_mask() for x86. I assume that it is x86 specific
and generally it is enough to have only flush_tlb_mask(), right?

Thanks in advance for the answers.

~ Oleksii

[-- Attachment #2: Type: text/html, Size: 3851 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization
  2025-05-12  9:24     ` Oleksii Kurochko
@ 2025-05-12  9:33       ` Oleksii Kurochko
  2025-05-20 13:47       ` Jan Beulich
  1 sibling, 0 replies; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-12  9:33 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Anthony PERARD,
	Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
	Stefano Stabellini

[-- Attachment #1: Type: text/plain, Size: 2891 bytes --]


On 5/12/25 11:24 AM, Oleksii Kurochko wrote:
>
>
> On 5/9/25 6:14 PM, Andrew Cooper wrote:
>> On 09/05/2025 4:57 pm, Oleksii Kurochko wrote:
>>> diff --git a/xen/arch/riscv/p2m.c b/xen/arch/riscv/p2m.c
>>> new file mode 100644
>>> index 0000000000..ad4beef8f9
>>> --- /dev/null
>>> +++ b/xen/arch/riscv/p2m.c
>>> @@ -0,0 +1,168 @@
>>> +#include <xen/domain_page.h>
>>> +#include <xen/iommu.h>
>>> +#include <xen/lib.h>
>>> +#include <xen/mm.h>
>>> +#include <xen/pfn.h>
>>> +#include <xen/rwlock.h>
>>> +#include <xen/sched.h>
>>> +#include <xen/spinlock.h>
>>> +
>>> +#include <asm/page.h>
>>> +#include <asm/p2m.h>
>>> +
>>> +/*
>>> + * Force a synchronous P2M TLB flush.
>>> + *
>>> + * Must be called with the p2m lock held.
>>> + *
>>> + * TODO: add support of flushing TLB connected to VMID.
>>> + */
>>> +static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
>>> +{
>>> +    ASSERT(p2m_is_write_locked(p2m));
>>> +
>>> +    /*
>>> +     * TODO: shouldn't be this flush done for each physical CPU?
>>> +     *       If yes, then SBI call sbi_remote_hfence_gvma() could
>>> +     *       be used for that.
>>> +     */
>>> +#if defined(__riscv_hh) || defined(__riscv_h)
>>> +    asm volatile ( "hfence.gvma" ::: "memory" );
>>> +#else
>>> +    asm volatile ( ".insn r 0x73, 0x0, 0x31, x0, x0, x0" ::: "memory" );
>>> +#endif
>> TLB flushing needs to happen for each pCPU which potentially has cached
>> a mapping.
>>
>> In other arches, this is tracked by d->dirty_cpumask which is the bitmap
>> of pCPUs where this domain is scheduled.
> I could only find usage of|d->dirty_cpumask| in x86 and common code (grant
> tables) for flushing the TLB. However, it seems that|d->dirty_cpumask| is
> not set anywhere for ARM. Is it sufficient to set a bit in|d->dirty_cpumask|
> in|startup_cpu_idle_loop()|?

And one more thing.
If|d->dirty_cpumask| is empty (for example, on p2m initialization stage) then
p2m TLB flush could be skipped at all, right?

~ Oleksii

> In addition, it’s also necessary to set and clear bits in|d->dirty_cpumask|
> during|context_switch|, correct? Set it before switching from the previous
> domain, and clear it after switching to the new domain?
> Also, when a bit is set in|d->dirty_cpumask|, the|v->processor| value is also
> stored in|v->dirty_cpu|. Is this needed to track which processor is
> currently being used for the vCPU?
>> CPUs need to flush their TLBs before removing themselves from
>> d->dirty_cpumask, which is typically done during context switch, but it
>> means that to flush the P2M, you only need to IPI a subset of CPUs.
> I can't find where the P2M flush happens for x86/ARM. Could you please point me
> to where it is handled?
>
> Also, I found guest_flush_tlb_mask() for x86. I assume that it is x86 specific
> and generally it is enough to have only flush_tlb_mask(), right?
>
> Thanks in advance for the answers.
> ~ Oleksii

[-- Attachment #2: Type: text/html, Size: 4545 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization
  2025-05-09 15:57 ` [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization Oleksii Kurochko
  2025-05-09 16:14   ` Andrew Cooper
@ 2025-05-20 13:37   ` Jan Beulich
  2025-05-22 15:53     ` Oleksii Kurochko
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2025-05-20 13:37 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.05.2025 17:57, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/include/asm/domain.h
> +++ b/xen/arch/riscv/include/asm/domain.h
> @@ -5,6 +5,8 @@
>  #include <xen/xmalloc.h>
>  #include <public/hvm/params.h>
>  
> +#include <asm/p2m.h>
> +
>  struct hvm_domain
>  {
>      uint64_t              params[HVM_NR_PARAMS];
> @@ -16,8 +18,12 @@ struct arch_vcpu_io {
>  struct arch_vcpu {
>  };
>  
> +

Nit: As before, no double blank lines please.

>  struct arch_domain {
>      struct hvm_domain hvm;
> +
> +    struct p2m_domain p2m;
> +
>  };

Similarly, no blank lines please at the end of a struct/union/enum.

> --- a/xen/arch/riscv/include/asm/mm.h
> +++ b/xen/arch/riscv/include/asm/mm.h
> @@ -149,6 +149,10 @@ extern struct page_info *frametable_virt_start;
>  #define mfn_to_page(mfn)    (frametable_virt_start + mfn_x(mfn))
>  #define page_to_mfn(pg)     _mfn((pg) - frametable_virt_start)
>  
> +/* Convert between machine addresses and page-info structures. */
> +#define maddr_to_page(ma) mfn_to_page(maddr_to_mfn(ma))
> +#define page_to_maddr(pg) (mfn_to_maddr(page_to_mfn(pg)))

Nit: The outermost parentheses aren't really needed here. Or if you really
want them, then please be consistent and also add them for the other macro
you add.

> --- /dev/null
> +++ b/xen/arch/riscv/p2m.c
> @@ -0,0 +1,168 @@
> +#include <xen/domain_page.h>
> +#include <xen/iommu.h>
> +#include <xen/lib.h>
> +#include <xen/mm.h>
> +#include <xen/pfn.h>
> +#include <xen/rwlock.h>
> +#include <xen/sched.h>
> +#include <xen/spinlock.h>
> +
> +#include <asm/page.h>
> +#include <asm/p2m.h>
> +
> +/*
> + * Force a synchronous P2M TLB flush.
> + *
> + * Must be called with the p2m lock held.
> + *
> + * TODO: add support of flushing TLB connected to VMID.
> + */
> +static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
> +{
> +    ASSERT(p2m_is_write_locked(p2m));
> +
> +    /*
> +     * TODO: shouldn't be this flush done for each physical CPU?
> +     *       If yes, then SBI call sbi_remote_hfence_gvma() could
> +     *       be used for that.
> +     */
> +#if defined(__riscv_hh) || defined(__riscv_h)

This is a compiler capability check (which would be applicable if you
used a built-in function below).

> +    asm volatile ( "hfence.gvma" ::: "memory" );

Whereas here you use a feature in the assembler, at least for the GNU
toolchain.

> +static void clear_and_clean_page(struct page_info *page)
> +{
> +    void *p = __map_domain_page(page);
> +
> +    clear_page(p);
> +    unmap_domain_page(p);
> +}

What's the "clean" about in the function name? The "clear" is referring
to the clear_page() call afaict. Also aren't you largely open-coding
clear_domain_page() here?

> +static struct page_info *p2m_get_clean_page(struct domain *d)
> +{
> +    struct page_info *page;
> +
> +    /*
> +     * As mentioned in the Priviliged Architecture Spec (version 20240411)
> +     * As explained in Section 18.5.1, for the paged virtual-memory schemes
> +     * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
> +     * and must be aligned to a 16-KiB boundary.
> +     */
> +    page = alloc_domheap_pages(NULL, 2, 0);

Shouldn't this allocation come from the domain's P2M pool (which is yet
to be introduced)? Also hard-coding 2 here as order effectively builds
in an assumption that PAGE_SIZE will only ever be 4k. I think to wants
properly calculating instead.

> +    if ( page == NULL )
> +        return NULL;
> +
> +    clear_and_clean_page(page);
> +
> +    return page;
> +}

Contrary to the function name you obtained 4 pages here, which is suitable
for ...

> +static struct page_info *p2m_allocate_root(struct domain *d)
> +{
> +    return p2m_get_clean_page(d);
> +}

... this but - I expect - no anywhere else.

> +static unsigned long hgatp_from_page_info(struct page_info *page_info)

Except for the struct name please drop _info; we don#t use such anywhere
else.

> +{
> +    unsigned long ppn;
> +    unsigned long hgatp_mode;
> +
> +    ppn = PFN_DOWN(page_to_maddr(page_info)) & HGATP_PPN;
> +
> +    /* ASID (VMID) not supported yet */
> +
> +#if RV_STAGE1_MODE == SATP_MODE_SV39
> +    hgatp_mode = HGATP_MODE_SV39X4;
> +#elif RV_STAGE1_MODE == SATP_MODE_SV48
> +    hgatp_mode = HGATP_MODE_SV48X4;
> +#else
> +    #error "add HGATP_MODE"

As before, please have the # of pre-processor directives in the first column.

> +#endif
> +
> +    return ppn | (hgatp_mode << HGATP_MODE_SHIFT);

Use MASK_INSR()?

> +}
> +
> +static int p2m_alloc_table(struct domain *d)
> +{
> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +
> +    p2m->root = p2m_allocate_root(d);
> +    if ( !p2m->root )
> +        return -ENOMEM;
> +
> +    p2m->hgatp = hgatp_from_page_info(p2m->root);
> +
> +    /*
> +     * Make sure that all TLBs corresponding to the new VMID are flushed
> +     * before using it.
> +     */
> +    p2m_write_lock(p2m);
> +    p2m_force_tlb_flush_sync(p2m);
> +    p2m_write_unlock(p2m);

While Andrew directed you towards a better model in general, it won't be
usable here then, as the guest didn't run on any pCPU(s) yet. Imo you
want to do a single global flush e.g. when VMIDs wrap around. That'll be
fewer global flushes than one per VM creation.

> +int p2m_init(struct domain *d)
> +{
> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +    int rc;
> +
> +    rwlock_init(&p2m->lock);
> +    INIT_PAGE_LIST_HEAD(&p2m->pages);
> +
> +    p2m->max_mapped_gfn = _gfn(0);
> +    p2m->lowest_mapped_gfn = _gfn(ULONG_MAX);

You don't ever read these two fields. Likely better to introduce them when
they're actually needed. Same possibly for further things done in this
function.

> +    p2m->default_access = p2m_access_rwx;
> +
> +    radix_tree_init(&p2m->p2m_type);
> +
> +#ifdef CONFIG_HAS_PASSTHROUGH
> +    /*
> +     * Some IOMMUs don't support coherent PT walk. When the p2m is
> +     * shared with the CPU, Xen has to make sure that the PT changes have
> +     * reached the memory
> +     */
> +    p2m->clean_pte = is_iommu_enabled(d) &&
> +        !iommu_has_feature(d, IOMMU_FEAT_COHERENT_WALK);
> +#else
> +    p2m->clean_pte = true;

When there's no IOMMU (in use), doesn't this want to be "false"?

> +#endif
> +
> +    /*
> +     * "Trivial" initialisation is now complete.  Set the backpointer so
> +     * p2m_teardown() and friends know to do something.
> +     */
> +    p2m->domain = d;

And where is that p2m_teardown(), to cross-check the comment against?

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization
  2025-05-12  9:24     ` Oleksii Kurochko
  2025-05-12  9:33       ` Oleksii Kurochko
@ 2025-05-20 13:47       ` Jan Beulich
  1 sibling, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2025-05-20 13:47 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Anthony PERARD,
	Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, Andrew Cooper, xen-devel

On 12.05.2025 11:24, Oleksii Kurochko wrote:
> On 5/9/25 6:14 PM, Andrew Cooper wrote:
>> On 09/05/2025 4:57 pm, Oleksii Kurochko wrote:
>>> --- /dev/null
>>> +++ b/xen/arch/riscv/p2m.c
>>> @@ -0,0 +1,168 @@
>>> +#include <xen/domain_page.h>
>>> +#include <xen/iommu.h>
>>> +#include <xen/lib.h>
>>> +#include <xen/mm.h>
>>> +#include <xen/pfn.h>
>>> +#include <xen/rwlock.h>
>>> +#include <xen/sched.h>
>>> +#include <xen/spinlock.h>
>>> +
>>> +#include <asm/page.h>
>>> +#include <asm/p2m.h>
>>> +
>>> +/*
>>> + * Force a synchronous P2M TLB flush.
>>> + *
>>> + * Must be called with the p2m lock held.
>>> + *
>>> + * TODO: add support of flushing TLB connected to VMID.
>>> + */
>>> +static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
>>> +{
>>> +    ASSERT(p2m_is_write_locked(p2m));
>>> +
>>> +    /*
>>> +     * TODO: shouldn't be this flush done for each physical CPU?
>>> +     *       If yes, then SBI call sbi_remote_hfence_gvma() could
>>> +     *       be used for that.
>>> +     */
>>> +#if defined(__riscv_hh) || defined(__riscv_h)
>>> +    asm volatile ( "hfence.gvma" ::: "memory" );
>>> +#else
>>> +    asm volatile ( ".insn r 0x73, 0x0, 0x31, x0, x0, x0" ::: "memory" );
>>> +#endif
>> TLB flushing needs to happen for each pCPU which potentially has cached
>> a mapping.
>>
>> In other arches, this is tracked by d->dirty_cpumask which is the bitmap
>> of pCPUs where this domain is scheduled.
> 
> I could only find usage of|d->dirty_cpumask| in x86 and common code (grant
> tables) for flushing the TLB. However, it seems that|d->dirty_cpumask| is
> not set anywhere for ARM. Is it sufficient to set a bit in|d->dirty_cpumask|
> in|startup_cpu_idle_loop()|?

No, how would the idle loop be relevant here? The bit needs setting for any
pCPU a vCPU of the domain is running on, i.e. somewhere in context switch
code.

> In addition, it’s also necessary to set and clear bits in|d->dirty_cpumask|
> during|context_switch|, correct? Set it before switching from the previous
> domain, and clear it after switching to the new domain?
> 
> Also, when a bit is set in|d->dirty_cpumask|, the|v->processor| value is also
> stored in|v->dirty_cpu|. Is this needed to track which processor is
> currently being used for the vCPU?
> 
>> CPUs need to flush their TLBs before removing themselves from
>> d->dirty_cpumask, which is typically done during context switch, but it
>> means that to flush the P2M, you only need to IPI a subset of CPUs.
> 
> I can't find where the P2M flush happens for x86/ARM. Could you please point me
> to where it is handled?

Grep for ept_sync_domain, which will give you several involved functions
(for the Intel, i.e. EPT case).

> Also, I found guest_flush_tlb_mask() for x86. I assume that it is x86 specific
> and generally it is enough to have only flush_tlb_mask(), right?

Yes, that's an x86-specific helper which you may or may not want a
counterpart of.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 3/6] xen/riscv: construct the P2M pages pool for guests
  2025-05-09 15:57 ` [PATCH v1 3/6] xen/riscv: construct the P2M pages pool for guests Oleksii Kurochko
@ 2025-05-20 14:38   ` Jan Beulich
  2025-05-23 10:27     ` Oleksii Kurochko
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2025-05-20 14:38 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.05.2025 17:57, Oleksii Kurochko wrote:
> Implement p2m_set_allocation() to construct p2m pages pool for guests
> based on required number of pages.
> 
> This is implemented by:
> - Adding a `struct paging_domain` which contains a freelist, a
>   counter variable and a spinlock to `struct arch_domain` to
>   indicate the free p2m pages and the number of p2m total pages in
>   the p2m pages pool.
> - Adding a helper `p2m_set_allocation` to set the p2m pages pool
>   size. This helper should be called before allocating memory for
>   a guest and is called from domain_p2m_set_allocation(), the latter
>   is a part of common dom0less code.
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

As already indicated in reply to patch 2, I expect this pool will want
introducing ahead of that.

> --- a/xen/arch/riscv/p2m.c
> +++ b/xen/arch/riscv/p2m.c
> @@ -1,4 +1,12 @@
>  #include <xen/domain_page.h>
> +/*
> + * Because of general_preempt_check() from xen/sched.h which uses
> + * local_events_need_delivery() but latter is declared in <asm/event.h>.
> + * Thereby it is needed to icnlude <xen/event.h> here before xen/sched.h.
> + *
> + * Shouldn't be xen/event.h be included in <xen/sched.h>?
> + */
> +#include <xen/event.h>

The question doesn't belong here; such could be put in the post-commit-
message area. And the answer depends on what dependency you found missing.

> @@ -166,3 +176,60 @@ int p2m_init(struct domain *d)
>  
>      return 0;
>  }
> +
> +/*
> + * Set the pool of pages to the required number of pages.
> + * Returns 0 for success, non-zero for failure.
> + * Call with d->arch.paging.lock held.
> + */
> +int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
> +{
> +    struct page_info *pg;
> +
> +    ASSERT(spin_is_locked(&d->arch.paging.lock));
> +
> +    for ( ; ; )
> +    {
> +        if ( d->arch.paging.p2m_total_pages < pages )
> +        {
> +            /* Need to allocate more memory from domheap */
> +            pg = alloc_domheap_page(d, MEMF_no_owner);
> +            if ( pg == NULL )
> +            {
> +                printk(XENLOG_ERR "Failed to allocate P2M pages.\n");
> +                return -ENOMEM;
> +            }
> +            ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
> +                d->arch.paging.p2m_total_pages + 1;

Looks like you copied this from Arm, but this code is bogus: Using
ACCESS_ONCE() just on the lhs is pretty pointless. Once also used on the
rhs the expression can easily become

                ACCESS_ONCE(d->arch.paging.p2m_total_pages) += 1;

or even

                ACCESS_ONCE(d->arch.paging.p2m_total_pages)++;

.

> +            page_list_add_tail(pg, &d->arch.paging.p2m_freelist);
> +        }
> +        else if ( d->arch.paging.p2m_total_pages > pages )
> +        {
> +            /* Need to return memory to domheap */
> +            pg = page_list_remove_head(&d->arch.paging.p2m_freelist);
> +            if( pg )
> +            {
> +                ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
> +                    d->arch.paging.p2m_total_pages - 1;

Same here then, obviously.

> +                free_domheap_page(pg);
> +            }
> +            else
> +            {
> +                printk(XENLOG_ERR
> +                       "Failed to free P2M pages, P2M freelist is empty.\n");
> +                return -ENOMEM;
> +            }
> +        }
> +        else
> +            break;
> +
> +        /* Check to see if we need to yield and try again */
> +        if ( preempted && general_preempt_check() )

While it's this way on both Arm and x86, I wonder how useful it is
to check on every iteration, especially when freeing pages back to the
buddy allocator.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 4/6] xen/riscv: define pt_t and pt_walk_t structures
  2025-05-09 15:57 ` [PATCH v1 4/6] xen/riscv: define pt_t and pt_walk_t structures Oleksii Kurochko
@ 2025-05-20 15:04   ` Jan Beulich
  2025-05-23 10:48     ` Oleksii Kurochko
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2025-05-20 15:04 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.05.2025 17:57, Oleksii Kurochko wrote:
> Refactor pte_t to be a union which hold page table entry plus
> pt_t and pt_walk_t structures to simpilfy p2m functions.

Is this really simplifying things? I really view ...

> Also, introduce some helpers which are using pt_walk_t.

... these helpers as confusing things, by using the wrong part of the
union relative to what their names are. (I'll re-phrase this some at
the bottom.)

With the union it's also unclear to me how to know which part of the
union is the one that's valid to use, when there's no auxiliary info
available.

> --- a/xen/arch/riscv/include/asm/page.h
> +++ b/xen/arch/riscv/include/asm/page.h
> @@ -99,15 +99,67 @@
>  
>  #endif
>  
> -/* Page Table entry */
>  typedef struct {
> +    unsigned long v:1;
> +    unsigned long r:1;
> +    unsigned long w:1;
> +    unsigned long x:1;
> +    unsigned long u:1;
> +    unsigned long g:1;
> +    unsigned long a:1;
> +    unsigned long d:1;
> +    unsigned long rsw:2;
> +#if RV_STAGE1_MODE == SATP_MODE_SV39
> +    unsigned long ppn0:9;
> +    unsigned long ppn1:9;
> +    unsigned long ppn2:26;
> +    unsigned long rsw2:7;
> +    unsigned long pbmt:2;
> +    unsigned long n:1;
> +#elif RV_STAGE1_MODE == SATP_MODE_SV48
> +    unsigned long ppn0:9;
> +    unsigned long ppn1:9;
> +    unsigned long ppn2:9;
> +    unsigned long ppn3:17;
> +    unsigned long rsw2:7;
> +    unsigned long pbmt:2;
> +    unsigned long n:1;
> +#else

Imo you want to settle on whether you want to use bitfields or #define-s
to manipulate page table entries. Using a mix is going to be confusing
(or worse).

> +#error "Add proper bits for SATP_MODE"
> +#endif
> +} pt_t;
> +
> +typedef struct {
> +    unsigned long rsw:10;
> +#if RV_STAGE1_MODE == SATP_MODE_SV39 || RV_STAGE1_MODE == SATP_MODE_SV48
> +    unsigned long ppn: 44;

Nit: Why's there a blank after the colon here when there's none anywhere else?

> +static inline void pte_set_mfn(pte_t *pte, mfn_t mfn)
> +{
> +    pte->walk.ppn = mfn_x(mfn);
> +}
> +
> +static inline mfn_t pte_get_mfn(pte_t pte)
> +{
> +    return _mfn(pte.walk.ppn);
> +}

Following to my remark at the top - if you do it this way, what use are the
ppn<N> fields?

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 5/6] xen/riscv: add new p2m types and helper macros for type classification
  2025-05-09 15:57 ` [PATCH v1 5/6] xen/riscv: add new p2m types and helper macros for type classification Oleksii Kurochko
@ 2025-05-20 15:11   ` Jan Beulich
  2025-05-23 11:34     ` Oleksii Kurochko
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2025-05-20 15:11 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.05.2025 17:57, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/include/asm/p2m.h
> +++ b/xen/arch/riscv/include/asm/p2m.h
> @@ -80,8 +80,36 @@ struct p2m_domain {
>  typedef enum {
>      p2m_invalid = 0,    /* Nothing mapped here */
>      p2m_ram_rw,         /* Normal read/write domain RAM */
> +    p2m_ram_ro,         /* Read-only; writes are silently dropped */

This is pretty special a type, which imo better wouldn't be introduced
without there being proper support for it. (I don't suppose RISC-V
hardware alone can effect this type?)

> +    p2m_mmio_direct_dev,/* Read/write mapping of genuine Device MMIO area */
> +    p2m_map_foreign_rw, /* Read/write RAM pages from foreign domain */
> +    p2m_map_foreign_ro, /* Read-only RAM pages from foreign domain */

Aiui you took these from Arm. Looking at its sole use, I'm not convinced
it's used correctly. If it is, the same comment as for p2m_ram_ro above
would apply here, too.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 6/6] xen/riscv: implement p2m mapping functionality
  2025-05-09 15:57 ` [PATCH v1 6/6] xen/riscv: implement p2m mapping functionality Oleksii Kurochko
@ 2025-05-20 15:16   ` Jan Beulich
  2025-05-23 11:47     ` Oleksii Kurochko
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2025-05-20 15:16 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.05.2025 17:57, Oleksii Kurochko wrote:
> These utilities are needed for building and managing RISC-V guest page
> tables and MMIO mappings by using functions map_regions_p2mt() and
> guest_physmap_add_entry().
> 
> To implement p2m mapping functionality the following is introduced:
> - Define P2M root level/order and entry count.
> - Introdude radix type for p2m types as it isn't enough free bits in pte
>   and the helpers (p2m_type_radix_{get,set}()) to deal with them.
> - Introduce p2m_is_*() helpers() as  pte_is_*() helpers are checking
>   the valid bit set in the PTE but we have to check p2m_type instead
>   (look at the comment above p2m_is_valid() for some details).

May I suggest to name them at least p2me_is_*() then, as they check entries
rather than entire P2Ms? Same perhaps elsewhere.

> - Introduce helper to set p2m's pte permission: p2m_set_permissions().
> - Introduce helper to create p2m entry based on mfn, p2m_type_t and
>   p2m_access_t.
> - Introduce helper to generate table entry with correct attributes:
>   page_to_p2m_table().
> - Introduce p2m page allocation function: p2m_alloc_page().
> - Introduce functions to write/remove p2m's entries: p2m_{write,remove}_pte().
> - Introduce function to allocate p2m table: p2m_create_table().
> - Introduce functions used to free p2m entry.
> - Introduce function for table walking: p2m_next_level().
> - Introduce function to insert an entry in the p2m (p2m_set_entry()).
> - Introduce superpage splitting: p2m_split_superpage()).
> - Introduce page table type defines (PGT_{none,writable_page}, etc).
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
>  xen/arch/riscv/include/asm/mm.h   |  32 +-
>  xen/arch/riscv/include/asm/p2m.h  |  17 +-
>  xen/arch/riscv/include/asm/page.h |  11 +
>  xen/arch/riscv/p2m.c              | 780 ++++++++++++++++++++++++++++++
>  4 files changed, 829 insertions(+), 11 deletions(-)

It's imo too many things you do in one go here.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization
  2025-05-20 13:37   ` Jan Beulich
@ 2025-05-22 15:53     ` Oleksii Kurochko
  2025-05-22 16:09       ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-22 15:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

[-- Attachment #1: Type: text/plain, Size: 7299 bytes --]

On 5/20/25 3:37 PM, Jan Beulich wrote:
> On 09.05.2025 17:57, Oleksii Kurochko wrote:
>> --- /dev/null
>> +++ b/xen/arch/riscv/p2m.c
>> +static void clear_and_clean_page(struct page_info *page)
>> +{
>> +    void *p = __map_domain_page(page);
>> +
>> +    clear_page(p);
>> +    unmap_domain_page(p);
>> +}
> What's the "clean" about in the function name? The "clear" is referring
> to the clear_page() call afaict.

Missed to add clean_dcache_va_range() between clear_page() and unmap_domain_page().

> Also aren't you largely open-coding
> clear_domain_page() here?

Yes, missed that it is almost the sane as clear_domain_page(), so we could re-write
this function as:
   static void clear_and_clean_page(struct page_info *page)
   {
       clean_dcache_va_range(page, PAGE_SIZE);
       clear_domain_page(page_to_mfn(page));
   }

>> +static struct page_info *p2m_get_clean_page(struct domain *d)
>> +{
>> +    struct page_info *page;
>> +
>> +    /*
>> +     * As mentioned in the Priviliged Architecture Spec (version 20240411)
>> +     * As explained in Section 18.5.1, for the paged virtual-memory schemes
>> +     * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
>> +     * and must be aligned to a 16-KiB boundary.
>> +     */
>> +    page = alloc_domheap_pages(NULL, 2, 0);
> Shouldn't this allocation come from the domain's P2M pool (which is yet
> to be introduced)?

First, I will drop p2m_get_clean_page() as it will be used only for p2m root page
table allocation.

p2m_init() is called by domain_create() [->arch_domain_create()->p2m_init()] from create_domUs():
[https://gitlab.com/xen-project/xen/-/blob/staging/xen/common/device-tree/dom0less-build.c?ref_type=heads#L984].

When p2m_init() is called, p2m pool isn't ready and domain isn't created yet. Last one
is also crucial for usage of p2m pool as p2m pool belongs to domain and thereby it is
using alloc_domheap_page(d, ...) (Not NULL as for allocation of p2m root table above),
so domain should be created first.

And only after domain_create() will created domain, p2m pool could be initialized during
domain construction:
   https://gitlab.com/xen-project/xen/-/blob/staging/xen/common/device-tree/dom0less-build.c?ref_type=heads#L756
and the size of p2m pool depends on the value from memory property of domain node in DT.
(line 748, the link the same as above).

Also, if CONFIG_ARCH_PAGING_MEMPOOL=n, then p2m pool isn't used. But it isn't a case for RISC-V
for the moment. Probably one day it would be useful if someone wanted to add support for MMU-less
case. Something like Arm is doing now for R-cores.

> Also hard-coding 2 here as order effectively builds
> in an assumption that PAGE_SIZE will only ever be 4k. I think to wants
> properly calculating instead.

I haven't thought about that. I will update it with:
   page = alloc_domheap_pages(NULL, get_order_from_bytes(KB(16)), 0);

>
>> +    if ( page == NULL )
>> +        return NULL;
>> +
>> +    clear_and_clean_page(page);
>> +
>> +    return page;
>> +}
> Contrary to the function name you obtained 4 pages here, which is suitable
> for ...
>
>> +static struct page_info *p2m_allocate_root(struct domain *d)
>> +{
>> +    return p2m_get_clean_page(d);
>> +}
> ... this but - I expect - no anywhere else.

Totally agree, as mentioned above this function is used only for p2m_allocate_root().
I will just open-code it in p2m_allocate_root().

>> +{
>> +    unsigned long ppn;
>> +    unsigned long hgatp_mode;
>> +
>> +    ppn = PFN_DOWN(page_to_maddr(page_info)) & HGATP_PPN;
>> +
>> +    /* ASID (VMID) not supported yet */
>> +
>> +#if RV_STAGE1_MODE == SATP_MODE_SV39
>> +    hgatp_mode = HGATP_MODE_SV39X4;
>> +#elif RV_STAGE1_MODE == SATP_MODE_SV48
>> +    hgatp_mode = HGATP_MODE_SV48X4;
>> +#else
>> +    #error "add HGATP_MODE"
> As before, please have the # of pre-processor directives in the first column.
>
>> +#endif
>> +
>> +    return ppn | (hgatp_mode << HGATP_MODE_SHIFT);
> Use MASK_INSR()?

Do you mean MASK_INSR(hgatp_mode, HGATP_MODE_MASK)?
If yes, then I didn't get what is the point then?

>
>> +}
>> +
>> +static int p2m_alloc_table(struct domain *d)
>> +{
>> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> +
>> +    p2m->root = p2m_allocate_root(d);
>> +    if ( !p2m->root )
>> +        return -ENOMEM;
>> +
>> +    p2m->hgatp = hgatp_from_page_info(p2m->root);
>> +
>> +    /*
>> +     * Make sure that all TLBs corresponding to the new VMID are flushed
>> +     * before using it.
>> +     */
>> +    p2m_write_lock(p2m);
>> +    p2m_force_tlb_flush_sync(p2m);
>> +    p2m_write_unlock(p2m);
> While Andrew directed you towards a better model in general, it won't be
> usable here then, as the guest didn't run on any pCPU(s) yet. Imo you
> want to do a single global flush e.g. when VMIDs wrap around. That'll be
> fewer global flushes than one per VM creation.

I am not sure that I get a phrase 'VMIDs wrap around'.

I am going to implement, p2m_force_tlb_flush_sync() as:
  static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
  {
    ...
      sbi_remote_hfence_gvma(d->dirty_cpumask, 0, 0);
    ...
  }

With such implementation if the guest didn't run on any pCPU(s) yet
then d->dirty_cpumask is empty, then sbi_remote_hfence_gvma() will do nothing
as hmask will be NULL (https://gitlab.com/xen-project/people/olkur/xen/-/blob/staging/xen/arch/riscv/sbi.c?ref_type=heads#L238).
I am not sure that it is a good idea as I can't find a guarantee in the spec
that TLB will be empty during boot time.

But if another VM is being created then we should flush stage2 before run a VM
so, the new VM won't re-use something from the old VM.
Or in case of VMID if VMID is reused by new VM in case if, for example, the
previous owner(domain) was destroyed and a new domain is reusing VMID, it is
needed to flush stage2.

p2m_alloc_table() looks a good place for that and I am not sure that we can
do a single global flush, and I don't really know in first glance where it
should be done.

>> +    p2m->default_access = p2m_access_rwx;
>> +
>> +    radix_tree_init(&p2m->p2m_type);
>> +
>> +#ifdef CONFIG_HAS_PASSTHROUGH
>> +    /*
>> +     * Some IOMMUs don't support coherent PT walk. When the p2m is
>> +     * shared with the CPU, Xen has to make sure that the PT changes have
>> +     * reached the memory
>> +     */
>> +    p2m->clean_pte = is_iommu_enabled(d) &&
>> +        !iommu_has_feature(d, IOMMU_FEAT_COHERENT_WALK);
>> +#else
>> +    p2m->clean_pte = true;
> When there's no IOMMU (in use), doesn't this want to be "false"?

I think you are right, "false" is more correct here.

>
>> +#endif
>> +
>> +    /*
>> +     * "Trivial" initialisation is now complete.  Set the backpointer so
>> +     * p2m_teardown() and friends know to do something.
>> +     */
>> +    p2m->domain = d;
> And where is that p2m_teardown(), to cross-check the comment against?

It is not introduced now as I expected it is need only when domain is needed to
be stop for some reason. And it isn't really needed now.

Anyway, it seems like it is a stale comment as on other arch-es p2m_teardown() has
an argument with struct domain *d.

I can update the commit to:
  "Trivial" initialisation is now complete.  Set the backpointer so the users of p2m
   could get an access to domain structure.

~ Oleksii

[-- Attachment #2: Type: text/html, Size: 10563 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization
  2025-05-22 15:53     ` Oleksii Kurochko
@ 2025-05-22 16:09       ` Jan Beulich
  2025-05-23  9:44         ` Oleksii Kurochko
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2025-05-22 16:09 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 22.05.2025 17:53, Oleksii Kurochko wrote:
> On 5/20/25 3:37 PM, Jan Beulich wrote:
>> On 09.05.2025 17:57, Oleksii Kurochko wrote:
>>> +static struct page_info *p2m_get_clean_page(struct domain *d)
>>> +{
>>> +    struct page_info *page;
>>> +
>>> +    /*
>>> +     * As mentioned in the Priviliged Architecture Spec (version 20240411)
>>> +     * As explained in Section 18.5.1, for the paged virtual-memory schemes
>>> +     * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
>>> +     * and must be aligned to a 16-KiB boundary.
>>> +     */
>>> +    page = alloc_domheap_pages(NULL, 2, 0);
>> Shouldn't this allocation come from the domain's P2M pool (which is yet
>> to be introduced)?
> 
> First, I will drop p2m_get_clean_page() as it will be used only for p2m root page
> table allocation.
> 
> p2m_init() is called by domain_create() [->arch_domain_create()->p2m_init()] from create_domUs():
> [https://gitlab.com/xen-project/xen/-/blob/staging/xen/common/device-tree/dom0less-build.c?ref_type=heads#L984].
> 
> When p2m_init() is called, p2m pool isn't ready and domain isn't created yet. Last one
> is also crucial for usage of p2m pool as p2m pool belongs to domain and thereby it is
> using alloc_domheap_page(d, ...) (Not NULL as for allocation of p2m root table above),
> so domain should be created first.

Yet that is part of my point: This allocation should be against the domain,
so it is properly accounted. What's the problem with allocating the root
table when the pools is being created / filled?

>>> +{
>>> +    unsigned long ppn;
>>> +    unsigned long hgatp_mode;
>>> +
>>> +    ppn = PFN_DOWN(page_to_maddr(page_info)) & HGATP_PPN;
>>> +
>>> +    /* ASID (VMID) not supported yet */
>>> +
>>> +#if RV_STAGE1_MODE == SATP_MODE_SV39
>>> +    hgatp_mode = HGATP_MODE_SV39X4;
>>> +#elif RV_STAGE1_MODE == SATP_MODE_SV48
>>> +    hgatp_mode = HGATP_MODE_SV48X4;
>>> +#else
>>> +    #error "add HGATP_MODE"
>> As before, please have the # of pre-processor directives in the first column.
>>
>>> +#endif
>>> +
>>> +    return ppn | (hgatp_mode << HGATP_MODE_SHIFT);
>> Use MASK_INSR()?
> 
> Do you mean MASK_INSR(hgatp_mode, HGATP_MODE_MASK)?
> If yes, then I didn't get what is the point then?

The point is that generally ..._SHIFT is redundant when you also have
..._MASK; that's what MASK_EXTR() and MASK_INSR() leverage.

>>> +static int p2m_alloc_table(struct domain *d)
>>> +{
>>> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>> +
>>> +    p2m->root = p2m_allocate_root(d);
>>> +    if ( !p2m->root )
>>> +        return -ENOMEM;
>>> +
>>> +    p2m->hgatp = hgatp_from_page_info(p2m->root);
>>> +
>>> +    /*
>>> +     * Make sure that all TLBs corresponding to the new VMID are flushed
>>> +     * before using it.
>>> +     */
>>> +    p2m_write_lock(p2m);
>>> +    p2m_force_tlb_flush_sync(p2m);
>>> +    p2m_write_unlock(p2m);
>> While Andrew directed you towards a better model in general, it won't be
>> usable here then, as the guest didn't run on any pCPU(s) yet. Imo you
>> want to do a single global flush e.g. when VMIDs wrap around. That'll be
>> fewer global flushes than one per VM creation.
> 
> I am not sure that I get a phrase 'VMIDs wrap around'.

You have to allocate them somehow. Typically you'll use the next one available.
At some point you will need to start over, searching from the beginning. Prior
to that now allocation of a new one will require any flush, as none of them
had be in use before (after boot or the last such flush).

> I am going to implement, p2m_force_tlb_flush_sync() as:
>   static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
>   {
>     ...
>       sbi_remote_hfence_gvma(d->dirty_cpumask, 0, 0);
>     ...
>   }
> 
> With such implementation if the guest didn't run on any pCPU(s) yet
> then d->dirty_cpumask is empty, then sbi_remote_hfence_gvma() will do nothing
> as hmask will be NULL (https://gitlab.com/xen-project/people/olkur/xen/-/blob/staging/xen/arch/riscv/sbi.c?ref_type=heads#L238).
> I am not sure that it is a good idea as I can't find a guarantee in the spec
> that TLB will be empty during boot time.

If in doubt, do one global flush while booting.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization
  2025-05-22 16:09       ` Jan Beulich
@ 2025-05-23  9:44         ` Oleksii Kurochko
  2025-06-02 11:04           ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-23  9:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

[-- Attachment #1: Type: text/plain, Size: 7233 bytes --]


On 5/22/25 6:09 PM, Jan Beulich wrote:
> On 22.05.2025 17:53, Oleksii Kurochko wrote:
>> On 5/20/25 3:37 PM, Jan Beulich wrote:
>>> On 09.05.2025 17:57, Oleksii Kurochko wrote:
>>>> +static struct page_info *p2m_get_clean_page(struct domain *d)
>>>> +{
>>>> +    struct page_info *page;
>>>> +
>>>> +    /*
>>>> +     * As mentioned in the Priviliged Architecture Spec (version 20240411)
>>>> +     * As explained in Section 18.5.1, for the paged virtual-memory schemes
>>>> +     * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
>>>> +     * and must be aligned to a 16-KiB boundary.
>>>> +     */
>>>> +    page = alloc_domheap_pages(NULL, 2, 0);
>>> Shouldn't this allocation come from the domain's P2M pool (which is yet
>>> to be introduced)?
>> First, I will drop p2m_get_clean_page() as it will be used only for p2m root page
>> table allocation.
>>
>> p2m_init() is called by domain_create() [->arch_domain_create()->p2m_init()] from create_domUs():
>> [https://gitlab.com/xen-project/xen/-/blob/staging/xen/common/device-tree/dom0less-build.c?ref_type=heads#L984].
>>
>> When p2m_init() is called, p2m pool isn't ready and domain isn't created yet. Last one
>> is also crucial for usage of p2m pool as p2m pool belongs to domain and thereby it is
>> using alloc_domheap_page(d, ...) (Not NULL as for allocation of p2m root table above),
>> so domain should be created first.
> Yet that is part of my point: This allocation should be against the domain,
> so it is properly accounted. What's the problem with allocating the root
> table when the pools is being created / filled?

I can't use pages from pool for root table as they aren't properly aligned.

At the moment, creation of p2m pool looks like:
  int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
  {
      struct page_info *pg;

      ASSERT(spin_is_locked(&d->arch.paging.lock));

      for ( ; ; )
      {
          if ( d->arch.paging.p2m_total_pages < pages )
          {
              /* Need to allocate more memory from domheap */
              pg = alloc_domheap_page(d, MEMF_no_owner);
              if ( pg == NULL )
              {
                  printk(XENLOG_ERR "Failed to allocate P2M pages.\n");
                  return -ENOMEM;
              }
              ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
                  d->arch.paging.p2m_total_pages + 1;
              page_list_add_tail(pg, &d->arch.paging.p2m_freelist);
          }
          ...
      }

      return 0;
  }
alloc_domheap_page(d, MEMF_no_owner) allocates page table with order 0, so 4k-aligned page table.
But if I needed 16k for root table and it should be 16k-aligned then I still have to use
alloc_domheap_pages(NULL, 2, 0);

Or do you mean that I have to something like:
  int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
  {
      struct page_info *pg;
  
      ASSERT(spin_is_locked(&d->arch.paging.lock));
  
+    if ( !d->arch.p2m.root )
+    {
+        unsigned int order = get_order_from_bytes(KB(16));
+        unsigned int nr_pages = _AC(1,U) << order;
+        /*
+        * As mentioned in the Priviliged Architecture Spec (version 20240411)
+        * As explained in Section 18.5.1, for the paged virtual-memory schemes
+        * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
+        * and must be aligned to a 16-KiB boundary.
+        */
+        d->arch.p2m.root = alloc_domheap_pages(d, order, MEMF_no_owner);
+        if ( d->arch.p2m.root == NULL )
+            panic("root page table hasn't been allocated\n");
+
+        clear_and_clean_page(d->arch.p2m.root);
+
+        /* TODO: do I need TLB flush here? */
+
+        ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
+            d->arch.paging.p2m_total_pages + nr_pages;
+    }
+
...
}
(I will the current version of p2m_alloc_table() instead of open-coding.)


>>>> +{
>>>> +    unsigned long ppn;
>>>> +    unsigned long hgatp_mode;
>>>> +
>>>> +    ppn = PFN_DOWN(page_to_maddr(page_info)) & HGATP_PPN;
>>>> +
>>>> +    /* ASID (VMID) not supported yet */
>>>> +
>>>> +#if RV_STAGE1_MODE == SATP_MODE_SV39
>>>> +    hgatp_mode = HGATP_MODE_SV39X4;
>>>> +#elif RV_STAGE1_MODE == SATP_MODE_SV48
>>>> +    hgatp_mode = HGATP_MODE_SV48X4;
>>>> +#else
>>>> +    #error "add HGATP_MODE"
>>> As before, please have the # of pre-processor directives in the first column.
>>>
>>>> +#endif
>>>> +
>>>> +    return ppn | (hgatp_mode << HGATP_MODE_SHIFT);
>>> Use MASK_INSR()?
>> Do you mean MASK_INSR(hgatp_mode, HGATP_MODE_MASK)?
>> If yes, then I didn't get what is the point then?
> The point is that generally ..._SHIFT is redundant when you also have
> ..._MASK; that's what MASK_EXTR() and MASK_INSR() leverage.

At the moment, there is no mask for HGATP_MODE so if to use *_MASK then I
have to introduce it if it better to have *_MASK instead of *_SHIFT.

>
>>>> +static int p2m_alloc_table(struct domain *d)
>>>> +{
>>>> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>>> +
>>>> +    p2m->root = p2m_allocate_root(d);
>>>> +    if ( !p2m->root )
>>>> +        return -ENOMEM;
>>>> +
>>>> +    p2m->hgatp = hgatp_from_page_info(p2m->root);
>>>> +
>>>> +    /*
>>>> +     * Make sure that all TLBs corresponding to the new VMID are flushed
>>>> +     * before using it.
>>>> +     */
>>>> +    p2m_write_lock(p2m);
>>>> +    p2m_force_tlb_flush_sync(p2m);
>>>> +    p2m_write_unlock(p2m);
>>> While Andrew directed you towards a better model in general, it won't be
>>> usable here then, as the guest didn't run on any pCPU(s) yet. Imo you
>>> want to do a single global flush e.g. when VMIDs wrap around. That'll be
>>> fewer global flushes than one per VM creation.
>> I am not sure that I get a phrase 'VMIDs wrap around'.
> You have to allocate them somehow. Typically you'll use the next one available.
> At some point you will need to start over, searching from the beginning. Prior
> to that now allocation of a new one will require any flush, as none of them
> had be in use before (after boot or the last such flush).

Thanks. Now I get your point.

Won't be better to do TLB flushing during destroying of a domain so then we will
be sure that TLBs connected to freed VMID aren't present in TLB anymore?

IIUC, it will work only if VMID is used, right?
In case if VMID isn't used, probably we can drop flushing here and do a flush
during booting, right?
Won't be enough to flushing of guest TLB only during context switch?

>
>> I am going to implement, p2m_force_tlb_flush_sync() as:
>>    static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
>>    {
>>      ...
>>        sbi_remote_hfence_gvma(d->dirty_cpumask, 0, 0);
>>      ...
>>    }
>>
>> With such implementation if the guest didn't run on any pCPU(s) yet
>> then d->dirty_cpumask is empty, then sbi_remote_hfence_gvma() will do nothing
>> as hmask will be NULL (https://gitlab.com/xen-project/people/olkur/xen/-/blob/staging/xen/arch/riscv/sbi.c?ref_type=heads#L238).
>> I am not sure that it is a good idea as I can't find a guarantee in the spec
>> that TLB will be empty during boot time.
> If in doubt, do one global flush while booting.

By booting you mean somewhere in continue_new_vcpu()?

~ Oleksii


[-- Attachment #2: Type: text/html, Size: 9663 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 3/6] xen/riscv: construct the P2M pages pool for guests
  2025-05-20 14:38   ` Jan Beulich
@ 2025-05-23 10:27     ` Oleksii Kurochko
  2025-06-02 11:08       ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-23 10:27 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

[-- Attachment #1: Type: text/plain, Size: 5199 bytes --]


On 5/20/25 4:38 PM, Jan Beulich wrote:
> On 09.05.2025 17:57, Oleksii Kurochko wrote:
>> Implement p2m_set_allocation() to construct p2m pages pool for guests
>> based on required number of pages.
>>
>> This is implemented by:
>> - Adding a `struct paging_domain` which contains a freelist, a
>>    counter variable and a spinlock to `struct arch_domain` to
>>    indicate the free p2m pages and the number of p2m total pages in
>>    the p2m pages pool.
>> - Adding a helper `p2m_set_allocation` to set the p2m pages pool
>>    size. This helper should be called before allocating memory for
>>    a guest and is called from domain_p2m_set_allocation(), the latter
>>    is a part of common dom0less code.
>>
>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
> As already indicated in reply to patch 2, I expect this pool will want
> introducing ahead of that.
>
>> --- a/xen/arch/riscv/p2m.c
>> +++ b/xen/arch/riscv/p2m.c
>> @@ -1,4 +1,12 @@
>>   #include <xen/domain_page.h>
>> +/*
>> + * Because of general_preempt_check() from xen/sched.h which uses
>> + * local_events_need_delivery() but latter is declared in <asm/event.h>.
>> + * Thereby it is needed to icnlude <xen/event.h> here before xen/sched.h.
>> + *
>> + * Shouldn't be xen/event.h be included in <xen/sched.h>?
>> + */
>> +#include <xen/event.h>
> The question doesn't belong here; such could be put in the post-commit-
> message area. And the answer depends on what dependency you found missing.

It is needed for local_events_need_delivery() which is used by general_preempt_check()
in p2m_set_allocation(). Otherwise, the following issue will occur:

In file included from ././include/xen/config.h:17,
                  from <command-line>:
arch/riscv/p2m.c: In function 'p2m_set_allocation':
./include/xen/sched.h:941:36: error: implicit declaration of function 'local_events_need_delivery' [-Werror=implicit-function-declaration]
   941 |         (!is_idle_vcpu(current) && local_events_need_delivery())    \
       |                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~
./include/xen/compiler.h:26:43: note: in definition of macro 'unlikely'
    26 | #define unlikely(x)   __builtin_expect(!!(x),0)
       |                                           ^
arch/riscv/p2m.c:244:27: note: in expansion of macro 'general_preempt_check'
   244 |         if ( preempted && general_preempt_check() )
       |                           ^~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors

>
>> @@ -166,3 +176,60 @@ int p2m_init(struct domain *d)
>>   
>>       return 0;
>>   }
>> +
>> +/*
>> + * Set the pool of pages to the required number of pages.
>> + * Returns 0 for success, non-zero for failure.
>> + * Call with d->arch.paging.lock held.
>> + */
>> +int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
>> +{
>> +    struct page_info *pg;
>> +
>> +    ASSERT(spin_is_locked(&d->arch.paging.lock));
>> +
>> +    for ( ; ; )
>> +    {
>> +        if ( d->arch.paging.p2m_total_pages < pages )
>> +        {
>> +            /* Need to allocate more memory from domheap */
>> +            pg = alloc_domheap_page(d, MEMF_no_owner);
>> +            if ( pg == NULL )
>> +            {
>> +                printk(XENLOG_ERR "Failed to allocate P2M pages.\n");
>> +                return -ENOMEM;
>> +            }
>> +            ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
>> +                d->arch.paging.p2m_total_pages + 1;
> Looks like you copied this from Arm, but this code is bogus: Using
> ACCESS_ONCE() just on the lhs is pretty pointless. Once also used on the
> rhs the expression can easily become
>
>                  ACCESS_ONCE(d->arch.paging.p2m_total_pages) += 1;
>
> or even
>
>                  ACCESS_ONCE(d->arch.paging.p2m_total_pages)++;
>
> .
>
>> +            page_list_add_tail(pg, &d->arch.paging.p2m_freelist);
>> +        }
>> +        else if ( d->arch.paging.p2m_total_pages > pages )
>> +        {
>> +            /* Need to return memory to domheap */
>> +            pg = page_list_remove_head(&d->arch.paging.p2m_freelist);
>> +            if( pg )
>> +            {
>> +                ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
>> +                    d->arch.paging.p2m_total_pages - 1;
> Same here then, obviously.
>
>> +                free_domheap_page(pg);
>> +            }
>> +            else
>> +            {
>> +                printk(XENLOG_ERR
>> +                       "Failed to free P2M pages, P2M freelist is empty.\n");
>> +                return -ENOMEM;
>> +            }
>> +        }
>> +        else
>> +            break;
>> +
>> +        /* Check to see if we need to yield and try again */
>> +        if ( preempted && general_preempt_check() )
> While it's this way on both Arm and x86, I wonder how useful it is
> to check on every iteration, especially when freeing pages back to the
> buddy allocator.

IIUC, but a preemption request could happen for both cases. And destroying of
a domain could also consume long time and so not to block hypervisor if something
more urgent should be handled it could be also have this check for the case of
freeng pages back to the buddy allocator.

~ Oleksii

[-- Attachment #2: Type: text/html, Size: 6408 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 4/6] xen/riscv: define pt_t and pt_walk_t structures
  2025-05-20 15:04   ` Jan Beulich
@ 2025-05-23 10:48     ` Oleksii Kurochko
  0 siblings, 0 replies; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-23 10:48 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

[-- Attachment #1: Type: text/plain, Size: 2925 bytes --]


On 5/20/25 5:04 PM, Jan Beulich wrote:
> On 09.05.2025 17:57, Oleksii Kurochko wrote:
>> Refactor pte_t to be a union which hold page table entry plus
>> pt_t and pt_walk_t structures to simpilfy p2m functions.
> Is this really simplifying things? I really view ...
>
>> Also, introduce some helpers which are using pt_walk_t.
> ... these helpers as confusing things, by using the wrong part of the
> union relative to what their names are. (I'll re-phrase this some at
> the bottom.)
>
> With the union it's also unclear to me how to know which part of the
> union is the one that's valid to use, when there's no auxiliary info
> available.

Everything is valid to use and depends on the context if it is convenient
or not. It is hard to come up with a rule when and what should be used.

>
>> --- a/xen/arch/riscv/include/asm/page.h
>> +++ b/xen/arch/riscv/include/asm/page.h
>> @@ -99,15 +99,67 @@
>>   
>>   #endif
>>   
>> -/* Page Table entry */
>>   typedef struct {
>> +    unsigned long v:1;
>> +    unsigned long r:1;
>> +    unsigned long w:1;
>> +    unsigned long x:1;
>> +    unsigned long u:1;
>> +    unsigned long g:1;
>> +    unsigned long a:1;
>> +    unsigned long d:1;
>> +    unsigned long rsw:2;
>> +#if RV_STAGE1_MODE == SATP_MODE_SV39
>> +    unsigned long ppn0:9;
>> +    unsigned long ppn1:9;
>> +    unsigned long ppn2:26;
>> +    unsigned long rsw2:7;
>> +    unsigned long pbmt:2;
>> +    unsigned long n:1;
>> +#elif RV_STAGE1_MODE == SATP_MODE_SV48
>> +    unsigned long ppn0:9;
>> +    unsigned long ppn1:9;
>> +    unsigned long ppn2:9;
>> +    unsigned long ppn3:17;
>> +    unsigned long rsw2:7;
>> +    unsigned long pbmt:2;
>> +    unsigned long n:1;
>> +#else
> Imo you want to settle on whether you want to use bitfields or #define-s
> to manipulate page table entries. Using a mix is going to be confusing
> (or worse).

Generally, I am okay to use something one.
But technically it is the same things from result point of view, just
different is access of a field by using a union or do a bit manipulation operations.

>
>> +#error "Add proper bits for SATP_MODE"
>> +#endif
>> +} pt_t;
>> +
>> +typedef struct {
>> +    unsigned long rsw:10;
>> +#if RV_STAGE1_MODE == SATP_MODE_SV39 || RV_STAGE1_MODE == SATP_MODE_SV48
>> +    unsigned long ppn: 44;
> Nit: Why's there a blank after the colon here when there's none anywhere else?
>
>> +static inline void pte_set_mfn(pte_t *pte, mfn_t mfn)
>> +{
>> +    pte->walk.ppn = mfn_x(mfn);
>> +}
>> +
>> +static inline mfn_t pte_get_mfn(pte_t pte)
>> +{
>> +    return _mfn(pte.walk.ppn);
>> +}
> Following to my remark at the top - if you do it this way, what use are the
> ppn<N> fields?

|ppn<N>| isn't actually used; it was added only to follow the PTE format specified
in the architecture spec. Technically,|ppn<N>| could be merged into|ppn|,
but IMO, keeping|ppn<N>| improves self-documentation of the page table format.

~ Oleksii

[-- Attachment #2: Type: text/html, Size: 4414 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 5/6] xen/riscv: add new p2m types and helper macros for type classification
  2025-05-20 15:11   ` Jan Beulich
@ 2025-05-23 11:34     ` Oleksii Kurochko
  2025-06-02 11:12       ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-23 11:34 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

[-- Attachment #1: Type: text/plain, Size: 1720 bytes --]


On 5/20/25 5:11 PM, Jan Beulich wrote:
> On 09.05.2025 17:57, Oleksii Kurochko wrote:
>> --- a/xen/arch/riscv/include/asm/p2m.h
>> +++ b/xen/arch/riscv/include/asm/p2m.h
>> @@ -80,8 +80,36 @@ struct p2m_domain {
>>   typedef enum {
>>       p2m_invalid = 0,    /* Nothing mapped here */
>>       p2m_ram_rw,         /* Normal read/write domain RAM */
>> +    p2m_ram_ro,         /* Read-only; writes are silently dropped */
> This is pretty special a type, which imo better wouldn't be introduced
> without there being proper support for it. (I don't suppose RISC-V
> hardware alone can effect this type?)

It is possible to make ro by using r, w, x bits of page table entry in the
same way Arm does that:
     case p2m_ram_ro:
         e->p2m.xn = 0;
         e->p2m.write = 0;
         break;

>
>> +    p2m_mmio_direct_dev,/* Read/write mapping of genuine Device MMIO area */
>> +    p2m_map_foreign_rw, /* Read/write RAM pages from foreign domain */
>> +    p2m_map_foreign_ro, /* Read-only RAM pages from foreign domain */
> Aiui you took these from Arm. Looking at its sole use, I'm not convinced
> it's used correctly. If it is, the same comment as for p2m_ram_ro above
> would apply here, too.

p2m_mmio_direct_dev - this one is defintely needed as it is used for device
pass through to guest domain to map device's MMIO. It seems to me like it is
correctly used.

Others we don't really use now in private branches but it seems like they could be
useful, so I added them now.

I can drop them now and return back if such functionality which uses them will be
introduced for RISC-V, and at that moment I think it will be
more clear if it is used correctly or not.
Right now, I am not sure if it is.

~ Oleksii


[-- Attachment #2: Type: text/html, Size: 2529 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 6/6] xen/riscv: implement p2m mapping functionality
  2025-05-20 15:16   ` Jan Beulich
@ 2025-05-23 11:47     ` Oleksii Kurochko
  0 siblings, 0 replies; 31+ messages in thread
From: Oleksii Kurochko @ 2025-05-23 11:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

[-- Attachment #1: Type: text/plain, Size: 2166 bytes --]


On 5/20/25 5:16 PM, Jan Beulich wrote:
> On 09.05.2025 17:57, Oleksii Kurochko wrote:
>> These utilities are needed for building and managing RISC-V guest page
>> tables and MMIO mappings by using functions map_regions_p2mt() and
>> guest_physmap_add_entry().
>>
>> To implement p2m mapping functionality the following is introduced:
>> - Define P2M root level/order and entry count.
>> - Introdude radix type for p2m types as it isn't enough free bits in pte
>>    and the helpers (p2m_type_radix_{get,set}()) to deal with them.
>> - Introduce p2m_is_*() helpers() as  pte_is_*() helpers are checking
>>    the valid bit set in the PTE but we have to check p2m_type instead
>>    (look at the comment above p2m_is_valid() for some details).
> May I suggest to name them at least p2me_is_*() then, as they check entries
> rather than entire P2Ms? Same perhaps elsewhere.

Sure, I will handle that during a work on v2.

>
>> - Introduce helper to set p2m's pte permission: p2m_set_permissions().
>> - Introduce helper to create p2m entry based on mfn, p2m_type_t and
>>    p2m_access_t.
>> - Introduce helper to generate table entry with correct attributes:
>>    page_to_p2m_table().
>> - Introduce p2m page allocation function: p2m_alloc_page().
>> - Introduce functions to write/remove p2m's entries: p2m_{write,remove}_pte().
>> - Introduce function to allocate p2m table: p2m_create_table().
>> - Introduce functions used to free p2m entry.
>> - Introduce function for table walking: p2m_next_level().
>> - Introduce function to insert an entry in the p2m (p2m_set_entry()).
>> - Introduce superpage splitting: p2m_split_superpage()).
>> - Introduce page table type defines (PGT_{none,writable_page}, etc).
>>
>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
>> ---
>>   xen/arch/riscv/include/asm/mm.h   |  32 +-
>>   xen/arch/riscv/include/asm/p2m.h  |  17 +-
>>   xen/arch/riscv/include/asm/page.h |  11 +
>>   xen/arch/riscv/p2m.c              | 780 ++++++++++++++++++++++++++++++
>>   4 files changed, 829 insertions(+), 11 deletions(-)
> It's imo too many things you do in one go here.

I will split to smaller patches.

Thanks.

~ Oleksii


[-- Attachment #2: Type: text/html, Size: 2999 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization
  2025-05-23  9:44         ` Oleksii Kurochko
@ 2025-06-02 11:04           ` Jan Beulich
  2025-06-05 14:10             ` Oleksii Kurochko
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2025-06-02 11:04 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 23.05.2025 11:44, Oleksii Kurochko wrote:
> On 5/22/25 6:09 PM, Jan Beulich wrote:
>> On 22.05.2025 17:53, Oleksii Kurochko wrote:
>>> On 5/20/25 3:37 PM, Jan Beulich wrote:
>>>> On 09.05.2025 17:57, Oleksii Kurochko wrote:
>>>>> +static struct page_info *p2m_get_clean_page(struct domain *d)
>>>>> +{
>>>>> +    struct page_info *page;
>>>>> +
>>>>> +    /*
>>>>> +     * As mentioned in the Priviliged Architecture Spec (version 20240411)
>>>>> +     * As explained in Section 18.5.1, for the paged virtual-memory schemes
>>>>> +     * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
>>>>> +     * and must be aligned to a 16-KiB boundary.
>>>>> +     */
>>>>> +    page = alloc_domheap_pages(NULL, 2, 0);
>>>> Shouldn't this allocation come from the domain's P2M pool (which is yet
>>>> to be introduced)?
>>> First, I will drop p2m_get_clean_page() as it will be used only for p2m root page
>>> table allocation.
>>>
>>> p2m_init() is called by domain_create() [->arch_domain_create()->p2m_init()] from create_domUs():
>>> [https://gitlab.com/xen-project/xen/-/blob/staging/xen/common/device-tree/dom0less-build.c?ref_type=heads#L984].
>>>
>>> When p2m_init() is called, p2m pool isn't ready and domain isn't created yet. Last one
>>> is also crucial for usage of p2m pool as p2m pool belongs to domain and thereby it is
>>> using alloc_domheap_page(d, ...) (Not NULL as for allocation of p2m root table above),
>>> so domain should be created first.
>> Yet that is part of my point: This allocation should be against the domain,
>> so it is properly accounted. What's the problem with allocating the root
>> table when the pools is being created / filled?
> 
> I can't use pages from pool for root table as they aren't properly aligned.
> 
> At the moment, creation of p2m pool looks like:
>   int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
>   {
>       struct page_info *pg;
> 
>       ASSERT(spin_is_locked(&d->arch.paging.lock));
> 
>       for ( ; ; )
>       {
>           if ( d->arch.paging.p2m_total_pages < pages )
>           {
>               /* Need to allocate more memory from domheap */
>               pg = alloc_domheap_page(d, MEMF_no_owner);
>               if ( pg == NULL )
>               {
>                   printk(XENLOG_ERR "Failed to allocate P2M pages.\n");
>                   return -ENOMEM;
>               }
>               ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
>                   d->arch.paging.p2m_total_pages + 1;
>               page_list_add_tail(pg, &d->arch.paging.p2m_freelist);
>           }
>           ...
>       }
> 
>       return 0;
>   }
> alloc_domheap_page(d, MEMF_no_owner) allocates page table with order 0, so 4k-aligned page table.
> But if I needed 16k for root table and it should be 16k-aligned then I still have to use
> alloc_domheap_pages(NULL, 2, 0);
> 
> Or do you mean that I have to something like:
>   int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
>   {
>       struct page_info *pg;
>   
>       ASSERT(spin_is_locked(&d->arch.paging.lock));
>   
> +    if ( !d->arch.p2m.root )
> +    {
> +        unsigned int order = get_order_from_bytes(KB(16));
> +        unsigned int nr_pages = _AC(1,U) << order;
> +        /*
> +        * As mentioned in the Priviliged Architecture Spec (version 20240411)
> +        * As explained in Section 18.5.1, for the paged virtual-memory schemes
> +        * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
> +        * and must be aligned to a 16-KiB boundary.
> +        */
> +        d->arch.p2m.root = alloc_domheap_pages(d, order, MEMF_no_owner);
> +        if ( d->arch.p2m.root == NULL )
> +            panic("root page table hasn't been allocated\n");
> +
> +        clear_and_clean_page(d->arch.p2m.root);
> +
> +        /* TODO: do I need TLB flush here? */
> +
> +        ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
> +            d->arch.paging.p2m_total_pages + nr_pages;
> +    }
> +
> ...
> }

Neither. I was thinking of you taking 4 pages off the pool in exchange for the
order-2 allocation. Primarily to get the memory accounting right (or at least
closer to reality).

>>>>> +{
>>>>> +    unsigned long ppn;
>>>>> +    unsigned long hgatp_mode;
>>>>> +
>>>>> +    ppn = PFN_DOWN(page_to_maddr(page_info)) & HGATP_PPN;
>>>>> +
>>>>> +    /* ASID (VMID) not supported yet */
>>>>> +
>>>>> +#if RV_STAGE1_MODE == SATP_MODE_SV39
>>>>> +    hgatp_mode = HGATP_MODE_SV39X4;
>>>>> +#elif RV_STAGE1_MODE == SATP_MODE_SV48
>>>>> +    hgatp_mode = HGATP_MODE_SV48X4;
>>>>> +#else
>>>>> +    #error "add HGATP_MODE"
>>>> As before, please have the # of pre-processor directives in the first column.
>>>>
>>>>> +#endif
>>>>> +
>>>>> +    return ppn | (hgatp_mode << HGATP_MODE_SHIFT);
>>>> Use MASK_INSR()?
>>> Do you mean MASK_INSR(hgatp_mode, HGATP_MODE_MASK)?
>>> If yes, then I didn't get what is the point then?
>> The point is that generally ..._SHIFT is redundant when you also have
>> ..._MASK; that's what MASK_EXTR() and MASK_INSR() leverage.
> 
> At the moment, there is no mask for HGATP_MODE so if to use *_MASK then I
> have to introduce it if it better to have *_MASK instead of *_SHIFT.

Perhaps best to do so then.

>>>>> +static int p2m_alloc_table(struct domain *d)
>>>>> +{
>>>>> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>>>> +
>>>>> +    p2m->root = p2m_allocate_root(d);
>>>>> +    if ( !p2m->root )
>>>>> +        return -ENOMEM;
>>>>> +
>>>>> +    p2m->hgatp = hgatp_from_page_info(p2m->root);
>>>>> +
>>>>> +    /*
>>>>> +     * Make sure that all TLBs corresponding to the new VMID are flushed
>>>>> +     * before using it.
>>>>> +     */
>>>>> +    p2m_write_lock(p2m);
>>>>> +    p2m_force_tlb_flush_sync(p2m);
>>>>> +    p2m_write_unlock(p2m);
>>>> While Andrew directed you towards a better model in general, it won't be
>>>> usable here then, as the guest didn't run on any pCPU(s) yet. Imo you
>>>> want to do a single global flush e.g. when VMIDs wrap around. That'll be
>>>> fewer global flushes than one per VM creation.
>>> I am not sure that I get a phrase 'VMIDs wrap around'.
>> You have to allocate them somehow. Typically you'll use the next one available.
>> At some point you will need to start over, searching from the beginning. Prior
>> to that now allocation of a new one will require any flush, as none of them
>> had be in use before (after boot or the last such flush).
> 
> Thanks. Now I get your point.
> 
> Won't be better to do TLB flushing during destroying of a domain so then we will
> be sure that TLBs connected to freed VMID aren't present in TLB anymore?

That's an option, but will result in more flushes. Furthermore there may be
reasons to change the VMID for a domain while it's running.

> IIUC, it will work only if VMID is used, right?

Well, anything VMID related is of course only relevant when VMIDs are in use.

> In case if VMID isn't used, probably we can drop flushing here and do a flush
> during booting, right?

That'll be too little flushing?

> Won't be enough to flushing of guest TLB only during context switch?

"only" is interesting here. Context switches are a relatively frequent
operation, which in addition you want to be fast. If a flush is necessary
there for correctness (e.g. when VMIDs aren't in use), you have to do it
there. But if you can flush less frequently without violating correctness,
you'd almost always want to use such an opportunity.

>>> I am going to implement, p2m_force_tlb_flush_sync() as:
>>>    static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
>>>    {
>>>      ...
>>>        sbi_remote_hfence_gvma(d->dirty_cpumask, 0, 0);
>>>      ...
>>>    }
>>>
>>> With such implementation if the guest didn't run on any pCPU(s) yet
>>> then d->dirty_cpumask is empty, then sbi_remote_hfence_gvma() will do nothing
>>> as hmask will be NULL (https://gitlab.com/xen-project/people/olkur/xen/-/blob/staging/xen/arch/riscv/sbi.c?ref_type=heads#L238).
>>> I am not sure that it is a good idea as I can't find a guarantee in the spec
>>> that TLB will be empty during boot time.
>> If in doubt, do one global flush while booting.
> 
> By booting you mean somewhere in continue_new_vcpu()?

I don't particularly mean any specific place. However, continue_new_vcpu()
(by its name) isn't involved in bringing up Xen, is it?

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 3/6] xen/riscv: construct the P2M pages pool for guests
  2025-05-23 10:27     ` Oleksii Kurochko
@ 2025-06-02 11:08       ` Jan Beulich
  2025-06-05 14:22         ` Oleksii Kurochko
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2025-06-02 11:08 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 23.05.2025 12:27, Oleksii Kurochko wrote:
> On 5/20/25 4:38 PM, Jan Beulich wrote:
>> On 09.05.2025 17:57, Oleksii Kurochko wrote:
>>> --- a/xen/arch/riscv/p2m.c
>>> +++ b/xen/arch/riscv/p2m.c
>>> @@ -1,4 +1,12 @@
>>>   #include <xen/domain_page.h>
>>> +/*
>>> + * Because of general_preempt_check() from xen/sched.h which uses
>>> + * local_events_need_delivery() but latter is declared in <asm/event.h>.
>>> + * Thereby it is needed to icnlude <xen/event.h> here before xen/sched.h.
>>> + *
>>> + * Shouldn't be xen/event.h be included in <xen/sched.h>?
>>> + */
>>> +#include <xen/event.h>
>> The question doesn't belong here; such could be put in the post-commit-
>> message area. And the answer depends on what dependency you found missing.
> 
> It is needed for local_events_need_delivery() which is used by general_preempt_check()
> in p2m_set_allocation(). Otherwise, the following issue will occur:
> 
> In file included from ././include/xen/config.h:17,
>                   from <command-line>:
> arch/riscv/p2m.c: In function 'p2m_set_allocation':
> ./include/xen/sched.h:941:36: error: implicit declaration of function 'local_events_need_delivery' [-Werror=implicit-function-declaration]
>    941 |         (!is_idle_vcpu(current) && local_events_need_delivery())    \
>        |                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~
> ./include/xen/compiler.h:26:43: note: in definition of macro 'unlikely'
>     26 | #define unlikely(x)   __builtin_expect(!!(x),0)
>        |                                           ^
> arch/riscv/p2m.c:244:27: note: in expansion of macro 'general_preempt_check'
>    244 |         if ( preempted && general_preempt_check() )
>        |                           ^~~~~~~~~~~~~~~~~~~~~
> cc1: all warnings being treated as errors

In which case my answer to your question is "No". Others may take a different
perspective. (xen/sched.h being included virtually everywhere, imo we want to
avoid adding dependencies there which aren't strictly necessary to keep things
building.)

>>> @@ -166,3 +176,60 @@ int p2m_init(struct domain *d)
>>>   
>>>       return 0;
>>>   }
>>> +
>>> +/*
>>> + * Set the pool of pages to the required number of pages.
>>> + * Returns 0 for success, non-zero for failure.
>>> + * Call with d->arch.paging.lock held.
>>> + */
>>> +int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
>>> +{
>>> +    struct page_info *pg;
>>> +
>>> +    ASSERT(spin_is_locked(&d->arch.paging.lock));
>>> +
>>> +    for ( ; ; )
>>> +    {
>>> +        if ( d->arch.paging.p2m_total_pages < pages )
>>> +        {
>>> +            /* Need to allocate more memory from domheap */
>>> +            pg = alloc_domheap_page(d, MEMF_no_owner);
>>> +            if ( pg == NULL )
>>> +            {
>>> +                printk(XENLOG_ERR "Failed to allocate P2M pages.\n");
>>> +                return -ENOMEM;
>>> +            }
>>> +            ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
>>> +                d->arch.paging.p2m_total_pages + 1;
>> Looks like you copied this from Arm, but this code is bogus: Using
>> ACCESS_ONCE() just on the lhs is pretty pointless. Once also used on the
>> rhs the expression can easily become
>>
>>                  ACCESS_ONCE(d->arch.paging.p2m_total_pages) += 1;
>>
>> or even
>>
>>                  ACCESS_ONCE(d->arch.paging.p2m_total_pages)++;
>>
>> .
>>
>>> +            page_list_add_tail(pg, &d->arch.paging.p2m_freelist);
>>> +        }
>>> +        else if ( d->arch.paging.p2m_total_pages > pages )
>>> +        {
>>> +            /* Need to return memory to domheap */
>>> +            pg = page_list_remove_head(&d->arch.paging.p2m_freelist);
>>> +            if( pg )
>>> +            {
>>> +                ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
>>> +                    d->arch.paging.p2m_total_pages - 1;
>> Same here then, obviously.
>>
>>> +                free_domheap_page(pg);
>>> +            }
>>> +            else
>>> +            {
>>> +                printk(XENLOG_ERR
>>> +                       "Failed to free P2M pages, P2M freelist is empty.\n");
>>> +                return -ENOMEM;
>>> +            }
>>> +        }
>>> +        else
>>> +            break;
>>> +
>>> +        /* Check to see if we need to yield and try again */
>>> +        if ( preempted && general_preempt_check() )
>> While it's this way on both Arm and x86, I wonder how useful it is
>> to check on every iteration, especially when freeing pages back to the
>> buddy allocator.
> 
> IIUC, but a preemption request could happen for both cases. And destroying of
> a domain could also consume long time and so not to block hypervisor if something
> more urgent should be handled it could be also have this check for the case of
> freeng pages back to the buddy allocator.

The question wasn't whether to check, but how frequently. The check itself is
consuming processing time, too, so one generally wants to balance the number
of checks against the size of the resulting time window without any check.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 5/6] xen/riscv: add new p2m types and helper macros for type classification
  2025-05-23 11:34     ` Oleksii Kurochko
@ 2025-06-02 11:12       ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2025-06-02 11:12 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 23.05.2025 13:34, Oleksii Kurochko wrote:
> On 5/20/25 5:11 PM, Jan Beulich wrote:
>> On 09.05.2025 17:57, Oleksii Kurochko wrote:
>>> --- a/xen/arch/riscv/include/asm/p2m.h
>>> +++ b/xen/arch/riscv/include/asm/p2m.h
>>> @@ -80,8 +80,36 @@ struct p2m_domain {
>>>   typedef enum {
>>>       p2m_invalid = 0,    /* Nothing mapped here */
>>>       p2m_ram_rw,         /* Normal read/write domain RAM */
>>> +    p2m_ram_ro,         /* Read-only; writes are silently dropped */
>> This is pretty special a type, which imo better wouldn't be introduced
>> without there being proper support for it. (I don't suppose RISC-V
>> hardware alone can effect this type?)
> 
> It is possible to make ro by using r, w, x bits of page table entry in the
> same way Arm does that:
>      case p2m_ram_ro:
>          e->p2m.xn = 0;
>          e->p2m.write = 0;
>          break;

That takes care of the r/o aspect, yes, but not of the "writes are silently
dropped" one.

>>> +    p2m_mmio_direct_dev,/* Read/write mapping of genuine Device MMIO area */
>>> +    p2m_map_foreign_rw, /* Read/write RAM pages from foreign domain */
>>> +    p2m_map_foreign_ro, /* Read-only RAM pages from foreign domain */
>> Aiui you took these from Arm. Looking at its sole use, I'm not convinced
>> it's used correctly. If it is, the same comment as for p2m_ram_ro above
>> would apply here, too.
> 
> p2m_mmio_direct_dev - this one is defintely needed as it is used for device
> pass through to guest domain to map device's MMIO. It seems to me like it is
> correctly used.

My earlier comment was mainly about p2m_map_foreign_ro.

> Others we don't really use now in private branches but it seems like they could be
> useful, so I added them now.
> 
> I can drop them now and return back if such functionality which uses them will be
> introduced for RISC-V, and at that moment I think it will be
> more clear if it is used correctly or not.

Indeed. And maybe it'll just be p2m_map_foreign, as we have it on x86.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization
  2025-06-02 11:04           ` Jan Beulich
@ 2025-06-05 14:10             ` Oleksii Kurochko
  2025-06-05 14:19               ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Oleksii Kurochko @ 2025-06-05 14:10 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

[-- Attachment #1: Type: text/plain, Size: 9663 bytes --]


On 6/2/25 1:04 PM, Jan Beulich wrote:
> On 23.05.2025 11:44, Oleksii Kurochko wrote:
>> On 5/22/25 6:09 PM, Jan Beulich wrote:
>>> On 22.05.2025 17:53, Oleksii Kurochko wrote:
>>>> On 5/20/25 3:37 PM, Jan Beulich wrote:
>>>>> On 09.05.2025 17:57, Oleksii Kurochko wrote:
>>>>>> +static struct page_info *p2m_get_clean_page(struct domain *d)
>>>>>> +{
>>>>>> +    struct page_info *page;
>>>>>> +
>>>>>> +    /*
>>>>>> +     * As mentioned in the Priviliged Architecture Spec (version 20240411)
>>>>>> +     * As explained in Section 18.5.1, for the paged virtual-memory schemes
>>>>>> +     * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
>>>>>> +     * and must be aligned to a 16-KiB boundary.
>>>>>> +     */
>>>>>> +    page = alloc_domheap_pages(NULL, 2, 0);
>>>>> Shouldn't this allocation come from the domain's P2M pool (which is yet
>>>>> to be introduced)?
>>>> First, I will drop p2m_get_clean_page() as it will be used only for p2m root page
>>>> table allocation.
>>>>
>>>> p2m_init() is called by domain_create() [->arch_domain_create()->p2m_init()] from create_domUs():
>>>> [https://gitlab.com/xen-project/xen/-/blob/staging/xen/common/device-tree/dom0less-build.c?ref_type=heads#L984].
>>>>
>>>> When p2m_init() is called, p2m pool isn't ready and domain isn't created yet. Last one
>>>> is also crucial for usage of p2m pool as p2m pool belongs to domain and thereby it is
>>>> using alloc_domheap_page(d, ...) (Not NULL as for allocation of p2m root table above),
>>>> so domain should be created first.
>>> Yet that is part of my point: This allocation should be against the domain,
>>> so it is properly accounted. What's the problem with allocating the root
>>> table when the pools is being created / filled?
>> I can't use pages from pool for root table as they aren't properly aligned.
>>
>> At the moment, creation of p2m pool looks like:
>>    int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
>>    {
>>        struct page_info *pg;
>>
>>        ASSERT(spin_is_locked(&d->arch.paging.lock));
>>
>>        for ( ; ; )
>>        {
>>            if ( d->arch.paging.p2m_total_pages < pages )
>>            {
>>                /* Need to allocate more memory from domheap */
>>                pg = alloc_domheap_page(d, MEMF_no_owner);
>>                if ( pg == NULL )
>>                {
>>                    printk(XENLOG_ERR "Failed to allocate P2M pages.\n");
>>                    return -ENOMEM;
>>                }
>>                ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
>>                    d->arch.paging.p2m_total_pages + 1;
>>                page_list_add_tail(pg, &d->arch.paging.p2m_freelist);
>>            }
>>            ...
>>        }
>>
>>        return 0;
>>    }
>> alloc_domheap_page(d, MEMF_no_owner) allocates page table with order 0, so 4k-aligned page table.
>> But if I needed 16k for root table and it should be 16k-aligned then I still have to use
>> alloc_domheap_pages(NULL, 2, 0);
>>
>> Or do you mean that I have to something like:
>>    int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
>>    {
>>        struct page_info *pg;
>>    
>>        ASSERT(spin_is_locked(&d->arch.paging.lock));
>>    
>> +    if ( !d->arch.p2m.root )
>> +    {
>> +        unsigned int order = get_order_from_bytes(KB(16));
>> +        unsigned int nr_pages = _AC(1,U) << order;
>> +        /*
>> +        * As mentioned in the Priviliged Architecture Spec (version 20240411)
>> +        * As explained in Section 18.5.1, for the paged virtual-memory schemes
>> +        * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
>> +        * and must be aligned to a 16-KiB boundary.
>> +        */
>> +        d->arch.p2m.root = alloc_domheap_pages(d, order, MEMF_no_owner);
>> +        if ( d->arch.p2m.root == NULL )
>> +            panic("root page table hasn't been allocated\n");
>> +
>> +        clear_and_clean_page(d->arch.p2m.root);
>> +
>> +        /* TODO: do I need TLB flush here? */
>> +
>> +        ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
>> +            d->arch.paging.p2m_total_pages + nr_pages;
>> +    }
>> +
>> ...
>> }
> Neither. I was thinking of you taking 4 pages off the pool in exchange for the
> order-2 allocation. Primarily to get the memory accounting right (or at least
> closer to reality).

Do you mean that I have to call 4 times page_list_remove_head(), something like
that:
--- a/xen/arch/riscv/p2m.c
+++ b/xen/arch/riscv/p2m.c
@@ -215,6 +215,44 @@ int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
          }
      }
  
+    if ( !d->arch.p2m.root )
+    {
+        unsigned int order = get_order_from_bytes(KB(16));
+        unsigned int nr_pages = _AC(1,U) << order;
+
+        if ( ACCESS_ONCE(d->arch.paging.p2m_total_pages) < nr_pages )
+            panic("Specify more xen,domain-p2m-mem-mb\n");
+
+        /*
+         * As mentioned in the Priviliged Architecture Spec (version 20240411)
+         * As explained in Section 18.5.1, for the paged virtual-memory schemes
+         * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
+         * and must be aligned to a 16-KiB boundary.
+         */
+        d->arch.p2m.root = alloc_domheap_pages(NULL, order, 0);
+        if (  d->arch.p2m.root == NULL )
+            panic("failed to allocate p2m root page table\n");
+
+        for ( unsigned int i = 0; i < nr_pages; i++ )
+        {
+            clear_and_clean_page(d->arch.p2m.root + i);
+
+            /* Return memory to domheap */
+            pg = page_list_remove_head(&d->arch.paging.p2m_freelist);
+            if( pg )
+            {
+                ACCESS_ONCE(d->arch.paging.p2m_total_pages)--;
+                free_domheap_page(pg);
+            }
+            else
+            {
+                printk(XENLOG_ERR
+                       "Failed to free P2M pages, P2M freelist is empty.\n");
+                return -ENOMEM;
+            }
+        }
+    }
+
      return 0;
  }

>>>>>> +static int p2m_alloc_table(struct domain *d)
>>>>>> +{
>>>>>> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>>>>> +
>>>>>> +    p2m->root = p2m_allocate_root(d);
>>>>>> +    if ( !p2m->root )
>>>>>> +        return -ENOMEM;
>>>>>> +
>>>>>> +    p2m->hgatp = hgatp_from_page_info(p2m->root);
>>>>>> +
>>>>>> +    /*
>>>>>> +     * Make sure that all TLBs corresponding to the new VMID are flushed
>>>>>> +     * before using it.
>>>>>> +     */
>>>>>> +    p2m_write_lock(p2m);
>>>>>> +    p2m_force_tlb_flush_sync(p2m);
>>>>>> +    p2m_write_unlock(p2m);
>>>>> While Andrew directed you towards a better model in general, it won't be
>>>>> usable here then, as the guest didn't run on any pCPU(s) yet. Imo you
>>>>> want to do a single global flush e.g. when VMIDs wrap around. That'll be
>>>>> fewer global flushes than one per VM creation.
>>>> I am not sure that I get a phrase 'VMIDs wrap around'.
>>> You have to allocate them somehow. Typically you'll use the next one available.
>>> At some point you will need to start over, searching from the beginning. Prior
>>> to that now allocation of a new one will require any flush, as none of them
>>> had be in use before (after boot or the last such flush).
>> Thanks. Now I get your point.
>>
>> Won't be better to do TLB flushing during destroying of a domain so then we will
>> be sure that TLBs connected to freed VMID aren't present in TLB anymore?
> That's an option, but will result in more flushes. Furthermore there may be
> reasons to change the VMID for a domain while it's running.
>
>> IIUC, it will work only if VMID is used, right?
> Well, anything VMID related is of course only relevant when VMIDs are in use.
>
>> In case if VMID isn't used, probably we can drop flushing here and do a flush
>> during booting, right?
> That'll be too little flushing?

I meant that instead of having TLB flush in p2m_alloc_table() we could have a one flush
during booting. And of course, we still should have flush on each context switch.

>
>> Won't be enough to flushing of guest TLB only during context switch?
> "only" is interesting here. Context switches are a relatively frequent
> operation, which in addition you want to be fast. If a flush is necessary
> there for correctness (e.g. when VMIDs aren't in use), you have to do it
> there. But if you can flush less frequently without violating correctness,
> you'd almost always want to use such an opportunity.

Then it is better to introduce VMID now, it seems it's only one place where
it should be set, when hgatp is initialized.

Does Xen have some framework to work with VMID?

>
>>>> I am going to implement, p2m_force_tlb_flush_sync() as:
>>>>     static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
>>>>     {
>>>>       ...
>>>>         sbi_remote_hfence_gvma(d->dirty_cpumask, 0, 0);
>>>>       ...
>>>>     }
>>>>
>>>> With such implementation if the guest didn't run on any pCPU(s) yet
>>>> then d->dirty_cpumask is empty, then sbi_remote_hfence_gvma() will do nothing
>>>> as hmask will be NULL (https://gitlab.com/xen-project/people/olkur/xen/-/blob/staging/xen/arch/riscv/sbi.c?ref_type=heads#L238).
>>>> I am not sure that it is a good idea as I can't find a guarantee in the spec
>>>> that TLB will be empty during boot time.
>>> If in doubt, do one global flush while booting.
>> By booting you mean somewhere in continue_new_vcpu()?
> I don't particularly mean any specific place. However, continue_new_vcpu()
> (by its name) isn't involved in bringing up Xen, is it?
>
No, it isn't. By booting here I meant a boot of a guest domain, not Xen itself.

~ Oleksii

[-- Attachment #2: Type: text/html, Size: 12511 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization
  2025-06-05 14:10             ` Oleksii Kurochko
@ 2025-06-05 14:19               ` Jan Beulich
  2025-06-05 15:28                 ` Oleksii Kurochko
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2025-06-05 14:19 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 05.06.2025 16:10, Oleksii Kurochko wrote:
> 
> On 6/2/25 1:04 PM, Jan Beulich wrote:
>> On 23.05.2025 11:44, Oleksii Kurochko wrote:
>>> On 5/22/25 6:09 PM, Jan Beulich wrote:
>>>> On 22.05.2025 17:53, Oleksii Kurochko wrote:
>>>>> On 5/20/25 3:37 PM, Jan Beulich wrote:
>>>>>> On 09.05.2025 17:57, Oleksii Kurochko wrote:
>>>>>>> +static struct page_info *p2m_get_clean_page(struct domain *d)
>>>>>>> +{
>>>>>>> +    struct page_info *page;
>>>>>>> +
>>>>>>> +    /*
>>>>>>> +     * As mentioned in the Priviliged Architecture Spec (version 20240411)
>>>>>>> +     * As explained in Section 18.5.1, for the paged virtual-memory schemes
>>>>>>> +     * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
>>>>>>> +     * and must be aligned to a 16-KiB boundary.
>>>>>>> +     */
>>>>>>> +    page = alloc_domheap_pages(NULL, 2, 0);
>>>>>> Shouldn't this allocation come from the domain's P2M pool (which is yet
>>>>>> to be introduced)?
>>>>> First, I will drop p2m_get_clean_page() as it will be used only for p2m root page
>>>>> table allocation.
>>>>>
>>>>> p2m_init() is called by domain_create() [->arch_domain_create()->p2m_init()] from create_domUs():
>>>>> [https://gitlab.com/xen-project/xen/-/blob/staging/xen/common/device-tree/dom0less-build.c?ref_type=heads#L984].
>>>>>
>>>>> When p2m_init() is called, p2m pool isn't ready and domain isn't created yet. Last one
>>>>> is also crucial for usage of p2m pool as p2m pool belongs to domain and thereby it is
>>>>> using alloc_domheap_page(d, ...) (Not NULL as for allocation of p2m root table above),
>>>>> so domain should be created first.
>>>> Yet that is part of my point: This allocation should be against the domain,
>>>> so it is properly accounted. What's the problem with allocating the root
>>>> table when the pools is being created / filled?
>>> I can't use pages from pool for root table as they aren't properly aligned.
>>>
>>> At the moment, creation of p2m pool looks like:
>>>    int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
>>>    {
>>>        struct page_info *pg;
>>>
>>>        ASSERT(spin_is_locked(&d->arch.paging.lock));
>>>
>>>        for ( ; ; )
>>>        {
>>>            if ( d->arch.paging.p2m_total_pages < pages )
>>>            {
>>>                /* Need to allocate more memory from domheap */
>>>                pg = alloc_domheap_page(d, MEMF_no_owner);
>>>                if ( pg == NULL )
>>>                {
>>>                    printk(XENLOG_ERR "Failed to allocate P2M pages.\n");
>>>                    return -ENOMEM;
>>>                }
>>>                ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
>>>                    d->arch.paging.p2m_total_pages + 1;
>>>                page_list_add_tail(pg, &d->arch.paging.p2m_freelist);
>>>            }
>>>            ...
>>>        }
>>>
>>>        return 0;
>>>    }
>>> alloc_domheap_page(d, MEMF_no_owner) allocates page table with order 0, so 4k-aligned page table.
>>> But if I needed 16k for root table and it should be 16k-aligned then I still have to use
>>> alloc_domheap_pages(NULL, 2, 0);
>>>
>>> Or do you mean that I have to something like:
>>>    int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
>>>    {
>>>        struct page_info *pg;
>>>    
>>>        ASSERT(spin_is_locked(&d->arch.paging.lock));
>>>    
>>> +    if ( !d->arch.p2m.root )
>>> +    {
>>> +        unsigned int order = get_order_from_bytes(KB(16));
>>> +        unsigned int nr_pages = _AC(1,U) << order;
>>> +        /*
>>> +        * As mentioned in the Priviliged Architecture Spec (version 20240411)
>>> +        * As explained in Section 18.5.1, for the paged virtual-memory schemes
>>> +        * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
>>> +        * and must be aligned to a 16-KiB boundary.
>>> +        */
>>> +        d->arch.p2m.root = alloc_domheap_pages(d, order, MEMF_no_owner);
>>> +        if ( d->arch.p2m.root == NULL )
>>> +            panic("root page table hasn't been allocated\n");
>>> +
>>> +        clear_and_clean_page(d->arch.p2m.root);
>>> +
>>> +        /* TODO: do I need TLB flush here? */
>>> +
>>> +        ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
>>> +            d->arch.paging.p2m_total_pages + nr_pages;
>>> +    }
>>> +
>>> ...
>>> }
>> Neither. I was thinking of you taking 4 pages off the pool in exchange for the
>> order-2 allocation. Primarily to get the memory accounting right (or at least
>> closer to reality).
> 
> Do you mean that I have to call 4 times page_list_remove_head(), something like
> that:
> --- a/xen/arch/riscv/p2m.c
> +++ b/xen/arch/riscv/p2m.c
> @@ -215,6 +215,44 @@ int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
>           }
>       }
>   
> +    if ( !d->arch.p2m.root )
> +    {
> +        unsigned int order = get_order_from_bytes(KB(16));
> +        unsigned int nr_pages = _AC(1,U) << order;
> +
> +        if ( ACCESS_ONCE(d->arch.paging.p2m_total_pages) < nr_pages )
> +            panic("Specify more xen,domain-p2m-mem-mb\n");
> +
> +        /*
> +         * As mentioned in the Priviliged Architecture Spec (version 20240411)
> +         * As explained in Section 18.5.1, for the paged virtual-memory schemes
> +         * (Sv32x4, Sv39x4, Sv48x4, and Sv57x4), the root page table is 16 KiB
> +         * and must be aligned to a 16-KiB boundary.
> +         */
> +        d->arch.p2m.root = alloc_domheap_pages(NULL, order, 0);

Imo you'd better not use NULL here, but instead pass MEMF_no_owner. See
respective x86 code. I also think you want to do the freeing first, and
only then do this allocation, such that ...

> +        if (  d->arch.p2m.root == NULL )
> +            panic("failed to allocate p2m root page table\n");
> +
> +        for ( unsigned int i = 0; i < nr_pages; i++ )
> +        {
> +            clear_and_clean_page(d->arch.p2m.root + i);
> +
> +            /* Return memory to domheap */
> +            pg = page_list_remove_head(&d->arch.paging.p2m_freelist);
> +            if( pg )
> +            {
> +                ACCESS_ONCE(d->arch.paging.p2m_total_pages)--;
> +                free_domheap_page(pg);
> +            }
> +            else
> +            {
> +                printk(XENLOG_ERR
> +                       "Failed to free P2M pages, P2M freelist is empty.\n");
> +                return -ENOMEM;

... this path will not have eaten more memory than was given back.

>>>>>>> +static int p2m_alloc_table(struct domain *d)
>>>>>>> +{
>>>>>>> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>>>>>> +
>>>>>>> +    p2m->root = p2m_allocate_root(d);
>>>>>>> +    if ( !p2m->root )
>>>>>>> +        return -ENOMEM;
>>>>>>> +
>>>>>>> +    p2m->hgatp = hgatp_from_page_info(p2m->root);
>>>>>>> +
>>>>>>> +    /*
>>>>>>> +     * Make sure that all TLBs corresponding to the new VMID are flushed
>>>>>>> +     * before using it.
>>>>>>> +     */
>>>>>>> +    p2m_write_lock(p2m);
>>>>>>> +    p2m_force_tlb_flush_sync(p2m);
>>>>>>> +    p2m_write_unlock(p2m);
>>>>>> While Andrew directed you towards a better model in general, it won't be
>>>>>> usable here then, as the guest didn't run on any pCPU(s) yet. Imo you
>>>>>> want to do a single global flush e.g. when VMIDs wrap around. That'll be
>>>>>> fewer global flushes than one per VM creation.
>>>>> I am not sure that I get a phrase 'VMIDs wrap around'.
>>>> You have to allocate them somehow. Typically you'll use the next one available.
>>>> At some point you will need to start over, searching from the beginning. Prior
>>>> to that now allocation of a new one will require any flush, as none of them
>>>> had be in use before (after boot or the last such flush).
>>> Thanks. Now I get your point.
>>>
>>> Won't be better to do TLB flushing during destroying of a domain so then we will
>>> be sure that TLBs connected to freed VMID aren't present in TLB anymore?
>> That's an option, but will result in more flushes. Furthermore there may be
>> reasons to change the VMID for a domain while it's running.
>>
>>> IIUC, it will work only if VMID is used, right?
>> Well, anything VMID related is of course only relevant when VMIDs are in use.
>>
>>> In case if VMID isn't used, probably we can drop flushing here and do a flush
>>> during booting, right?
>> That'll be too little flushing?
> 
> I meant that instead of having TLB flush in p2m_alloc_table() we could have a one flush
> during booting. And of course, we still should have flush on each context switch.

Yet as said - context switches are likely too frequent for having an
unconditional flush there (if it can be avoided).

>>> Won't be enough to flushing of guest TLB only during context switch?
>> "only" is interesting here. Context switches are a relatively frequent
>> operation, which in addition you want to be fast. If a flush is necessary
>> there for correctness (e.g. when VMIDs aren't in use), you have to do it
>> there. But if you can flush less frequently without violating correctness,
>> you'd almost always want to use such an opportunity.
> 
> Then it is better to introduce VMID now, it seems it's only one place where
> it should be set, when hgatp is initialized.
> 
> Does Xen have some framework to work with VMID?

That's all arch-specific, I think.

>>>>> I am going to implement, p2m_force_tlb_flush_sync() as:
>>>>>     static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
>>>>>     {
>>>>>       ...
>>>>>         sbi_remote_hfence_gvma(d->dirty_cpumask, 0, 0);
>>>>>       ...
>>>>>     }
>>>>>
>>>>> With such implementation if the guest didn't run on any pCPU(s) yet
>>>>> then d->dirty_cpumask is empty, then sbi_remote_hfence_gvma() will do nothing
>>>>> as hmask will be NULL (https://gitlab.com/xen-project/people/olkur/xen/-/blob/staging/xen/arch/riscv/sbi.c?ref_type=heads#L238).
>>>>> I am not sure that it is a good idea as I can't find a guarantee in the spec
>>>>> that TLB will be empty during boot time.
>>>> If in doubt, do one global flush while booting.
>>> By booting you mean somewhere in continue_new_vcpu()?
>> I don't particularly mean any specific place. However, continue_new_vcpu()
>> (by its name) isn't involved in bringing up Xen, is it?
>>
> No, it isn't. By booting here I meant a boot of a guest domain, not Xen itself.

Please don't call this "booting", but "starting of a guest" (or "launching" or
some such). When you originally said "booting" I thought RISC-V wouldn't
guarantee clean TLBs when being booted, and hence suggested to cover for this
by doing a single flush during (Xen) boot. Looks like this may not be needed
then, simply because of the misunderstanding.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 3/6] xen/riscv: construct the P2M pages pool for guests
  2025-06-02 11:08       ` Jan Beulich
@ 2025-06-05 14:22         ` Oleksii Kurochko
  0 siblings, 0 replies; 31+ messages in thread
From: Oleksii Kurochko @ 2025-06-05 14:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

[-- Attachment #1: Type: text/plain, Size: 5724 bytes --]


On 6/2/25 1:08 PM, Jan Beulich wrote:
> On 23.05.2025 12:27, Oleksii Kurochko wrote:
>> On 5/20/25 4:38 PM, Jan Beulich wrote:
>>> On 09.05.2025 17:57, Oleksii Kurochko wrote:
>>>> --- a/xen/arch/riscv/p2m.c
>>>> +++ b/xen/arch/riscv/p2m.c
>>>> @@ -1,4 +1,12 @@
>>>>    #include <xen/domain_page.h>
>>>> +/*
>>>> + * Because of general_preempt_check() from xen/sched.h which uses
>>>> + * local_events_need_delivery() but latter is declared in <asm/event.h>.
>>>> + * Thereby it is needed to icnlude <xen/event.h> here before xen/sched.h.
>>>> + *
>>>> + * Shouldn't be xen/event.h be included in <xen/sched.h>?
>>>> + */
>>>> +#include <xen/event.h>
>>> The question doesn't belong here; such could be put in the post-commit-
>>> message area. And the answer depends on what dependency you found missing.
>> It is needed for local_events_need_delivery() which is used by general_preempt_check()
>> in p2m_set_allocation(). Otherwise, the following issue will occur:
>>
>> In file included from ././include/xen/config.h:17,
>>                    from <command-line>:
>> arch/riscv/p2m.c: In function 'p2m_set_allocation':
>> ./include/xen/sched.h:941:36: error: implicit declaration of function 'local_events_need_delivery' [-Werror=implicit-function-declaration]
>>     941 |         (!is_idle_vcpu(current) && local_events_need_delivery())    \
>>         |                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~
>> ./include/xen/compiler.h:26:43: note: in definition of macro 'unlikely'
>>      26 | #define unlikely(x)   __builtin_expect(!!(x),0)
>>         |                                           ^
>> arch/riscv/p2m.c:244:27: note: in expansion of macro 'general_preempt_check'
>>     244 |         if ( preempted && general_preempt_check() )
>>         |                           ^~~~~~~~~~~~~~~~~~~~~
>> cc1: all warnings being treated as errors
> In which case my answer to your question is "No". Others may take a different
> perspective. (xen/sched.h being included virtually everywhere, imo we want to
> avoid adding dependencies there which aren't strictly necessary to keep things
> building.)

Okay, then I will just update the comment and leave inclusion of xen/event.h in riscv/p2mc.

>
>>>> @@ -166,3 +176,60 @@ int p2m_init(struct domain *d)
>>>>    
>>>>        return 0;
>>>>    }
>>>> +
>>>> +/*
>>>> + * Set the pool of pages to the required number of pages.
>>>> + * Returns 0 for success, non-zero for failure.
>>>> + * Call with d->arch.paging.lock held.
>>>> + */
>>>> +int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
>>>> +{
>>>> +    struct page_info *pg;
>>>> +
>>>> +    ASSERT(spin_is_locked(&d->arch.paging.lock));
>>>> +
>>>> +    for ( ; ; )
>>>> +    {
>>>> +        if ( d->arch.paging.p2m_total_pages < pages )
>>>> +        {
>>>> +            /* Need to allocate more memory from domheap */
>>>> +            pg = alloc_domheap_page(d, MEMF_no_owner);
>>>> +            if ( pg == NULL )
>>>> +            {
>>>> +                printk(XENLOG_ERR "Failed to allocate P2M pages.\n");
>>>> +                return -ENOMEM;
>>>> +            }
>>>> +            ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
>>>> +                d->arch.paging.p2m_total_pages + 1;
>>> Looks like you copied this from Arm, but this code is bogus: Using
>>> ACCESS_ONCE() just on the lhs is pretty pointless. Once also used on the
>>> rhs the expression can easily become
>>>
>>>                   ACCESS_ONCE(d->arch.paging.p2m_total_pages) += 1;
>>>
>>> or even
>>>
>>>                   ACCESS_ONCE(d->arch.paging.p2m_total_pages)++;
>>>
>>> .
>>>
>>>> +            page_list_add_tail(pg, &d->arch.paging.p2m_freelist);
>>>> +        }
>>>> +        else if ( d->arch.paging.p2m_total_pages > pages )
>>>> +        {
>>>> +            /* Need to return memory to domheap */
>>>> +            pg = page_list_remove_head(&d->arch.paging.p2m_freelist);
>>>> +            if( pg )
>>>> +            {
>>>> +                ACCESS_ONCE(d->arch.paging.p2m_total_pages) =
>>>> +                    d->arch.paging.p2m_total_pages - 1;
>>> Same here then, obviously.
>>>
>>>> +                free_domheap_page(pg);
>>>> +            }
>>>> +            else
>>>> +            {
>>>> +                printk(XENLOG_ERR
>>>> +                       "Failed to free P2M pages, P2M freelist is empty.\n");
>>>> +                return -ENOMEM;
>>>> +            }
>>>> +        }
>>>> +        else
>>>> +            break;
>>>> +
>>>> +        /* Check to see if we need to yield and try again */
>>>> +        if ( preempted && general_preempt_check() )
>>> While it's this way on both Arm and x86, I wonder how useful it is
>>> to check on every iteration, especially when freeing pages back to the
>>> buddy allocator.
>> IIUC, but a preemption request could happen for both cases. And destroying of
>> a domain could also consume long time and so not to block hypervisor if something
>> more urgent should be handled it could be also have this check for the case of
>> freeng pages back to the buddy allocator.
> The question wasn't whether to check, but how frequently. The check itself is
> consuming processing time, too, so one generally wants to balance the number
> of checks against the size of the resulting time window without any check.

So you are expecting something like this:
         /* Check to see if we need to yield and try again */
         if ( !(d->arch.paging.p2m_total_pages % 30) && preempted &&
              general_preempt_check() )
         {
             *preempted = true;
             return -ERESTART;
         }
It still requires, at least, one check, but this check isn't so long as
general_preempt_check() could be.

~ Oleksii

[-- Attachment #2: Type: text/html, Size: 7102 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization
  2025-06-05 14:19               ` Jan Beulich
@ 2025-06-05 15:28                 ` Oleksii Kurochko
  0 siblings, 0 replies; 31+ messages in thread
From: Oleksii Kurochko @ 2025-06-05 15:28 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

[-- Attachment #1: Type: text/plain, Size: 1201 bytes --]


On 6/5/25 4:19 PM, Jan Beulich wrote:
>>>> Won't be enough to flushing of guest TLB only during context switch?
>>> "only" is interesting here. Context switches are a relatively frequent
>>> operation, which in addition you want to be fast. If a flush is necessary
>>> there for correctness (e.g. when VMIDs aren't in use), you have to do it
>>> there. But if you can flush less frequently without violating correctness,
>>> you'd almost always want to use such an opportunity.
>> Then it is better to introduce VMID now, it seems it's only one place where
>> it should be set, when hgatp is initialized.
>>
>> Does Xen have some framework to work with VMID?
> That's all arch-specific, I think.

Probably, I used incorrect words to express what I want. I wrote about allocation/freeing
of VMIDs. Basically something similar to what Arm has (|p2m_vmid_allocator_init(), ||p2m_alloc_vmid(), ||p2m_free_vmid|):
   https://gitlab.com/xen-project/xen/-/blob/staging/xen/arch/arm/p2m.c?ref_type=heads#L271

It seems like it is not very arch-specific, at the moment.

But, likely, RISC-V will need to do an update|p2m_alloc_vmid(), which will do p2m's TLB flush when overflow of VMIDs 
happen. ~ Oleksii |

[-- Attachment #2: Type: text/html, Size: 2085 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2025-06-05 15:28 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-09 15:57 [PATCH v1 0/6] xen/riscv: introduce p2m functionality Oleksii Kurochko
2025-05-09 15:57 ` [PATCH v1 1/6] xen/riscv: add inclusion of xen/bitops.h to asm/cmpxchg.h Oleksii Kurochko
2025-05-09 16:00   ` Andrew Cooper
2025-05-09 15:57 ` [PATCH v1 2/6] xen/riscv: introduce things necessary for p2m initialization Oleksii Kurochko
2025-05-09 16:14   ` Andrew Cooper
2025-05-12  9:24     ` Oleksii Kurochko
2025-05-12  9:33       ` Oleksii Kurochko
2025-05-20 13:47       ` Jan Beulich
2025-05-20 13:37   ` Jan Beulich
2025-05-22 15:53     ` Oleksii Kurochko
2025-05-22 16:09       ` Jan Beulich
2025-05-23  9:44         ` Oleksii Kurochko
2025-06-02 11:04           ` Jan Beulich
2025-06-05 14:10             ` Oleksii Kurochko
2025-06-05 14:19               ` Jan Beulich
2025-06-05 15:28                 ` Oleksii Kurochko
2025-05-09 15:57 ` [PATCH v1 3/6] xen/riscv: construct the P2M pages pool for guests Oleksii Kurochko
2025-05-20 14:38   ` Jan Beulich
2025-05-23 10:27     ` Oleksii Kurochko
2025-06-02 11:08       ` Jan Beulich
2025-06-05 14:22         ` Oleksii Kurochko
2025-05-09 15:57 ` [PATCH v1 4/6] xen/riscv: define pt_t and pt_walk_t structures Oleksii Kurochko
2025-05-20 15:04   ` Jan Beulich
2025-05-23 10:48     ` Oleksii Kurochko
2025-05-09 15:57 ` [PATCH v1 5/6] xen/riscv: add new p2m types and helper macros for type classification Oleksii Kurochko
2025-05-20 15:11   ` Jan Beulich
2025-05-23 11:34     ` Oleksii Kurochko
2025-06-02 11:12       ` Jan Beulich
2025-05-09 15:57 ` [PATCH v1 6/6] xen/riscv: implement p2m mapping functionality Oleksii Kurochko
2025-05-20 15:16   ` Jan Beulich
2025-05-23 11:47     ` Oleksii Kurochko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.