All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/11] RISCV: enable DOMAIN_BUILD_HELPERS
@ 2026-03-23 16:29 Oleksii Kurochko
  2026-03-23 16:29 ` [PATCH v2 01/11] xen/riscv: implement get_page_from_gfn() Oleksii Kurochko
                   ` (10 more replies)
  0 siblings, 11 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-23 16:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk, Timothy Pearson, Rahul Singh


Introduce necessary things to enable DOMAIN_BUILD_HELPERS config for RISC-V.

Generally it is indepenent patch series from [1] but depends on which
patches will go first it could be some merge conflicts.

[1] https://lore.kernel.org/xen-devel/cover.1773419622.git.oleksii.kurochko@gmail.com/

CI tests: https://gitlab.com/xen-project/people/olkur/xen/-/pipelines/2403222832

---
Changes in v2:
 - Address the comments from ML.
 - Introduce some new patches to make dom0less solution more architecture
   indepenent from terminology point of view.
 - Minor fixes.
---

Oleksii Kurochko (11):
  xen/riscv: implement get_page_from_gfn()
  xen: return proper type for guest access functions
  xen/riscv: implement copy_to_guest_phys()
  xen/dom0less: rename kernel_zimage_probe() to kernel_image_probe()
  xen/riscv: add kernel loading support
  xen: move declaration of fw_unreserved_regions() to common header
  xen: move domain_use_host_layout() to common code
  xen: rename p2m_ipa_bits to p2m_gpa_bits
  xen/riscv: introduce p2m_gpa_bits
  xen/riscv: add definition of guest RAM banks
  xen/riscv: enable DOMAIN_BUILD_HELPERS

 xen/arch/arm/domain_build.c               |  12 +-
 xen/arch/arm/domctl.c                     |   2 +-
 xen/arch/arm/guestcopy.c                  |  24 ++--
 xen/arch/arm/include/asm/domain.h         |  14 --
 xen/arch/arm/include/asm/guest_access.h   |  18 +--
 xen/arch/arm/include/asm/p2m.h            |   4 +-
 xen/arch/arm/include/asm/setup.h          |   3 -
 xen/arch/arm/kernel.c                     |  48 +++----
 xen/arch/arm/mmu/p2m.c                    |  18 +--
 xen/arch/arm/p2m.c                        |   6 +-
 xen/arch/ppc/include/asm/guest_access.h   |  10 +-
 xen/arch/riscv/Kconfig                    |   1 +
 xen/arch/riscv/Makefile                   |   2 +
 xen/arch/riscv/guestcopy.c                | 116 ++++++++++++++++
 xen/arch/riscv/include/asm/config.h       |  13 ++
 xen/arch/riscv/include/asm/guest_access.h |  13 +-
 xen/arch/riscv/include/asm/p2m.h          |  18 +--
 xen/arch/riscv/kernel.c                   | 158 ++++++++++++++++++++++
 xen/arch/riscv/p2m.c                      |  63 ++++++++-
 xen/arch/riscv/stubs.c                    |   8 +-
 xen/common/device-tree/domain-build.c     |   2 +-
 xen/common/device-tree/kernel.c           |   2 +-
 xen/common/domain.c                       |   8 +-
 xen/drivers/passthrough/arm/ipmmu-vmsa.c  |   4 +-
 xen/drivers/passthrough/arm/smmu-v3.c     |   2 +-
 xen/drivers/passthrough/arm/smmu.c        |   2 +-
 xen/include/public/arch-riscv.h           |  16 +++
 xen/include/xen/bootinfo.h                |   4 +
 xen/include/xen/domain.h                  |  16 +++
 xen/include/xen/fdt-domain-build.h        |   8 +-
 xen/include/xen/fdt-kernel.h              |   4 +-
 31 files changed, 499 insertions(+), 120 deletions(-)
 create mode 100644 xen/arch/riscv/guestcopy.c
 create mode 100644 xen/arch/riscv/kernel.c

-- 
2.53.0



^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v2 01/11] xen/riscv: implement get_page_from_gfn()
  2026-03-23 16:29 [PATCH v2 00/11] RISCV: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko
@ 2026-03-23 16:29 ` Oleksii Kurochko
  2026-03-26 13:50   ` Jan Beulich
  2026-03-23 16:29 ` [PATCH v2 02/11] xen: return proper type for guest access functions Oleksii Kurochko
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-23 16:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Provide a RISC-V implementation of get_page_from_gfn(), matching the
semantics used by other architectures.

For translated guests, this is implemented as a wrapper around
p2m_get_page_from_gfn(). For DOMID_XEN, which is not auto-translated,
provide a 1:1 RAM/MMIO mapping and perform the required validation and
reference counting.

The function is implemented out-of-line rather than as a static inline,
to avoid header ordering issues where struct domain is incomplete when
asm/p2m.h is included, leading to build failures:
  In file included from ./arch/riscv/include/asm/domain.h:10,
                   from ./include/xen/domain.h:16,
                   from ./include/xen/sched.h:11,
                   from ./include/xen/event.h:12,
                   from common/cpu.c:3:
  ./arch/riscv/include/asm/p2m.h: In function 'get_page_from_gfn':
  ./arch/riscv/include/asm/p2m.h:50:33: error: invalid use of undefined type 'struct domain'
     50 | #define p2m_get_hostp2m(d) (&(d)->arch.p2m)
        |                                 ^~
  ./arch/riscv/include/asm/p2m.h:180:38: note: in expansion of macro 'p2m_get_hostp2m'
    180 |         return p2m_get_page_from_gfn(p2m_get_hostp2m(d), _gfn(gfn), t);
        |                                      ^~~~~~~~~~~~~~~
  make[2]: *** [Rules.mk:253: common/cpu.o] Error 1
  make[1]: *** [build.mk:72: common] Error 2
  make: *** [Makefile:623: xen] Error 2

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
 - Align implemntation with Arm's get_page_from_gfn().
 - Update the first comment about DOMID_XEN to mention that isn't "normal"
   domain instead of no-autotranslated.
 - Drop footer after commit message.
---
 xen/arch/riscv/include/asm/p2m.h |  8 ++------
 xen/arch/riscv/p2m.c             | 29 +++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/xen/arch/riscv/include/asm/p2m.h b/xen/arch/riscv/include/asm/p2m.h
index 60f27f9b347e..54ea67990f06 100644
--- a/xen/arch/riscv/include/asm/p2m.h
+++ b/xen/arch/riscv/include/asm/p2m.h
@@ -164,12 +164,8 @@ typedef unsigned int p2m_query_t;
 #define P2M_ALLOC    (1u<<0)   /* Populate PoD and paged-out entries */
 #define P2M_UNSHARE  (1u<<1)   /* Break CoW sharing */
 
-static inline struct page_info *get_page_from_gfn(
-    struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q)
-{
-    BUG_ON("unimplemented");
-    return NULL;
-}
+struct page_info *get_page_from_gfn(struct domain *d, unsigned long gfn,
+                                    p2m_type_t *t, p2m_query_t q);
 
 static inline void memory_type_changed(struct domain *d)
 {
diff --git a/xen/arch/riscv/p2m.c b/xen/arch/riscv/p2m.c
index 89e5db606fc8..11beaeead5ac 100644
--- a/xen/arch/riscv/p2m.c
+++ b/xen/arch/riscv/p2m.c
@@ -1534,3 +1534,32 @@ void p2m_handle_vmenter(void)
      * won't be reused until need_flush is set to true.
      */
 }
+
+struct page_info *get_page_from_gfn(struct domain *d, unsigned long gfn,
+                                    p2m_type_t *t, p2m_query_t q)
+{
+    struct page_info *page;
+    p2m_type_t p2mt;
+
+    /* Special case for DOMID_XEN as it isn't "normal" domain */
+    if ( likely(d != dom_xen) )
+        return p2m_get_page_from_gfn(p2m_get_hostp2m(d), _gfn(gfn), t);
+
+    if ( !t )
+        t = &p2mt;
+
+    *t = p2m_invalid;
+
+    /* DOMID_XEN sees 1-1 RAM. The p2m_type is based on the type of the page */
+    page = mfn_to_page(_mfn(gfn));
+
+    if ( !mfn_valid(_mfn(gfn)) || !get_page(page, d) )
+        return NULL;
+
+    if ( page->u.inuse.type_info & PGT_writable_page )
+        *t = p2m_ram_rw;
+    else
+        BUG_ON("unimplemented. p2m_ram_ro hasn't been introduced yet");
+
+    return page;
+}
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 02/11] xen: return proper type for guest access functions
  2026-03-23 16:29 [PATCH v2 00/11] RISCV: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko
  2026-03-23 16:29 ` [PATCH v2 01/11] xen/riscv: implement get_page_from_gfn() Oleksii Kurochko
@ 2026-03-23 16:29 ` Oleksii Kurochko
  2026-03-26 13:56   ` Jan Beulich
  2026-03-23 16:29 ` [PATCH v2 03/11] xen/riscv: implement copy_to_guest_phys() Oleksii Kurochko
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-23 16:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Andrew Cooper, Anthony PERARD, Jan Beulich, Roger Pau Monné,
	Timothy Pearson, Alistair Francis, Connor Davis

The copy_to_guest_phys_cb function pointer declaration was based on Arm
code. However, guest access functions use copy_guest(), which should
return an unsigned int as it returns either 0 or len which is unsigned int,
so it does not make sense to return unsigned long.

Update other functions that use copy_guest() to return unsigned int, to
match its return type.

Also update guest access functions for other architectures, as their
declarations/definitions are likely copied from the Arm implementation,
so update them as well to keep everything in sync.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
 - New patch.
---
 xen/arch/arm/guestcopy.c                  | 24 +++++++++++------------
 xen/arch/arm/include/asm/guest_access.h   | 18 ++++++++---------
 xen/arch/ppc/include/asm/guest_access.h   | 10 +++++-----
 xen/arch/riscv/include/asm/guest_access.h |  6 +++---
 xen/arch/riscv/stubs.c                    |  8 ++++----
 xen/include/xen/fdt-domain-build.h        |  8 ++++----
 6 files changed, 37 insertions(+), 37 deletions(-)

diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c
index fdb06422b8e9..11ad80320f4c 100644
--- a/xen/arch/arm/guestcopy.c
+++ b/xen/arch/arm/guestcopy.c
@@ -53,8 +53,8 @@ static struct page_info *translate_get_page(copy_info_t info, uint64_t addr,
     return page;
 }
 
-static unsigned long copy_guest(void *buf, uint64_t addr, unsigned int len,
-                                copy_info_t info, unsigned int flags)
+static unsigned int copy_guest(void *buf, uint64_t addr, unsigned int len,
+                               copy_info_t info, unsigned int flags)
 {
     /* XXX needs to handle faults */
     unsigned int offset = addr & ~PAGE_MASK;
@@ -107,36 +107,36 @@ static unsigned long copy_guest(void *buf, uint64_t addr, unsigned int len,
     return 0;
 }
 
-unsigned long raw_copy_to_guest(void *to, const void *from, unsigned int len)
+unsigned int raw_copy_to_guest(void *to, const void *from, unsigned int len)
 {
     return copy_guest((void *)from, (vaddr_t)to, len,
                       GVA_INFO(current), COPY_to_guest | COPY_linear);
 }
 
-unsigned long raw_copy_to_guest_flush_dcache(void *to, const void *from,
-                                             unsigned int len)
+unsigned int raw_copy_to_guest_flush_dcache(void *to, const void *from,
+                                            unsigned int len)
 {
     return copy_guest((void *)from, (vaddr_t)to, len, GVA_INFO(current),
                       COPY_to_guest | COPY_flush_dcache | COPY_linear);
 }
 
-unsigned long raw_clear_guest(void *to, unsigned int len)
+unsigned int raw_clear_guest(void *to, unsigned int len)
 {
     return copy_guest(NULL, (vaddr_t)to, len, GVA_INFO(current),
                       COPY_to_guest | COPY_linear);
 }
 
-unsigned long raw_copy_from_guest(void *to, const void __user *from,
-                                  unsigned int len)
+unsigned int raw_copy_from_guest(void *to, const void __user *from,
+                                 unsigned int len)
 {
     return copy_guest(to, (vaddr_t)from, len, GVA_INFO(current),
                       COPY_from_guest | COPY_linear);
 }
 
-unsigned long copy_to_guest_phys_flush_dcache(struct domain *d,
-                                              paddr_t gpa,
-                                              void *buf,
-                                              unsigned int len)
+unsigned int copy_to_guest_phys_flush_dcache(struct domain *d,
+                                             paddr_t gpa,
+                                             void *buf,
+                                             unsigned int len)
 {
     return copy_guest(buf, gpa, len, GPA_INFO(d),
                       COPY_to_guest | COPY_ipa | COPY_flush_dcache);
diff --git a/xen/arch/arm/include/asm/guest_access.h b/xen/arch/arm/include/asm/guest_access.h
index 18c88b70d7ec..a1a4b1c36269 100644
--- a/xen/arch/arm/include/asm/guest_access.h
+++ b/xen/arch/arm/include/asm/guest_access.h
@@ -4,17 +4,17 @@
 #include <xen/errno.h>
 #include <xen/sched.h>
 
-unsigned long raw_copy_to_guest(void *to, const void *from, unsigned int len);
-unsigned long raw_copy_to_guest_flush_dcache(void *to, const void *from,
-                                             unsigned int len);
-unsigned long raw_copy_from_guest(void *to, const void *from, unsigned int len);
-unsigned long raw_clear_guest(void *to, unsigned int len);
+unsigned int raw_copy_to_guest(void *to, const void *from, unsigned int len);
+unsigned int raw_copy_to_guest_flush_dcache(void *to, const void *from,
+                                            unsigned int len);
+unsigned int raw_copy_from_guest(void *to, const void *from, unsigned int len);
+unsigned int raw_clear_guest(void *to, unsigned int len);
 
 /* Copy data to guest physical address, then clean the region. */
-unsigned long copy_to_guest_phys_flush_dcache(struct domain *d,
-                                              paddr_t gpa,
-                                              void *buf,
-                                              unsigned int len);
+unsigned int copy_to_guest_phys_flush_dcache(struct domain *d,
+                                             paddr_t gpa,
+                                             void *buf,
+                                             unsigned int len);
 
 int access_guest_memory_by_gpa(struct domain *d, paddr_t gpa, void *buf,
                                uint32_t size, bool is_write);
diff --git a/xen/arch/ppc/include/asm/guest_access.h b/xen/arch/ppc/include/asm/guest_access.h
index 654693191106..922848032604 100644
--- a/xen/arch/ppc/include/asm/guest_access.h
+++ b/xen/arch/ppc/include/asm/guest_access.h
@@ -6,34 +6,34 @@
 
 /* TODO */
 
-static inline unsigned long raw_copy_to_guest(
+static inline unsigned int raw_copy_to_guest(
     void *to,
     const void *from,
     unsigned int len)
 {
     BUG_ON("unimplemented");
 }
-static inline unsigned long raw_copy_to_guest_flush_dcache(
+static inline unsigned int raw_copy_to_guest_flush_dcache(
     void *to,
     const void *from,
     unsigned int len)
 {
     BUG_ON("unimplemented");
 }
-static inline unsigned long raw_copy_from_guest(
+static inline unsigned int raw_copy_from_guest(
     void *to,
     const void *from,
     unsigned int len)
 {
     BUG_ON("unimplemented");
 }
-static inline unsigned long raw_clear_guest(void *to, unsigned int len)
+static inline unsigned int raw_clear_guest(void *to, unsigned int len)
 {
     BUG_ON("unimplemented");
 }
 
 /* Copy data to guest physical address, then clean the region. */
-static inline unsigned long copy_to_guest_phys_flush_dcache(
+static inline unsigned int copy_to_guest_phys_flush_dcache(
     struct domain *d,
     paddr_t gpa,
     void *buf,
diff --git a/xen/arch/riscv/include/asm/guest_access.h b/xen/arch/riscv/include/asm/guest_access.h
index 7cd51fbbdead..3f4c68e4da20 100644
--- a/xen/arch/riscv/include/asm/guest_access.h
+++ b/xen/arch/riscv/include/asm/guest_access.h
@@ -2,9 +2,9 @@
 #ifndef ASM__RISCV__GUEST_ACCESS_H
 #define ASM__RISCV__GUEST_ACCESS_H
 
-unsigned long raw_copy_to_guest(void *to, const void *from, unsigned len);
-unsigned long raw_copy_from_guest(void *to, const void *from, unsigned len);
-unsigned long raw_clear_guest(void *to, unsigned int len);
+unsigned int raw_copy_to_guest(void *to, const void *from, unsigned len);
+unsigned int raw_copy_from_guest(void *to, const void *from, unsigned len);
+unsigned int raw_clear_guest(void *to, unsigned int len);
 
 #define __raw_copy_to_guest raw_copy_to_guest
 #define __raw_copy_from_guest raw_copy_from_guest
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index acbb5b9123ea..d1f78b7c59fa 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -201,13 +201,13 @@ int __init parse_arch_dom0_param(const char *s, const char *e)
 
 /* guestcopy.c */
 
-unsigned long raw_copy_to_guest(void *to, const void *from, unsigned int len)
+unsigned int raw_copy_to_guest(void *to, const void *from, unsigned int len)
 {
     BUG_ON("unimplemented");
 }
 
-unsigned long raw_copy_from_guest(void *to, const void __user *from,
-                                  unsigned int len)
+unsigned int raw_copy_from_guest(void *to, const void __user *from,
+                                 unsigned int len)
 {
     BUG_ON("unimplemented");
 }
@@ -266,7 +266,7 @@ void udelay(unsigned long usecs)
 
 /* guest_access.h */
 
-static inline unsigned long raw_clear_guest(void *to, unsigned int len)
+static inline unsigned int raw_clear_guest(void *to, unsigned int len)
 {
     BUG_ON("unimplemented");
 }
diff --git a/xen/include/xen/fdt-domain-build.h b/xen/include/xen/fdt-domain-build.h
index 886a85381651..194d69303f56 100644
--- a/xen/include/xen/fdt-domain-build.h
+++ b/xen/include/xen/fdt-domain-build.h
@@ -44,10 +44,10 @@ static inline int get_allocation_size(paddr_t size)
     return get_order_from_bytes(size + 1) - 1;
 }
 
-typedef unsigned long (*copy_to_guest_phys_cb)(struct domain *d,
-                                               paddr_t gpa,
-                                               void *buf,
-                                               unsigned int len);
+typedef unsigned int (*copy_to_guest_phys_cb)(struct domain *d,
+                                              paddr_t gpa,
+                                              void *buf,
+                                              unsigned int len);
 
 void initrd_load(struct kernel_info *kinfo,
                  copy_to_guest_phys_cb cb);
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 03/11] xen/riscv: implement copy_to_guest_phys()
  2026-03-23 16:29 [PATCH v2 00/11] RISCV: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko
  2026-03-23 16:29 ` [PATCH v2 01/11] xen/riscv: implement get_page_from_gfn() Oleksii Kurochko
  2026-03-23 16:29 ` [PATCH v2 02/11] xen: return proper type for guest access functions Oleksii Kurochko
@ 2026-03-23 16:29 ` Oleksii Kurochko
  2026-03-30 14:24   ` Jan Beulich
  2026-03-23 16:29 ` [PATCH v2 04/11] xen/dom0less: rename kernel_zimage_probe() to kernel_image_probe() Oleksii Kurochko
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-23 16:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Introduce copy_to_guest_phys() for RISC-V, based on the Arm implementation.

Add a generic copy_guest() helper for copying to and from guest physical
(and potentially virtual addresses in the future), and implement
translate_get_page() to translate a guest physical address into a struct
page_info via the domain p2m.

Compared to the Arm code:
- Drop COPY_flush_dcache(), as no such use cases exist on RISC-V.
- Do not implement the linear mapping case, which is currently unused.
- Use PAGE_OFFSET() to initialize the local offset variable in copy_guest().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
 - Use BIT() instead of open-coding.
 - Rename COPY_ipa to COPY_gpa.
 - Rename COPY_linear to COPY_gva.
 - Use  BUG_ON(linear) instead if (lineer) + BUG_ON.
 - Rename arg liner to gva for translate_get_page().
 - Update translate_get_page() to properly handling write argument.
 - Return unsigned int for copy_guest() and copy_to_guest_phys() as
   len function parameter is only 'unsigned int'.
 - Reformat function arguments for alignment
---
 xen/arch/riscv/Makefile                   |   1 +
 xen/arch/riscv/guestcopy.c                | 116 ++++++++++++++++++++++
 xen/arch/riscv/include/asm/guest_access.h |   7 ++
 3 files changed, 124 insertions(+)
 create mode 100644 xen/arch/riscv/guestcopy.c

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 6b3f3ed90bdb..6d3c822409b8 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -4,6 +4,7 @@ obj-y += domain.o
 obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
 obj-y += entry.o
 obj-$(CONFIG_HAS_EX_TABLE) += extables.o
+obj-y += guestcopy.o
 obj-y += imsic.o
 obj-y += intc.o
 obj-y += irq.o
diff --git a/xen/arch/riscv/guestcopy.c b/xen/arch/riscv/guestcopy.c
new file mode 100644
index 000000000000..d774a90bff92
--- /dev/null
+++ b/xen/arch/riscv/guestcopy.c
@@ -0,0 +1,116 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/domain_page.h>
+#include <xen/page-size.h>
+#include <xen/sched.h>
+#include <xen/string.h>
+
+#include <asm/guest_access.h>
+
+#define COPY_from_guest     0U
+#define COPY_to_guest       BIT(0, U)
+#define COPY_gpa            0U
+#define COPY_gva            BIT(1, U)
+
+typedef union
+{
+    struct
+    {
+        struct vcpu *v;
+    } gva;
+
+    struct
+    {
+        struct domain *d;
+    } gpa;
+} copy_info_t;
+
+#define GVA_INFO(vcpu) ((copy_info_t) { .gva = { vcpu } })
+#define GPA_INFO(domain) ((copy_info_t) { .gpa = { domain } })
+
+static struct page_info *translate_get_page(copy_info_t info, uint64_t addr,
+                                            bool gva, bool write)
+{
+    p2m_type_t p2mt;
+    struct page_info *page;
+
+    /*
+     * Not implemented yet.
+     *
+     * If gva == true, the operation will likely require a struct vcpu
+     * rather than just a struct domain. For this reason copy_info_t is
+     * already passed here instead of only struct domain.
+     */
+    BUG_ON(gva);
+
+    page = get_page_from_gfn(info.gpa.d, paddr_to_pfn(addr), &p2mt, P2M_ALLOC);
+
+    if ( !page )
+        return NULL;
+
+    if ( write ? p2mt != p2m_ram_rw : !p2m_is_ram(p2mt) )
+    {
+        put_page(page);
+        return NULL;
+    }
+
+    return page;
+}
+
+static unsigned int copy_guest(void *buf, uint64_t addr, unsigned int len,
+                               copy_info_t info, unsigned int flags)
+{
+    unsigned int offset = PAGE_OFFSET(addr);
+
+    BUILD_BUG_ON((sizeof(addr)) < sizeof(vaddr_t));
+    BUILD_BUG_ON((sizeof(addr)) < sizeof(paddr_t));
+
+    while ( len )
+    {
+        void *p;
+        unsigned int size = min(len, (unsigned int)PAGE_SIZE - offset);
+        struct page_info *page;
+
+        page = translate_get_page(info, addr, flags & COPY_gva,
+                                  flags & COPY_to_guest);
+        if ( page == NULL )
+            return len;
+
+        p = __map_domain_page(page);
+        p += offset;
+        if ( flags & COPY_to_guest )
+        {
+            /*
+             * buf will be NULL when the caller request to zero the
+             * guest memory.
+             */
+            if ( buf )
+                memcpy(p, buf, size);
+            else
+                memset(p, 0, size);
+        }
+        else
+            memcpy(buf, p, size);
+
+        unmap_domain_page(p - offset);
+        put_page(page);
+        len -= size;
+        buf += size;
+        addr += size;
+
+        /*
+         * After the first iteration, guest virtual address is correctly
+         * aligned to PAGE_SIZE.
+         */
+        offset = 0;
+    }
+
+    return 0;
+}
+
+unsigned int copy_to_guest_phys(struct domain *d, paddr_t gpa, void *buf,
+                                unsigned int len)
+{
+    return copy_guest(buf, gpa, len, GPA_INFO(d),
+                      COPY_to_guest | COPY_gpa);
+}
diff --git a/xen/arch/riscv/include/asm/guest_access.h b/xen/arch/riscv/include/asm/guest_access.h
index 3f4c68e4da20..f0a42745330e 100644
--- a/xen/arch/riscv/include/asm/guest_access.h
+++ b/xen/arch/riscv/include/asm/guest_access.h
@@ -2,6 +2,10 @@
 #ifndef ASM__RISCV__GUEST_ACCESS_H
 #define ASM__RISCV__GUEST_ACCESS_H
 
+#include <xen/types.h>
+
+struct domain;
+
 unsigned int raw_copy_to_guest(void *to, const void *from, unsigned len);
 unsigned int raw_copy_from_guest(void *to, const void *from, unsigned len);
 unsigned int raw_clear_guest(void *to, unsigned int len);
@@ -18,6 +22,9 @@ unsigned int raw_clear_guest(void *to, unsigned int len);
 #define guest_handle_okay(hnd, nr) (1)
 #define guest_handle_subrange_okay(hnd, first, last) (1)
 
+unsigned int copy_to_guest_phys(struct domain *d, paddr_t gpa, void *buf,
+                                unsigned int len);
+
 #endif /* ASM__RISCV__GUEST_ACCESS_H */
 /*
  * Local variables:
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 04/11] xen/dom0less: rename kernel_zimage_probe() to kernel_image_probe()
  2026-03-23 16:29 [PATCH v2 00/11] RISCV: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko
                   ` (2 preceding siblings ...)
  2026-03-23 16:29 ` [PATCH v2 03/11] xen/riscv: implement copy_to_guest_phys() Oleksii Kurochko
@ 2026-03-23 16:29 ` Oleksii Kurochko
  2026-03-23 16:29 ` [PATCH v2 05/11] xen/riscv: add kernel loading support Oleksii Kurochko
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-23 16:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Andrew Cooper, Anthony PERARD, Jan Beulich, Roger Pau Monné

The helper kernel_zimage_probe() is referenced from common code
(xen/common/device-tree/kernel.c), but its name is tied to the zImage
format which is specific to Arm (from architectures supported by Xen).

Other architectures supported by Xen, such as RISC-V, do not use the
zImage format and instead rely on other kernel image types (e.g. Image
or compressed Image variants: Image.gz, etc). Using "zimage" in the
name is therefore misleading in architecture-independent code.

Rename kernel_zimage_probe() to kernel_image_probe() and update the
associated structure field from "zimage" to "image" to reflect that the
code handles generic kernel images rather than the zImage format
specifically.

No functional change intended.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
 - new patch.
---
 xen/arch/arm/kernel.c           | 48 ++++++++++++++++-----------------
 xen/common/device-tree/kernel.c |  2 +-
 xen/include/xen/fdt-kernel.h    |  4 +--
 3 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/xen/arch/arm/kernel.c b/xen/arch/arm/kernel.c
index 7544fd50a20f..3c613cdb233f 100644
--- a/xen/arch/arm/kernel.c
+++ b/xen/arch/arm/kernel.c
@@ -101,8 +101,8 @@ static paddr_t __init kernel_zimage_place(struct kernel_info *info)
     paddr_t load_addr;
 
 #ifdef CONFIG_ARM_64
-    if ( (info->arch.type == DOMAIN_64BIT) && (info->zimage.start == 0) )
-        return mem->bank[0].start + info->zimage.text_offset;
+    if ( (info->arch.type == DOMAIN_64BIT) && (info->image.start == 0) )
+        return mem->bank[0].start + info->image.text_offset;
 #endif
 
     /*
@@ -111,19 +111,19 @@ static paddr_t __init kernel_zimage_place(struct kernel_info *info)
      * and above 32MiB. Load it as high as possible within these
      * constraints, while also avoiding the DTB.
      */
-    if ( info->zimage.start == 0 )
+    if ( info->image.start == 0 )
     {
         paddr_t load_end;
 
         load_end = mem->bank[0].start + mem->bank[0].size;
         load_end = MIN(mem->bank[0].start + MB(128), load_end);
 
-        load_addr = load_end - info->zimage.len;
+        load_addr = load_end - info->image.len;
         /* Align to 2MB */
         load_addr &= ~((2 << 20) - 1);
     }
     else
-        load_addr = info->zimage.start;
+        load_addr = info->image.start;
 
     return load_addr;
 }
@@ -131,8 +131,8 @@ static paddr_t __init kernel_zimage_place(struct kernel_info *info)
 static void __init kernel_zimage_load(struct kernel_info *info)
 {
     paddr_t load_addr = kernel_zimage_place(info);
-    paddr_t paddr = info->zimage.kernel_addr;
-    paddr_t len = info->zimage.len;
+    paddr_t paddr = info->image.kernel_addr;
+    paddr_t len = info->image.len;
     void *kernel;
     int rc;
 
@@ -215,7 +215,7 @@ int __init kernel_uimage_probe(struct kernel_info *info,
         return -EOPNOTSUPP;
     }
 
-    info->zimage.start = be32_to_cpu(uimage.load);
+    info->image.start = be32_to_cpu(uimage.load);
     info->entry = be32_to_cpu(uimage.ep);
 
     /*
@@ -224,20 +224,20 @@ int __init kernel_uimage_probe(struct kernel_info *info,
      * independent image. That means Xen is free to load such an image at
      * any valid address.
      */
-    if ( info->zimage.start == 0 )
+    if ( info->image.start == 0 )
         printk(XENLOG_INFO
                "No load address provided. Xen will decide where to load it.\n");
     else
         printk(XENLOG_INFO
                "Provided load address: %"PRIpaddr" and entry address: %"PRIpaddr"\n",
-               info->zimage.start, info->entry);
+               info->image.start, info->entry);
 
     /*
      * If the image supports position independent execution, then user cannot
      * provide an entry point as Xen will load such an image at any appropriate
      * memory address. Thus, we need to return error.
      */
-    if ( (info->zimage.start == 0) && (info->entry != 0) )
+    if ( (info->image.start == 0) && (info->entry != 0) )
     {
         printk(XENLOG_ERR
                "Entry point cannot be non zero for PIE image.\n");
@@ -257,13 +257,13 @@ int __init kernel_uimage_probe(struct kernel_info *info,
         if ( rc )
             return rc;
 
-        info->zimage.kernel_addr = mod->start;
-        info->zimage.len = mod->size;
+        info->image.kernel_addr = mod->start;
+        info->image.len = mod->size;
     }
     else
     {
-        info->zimage.kernel_addr = addr + sizeof(uimage);
-        info->zimage.len = len;
+        info->image.kernel_addr = addr + sizeof(uimage);
+        info->image.len = len;
     }
 
     info->load = kernel_zimage_load;
@@ -289,7 +289,7 @@ int __init kernel_uimage_probe(struct kernel_info *info,
      * Thus, Xen uses uimage.load attribute to determine the load address and
      * zimage.text_offset is ignored.
      */
-    info->zimage.text_offset = 0;
+    info->image.text_offset = 0;
 #endif
 
     return 0;
@@ -338,10 +338,10 @@ static int __init kernel_zimage64_probe(struct kernel_info *info,
     if ( (end - start) > size )
         return -EINVAL;
 
-    info->zimage.kernel_addr = addr;
-    info->zimage.len = end - start;
-    info->zimage.text_offset = zimage.text_offset;
-    info->zimage.start = 0;
+    info->image.kernel_addr = addr;
+    info->image.len = end - start;
+    info->image.text_offset = zimage.text_offset;
+    info->image.start = 0;
 
     info->load = kernel_zimage_load;
 
@@ -389,10 +389,10 @@ static int __init kernel_zimage32_probe(struct kernel_info *info,
         }
     }
 
-    info->zimage.kernel_addr = addr;
+    info->image.kernel_addr = addr;
 
-    info->zimage.start = start;
-    info->zimage.len = end - start;
+    info->image.start = start;
+    info->image.len = end - start;
 
     info->load = kernel_zimage_load;
 
@@ -403,7 +403,7 @@ static int __init kernel_zimage32_probe(struct kernel_info *info,
     return 0;
 }
 
-int __init kernel_zimage_probe(struct kernel_info *info, paddr_t addr,
+int __init kernel_image_probe(struct kernel_info *info, paddr_t addr,
                                paddr_t size)
 {
     int rc;
diff --git a/xen/common/device-tree/kernel.c b/xen/common/device-tree/kernel.c
index 28096121a52d..cfa27464f0fc 100644
--- a/xen/common/device-tree/kernel.c
+++ b/xen/common/device-tree/kernel.c
@@ -235,7 +235,7 @@ int __init kernel_probe(struct kernel_info *info,
     if ( rc && rc != -EINVAL )
         return rc;
 
-    rc = kernel_zimage_probe(info, mod->start, mod->size);
+    rc = kernel_image_probe(info, mod->start, mod->size);
 
     return rc;
 }
diff --git a/xen/include/xen/fdt-kernel.h b/xen/include/xen/fdt-kernel.h
index 33a60597bb4d..2af3bd5f0722 100644
--- a/xen/include/xen/fdt-kernel.h
+++ b/xen/include/xen/fdt-kernel.h
@@ -56,7 +56,7 @@ struct kernel_info {
             paddr_t text_offset; /* 64-bit Image only */
 #endif
             paddr_t start; /* Must be 0 for 64-bit Image */
-        } zimage;
+        } image;
     };
 
 #if __has_include(<asm/kernel.h>)
@@ -122,7 +122,7 @@ void kernel_load(struct kernel_info *info);
 
 int kernel_decompress(struct boot_module *mod, uint32_t offset);
 
-int kernel_zimage_probe(struct kernel_info *info, paddr_t addr, paddr_t size);
+int kernel_image_probe(struct kernel_info *info, paddr_t addr, paddr_t size);
 
 /*
  * uImage isn't really used nowadays thereby leave kernel_uimage_probe()
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 05/11] xen/riscv: add kernel loading support
  2026-03-23 16:29 [PATCH v2 00/11] RISCV: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko
                   ` (3 preceding siblings ...)
  2026-03-23 16:29 ` [PATCH v2 04/11] xen/dom0less: rename kernel_zimage_probe() to kernel_image_probe() Oleksii Kurochko
@ 2026-03-23 16:29 ` Oleksii Kurochko
  2026-03-30 14:47   ` Jan Beulich
  2026-03-23 16:29 ` [PATCH v2 06/11] xen: move declaration of fw_unreserved_regions() to common header Oleksii Kurochko
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-23 16:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Introduce support for loading a Linux kernel Image (its compressed
version) on RISC-V.

kernel_image_load() and place_modules() currently call panic() on
failure rather than returning an error. This is because the common
kernel_load() in common/device-tree/kernel.c does not expect a
return code. Handling errors gracefully would require a separate
refactor.

The implementation is based on the Xen Arm kernel loading code.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
 - s/zimage/image as RISC-V doesn't support zImages, only Image and
   Image.{gz,...}.
 - Update the commit message.
---
 xen/arch/riscv/Makefile             |   1 +
 xen/arch/riscv/include/asm/config.h |  13 +++
 xen/arch/riscv/kernel.c             | 158 ++++++++++++++++++++++++++++
 3 files changed, 172 insertions(+)
 create mode 100644 xen/arch/riscv/kernel.c

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 6d3c822409b8..d90afc7ad23f 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -8,6 +8,7 @@ obj-y += guestcopy.o
 obj-y += imsic.o
 obj-y += intc.o
 obj-y += irq.o
+obj-y += kernel.o
 obj-y += mm.o
 obj-y += p2m.o
 obj-y += paging.o
diff --git a/xen/arch/riscv/include/asm/config.h b/xen/arch/riscv/include/asm/config.h
index 0613de008b13..fd69057826e1 100644
--- a/xen/arch/riscv/include/asm/config.h
+++ b/xen/arch/riscv/include/asm/config.h
@@ -151,6 +151,19 @@
 extern unsigned long phys_offset; /* = load_start - XEN_VIRT_START */
 #endif
 
+/*
+ * KERNEL_LOAD_ADDR_ALIGNMENT is defined based on paragraph of
+ * "Kernel location" of boot.rst:
+ * https://docs.kernel.org/arch/riscv/boot.html#kernel-location
+ */
+#if defined(CONFIG_RISCV_32)
+#define KERNEL_LOAD_ADDR_ALIGNMENT MB(4)
+#elif defined(CONFIG_RISCV_64)
+#define KERNEL_LOAD_ADDR_ALIGNMENT MB(2)
+#else
+#error "Define KERNEL_LOAD_ADDR_ALIGNMENT"
+#endif
+
 #endif /* ASM__RISCV__CONFIG_H */
 /*
  * Local variables:
diff --git a/xen/arch/riscv/kernel.c b/xen/arch/riscv/kernel.c
new file mode 100644
index 000000000000..be5b17dc22c3
--- /dev/null
+++ b/xen/arch/riscv/kernel.c
@@ -0,0 +1,158 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/bug.h>
+#include <xen/compiler.h>
+#include <xen/errno.h>
+#include <xen/fdt-kernel.h>
+#include <xen/guest_access.h>
+#include <xen/init.h>
+#include <xen/libfdt/libfdt.h>
+#include <xen/mm.h>
+#include <xen/types.h>
+#include <xen/vmap.h>
+
+#include <asm/setup.h>
+
+#define IMAGE64_MAGIC_V2 0x05435352 /* Magic number 2, le, "RSC\x05" */
+
+static void __init place_modules(struct kernel_info *info, paddr_t kernbase,
+                                 paddr_t kernend)
+{
+    const struct boot_module *mod = info->bd.initrd;
+
+    const paddr_t initrd_len = ROUNDUP(mod ? mod->size : 0, MB(2));
+    const paddr_t dtb_len = ROUNDUP(fdt_totalsize(info->fdt), MB(2));
+    const paddr_t modsize = initrd_len + dtb_len;
+
+    const paddr_t ramsize = info->mem.bank[0].size;
+    const paddr_t kernsize = ROUNDUP(kernend, MB(2)) - kernbase;
+
+    if ( modsize + kernsize > ramsize )
+        panic("Not enough memory in the first bank for the kernel+dtb+initrd\n");
+
+    info->dtb_paddr = ROUNDUP(kernend, MB(2));
+
+    info->initrd_paddr = info->dtb_paddr + dtb_len;
+}
+
+static paddr_t __init kernel_image_place(struct kernel_info *info)
+{
+    paddr_t load_addr;
+
+    /*
+     * At the moment, RISC-V's Linux kernel should be always position
+     * independent based on "Per-MMU execution" of boot.rst:
+     *   https://docs.kernel.org/arch/riscv/boot.html#pre-mmu-execution
+     *
+     * But just for the case when RISC-V's Linux kernel isn't position
+     * independent it is needed to take load address from
+     * info->image.start.
+     *
+     * If `start` is zero, the Image is position independent. */
+    if ( likely(!info->image.start) )
+        /*
+         * According to boot.rst kernel load address should be properly
+         * aligned:
+         *   https://docs.kernel.org/arch/riscv/boot.html#kernel-location
+         */
+        load_addr = ROUNDUP(info->mem.bank[0].start, KERNEL_LOAD_ADDR_ALIGNMENT);
+    else
+        load_addr = info->image.start;
+
+    return load_addr;
+}
+
+static void __init kernel_image_load(struct kernel_info *info)
+{
+    int rc;
+    paddr_t load_addr = kernel_image_place(info);
+    paddr_t paddr = info->image.kernel_addr;
+    paddr_t len = info->image.len;
+    void *kernel;
+
+    info->entry = load_addr;
+
+    place_modules(info, load_addr, load_addr + len);
+
+    printk("Loading Image from %"PRIpaddr" to %"PRIpaddr"-%"PRIpaddr"\n",
+            paddr, load_addr, load_addr + len);
+
+    kernel = ioremap_wc(paddr, len);
+
+    if ( !kernel )
+        panic("Unable to map kernel\n");
+
+    /* Move kernel to proper location in guest phys map */
+    rc = copy_to_guest_phys(info->bd.d, load_addr, kernel, len);
+
+    if ( rc )
+        panic("Unable to copy kernel to proper guest location\n");
+
+    iounmap(kernel);
+}
+
+/* Check if the image is a 64-bit Image */
+static int __init kernel_image64_probe(struct kernel_info *info,
+                                       paddr_t addr, paddr_t size)
+{
+    /* riscv/boot-image-header.rst */
+    struct {
+        u32 code0;		  /* Executable code */
+        u32 code1;		  /* Executable code */
+        u64 text_offset;  /* Image load offset, little endian */
+        u64 image_size;	  /* Effective Image size, little endian */
+        u64 flags;		  /* kernel flags, little endian */
+        u32 version;	  /* Version of this header */
+        u32 res1;		  /* Reserved */
+        u64 res2;		  /* Reserved */
+        u64 magic;        /* Deprecated: Magic number, little endian, "RISCV" */
+        u32 magic2;       /* Magic number 2, little endian, "RSC\x05" */
+        u32 res3;		  /* Reserved for PE COFF offset */
+    } image;
+    uint64_t start, end;
+
+    if ( size < sizeof(image) )
+        return -EINVAL;
+
+    copy_from_paddr(&image, addr, sizeof(image));
+
+    /* Magic v1 is deprecated and may be removed.  Only use v2 */
+    if ( image.magic2 != IMAGE64_MAGIC_V2 )
+        return -EINVAL;
+
+    /* Currently there is no length in the header, so just use the size */
+    start = 0;
+    end = size;
+
+    /*
+     * Given the above this check is a bit pointless, but leave it
+     * here in case someone adds a length field in the future.
+     */
+    if ( (end - start) > size )
+        return -EINVAL;
+
+    info->image.kernel_addr = addr;
+    info->image.len = end - start;
+    info->image.text_offset = image.text_offset;
+    info->image.start = 0;
+
+    info->load = kernel_image_load;
+
+    return 0;
+}
+
+int __init kernel_image_probe(struct kernel_info *info, paddr_t addr,
+                              paddr_t size)
+{
+    int rc;
+
+#ifdef CONFIG_RISCV_64
+    rc = kernel_image64_probe(info, addr, size);
+    if ( rc < 0 )
+        panic("Failed to probe kernel image\n");
+#else
+    panic("Only RISC-V 64 is supported\n");
+#endif
+
+    return rc;
+}
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 06/11] xen: move declaration of fw_unreserved_regions() to common header
  2026-03-23 16:29 [PATCH v2 00/11] RISCV: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko
                   ` (4 preceding siblings ...)
  2026-03-23 16:29 ` [PATCH v2 05/11] xen/riscv: add kernel loading support Oleksii Kurochko
@ 2026-03-23 16:29 ` Oleksii Kurochko
  2026-03-23 16:29 ` [PATCH v2 07/11] xen: move domain_use_host_layout() to common code Oleksii Kurochko
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-23 16:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Andrew Cooper, Anthony PERARD, Jan Beulich, Roger Pau Monné

Since the implementation of fw_unreserved_regions() is in common code, move
its declaration to xen/bootinfo.h.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
 - Nothing changed. Only rebase.
---
 xen/arch/arm/include/asm/setup.h | 3 ---
 xen/include/xen/bootinfo.h       | 4 ++++
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/include/asm/setup.h b/xen/arch/arm/include/asm/setup.h
index 899e33925ca4..0d29b46ea52b 100644
--- a/xen/arch/arm/include/asm/setup.h
+++ b/xen/arch/arm/include/asm/setup.h
@@ -43,9 +43,6 @@ int acpi_make_efi_nodes(void *fdt, struct membank tbl_add[]);
 void create_dom0(void);
 
 void discard_initial_modules(void);
-void fw_unreserved_regions(paddr_t s, paddr_t e,
-                           void (*cb)(paddr_t ps, paddr_t pe),
-                           unsigned int first);
 
 void init_pdx(void);
 void setup_mm(void);
diff --git a/xen/include/xen/bootinfo.h b/xen/include/xen/bootinfo.h
index f834f1957155..dbf492c2e36e 100644
--- a/xen/include/xen/bootinfo.h
+++ b/xen/include/xen/bootinfo.h
@@ -210,4 +210,8 @@ static inline struct membanks *membanks_xzalloc(unsigned int nr,
     return banks;
 }
 
+void fw_unreserved_regions(paddr_t s, paddr_t e,
+                           void (*cb)(paddr_t ps, paddr_t pe),
+                           unsigned int first);
+
 #endif /* XEN_BOOTINFO_H */
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 07/11] xen: move domain_use_host_layout() to common code
  2026-03-23 16:29 [PATCH v2 00/11] RISCV: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko
                   ` (5 preceding siblings ...)
  2026-03-23 16:29 ` [PATCH v2 06/11] xen: move declaration of fw_unreserved_regions() to common header Oleksii Kurochko
@ 2026-03-23 16:29 ` Oleksii Kurochko
  2026-03-30 15:13   ` Jan Beulich
  2026-03-23 16:29 ` [PATCH v2 08/11] xen: rename p2m_ipa_bits to p2m_gpa_bits Oleksii Kurochko
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-23 16:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Andrew Cooper, Anthony PERARD, Jan Beulich, Roger Pau Monné

domain_use_host_layout() is not really architecture-specific, so move it
from the Arm header to the common header xen/domain.h and provide a common
implementation in xen/common/domain.c. domain_use_host_layout() potentially
is needed for x86 [1].

Turn the macro into a function to avoid header dependency issues.  In
particular, the implementation depends on paging_mode_translate(), and
including xen/paging.h from xen/domain.h would introduce circular
dependencies via xen/sched.h which will lead to compilation errors as
implicit declaration of struct vcpu, or struct domain, or similar things
declared in xen/sched.h.

Adjust the implementation to take paging_mode_translate() into account
so it works correctly for all architectures, including x86. Some extra
details about implementation [2] and [3].

Also, inclusion of asm/p2m.h is dropped as xen/paging.h already includes
it.

[1] https://lore.kernel.org/xen-devel/alpine.DEB.2.22.394.2602161038120.359097@ubuntu-linux-20-04-desktop/
[2] https://lore.kernel.org/xen-devel/alpine.DEB.2.22.394.2602271742400.3148344@ubuntu-linux-20-04-desktop/
[3] https://lore.kernel.org/xen-devel/alpine.DEB.2.22.394.2602271750190.3148344@ubuntu-linux-20-04-desktop/

Suggested-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
 - Drop ifdef around defintion of domain_use_host_layout() as it
   was suggested generic version. It could be returned back when
   the real use case for it will appear.
 - Add Suggested-by: and update the commit message.
 - Make domain_use_host_layout() function instead of macros to
   avoid ciclular header dependecies. Look at more details in
   the commit message.
---
 xen/arch/arm/include/asm/domain.h | 14 --------------
 xen/common/domain.c               |  8 +++++++-
 xen/include/xen/domain.h          | 16 ++++++++++++++++
 3 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
index 758ad807e461..1a04fe658c97 100644
--- a/xen/arch/arm/include/asm/domain.h
+++ b/xen/arch/arm/include/asm/domain.h
@@ -29,20 +29,6 @@ enum domain_type {
 #define is_64bit_domain(d) (0)
 #endif
 
-/*
- * Is the domain using the host memory layout?
- *
- * Direct-mapped domain will always have the RAM mapped with GFN == MFN.
- * To avoid any trouble finding space, it is easier to force using the
- * host memory layout.
- *
- * The hardware domain will use the host layout regardless of
- * direct-mapped because some OS may rely on a specific address ranges
- * for the devices.
- */
-#define domain_use_host_layout(d) (is_domain_direct_mapped(d) || \
-                                   is_hardware_domain(d))
-
 struct vtimer {
     struct vcpu *v;
     int irq;
diff --git a/xen/common/domain.c b/xen/common/domain.c
index ab910fcf9306..87a6a17575f9 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -26,6 +26,7 @@
 #include <xen/hypercall.h>
 #include <xen/delay.h>
 #include <xen/shutdown.h>
+#include <xen/paging.h>
 #include <xen/percpu.h>
 #include <xen/multicall.h>
 #include <xen/rcupdate.h>
@@ -35,7 +36,6 @@
 #include <xen/argo.h>
 #include <xen/llc-coloring.h>
 #include <xen/xvmalloc.h>
-#include <asm/p2m.h>
 #include <asm/processor.h>
 #include <public/sched.h>
 #include <public/sysctl.h>
@@ -2544,6 +2544,12 @@ void thaw_domains(void)
 
 #endif /* CONFIG_SYSTEM_SUSPEND */
 
+bool domain_use_host_layout(struct domain *d)
+{
+    return is_domain_direct_mapped(d) ||
+           (paging_mode_translate(d) && is_hardware_domain(d));
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 93c0fd00c1d7..68fb1acd4083 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -62,6 +62,22 @@ void domid_free(domid_t domid);
 #define is_domain_direct_mapped(d) ((d)->cdf & CDF_directmap)
 #define is_domain_using_staticmem(d) ((d)->cdf & CDF_staticmem)
 
+/*
+ * Is the auto-translated domain using the host memory layout?
+ *
+ * domain_use_host_layout() is always False for PV guests.
+ *
+ * Direct-mapped domains (autotranslated domains with memory allocated
+ * contiguously and mapped 1:1 so that GFN == MFN) are always using the
+ * host memory layout to avoid address clashes.
+ *
+ * The hardware domain will use the host layout (regardless of
+ * direct-mapped) because some OS may rely on a specific address ranges
+ * for the devices. PV Dom0, like any other PV guests, has
+ * domain_use_host_layout() returning False.
+ */
+bool domain_use_host_layout(struct domain *d);
+
 /*
  * Arch-specifics.
  */
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 08/11] xen: rename p2m_ipa_bits to p2m_gpa_bits
  2026-03-23 16:29 [PATCH v2 00/11] RISCV: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko
                   ` (6 preceding siblings ...)
  2026-03-23 16:29 ` [PATCH v2 07/11] xen: move domain_use_host_layout() to common code Oleksii Kurochko
@ 2026-03-23 16:29 ` Oleksii Kurochko
  2026-03-30 15:16   ` Jan Beulich
  2026-03-23 16:29 ` [PATCH v2 09/11] xen/riscv: introduce p2m_gpa_bits Oleksii Kurochko
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-23 16:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Rahul Singh, Jan Beulich

The IPA terminology is Arm-specific, so rename p2m_ipa_bits to
p2m_gpa_bits to use architecture-neutral naming.

No functional changes.

Reported-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v2:
 - New patch
---
 xen/arch/arm/domain_build.c              | 12 ++++++------
 xen/arch/arm/domctl.c                    |  2 +-
 xen/arch/arm/include/asm/p2m.h           |  4 ++--
 xen/arch/arm/mmu/p2m.c                   | 18 +++++++++---------
 xen/arch/arm/p2m.c                       |  6 +++---
 xen/common/device-tree/domain-build.c    |  2 +-
 xen/drivers/passthrough/arm/ipmmu-vmsa.c |  4 ++--
 xen/drivers/passthrough/arm/smmu-v3.c    |  2 +-
 xen/drivers/passthrough/arm/smmu.c       |  2 +-
 9 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index e8795745ddc7..38ab41ec6b19 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -744,7 +744,7 @@ static int __init find_memory_holes(const struct kernel_info *kinfo,
 
     /* Start with maximum possible addressable physical memory range */
     start = 0;
-    end = (1ULL << p2m_ipa_bits) - 1;
+    end = (1ULL << p2m_gpa_bits) - 1;
     res = rangeset_add_range(mem_holes, PFN_DOWN(start), PFN_DOWN(end));
     if ( res )
     {
@@ -815,7 +815,7 @@ static int __init find_memory_holes(const struct kernel_info *kinfo,
     }
 
     start = 0;
-    end = (1ULL << p2m_ipa_bits) - 1;
+    end = (1ULL << p2m_gpa_bits) - 1;
     res = rangeset_report_ranges(mem_holes, PFN_DOWN(start), PFN_DOWN(end),
                                  add_ext_regions,  ext_regions);
     if ( res )
@@ -849,7 +849,7 @@ static int __init find_domU_holes(const struct kernel_info *kinfo,
 
         start = ROUNDUP(bankbase[i] + kinfo_mem->bank[i].size, SZ_2M);
 
-        bankend = ~0ULL >> (64 - p2m_ipa_bits);
+        bankend = ~0ULL >> (64 - p2m_gpa_bits);
         bankend = min(bankend, bankbase[i] + banksize[i] - 1);
 
         if ( bankend > start )
@@ -881,7 +881,7 @@ static int __init find_domU_holes(const struct kernel_info *kinfo,
     }
 
     res = rangeset_report_ranges(mem_holes, 0,
-                                 PFN_DOWN((1ULL << p2m_ipa_bits) - 1),
+                                 PFN_DOWN((1ULL << p2m_gpa_bits) - 1),
                                  add_ext_regions, ext_regions);
     if ( res )
         ext_regions->nr_banks = 0;
@@ -907,7 +907,7 @@ static unsigned int __init count_ranges(struct rangeset *r)
 {
     unsigned int cnt = 0;
 
-    (void) rangeset_report_ranges(r, 0, PFN_DOWN((1ULL << p2m_ipa_bits) - 1),
+    (void) rangeset_report_ranges(r, 0, PFN_DOWN((1ULL << p2m_gpa_bits) - 1),
                                   count, &cnt);
 
     return cnt;
@@ -972,7 +972,7 @@ static int __init find_host_extended_regions(const struct kernel_info *kinfo,
         }
 
         rangeset_report_ranges(kinfo->xen_reg_assigned, 0,
-                               PFN_DOWN((1ULL << p2m_ipa_bits) - 1),
+                               PFN_DOWN((1ULL << p2m_gpa_bits) - 1),
                                rangeset_to_membank, xen_reg);
     }
 
diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
index ad914c915f81..d8db595ab348 100644
--- a/xen/arch/arm/domctl.c
+++ b/xen/arch/arm/domctl.c
@@ -23,7 +23,7 @@ void arch_get_domain_info(const struct domain *d,
     /* All ARM domains use hardware assisted paging. */
     info->flags |= XEN_DOMINF_hap;
 
-    info->gpaddr_bits = p2m_ipa_bits;
+    info->gpaddr_bits = p2m_gpa_bits;
 }
 
 static int handle_vuart_init(struct domain *d, 
diff --git a/xen/arch/arm/include/asm/p2m.h b/xen/arch/arm/include/asm/p2m.h
index 010ce8c9ebbd..b15b57aa32bd 100644
--- a/xen/arch/arm/include/asm/p2m.h
+++ b/xen/arch/arm/include/asm/p2m.h
@@ -12,7 +12,7 @@
 #define paddr_bits PADDR_BITS
 
 /* Holds the bit size of IPAs in p2m tables.  */
-extern unsigned int p2m_ipa_bits;
+extern unsigned int p2m_gpa_bits;
 
 #define MAX_VMID_8_BIT  (1UL << 8)
 #define MAX_VMID_16_BIT (1UL << 16)
@@ -186,7 +186,7 @@ static inline bool arch_acquire_resource_check(struct domain *d)
 }
 
 /*
- * Helper to restrict "p2m_ipa_bits" according the external entity
+ * Helper to restrict "p2m_gpa_bits" according the external entity
  * (e.g. IOMMU) requirements.
  *
  * Each corresponding driver should report the maximum IPA bits
diff --git a/xen/arch/arm/mmu/p2m.c b/xen/arch/arm/mmu/p2m.c
index 51abf3504fcf..08871c61b812 100644
--- a/xen/arch/arm/mmu/p2m.c
+++ b/xen/arch/arm/mmu/p2m.c
@@ -1734,11 +1734,11 @@ void __init setup_virt_paging(void)
     } t0sz_32;
 #else
     /*
-     * Restrict "p2m_ipa_bits" if needed. As P2M table is always configured
+     * Restrict "p2m_gpa_bits" if needed. As P2M table is always configured
      * with IPA bits == PA bits, compare against "pabits".
      */
-    if ( pa_range_info[system_cpuinfo.mm64.pa_range].pabits < p2m_ipa_bits )
-        p2m_ipa_bits = pa_range_info[system_cpuinfo.mm64.pa_range].pabits;
+    if ( pa_range_info[system_cpuinfo.mm64.pa_range].pabits < p2m_gpa_bits )
+        p2m_gpa_bits = pa_range_info[system_cpuinfo.mm64.pa_range].pabits;
 
     /*
      * cpu info sanitization made sure we support 16bits VMID only if all
@@ -1748,10 +1748,10 @@ void __init setup_virt_paging(void)
         max_vmid = MAX_VMID_16_BIT;
 #endif
 
-    /* Choose suitable "pa_range" according to the resulted "p2m_ipa_bits". */
+    /* Choose suitable "pa_range" according to the resulted "p2m_gpa_bits". */
     for ( i = 0; i < ARRAY_SIZE(pa_range_info); i++ )
     {
-        if ( p2m_ipa_bits == pa_range_info[i].pabits )
+        if ( p2m_gpa_bits == pa_range_info[i].pabits )
         {
             pa_range = i;
             break;
@@ -1760,7 +1760,7 @@ void __init setup_virt_paging(void)
 
     /* Check if we found the associated entry in the array */
     if ( pa_range >= ARRAY_SIZE(pa_range_info) || !pa_range_info[pa_range].pabits )
-        panic("%u-bit P2M is not supported\n", p2m_ipa_bits);
+        panic("%u-bit P2M is not supported\n", p2m_gpa_bits);
 
 #ifdef CONFIG_ARM_64
     val |= VTCR_PS(pa_range);
@@ -1778,14 +1778,14 @@ void __init setup_virt_paging(void)
     p2m_root_level = 2 - pa_range_info[pa_range].sl0;
 
 #ifdef CONFIG_ARM_64
-    p2m_ipa_bits = 64 - pa_range_info[pa_range].t0sz;
+    p2m_gpa_bits = 64 - pa_range_info[pa_range].t0sz;
 #else
     t0sz_32.val = pa_range_info[pa_range].t0sz;
-    p2m_ipa_bits = 32 - t0sz_32.val;
+    p2m_gpa_bits = 32 - t0sz_32.val;
 #endif
 
     printk("P2M: %d-bit IPA with %d-bit PA and %d-bit VMID\n",
-           p2m_ipa_bits,
+           p2m_gpa_bits,
            pa_range_info[pa_range].pabits,
            ( MAX_VMID == MAX_VMID_16_BIT ) ? 16 : 8);
 
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index fb03978a19af..5564e7d3c1db 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -19,7 +19,7 @@ unsigned int __read_mostly max_vmid = MAX_VMID_8_BIT;
  * Set to the maximum configured support for IPA bits, so the number of IPA bits can be
  * restricted by external entity (e.g. IOMMU).
  */
-unsigned int __read_mostly p2m_ipa_bits = PADDR_BITS;
+unsigned int __read_mostly p2m_gpa_bits = PADDR_BITS;
 
 /* Unlock the flush and do a P2M TLB flush if necessary */
 void p2m_write_unlock(struct p2m_domain *p2m)
@@ -603,8 +603,8 @@ void __init p2m_restrict_ipa_bits(unsigned int ipa_bits)
      * Calculate the minimum of the maximum IPA bits that any external entity
      * can support.
      */
-    if ( ipa_bits < p2m_ipa_bits )
-        p2m_ipa_bits = ipa_bits;
+    if ( ipa_bits < p2m_gpa_bits )
+        p2m_gpa_bits = ipa_bits;
 }
 
 /*
diff --git a/xen/common/device-tree/domain-build.c b/xen/common/device-tree/domain-build.c
index 6708c9dd66e6..362da1cae780 100644
--- a/xen/common/device-tree/domain-build.c
+++ b/xen/common/device-tree/domain-build.c
@@ -220,7 +220,7 @@ int __init find_unallocated_memory(const struct kernel_info *kinfo,
     }
 
     start = 0;
-    end = (1ULL << p2m_ipa_bits) - 1;
+    end = (1ULL << p2m_gpa_bits) - 1;
     res = rangeset_report_ranges(unalloc_mem, PFN_DOWN(start), PFN_DOWN(end),
                                  cb, free_regions);
     if ( res )
diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
index ea9fa9ddf3ce..e2b4c95dcc67 100644
--- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c
+++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
@@ -575,11 +575,11 @@ static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
 
     /*
      * TTBCR
-     * We use long descriptors and allocate the whole "p2m_ipa_bits" IPA space
+     * We use long descriptors and allocate the whole "p2m_gpa_bits" IPA space
      * to TTBR0. Use 4KB page granule. Start page table walks at first level.
      * Always bypass stage 1 translation.
      */
-    tsz0 = (64 - p2m_ipa_bits) << IMTTBCR_TSZ0_SHIFT;
+    tsz0 = (64 - p2m_gpa_bits) << IMTTBCR_TSZ0_SHIFT;
     ipmmu_ctx_write_root(domain, IMTTBCR, IMTTBCR_EAE | IMTTBCR_PMB |
                          IMTTBCR_SL0_LVL_1 | tsz0);
 
diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v3.c
index bf153227dbd9..9e86cd7b0ad0 100644
--- a/xen/drivers/passthrough/arm/smmu-v3.c
+++ b/xen/drivers/passthrough/arm/smmu-v3.c
@@ -1202,7 +1202,7 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
 		return -EINVAL;
 	}
 
-	vtcr->tsz = 64 - p2m_ipa_bits;
+	vtcr->tsz = 64 - p2m_gpa_bits;
 	vtcr->sl = 2 - P2M_ROOT_LEVEL;
 
 	arm_lpae_s2_cfg.vttbr  = page_to_maddr(smmu_domain->d->arch.p2m.root);
diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index 22d306d0cb80..fa28fd7db79c 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -1276,7 +1276,7 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain)
 			 * Xen: The IOMMU share the page-tables with the P2M
 			 * which may have restrict the size further.
 			 */
-			reg |= (64 - p2m_ipa_bits) << TTBCR_T0SZ_SHIFT;
+			reg |= (64 - p2m_gpa_bits) << TTBCR_T0SZ_SHIFT;
 
 			switch (smmu->s2_output_size) {
 			case 32:
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 09/11] xen/riscv: introduce p2m_gpa_bits
  2026-03-23 16:29 [PATCH v2 00/11] RISCV: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko
                   ` (7 preceding siblings ...)
  2026-03-23 16:29 ` [PATCH v2 08/11] xen: rename p2m_ipa_bits to p2m_gpa_bits Oleksii Kurochko
@ 2026-03-23 16:29 ` Oleksii Kurochko
  2026-03-30 15:34   ` Jan Beulich
  2026-03-23 16:29 ` [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks Oleksii Kurochko
  2026-03-23 16:29 ` [PATCH v2 11/11] xen/riscv: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko
  10 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-23 16:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

p2m_gpa_bits is used by common/device-tree/domain-build.c thereby when
CONFIG_DOMAIN_BUILD_HELPERS=y it is necessary to have p2m_gpa_bits properly
defined as it is going to be used to find unused regions.

Introduce default_gstage_mode to have ability to limit p2m_gpa_bits before
p2m_init() is being called as it will be too late.

Limit p2m_gpa_bits in guest_mm_init() as it could be that default G-stage
MMU mode uses less VA wide bits than IOMMU, so p2m_gpa_bits should be
restricted more so that dom0less code uses the correct GPA bits.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
 - New patch.
---
 xen/arch/riscv/include/asm/p2m.h | 10 ++++++++--
 xen/arch/riscv/p2m.c             | 34 ++++++++++++++++++++++++++++----
 2 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/xen/arch/riscv/include/asm/p2m.h b/xen/arch/riscv/include/asm/p2m.h
index 54ea67990f06..76b30af8dacb 100644
--- a/xen/arch/riscv/include/asm/p2m.h
+++ b/xen/arch/riscv/include/asm/p2m.h
@@ -32,10 +32,13 @@
  */
 #define P2M_LEVEL_ORDER(lvl) XEN_PT_LEVEL_ORDER(lvl)
 
-#define P2M_ROOT_EXTRA_BITS(p2m, lvl) (2 * ((lvl) == P2M_ROOT_LEVEL(p2m)))
+#define P2M_ROOT_EXTRA_BITS 2
+
+#define P2M_LEVEL_EXTRA_BITS(p2m, lvl) \
+    (P2M_ROOT_EXTRA_BITS * ((lvl) == P2M_ROOT_LEVEL(p2m)))
 
 #define P2M_PAGETABLE_ENTRIES(p2m, lvl) \
-    (BIT(PAGETABLE_ORDER + P2M_ROOT_EXTRA_BITS(p2m, lvl), UL))
+    (BIT(PAGETABLE_ORDER + P2M_LEVEL_EXTRA_BITS(p2m, lvl), UL))
 
 #define P2M_TABLE_OFFSET(p2m, lvl) (P2M_PAGETABLE_ENTRIES(p2m, lvl) - 1UL)
 
@@ -44,6 +47,9 @@
 #define P2M_LEVEL_MASK(p2m, lvl) \
     (P2M_TABLE_OFFSET(p2m, lvl) << P2M_GFN_LEVEL_SHIFT(lvl))
 
+/* Holds the bit size of GPAs in p2m tables */
+extern unsigned int p2m_gpa_bits;
+
 #define paddr_bits PADDR_BITS
 
 /* Get host p2m table */
diff --git a/xen/arch/riscv/p2m.c b/xen/arch/riscv/p2m.c
index 11beaeead5ac..cd682d6586c7 100644
--- a/xen/arch/riscv/p2m.c
+++ b/xen/arch/riscv/p2m.c
@@ -51,6 +51,24 @@ static struct gstage_mode_desc __ro_after_init max_gstage_mode = {
     .name = "Bare",
 };
 
+static struct gstage_mode_desc __ro_after_init default_gstage_mode = {
+    .mode = HGATP_MODE_SV39X4,
+    .paging_levels = 2,
+    .name = "Sv39x4",
+};
+
+/*
+ * Set to the maximum configured support for GPA bits, so the number of GPA
+ * bits can be restricted by an external entity (e.g. IOMMU) and the
+ * restriction must happen before the call of guest_mm_init().
+ *
+ * The widest G-stage mode defined by the RISC-V specification is Sv57x4,
+ * which yields 59-bit GPAs: Sv57 maps 57-bit VAs onto 56-bit PAs (PADDR_BITS),
+ * and the G-stage "x4" extension widens the address space by a further 2 bits,
+ * hence PADDR_BITS + 1 + P2M_ROOT_EXTRA_BITS.
+ */
+unsigned int __ro_after_init p2m_gpa_bits = PADDR_BITS + P2M_ROOT_EXTRA_BITS + 1;
+
 static void p2m_free_page(struct p2m_domain *p2m, struct page_info *pg);
 
 static inline void p2m_free_metadata_page(struct p2m_domain *p2m,
@@ -191,8 +209,13 @@ static void __init gstage_mode_detect(void)
 
 void __init guest_mm_init(void)
 {
+    unsigned int gpa_bits;
+    unsigned int paging_levels = default_gstage_mode.paging_levels;
+
     gstage_mode_detect();
 
+    ASSERT(default_gstage_mode.paging_levels <= max_gstage_mode.paging_levels);
+
     vmid_init();
 
     /*
@@ -226,6 +249,11 @@ void __init guest_mm_init(void)
      * so it could be that we polluted local TLB so flush all guest TLB.
      */
     local_hfence_gvma_all();
+
+    gpa_bits = P2M_GFN_LEVEL_SHIFT(paging_levels + 1) + P2M_ROOT_EXTRA_BITS;
+
+    if ( gpa_bits < p2m_gpa_bits )
+        p2m_gpa_bits = gpa_bits;
 }
 
 /*
@@ -363,9 +391,7 @@ int p2m_init(struct domain *d)
 #endif
 
     /* TODO: don't hardcode used for a domain g-stage mode. */
-    p2m->mode.mode = HGATP_MODE_SV39X4;
-    p2m->mode.paging_levels = 2;
-    safe_strcpy(p2m->mode.name, "Sv39x4");
+    p2m->mode = default_gstage_mode;
 
     return 0;
 }
@@ -1304,7 +1330,7 @@ static mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
 {
     unsigned int level = P2M_ROOT_LEVEL(p2m);
     unsigned int gfn_limit_bits =
-        P2M_LEVEL_ORDER(level + 1) + P2M_ROOT_EXTRA_BITS(p2m, level);
+        P2M_LEVEL_ORDER(level + 1) + P2M_LEVEL_EXTRA_BITS(p2m, level);
     pte_t entry, *table;
     int rc;
     mfn_t mfn = INVALID_MFN;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks
  2026-03-23 16:29 [PATCH v2 00/11] RISCV: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko
                   ` (8 preceding siblings ...)
  2026-03-23 16:29 ` [PATCH v2 09/11] xen/riscv: introduce p2m_gpa_bits Oleksii Kurochko
@ 2026-03-23 16:29 ` Oleksii Kurochko
  2026-03-30 15:51   ` Jan Beulich
  2026-03-23 16:29 ` [PATCH v2 11/11] xen/riscv: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko
  10 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-23 16:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Andrew Cooper, Anthony PERARD,
	Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
	Stefano Stabellini

The dom0less solution uses defined RAM banks as compile-time constants,
so introduce macros to describe guest RAM banks.

The reason for 2 banks is that there is typically always a use case for
low memory under 4 GB, but the bank under 4 GB ends up being small because
there are other things under 4 GB it can conflict with (interrupt
controller, PCI BARs, etc.). So a second bank is added above that MMIO
region (starting at 8 GiB) to provide the remaining RAM; the gap between
the two banks also exercises code paths handling discontiguous memory.
For Sv32 guests (34-bit GPA, 16 GiB addressable), bank0 provides 2 GB
(2–4 GB) and the first 8 GB of bank1 (8–16 GB) is accessible.

Extended regions are useful for RISC-V: they could be used to provide a
"space" for Linux to map grant mappings.

Despite the fact that for every guest MMU mode the GPA could be up
to 56 bits wide (except Sv32 whose GPA is 34 bits), the combined size
of both banks is limited to 1018 GB as it is more than enough for most
use cases.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
 - New patch.
---
 xen/include/public/arch-riscv.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/xen/include/public/arch-riscv.h b/xen/include/public/arch-riscv.h
index 360d8e6871ba..f14ff4c2d14e 100644
--- a/xen/include/public/arch-riscv.h
+++ b/xen/include/public/arch-riscv.h
@@ -50,6 +50,22 @@ typedef uint64_t xen_ulong_t;
 
 #if defined(__XEN__) || defined(__XEN_TOOLS__)
 
+#define GUEST_RAM_BANKS   2
+
+/*
+ * The way to find the extended regions (to be exposed to the guest as unused
+ * address space) relies on the fact that the regions reserved for the RAM
+ * below are big enough to also accommodate such regions.
+ */
+#define GUEST_RAM0_BASE   xen_mk_ullong(0x80000000) /* 2GB of low RAM @ 2GB */
+#define GUEST_RAM0_SIZE   xen_mk_ullong(0x80000000)
+
+#define GUEST_RAM1_BASE   xen_mk_ullong(0x0200000000) /* 1016 GB of RAM @ 8GB */
+#define GUEST_RAM1_SIZE   xen_mk_ullong(0xFE00000000)
+
+#define GUEST_RAM_BANK_BASES   { GUEST_RAM0_BASE, GUEST_RAM1_BASE }
+#define GUEST_RAM_BANK_SIZES   { GUEST_RAM0_SIZE, GUEST_RAM1_SIZE }
+
 struct vcpu_guest_context {
 };
 typedef struct vcpu_guest_context vcpu_guest_context_t;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 11/11] xen/riscv: enable DOMAIN_BUILD_HELPERS
  2026-03-23 16:29 [PATCH v2 00/11] RISCV: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko
                   ` (9 preceding siblings ...)
  2026-03-23 16:29 ` [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks Oleksii Kurochko
@ 2026-03-23 16:29 ` Oleksii Kurochko
  10 siblings, 0 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-23 16:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Everything is ready to enable DOMAIN_BUILD_HELPER which are necessary
for dom0less common code. So enable it.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
 - Move introduction of p2m_ipa_bits to separate patch. Also:
   - do rename of it to p2m_gpa_bits to follow moe arch-neutral
     naming.
   - use __ro_after_init for p2m_gpa_bits;
   - initialize p2m_gpa_bits in guest_mm_init and update if necessary
     in p2m_init().
 - Move to separate patch introduction of guest banks constansts.
---
 xen/arch/riscv/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/arch/riscv/Kconfig b/xen/arch/riscv/Kconfig
index a5e87c1757f7..41426c205292 100644
--- a/xen/arch/riscv/Kconfig
+++ b/xen/arch/riscv/Kconfig
@@ -1,5 +1,6 @@
 config RISCV
 	def_bool y
+	select DOMAIN_BUILD_HELPERS
 	select FUNCTION_ALIGNMENT_16B
 	select GENERIC_BUG_FRAME
 	select GENERIC_UART_INIT
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 01/11] xen/riscv: implement get_page_from_gfn()
  2026-03-23 16:29 ` [PATCH v2 01/11] xen/riscv: implement get_page_from_gfn() Oleksii Kurochko
@ 2026-03-26 13:50   ` Jan Beulich
  2026-03-30 13:40     ` Oleksii Kurochko
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-03-26 13:50 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 23.03.2026 17:29, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/p2m.c
> +++ b/xen/arch/riscv/p2m.c
> @@ -1534,3 +1534,32 @@ void p2m_handle_vmenter(void)
>       * won't be reused until need_flush is set to true.
>       */
>  }
> +
> +struct page_info *get_page_from_gfn(struct domain *d, unsigned long gfn,
> +                                    p2m_type_t *t, p2m_query_t q)
> +{
> +    struct page_info *page;
> +    p2m_type_t p2mt;
> +
> +    /* Special case for DOMID_XEN as it isn't "normal" domain */
> +    if ( likely(d != dom_xen) )
> +        return p2m_get_page_from_gfn(p2m_get_hostp2m(d), _gfn(gfn), t);

Comments usually apply to immediately following code. When that's not
the case (as it is here), the comment either wants moving or wording
accordingly.

> +    if ( !t )
> +        t = &p2mt;
> +
> +    *t = p2m_invalid;
> +
> +    /* DOMID_XEN sees 1-1 RAM. The p2m_type is based on the type of the page */

As before - I don't think implying any kind of translation (even 1:1) is
correct for system domains.

> +    page = mfn_to_page(_mfn(gfn));

This, strictly speaking, is UB until ...

> +    if ( !mfn_valid(_mfn(gfn)) || !get_page(page, d) )

... the mfn_valid() check succeeded. Yes, Arm code has it like this, but
I can only repeat that you want to carefully inspect any code you copy.

> +        return NULL;
> +
> +    if ( page->u.inuse.type_info & PGT_writable_page )
> +        *t = p2m_ram_rw;
> +    else
> +        BUG_ON("unimplemented. p2m_ram_ro hasn't been introduced yet");
> +
> +    return page;
> +}

Finally, what doesn't become clear at all is why dom_xen needs special
casing. ISTR that when looking at the Arm code in the context of reviewing
v1, I spotted why Arm has this special case. Maybe I'm misremembering, as
now I can't spot it again / anymore. Yet whatever the reason there may not
apply at all to RISC-V.

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 02/11] xen: return proper type for guest access functions
  2026-03-23 16:29 ` [PATCH v2 02/11] xen: return proper type for guest access functions Oleksii Kurochko
@ 2026-03-26 13:56   ` Jan Beulich
  0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2026-03-26 13:56 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
	Anthony PERARD, Roger Pau Monné, Timothy Pearson,
	Alistair Francis, Connor Davis, xen-devel

On 23.03.2026 17:29, Oleksii Kurochko wrote:
> --- a/xen/include/xen/fdt-domain-build.h
> +++ b/xen/include/xen/fdt-domain-build.h
> @@ -44,10 +44,10 @@ static inline int get_allocation_size(paddr_t size)
>      return get_order_from_bytes(size + 1) - 1;
>  }
>  
> -typedef unsigned long (*copy_to_guest_phys_cb)(struct domain *d,
> -                                               paddr_t gpa,
> -                                               void *buf,
> -                                               unsigned int len);
> +typedef unsigned int (*copy_to_guest_phys_cb)(struct domain *d,
> +                                              paddr_t gpa,
> +                                              void *buf,
> +                                              unsigned int len);

When making this change, did you look at the use sites of this type? If
so, did it not occur to you that initrd-s can be pretty much arbitrarily
large, i.e. in particular be larger than 4Gb? IOW I think there's a
truncation bug to be fixed in Arm / DT code.

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 01/11] xen/riscv: implement get_page_from_gfn()
  2026-03-26 13:50   ` Jan Beulich
@ 2026-03-30 13:40     ` Oleksii Kurochko
  2026-03-30 14:04       ` Jan Beulich
  0 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-30 13:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel



On 3/26/26 2:50 PM, Jan Beulich wrote:
> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>> --- a/xen/arch/riscv/p2m.c
>> +++ b/xen/arch/riscv/p2m.c
>> @@ -1534,3 +1534,32 @@ void p2m_handle_vmenter(void)
>>        * won't be reused until need_flush is set to true.
>>        */
>>   }
>> +
>> +struct page_info *get_page_from_gfn(struct domain *d, unsigned long gfn,
>> +                                    p2m_type_t *t, p2m_query_t q)
>> +{
>> +    struct page_info *page;
>> +    p2m_type_t p2mt;
>> +
>> +    /* Special case for DOMID_XEN as it isn't "normal" domain */
>> +    if ( likely(d != dom_xen) )
>> +        return p2m_get_page_from_gfn(p2m_get_hostp2m(d), _gfn(gfn), t);
> 
> Comments usually apply to immediately following code. When that's not
> the case (as it is here), the comment either wants moving or wording
> accordingly.

I will move it after if-() statement.

> 
>> +    if ( !t )
>> +        t = &p2mt;
>> +
>> +    *t = p2m_invalid;
>> +
>> +    /* DOMID_XEN sees 1-1 RAM. The p2m_type is based on the type of the page */
> 
> As before - I don't think implying any kind of translation (even 1:1) is
> correct for system domains.

I will rephrase that to:
  "DOM_XEN has no stage-2 translation at all, so the gfn argument is 
treated directly as an mfn"

> 
>> +    page = mfn_to_page(_mfn(gfn));
> 
> This, strictly speaking, is UB until ...
> 
>> +    if ( !mfn_valid(_mfn(gfn)) || !get_page(page, d) )
> 
> ... the mfn_valid() check succeeded. Yes, Arm code has it like this, but
> I can only repeat that you want to carefully inspect any code you copy.
> 
>> +        return NULL;
>> +
>> +    if ( page->u.inuse.type_info & PGT_writable_page )
>> +        *t = p2m_ram_rw;
>> +    else
>> +        BUG_ON("unimplemented. p2m_ram_ro hasn't been introduced yet");
>> +
>> +    return page;
>> +}
> 
> Finally, what doesn't become clear at all is why dom_xen needs special
> casing. ISTR that when looking at the Arm code in the context of reviewing
> v1, I spotted why Arm has this special case. Maybe I'm misremembering, as
> now I can't spot it again / anymore. Yet whatever the reason there may not
> apply at all to RISC-V.

IIUC, then Arm having this special case for DOMID_XEN as it is used to 
share pages beloging to the hypervisor, for example, trace buffers and 
considering that trace buffers are part of common code it will be also 
true for RISC-V.

~ Oleksii



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 01/11] xen/riscv: implement get_page_from_gfn()
  2026-03-30 13:40     ` Oleksii Kurochko
@ 2026-03-30 14:04       ` Jan Beulich
  0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2026-03-30 14:04 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 30.03.2026 15:40, Oleksii Kurochko wrote:
> On 3/26/26 2:50 PM, Jan Beulich wrote:
>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>> +    if ( page->u.inuse.type_info & PGT_writable_page )
>>> +        *t = p2m_ram_rw;
>>> +    else
>>> +        BUG_ON("unimplemented. p2m_ram_ro hasn't been introduced yet");
>>> +
>>> +    return page;
>>> +}
>>
>> Finally, what doesn't become clear at all is why dom_xen needs special
>> casing. ISTR that when looking at the Arm code in the context of reviewing
>> v1, I spotted why Arm has this special case. Maybe I'm misremembering, as
>> now I can't spot it again / anymore. Yet whatever the reason there may not
>> apply at all to RISC-V.
> 
> IIUC, then Arm having this special case for DOMID_XEN as it is used to 
> share pages beloging to the hypervisor, for example, trace buffers and 
> considering that trace buffers are part of common code it will be also 
> true for RISC-V.

Ah yes, share_xen_page_with_privileged_guests() is what I didn't spot this
time round. But then you also need to implement the XENMAPSPACE_gmfn_foreign
case of xenmem_add_to_physmap_one() for this code to actually be reachable.
IOW either you make crystal clear (by way of commentary) why the case wants
dealing with, or both parts get introduced together (thus making their
connection obvious).

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/11] xen/riscv: implement copy_to_guest_phys()
  2026-03-23 16:29 ` [PATCH v2 03/11] xen/riscv: implement copy_to_guest_phys() Oleksii Kurochko
@ 2026-03-30 14:24   ` Jan Beulich
  0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2026-03-30 14:24 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 23.03.2026 17:29, Oleksii Kurochko wrote:
> Introduce copy_to_guest_phys() for RISC-V, based on the Arm implementation.
> 
> Add a generic copy_guest() helper for copying to and from guest physical
> (and potentially virtual addresses in the future), and implement
> translate_get_page() to translate a guest physical address into a struct
> page_info via the domain p2m.
> 
> Compared to the Arm code:
> - Drop COPY_flush_dcache(), as no such use cases exist on RISC-V.
> - Do not implement the linear mapping case, which is currently unused.
> - Use PAGE_OFFSET() to initialize the local offset variable in copy_guest().
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
preferably with ...

> +static unsigned int copy_guest(void *buf, uint64_t addr, unsigned int len,
> +                               copy_info_t info, unsigned int flags)
> +{
> +    unsigned int offset = PAGE_OFFSET(addr);
> +
> +    BUILD_BUG_ON((sizeof(addr)) < sizeof(vaddr_t));
> +    BUILD_BUG_ON((sizeof(addr)) < sizeof(paddr_t));
> +
> +    while ( len )
> +    {
> +        void *p;
> +        unsigned int size = min(len, (unsigned int)PAGE_SIZE - offset);
> +        struct page_info *page;
> +
> +        page = translate_get_page(info, addr, flags & COPY_gva,
> +                                  flags & COPY_to_guest);
> +        if ( page == NULL )

... this consistent ("!page") with ....

> +            return len;
> +
> +        p = __map_domain_page(page);
> +        p += offset;
> +        if ( flags & COPY_to_guest )
> +        {
> +            /*
> +             * buf will be NULL when the caller request to zero the
> +             * guest memory.
> +             */
> +            if ( buf )

... this.

> +                memcpy(p, buf, size);
> +            else
> +                memset(p, 0, size);
> +        }
> +        else
> +            memcpy(buf, p, size);
> +
> +        unmap_domain_page(p - offset);

It doesn't look like the subtracting of "offset" would be needed here. Any
pointer into the correct page will do.

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/11] xen/riscv: add kernel loading support
  2026-03-23 16:29 ` [PATCH v2 05/11] xen/riscv: add kernel loading support Oleksii Kurochko
@ 2026-03-30 14:47   ` Jan Beulich
       [not found]     ` <05b1bc67-bbed-412e-881e-a3fb2c2d873b@gmail.com>
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-03-30 14:47 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 23.03.2026 17:29, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/Makefile
> +++ b/xen/arch/riscv/Makefile
> @@ -8,6 +8,7 @@ obj-y += guestcopy.o
>  obj-y += imsic.o
>  obj-y += intc.o
>  obj-y += irq.o
> +obj-y += kernel.o

kernel.init.o, like Arm has it?

> --- a/xen/arch/riscv/include/asm/config.h
> +++ b/xen/arch/riscv/include/asm/config.h
> @@ -151,6 +151,19 @@
>  extern unsigned long phys_offset; /* = load_start - XEN_VIRT_START */
>  #endif
>  
> +/*
> + * KERNEL_LOAD_ADDR_ALIGNMENT is defined based on paragraph of
> + * "Kernel location" of boot.rst:
> + * https://docs.kernel.org/arch/riscv/boot.html#kernel-location
> + */
> +#if defined(CONFIG_RISCV_32)
> +#define KERNEL_LOAD_ADDR_ALIGNMENT MB(4)
> +#elif defined(CONFIG_RISCV_64)
> +#define KERNEL_LOAD_ADDR_ALIGNMENT MB(2)
> +#else
> +#error "Define KERNEL_LOAD_ADDR_ALIGNMENT"
> +#endif

But that's Linux-specific. You want to be able to loader other OS kernels,
I suppose? The needed alignment should be a property of the kernel image,
suitably conveyed to the loader.

Is Arm similarly capable of loading only Linux images? What about in
particular XTF?

> --- /dev/null
> +++ b/xen/arch/riscv/kernel.c
> @@ -0,0 +1,158 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <xen/bug.h>
> +#include <xen/compiler.h>
> +#include <xen/errno.h>
> +#include <xen/fdt-kernel.h>
> +#include <xen/guest_access.h>
> +#include <xen/init.h>
> +#include <xen/libfdt/libfdt.h>
> +#include <xen/mm.h>
> +#include <xen/types.h>
> +#include <xen/vmap.h>
> +
> +#include <asm/setup.h>
> +
> +#define IMAGE64_MAGIC_V2 0x05435352 /* Magic number 2, le, "RSC\x05" */
> +
> +static void __init place_modules(struct kernel_info *info, paddr_t kernbase,
> +                                 paddr_t kernend)
> +{
> +    const struct boot_module *mod = info->bd.initrd;
> +
> +    const paddr_t initrd_len = ROUNDUP(mod ? mod->size : 0, MB(2));
> +    const paddr_t dtb_len = ROUNDUP(fdt_totalsize(info->fdt), MB(2));
> +    const paddr_t modsize = initrd_len + dtb_len;
> +
> +    const paddr_t ramsize = info->mem.bank[0].size;
> +    const paddr_t kernsize = ROUNDUP(kernend, MB(2)) - kernbase;
> +
> +    if ( modsize + kernsize > ramsize )
> +        panic("Not enough memory in the first bank for the kernel+dtb+initrd\n");
> +
> +    info->dtb_paddr = ROUNDUP(kernend, MB(2));
> +
> +    info->initrd_paddr = info->dtb_paddr + dtb_len;
> +}

Where are all of the MB(2) coming from in here? Do they mean to be
KERNEL_LOAD_ADDR_ALIGNMENT?

Also, how come all of this is limited to the first memory bank?

> +static paddr_t __init kernel_image_place(struct kernel_info *info)
> +{
> +    paddr_t load_addr;
> +
> +    /*
> +     * At the moment, RISC-V's Linux kernel should be always position
> +     * independent based on "Per-MMU execution" of boot.rst:
> +     *   https://docs.kernel.org/arch/riscv/boot.html#pre-mmu-execution
> +     *
> +     * But just for the case when RISC-V's Linux kernel isn't position
> +     * independent it is needed to take load address from
> +     * info->image.start.
> +     *
> +     * If `start` is zero, the Image is position independent. */
> +    if ( likely(!info->image.start) )
> +        /*
> +         * According to boot.rst kernel load address should be properly
> +         * aligned:
> +         *   https://docs.kernel.org/arch/riscv/boot.html#kernel-location
> +         */
> +        load_addr = ROUNDUP(info->mem.bank[0].start, KERNEL_LOAD_ADDR_ALIGNMENT);
> +    else
> +        load_addr = info->image.start;
> +
> +    return load_addr;
> +}

*info doesn't look to be altered here, so likely the parameter wants to
be pointer-to-const.

> +static void __init kernel_image_load(struct kernel_info *info)
> +{
> +    int rc;
> +    paddr_t load_addr = kernel_image_place(info);
> +    paddr_t paddr = info->image.kernel_addr;
> +    paddr_t len = info->image.len;
> +    void *kernel;
> +
> +    info->entry = load_addr;

What if this is outside of memory bank 0 (as is possible when
info->image.start is non-zero).

> +    place_modules(info, load_addr, load_addr + len);
> +
> +    printk("Loading Image from %"PRIpaddr" to %"PRIpaddr"-%"PRIpaddr"\n",
> +            paddr, load_addr, load_addr + len);
> +
> +    kernel = ioremap_wc(paddr, len);

ioremap_cache()?

> +/* Check if the image is a 64-bit Image */
> +static int __init kernel_image64_probe(struct kernel_info *info,
> +                                       paddr_t addr, paddr_t size)
> +{
> +    /* riscv/boot-image-header.rst */
> +    struct {
> +        u32 code0;		  /* Executable code */
> +        u32 code1;		  /* Executable code */
> +        u64 text_offset;  /* Image load offset, little endian */
> +        u64 image_size;	  /* Effective Image size, little endian */
> +        u64 flags;		  /* kernel flags, little endian */
> +        u32 version;	  /* Version of this header */
> +        u32 res1;		  /* Reserved */
> +        u64 res2;		  /* Reserved */
> +        u64 magic;        /* Deprecated: Magic number, little endian, "RISCV" */
> +        u32 magic2;       /* Magic number 2, little endian, "RSC\x05" */
> +        u32 res3;		  /* Reserved for PE COFF offset */

uint<N>_t throughout, please. And no use of hard tabs.

> +    } image;
> +    uint64_t start, end;
> +
> +    if ( size < sizeof(image) )
> +        return -EINVAL;
> +
> +    copy_from_paddr(&image, addr, sizeof(image));
> +
> +    /* Magic v1 is deprecated and may be removed.  Only use v2 */
> +    if ( image.magic2 != IMAGE64_MAGIC_V2 )
> +        return -EINVAL;

This doesn't look to be endian-ness-agnostic.

> +    /* Currently there is no length in the header, so just use the size */
> +    start = 0;
> +    end = size;

What's image_size then?

> +    /*
> +     * Given the above this check is a bit pointless, but leave it
> +     * here in case someone adds a length field in the future.
> +     */
> +    if ( (end - start) > size )
> +        return -EINVAL;
> +
> +    info->image.kernel_addr = addr;
> +    info->image.len = end - start;
> +    info->image.text_offset = image.text_offset;

This again doesn't look to be endian-ness-agnostic.

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/11] xen: move domain_use_host_layout() to common code
  2026-03-23 16:29 ` [PATCH v2 07/11] xen: move domain_use_host_layout() to common code Oleksii Kurochko
@ 2026-03-30 15:13   ` Jan Beulich
       [not found]     ` <57581b7d-cb9f-444c-9321-63b2fc3d09f0@gmail.com>
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-03-30 15:13 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
	Anthony PERARD, Roger Pau Monné, xen-devel

On 23.03.2026 17:29, Oleksii Kurochko wrote:
> domain_use_host_layout() is not really architecture-specific, so move it
> from the Arm header to the common header xen/domain.h and provide a common
> implementation in xen/common/domain.c. domain_use_host_layout() potentially
> is needed for x86 [1].

No matter that this may indeed be true, ...

> Turn the macro into a function to avoid header dependency issues.

... this introduces unreachable code on x86, i.e. a Misra rule 2.1 violation.

> @@ -2544,6 +2544,12 @@ void thaw_domains(void)
>  
>  #endif /* CONFIG_SYSTEM_SUSPEND */
>  
> +bool domain_use_host_layout(struct domain *d)
> +{
> +    return is_domain_direct_mapped(d) ||
> +           (paging_mode_translate(d) && is_hardware_domain(d));
> +}

The placement of paging_mode_translate() doesn't match ...

> --- a/xen/include/xen/domain.h
> +++ b/xen/include/xen/domain.h
> @@ -62,6 +62,22 @@ void domid_free(domid_t domid);
>  #define is_domain_direct_mapped(d) ((d)->cdf & CDF_directmap)
>  #define is_domain_using_staticmem(d) ((d)->cdf & CDF_staticmem)
>  
> +/*
> + * Is the auto-translated domain using the host memory layout?
> + *
> + * domain_use_host_layout() is always False for PV guests.

... the description of the function.

Further, the first sentence above suggests the caller has to check
paging_mode_translate() before calling, which as per the implementation
clearly isn't the intention.

> + * Direct-mapped domains (autotranslated domains with memory allocated
> + * contiguously and mapped 1:1 so that GFN == MFN) are always using the
> + * host memory layout to avoid address clashes.
> + *
> + * The hardware domain will use the host layout (regardless of
> + * direct-mapped) because some OS may rely on a specific address ranges
> + * for the devices. PV Dom0, like any other PV guests, has
> + * domain_use_host_layout() returning False.
> + */
> +bool domain_use_host_layout(struct domain *d);

Like about any other predicate, its parameter wants to be pointer-to-
const. (And I really shouldn't need to repeat this to you every other
time.)

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 08/11] xen: rename p2m_ipa_bits to p2m_gpa_bits
  2026-03-23 16:29 ` [PATCH v2 08/11] xen: rename p2m_ipa_bits to p2m_gpa_bits Oleksii Kurochko
@ 2026-03-30 15:16   ` Jan Beulich
  0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2026-03-30 15:16 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Rahul Singh,
	xen-devel

On 23.03.2026 17:29, Oleksii Kurochko wrote:
> The IPA terminology is Arm-specific, so rename p2m_ipa_bits to
> p2m_gpa_bits to use architecture-neutral naming.

This desire is limited to xen/common/device-tree/, which could do with
saying. I don't know whether Arm folks mind the renaming and the involved
churn. An alternative maybe to have

#define p2m_gpa_bits p2m_ipa_bits

in a suitable Arm header.

> No functional changes.
> 
> Reported-by: Jan Beulich <jbeulich@suse.com>
> ---
> Changes in v2:
>  - New patch

Missing your S-o-b.

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 09/11] xen/riscv: introduce p2m_gpa_bits
  2026-03-23 16:29 ` [PATCH v2 09/11] xen/riscv: introduce p2m_gpa_bits Oleksii Kurochko
@ 2026-03-30 15:34   ` Jan Beulich
  2026-03-31 16:02     ` Oleksii Kurochko
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-03-30 15:34 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 23.03.2026 17:29, Oleksii Kurochko wrote:
> p2m_gpa_bits is used by common/device-tree/domain-build.c thereby when
> CONFIG_DOMAIN_BUILD_HELPERS=y it is necessary to have p2m_gpa_bits properly
> defined as it is going to be used to find unused regions.
> 
> Introduce default_gstage_mode to have ability to limit p2m_gpa_bits before
> p2m_init() is being called as it will be too late.

This is a somewhat strange way of describing things. Of course you want to
establish globals before doing any per-domain setup.

> Limit p2m_gpa_bits in guest_mm_init() as it could be that default G-stage
> MMU mode uses less VA wide bits than IOMMU,

How does a VA come into play here? And what is "less VA wide bits"?

> --- a/xen/arch/riscv/p2m.c
> +++ b/xen/arch/riscv/p2m.c
> @@ -51,6 +51,24 @@ static struct gstage_mode_desc __ro_after_init max_gstage_mode = {
>      .name = "Bare",
>  };
>  
> +static struct gstage_mode_desc __ro_after_init default_gstage_mode = {
> +    .mode = HGATP_MODE_SV39X4,
> +    .paging_levels = 2,
> +    .name = "Sv39x4",
> +};
> +
> +/*
> + * Set to the maximum configured support for GPA bits, so the number of GPA
> + * bits can be restricted by an external entity (e.g. IOMMU) and the
> + * restriction must happen before the call of guest_mm_init().

DYM before p2m_init()? Because you do the calculation in the named
function.

> + * The widest G-stage mode defined by the RISC-V specification is Sv57x4,
> + * which yields 59-bit GPAs: Sv57 maps 57-bit VAs onto 56-bit PAs (PADDR_BITS),
> + * and the G-stage "x4" extension widens the address space by a further 2 bits,
> + * hence PADDR_BITS + 1 + P2M_ROOT_EXTRA_BITS.
> + */

I fear I don't follow. I can't explain the +1 at all. And adding in
P2M_ROOT_EXTRA_BITS seems wrong too: Whatever the width of output of
guest paging _is_ the width of input to stage-2 paging. There's no way
for a guest to encode 2 extra bits.

> @@ -191,8 +209,13 @@ static void __init gstage_mode_detect(void)
>  
>  void __init guest_mm_init(void)
>  {
> +    unsigned int gpa_bits;
> +    unsigned int paging_levels = default_gstage_mode.paging_levels;

Deriving a global from a default, when ...

>      gstage_mode_detect();
>  
> +    ASSERT(default_gstage_mode.paging_levels <= max_gstage_mode.paging_levels);

... the default isn't the maximum possible, isn't going to fly.

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks
  2026-03-23 16:29 ` [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks Oleksii Kurochko
@ 2026-03-30 15:51   ` Jan Beulich
  2026-03-31 16:14     ` Oleksii Kurochko
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-03-30 15:51 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Andrew Cooper, Anthony PERARD, Michal Orzel,
	Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel

On 23.03.2026 17:29, Oleksii Kurochko wrote:
> The dom0less solution uses defined RAM banks as compile-time constants,
> so introduce macros to describe guest RAM banks.
> 
> The reason for 2 banks is that there is typically always a use case for
> low memory under 4 GB, but the bank under 4 GB ends up being small because
> there are other things under 4 GB it can conflict with (interrupt
> controller, PCI BARs, etc.).

Fixed layouts like the one you suggest come with (potentially severe)
downsides. For example, what if more than 2Gb of MMIO space are needed
for non-64-bit BARs? Further, assuming that the space 4G...8G is what
you expect 64-bit BARs to be put into, what if there's a device with a
4G BAR? It'll eat up that entire space, requiring everything else to
fit in the 2G you reserve below 4G.

> So a second bank is added above that MMIO
> region (starting at 8 GiB) to provide the remaining RAM; the gap between
> the two banks also exercises code paths handling discontiguous memory.
> For Sv32 guests (34-bit GPA, 16 GiB addressable), bank0 provides 2 GB
> (2–4 GB) and the first 8 GB of bank1 (8–16 GB) is accessible.
> 
> Extended regions are useful for RISC-V: they could be used to provide a
> "space" for Linux to map grant mappings.
> 
> Despite the fact that for every guest MMU mode the GPA could be up
> to 56 bits wide (except Sv32 whose GPA is 34 bits), the combined size
> of both banks is limited to 1018 GB as it is more than enough for most
> use cases.

Okay, more memory can be made available by (later) adding an optional
3rd bank.

> --- a/xen/include/public/arch-riscv.h
> +++ b/xen/include/public/arch-riscv.h
> @@ -50,6 +50,22 @@ typedef uint64_t xen_ulong_t;
>  
>  #if defined(__XEN__) || defined(__XEN_TOOLS__)
>  
> +#define GUEST_RAM_BANKS   2
> +
> +/*
> + * The way to find the extended regions (to be exposed to the guest as unused
> + * address space) relies on the fact that the regions reserved for the RAM
> + * below are big enough to also accommodate such regions.
> + */
> +#define GUEST_RAM0_BASE   xen_mk_ullong(0x80000000) /* 2GB of low RAM @ 2GB */
> +#define GUEST_RAM0_SIZE   xen_mk_ullong(0x80000000)

Connecting this with my comment on the earlier patch regarding kernel, initrd,
and DTB fitting in bank 0: How's that going to work with a huge kernel and/or
initrd (I expect DTBs can't grow very large)?

> +#define GUEST_RAM1_BASE   xen_mk_ullong(0x0200000000) /* 1016 GB of RAM @ 8GB */
> +#define GUEST_RAM1_SIZE   xen_mk_ullong(0xFE00000000)
> +
> +#define GUEST_RAM_BANK_BASES   { GUEST_RAM0_BASE, GUEST_RAM1_BASE }
> +#define GUEST_RAM_BANK_SIZES   { GUEST_RAM0_SIZE, GUEST_RAM1_SIZE }

Why's this needed in the public header?

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/11] xen/riscv: add kernel loading support
       [not found]     ` <05b1bc67-bbed-412e-881e-a3fb2c2d873b@gmail.com>
@ 2026-03-31 15:14       ` Jan Beulich
       [not found]         ` <a0efb7a6-4854-4fe5-bbf4-2561f25d7133@gmail.com>
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-03-31 15:14 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 31.03.2026 16:30, Oleksii Kurochko wrote:
> On 3/30/26 4:47 PM, Jan Beulich wrote:
>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>> --- a/xen/arch/riscv/include/asm/config.h
>>> +++ b/xen/arch/riscv/include/asm/config.h
>>> @@ -151,6 +151,19 @@
>>>   extern unsigned long phys_offset; /* = load_start - XEN_VIRT_START */
>>>   #endif
>>>   
>>> +/*
>>> + * KERNEL_LOAD_ADDR_ALIGNMENT is defined based on paragraph of
>>> + * "Kernel location" of boot.rst:
>>> + * https://docs.kernel.org/arch/riscv/boot.html#kernel-location
>>> + */
>>> +#if defined(CONFIG_RISCV_32)
>>> +#define KERNEL_LOAD_ADDR_ALIGNMENT MB(4)
>>> +#elif defined(CONFIG_RISCV_64)
>>> +#define KERNEL_LOAD_ADDR_ALIGNMENT MB(2)
>>> +#else
>>> +#error "Define KERNEL_LOAD_ADDR_ALIGNMENT"
>>> +#endif
>>
>> But that's Linux-specific. You want to be able to loader other OS kernels,
>> I suppose? The needed alignment should be a property of the kernel image,
>> suitably conveyed to the loader.
> 
> Then likely some updates will be needed...
> 
>>
>> Is Arm similarly capable of loading only Linux images? What about in
>> particular XTF?
> 
> ... they are pretend as they are Linux kernel zImage:
> 
> https://gitlab.com/xen-project/fusa/xtf/-/commit/dec72d83291d6782b3f41df66987c8a25eac422f#line_9f6eadcd7_A42
> 
> and in the case of XTF:
>      /* Magic number used to identify this is an ARM Linux zImage */
>      .word   ZIMAGE_MAGIC_NUMBER
>      /* The address the zImage starts at (0 = relocatable) */
>      .word   0
>      /* The address the zImage ends at */
>      .word   (_end - _start)
> 
> zImage.start is set to 0 so KERNEL_LOAD_ADDR_ALIGNMENT won't be applied 
> and load address from DTS's kernel node will be taken.
> 
> Other example in mind I have it is Zephyr OS, and the also use Image 
> protocol by enabling CONFIG_AARCH64_IMAGE_HEADER. So Xen can boot it too.

ISTR Andrew saying that he'd really like to be able to use plain ELF.
Anyway, if Linux Image is clearly stated as a only thing presently
supported, that's perhaps okay for the time being.

>>> +static void __init kernel_image_load(struct kernel_info *info)
>>> +{
>>> +    int rc;
>>> +    paddr_t load_addr = kernel_image_place(info);
>>> +    paddr_t paddr = info->image.kernel_addr;
>>> +    paddr_t len = info->image.len;
>>> +    void *kernel;
>>> +
>>> +    info->entry = load_addr;
>>
>> What if this is outside of memory bank 0 (as is possible when
>> info->image.start is non-zero).
> 
> It will be an issue and panic() will occur in place_modules().

Will it? I just looked again, and I can't help thinking that it won't.

>>> +    /* Currently there is no length in the header, so just use the size */
>>> +    start = 0;
>>> +    end = size;
>>
>> What's image_size then?
> 
> The comment is incorrect, the length is present in the header, but it is 
> effective length which isn't equal to the size of binary and is actually 
> bigger then binary size.
> 
> So here we want to use 'size' as it is a size of binary itself.

What is "effective length"? That sounds a little like e.g. .bss extending
past the (file) image, yet such would nee taking into account for allocation
(but not for reading in / copying over).

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/11] xen: move domain_use_host_layout() to common code
       [not found]     ` <57581b7d-cb9f-444c-9321-63b2fc3d09f0@gmail.com>
@ 2026-03-31 15:53       ` Jan Beulich
  2026-03-31 16:32         ` Oleksii Kurochko
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-03-31 15:53 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
	Anthony PERARD, Roger Pau Monné, xen-devel

On 31.03.2026 17:20, Oleksii Kurochko wrote:
> On 3/30/26 5:13 PM, Jan Beulich wrote:
>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>> domain_use_host_layout() is not really architecture-specific, so move it
>>> from the Arm header to the common header xen/domain.h and provide a common
>>> implementation in xen/common/domain.c. domain_use_host_layout() potentially
>>> is needed for x86 [1].
>>
>> No matter that this may indeed be true, ...
>>
>>> Turn the macro into a function to avoid header dependency issues.
>>
>> ... this introduces unreachable code on x86, i.e. a Misra rule 2.1 violation.
> 
> Do we have some deviation tag for such cases when the code temporary 
> isn't used?

I'm sorry, but it'll take me about as long as you to find out. I wonder
about "temporary" though: Do you have a clear understanding as to when
that will change?

>>> @@ -2544,6 +2544,12 @@ void thaw_domains(void)
>>>   
>>>   #endif /* CONFIG_SYSTEM_SUSPEND */
>>>   
>>> +bool domain_use_host_layout(struct domain *d)
>>> +{
>>> +    return is_domain_direct_mapped(d) ||
>>> +           (paging_mode_translate(d) && is_hardware_domain(d));
>>> +}
>>
>> The placement of paging_mode_translate() doesn't match ...
>>
>>> --- a/xen/include/xen/domain.h
>>> +++ b/xen/include/xen/domain.h
>>> @@ -62,6 +62,22 @@ void domid_free(domid_t domid);
>>>   #define is_domain_direct_mapped(d) ((d)->cdf & CDF_directmap)
>>>   #define is_domain_using_staticmem(d) ((d)->cdf & CDF_staticmem)
>>>   
>>> +/*
>>> + * Is the auto-translated domain using the host memory layout?
>>> + *
>>> + * domain_use_host_layout() is always False for PV guests.
>>
>> ... the description of the function.
> 
> But why the placement should be different?

If you focus on auto-translated, then imo paging_mode_translate()
better would guard everything.

> is_domain_direct_mapped() is false for PV guests (and for other guest 
> types on x86).
> 
> So if domain_use_host_layout() is fully depends on 
> paging_mode_translate(d) && is_hardware_domain(d) and for which 
> paging_mode_translate() is false if it is PV guest.
> Thereby domain_use_host_layout() is false too.
> 
>>
>> Further, the first sentence above suggests the caller has to check
>> paging_mode_translate() before calling, which as per the implementation
>> clearly isn't the intention.
> 
> Sorry, I don't follow you here.

By starting the comment with "Is the auto-translated domain using", you
imply the caller checked for that aspect already. At least the way I
read it.

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/11] xen/riscv: add kernel loading support
       [not found]         ` <a0efb7a6-4854-4fe5-bbf4-2561f25d7133@gmail.com>
@ 2026-03-31 15:56           ` Jan Beulich
  0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2026-03-31 15:56 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 31.03.2026 17:36, Oleksii Kurochko wrote:
> On 3/31/26 5:14 PM, Jan Beulich wrote:
>> On 31.03.2026 16:30, Oleksii Kurochko wrote:
>>> On 3/30/26 4:47 PM, Jan Beulich wrote:
>>>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>>>> +    /* Currently there is no length in the header, so just use the size */
>>>>> +    start = 0;
>>>>> +    end = size;
>>>>
>>>> What's image_size then?
>>>
>>> The comment is incorrect, the length is present in the header, but it is
>>> effective length which isn't equal to the size of binary and is actually
>>> bigger then binary size.
>>>
>>> So here we want to use 'size' as it is a size of binary itself.
>>
>> What is "effective length"? That sounds a little like e.g. .bss extending
>> past the (file) image, yet such would nee taking into account for allocation
>> (but not for reading in / copying over).
> 
> Yes, correct.
> Effective length is how much memory the image needs when loaded and 
> running. So it includes .bss (and similar sections) that are not stored 
> in the file but need zeroed memory at runtime. So:
>   size = actual bytes in the binary file
>   image_size (from header) = total memory the kernel occupies at runtime 
>   (larger, includes BSS)
> 
> So I think that:
>      start = 0;
>      end = size;
> that could be dropped at all. then:
>    info->image.len = size;
> 
> Then in kernel_image_load, pass load_addr + info->image.effective_size 
> to place_modules instead of load_addr + len.

Not really - for loading you need to know how much to load/copy. For
allocation/placement you need to know the overall size.

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 09/11] xen/riscv: introduce p2m_gpa_bits
  2026-03-30 15:34   ` Jan Beulich
@ 2026-03-31 16:02     ` Oleksii Kurochko
  2026-04-01  6:07       ` Jan Beulich
  0 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-31 16:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel



On 3/30/26 5:34 PM, Jan Beulich wrote:
> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>> p2m_gpa_bits is used by common/device-tree/domain-build.c thereby when
>> CONFIG_DOMAIN_BUILD_HELPERS=y it is necessary to have p2m_gpa_bits properly
>> defined as it is going to be used to find unused regions.
>>
>> Introduce default_gstage_mode to have ability to limit p2m_gpa_bits before
>> p2m_init() is being called as it will be too late.
> 
> This is a somewhat strange way of describing things. Of course you want to
> establish globals before doing any per-domain setup.

Then I will drop that sentence now and avoid similar in the future.

> 
>> Limit p2m_gpa_bits in guest_mm_init() as it could be that default G-stage
>> MMU mode uses less VA wide bits than IOMMU,
> 
> How does a VA come into play here?

It is what spec uses, for example:
  Figure 108. Sv39x4 virtual address (guest physical address).

I can just use GPA.

> And what is "less VA wide bits"?

They could be configured to different modes: IOMMU lets say Sv39 and MMU 
- Sv48, so IOMMU could work with 39-bit GPA, but MMU - with 48-bit GPAs.


> 
>> --- a/xen/arch/riscv/p2m.c
>> +++ b/xen/arch/riscv/p2m.c
>> @@ -51,6 +51,24 @@ static struct gstage_mode_desc __ro_after_init max_gstage_mode = {
>>       .name = "Bare",
>>   };
>>   
>> +static struct gstage_mode_desc __ro_after_init default_gstage_mode = {
>> +    .mode = HGATP_MODE_SV39X4,
>> +    .paging_levels = 2,
>> +    .name = "Sv39x4",
>> +};
>> +
>> +/*
>> + * Set to the maximum configured support for GPA bits, so the number of GPA
>> + * bits can be restricted by an external entity (e.g. IOMMU) and the
>> + * restriction must happen before the call of guest_mm_init().
> 
> DYM before p2m_init()? Because you do the calculation in the named
> function.

Yes, before p2m_init(). Probably, as you made a note in the commit 
message, this part could be dropped too.

> 
>> + * The widest G-stage mode defined by the RISC-V specification is Sv57x4,
>> + * which yields 59-bit GPAs: Sv57 maps 57-bit VAs onto 56-bit PAs (PADDR_BITS),
>> + * and the G-stage "x4" extension widens the address space by a further 2 bits,
>> + * hence PADDR_BITS + 1 + P2M_ROOT_EXTRA_BITS.
>> + */
> 
> I fear I don't follow. I can't explain the +1 at all.

Agree, +1 should be dropped. I think I mistakenly interpret PADDR_BITS 
as highest possible bit set, so 55 intead of 56.

  And adding in
> P2M_ROOT_EXTRA_BITS seems wrong too: Whatever the width of output of
> guest paging _is_ the width of input to stage-2 paging. There's no way
> for a guest to encode 2 extra bits.

Agree, PADDR_BITS should be enough here to be used as initializer.


> 
>> @@ -191,8 +209,13 @@ static void __init gstage_mode_detect(void)
>>   
>>   void __init guest_mm_init(void)
>>   {
>> +    unsigned int gpa_bits;
>> +    unsigned int paging_levels = default_gstage_mode.paging_levels;
> 
> Deriving a global from a default, when ...
> 
>>       gstage_mode_detect();
>>   
>> +    ASSERT(default_gstage_mode.paging_levels <= max_gstage_mode.paging_levels);
> 
> ... the default isn't the maximum possible, isn't going to fly.

I didn't get you here.

If we want Xen uses Sv39 for G-stage, we want to limit guest's 56-bit 
GPA to 39-bit GPA, but not the maximum supported by h/w mode for G-stage 
mode.

~ Oleksii





^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks
  2026-03-30 15:51   ` Jan Beulich
@ 2026-03-31 16:14     ` Oleksii Kurochko
  2026-04-01  6:17       ` Jan Beulich
  0 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-31 16:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Andrew Cooper, Anthony PERARD, Michal Orzel,
	Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel



On 3/30/26 5:51 PM, Jan Beulich wrote:
> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>> The dom0less solution uses defined RAM banks as compile-time constants,
>> so introduce macros to describe guest RAM banks.
>>
>> The reason for 2 banks is that there is typically always a use case for
>> low memory under 4 GB, but the bank under 4 GB ends up being small because
>> there are other things under 4 GB it can conflict with (interrupt
>> controller, PCI BARs, etc.).
> 
> Fixed layouts like the one you suggest come with (potentially severe)
> downsides. For example, what if more than 2Gb of MMIO space are needed
> for non-64-bit BARs? 

It looks where usually RAM on RISC-V boards start, so I expect that 2gb 
before RAM start is enough for MMIO space.

Answering your question it will be an issue or it will also use some 
space before banks, no?

I don't know how to do that better now.

Further, assuming that the space 4G...8G is what
> you expect 64-bit BARs to be put into, what if there's a device with a
> 4G BAR? It'll eat up that entire space, requiring everything else to
> fit in the 2G you reserve below 4G.

I assume that such big devices could use high memory without any issue.

> 
>> So a second bank is added above that MMIO
>> region (starting at 8 GiB) to provide the remaining RAM; the gap between
>> the two banks also exercises code paths handling discontiguous memory.
>> For Sv32 guests (34-bit GPA, 16 GiB addressable), bank0 provides 2 GB
>> (2–4 GB) and the first 8 GB of bank1 (8–16 GB) is accessible.
>>
>> Extended regions are useful for RISC-V: they could be used to provide a
>> "space" for Linux to map grant mappings.
>>
>> Despite the fact that for every guest MMU mode the GPA could be up
>> to 56 bits wide (except Sv32 whose GPA is 34 bits), the combined size
>> of both banks is limited to 1018 GB as it is more than enough for most
>> use cases.
> 
> Okay, more memory can be made available by (later) adding an optional
> 3rd bank.
> 
>> --- a/xen/include/public/arch-riscv.h
>> +++ b/xen/include/public/arch-riscv.h
>> @@ -50,6 +50,22 @@ typedef uint64_t xen_ulong_t;
>>   
>>   #if defined(__XEN__) || defined(__XEN_TOOLS__)
>>   
>> +#define GUEST_RAM_BANKS   2
>> +
>> +/*
>> + * The way to find the extended regions (to be exposed to the guest as unused
>> + * address space) relies on the fact that the regions reserved for the RAM
>> + * below are big enough to also accommodate such regions.
>> + */
>> +#define GUEST_RAM0_BASE   xen_mk_ullong(0x80000000) /* 2GB of low RAM @ 2GB */
>> +#define GUEST_RAM0_SIZE   xen_mk_ullong(0x80000000)
> 
> Connecting this with my comment on the earlier patch regarding kernel, initrd,
> and DTB fitting in bank 0: How's that going to work with a huge kernel and/or
> initrd (I expect DTBs can't grow very large)?

The short answer it won't, but does initrd usually so big?

DTB is limited to 2MB, IIRC. So it isn't expect to grow to much...

As I mentioned in the reply to earlier patch, I agree that we could 
leave bank0 for kernel and all other put to bank1.

Even more I can try to put kernel in ban1 as I don't see any place at 
the moment where it will be a problem for RISC-V Linux kernel to be in 
high memory.


> 
>> +#define GUEST_RAM1_BASE   xen_mk_ullong(0x0200000000) /* 1016 GB of RAM @ 8GB */
>> +#define GUEST_RAM1_SIZE   xen_mk_ullong(0xFE00000000)
>> +
>> +#define GUEST_RAM_BANK_BASES   { GUEST_RAM0_BASE, GUEST_RAM1_BASE }
>> +#define GUEST_RAM_BANK_SIZES   { GUEST_RAM0_SIZE, GUEST_RAM1_SIZE }
> 
> Why's this needed in the public header?

xl toolstack could use them so I expected what toolstack will use to 
live in this header.

~ Oleksii


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/11] xen: move domain_use_host_layout() to common code
  2026-03-31 15:53       ` Jan Beulich
@ 2026-03-31 16:32         ` Oleksii Kurochko
  2026-03-31 19:49           ` Oleksii Kurochko
  2026-04-01  5:58           ` Jan Beulich
  0 siblings, 2 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-31 16:32 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
	Anthony PERARD, Roger Pau Monné, xen-devel



On 3/31/26 5:53 PM, Jan Beulich wrote:
> On 31.03.2026 17:20, Oleksii Kurochko wrote:
>> On 3/30/26 5:13 PM, Jan Beulich wrote:
>>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>>> domain_use_host_layout() is not really architecture-specific, so move it
>>>> from the Arm header to the common header xen/domain.h and provide a common
>>>> implementation in xen/common/domain.c. domain_use_host_layout() potentially
>>>> is needed for x86 [1].
>>>
>>> No matter that this may indeed be true, ...
>>>
>>>> Turn the macro into a function to avoid header dependency issues.
>>>
>>> ... this introduces unreachable code on x86, i.e. a Misra rule 2.1 violation.
>>
>> Do we have some deviation tag for such cases when the code temporary
>> isn't used?
> 
> I'm sorry, but it'll take me about as long as you to find out.

Sure, I will take a look. I just thought that maybe you have a solution 
already just in your head.

  I wonder
> about "temporary" though: Do you have a clear understanding as to when
> that will change?

No, I don't. As Stefano mentioned they will need this function one day. 
Another option we could use ifndef x86 or ifdef DOM0_LESS and then when 
someone will really need it on x86, this ifdef will be dropped. I don't 
know if it is better solution.

It seems like the best one solution will still make a try to make 
declare this function as macro.

> 
>>>> @@ -2544,6 +2544,12 @@ void thaw_domains(void)
>>>>    
>>>>    #endif /* CONFIG_SYSTEM_SUSPEND */
>>>>    
>>>> +bool domain_use_host_layout(struct domain *d)
>>>> +{
>>>> +    return is_domain_direct_mapped(d) ||
>>>> +           (paging_mode_translate(d) && is_hardware_domain(d));
>>>> +}
>>>
>>> The placement of paging_mode_translate() doesn't match ...
>>>
>>>> --- a/xen/include/xen/domain.h
>>>> +++ b/xen/include/xen/domain.h
>>>> @@ -62,6 +62,22 @@ void domid_free(domid_t domid);
>>>>    #define is_domain_direct_mapped(d) ((d)->cdf & CDF_directmap)
>>>>    #define is_domain_using_staticmem(d) ((d)->cdf & CDF_staticmem)
>>>>    
>>>> +/*
>>>> + * Is the auto-translated domain using the host memory layout?
>>>> + *
>>>> + * domain_use_host_layout() is always False for PV guests.
>>>
>>> ... the description of the function.
>>
>> But why the placement should be different?
> 
> If you focus on auto-translated, then imo paging_mode_translate()
> better would guard everything.

Then it make sense to do in the following way:
  bool domain_use_host_layout(struct domain *d)
  {
-    return is_domain_direct_mapped(d) ||
-           (paging_mode_translate(d) && is_hardware_domain(d));
+    return paging_mode_translate(d) &&
+           (is_domain_direct_mapped(d) || is_hardware_domain(d));
  }

> 
>> is_domain_direct_mapped() is false for PV guests (and for other guest
>> types on x86).
>>
>> So if domain_use_host_layout() is fully depends on
>> paging_mode_translate(d) && is_hardware_domain(d) and for which
>> paging_mode_translate() is false if it is PV guest.
>> Thereby domain_use_host_layout() is false too.
>>
>>>
>>> Further, the first sentence above suggests the caller has to check
>>> paging_mode_translate() before calling, which as per the implementation
>>> clearly isn't the intention.
>>
>> Sorry, I don't follow you here.
> 
> By starting the comment with "Is the auto-translated domain using", you
> imply the caller checked for that aspect already. At least the way I
> read it.

My understanding was that it is an explanation what function is checking.

~ Oleksii




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/11] xen: move domain_use_host_layout() to common code
  2026-03-31 16:32         ` Oleksii Kurochko
@ 2026-03-31 19:49           ` Oleksii Kurochko
  2026-04-01  5:59             ` Jan Beulich
  2026-04-01  5:58           ` Jan Beulich
  1 sibling, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-03-31 19:49 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
	Anthony PERARD, Roger Pau Monné, xen-devel



On 3/31/26 6:32 PM, Oleksii Kurochko wrote:
>>>>> @@ -2544,6 +2544,12 @@ void thaw_domains(void)
>>>>>    #endif /* CONFIG_SYSTEM_SUSPEND */
>>>>> +bool domain_use_host_layout(struct domain *d)
>>>>> +{
>>>>> +    return is_domain_direct_mapped(d) ||
>>>>> +           (paging_mode_translate(d) && is_hardware_domain(d));
>>>>> +}
>>>>
>>>> The placement of paging_mode_translate() doesn't match ...
>>>>
>>>>> --- a/xen/include/xen/domain.h
>>>>> +++ b/xen/include/xen/domain.h
>>>>> @@ -62,6 +62,22 @@ void domid_free(domid_t domid);
>>>>>    #define is_domain_direct_mapped(d) ((d)->cdf & CDF_directmap)
>>>>>    #define is_domain_using_staticmem(d) ((d)->cdf & CDF_staticmem)
>>>>> +/*
>>>>> + * Is the auto-translated domain using the host memory layout?
>>>>> + *
>>>>> + * domain_use_host_layout() is always False for PV guests.
>>>>
>>>> ... the description of the function.
>>>
>>> But why the placement should be different?
>>
>> If you focus on auto-translated, then imo paging_mode_translate()
>> better would guard everything.
> 
> Then it make sense to do in the following way:
>   bool domain_use_host_layout(struct domain *d)
>   {
> -    return is_domain_direct_mapped(d) ||
> -           (paging_mode_translate(d) && is_hardware_domain(d));
> +    return paging_mode_translate(d) &&
> +           (is_domain_direct_mapped(d) || is_hardware_domain(d));
>   }

This is not really correct.

~ Oleksii


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/11] xen: move domain_use_host_layout() to common code
  2026-03-31 16:32         ` Oleksii Kurochko
  2026-03-31 19:49           ` Oleksii Kurochko
@ 2026-04-01  5:58           ` Jan Beulich
  2026-04-01 14:38             ` Oleksii Kurochko
  1 sibling, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-04-01  5:58 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
	Anthony PERARD, Roger Pau Monné, xen-devel

On 31.03.2026 18:32, Oleksii Kurochko wrote:
> On 3/31/26 5:53 PM, Jan Beulich wrote:
>> On 31.03.2026 17:20, Oleksii Kurochko wrote:
>>> On 3/30/26 5:13 PM, Jan Beulich wrote:
>>>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>>>> domain_use_host_layout() is not really architecture-specific, so move it
>>>>> from the Arm header to the common header xen/domain.h and provide a common
>>>>> implementation in xen/common/domain.c. domain_use_host_layout() potentially
>>>>> is needed for x86 [1].
>>>>
>>>> No matter that this may indeed be true, ...
>>>>
>>>>> Turn the macro into a function to avoid header dependency issues.
>>>>
>>>> ... this introduces unreachable code on x86, i.e. a Misra rule 2.1 violation.
>>>
>>> Do we have some deviation tag for such cases when the code temporary
>>> isn't used?
>>
>> I'm sorry, but it'll take me about as long as you to find out.
> 
> Sure, I will take a look. I just thought that maybe you have a solution 
> already just in your head.

Well, I do: Don't make this an out-of-line function.

>   I wonder
>> about "temporary" though: Do you have a clear understanding as to when
>> that will change?
> 
> No, I don't. As Stefano mentioned they will need this function one day. 
> Another option we could use ifndef x86 or ifdef DOM0_LESS and then when 
> someone will really need it on x86, this ifdef will be dropped. I don't 
> know if it is better solution.
> 
> It seems like the best one solution will still make a try to make 
> declare this function as macro.

Or an inline function. There's nothing ...

>>>>> @@ -2544,6 +2544,12 @@ void thaw_domains(void)
>>>>>    
>>>>>    #endif /* CONFIG_SYSTEM_SUSPEND */
>>>>>    
>>>>> +bool domain_use_host_layout(struct domain *d)
>>>>> +{
>>>>> +    return is_domain_direct_mapped(d) ||
>>>>> +           (paging_mode_translate(d) && is_hardware_domain(d));
>>>>> +}
>>>>
>>>> The placement of paging_mode_translate() doesn't match ...
>>>>
>>>>> --- a/xen/include/xen/domain.h
>>>>> +++ b/xen/include/xen/domain.h
>>>>> @@ -62,6 +62,22 @@ void domid_free(domid_t domid);
>>>>>    #define is_domain_direct_mapped(d) ((d)->cdf & CDF_directmap)
>>>>>    #define is_domain_using_staticmem(d) ((d)->cdf & CDF_staticmem)
>>>>>    
>>>>> +/*
>>>>> + * Is the auto-translated domain using the host memory layout?
>>>>> + *
>>>>> + * domain_use_host_layout() is always False for PV guests.
>>>>
>>>> ... the description of the function.
>>>
>>> But why the placement should be different?
>>
>> If you focus on auto-translated, then imo paging_mode_translate()
>> better would guard everything.
> 
> Then it make sense to do in the following way:
>   bool domain_use_host_layout(struct domain *d)
>   {
> -    return is_domain_direct_mapped(d) ||
> -           (paging_mode_translate(d) && is_hardware_domain(d));
> +    return paging_mode_translate(d) &&
> +           (is_domain_direct_mapped(d) || is_hardware_domain(d));
>   }

... in here which clearly speaks against doing so. And yes, this is what I
was asking for (with the function parameter also suitably constified).

>>> So if domain_use_host_layout() is fully depends on
>>> paging_mode_translate(d) && is_hardware_domain(d) and for which
>>> paging_mode_translate() is false if it is PV guest.
>>> Thereby domain_use_host_layout() is false too.
>>>
>>>>
>>>> Further, the first sentence above suggests the caller has to check
>>>> paging_mode_translate() before calling, which as per the implementation
>>>> clearly isn't the intention.
>>>
>>> Sorry, I don't follow you here.
>>
>> By starting the comment with "Is the auto-translated domain using", you
>> imply the caller checked for that aspect already. At least the way I
>> read it.
> 
> My understanding was that it is an explanation what function is checking.

For that you'd want to omit "auto-translated" from the first sentence, imo.

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/11] xen: move domain_use_host_layout() to common code
  2026-03-31 19:49           ` Oleksii Kurochko
@ 2026-04-01  5:59             ` Jan Beulich
  2026-04-01 14:44               ` Oleksii Kurochko
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-04-01  5:59 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
	Anthony PERARD, Roger Pau Monné, xen-devel

On 31.03.2026 21:49, Oleksii Kurochko wrote:
> On 3/31/26 6:32 PM, Oleksii Kurochko wrote:
>>>>>> @@ -2544,6 +2544,12 @@ void thaw_domains(void)
>>>>>>    #endif /* CONFIG_SYSTEM_SUSPEND */
>>>>>> +bool domain_use_host_layout(struct domain *d)
>>>>>> +{
>>>>>> +    return is_domain_direct_mapped(d) ||
>>>>>> +           (paging_mode_translate(d) && is_hardware_domain(d));
>>>>>> +}
>>>>>
>>>>> The placement of paging_mode_translate() doesn't match ...
>>>>>
>>>>>> --- a/xen/include/xen/domain.h
>>>>>> +++ b/xen/include/xen/domain.h
>>>>>> @@ -62,6 +62,22 @@ void domid_free(domid_t domid);
>>>>>>    #define is_domain_direct_mapped(d) ((d)->cdf & CDF_directmap)
>>>>>>    #define is_domain_using_staticmem(d) ((d)->cdf & CDF_staticmem)
>>>>>> +/*
>>>>>> + * Is the auto-translated domain using the host memory layout?
>>>>>> + *
>>>>>> + * domain_use_host_layout() is always False for PV guests.
>>>>>
>>>>> ... the description of the function.
>>>>
>>>> But why the placement should be different?
>>>
>>> If you focus on auto-translated, then imo paging_mode_translate()
>>> better would guard everything.
>>
>> Then it make sense to do in the following way:
>>   bool domain_use_host_layout(struct domain *d)
>>   {
>> -    return is_domain_direct_mapped(d) ||
>> -           (paging_mode_translate(d) && is_hardware_domain(d));
>> +    return paging_mode_translate(d) &&
>> +           (is_domain_direct_mapped(d) || is_hardware_domain(d));
>>   }
> 
> This is not really correct.

... because of ... ? (After all, then the comment isn't correct either.)

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 09/11] xen/riscv: introduce p2m_gpa_bits
  2026-03-31 16:02     ` Oleksii Kurochko
@ 2026-04-01  6:07       ` Jan Beulich
  2026-04-01 13:50         ` Oleksii Kurochko
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-04-01  6:07 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 31.03.2026 18:02, Oleksii Kurochko wrote:
> On 3/30/26 5:34 PM, Jan Beulich wrote:
>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>> p2m_gpa_bits is used by common/device-tree/domain-build.c thereby when
>>> CONFIG_DOMAIN_BUILD_HELPERS=y it is necessary to have p2m_gpa_bits properly
>>> defined as it is going to be used to find unused regions.
>>>
>>> Introduce default_gstage_mode to have ability to limit p2m_gpa_bits before
>>> p2m_init() is being called as it will be too late.
>>
>> This is a somewhat strange way of describing things. Of course you want to
>> establish globals before doing any per-domain setup.
> 
> Then I will drop that sentence now and avoid similar in the future.
> 
>>> Limit p2m_gpa_bits in guest_mm_init() as it could be that default G-stage
>>> MMU mode uses less VA wide bits than IOMMU,
>>
>> How does a VA come into play here?
> 
> It is what spec uses, for example:
>   Figure 108. Sv39x4 virtual address (guest physical address).

Note the difference between what you quote and what your sentence said:
You used VA entirely unqualified. Yes, please ...

> I can just use GPA.

... use GPA whenever you mean one. Using VA for two distinct purposes
is simply confusing. Even the qualifying by the mode is only of limited
help imo, as the casual reader may not be fluent in those modes and
their acronyms.

>> And what is "less VA wide bits"?
> 
> They could be configured to different modes: IOMMU lets say Sv39 and MMU 
> - Sv48, so IOMMU could work with 39-bit GPA, but MMU - with 48-bit GPAs.

I guessed as much, but this wants wording differently. E.g. "... uses
fewer GPA bits than ...".

>>> @@ -191,8 +209,13 @@ static void __init gstage_mode_detect(void)
>>>   
>>>   void __init guest_mm_init(void)
>>>   {
>>> +    unsigned int gpa_bits;
>>> +    unsigned int paging_levels = default_gstage_mode.paging_levels;
>>
>> Deriving a global from a default, when ...
>>
>>>       gstage_mode_detect();
>>>   
>>> +    ASSERT(default_gstage_mode.paging_levels <= max_gstage_mode.paging_levels);
>>
>> ... the default isn't the maximum possible, isn't going to fly.
> 
> I didn't get you here.
> 
> If we want Xen uses Sv39 for G-stage, we want to limit guest's 56-bit 
> GPA to 39-bit GPA, but not the maximum supported by h/w mode for G-stage 
> mode.

I can only repeat what I thought I had got across already on an earlier
series of yours: What mode a guest is going to use is going to be a guest
property. The default mode therefore isn't the only mode that may be used
at runtime.

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks
  2026-03-31 16:14     ` Oleksii Kurochko
@ 2026-04-01  6:17       ` Jan Beulich
  2026-04-01 13:57         ` Oleksii Kurochko
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-04-01  6:17 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Andrew Cooper, Anthony PERARD, Michal Orzel,
	Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel

On 31.03.2026 18:14, Oleksii Kurochko wrote:
> On 3/30/26 5:51 PM, Jan Beulich wrote:
>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>> The dom0less solution uses defined RAM banks as compile-time constants,
>>> so introduce macros to describe guest RAM banks.
>>>
>>> The reason for 2 banks is that there is typically always a use case for
>>> low memory under 4 GB, but the bank under 4 GB ends up being small because
>>> there are other things under 4 GB it can conflict with (interrupt
>>> controller, PCI BARs, etc.).
>>
>> Fixed layouts like the one you suggest come with (potentially severe)
>> downsides. For example, what if more than 2Gb of MMIO space are needed
>> for non-64-bit BARs? 
> 
> It looks where usually RAM on RISC-V boards start, so I expect that 2gb 
> before RAM start is enough for MMIO space.

Likely in the common case. Board designers aren't constrained by this,
though (aiui). Whereas you set in stone a single, fixed layout.

Arm maintainers - since a similar fixed layout is used there iirc,
could you chime in here, please?

> Answering your question it will be an issue or it will also use some 
> space before banks, no?

I fear I don't understand what you're trying to tell me.

> Further, assuming that the space 4G...8G is what
>> you expect 64-bit BARs to be put into, what if there's a device with a
>> 4G BAR? It'll eat up that entire space, requiring everything else to
>> fit in the 2G you reserve below 4G.
> 
> I assume that such big devices could use high memory without any issue.

Well, I could go (almost) arbitrarily low with individual BAR size,
merely increasing the number of BARs accordingly. Assuming 2G BARs are
64-bit capable is likely fine. Maybe the same is true for 1G and 512M
ones as well. Yet a some size the assumption will break.

IMO RAM layout wants establishing dynamically based on the MMIO needs
of a guest.

>>> --- a/xen/include/public/arch-riscv.h
>>> +++ b/xen/include/public/arch-riscv.h
>>> @@ -50,6 +50,22 @@ typedef uint64_t xen_ulong_t;
>>>   
>>>   #if defined(__XEN__) || defined(__XEN_TOOLS__)
>>>   
>>> +#define GUEST_RAM_BANKS   2
>>> +
>>> +/*
>>> + * The way to find the extended regions (to be exposed to the guest as unused
>>> + * address space) relies on the fact that the regions reserved for the RAM
>>> + * below are big enough to also accommodate such regions.
>>> + */
>>> +#define GUEST_RAM0_BASE   xen_mk_ullong(0x80000000) /* 2GB of low RAM @ 2GB */
>>> +#define GUEST_RAM0_SIZE   xen_mk_ullong(0x80000000)
>>
>> Connecting this with my comment on the earlier patch regarding kernel, initrd,
>> and DTB fitting in bank 0: How's that going to work with a huge kernel and/or
>> initrd (I expect DTBs can't grow very large)?
> 
> The short answer it won't, but does initrd usually so big?

Not usually, but nothing keeps it from being arbitrary size.

> DTB is limited to 2MB, IIRC. So it isn't expect to grow to much...
> 
> As I mentioned in the reply to earlier patch, I agree that we could 
> leave bank0 for kernel and all other put to bank1.

Kernels can also be arbitrarily large.

> Even more I can try to put kernel in ban1 as I don't see any place at 
> the moment where it will be a problem for RISC-V Linux kernel to be in 
> high memory.

Yes, the less restrictions from the beginning, the less worries later.

>>> +#define GUEST_RAM1_BASE   xen_mk_ullong(0x0200000000) /* 1016 GB of RAM @ 8GB */
>>> +#define GUEST_RAM1_SIZE   xen_mk_ullong(0xFE00000000)
>>> +
>>> +#define GUEST_RAM_BANK_BASES   { GUEST_RAM0_BASE, GUEST_RAM1_BASE }
>>> +#define GUEST_RAM_BANK_SIZES   { GUEST_RAM0_SIZE, GUEST_RAM1_SIZE }
>>
>> Why's this needed in the public header?
> 
> xl toolstack could use them so I expected what toolstack will use to 
> live in this header.

But these last two #define-s are merely convenience definitions. They
even prescribe a certain data layout in order to be usable. I don't
think anything like this should be put in the public headers.

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 09/11] xen/riscv: introduce p2m_gpa_bits
  2026-04-01  6:07       ` Jan Beulich
@ 2026-04-01 13:50         ` Oleksii Kurochko
  2026-04-01 13:57           ` Jan Beulich
  0 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-04-01 13:50 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel



On 4/1/26 8:07 AM, Jan Beulich wrote:
>>>> @@ -191,8 +209,13 @@ static void __init gstage_mode_detect(void)
>>>>    
>>>>    void __init guest_mm_init(void)
>>>>    {
>>>> +    unsigned int gpa_bits;
>>>> +    unsigned int paging_levels = default_gstage_mode.paging_levels;
>>> Deriving a global from a default, when ...
>>>
>>>>        gstage_mode_detect();
>>>>    
>>>> +    ASSERT(default_gstage_mode.paging_levels <= max_gstage_mode.paging_levels);
>>> ... the default isn't the maximum possible, isn't going to fly.
>> I didn't get you here.
>>
>> If we want Xen uses Sv39 for G-stage, we want to limit guest's 56-bit
>> GPA to 39-bit GPA, but not the maximum supported by h/w mode for G-stage
>> mode.
> I can only repeat what I thought I had got across already on an earlier
> series of yours: What mode a guest is going to use is going to be a guest
> property. The default mode therefore isn't the only mode that may be used
> at runtime.

I remember that, but i don't really understand what is wrong now with 
the ASSERT(). It should be changed or dropped at all when this property 
you are talking about will be introduced.

~ Oleksii


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 09/11] xen/riscv: introduce p2m_gpa_bits
  2026-04-01 13:50         ` Oleksii Kurochko
@ 2026-04-01 13:57           ` Jan Beulich
  0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2026-04-01 13:57 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 01.04.2026 15:50, Oleksii Kurochko wrote:
> On 4/1/26 8:07 AM, Jan Beulich wrote:
>>>>> @@ -191,8 +209,13 @@ static void __init gstage_mode_detect(void)
>>>>>    
>>>>>    void __init guest_mm_init(void)
>>>>>    {
>>>>> +    unsigned int gpa_bits;
>>>>> +    unsigned int paging_levels = default_gstage_mode.paging_levels;
>>>> Deriving a global from a default, when ...

This earlier comment may have been placed a little unhelpfully. The global
talked about is p2m_gpa_bits. IOW ...

>>>>>        gstage_mode_detect();
>>>>>    
>>>>> +    ASSERT(default_gstage_mode.paging_levels <= max_gstage_mode.paging_levels);
>>>> ... the default isn't the maximum possible, isn't going to fly.
>>> I didn't get you here.
>>>
>>> If we want Xen uses Sv39 for G-stage, we want to limit guest's 56-bit
>>> GPA to 39-bit GPA, but not the maximum supported by h/w mode for G-stage
>>> mode.
>> I can only repeat what I thought I had got across already on an earlier
>> series of yours: What mode a guest is going to use is going to be a guest
>> property. The default mode therefore isn't the only mode that may be used
>> at runtime.
> 
> I remember that, but i don't really understand what is wrong now with 
> the ASSERT(). It should be changed or dropped at all when this property 
> you are talking about will be introduced.

... the comment wasn't about the assertion itself, but the mere existence
of default_gstage_mode. Imo you really don't want to redo this later, but
maintain per-guest settings per guest right away, even if for the time
being all guests may end up with identical settings.

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks
  2026-04-01  6:17       ` Jan Beulich
@ 2026-04-01 13:57         ` Oleksii Kurochko
  2026-04-01 14:22           ` Jan Beulich
  0 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-04-01 13:57 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Andrew Cooper, Anthony PERARD, Michal Orzel,
	Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel



On 4/1/26 8:17 AM, Jan Beulich wrote:
> On 31.03.2026 18:14, Oleksii Kurochko wrote:
>> On 3/30/26 5:51 PM, Jan Beulich wrote:
>>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>>> The dom0less solution uses defined RAM banks as compile-time constants,
>>>> so introduce macros to describe guest RAM banks.
>>>>
>>>> The reason for 2 banks is that there is typically always a use case for
>>>> low memory under 4 GB, but the bank under 4 GB ends up being small because
>>>> there are other things under 4 GB it can conflict with (interrupt
>>>> controller, PCI BARs, etc.).
>>> Fixed layouts like the one you suggest come with (potentially severe)
>>> downsides. For example, what if more than 2Gb of MMIO space are needed
>>> for non-64-bit BARs?
>> It looks where usually RAM on RISC-V boards start, so I expect that 2gb
>> before RAM start is enough for MMIO space.
> Likely in the common case. Board designers aren't constrained by this,
> though (aiui). Whereas you set in stone a single, fixed layout.
> 
> Arm maintainers - since a similar fixed layout is used there iirc,
> could you chime in here, please?
> 
>> Answering your question it will be an issue or it will also use some
>> space before banks, no?
> I fear I don't understand what you're trying to tell me.

I meant that there is also some space between banks and pretty big which 
could be used for MMIO which could be used for non-64-bit BARs.

> 
>> Further, assuming that the space 4G...8G is what
>>> you expect 64-bit BARs to be put into, what if there's a device with a
>>> 4G BAR? It'll eat up that entire space, requiring everything else to
>>> fit in the 2G you reserve below 4G.
>> I assume that such big devices could use high memory without any issue.
> Well, I could go (almost) arbitrarily low with individual BAR size,
> merely increasing the number of BARs accordingly. Assuming 2G BARs are
> 64-bit capable is likely fine. Maybe the same is true for 1G and 512M
> ones as well. Yet a some size the assumption will break.
> 
> IMO RAM layout wants establishing dynamically based on the MMIO needs
> of a guest.

I have this in my TODO.

But with the current implementation of dom0less it requires to have RAM 
banks defined in compile time.

Can we process with the current suggested way with the following update 
of dom0less code to work with dynamically allocated RAM layout?

~ Oleksii


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks
  2026-04-01 13:57         ` Oleksii Kurochko
@ 2026-04-01 14:22           ` Jan Beulich
  2026-04-01 14:53             ` Oleksii Kurochko
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-04-01 14:22 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Andrew Cooper, Anthony PERARD, Michal Orzel,
	Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel

On 01.04.2026 15:57, Oleksii Kurochko wrote:
> On 4/1/26 8:17 AM, Jan Beulich wrote:
>> On 31.03.2026 18:14, Oleksii Kurochko wrote:
>>> On 3/30/26 5:51 PM, Jan Beulich wrote:
>>>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>>>> The dom0less solution uses defined RAM banks as compile-time constants,
>>>>> so introduce macros to describe guest RAM banks.
>>>>>
>>>>> The reason for 2 banks is that there is typically always a use case for
>>>>> low memory under 4 GB, but the bank under 4 GB ends up being small because
>>>>> there are other things under 4 GB it can conflict with (interrupt
>>>>> controller, PCI BARs, etc.).
>>>> Fixed layouts like the one you suggest come with (potentially severe)
>>>> downsides. For example, what if more than 2Gb of MMIO space are needed
>>>> for non-64-bit BARs?
>>> It looks where usually RAM on RISC-V boards start, so I expect that 2gb
>>> before RAM start is enough for MMIO space.
>> Likely in the common case. Board designers aren't constrained by this,
>> though (aiui). Whereas you set in stone a single, fixed layout.
>>
>> Arm maintainers - since a similar fixed layout is used there iirc,
>> could you chime in here, please?
>>
>>> Answering your question it will be an issue or it will also use some
>>> space before banks, no?
>> I fear I don't understand what you're trying to tell me.
> 
> I meant that there is also some space between banks and pretty big which 
> could be used for MMIO which could be used for non-64-bit BARs.

I don't follow: Bank 0 extends to 4G. There's no space above it, below
bank 1, which could be use for non-64-bit BARs.

>>> Further, assuming that the space 4G...8G is what
>>>> you expect 64-bit BARs to be put into, what if there's a device with a
>>>> 4G BAR? It'll eat up that entire space, requiring everything else to
>>>> fit in the 2G you reserve below 4G.
>>> I assume that such big devices could use high memory without any issue.
>> Well, I could go (almost) arbitrarily low with individual BAR size,
>> merely increasing the number of BARs accordingly. Assuming 2G BARs are
>> 64-bit capable is likely fine. Maybe the same is true for 1G and 512M
>> ones as well. Yet a some size the assumption will break.
>>
>> IMO RAM layout wants establishing dynamically based on the MMIO needs
>> of a guest.
> 
> I have this in my TODO.
> 
> But with the current implementation of dom0less it requires to have RAM 
> banks defined in compile time.

Oh well.

> Can we process with the current suggested way with the following update 
> of dom0less code to work with dynamically allocated RAM layout?

If you want me to ack such, the limitations will need clearly calling out
as such (and why it needs doing like this). Further the public interface
wants leaving as tidy as possible, as removing stuff from there is
usually not a straightforward thing to do. Ideally, no part of this would
be encoded into the public headers, if at all possible.

You also may recall that I have reservations towards this work targeting
dom0less alone. Yet that's likely okay(ish) as long as this is the mutual
understanding of interested parties (and again clearly expressed in
relevant places).

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/11] xen: move domain_use_host_layout() to common code
  2026-04-01  5:58           ` Jan Beulich
@ 2026-04-01 14:38             ` Oleksii Kurochko
  2026-04-01 14:42               ` Jan Beulich
  0 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-04-01 14:38 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
	Anthony PERARD, Roger Pau Monné, xen-devel



On 4/1/26 7:58 AM, Jan Beulich wrote:
> On 31.03.2026 18:32, Oleksii Kurochko wrote:
>> On 3/31/26 5:53 PM, Jan Beulich wrote:
>>> On 31.03.2026 17:20, Oleksii Kurochko wrote:
>>>> On 3/30/26 5:13 PM, Jan Beulich wrote:
>>>>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>>>>> domain_use_host_layout() is not really architecture-specific, so move it
>>>>>> from the Arm header to the common header xen/domain.h and provide a common
>>>>>> implementation in xen/common/domain.c. domain_use_host_layout() potentially
>>>>>> is needed for x86 [1].
>>>>> No matter that this may indeed be true, ...
>>>>>
>>>>>> Turn the macro into a function to avoid header dependency issues.
>>>>> ... this introduces unreachable code on x86, i.e. a Misra rule 2.1 violation.
>>>> Do we have some deviation tag for such cases when the code temporary
>>>> isn't used?
>>> I'm sorry, but it'll take me about as long as you to find out.
>> Sure, I will take a look. I just thought that maybe you have a solution
>> already just in your head.
> Well, I do: Don't make this an out-of-line function.
> 
>>    I wonder
>>> about "temporary" though: Do you have a clear understanding as to when
>>> that will change?
>> No, I don't. As Stefano mentioned they will need this function one day.
>> Another option we could use ifndef x86 or ifdef DOM0_LESS and then when
>> someone will really need it on x86, this ifdef will be dropped. I don't
>> know if it is better solution.
>>
>> It seems like the best one solution will still make a try to make
>> declare this function as macro.
> Or an inline function. There's nothing ...
> 
>>>>>> @@ -2544,6 +2544,12 @@ void thaw_domains(void)
>>>>>>     
>>>>>>     #endif /* CONFIG_SYSTEM_SUSPEND */
>>>>>>     
>>>>>> +bool domain_use_host_layout(struct domain *d)
>>>>>> +{
>>>>>> +    return is_domain_direct_mapped(d) ||
>>>>>> +           (paging_mode_translate(d) && is_hardware_domain(d));
>>>>>> +}
>>>>> The placement of paging_mode_translate() doesn't match ...
>>>>>
>>>>>> --- a/xen/include/xen/domain.h
>>>>>> +++ b/xen/include/xen/domain.h
>>>>>> @@ -62,6 +62,22 @@ void domid_free(domid_t domid);
>>>>>>     #define is_domain_direct_mapped(d) ((d)->cdf & CDF_directmap)
>>>>>>     #define is_domain_using_staticmem(d) ((d)->cdf & CDF_staticmem)
>>>>>>     
>>>>>> +/*
>>>>>> + * Is the auto-translated domain using the host memory layout?
>>>>>> + *
>>>>>> + * domain_use_host_layout() is always False for PV guests.
>>>>> ... the description of the function.
>>>> But why the placement should be different?
>>> If you focus on auto-translated, then imo paging_mode_translate()
>>> better would guard everything.
>> Then it make sense to do in the following way:
>>    bool domain_use_host_layout(struct domain *d)
>>    {
>> -    return is_domain_direct_mapped(d) ||
>> -           (paging_mode_translate(d) && is_hardware_domain(d));
>> +    return paging_mode_translate(d) &&
>> +           (is_domain_direct_mapped(d) || is_hardware_domain(d));
>>    }
> ... in here which clearly speaks against doing so. And yes, this is what I
> was asking for (with the function parameter also suitably constified).

I expect that with an inline function in xen/domain.h compiler will want 
have paging_mode_translate() be explicitly defined, so an inclusion of 
xen/paging.h will be needed, but likely I am wrong that it will be 
needed it the case of inline function.

~ Oleksii


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/11] xen: move domain_use_host_layout() to common code
  2026-04-01 14:38             ` Oleksii Kurochko
@ 2026-04-01 14:42               ` Jan Beulich
  0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2026-04-01 14:42 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
	Anthony PERARD, Roger Pau Monné, xen-devel

On 01.04.2026 16:38, Oleksii Kurochko wrote:
> 
> 
> On 4/1/26 7:58 AM, Jan Beulich wrote:
>> On 31.03.2026 18:32, Oleksii Kurochko wrote:
>>> On 3/31/26 5:53 PM, Jan Beulich wrote:
>>>> On 31.03.2026 17:20, Oleksii Kurochko wrote:
>>>>> On 3/30/26 5:13 PM, Jan Beulich wrote:
>>>>>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>>>>>> domain_use_host_layout() is not really architecture-specific, so move it
>>>>>>> from the Arm header to the common header xen/domain.h and provide a common
>>>>>>> implementation in xen/common/domain.c. domain_use_host_layout() potentially
>>>>>>> is needed for x86 [1].
>>>>>> No matter that this may indeed be true, ...
>>>>>>
>>>>>>> Turn the macro into a function to avoid header dependency issues.
>>>>>> ... this introduces unreachable code on x86, i.e. a Misra rule 2.1 violation.
>>>>> Do we have some deviation tag for such cases when the code temporary
>>>>> isn't used?
>>>> I'm sorry, but it'll take me about as long as you to find out.
>>> Sure, I will take a look. I just thought that maybe you have a solution
>>> already just in your head.
>> Well, I do: Don't make this an out-of-line function.
>>
>>>    I wonder
>>>> about "temporary" though: Do you have a clear understanding as to when
>>>> that will change?
>>> No, I don't. As Stefano mentioned they will need this function one day.
>>> Another option we could use ifndef x86 or ifdef DOM0_LESS and then when
>>> someone will really need it on x86, this ifdef will be dropped. I don't
>>> know if it is better solution.
>>>
>>> It seems like the best one solution will still make a try to make
>>> declare this function as macro.
>> Or an inline function. There's nothing ...
>>
>>>>>>> @@ -2544,6 +2544,12 @@ void thaw_domains(void)
>>>>>>>     
>>>>>>>     #endif /* CONFIG_SYSTEM_SUSPEND */
>>>>>>>     
>>>>>>> +bool domain_use_host_layout(struct domain *d)
>>>>>>> +{
>>>>>>> +    return is_domain_direct_mapped(d) ||
>>>>>>> +           (paging_mode_translate(d) && is_hardware_domain(d));
>>>>>>> +}
>>>>>> The placement of paging_mode_translate() doesn't match ...
>>>>>>
>>>>>>> --- a/xen/include/xen/domain.h
>>>>>>> +++ b/xen/include/xen/domain.h
>>>>>>> @@ -62,6 +62,22 @@ void domid_free(domid_t domid);
>>>>>>>     #define is_domain_direct_mapped(d) ((d)->cdf & CDF_directmap)
>>>>>>>     #define is_domain_using_staticmem(d) ((d)->cdf & CDF_staticmem)
>>>>>>>     
>>>>>>> +/*
>>>>>>> + * Is the auto-translated domain using the host memory layout?
>>>>>>> + *
>>>>>>> + * domain_use_host_layout() is always False for PV guests.
>>>>>> ... the description of the function.
>>>>> But why the placement should be different?
>>>> If you focus on auto-translated, then imo paging_mode_translate()
>>>> better would guard everything.
>>> Then it make sense to do in the following way:
>>>    bool domain_use_host_layout(struct domain *d)
>>>    {
>>> -    return is_domain_direct_mapped(d) ||
>>> -           (paging_mode_translate(d) && is_hardware_domain(d));
>>> +    return paging_mode_translate(d) &&
>>> +           (is_domain_direct_mapped(d) || is_hardware_domain(d));
>>>    }
>> ... in here which clearly speaks against doing so. And yes, this is what I
>> was asking for (with the function parameter also suitably constified).
> 
> I expect that with an inline function in xen/domain.h compiler will want 
> have paging_mode_translate() be explicitly defined, so an inclusion of 
> xen/paging.h will be needed, but likely I am wrong that it will be 
> needed it the case of inline function.

I don't think you're wrong with that. What I don't understand is why it
needs to be xen/domain.h where the function would be placed. If need be
you could make an entirely new header, where (presumably) you're not
going to have any include recursion concerns.

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/11] xen: move domain_use_host_layout() to common code
  2026-04-01  5:59             ` Jan Beulich
@ 2026-04-01 14:44               ` Oleksii Kurochko
  0 siblings, 0 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2026-04-01 14:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
	Anthony PERARD, Roger Pau Monné, xen-devel



On 4/1/26 7:59 AM, Jan Beulich wrote:
> On 31.03.2026 21:49, Oleksii Kurochko wrote:
>> On 3/31/26 6:32 PM, Oleksii Kurochko wrote:
>>>>>>> @@ -2544,6 +2544,12 @@ void thaw_domains(void)
>>>>>>>     #endif /* CONFIG_SYSTEM_SUSPEND */
>>>>>>> +bool domain_use_host_layout(struct domain *d)
>>>>>>> +{
>>>>>>> +    return is_domain_direct_mapped(d) ||
>>>>>>> +           (paging_mode_translate(d) && is_hardware_domain(d));
>>>>>>> +}
>>>>>>
>>>>>> The placement of paging_mode_translate() doesn't match ...
>>>>>>
>>>>>>> --- a/xen/include/xen/domain.h
>>>>>>> +++ b/xen/include/xen/domain.h
>>>>>>> @@ -62,6 +62,22 @@ void domid_free(domid_t domid);
>>>>>>>     #define is_domain_direct_mapped(d) ((d)->cdf & CDF_directmap)
>>>>>>>     #define is_domain_using_staticmem(d) ((d)->cdf & CDF_staticmem)
>>>>>>> +/*
>>>>>>> + * Is the auto-translated domain using the host memory layout?
>>>>>>> + *
>>>>>>> + * domain_use_host_layout() is always False for PV guests.
>>>>>>
>>>>>> ... the description of the function.
>>>>>
>>>>> But why the placement should be different?
>>>>
>>>> If you focus on auto-translated, then imo paging_mode_translate()
>>>> better would guard everything.
>>>
>>> Then it make sense to do in the following way:
>>>    bool domain_use_host_layout(struct domain *d)
>>>    {
>>> -    return is_domain_direct_mapped(d) ||
>>> -           (paging_mode_translate(d) && is_hardware_domain(d));
>>> +    return paging_mode_translate(d) &&
>>> +           (is_domain_direct_mapped(d) || is_hardware_domain(d));
>>>    }
>>
>> This is not really correct.
> 
> ... because of ... ? (After all, then the comment isn't correct either.)

I thought it could break what Arm had before when 
paging_mode_translate() is false, but it is always true for Arm, so with 
paging_mode_translate() being true, the new definition is equivalent to 
what it had before. So looks goods.

~ Oleksii


> 
> Jan



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks
  2026-04-01 14:22           ` Jan Beulich
@ 2026-04-01 14:53             ` Oleksii Kurochko
  2026-04-01 15:10               ` Jan Beulich
  0 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-04-01 14:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Andrew Cooper, Anthony PERARD, Michal Orzel,
	Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel



On 4/1/26 4:22 PM, Jan Beulich wrote:
> On 01.04.2026 15:57, Oleksii Kurochko wrote:
>> On 4/1/26 8:17 AM, Jan Beulich wrote:
>>> On 31.03.2026 18:14, Oleksii Kurochko wrote:
>>>> On 3/30/26 5:51 PM, Jan Beulich wrote:
>>>>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>>>>> The dom0less solution uses defined RAM banks as compile-time constants,
>>>>>> so introduce macros to describe guest RAM banks.
>>>>>>
>>>>>> The reason for 2 banks is that there is typically always a use case for
>>>>>> low memory under 4 GB, but the bank under 4 GB ends up being small because
>>>>>> there are other things under 4 GB it can conflict with (interrupt
>>>>>> controller, PCI BARs, etc.).
>>>>> Fixed layouts like the one you suggest come with (potentially severe)
>>>>> downsides. For example, what if more than 2Gb of MMIO space are needed
>>>>> for non-64-bit BARs?
>>>> It looks where usually RAM on RISC-V boards start, so I expect that 2gb
>>>> before RAM start is enough for MMIO space.
>>> Likely in the common case. Board designers aren't constrained by this,
>>> though (aiui). Whereas you set in stone a single, fixed layout.
>>>
>>> Arm maintainers - since a similar fixed layout is used there iirc,
>>> could you chime in here, please?
>>>
>>>> Answering your question it will be an issue or it will also use some
>>>> space before banks, no?
>>> I fear I don't understand what you're trying to tell me.
>> I meant that there is also some space between banks and pretty big which
>> could be used for MMIO which could be used for non-64-bit BARs.
> I don't follow: Bank 0 extends to 4G. There's no space above it, below
> bank 1, which could be use for non-64-bit BARs.

So we have two banks:
bank[0] -> [0x80000000, 0x100000000)
bank[1] -> [0x0200000000, 10000000000)

So i think we have some space between them [0x100000000, 0x0200000000) 
-> 4gb to be used for non-64-bit BARs.

And also we have another 2gb before bank[0].

~ Oleksii


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks
  2026-04-01 14:53             ` Oleksii Kurochko
@ 2026-04-01 15:10               ` Jan Beulich
  2026-04-06 15:43                 ` Oleksii Kurochko
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-04-01 15:10 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Andrew Cooper, Anthony PERARD, Michal Orzel,
	Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel

On 01.04.2026 16:53, Oleksii Kurochko wrote:
> 
> 
> On 4/1/26 4:22 PM, Jan Beulich wrote:
>> On 01.04.2026 15:57, Oleksii Kurochko wrote:
>>> On 4/1/26 8:17 AM, Jan Beulich wrote:
>>>> On 31.03.2026 18:14, Oleksii Kurochko wrote:
>>>>> On 3/30/26 5:51 PM, Jan Beulich wrote:
>>>>>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>>>>>> The dom0less solution uses defined RAM banks as compile-time constants,
>>>>>>> so introduce macros to describe guest RAM banks.
>>>>>>>
>>>>>>> The reason for 2 banks is that there is typically always a use case for
>>>>>>> low memory under 4 GB, but the bank under 4 GB ends up being small because
>>>>>>> there are other things under 4 GB it can conflict with (interrupt
>>>>>>> controller, PCI BARs, etc.).
>>>>>> Fixed layouts like the one you suggest come with (potentially severe)
>>>>>> downsides. For example, what if more than 2Gb of MMIO space are needed
>>>>>> for non-64-bit BARs?
>>>>> It looks where usually RAM on RISC-V boards start, so I expect that 2gb
>>>>> before RAM start is enough for MMIO space.
>>>> Likely in the common case. Board designers aren't constrained by this,
>>>> though (aiui). Whereas you set in stone a single, fixed layout.
>>>>
>>>> Arm maintainers - since a similar fixed layout is used there iirc,
>>>> could you chime in here, please?
>>>>
>>>>> Answering your question it will be an issue or it will also use some
>>>>> space before banks, no?
>>>> I fear I don't understand what you're trying to tell me.
>>> I meant that there is also some space between banks and pretty big which
>>> could be used for MMIO which could be used for non-64-bit BARs.
>> I don't follow: Bank 0 extends to 4G. There's no space above it, below
>> bank 1, which could be use for non-64-bit BARs.
> 
> So we have two banks:
> bank[0] -> [0x80000000, 0x100000000)
> bank[1] -> [0x0200000000, 10000000000)
> 
> So i think we have some space between them [0x100000000, 0x0200000000) 
> -> 4gb to be used for non-64-bit BARs.

But a non-64-bit BAR need to be assigned an address below 0x100000000?

> And also we have another 2gb before bank[0].

Yes, but I talked about that before.

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks
  2026-04-01 15:10               ` Jan Beulich
@ 2026-04-06 15:43                 ` Oleksii Kurochko
  2026-04-07  6:23                   ` Jan Beulich
  0 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-04-06 15:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Andrew Cooper, Anthony PERARD, Michal Orzel,
	Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel



On 4/1/26 5:10 PM, Jan Beulich wrote:
> On 01.04.2026 16:53, Oleksii Kurochko wrote:
>>
>>
>> On 4/1/26 4:22 PM, Jan Beulich wrote:
>>> On 01.04.2026 15:57, Oleksii Kurochko wrote:
>>>> On 4/1/26 8:17 AM, Jan Beulich wrote:
>>>>> On 31.03.2026 18:14, Oleksii Kurochko wrote:
>>>>>> On 3/30/26 5:51 PM, Jan Beulich wrote:
>>>>>>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>>>>>>> The dom0less solution uses defined RAM banks as compile-time constants,
>>>>>>>> so introduce macros to describe guest RAM banks.
>>>>>>>>
>>>>>>>> The reason for 2 banks is that there is typically always a use case for
>>>>>>>> low memory under 4 GB, but the bank under 4 GB ends up being small because
>>>>>>>> there are other things under 4 GB it can conflict with (interrupt
>>>>>>>> controller, PCI BARs, etc.).
>>>>>>> Fixed layouts like the one you suggest come with (potentially severe)
>>>>>>> downsides. For example, what if more than 2Gb of MMIO space are needed
>>>>>>> for non-64-bit BARs?
>>>>>> It looks where usually RAM on RISC-V boards start, so I expect that 2gb
>>>>>> before RAM start is enough for MMIO space.
>>>>> Likely in the common case. Board designers aren't constrained by this,
>>>>> though (aiui). Whereas you set in stone a single, fixed layout.
>>>>>
>>>>> Arm maintainers - since a similar fixed layout is used there iirc,
>>>>> could you chime in here, please?
>>>>>
>>>>>> Answering your question it will be an issue or it will also use some
>>>>>> space before banks, no?
>>>>> I fear I don't understand what you're trying to tell me.
>>>> I meant that there is also some space between banks and pretty big which
>>>> could be used for MMIO which could be used for non-64-bit BARs.
>>> I don't follow: Bank 0 extends to 4G. There's no space above it, below
>>> bank 1, which could be use for non-64-bit BARs.
>>
>> So we have two banks:
>> bank[0] -> [0x80000000, 0x100000000)
>> bank[1] -> [0x0200000000, 10000000000)
>>
>> So i think we have some space between them [0x100000000, 0x0200000000)
>> -> 4gb to be used for non-64-bit BARs.
> 
> But a non-64-bit BAR need to be assigned an address below 0x100000000?

Right, I had in mind that RV32 uses for guest Sv32x4 which could 
translate 34-bit GPA into 34-bit MPA and automatically applied that to 
32-bit BAR...

I can keep first 4gb for MMIO purpose and start bank[0] at 4gb as 34 MPA 
address space is more then enough to cover reserved 2gb of bank[0] after 
4gb.

~ Oleksii




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks
  2026-04-06 15:43                 ` Oleksii Kurochko
@ 2026-04-07  6:23                   ` Jan Beulich
  2026-04-07  8:54                     ` Oleksii Kurochko
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-04-07  6:23 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Andrew Cooper, Anthony PERARD, Michal Orzel,
	Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel

On 06.04.2026 17:43, Oleksii Kurochko wrote:
> 
> 
> On 4/1/26 5:10 PM, Jan Beulich wrote:
>> On 01.04.2026 16:53, Oleksii Kurochko wrote:
>>>
>>>
>>> On 4/1/26 4:22 PM, Jan Beulich wrote:
>>>> On 01.04.2026 15:57, Oleksii Kurochko wrote:
>>>>> On 4/1/26 8:17 AM, Jan Beulich wrote:
>>>>>> On 31.03.2026 18:14, Oleksii Kurochko wrote:
>>>>>>> On 3/30/26 5:51 PM, Jan Beulich wrote:
>>>>>>>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>>>>>>>> The dom0less solution uses defined RAM banks as compile-time constants,
>>>>>>>>> so introduce macros to describe guest RAM banks.
>>>>>>>>>
>>>>>>>>> The reason for 2 banks is that there is typically always a use case for
>>>>>>>>> low memory under 4 GB, but the bank under 4 GB ends up being small because
>>>>>>>>> there are other things under 4 GB it can conflict with (interrupt
>>>>>>>>> controller, PCI BARs, etc.).
>>>>>>>> Fixed layouts like the one you suggest come with (potentially severe)
>>>>>>>> downsides. For example, what if more than 2Gb of MMIO space are needed
>>>>>>>> for non-64-bit BARs?
>>>>>>> It looks where usually RAM on RISC-V boards start, so I expect that 2gb
>>>>>>> before RAM start is enough for MMIO space.
>>>>>> Likely in the common case. Board designers aren't constrained by this,
>>>>>> though (aiui). Whereas you set in stone a single, fixed layout.
>>>>>>
>>>>>> Arm maintainers - since a similar fixed layout is used there iirc,
>>>>>> could you chime in here, please?
>>>>>>
>>>>>>> Answering your question it will be an issue or it will also use some
>>>>>>> space before banks, no?
>>>>>> I fear I don't understand what you're trying to tell me.
>>>>> I meant that there is also some space between banks and pretty big which
>>>>> could be used for MMIO which could be used for non-64-bit BARs.
>>>> I don't follow: Bank 0 extends to 4G. There's no space above it, below
>>>> bank 1, which could be use for non-64-bit BARs.
>>>
>>> So we have two banks:
>>> bank[0] -> [0x80000000, 0x100000000)
>>> bank[1] -> [0x0200000000, 10000000000)
>>>
>>> So i think we have some space between them [0x100000000, 0x0200000000)
>>> -> 4gb to be used for non-64-bit BARs.
>>
>> But a non-64-bit BAR need to be assigned an address below 0x100000000?
> 
> Right, I had in mind that RV32 uses for guest Sv32x4 which could 
> translate 34-bit GPA into 34-bit MPA and automatically applied that to 
> 32-bit BAR...
> 
> I can keep first 4gb for MMIO purpose and start bank[0] at 4gb as 34 MPA 
> address space is more then enough to cover reserved 2gb of bank[0] after 
> 4gb.

Yet having no memory below 4G won't work for guests wanting to run in bare
mode? Don't guests even start up in bare mode (and hence 32-bit ones need
to have some of their memory below 4G in all cases)?

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks
  2026-04-07  6:23                   ` Jan Beulich
@ 2026-04-07  8:54                     ` Oleksii Kurochko
  2026-04-07  9:09                       ` Jan Beulich
  0 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2026-04-07  8:54 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Andrew Cooper, Anthony PERARD, Michal Orzel,
	Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel



On 4/7/26 8:23 AM, Jan Beulich wrote:
> On 06.04.2026 17:43, Oleksii Kurochko wrote:
>>
>>
>> On 4/1/26 5:10 PM, Jan Beulich wrote:
>>> On 01.04.2026 16:53, Oleksii Kurochko wrote:
>>>>
>>>>
>>>> On 4/1/26 4:22 PM, Jan Beulich wrote:
>>>>> On 01.04.2026 15:57, Oleksii Kurochko wrote:
>>>>>> On 4/1/26 8:17 AM, Jan Beulich wrote:
>>>>>>> On 31.03.2026 18:14, Oleksii Kurochko wrote:
>>>>>>>> On 3/30/26 5:51 PM, Jan Beulich wrote:
>>>>>>>>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>>>>>>>>> The dom0less solution uses defined RAM banks as compile-time constants,
>>>>>>>>>> so introduce macros to describe guest RAM banks.
>>>>>>>>>>
>>>>>>>>>> The reason for 2 banks is that there is typically always a use case for
>>>>>>>>>> low memory under 4 GB, but the bank under 4 GB ends up being small because
>>>>>>>>>> there are other things under 4 GB it can conflict with (interrupt
>>>>>>>>>> controller, PCI BARs, etc.).
>>>>>>>>> Fixed layouts like the one you suggest come with (potentially severe)
>>>>>>>>> downsides. For example, what if more than 2Gb of MMIO space are needed
>>>>>>>>> for non-64-bit BARs?
>>>>>>>> It looks where usually RAM on RISC-V boards start, so I expect that 2gb
>>>>>>>> before RAM start is enough for MMIO space.
>>>>>>> Likely in the common case. Board designers aren't constrained by this,
>>>>>>> though (aiui). Whereas you set in stone a single, fixed layout.
>>>>>>>
>>>>>>> Arm maintainers - since a similar fixed layout is used there iirc,
>>>>>>> could you chime in here, please?
>>>>>>>
>>>>>>>> Answering your question it will be an issue or it will also use some
>>>>>>>> space before banks, no?
>>>>>>> I fear I don't understand what you're trying to tell me.
>>>>>> I meant that there is also some space between banks and pretty big which
>>>>>> could be used for MMIO which could be used for non-64-bit BARs.
>>>>> I don't follow: Bank 0 extends to 4G. There's no space above it, below
>>>>> bank 1, which could be use for non-64-bit BARs.
>>>>
>>>> So we have two banks:
>>>> bank[0] -> [0x80000000, 0x100000000)
>>>> bank[1] -> [0x0200000000, 10000000000)
>>>>
>>>> So i think we have some space between them [0x100000000, 0x0200000000)
>>>> -> 4gb to be used for non-64-bit BARs.
>>>
>>> But a non-64-bit BAR need to be assigned an address below 0x100000000?
>>
>> Right, I had in mind that RV32 uses for guest Sv32x4 which could
>> translate 34-bit GPA into 34-bit MPA and automatically applied that to
>> 32-bit BAR...
>>
>> I can keep first 4gb for MMIO purpose and start bank[0] at 4gb as 34 MPA
>> address space is more then enough to cover reserved 2gb of bank[0] after
>> 4gb.
> 
> Yet having no memory below 4G won't work for guests wanting to run in bare
> mode? Don't guests even start up in bare mode (and hence 32-bit ones need
> to have some of their memory below 4G in all cases)?

I thought about such use case but decided that no one will want to run 
guest in bare mode and that is why we have:
     if ( max_gstage_mode.mode == HGATP_MODE_OFF )
         panic("Xen expects that G-stage won't be Bare mode\n");

Probably it is wrong assumption and we really want to support Bare mode 
for guest too. Let me know if I have to drop the panic above...

Then it isn't clear what will be the best layout for the current 
limitation that guest RAM should be compile-time constant for dom0less 
solution.
It looks to me that giving 2gb reserved for MMIO and 2gb for guest RAM 
is fair enough.
As an option 3gb for MMIO and 1gb for guest RAM will be enough as only 
Bare model will have such small amount of RAM, for other modes part of 
bank[1] could be used.

~ Oleksii


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks
  2026-04-07  8:54                     ` Oleksii Kurochko
@ 2026-04-07  9:09                       ` Jan Beulich
  2026-04-07  9:19                         ` Oleksii Kurochko
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2026-04-07  9:09 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Andrew Cooper, Anthony PERARD, Michal Orzel,
	Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel

On 07.04.2026 10:54, Oleksii Kurochko wrote:
> 
> 
> On 4/7/26 8:23 AM, Jan Beulich wrote:
>> On 06.04.2026 17:43, Oleksii Kurochko wrote:
>>>
>>>
>>> On 4/1/26 5:10 PM, Jan Beulich wrote:
>>>> On 01.04.2026 16:53, Oleksii Kurochko wrote:
>>>>>
>>>>>
>>>>> On 4/1/26 4:22 PM, Jan Beulich wrote:
>>>>>> On 01.04.2026 15:57, Oleksii Kurochko wrote:
>>>>>>> On 4/1/26 8:17 AM, Jan Beulich wrote:
>>>>>>>> On 31.03.2026 18:14, Oleksii Kurochko wrote:
>>>>>>>>> On 3/30/26 5:51 PM, Jan Beulich wrote:
>>>>>>>>>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>>>>>>>>>> The dom0less solution uses defined RAM banks as compile-time constants,
>>>>>>>>>>> so introduce macros to describe guest RAM banks.
>>>>>>>>>>>
>>>>>>>>>>> The reason for 2 banks is that there is typically always a use case for
>>>>>>>>>>> low memory under 4 GB, but the bank under 4 GB ends up being small because
>>>>>>>>>>> there are other things under 4 GB it can conflict with (interrupt
>>>>>>>>>>> controller, PCI BARs, etc.).
>>>>>>>>>> Fixed layouts like the one you suggest come with (potentially severe)
>>>>>>>>>> downsides. For example, what if more than 2Gb of MMIO space are needed
>>>>>>>>>> for non-64-bit BARs?
>>>>>>>>> It looks where usually RAM on RISC-V boards start, so I expect that 2gb
>>>>>>>>> before RAM start is enough for MMIO space.
>>>>>>>> Likely in the common case. Board designers aren't constrained by this,
>>>>>>>> though (aiui). Whereas you set in stone a single, fixed layout.
>>>>>>>>
>>>>>>>> Arm maintainers - since a similar fixed layout is used there iirc,
>>>>>>>> could you chime in here, please?
>>>>>>>>
>>>>>>>>> Answering your question it will be an issue or it will also use some
>>>>>>>>> space before banks, no?
>>>>>>>> I fear I don't understand what you're trying to tell me.
>>>>>>> I meant that there is also some space between banks and pretty big which
>>>>>>> could be used for MMIO which could be used for non-64-bit BARs.
>>>>>> I don't follow: Bank 0 extends to 4G. There's no space above it, below
>>>>>> bank 1, which could be use for non-64-bit BARs.
>>>>>
>>>>> So we have two banks:
>>>>> bank[0] -> [0x80000000, 0x100000000)
>>>>> bank[1] -> [0x0200000000, 10000000000)
>>>>>
>>>>> So i think we have some space between them [0x100000000, 0x0200000000)
>>>>> -> 4gb to be used for non-64-bit BARs.
>>>>
>>>> But a non-64-bit BAR need to be assigned an address below 0x100000000?
>>>
>>> Right, I had in mind that RV32 uses for guest Sv32x4 which could
>>> translate 34-bit GPA into 34-bit MPA and automatically applied that to
>>> 32-bit BAR...
>>>
>>> I can keep first 4gb for MMIO purpose and start bank[0] at 4gb as 34 MPA
>>> address space is more then enough to cover reserved 2gb of bank[0] after
>>> 4gb.
>>
>> Yet having no memory below 4G won't work for guests wanting to run in bare
>> mode? Don't guests even start up in bare mode (and hence 32-bit ones need
>> to have some of their memory below 4G in all cases)?
> 
> I thought about such use case but decided that no one will want to run 
> guest in bare mode and that is why we have:
>      if ( max_gstage_mode.mode == HGATP_MODE_OFF )
>          panic("Xen expects that G-stage won't be Bare mode\n");

How does HGATP matter here? We're talking of guest physical address space
layout, and hence it's SATP which matters.

> Probably it is wrong assumption and we really want to support Bare mode 
> for guest too. Let me know if I have to drop the panic above...
> 
> Then it isn't clear what will be the best layout for the current 
> limitation that guest RAM should be compile-time constant for dom0less 
> solution.
> It looks to me that giving 2gb reserved for MMIO and 2gb for guest RAM 
> is fair enough.
> As an option 3gb for MMIO and 1gb for guest RAM will be enough as only 
> Bare model will have such small amount of RAM, for other modes part of 
> bank[1] could be used.

All of which only supports my take that you don't want to make guest
memory layout an ABI property. Using compile-time determined banks for
now may be okay(ish), but in the longer run things will want determining
dynamically (or specifying via per-guest config).

Jan


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks
  2026-04-07  9:09                       ` Jan Beulich
@ 2026-04-07  9:19                         ` Oleksii Kurochko
  0 siblings, 0 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2026-04-07  9:19 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Andrew Cooper, Anthony PERARD, Michal Orzel,
	Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel



On 4/7/26 11:09 AM, Jan Beulich wrote:
> On 07.04.2026 10:54, Oleksii Kurochko wrote:
>>
>>
>> On 4/7/26 8:23 AM, Jan Beulich wrote:
>>> On 06.04.2026 17:43, Oleksii Kurochko wrote:
>>>>
>>>>
>>>> On 4/1/26 5:10 PM, Jan Beulich wrote:
>>>>> On 01.04.2026 16:53, Oleksii Kurochko wrote:
>>>>>>
>>>>>>
>>>>>> On 4/1/26 4:22 PM, Jan Beulich wrote:
>>>>>>> On 01.04.2026 15:57, Oleksii Kurochko wrote:
>>>>>>>> On 4/1/26 8:17 AM, Jan Beulich wrote:
>>>>>>>>> On 31.03.2026 18:14, Oleksii Kurochko wrote:
>>>>>>>>>> On 3/30/26 5:51 PM, Jan Beulich wrote:
>>>>>>>>>>> On 23.03.2026 17:29, Oleksii Kurochko wrote:
>>>>>>>>>>>> The dom0less solution uses defined RAM banks as compile-time constants,
>>>>>>>>>>>> so introduce macros to describe guest RAM banks.
>>>>>>>>>>>>
>>>>>>>>>>>> The reason for 2 banks is that there is typically always a use case for
>>>>>>>>>>>> low memory under 4 GB, but the bank under 4 GB ends up being small because
>>>>>>>>>>>> there are other things under 4 GB it can conflict with (interrupt
>>>>>>>>>>>> controller, PCI BARs, etc.).
>>>>>>>>>>> Fixed layouts like the one you suggest come with (potentially severe)
>>>>>>>>>>> downsides. For example, what if more than 2Gb of MMIO space are needed
>>>>>>>>>>> for non-64-bit BARs?
>>>>>>>>>> It looks where usually RAM on RISC-V boards start, so I expect that 2gb
>>>>>>>>>> before RAM start is enough for MMIO space.
>>>>>>>>> Likely in the common case. Board designers aren't constrained by this,
>>>>>>>>> though (aiui). Whereas you set in stone a single, fixed layout.
>>>>>>>>>
>>>>>>>>> Arm maintainers - since a similar fixed layout is used there iirc,
>>>>>>>>> could you chime in here, please?
>>>>>>>>>
>>>>>>>>>> Answering your question it will be an issue or it will also use some
>>>>>>>>>> space before banks, no?
>>>>>>>>> I fear I don't understand what you're trying to tell me.
>>>>>>>> I meant that there is also some space between banks and pretty big which
>>>>>>>> could be used for MMIO which could be used for non-64-bit BARs.
>>>>>>> I don't follow: Bank 0 extends to 4G. There's no space above it, below
>>>>>>> bank 1, which could be use for non-64-bit BARs.
>>>>>>
>>>>>> So we have two banks:
>>>>>> bank[0] -> [0x80000000, 0x100000000)
>>>>>> bank[1] -> [0x0200000000, 10000000000)
>>>>>>
>>>>>> So i think we have some space between them [0x100000000, 0x0200000000)
>>>>>> -> 4gb to be used for non-64-bit BARs.
>>>>>
>>>>> But a non-64-bit BAR need to be assigned an address below 0x100000000?
>>>>
>>>> Right, I had in mind that RV32 uses for guest Sv32x4 which could
>>>> translate 34-bit GPA into 34-bit MPA and automatically applied that to
>>>> 32-bit BAR...
>>>>
>>>> I can keep first 4gb for MMIO purpose and start bank[0] at 4gb as 34 MPA
>>>> address space is more then enough to cover reserved 2gb of bank[0] after
>>>> 4gb.
>>>
>>> Yet having no memory below 4G won't work for guests wanting to run in bare
>>> mode? Don't guests even start up in bare mode (and hence 32-bit ones need
>>> to have some of their memory below 4G in all cases)?
>>
>> I thought about such use case but decided that no one will want to run
>> guest in bare mode and that is why we have:
>>       if ( max_gstage_mode.mode == HGATP_MODE_OFF )
>>           panic("Xen expects that G-stage won't be Bare mode\n");
> 
> How does HGATP matter here? We're talking of guest physical address space
> layout, and hence it's SATP which matters.

oh, right, it is hgatp.

> 
>> Probably it is wrong assumption and we really want to support Bare mode
>> for guest too. Let me know if I have to drop the panic above...
>>
>> Then it isn't clear what will be the best layout for the current
>> limitation that guest RAM should be compile-time constant for dom0less
>> solution.
>> It looks to me that giving 2gb reserved for MMIO and 2gb for guest RAM
>> is fair enough.
>> As an option 3gb for MMIO and 1gb for guest RAM will be enough as only
>> Bare model will have such small amount of RAM, for other modes part of
>> bank[1] could be used.
> 
> All of which only supports my take that you don't want to make guest
> memory layout an ABI property. Using compile-time determined banks for
> now may be okay(ish), but in the longer run things will want determining
> dynamically (or specifying via per-guest config).

I've already planned to move it to some arch specific header instead of 
ABI header.

Thanks.

~ Oleksii


^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2026-04-07  9:20 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-23 16:29 [PATCH v2 00/11] RISCV: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko
2026-03-23 16:29 ` [PATCH v2 01/11] xen/riscv: implement get_page_from_gfn() Oleksii Kurochko
2026-03-26 13:50   ` Jan Beulich
2026-03-30 13:40     ` Oleksii Kurochko
2026-03-30 14:04       ` Jan Beulich
2026-03-23 16:29 ` [PATCH v2 02/11] xen: return proper type for guest access functions Oleksii Kurochko
2026-03-26 13:56   ` Jan Beulich
2026-03-23 16:29 ` [PATCH v2 03/11] xen/riscv: implement copy_to_guest_phys() Oleksii Kurochko
2026-03-30 14:24   ` Jan Beulich
2026-03-23 16:29 ` [PATCH v2 04/11] xen/dom0less: rename kernel_zimage_probe() to kernel_image_probe() Oleksii Kurochko
2026-03-23 16:29 ` [PATCH v2 05/11] xen/riscv: add kernel loading support Oleksii Kurochko
2026-03-30 14:47   ` Jan Beulich
     [not found]     ` <05b1bc67-bbed-412e-881e-a3fb2c2d873b@gmail.com>
2026-03-31 15:14       ` Jan Beulich
     [not found]         ` <a0efb7a6-4854-4fe5-bbf4-2561f25d7133@gmail.com>
2026-03-31 15:56           ` Jan Beulich
2026-03-23 16:29 ` [PATCH v2 06/11] xen: move declaration of fw_unreserved_regions() to common header Oleksii Kurochko
2026-03-23 16:29 ` [PATCH v2 07/11] xen: move domain_use_host_layout() to common code Oleksii Kurochko
2026-03-30 15:13   ` Jan Beulich
     [not found]     ` <57581b7d-cb9f-444c-9321-63b2fc3d09f0@gmail.com>
2026-03-31 15:53       ` Jan Beulich
2026-03-31 16:32         ` Oleksii Kurochko
2026-03-31 19:49           ` Oleksii Kurochko
2026-04-01  5:59             ` Jan Beulich
2026-04-01 14:44               ` Oleksii Kurochko
2026-04-01  5:58           ` Jan Beulich
2026-04-01 14:38             ` Oleksii Kurochko
2026-04-01 14:42               ` Jan Beulich
2026-03-23 16:29 ` [PATCH v2 08/11] xen: rename p2m_ipa_bits to p2m_gpa_bits Oleksii Kurochko
2026-03-30 15:16   ` Jan Beulich
2026-03-23 16:29 ` [PATCH v2 09/11] xen/riscv: introduce p2m_gpa_bits Oleksii Kurochko
2026-03-30 15:34   ` Jan Beulich
2026-03-31 16:02     ` Oleksii Kurochko
2026-04-01  6:07       ` Jan Beulich
2026-04-01 13:50         ` Oleksii Kurochko
2026-04-01 13:57           ` Jan Beulich
2026-03-23 16:29 ` [PATCH v2 10/11] xen/riscv: add definition of guest RAM banks Oleksii Kurochko
2026-03-30 15:51   ` Jan Beulich
2026-03-31 16:14     ` Oleksii Kurochko
2026-04-01  6:17       ` Jan Beulich
2026-04-01 13:57         ` Oleksii Kurochko
2026-04-01 14:22           ` Jan Beulich
2026-04-01 14:53             ` Oleksii Kurochko
2026-04-01 15:10               ` Jan Beulich
2026-04-06 15:43                 ` Oleksii Kurochko
2026-04-07  6:23                   ` Jan Beulich
2026-04-07  8:54                     ` Oleksii Kurochko
2026-04-07  9:09                       ` Jan Beulich
2026-04-07  9:19                         ` Oleksii Kurochko
2026-03-23 16:29 ` [PATCH v2 11/11] xen/riscv: enable DOMAIN_BUILD_HELPERS Oleksii Kurochko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.