* [PATCH v3 1/8] xen/arm: drop declaration of handle_device_interrupts()
2025-05-02 16:22 [PATCH v3 0/8] Move parts of Arm's Dom0less to common code Oleksii Kurochko
@ 2025-05-02 16:22 ` Oleksii Kurochko
2025-05-02 16:22 ` [PATCH v3 2/8] xen/common: dom0less: make some parts of Arm's CONFIG_DOM0LESS common Oleksii Kurochko
` (6 subsequent siblings)
7 siblings, 0 replies; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-02 16:22 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk
There is no definition of handle_device_interrupts() thereby it
could be dropped.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
---
Change in v3:
- Update commit message
- Add Reviewed-by: Michal Orzel <michal.orzel@amd.com>.
---
xen/arch/arm/include/asm/domain_build.h | 11 -----------
1 file changed, 11 deletions(-)
diff --git a/xen/arch/arm/include/asm/domain_build.h b/xen/arch/arm/include/asm/domain_build.h
index 17619c875d..378c10cc98 100644
--- a/xen/arch/arm/include/asm/domain_build.h
+++ b/xen/arch/arm/include/asm/domain_build.h
@@ -28,17 +28,6 @@ void evtchn_allocate(struct domain *d);
unsigned int get_allocation_size(paddr_t size);
-/*
- * handle_device_interrupts retrieves the interrupts configuration from
- * a device tree node and maps those interrupts to the target domain.
- *
- * Returns:
- * < 0 error
- * 0 success
- */
-int handle_device_interrupts(struct domain *d, struct dt_device_node *dev,
- bool need_mapping);
-
/*
* Helper to write an interrupts with the GIC format
* This code is assuming the irq is an PPI.
--
2.49.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH v3 2/8] xen/common: dom0less: make some parts of Arm's CONFIG_DOM0LESS common
2025-05-02 16:22 [PATCH v3 0/8] Move parts of Arm's Dom0less to common code Oleksii Kurochko
2025-05-02 16:22 ` [PATCH v3 1/8] xen/arm: drop declaration of handle_device_interrupts() Oleksii Kurochko
@ 2025-05-02 16:22 ` Oleksii Kurochko
2025-05-02 17:55 ` Stefano Stabellini
2025-05-02 16:22 ` [PATCH v3 3/8] asm-generic: move parts of Arm's asm/kernel.h to common code Oleksii Kurochko
` (5 subsequent siblings)
7 siblings, 1 reply; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-02 16:22 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné
Move some parts of Arm's Dom0Less code to be reused by other architectures.
At the moment, RISC-V is going to reuse these parts.
Move dom0less-build.h from the Arm-specific directory to asm-generic
as these header is expected to be the same across acrhictectures with
some updates: add the following declaration of construct_domU(),
and arch_create_domUs() as there are some parts which are still
architecture-specific.
Introduce HAS_DOM0LESS to provide ability to enable generic Dom0less
code for an architecture.
Relocate the CONFIG_DOM0LESS configuration to the common with adding
"depends on HAS_DOM0LESS" to not break builds for architectures which
don't support CONFIG_DOM0LESS config, especically it would be useful
to not provide stubs for construct_domU(), arch_create_domUs()
in case of *-randconfig which may set CONFIG_DOM0LESS=y.
Move is_dom0less_mode() function to the common code, as it depends on
boot modules that are already part of the common code.
Move create_domUs() function to the common code with some updates:
- Add arch_create_domUs() to cover parsing of arch-specific features,
for example, SVE (Scalar Vector Extension ) exists only in Arm.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
It was suggested to change dom0less to predefined domus or similar
(https://lore.kernel.org/xen-devel/cd2a3644-c9c6-4e77-9491-2988703906c0@gmail.com/T/#m1d5e81e5f1faca98a3c51efe4f35af25010edbf0):
I decided to go with dom0less name for now as it will require a lot of places to change,
including CI's test, and IMO we could do in a separate patch.
If it is necessry to do now, I'll be happy to do that in next version of the current
patch series.
---
Changes in v3:
- Move changes connected to the patch "xen/arm: dom0less delay xenstore initialization"
to common.
Also, some necessary parts for the mentioned patches were moved
to common (such as alloc_xenstore_evtchn(), ... ).
Not all changes are moved, changes connected to alloc_xenstore_params() and
construct_domu() will be moved in the following patches of this patch series.
- Move parsing of capabilities property to common code.
- Align parsing of "passthrough", "multiboot,device-tree" properties with staging.
- Drop arch_xen_domctl_createdomain().
- Add 'select HAS_DEVICE_TREE' for config HAS_DOM0LESS.
- Add empty lines after license in the top of newly added files.
- s/arch_create_domus/arch_create_domUs.
- Add footer below commit message regarding the naming of dom0less.
---
Changes in v2:
- Convert 'depends on Arm' to 'depends on HAS_DOM0LESS' for
CONFIG_DOM0LESS_BOOT.
- Change 'default Arm' to 'default y' for CONFIG_DOM0LESS_BOOT as there is
dependency on HAS_DOM0LESS.
- Introduce HAS_DOM0LESS and enable it for Arm.
- Update the commit message.
---
xen/arch/arm/Kconfig | 9 +-
xen/arch/arm/dom0less-build.c | 371 ++++------------------
xen/arch/arm/include/asm/Makefile | 1 +
xen/arch/arm/include/asm/dom0less-build.h | 34 --
xen/common/Kconfig | 13 +
xen/common/device-tree/Makefile | 1 +
xen/common/device-tree/dom0less-build.c | 283 +++++++++++++++++
xen/include/asm-generic/dom0less-build.h | 49 +++
8 files changed, 404 insertions(+), 357 deletions(-)
delete mode 100644 xen/arch/arm/include/asm/dom0less-build.h
create mode 100644 xen/common/device-tree/dom0less-build.c
create mode 100644 xen/include/asm-generic/dom0less-build.h
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index da8a406f5a..d0e0a7753c 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -15,6 +15,7 @@ config ARM
select GENERIC_UART_INIT
select HAS_ALTERNATIVE if HAS_VMAP
select HAS_DEVICE_TREE
+ select HAS_DOM0LESS
select HAS_STACK_PROTECTOR
select HAS_UBSAN
@@ -120,14 +121,6 @@ config GICV2
Driver for the ARM Generic Interrupt Controller v2.
If unsure, say Y
-config DOM0LESS_BOOT
- bool "Dom0less boot support" if EXPERT
- default y
- help
- Dom0less boot support enables Xen to create and start domU guests during
- Xen boot without the need of a control domain (Dom0), which could be
- present anyway.
-
config GICV3
bool "GICv3 driver"
depends on !NEW_VGIC
diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index a356fc94fc..ef49495d4f 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -22,48 +22,7 @@
#include <asm/static-memory.h>
#include <asm/static-shmem.h>
-#define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
-
-static domid_t __initdata xs_domid = DOMID_INVALID;
-static bool __initdata need_xenstore;
-
-void __init set_xs_domain(struct domain *d)
-{
- xs_domid = d->domain_id;
- set_global_virq_handler(d, VIRQ_DOM_EXC);
-}
-
-bool __init is_dom0less_mode(void)
-{
- struct bootmodules *mods = &bootinfo.modules;
- struct bootmodule *mod;
- unsigned int i;
- bool dom0found = false;
- bool domUfound = false;
-
- /* Look into the bootmodules */
- for ( i = 0 ; i < mods->nr_mods ; i++ )
- {
- mod = &mods->module[i];
- /* Find if dom0 and domU kernels are present */
- if ( mod->kind == BOOTMOD_KERNEL )
- {
- if ( mod->domU == false )
- {
- dom0found = true;
- break;
- }
- else
- domUfound = true;
- }
- }
-
- /*
- * If there is no dom0 kernel but at least one domU, then we are in
- * dom0less mode
- */
- return ( !dom0found && domUfound );
-}
+bool __initdata need_xenstore;
#ifdef CONFIG_VGICV2
static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
@@ -686,25 +645,6 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
return -EINVAL;
}
-static int __init alloc_xenstore_evtchn(struct domain *d)
-{
- evtchn_alloc_unbound_t alloc;
- int rc;
-
- alloc.dom = d->domain_id;
- alloc.remote_dom = xs_domid;
- rc = evtchn_alloc_unbound(&alloc, 0);
- if ( rc )
- {
- printk("Failed allocating event channel for domain\n");
- return rc;
- }
-
- d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN] = alloc.port;
-
- return 0;
-}
-
#define XENSTORE_PFN_OFFSET 1
static int __init alloc_xenstore_page(struct domain *d)
{
@@ -771,36 +711,6 @@ static int __init alloc_xenstore_params(struct kernel_info *kinfo)
return rc;
}
-static void __init initialize_domU_xenstore(void)
-{
- struct domain *d;
-
- if ( xs_domid == DOMID_INVALID )
- return;
-
- for_each_domain( d )
- {
- uint64_t gfn = d->arch.hvm.params[HVM_PARAM_STORE_PFN];
- int rc;
-
- if ( gfn == 0 )
- continue;
-
- if ( is_xenstore_domain(d) )
- continue;
-
- rc = alloc_xenstore_evtchn(d);
- if ( rc < 0 )
- panic("%pd: Failed to allocate xenstore_evtchn\n", d);
-
- if ( gfn != XENSTORE_PFN_LATE_ALLOC && IS_ENABLED(CONFIG_GRANT_TABLE) )
- {
- ASSERT(gfn < UINT32_MAX);
- gnttab_seed_entry(d, GNTTAB_RESERVED_XENSTORE, xs_domid, gfn);
- }
- }
-}
-
static void __init domain_vcpu_affinity(struct domain *d,
const struct dt_device_node *node)
{
@@ -906,8 +816,8 @@ static inline int domain_p2m_set_allocation(struct domain *d, uint64_t mem,
}
#endif /* CONFIG_ARCH_PAGING_MEMPOOL */
-static int __init construct_domU(struct domain *d,
- const struct dt_device_node *node)
+int __init construct_domU(struct domain *d,
+ const struct dt_device_node *node)
{
struct kernel_info kinfo = KERNEL_INFO_INIT;
const char *dom0less_enhanced;
@@ -1009,246 +919,77 @@ static int __init construct_domU(struct domain *d,
return alloc_xenstore_params(&kinfo);
}
-void __init create_domUs(void)
+void __init arch_create_domUs(struct dt_device_node *node,
+ struct xen_domctl_createdomain *d_cfg,
+ unsigned int flags)
{
- struct dt_device_node *node;
- const char *dom0less_iommu;
- bool iommu = false;
- const struct dt_device_node *cpupool_node,
- *chosen = dt_find_node_by_path("/chosen");
- const char *llc_colors_str = NULL;
-
- BUG_ON(chosen == NULL);
- dt_for_each_child_node(chosen, node)
- {
- struct domain *d;
- struct xen_domctl_createdomain d_cfg = {
- .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
- .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
- /*
- * The default of 1023 should be sufficient for guests because
- * on ARM we don't bind physical interrupts to event channels.
- * The only use of the evtchn port is inter-domain communications.
- * 1023 is also the default value used in libxl.
- */
- .max_evtchn_port = 1023,
- .max_grant_frames = -1,
- .max_maptrack_frames = -1,
- .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version),
- };
- unsigned int flags = 0U;
- bool has_dtb = false;
- uint32_t val;
- int rc;
-
- if ( !dt_device_is_compatible(node, "xen,domain") )
- continue;
-
- if ( (max_init_domid + 1) >= DOMID_FIRST_RESERVED )
- panic("No more domain IDs available\n");
+ uint32_t val;
- if ( dt_property_read_u32(node, "capabilities", &val) )
- {
- if ( val & ~DOMAIN_CAPS_MASK )
- panic("Invalid capabilities (%"PRIx32")\n", val);
-
- if ( val & DOMAIN_CAPS_CONTROL )
- flags |= CDF_privileged;
-
- if ( val & DOMAIN_CAPS_HARDWARE )
- {
- if ( hardware_domain )
- panic("Only 1 hardware domain can be specified! (%pd)\n",
- hardware_domain);
-
- d_cfg.max_grant_frames = gnttab_dom0_frames();
- d_cfg.max_evtchn_port = -1;
- flags |= CDF_hardware;
- iommu = true;
- }
-
- if ( val & DOMAIN_CAPS_XENSTORE )
- {
- if ( xs_domid != DOMID_INVALID )
- panic("Only 1 xenstore domain can be specified! (%u)\n",
- xs_domid);
+ d_cfg->arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE;
+ d_cfg->flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap;
- d_cfg.flags |= XEN_DOMCTL_CDF_xs_domain;
- d_cfg.max_evtchn_port = -1;
- }
- }
-
- if ( dt_find_property(node, "xen,static-mem", NULL) )
- {
- if ( llc_coloring_enabled )
- panic("LLC coloring and static memory are incompatible\n");
-
- flags |= CDF_staticmem;
- }
-
- if ( dt_property_read_bool(node, "direct-map") )
- {
- if ( !(flags & CDF_staticmem) )
- panic("direct-map is not valid for domain %s without static allocation.\n",
- dt_node_name(node));
-
- flags |= CDF_directmap;
- }
-
- if ( !dt_property_read_u32(node, "cpus", &d_cfg.max_vcpus) )
- panic("Missing property 'cpus' for domain %s\n",
- dt_node_name(node));
-
- if ( !dt_property_read_string(node, "passthrough", &dom0less_iommu) )
- {
- if ( flags & CDF_hardware )
- panic("Don't specify passthrough for hardware domain\n");
-
- if ( !strcmp(dom0less_iommu, "enabled") )
- iommu = true;
- }
-
- if ( (flags & CDF_hardware) && !(flags & CDF_directmap) &&
- !iommu_enabled )
- panic("non-direct mapped hardware domain requires iommu\n");
-
- if ( dt_find_compatible_node(node, NULL, "multiboot,device-tree") )
- {
- if ( flags & CDF_hardware )
- panic("\"multiboot,device-tree\" incompatible with hardware domain\n");
-
- has_dtb = true;
- }
-
- if ( iommu_enabled && (iommu || has_dtb) )
- d_cfg.flags |= XEN_DOMCTL_CDF_iommu;
-
- if ( !dt_property_read_u32(node, "nr_spis", &d_cfg.arch.nr_spis) )
- {
- int vpl011_virq = GUEST_VPL011_SPI;
-
- d_cfg.arch.nr_spis = VGIC_DEF_NR_SPIS;
-
- /*
- * The VPL011 virq is GUEST_VPL011_SPI, unless direct-map is
- * set, in which case it'll match the hardware.
- *
- * Since the domain is not yet created, we can't use
- * d->arch.vpl011.irq. So the logic to find the vIRQ has to
- * be hardcoded.
- * The logic here shall be consistent with the one in
- * domain_vpl011_init().
- */
- if ( flags & CDF_directmap )
- {
- vpl011_virq = serial_irq(SERHND_DTUART);
- if ( vpl011_virq < 0 )
- panic("Error getting IRQ number for this serial port %d\n",
- SERHND_DTUART);
- }
+ if ( !dt_property_read_u32(node, "nr_spis", &d_cfg->arch.nr_spis) )
+ {
+ int vpl011_virq = GUEST_VPL011_SPI;
- /*
- * vpl011 uses one emulated SPI. If vpl011 is requested, make
- * sure that we allocate enough SPIs for it.
- */
- if ( dt_property_read_bool(node, "vpl011") )
- d_cfg.arch.nr_spis = MAX(d_cfg.arch.nr_spis,
- vpl011_virq - 32 + 1);
- }
- else if ( flags & CDF_hardware )
- panic("nr_spis cannot be specified for hardware domain\n");
+ d_cfg->arch.nr_spis = VGIC_DEF_NR_SPIS;
- /* Get the optional property domain-cpupool */
- cpupool_node = dt_parse_phandle(node, "domain-cpupool", 0);
- if ( cpupool_node )
+ /*
+ * The VPL011 virq is GUEST_VPL011_SPI, unless direct-map is
+ * set, in which case it'll match the hardware.
+ *
+ * Since the domain is not yet created, we can't use
+ * d->arch.vpl011.irq. So the logic to find the vIRQ has to
+ * be hardcoded.
+ * The logic here shall be consistent with the one in
+ * domain_vpl011_init().
+ */
+ if ( flags & CDF_directmap )
{
- int pool_id = btcpupools_get_domain_pool_id(cpupool_node);
- if ( pool_id < 0 )
- panic("Error getting cpupool id from domain-cpupool (%d)\n",
- pool_id);
- d_cfg.cpupool_id = pool_id;
+ vpl011_virq = serial_irq(SERHND_DTUART);
+ if ( vpl011_virq < 0 )
+ panic("Error getting IRQ number for this serial port %d\n",
+ SERHND_DTUART);
}
- if ( dt_property_read_u32(node, "max_grant_version", &val) )
- d_cfg.grant_opts = XEN_DOMCTL_GRANT_version(val);
+ /*
+ * vpl011 uses one emulated SPI. If vpl011 is requested, make
+ * sure that we allocate enough SPIs for it.
+ */
+ if ( dt_property_read_bool(node, "vpl011") )
+ d_cfg->arch.nr_spis = MAX(d_cfg->arch.nr_spis,
+ vpl011_virq - 32 + 1);
+ }
+ else if ( flags & CDF_hardware )
+ panic("nr_spis cannot be specified for hardware domain\n");
- if ( dt_property_read_u32(node, "max_grant_frames", &val) )
- {
- if ( val > INT32_MAX )
- panic("max_grant_frames (%"PRIu32") overflow\n", val);
- d_cfg.max_grant_frames = val;
- }
+ if ( dt_get_property(node, "sve", &val) )
+ {
+#ifdef CONFIG_ARM64_SVE
+ unsigned int sve_vl_bits;
+ bool ret = false;
- if ( dt_property_read_u32(node, "max_maptrack_frames", &val) )
+ if ( !val )
{
- if ( val > INT32_MAX )
- panic("max_maptrack_frames (%"PRIu32") overflow\n", val);
- d_cfg.max_maptrack_frames = val;
+ /* Property found with no value, means max HW VL supported */
+ ret = sve_domctl_vl_param(-1, &sve_vl_bits);
}
-
- if ( dt_get_property(node, "sve", &val) )
+ else
{
-#ifdef CONFIG_ARM64_SVE
- unsigned int sve_vl_bits;
- bool ret = false;
-
- if ( !val )
- {
- /* Property found with no value, means max HW VL supported */
- ret = sve_domctl_vl_param(-1, &sve_vl_bits);
- }
+ if ( dt_property_read_u32(node, "sve", &val) )
+ ret = sve_domctl_vl_param(val, &sve_vl_bits);
else
- {
- if ( dt_property_read_u32(node, "sve", &val) )
- ret = sve_domctl_vl_param(val, &sve_vl_bits);
- else
- panic("Error reading 'sve' property\n");
- }
+ panic("Error reading 'sve' property\n");
+ }
- if ( ret )
- d_cfg.arch.sve_vl = sve_encode_vl(sve_vl_bits);
- else
- panic("SVE vector length error\n");
+ if ( ret )
+ d_cfg->arch.sve_vl = sve_encode_vl(sve_vl_bits);
+ else
+ panic("SVE vector length error\n");
#else
- panic("'sve' property found, but CONFIG_ARM64_SVE not selected\n");
+ panic("'sve' property found, but CONFIG_ARM64_SVE not selected\n");
#endif
- }
-
- dt_property_read_string(node, "llc-colors", &llc_colors_str);
- if ( !llc_coloring_enabled && llc_colors_str )
- panic("'llc-colors' found, but LLC coloring is disabled\n");
-
- /*
- * The variable max_init_domid is initialized with zero, so here it's
- * very important to use the pre-increment operator to call
- * domain_create() with a domid > 0. (domid == 0 is reserved for Dom0)
- */
- d = domain_create(++max_init_domid, &d_cfg, flags);
- if ( IS_ERR(d) )
- panic("Error creating domain %s (rc = %ld)\n",
- dt_node_name(node), PTR_ERR(d));
-
- if ( llc_coloring_enabled &&
- (rc = domain_set_llc_colors_from_str(d, llc_colors_str)) )
- panic("Error initializing LLC coloring for domain %s (rc = %d)\n",
- dt_node_name(node), rc);
-
- d->is_console = true;
- dt_device_set_used_by(node, d->domain_id);
-
- rc = construct_domU(d, node);
- if ( rc )
- panic("Could not set up domain %s (rc = %d)\n",
- dt_node_name(node), rc);
-
- if ( d_cfg.flags & XEN_DOMCTL_CDF_xs_domain )
- set_xs_domain(d);
}
-
- if ( need_xenstore && xs_domid == DOMID_INVALID )
- panic("xenstore requested, but xenstore domain not present\n");
-
- initialize_domU_xenstore();
}
/*
diff --git a/xen/arch/arm/include/asm/Makefile b/xen/arch/arm/include/asm/Makefile
index 4a4036c951..831c914cce 100644
--- a/xen/arch/arm/include/asm/Makefile
+++ b/xen/arch/arm/include/asm/Makefile
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0-only
generic-y += altp2m.h
generic-y += device.h
+generic-y += dom0less-build.h
generic-y += hardirq.h
generic-y += iocap.h
generic-y += paging.h
diff --git a/xen/arch/arm/include/asm/dom0less-build.h b/xen/arch/arm/include/asm/dom0less-build.h
deleted file mode 100644
index b0e41a1954..0000000000
--- a/xen/arch/arm/include/asm/dom0less-build.h
+++ /dev/null
@@ -1,34 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-
-#ifndef __ASM_DOM0LESS_BUILD_H_
-#define __ASM_DOM0LESS_BUILD_H_
-
-#include <xen/stdbool.h>
-
-#ifdef CONFIG_DOM0LESS_BOOT
-
-void create_domUs(void);
-bool is_dom0less_mode(void);
-void set_xs_domain(struct domain *d);
-
-#else /* !CONFIG_DOM0LESS_BOOT */
-
-static inline void create_domUs(void) {}
-static inline bool is_dom0less_mode(void)
-{
- return false;
-}
-static inline void set_xs_domain(struct domain *d) {}
-
-#endif /* CONFIG_DOM0LESS_BOOT */
-
-#endif /* __ASM_DOM0LESS_BUILD_H_ */
-
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index be28060716..be38abf9e1 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -12,6 +12,15 @@ config CORE_PARKING
bool
depends on NR_CPUS > 1
+config DOM0LESS_BOOT
+ bool "Dom0less boot support" if EXPERT
+ depends on HAS_DOM0LESS
+ default y
+ help
+ Dom0less boot support enables Xen to create and start domU guests during
+ Xen boot without the need of a control domain (Dom0), which could be
+ present anyway.
+
config GRANT_TABLE
bool "Grant table support" if EXPERT
default y
@@ -74,6 +83,10 @@ config HAS_DEVICE_TREE
bool
select LIBFDT
+config HAS_DOM0LESS
+ bool
+ select HAS_DEVICE_TREE
+
config HAS_DIT # Data Independent Timing
bool
diff --git a/xen/common/device-tree/Makefile b/xen/common/device-tree/Makefile
index 7c549be38a..f3dafc9b81 100644
--- a/xen/common/device-tree/Makefile
+++ b/xen/common/device-tree/Makefile
@@ -1,5 +1,6 @@
obj-y += bootfdt.init.o
obj-y += bootinfo.init.o
obj-y += device-tree.o
+obj-$(CONFIG_DOM0LESS_BOOT) += dom0less-build.o
obj-$(CONFIG_OVERLAY_DTB) += dt-overlay.o
obj-y += intc.o
diff --git a/xen/common/device-tree/dom0less-build.c b/xen/common/device-tree/dom0less-build.c
new file mode 100644
index 0000000000..a01a8b6b1a
--- /dev/null
+++ b/xen/common/device-tree/dom0less-build.c
@@ -0,0 +1,283 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/bootfdt.h>
+#include <xen/device_tree.h>
+#include <xen/domain.h>
+#include <xen/err.h>
+#include <xen/event.h>
+#include <xen/grant_table.h>
+#include <xen/init.h>
+#include <xen/iommu.h>
+#include <xen/llc-coloring.h>
+#include <xen/sched.h>
+#include <xen/stdbool.h>
+#include <xen/types.h>
+
+#include <public/bootfdt.h>
+#include <public/domctl.h>
+#include <public/event_channel.h>
+
+#include <asm/dom0less-build.h>
+#include <asm/setup.h>
+
+static domid_t __initdata xs_domid = DOMID_INVALID;
+
+void __init set_xs_domain(struct domain *d)
+{
+ xs_domid = d->domain_id;
+ set_global_virq_handler(d, VIRQ_DOM_EXC);
+}
+
+bool __init is_dom0less_mode(void)
+{
+ struct bootmodules *mods = &bootinfo.modules;
+ struct bootmodule *mod;
+ unsigned int i;
+ bool dom0found = false;
+ bool domUfound = false;
+
+ /* Look into the bootmodules */
+ for ( i = 0 ; i < mods->nr_mods ; i++ )
+ {
+ mod = &mods->module[i];
+ /* Find if dom0 and domU kernels are present */
+ if ( mod->kind == BOOTMOD_KERNEL )
+ {
+ if ( mod->domU == false )
+ {
+ dom0found = true;
+ break;
+ }
+ else
+ domUfound = true;
+ }
+ }
+
+ /*
+ * If there is no dom0 kernel but at least one domU, then we are in
+ * dom0less mode
+ */
+ return ( !dom0found && domUfound );
+}
+
+static int __init alloc_xenstore_evtchn(struct domain *d)
+{
+ evtchn_alloc_unbound_t alloc;
+ int rc;
+
+ alloc.dom = d->domain_id;
+ alloc.remote_dom = xs_domid;
+ rc = evtchn_alloc_unbound(&alloc, 0);
+ if ( rc )
+ {
+ printk("Failed allocating event channel for domain\n");
+ return rc;
+ }
+
+ d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN] = alloc.port;
+
+ return 0;
+}
+
+static void __init initialize_domU_xenstore(void)
+{
+ struct domain *d;
+
+ if ( xs_domid == DOMID_INVALID )
+ return;
+
+ for_each_domain( d )
+ {
+ uint64_t gfn = d->arch.hvm.params[HVM_PARAM_STORE_PFN];
+ int rc;
+
+ if ( gfn == 0 )
+ continue;
+
+ if ( is_xenstore_domain(d) )
+ continue;
+
+ rc = alloc_xenstore_evtchn(d);
+ if ( rc < 0 )
+ panic("%pd: Failed to allocate xenstore_evtchn\n", d);
+
+ if ( gfn != XENSTORE_PFN_LATE_ALLOC && IS_ENABLED(CONFIG_GRANT_TABLE) )
+ {
+ ASSERT(gfn < UINT32_MAX);
+ gnttab_seed_entry(d, GNTTAB_RESERVED_XENSTORE, xs_domid, gfn);
+ }
+ }
+}
+
+void __init create_domUs(void)
+{
+ struct dt_device_node *node;
+ const char *dom0less_iommu;
+ bool iommu = false;
+ const struct dt_device_node *cpupool_node,
+ *chosen = dt_find_node_by_path("/chosen");
+ const char *llc_colors_str = NULL;
+
+ BUG_ON(chosen == NULL);
+ dt_for_each_child_node(chosen, node)
+ {
+ struct domain *d;
+ struct xen_domctl_createdomain d_cfg = {0};
+ unsigned int flags = 0U;
+ bool has_dtb = false;
+ uint32_t val;
+ int rc;
+
+ if ( !dt_device_is_compatible(node, "xen,domain") )
+ continue;
+
+ if ( (max_init_domid + 1) >= DOMID_FIRST_RESERVED )
+ panic("No more domain IDs available\n");
+
+ d_cfg.max_evtchn_port = 1023;
+ d_cfg.max_grant_frames = -1;
+ d_cfg.max_maptrack_frames = -1;
+ d_cfg.grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version);
+
+ if ( dt_property_read_u32(node, "capabilities", &val) )
+ {
+ if ( val & ~DOMAIN_CAPS_MASK )
+ panic("Invalid capabilities (%"PRIx32")\n", val);
+
+ if ( val & DOMAIN_CAPS_CONTROL )
+ flags |= CDF_privileged;
+
+ if ( val & DOMAIN_CAPS_HARDWARE )
+ {
+ if ( hardware_domain )
+ panic("Only 1 hardware domain can be specified! (%pd)\n",
+ hardware_domain);
+
+ d_cfg.max_grant_frames = gnttab_dom0_frames();
+ d_cfg.max_evtchn_port = -1;
+ flags |= CDF_hardware;
+ iommu = true;
+ }
+
+ if ( val & DOMAIN_CAPS_XENSTORE )
+ {
+ if ( xs_domid != DOMID_INVALID )
+ panic("Only 1 xenstore domain can be specified! (%u)\n",
+ xs_domid);
+
+ d_cfg.flags |= XEN_DOMCTL_CDF_xs_domain;
+ d_cfg.max_evtchn_port = -1;
+ }
+ }
+
+ if ( dt_find_property(node, "xen,static-mem", NULL) )
+ {
+ if ( llc_coloring_enabled )
+ panic("LLC coloring and static memory are incompatible\n");
+
+ flags |= CDF_staticmem;
+ }
+
+ if ( dt_property_read_bool(node, "direct-map") )
+ {
+ if ( !(flags & CDF_staticmem) )
+ panic("direct-map is not valid for domain %s without static allocation.\n",
+ dt_node_name(node));
+
+ flags |= CDF_directmap;
+ }
+
+ if ( !dt_property_read_u32(node, "cpus", &d_cfg.max_vcpus) )
+ panic("Missing property 'cpus' for domain %s\n",
+ dt_node_name(node));
+
+ if ( !dt_property_read_string(node, "passthrough", &dom0less_iommu) )
+ {
+ if ( flags & CDF_hardware )
+ panic("Don't specify passthrough for hardware domain\n");
+
+ if ( !strcmp(dom0less_iommu, "enabled") )
+ iommu = true;
+ }
+
+ if ( (flags & CDF_hardware) && !(flags & CDF_directmap) &&
+ !iommu_enabled )
+ panic("non-direct mapped hardware domain requires iommu\n");
+
+ if ( dt_find_compatible_node(node, NULL, "multiboot,device-tree") )
+ {
+ if ( flags & CDF_hardware )
+ panic("\"multiboot,device-tree\" incompatible with hardware domain\n");
+
+ has_dtb = true;
+ }
+
+ if ( iommu_enabled && (iommu || has_dtb) )
+ d_cfg.flags |= XEN_DOMCTL_CDF_iommu;
+
+ /* Get the optional property domain-cpupool */
+ cpupool_node = dt_parse_phandle(node, "domain-cpupool", 0);
+ if ( cpupool_node )
+ {
+ int pool_id = btcpupools_get_domain_pool_id(cpupool_node);
+ if ( pool_id < 0 )
+ panic("Error getting cpupool id from domain-cpupool (%d)\n",
+ pool_id);
+ d_cfg.cpupool_id = pool_id;
+ }
+
+ if ( dt_property_read_u32(node, "max_grant_version", &val) )
+ d_cfg.grant_opts = XEN_DOMCTL_GRANT_version(val);
+
+ if ( dt_property_read_u32(node, "max_grant_frames", &val) )
+ {
+ if ( val > INT32_MAX )
+ panic("max_grant_frames (%"PRIu32") overflow\n", val);
+ d_cfg.max_grant_frames = val;
+ }
+
+ if ( dt_property_read_u32(node, "max_maptrack_frames", &val) )
+ {
+ if ( val > INT32_MAX )
+ panic("max_maptrack_frames (%"PRIu32") overflow\n", val);
+ d_cfg.max_maptrack_frames = val;
+ }
+
+ dt_property_read_string(node, "llc-colors", &llc_colors_str);
+ if ( !llc_coloring_enabled && llc_colors_str )
+ panic("'llc-colors' found, but LLC coloring is disabled\n");
+
+ arch_create_domUs(node, &d_cfg, flags);
+
+ /*
+ * The variable max_init_domid is initialized with zero, so here it's
+ * very important to use the pre-increment operator to call
+ * domain_create() with a domid > 0. (domid == 0 is reserved for Dom0)
+ */
+ d = domain_create(++max_init_domid, &d_cfg, flags);
+ if ( IS_ERR(d) )
+ panic("Error creating domain %s (rc = %ld)\n",
+ dt_node_name(node), PTR_ERR(d));
+
+ if ( llc_coloring_enabled &&
+ (rc = domain_set_llc_colors_from_str(d, llc_colors_str)) )
+ panic("Error initializing LLC coloring for domain %s (rc = %d)\n",
+ dt_node_name(node), rc);
+
+ d->is_console = true;
+ dt_device_set_used_by(node, d->domain_id);
+
+ rc = construct_domU(d, node);
+ if ( rc )
+ panic("Could not set up domain %s (rc = %d)\n",
+ dt_node_name(node), rc);
+
+ if ( d_cfg.flags & XEN_DOMCTL_CDF_xs_domain )
+ set_xs_domain(d);
+ }
+
+ if ( need_xenstore && xs_domid == DOMID_INVALID )
+ panic("xenstore requested, but xenstore domain not present\n");
+
+ initialize_domU_xenstore();
+}
diff --git a/xen/include/asm-generic/dom0less-build.h b/xen/include/asm-generic/dom0less-build.h
new file mode 100644
index 0000000000..5655571a66
--- /dev/null
+++ b/xen/include/asm-generic/dom0less-build.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef __ASM_GENERIC_DOM0LESS_BUILD_H__
+#define __ASM_GENERIC_DOM0LESS_BUILD_H__
+
+#include <xen/stdbool.h>
+
+#ifdef CONFIG_DOM0LESS_BOOT
+
+#include <public/domctl.h>
+
+struct domain;
+struct dt_device_node;
+
+/* TODO: remove both when construct_domU() will be moved to common. */
+#define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
+extern bool need_xenstore;
+
+void create_domUs(void);
+bool is_dom0less_mode(void);
+void set_xs_domain(struct domain *d);
+
+int construct_domU(struct domain *d, const struct dt_device_node *node);
+
+void arch_create_domUs(struct dt_device_node *node,
+ struct xen_domctl_createdomain *d_cfg,
+ unsigned int flags);
+
+#else /* !CONFIG_DOM0LESS_BOOT */
+
+static inline void create_domUs(void) {}
+static inline bool is_dom0less_mode(void)
+{
+ return false;
+}
+static inline void set_xs_domain(struct domain *d) {}
+
+#endif /* CONFIG_DOM0LESS_BOOT */
+
+#endif /* __ASM_GENERIC_DOM0LESS_BUILD_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
--
2.49.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH v3 2/8] xen/common: dom0less: make some parts of Arm's CONFIG_DOM0LESS common
2025-05-02 16:22 ` [PATCH v3 2/8] xen/common: dom0less: make some parts of Arm's CONFIG_DOM0LESS common Oleksii Kurochko
@ 2025-05-02 17:55 ` Stefano Stabellini
2025-05-05 7:35 ` Oleksii Kurochko
2025-05-05 13:19 ` Oleksii Kurochko
0 siblings, 2 replies; 30+ messages in thread
From: Stefano Stabellini @ 2025-05-02 17:55 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: xen-devel, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Jan Beulich, Roger Pau Monné
On Fri, 2 May 2025, Oleksii Kurochko wrote:
> Move some parts of Arm's Dom0Less code to be reused by other architectures.
> At the moment, RISC-V is going to reuse these parts.
>
> Move dom0less-build.h from the Arm-specific directory to asm-generic
> as these header is expected to be the same across acrhictectures with
> some updates: add the following declaration of construct_domU(),
> and arch_create_domUs() as there are some parts which are still
> architecture-specific.
>
> Introduce HAS_DOM0LESS to provide ability to enable generic Dom0less
> code for an architecture.
>
> Relocate the CONFIG_DOM0LESS configuration to the common with adding
> "depends on HAS_DOM0LESS" to not break builds for architectures which
> don't support CONFIG_DOM0LESS config, especically it would be useful
> to not provide stubs for construct_domU(), arch_create_domUs()
> in case of *-randconfig which may set CONFIG_DOM0LESS=y.
>
> Move is_dom0less_mode() function to the common code, as it depends on
> boot modules that are already part of the common code.
>
> Move create_domUs() function to the common code with some updates:
> - Add arch_create_domUs() to cover parsing of arch-specific features,
> for example, SVE (Scalar Vector Extension ) exists only in Arm.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> It was suggested to change dom0less to predefined domus or similar
> (https://lore.kernel.org/xen-devel/cd2a3644-c9c6-4e77-9491-2988703906c0@gmail.com/T/#m1d5e81e5f1faca98a3c51efe4f35af25010edbf0):
>
> I decided to go with dom0less name for now as it will require a lot of places to change,
> including CI's test, and IMO we could do in a separate patch.
> If it is necessry to do now, I'll be happy to do that in next version of the current
> patch series.
I think it is fine to use dom0less for now, it will make the code easier
to review and it is not necessary to change the name at this point.
The patch looks good to me, except for a couple of minor suggestions I
have below. I could make them on commit. With those:
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> ---
> Changes in v3:
> - Move changes connected to the patch "xen/arm: dom0less delay xenstore initialization"
> to common.
> Also, some necessary parts for the mentioned patches were moved
> to common (such as alloc_xenstore_evtchn(), ... ).
> Not all changes are moved, changes connected to alloc_xenstore_params() and
> construct_domu() will be moved in the following patches of this patch series.
> - Move parsing of capabilities property to common code.
> - Align parsing of "passthrough", "multiboot,device-tree" properties with staging.
> - Drop arch_xen_domctl_createdomain().
> - Add 'select HAS_DEVICE_TREE' for config HAS_DOM0LESS.
> - Add empty lines after license in the top of newly added files.
> - s/arch_create_domus/arch_create_domUs.
> - Add footer below commit message regarding the naming of dom0less.
> ---
> Changes in v2:
> - Convert 'depends on Arm' to 'depends on HAS_DOM0LESS' for
> CONFIG_DOM0LESS_BOOT.
> - Change 'default Arm' to 'default y' for CONFIG_DOM0LESS_BOOT as there is
> dependency on HAS_DOM0LESS.
> - Introduce HAS_DOM0LESS and enable it for Arm.
> - Update the commit message.
> ---
> xen/arch/arm/Kconfig | 9 +-
> xen/arch/arm/dom0less-build.c | 371 ++++------------------
> xen/arch/arm/include/asm/Makefile | 1 +
> xen/arch/arm/include/asm/dom0less-build.h | 34 --
> xen/common/Kconfig | 13 +
> xen/common/device-tree/Makefile | 1 +
> xen/common/device-tree/dom0less-build.c | 283 +++++++++++++++++
> xen/include/asm-generic/dom0less-build.h | 49 +++
> 8 files changed, 404 insertions(+), 357 deletions(-)
> delete mode 100644 xen/arch/arm/include/asm/dom0less-build.h
> create mode 100644 xen/common/device-tree/dom0less-build.c
> create mode 100644 xen/include/asm-generic/dom0less-build.h
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index da8a406f5a..d0e0a7753c 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -15,6 +15,7 @@ config ARM
> select GENERIC_UART_INIT
> select HAS_ALTERNATIVE if HAS_VMAP
> select HAS_DEVICE_TREE
> + select HAS_DOM0LESS
> select HAS_STACK_PROTECTOR
> select HAS_UBSAN
>
> @@ -120,14 +121,6 @@ config GICV2
> Driver for the ARM Generic Interrupt Controller v2.
> If unsure, say Y
>
> -config DOM0LESS_BOOT
> - bool "Dom0less boot support" if EXPERT
> - default y
> - help
> - Dom0less boot support enables Xen to create and start domU guests during
> - Xen boot without the need of a control domain (Dom0), which could be
> - present anyway.
> -
> config GICV3
> bool "GICv3 driver"
> depends on !NEW_VGIC
> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
> index a356fc94fc..ef49495d4f 100644
> --- a/xen/arch/arm/dom0less-build.c
> +++ b/xen/arch/arm/dom0less-build.c
> @@ -22,48 +22,7 @@
> #include <asm/static-memory.h>
> #include <asm/static-shmem.h>
>
> -#define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
> -
> -static domid_t __initdata xs_domid = DOMID_INVALID;
> -static bool __initdata need_xenstore;
> -
> -void __init set_xs_domain(struct domain *d)
> -{
> - xs_domid = d->domain_id;
> - set_global_virq_handler(d, VIRQ_DOM_EXC);
> -}
> -
> -bool __init is_dom0less_mode(void)
> -{
> - struct bootmodules *mods = &bootinfo.modules;
> - struct bootmodule *mod;
> - unsigned int i;
> - bool dom0found = false;
> - bool domUfound = false;
> -
> - /* Look into the bootmodules */
> - for ( i = 0 ; i < mods->nr_mods ; i++ )
> - {
> - mod = &mods->module[i];
> - /* Find if dom0 and domU kernels are present */
> - if ( mod->kind == BOOTMOD_KERNEL )
> - {
> - if ( mod->domU == false )
> - {
> - dom0found = true;
> - break;
> - }
> - else
> - domUfound = true;
> - }
> - }
> -
> - /*
> - * If there is no dom0 kernel but at least one domU, then we are in
> - * dom0less mode
> - */
> - return ( !dom0found && domUfound );
> -}
> +bool __initdata need_xenstore;
>
> #ifdef CONFIG_VGICV2
> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
> @@ -686,25 +645,6 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
> return -EINVAL;
> }
>
> -static int __init alloc_xenstore_evtchn(struct domain *d)
> -{
> - evtchn_alloc_unbound_t alloc;
> - int rc;
> -
> - alloc.dom = d->domain_id;
> - alloc.remote_dom = xs_domid;
> - rc = evtchn_alloc_unbound(&alloc, 0);
> - if ( rc )
> - {
> - printk("Failed allocating event channel for domain\n");
> - return rc;
> - }
> -
> - d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN] = alloc.port;
> -
> - return 0;
> -}
> -
> #define XENSTORE_PFN_OFFSET 1
> static int __init alloc_xenstore_page(struct domain *d)
> {
> @@ -771,36 +711,6 @@ static int __init alloc_xenstore_params(struct kernel_info *kinfo)
> return rc;
> }
>
> -static void __init initialize_domU_xenstore(void)
> -{
> - struct domain *d;
> -
> - if ( xs_domid == DOMID_INVALID )
> - return;
> -
> - for_each_domain( d )
> - {
> - uint64_t gfn = d->arch.hvm.params[HVM_PARAM_STORE_PFN];
> - int rc;
> -
> - if ( gfn == 0 )
> - continue;
> -
> - if ( is_xenstore_domain(d) )
> - continue;
> -
> - rc = alloc_xenstore_evtchn(d);
> - if ( rc < 0 )
> - panic("%pd: Failed to allocate xenstore_evtchn\n", d);
> -
> - if ( gfn != XENSTORE_PFN_LATE_ALLOC && IS_ENABLED(CONFIG_GRANT_TABLE) )
> - {
> - ASSERT(gfn < UINT32_MAX);
> - gnttab_seed_entry(d, GNTTAB_RESERVED_XENSTORE, xs_domid, gfn);
> - }
> - }
> -}
> -
> static void __init domain_vcpu_affinity(struct domain *d,
> const struct dt_device_node *node)
> {
> @@ -906,8 +816,8 @@ static inline int domain_p2m_set_allocation(struct domain *d, uint64_t mem,
> }
> #endif /* CONFIG_ARCH_PAGING_MEMPOOL */
>
> -static int __init construct_domU(struct domain *d,
> - const struct dt_device_node *node)
> +int __init construct_domU(struct domain *d,
> + const struct dt_device_node *node)
> {
> struct kernel_info kinfo = KERNEL_INFO_INIT;
> const char *dom0less_enhanced;
> @@ -1009,246 +919,77 @@ static int __init construct_domU(struct domain *d,
> return alloc_xenstore_params(&kinfo);
> }
>
> -void __init create_domUs(void)
> +void __init arch_create_domUs(struct dt_device_node *node,
> + struct xen_domctl_createdomain *d_cfg,
> + unsigned int flags)
> {
> - struct dt_device_node *node;
> - const char *dom0less_iommu;
> - bool iommu = false;
> - const struct dt_device_node *cpupool_node,
> - *chosen = dt_find_node_by_path("/chosen");
> - const char *llc_colors_str = NULL;
> -
> - BUG_ON(chosen == NULL);
> - dt_for_each_child_node(chosen, node)
> - {
> - struct domain *d;
> - struct xen_domctl_createdomain d_cfg = {
> - .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
> - .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
> - /*
> - * The default of 1023 should be sufficient for guests because
> - * on ARM we don't bind physical interrupts to event channels.
> - * The only use of the evtchn port is inter-domain communications.
> - * 1023 is also the default value used in libxl.
> - */
> - .max_evtchn_port = 1023,
> - .max_grant_frames = -1,
> - .max_maptrack_frames = -1,
> - .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version),
> - };
> - unsigned int flags = 0U;
> - bool has_dtb = false;
> - uint32_t val;
> - int rc;
> -
> - if ( !dt_device_is_compatible(node, "xen,domain") )
> - continue;
> -
> - if ( (max_init_domid + 1) >= DOMID_FIRST_RESERVED )
> - panic("No more domain IDs available\n");
> + uint32_t val;
>
> - if ( dt_property_read_u32(node, "capabilities", &val) )
> - {
> - if ( val & ~DOMAIN_CAPS_MASK )
> - panic("Invalid capabilities (%"PRIx32")\n", val);
> -
> - if ( val & DOMAIN_CAPS_CONTROL )
> - flags |= CDF_privileged;
> -
> - if ( val & DOMAIN_CAPS_HARDWARE )
> - {
> - if ( hardware_domain )
> - panic("Only 1 hardware domain can be specified! (%pd)\n",
> - hardware_domain);
> -
> - d_cfg.max_grant_frames = gnttab_dom0_frames();
> - d_cfg.max_evtchn_port = -1;
> - flags |= CDF_hardware;
> - iommu = true;
> - }
> -
> - if ( val & DOMAIN_CAPS_XENSTORE )
> - {
> - if ( xs_domid != DOMID_INVALID )
> - panic("Only 1 xenstore domain can be specified! (%u)\n",
> - xs_domid);
> + d_cfg->arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE;
> + d_cfg->flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap;
>
> - d_cfg.flags |= XEN_DOMCTL_CDF_xs_domain;
> - d_cfg.max_evtchn_port = -1;
> - }
> - }
> -
> - if ( dt_find_property(node, "xen,static-mem", NULL) )
> - {
> - if ( llc_coloring_enabled )
> - panic("LLC coloring and static memory are incompatible\n");
> -
> - flags |= CDF_staticmem;
> - }
> -
> - if ( dt_property_read_bool(node, "direct-map") )
> - {
> - if ( !(flags & CDF_staticmem) )
> - panic("direct-map is not valid for domain %s without static allocation.\n",
> - dt_node_name(node));
> -
> - flags |= CDF_directmap;
> - }
> -
> - if ( !dt_property_read_u32(node, "cpus", &d_cfg.max_vcpus) )
> - panic("Missing property 'cpus' for domain %s\n",
> - dt_node_name(node));
> -
> - if ( !dt_property_read_string(node, "passthrough", &dom0less_iommu) )
> - {
> - if ( flags & CDF_hardware )
> - panic("Don't specify passthrough for hardware domain\n");
> -
> - if ( !strcmp(dom0less_iommu, "enabled") )
> - iommu = true;
> - }
> -
> - if ( (flags & CDF_hardware) && !(flags & CDF_directmap) &&
> - !iommu_enabled )
> - panic("non-direct mapped hardware domain requires iommu\n");
> -
> - if ( dt_find_compatible_node(node, NULL, "multiboot,device-tree") )
> - {
> - if ( flags & CDF_hardware )
> - panic("\"multiboot,device-tree\" incompatible with hardware domain\n");
> -
> - has_dtb = true;
> - }
> -
> - if ( iommu_enabled && (iommu || has_dtb) )
> - d_cfg.flags |= XEN_DOMCTL_CDF_iommu;
> -
> - if ( !dt_property_read_u32(node, "nr_spis", &d_cfg.arch.nr_spis) )
> - {
> - int vpl011_virq = GUEST_VPL011_SPI;
> -
> - d_cfg.arch.nr_spis = VGIC_DEF_NR_SPIS;
> -
> - /*
> - * The VPL011 virq is GUEST_VPL011_SPI, unless direct-map is
> - * set, in which case it'll match the hardware.
> - *
> - * Since the domain is not yet created, we can't use
> - * d->arch.vpl011.irq. So the logic to find the vIRQ has to
> - * be hardcoded.
> - * The logic here shall be consistent with the one in
> - * domain_vpl011_init().
> - */
> - if ( flags & CDF_directmap )
> - {
> - vpl011_virq = serial_irq(SERHND_DTUART);
> - if ( vpl011_virq < 0 )
> - panic("Error getting IRQ number for this serial port %d\n",
> - SERHND_DTUART);
> - }
> + if ( !dt_property_read_u32(node, "nr_spis", &d_cfg->arch.nr_spis) )
> + {
> + int vpl011_virq = GUEST_VPL011_SPI;
>
> - /*
> - * vpl011 uses one emulated SPI. If vpl011 is requested, make
> - * sure that we allocate enough SPIs for it.
> - */
> - if ( dt_property_read_bool(node, "vpl011") )
> - d_cfg.arch.nr_spis = MAX(d_cfg.arch.nr_spis,
> - vpl011_virq - 32 + 1);
> - }
> - else if ( flags & CDF_hardware )
> - panic("nr_spis cannot be specified for hardware domain\n");
> + d_cfg->arch.nr_spis = VGIC_DEF_NR_SPIS;
>
> - /* Get the optional property domain-cpupool */
> - cpupool_node = dt_parse_phandle(node, "domain-cpupool", 0);
> - if ( cpupool_node )
> + /*
> + * The VPL011 virq is GUEST_VPL011_SPI, unless direct-map is
> + * set, in which case it'll match the hardware.
> + *
> + * Since the domain is not yet created, we can't use
> + * d->arch.vpl011.irq. So the logic to find the vIRQ has to
> + * be hardcoded.
> + * The logic here shall be consistent with the one in
> + * domain_vpl011_init().
> + */
> + if ( flags & CDF_directmap )
> {
> - int pool_id = btcpupools_get_domain_pool_id(cpupool_node);
> - if ( pool_id < 0 )
> - panic("Error getting cpupool id from domain-cpupool (%d)\n",
> - pool_id);
> - d_cfg.cpupool_id = pool_id;
> + vpl011_virq = serial_irq(SERHND_DTUART);
> + if ( vpl011_virq < 0 )
> + panic("Error getting IRQ number for this serial port %d\n",
> + SERHND_DTUART);
> }
>
> - if ( dt_property_read_u32(node, "max_grant_version", &val) )
> - d_cfg.grant_opts = XEN_DOMCTL_GRANT_version(val);
> + /*
> + * vpl011 uses one emulated SPI. If vpl011 is requested, make
> + * sure that we allocate enough SPIs for it.
> + */
> + if ( dt_property_read_bool(node, "vpl011") )
> + d_cfg->arch.nr_spis = MAX(d_cfg->arch.nr_spis,
> + vpl011_virq - 32 + 1);
> + }
> + else if ( flags & CDF_hardware )
> + panic("nr_spis cannot be specified for hardware domain\n");
>
> - if ( dt_property_read_u32(node, "max_grant_frames", &val) )
> - {
> - if ( val > INT32_MAX )
> - panic("max_grant_frames (%"PRIu32") overflow\n", val);
> - d_cfg.max_grant_frames = val;
> - }
> + if ( dt_get_property(node, "sve", &val) )
> + {
> +#ifdef CONFIG_ARM64_SVE
> + unsigned int sve_vl_bits;
> + bool ret = false;
>
> - if ( dt_property_read_u32(node, "max_maptrack_frames", &val) )
> + if ( !val )
> {
> - if ( val > INT32_MAX )
> - panic("max_maptrack_frames (%"PRIu32") overflow\n", val);
> - d_cfg.max_maptrack_frames = val;
> + /* Property found with no value, means max HW VL supported */
> + ret = sve_domctl_vl_param(-1, &sve_vl_bits);
> }
> -
> - if ( dt_get_property(node, "sve", &val) )
> + else
> {
> -#ifdef CONFIG_ARM64_SVE
> - unsigned int sve_vl_bits;
> - bool ret = false;
> -
> - if ( !val )
> - {
> - /* Property found with no value, means max HW VL supported */
> - ret = sve_domctl_vl_param(-1, &sve_vl_bits);
> - }
> + if ( dt_property_read_u32(node, "sve", &val) )
> + ret = sve_domctl_vl_param(val, &sve_vl_bits);
> else
> - {
> - if ( dt_property_read_u32(node, "sve", &val) )
> - ret = sve_domctl_vl_param(val, &sve_vl_bits);
> - else
> - panic("Error reading 'sve' property\n");
> - }
> + panic("Error reading 'sve' property\n");
> + }
>
> - if ( ret )
> - d_cfg.arch.sve_vl = sve_encode_vl(sve_vl_bits);
> - else
> - panic("SVE vector length error\n");
> + if ( ret )
> + d_cfg->arch.sve_vl = sve_encode_vl(sve_vl_bits);
> + else
> + panic("SVE vector length error\n");
> #else
> - panic("'sve' property found, but CONFIG_ARM64_SVE not selected\n");
> + panic("'sve' property found, but CONFIG_ARM64_SVE not selected\n");
> #endif
> - }
> -
> - dt_property_read_string(node, "llc-colors", &llc_colors_str);
> - if ( !llc_coloring_enabled && llc_colors_str )
> - panic("'llc-colors' found, but LLC coloring is disabled\n");
> -
> - /*
> - * The variable max_init_domid is initialized with zero, so here it's
> - * very important to use the pre-increment operator to call
> - * domain_create() with a domid > 0. (domid == 0 is reserved for Dom0)
> - */
> - d = domain_create(++max_init_domid, &d_cfg, flags);
> - if ( IS_ERR(d) )
> - panic("Error creating domain %s (rc = %ld)\n",
> - dt_node_name(node), PTR_ERR(d));
> -
> - if ( llc_coloring_enabled &&
> - (rc = domain_set_llc_colors_from_str(d, llc_colors_str)) )
> - panic("Error initializing LLC coloring for domain %s (rc = %d)\n",
> - dt_node_name(node), rc);
> -
> - d->is_console = true;
> - dt_device_set_used_by(node, d->domain_id);
> -
> - rc = construct_domU(d, node);
> - if ( rc )
> - panic("Could not set up domain %s (rc = %d)\n",
> - dt_node_name(node), rc);
> -
> - if ( d_cfg.flags & XEN_DOMCTL_CDF_xs_domain )
> - set_xs_domain(d);
> }
> -
> - if ( need_xenstore && xs_domid == DOMID_INVALID )
> - panic("xenstore requested, but xenstore domain not present\n");
> -
> - initialize_domU_xenstore();
> }
>
> /*
> diff --git a/xen/arch/arm/include/asm/Makefile b/xen/arch/arm/include/asm/Makefile
> index 4a4036c951..831c914cce 100644
> --- a/xen/arch/arm/include/asm/Makefile
> +++ b/xen/arch/arm/include/asm/Makefile
> @@ -1,6 +1,7 @@
> # SPDX-License-Identifier: GPL-2.0-only
> generic-y += altp2m.h
> generic-y += device.h
> +generic-y += dom0less-build.h
> generic-y += hardirq.h
> generic-y += iocap.h
> generic-y += paging.h
> diff --git a/xen/arch/arm/include/asm/dom0less-build.h b/xen/arch/arm/include/asm/dom0less-build.h
> deleted file mode 100644
> index b0e41a1954..0000000000
> --- a/xen/arch/arm/include/asm/dom0less-build.h
> +++ /dev/null
> @@ -1,34 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0-only */
> -
> -#ifndef __ASM_DOM0LESS_BUILD_H_
> -#define __ASM_DOM0LESS_BUILD_H_
> -
> -#include <xen/stdbool.h>
> -
> -#ifdef CONFIG_DOM0LESS_BOOT
> -
> -void create_domUs(void);
> -bool is_dom0less_mode(void);
> -void set_xs_domain(struct domain *d);
> -
> -#else /* !CONFIG_DOM0LESS_BOOT */
> -
> -static inline void create_domUs(void) {}
> -static inline bool is_dom0less_mode(void)
> -{
> - return false;
> -}
> -static inline void set_xs_domain(struct domain *d) {}
> -
> -#endif /* CONFIG_DOM0LESS_BOOT */
> -
> -#endif /* __ASM_DOM0LESS_BUILD_H_ */
> -
> -/*
> - * Local variables:
> - * mode: C
> - * c-file-style: "BSD"
> - * c-basic-offset: 4
> - * indent-tabs-mode: nil
> - * End:
> - */
> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
> index be28060716..be38abf9e1 100644
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -12,6 +12,15 @@ config CORE_PARKING
> bool
> depends on NR_CPUS > 1
>
> +config DOM0LESS_BOOT
> + bool "Dom0less boot support" if EXPERT
> + depends on HAS_DOM0LESS
I think it is better to also add here:
depends on HAS_DEVICE_TREE
and ...
> + default y
> + help
> + Dom0less boot support enables Xen to create and start domU guests during
> + Xen boot without the need of a control domain (Dom0), which could be
> + present anyway.
> +
> config GRANT_TABLE
> bool "Grant table support" if EXPERT
> default y
> @@ -74,6 +83,10 @@ config HAS_DEVICE_TREE
> bool
> select LIBFDT
>
> +config HAS_DOM0LESS
> + bool
> + select HAS_DEVICE_TREE
... remove select HAS_DEVICE_TREE from here. To reduce the dependencies
complexity.
> config HAS_DIT # Data Independent Timing
> bool
>
> diff --git a/xen/common/device-tree/Makefile b/xen/common/device-tree/Makefile
> index 7c549be38a..f3dafc9b81 100644
> --- a/xen/common/device-tree/Makefile
> +++ b/xen/common/device-tree/Makefile
> @@ -1,5 +1,6 @@
> obj-y += bootfdt.init.o
> obj-y += bootinfo.init.o
> obj-y += device-tree.o
> +obj-$(CONFIG_DOM0LESS_BOOT) += dom0less-build.o
> obj-$(CONFIG_OVERLAY_DTB) += dt-overlay.o
> obj-y += intc.o
> diff --git a/xen/common/device-tree/dom0less-build.c b/xen/common/device-tree/dom0less-build.c
> new file mode 100644
> index 0000000000..a01a8b6b1a
> --- /dev/null
> +++ b/xen/common/device-tree/dom0less-build.c
> @@ -0,0 +1,283 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <xen/bootfdt.h>
> +#include <xen/device_tree.h>
> +#include <xen/domain.h>
> +#include <xen/err.h>
> +#include <xen/event.h>
> +#include <xen/grant_table.h>
> +#include <xen/init.h>
> +#include <xen/iommu.h>
> +#include <xen/llc-coloring.h>
> +#include <xen/sched.h>
> +#include <xen/stdbool.h>
> +#include <xen/types.h>
> +
> +#include <public/bootfdt.h>
> +#include <public/domctl.h>
> +#include <public/event_channel.h>
> +
> +#include <asm/dom0less-build.h>
> +#include <asm/setup.h>
> +
> +static domid_t __initdata xs_domid = DOMID_INVALID;
> +
> +void __init set_xs_domain(struct domain *d)
> +{
> + xs_domid = d->domain_id;
> + set_global_virq_handler(d, VIRQ_DOM_EXC);
> +}
> +
> +bool __init is_dom0less_mode(void)
> +{
> + struct bootmodules *mods = &bootinfo.modules;
> + struct bootmodule *mod;
> + unsigned int i;
> + bool dom0found = false;
> + bool domUfound = false;
> +
> + /* Look into the bootmodules */
> + for ( i = 0 ; i < mods->nr_mods ; i++ )
> + {
> + mod = &mods->module[i];
> + /* Find if dom0 and domU kernels are present */
> + if ( mod->kind == BOOTMOD_KERNEL )
> + {
> + if ( mod->domU == false )
> + {
> + dom0found = true;
> + break;
> + }
> + else
> + domUfound = true;
> + }
> + }
> +
> + /*
> + * If there is no dom0 kernel but at least one domU, then we are in
> + * dom0less mode
> + */
> + return ( !dom0found && domUfound );
> +}
> +
> +static int __init alloc_xenstore_evtchn(struct domain *d)
> +{
> + evtchn_alloc_unbound_t alloc;
> + int rc;
> +
> + alloc.dom = d->domain_id;
> + alloc.remote_dom = xs_domid;
> + rc = evtchn_alloc_unbound(&alloc, 0);
> + if ( rc )
> + {
> + printk("Failed allocating event channel for domain\n");
> + return rc;
> + }
> +
> + d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN] = alloc.port;
> +
> + return 0;
> +}
> +
> +static void __init initialize_domU_xenstore(void)
> +{
> + struct domain *d;
> +
> + if ( xs_domid == DOMID_INVALID )
> + return;
> +
> + for_each_domain( d )
> + {
> + uint64_t gfn = d->arch.hvm.params[HVM_PARAM_STORE_PFN];
> + int rc;
> +
> + if ( gfn == 0 )
> + continue;
> +
> + if ( is_xenstore_domain(d) )
> + continue;
> +
> + rc = alloc_xenstore_evtchn(d);
> + if ( rc < 0 )
> + panic("%pd: Failed to allocate xenstore_evtchn\n", d);
> +
> + if ( gfn != XENSTORE_PFN_LATE_ALLOC && IS_ENABLED(CONFIG_GRANT_TABLE) )
> + {
> + ASSERT(gfn < UINT32_MAX);
> + gnttab_seed_entry(d, GNTTAB_RESERVED_XENSTORE, xs_domid, gfn);
> + }
> + }
> +}
> +
> +void __init create_domUs(void)
> +{
> + struct dt_device_node *node;
> + const char *dom0less_iommu;
> + bool iommu = false;
> + const struct dt_device_node *cpupool_node,
> + *chosen = dt_find_node_by_path("/chosen");
> + const char *llc_colors_str = NULL;
> +
> + BUG_ON(chosen == NULL);
> + dt_for_each_child_node(chosen, node)
> + {
> + struct domain *d;
> + struct xen_domctl_createdomain d_cfg = {0};
> + unsigned int flags = 0U;
> + bool has_dtb = false;
> + uint32_t val;
> + int rc;
> +
> + if ( !dt_device_is_compatible(node, "xen,domain") )
> + continue;
> +
> + if ( (max_init_domid + 1) >= DOMID_FIRST_RESERVED )
> + panic("No more domain IDs available\n");
> +
> + d_cfg.max_evtchn_port = 1023;
> + d_cfg.max_grant_frames = -1;
> + d_cfg.max_maptrack_frames = -1;
> + d_cfg.grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version);
> +
> + if ( dt_property_read_u32(node, "capabilities", &val) )
> + {
> + if ( val & ~DOMAIN_CAPS_MASK )
> + panic("Invalid capabilities (%"PRIx32")\n", val);
> +
> + if ( val & DOMAIN_CAPS_CONTROL )
> + flags |= CDF_privileged;
> +
> + if ( val & DOMAIN_CAPS_HARDWARE )
> + {
> + if ( hardware_domain )
> + panic("Only 1 hardware domain can be specified! (%pd)\n",
> + hardware_domain);
> +
> + d_cfg.max_grant_frames = gnttab_dom0_frames();
> + d_cfg.max_evtchn_port = -1;
> + flags |= CDF_hardware;
> + iommu = true;
> + }
> +
> + if ( val & DOMAIN_CAPS_XENSTORE )
> + {
> + if ( xs_domid != DOMID_INVALID )
> + panic("Only 1 xenstore domain can be specified! (%u)\n",
> + xs_domid);
> +
> + d_cfg.flags |= XEN_DOMCTL_CDF_xs_domain;
> + d_cfg.max_evtchn_port = -1;
> + }
> + }
> +
> + if ( dt_find_property(node, "xen,static-mem", NULL) )
> + {
> + if ( llc_coloring_enabled )
> + panic("LLC coloring and static memory are incompatible\n");
> +
> + flags |= CDF_staticmem;
> + }
> +
> + if ( dt_property_read_bool(node, "direct-map") )
> + {
> + if ( !(flags & CDF_staticmem) )
> + panic("direct-map is not valid for domain %s without static allocation.\n",
> + dt_node_name(node));
> +
> + flags |= CDF_directmap;
> + }
> +
> + if ( !dt_property_read_u32(node, "cpus", &d_cfg.max_vcpus) )
> + panic("Missing property 'cpus' for domain %s\n",
> + dt_node_name(node));
> +
> + if ( !dt_property_read_string(node, "passthrough", &dom0less_iommu) )
> + {
> + if ( flags & CDF_hardware )
> + panic("Don't specify passthrough for hardware domain\n");
> +
> + if ( !strcmp(dom0less_iommu, "enabled") )
> + iommu = true;
> + }
> +
> + if ( (flags & CDF_hardware) && !(flags & CDF_directmap) &&
> + !iommu_enabled )
> + panic("non-direct mapped hardware domain requires iommu\n");
> +
> + if ( dt_find_compatible_node(node, NULL, "multiboot,device-tree") )
> + {
> + if ( flags & CDF_hardware )
> + panic("\"multiboot,device-tree\" incompatible with hardware domain\n");
> +
> + has_dtb = true;
> + }
> +
> + if ( iommu_enabled && (iommu || has_dtb) )
> + d_cfg.flags |= XEN_DOMCTL_CDF_iommu;
> +
> + /* Get the optional property domain-cpupool */
> + cpupool_node = dt_parse_phandle(node, "domain-cpupool", 0);
> + if ( cpupool_node )
> + {
> + int pool_id = btcpupools_get_domain_pool_id(cpupool_node);
> + if ( pool_id < 0 )
> + panic("Error getting cpupool id from domain-cpupool (%d)\n",
> + pool_id);
> + d_cfg.cpupool_id = pool_id;
> + }
> +
> + if ( dt_property_read_u32(node, "max_grant_version", &val) )
> + d_cfg.grant_opts = XEN_DOMCTL_GRANT_version(val);
> +
> + if ( dt_property_read_u32(node, "max_grant_frames", &val) )
> + {
> + if ( val > INT32_MAX )
> + panic("max_grant_frames (%"PRIu32") overflow\n", val);
> + d_cfg.max_grant_frames = val;
> + }
> +
> + if ( dt_property_read_u32(node, "max_maptrack_frames", &val) )
> + {
> + if ( val > INT32_MAX )
> + panic("max_maptrack_frames (%"PRIu32") overflow\n", val);
> + d_cfg.max_maptrack_frames = val;
> + }
> +
> + dt_property_read_string(node, "llc-colors", &llc_colors_str);
> + if ( !llc_coloring_enabled && llc_colors_str )
> + panic("'llc-colors' found, but LLC coloring is disabled\n");
> +
> + arch_create_domUs(node, &d_cfg, flags);
> +
> + /*
> + * The variable max_init_domid is initialized with zero, so here it's
> + * very important to use the pre-increment operator to call
> + * domain_create() with a domid > 0. (domid == 0 is reserved for Dom0)
> + */
> + d = domain_create(++max_init_domid, &d_cfg, flags);
> + if ( IS_ERR(d) )
> + panic("Error creating domain %s (rc = %ld)\n",
> + dt_node_name(node), PTR_ERR(d));
> +
> + if ( llc_coloring_enabled &&
> + (rc = domain_set_llc_colors_from_str(d, llc_colors_str)) )
> + panic("Error initializing LLC coloring for domain %s (rc = %d)\n",
> + dt_node_name(node), rc);
> +
> + d->is_console = true;
> + dt_device_set_used_by(node, d->domain_id);
> +
> + rc = construct_domU(d, node);
> + if ( rc )
> + panic("Could not set up domain %s (rc = %d)\n",
> + dt_node_name(node), rc);
> +
> + if ( d_cfg.flags & XEN_DOMCTL_CDF_xs_domain )
> + set_xs_domain(d);
> + }
> +
> + if ( need_xenstore && xs_domid == DOMID_INVALID )
> + panic("xenstore requested, but xenstore domain not present\n");
> +
> + initialize_domU_xenstore();
> +}
> diff --git a/xen/include/asm-generic/dom0less-build.h b/xen/include/asm-generic/dom0less-build.h
> new file mode 100644
> index 0000000000..5655571a66
> --- /dev/null
> +++ b/xen/include/asm-generic/dom0less-build.h
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#ifndef __ASM_GENERIC_DOM0LESS_BUILD_H__
> +#define __ASM_GENERIC_DOM0LESS_BUILD_H__
> +
> +#include <xen/stdbool.h>
> +
> +#ifdef CONFIG_DOM0LESS_BOOT
> +
> +#include <public/domctl.h>
> +
> +struct domain;
This declaration needs to be out of the #ifdef CONFIG_DOM0LESS_BOOT
because...
> +struct dt_device_node;
> +
> +/* TODO: remove both when construct_domU() will be moved to common. */
> +#define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
> +extern bool need_xenstore;
> +
> +void create_domUs(void);
> +bool is_dom0less_mode(void);
> +void set_xs_domain(struct domain *d);
> +
> +int construct_domU(struct domain *d, const struct dt_device_node *node);
> +
> +void arch_create_domUs(struct dt_device_node *node,
> + struct xen_domctl_createdomain *d_cfg,
> + unsigned int flags);
> +
> +#else /* !CONFIG_DOM0LESS_BOOT */
> +
> +static inline void create_domUs(void) {}
> +static inline bool is_dom0less_mode(void)
> +{
> + return false;
> +}
> +static inline void set_xs_domain(struct domain *d) {}
... of this usage of struct domain *d.
> +#endif /* CONFIG_DOM0LESS_BOOT */
> +
> +#endif /* __ASM_GENERIC_DOM0LESS_BUILD_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v3 2/8] xen/common: dom0less: make some parts of Arm's CONFIG_DOM0LESS common
2025-05-02 17:55 ` Stefano Stabellini
@ 2025-05-05 7:35 ` Oleksii Kurochko
2025-05-05 8:35 ` Orzel, Michal
2025-05-05 13:19 ` Oleksii Kurochko
1 sibling, 1 reply; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-05 7:35 UTC (permalink / raw)
To: Stefano Stabellini
Cc: xen-devel, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD, Jan Beulich,
Roger Pau Monné
[-- Attachment #1: Type: text/plain, Size: 35778 bytes --]
On 5/2/25 7:55 PM, Stefano Stabellini wrote:
> On Fri, 2 May 2025, Oleksii Kurochko wrote:
>> Move some parts of Arm's Dom0Less code to be reused by other architectures.
>> At the moment, RISC-V is going to reuse these parts.
>>
>> Move dom0less-build.h from the Arm-specific directory to asm-generic
>> as these header is expected to be the same across acrhictectures with
>> some updates: add the following declaration of construct_domU(),
>> and arch_create_domUs() as there are some parts which are still
>> architecture-specific.
>>
>> Introduce HAS_DOM0LESS to provide ability to enable generic Dom0less
>> code for an architecture.
>>
>> Relocate the CONFIG_DOM0LESS configuration to the common with adding
>> "depends on HAS_DOM0LESS" to not break builds for architectures which
>> don't support CONFIG_DOM0LESS config, especically it would be useful
>> to not provide stubs for construct_domU(), arch_create_domUs()
>> in case of *-randconfig which may set CONFIG_DOM0LESS=y.
>>
>> Move is_dom0less_mode() function to the common code, as it depends on
>> boot modules that are already part of the common code.
>>
>> Move create_domUs() function to the common code with some updates:
>> - Add arch_create_domUs() to cover parsing of arch-specific features,
>> for example, SVE (Scalar Vector Extension ) exists only in Arm.
>>
>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
>> ---
>> It was suggested to change dom0less to predefined domus or similar
>> (https://lore.kernel.org/xen-devel/cd2a3644-c9c6-4e77-9491-2988703906c0@gmail.com/T/#m1d5e81e5f1faca98a3c51efe4f35af25010edbf0):
>>
>> I decided to go with dom0less name for now as it will require a lot of places to change,
>> including CI's test, and IMO we could do in a separate patch.
>> If it is necessry to do now, I'll be happy to do that in next version of the current
>> patch series.
> I think it is fine to use dom0less for now, it will make the code easier
> to review and it is not necessary to change the name at this point.
>
> The patch looks good to me, except for a couple of minor suggestions I
> have below. I could make them on commit. With those:
>
> Reviewed-by: Stefano Stabellini<sstabellini@kernel.org>
Thanks.
I will apply the suggestions below (unless they have already been committed by the time I start preparing the new version of the patch series).
~ Oleksii
>
>
>> ---
>> Changes in v3:
>> - Move changes connected to the patch "xen/arm: dom0less delay xenstore initialization"
>> to common.
>> Also, some necessary parts for the mentioned patches were moved
>> to common (such as alloc_xenstore_evtchn(), ... ).
>> Not all changes are moved, changes connected to alloc_xenstore_params() and
>> construct_domu() will be moved in the following patches of this patch series.
>> - Move parsing of capabilities property to common code.
>> - Align parsing of "passthrough", "multiboot,device-tree" properties with staging.
>> - Drop arch_xen_domctl_createdomain().
>> - Add 'select HAS_DEVICE_TREE' for config HAS_DOM0LESS.
>> - Add empty lines after license in the top of newly added files.
>> - s/arch_create_domus/arch_create_domUs.
>> - Add footer below commit message regarding the naming of dom0less.
>> ---
>> Changes in v2:
>> - Convert 'depends on Arm' to 'depends on HAS_DOM0LESS' for
>> CONFIG_DOM0LESS_BOOT.
>> - Change 'default Arm' to 'default y' for CONFIG_DOM0LESS_BOOT as there is
>> dependency on HAS_DOM0LESS.
>> - Introduce HAS_DOM0LESS and enable it for Arm.
>> - Update the commit message.
>> ---
>> xen/arch/arm/Kconfig | 9 +-
>> xen/arch/arm/dom0less-build.c | 371 ++++------------------
>> xen/arch/arm/include/asm/Makefile | 1 +
>> xen/arch/arm/include/asm/dom0less-build.h | 34 --
>> xen/common/Kconfig | 13 +
>> xen/common/device-tree/Makefile | 1 +
>> xen/common/device-tree/dom0less-build.c | 283 +++++++++++++++++
>> xen/include/asm-generic/dom0less-build.h | 49 +++
>> 8 files changed, 404 insertions(+), 357 deletions(-)
>> delete mode 100644 xen/arch/arm/include/asm/dom0less-build.h
>> create mode 100644 xen/common/device-tree/dom0less-build.c
>> create mode 100644 xen/include/asm-generic/dom0less-build.h
>>
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index da8a406f5a..d0e0a7753c 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -15,6 +15,7 @@ config ARM
>> select GENERIC_UART_INIT
>> select HAS_ALTERNATIVE if HAS_VMAP
>> select HAS_DEVICE_TREE
>> + select HAS_DOM0LESS
>> select HAS_STACK_PROTECTOR
>> select HAS_UBSAN
>>
>> @@ -120,14 +121,6 @@ config GICV2
>> Driver for the ARM Generic Interrupt Controller v2.
>> If unsure, say Y
>>
>> -config DOM0LESS_BOOT
>> - bool "Dom0less boot support" if EXPERT
>> - default y
>> - help
>> - Dom0less boot support enables Xen to create and start domU guests during
>> - Xen boot without the need of a control domain (Dom0), which could be
>> - present anyway.
>> -
>> config GICV3
>> bool "GICv3 driver"
>> depends on !NEW_VGIC
>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
>> index a356fc94fc..ef49495d4f 100644
>> --- a/xen/arch/arm/dom0less-build.c
>> +++ b/xen/arch/arm/dom0less-build.c
>> @@ -22,48 +22,7 @@
>> #include <asm/static-memory.h>
>> #include <asm/static-shmem.h>
>>
>> -#define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
>> -
>> -static domid_t __initdata xs_domid = DOMID_INVALID;
>> -static bool __initdata need_xenstore;
>> -
>> -void __init set_xs_domain(struct domain *d)
>> -{
>> - xs_domid = d->domain_id;
>> - set_global_virq_handler(d, VIRQ_DOM_EXC);
>> -}
>> -
>> -bool __init is_dom0less_mode(void)
>> -{
>> - struct bootmodules *mods = &bootinfo.modules;
>> - struct bootmodule *mod;
>> - unsigned int i;
>> - bool dom0found = false;
>> - bool domUfound = false;
>> -
>> - /* Look into the bootmodules */
>> - for ( i = 0 ; i < mods->nr_mods ; i++ )
>> - {
>> - mod = &mods->module[i];
>> - /* Find if dom0 and domU kernels are present */
>> - if ( mod->kind == BOOTMOD_KERNEL )
>> - {
>> - if ( mod->domU == false )
>> - {
>> - dom0found = true;
>> - break;
>> - }
>> - else
>> - domUfound = true;
>> - }
>> - }
>> -
>> - /*
>> - * If there is no dom0 kernel but at least one domU, then we are in
>> - * dom0less mode
>> - */
>> - return ( !dom0found && domUfound );
>> -}
>> +bool __initdata need_xenstore;
>>
>> #ifdef CONFIG_VGICV2
>> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
>> @@ -686,25 +645,6 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
>> return -EINVAL;
>> }
>>
>> -static int __init alloc_xenstore_evtchn(struct domain *d)
>> -{
>> - evtchn_alloc_unbound_t alloc;
>> - int rc;
>> -
>> - alloc.dom = d->domain_id;
>> - alloc.remote_dom = xs_domid;
>> - rc = evtchn_alloc_unbound(&alloc, 0);
>> - if ( rc )
>> - {
>> - printk("Failed allocating event channel for domain\n");
>> - return rc;
>> - }
>> -
>> - d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN] = alloc.port;
>> -
>> - return 0;
>> -}
>> -
>> #define XENSTORE_PFN_OFFSET 1
>> static int __init alloc_xenstore_page(struct domain *d)
>> {
>> @@ -771,36 +711,6 @@ static int __init alloc_xenstore_params(struct kernel_info *kinfo)
>> return rc;
>> }
>>
>> -static void __init initialize_domU_xenstore(void)
>> -{
>> - struct domain *d;
>> -
>> - if ( xs_domid == DOMID_INVALID )
>> - return;
>> -
>> - for_each_domain( d )
>> - {
>> - uint64_t gfn = d->arch.hvm.params[HVM_PARAM_STORE_PFN];
>> - int rc;
>> -
>> - if ( gfn == 0 )
>> - continue;
>> -
>> - if ( is_xenstore_domain(d) )
>> - continue;
>> -
>> - rc = alloc_xenstore_evtchn(d);
>> - if ( rc < 0 )
>> - panic("%pd: Failed to allocate xenstore_evtchn\n", d);
>> -
>> - if ( gfn != XENSTORE_PFN_LATE_ALLOC && IS_ENABLED(CONFIG_GRANT_TABLE) )
>> - {
>> - ASSERT(gfn < UINT32_MAX);
>> - gnttab_seed_entry(d, GNTTAB_RESERVED_XENSTORE, xs_domid, gfn);
>> - }
>> - }
>> -}
>> -
>> static void __init domain_vcpu_affinity(struct domain *d,
>> const struct dt_device_node *node)
>> {
>> @@ -906,8 +816,8 @@ static inline int domain_p2m_set_allocation(struct domain *d, uint64_t mem,
>> }
>> #endif /* CONFIG_ARCH_PAGING_MEMPOOL */
>>
>> -static int __init construct_domU(struct domain *d,
>> - const struct dt_device_node *node)
>> +int __init construct_domU(struct domain *d,
>> + const struct dt_device_node *node)
>> {
>> struct kernel_info kinfo = KERNEL_INFO_INIT;
>> const char *dom0less_enhanced;
>> @@ -1009,246 +919,77 @@ static int __init construct_domU(struct domain *d,
>> return alloc_xenstore_params(&kinfo);
>> }
>>
>> -void __init create_domUs(void)
>> +void __init arch_create_domUs(struct dt_device_node *node,
>> + struct xen_domctl_createdomain *d_cfg,
>> + unsigned int flags)
>> {
>> - struct dt_device_node *node;
>> - const char *dom0less_iommu;
>> - bool iommu = false;
>> - const struct dt_device_node *cpupool_node,
>> - *chosen = dt_find_node_by_path("/chosen");
>> - const char *llc_colors_str = NULL;
>> -
>> - BUG_ON(chosen == NULL);
>> - dt_for_each_child_node(chosen, node)
>> - {
>> - struct domain *d;
>> - struct xen_domctl_createdomain d_cfg = {
>> - .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
>> - .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
>> - /*
>> - * The default of 1023 should be sufficient for guests because
>> - * on ARM we don't bind physical interrupts to event channels.
>> - * The only use of the evtchn port is inter-domain communications.
>> - * 1023 is also the default value used in libxl.
>> - */
>> - .max_evtchn_port = 1023,
>> - .max_grant_frames = -1,
>> - .max_maptrack_frames = -1,
>> - .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version),
>> - };
>> - unsigned int flags = 0U;
>> - bool has_dtb = false;
>> - uint32_t val;
>> - int rc;
>> -
>> - if ( !dt_device_is_compatible(node, "xen,domain") )
>> - continue;
>> -
>> - if ( (max_init_domid + 1) >= DOMID_FIRST_RESERVED )
>> - panic("No more domain IDs available\n");
>> + uint32_t val;
>>
>> - if ( dt_property_read_u32(node, "capabilities", &val) )
>> - {
>> - if ( val & ~DOMAIN_CAPS_MASK )
>> - panic("Invalid capabilities (%"PRIx32")\n", val);
>> -
>> - if ( val & DOMAIN_CAPS_CONTROL )
>> - flags |= CDF_privileged;
>> -
>> - if ( val & DOMAIN_CAPS_HARDWARE )
>> - {
>> - if ( hardware_domain )
>> - panic("Only 1 hardware domain can be specified! (%pd)\n",
>> - hardware_domain);
>> -
>> - d_cfg.max_grant_frames = gnttab_dom0_frames();
>> - d_cfg.max_evtchn_port = -1;
>> - flags |= CDF_hardware;
>> - iommu = true;
>> - }
>> -
>> - if ( val & DOMAIN_CAPS_XENSTORE )
>> - {
>> - if ( xs_domid != DOMID_INVALID )
>> - panic("Only 1 xenstore domain can be specified! (%u)\n",
>> - xs_domid);
>> + d_cfg->arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE;
>> + d_cfg->flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap;
>>
>> - d_cfg.flags |= XEN_DOMCTL_CDF_xs_domain;
>> - d_cfg.max_evtchn_port = -1;
>> - }
>> - }
>> -
>> - if ( dt_find_property(node, "xen,static-mem", NULL) )
>> - {
>> - if ( llc_coloring_enabled )
>> - panic("LLC coloring and static memory are incompatible\n");
>> -
>> - flags |= CDF_staticmem;
>> - }
>> -
>> - if ( dt_property_read_bool(node, "direct-map") )
>> - {
>> - if ( !(flags & CDF_staticmem) )
>> - panic("direct-map is not valid for domain %s without static allocation.\n",
>> - dt_node_name(node));
>> -
>> - flags |= CDF_directmap;
>> - }
>> -
>> - if ( !dt_property_read_u32(node, "cpus", &d_cfg.max_vcpus) )
>> - panic("Missing property 'cpus' for domain %s\n",
>> - dt_node_name(node));
>> -
>> - if ( !dt_property_read_string(node, "passthrough", &dom0less_iommu) )
>> - {
>> - if ( flags & CDF_hardware )
>> - panic("Don't specify passthrough for hardware domain\n");
>> -
>> - if ( !strcmp(dom0less_iommu, "enabled") )
>> - iommu = true;
>> - }
>> -
>> - if ( (flags & CDF_hardware) && !(flags & CDF_directmap) &&
>> - !iommu_enabled )
>> - panic("non-direct mapped hardware domain requires iommu\n");
>> -
>> - if ( dt_find_compatible_node(node, NULL, "multiboot,device-tree") )
>> - {
>> - if ( flags & CDF_hardware )
>> - panic("\"multiboot,device-tree\" incompatible with hardware domain\n");
>> -
>> - has_dtb = true;
>> - }
>> -
>> - if ( iommu_enabled && (iommu || has_dtb) )
>> - d_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>> -
>> - if ( !dt_property_read_u32(node, "nr_spis", &d_cfg.arch.nr_spis) )
>> - {
>> - int vpl011_virq = GUEST_VPL011_SPI;
>> -
>> - d_cfg.arch.nr_spis = VGIC_DEF_NR_SPIS;
>> -
>> - /*
>> - * The VPL011 virq is GUEST_VPL011_SPI, unless direct-map is
>> - * set, in which case it'll match the hardware.
>> - *
>> - * Since the domain is not yet created, we can't use
>> - * d->arch.vpl011.irq. So the logic to find the vIRQ has to
>> - * be hardcoded.
>> - * The logic here shall be consistent with the one in
>> - * domain_vpl011_init().
>> - */
>> - if ( flags & CDF_directmap )
>> - {
>> - vpl011_virq = serial_irq(SERHND_DTUART);
>> - if ( vpl011_virq < 0 )
>> - panic("Error getting IRQ number for this serial port %d\n",
>> - SERHND_DTUART);
>> - }
>> + if ( !dt_property_read_u32(node, "nr_spis", &d_cfg->arch.nr_spis) )
>> + {
>> + int vpl011_virq = GUEST_VPL011_SPI;
>>
>> - /*
>> - * vpl011 uses one emulated SPI. If vpl011 is requested, make
>> - * sure that we allocate enough SPIs for it.
>> - */
>> - if ( dt_property_read_bool(node, "vpl011") )
>> - d_cfg.arch.nr_spis = MAX(d_cfg.arch.nr_spis,
>> - vpl011_virq - 32 + 1);
>> - }
>> - else if ( flags & CDF_hardware )
>> - panic("nr_spis cannot be specified for hardware domain\n");
>> + d_cfg->arch.nr_spis = VGIC_DEF_NR_SPIS;
>>
>> - /* Get the optional property domain-cpupool */
>> - cpupool_node = dt_parse_phandle(node, "domain-cpupool", 0);
>> - if ( cpupool_node )
>> + /*
>> + * The VPL011 virq is GUEST_VPL011_SPI, unless direct-map is
>> + * set, in which case it'll match the hardware.
>> + *
>> + * Since the domain is not yet created, we can't use
>> + * d->arch.vpl011.irq. So the logic to find the vIRQ has to
>> + * be hardcoded.
>> + * The logic here shall be consistent with the one in
>> + * domain_vpl011_init().
>> + */
>> + if ( flags & CDF_directmap )
>> {
>> - int pool_id = btcpupools_get_domain_pool_id(cpupool_node);
>> - if ( pool_id < 0 )
>> - panic("Error getting cpupool id from domain-cpupool (%d)\n",
>> - pool_id);
>> - d_cfg.cpupool_id = pool_id;
>> + vpl011_virq = serial_irq(SERHND_DTUART);
>> + if ( vpl011_virq < 0 )
>> + panic("Error getting IRQ number for this serial port %d\n",
>> + SERHND_DTUART);
>> }
>>
>> - if ( dt_property_read_u32(node, "max_grant_version", &val) )
>> - d_cfg.grant_opts = XEN_DOMCTL_GRANT_version(val);
>> + /*
>> + * vpl011 uses one emulated SPI. If vpl011 is requested, make
>> + * sure that we allocate enough SPIs for it.
>> + */
>> + if ( dt_property_read_bool(node, "vpl011") )
>> + d_cfg->arch.nr_spis = MAX(d_cfg->arch.nr_spis,
>> + vpl011_virq - 32 + 1);
>> + }
>> + else if ( flags & CDF_hardware )
>> + panic("nr_spis cannot be specified for hardware domain\n");
>>
>> - if ( dt_property_read_u32(node, "max_grant_frames", &val) )
>> - {
>> - if ( val > INT32_MAX )
>> - panic("max_grant_frames (%"PRIu32") overflow\n", val);
>> - d_cfg.max_grant_frames = val;
>> - }
>> + if ( dt_get_property(node, "sve", &val) )
>> + {
>> +#ifdef CONFIG_ARM64_SVE
>> + unsigned int sve_vl_bits;
>> + bool ret = false;
>>
>> - if ( dt_property_read_u32(node, "max_maptrack_frames", &val) )
>> + if ( !val )
>> {
>> - if ( val > INT32_MAX )
>> - panic("max_maptrack_frames (%"PRIu32") overflow\n", val);
>> - d_cfg.max_maptrack_frames = val;
>> + /* Property found with no value, means max HW VL supported */
>> + ret = sve_domctl_vl_param(-1, &sve_vl_bits);
>> }
>> -
>> - if ( dt_get_property(node, "sve", &val) )
>> + else
>> {
>> -#ifdef CONFIG_ARM64_SVE
>> - unsigned int sve_vl_bits;
>> - bool ret = false;
>> -
>> - if ( !val )
>> - {
>> - /* Property found with no value, means max HW VL supported */
>> - ret = sve_domctl_vl_param(-1, &sve_vl_bits);
>> - }
>> + if ( dt_property_read_u32(node, "sve", &val) )
>> + ret = sve_domctl_vl_param(val, &sve_vl_bits);
>> else
>> - {
>> - if ( dt_property_read_u32(node, "sve", &val) )
>> - ret = sve_domctl_vl_param(val, &sve_vl_bits);
>> - else
>> - panic("Error reading 'sve' property\n");
>> - }
>> + panic("Error reading 'sve' property\n");
>> + }
>>
>> - if ( ret )
>> - d_cfg.arch.sve_vl = sve_encode_vl(sve_vl_bits);
>> - else
>> - panic("SVE vector length error\n");
>> + if ( ret )
>> + d_cfg->arch.sve_vl = sve_encode_vl(sve_vl_bits);
>> + else
>> + panic("SVE vector length error\n");
>> #else
>> - panic("'sve' property found, but CONFIG_ARM64_SVE not selected\n");
>> + panic("'sve' property found, but CONFIG_ARM64_SVE not selected\n");
>> #endif
>> - }
>> -
>> - dt_property_read_string(node, "llc-colors", &llc_colors_str);
>> - if ( !llc_coloring_enabled && llc_colors_str )
>> - panic("'llc-colors' found, but LLC coloring is disabled\n");
>> -
>> - /*
>> - * The variable max_init_domid is initialized with zero, so here it's
>> - * very important to use the pre-increment operator to call
>> - * domain_create() with a domid > 0. (domid == 0 is reserved for Dom0)
>> - */
>> - d = domain_create(++max_init_domid, &d_cfg, flags);
>> - if ( IS_ERR(d) )
>> - panic("Error creating domain %s (rc = %ld)\n",
>> - dt_node_name(node), PTR_ERR(d));
>> -
>> - if ( llc_coloring_enabled &&
>> - (rc = domain_set_llc_colors_from_str(d, llc_colors_str)) )
>> - panic("Error initializing LLC coloring for domain %s (rc = %d)\n",
>> - dt_node_name(node), rc);
>> -
>> - d->is_console = true;
>> - dt_device_set_used_by(node, d->domain_id);
>> -
>> - rc = construct_domU(d, node);
>> - if ( rc )
>> - panic("Could not set up domain %s (rc = %d)\n",
>> - dt_node_name(node), rc);
>> -
>> - if ( d_cfg.flags & XEN_DOMCTL_CDF_xs_domain )
>> - set_xs_domain(d);
>> }
>> -
>> - if ( need_xenstore && xs_domid == DOMID_INVALID )
>> - panic("xenstore requested, but xenstore domain not present\n");
>> -
>> - initialize_domU_xenstore();
>> }
>>
>> /*
>> diff --git a/xen/arch/arm/include/asm/Makefile b/xen/arch/arm/include/asm/Makefile
>> index 4a4036c951..831c914cce 100644
>> --- a/xen/arch/arm/include/asm/Makefile
>> +++ b/xen/arch/arm/include/asm/Makefile
>> @@ -1,6 +1,7 @@
>> # SPDX-License-Identifier: GPL-2.0-only
>> generic-y += altp2m.h
>> generic-y += device.h
>> +generic-y += dom0less-build.h
>> generic-y += hardirq.h
>> generic-y += iocap.h
>> generic-y += paging.h
>> diff --git a/xen/arch/arm/include/asm/dom0less-build.h b/xen/arch/arm/include/asm/dom0less-build.h
>> deleted file mode 100644
>> index b0e41a1954..0000000000
>> --- a/xen/arch/arm/include/asm/dom0less-build.h
>> +++ /dev/null
>> @@ -1,34 +0,0 @@
>> -/* SPDX-License-Identifier: GPL-2.0-only */
>> -
>> -#ifndef __ASM_DOM0LESS_BUILD_H_
>> -#define __ASM_DOM0LESS_BUILD_H_
>> -
>> -#include <xen/stdbool.h>
>> -
>> -#ifdef CONFIG_DOM0LESS_BOOT
>> -
>> -void create_domUs(void);
>> -bool is_dom0less_mode(void);
>> -void set_xs_domain(struct domain *d);
>> -
>> -#else /* !CONFIG_DOM0LESS_BOOT */
>> -
>> -static inline void create_domUs(void) {}
>> -static inline bool is_dom0less_mode(void)
>> -{
>> - return false;
>> -}
>> -static inline void set_xs_domain(struct domain *d) {}
>> -
>> -#endif /* CONFIG_DOM0LESS_BOOT */
>> -
>> -#endif /* __ASM_DOM0LESS_BUILD_H_ */
>> -
>> -/*
>> - * Local variables:
>> - * mode: C
>> - * c-file-style: "BSD"
>> - * c-basic-offset: 4
>> - * indent-tabs-mode: nil
>> - * End:
>> - */
>> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
>> index be28060716..be38abf9e1 100644
>> --- a/xen/common/Kconfig
>> +++ b/xen/common/Kconfig
>> @@ -12,6 +12,15 @@ config CORE_PARKING
>> bool
>> depends on NR_CPUS > 1
>>
>> +config DOM0LESS_BOOT
>> + bool "Dom0less boot support" if EXPERT
>> + depends on HAS_DOM0LESS
> I think it is better to also add here:
>
> depends on HAS_DEVICE_TREE
>
> and ...
>
>
>> + default y
>> + help
>> + Dom0less boot support enables Xen to create and start domU guests during
>> + Xen boot without the need of a control domain (Dom0), which could be
>> + present anyway.
>> +
>> config GRANT_TABLE
>> bool "Grant table support" if EXPERT
>> default y
>> @@ -74,6 +83,10 @@ config HAS_DEVICE_TREE
>> bool
>> select LIBFDT
>>
>> +config HAS_DOM0LESS
>> + bool
>> + select HAS_DEVICE_TREE
> ... remove select HAS_DEVICE_TREE from here. To reduce the dependencies
> complexity.
>
>
>> config HAS_DIT # Data Independent Timing
>> bool
>>
>> diff --git a/xen/common/device-tree/Makefile b/xen/common/device-tree/Makefile
>> index 7c549be38a..f3dafc9b81 100644
>> --- a/xen/common/device-tree/Makefile
>> +++ b/xen/common/device-tree/Makefile
>> @@ -1,5 +1,6 @@
>> obj-y += bootfdt.init.o
>> obj-y += bootinfo.init.o
>> obj-y += device-tree.o
>> +obj-$(CONFIG_DOM0LESS_BOOT) += dom0less-build.o
>> obj-$(CONFIG_OVERLAY_DTB) += dt-overlay.o
>> obj-y += intc.o
>> diff --git a/xen/common/device-tree/dom0less-build.c b/xen/common/device-tree/dom0less-build.c
>> new file mode 100644
>> index 0000000000..a01a8b6b1a
>> --- /dev/null
>> +++ b/xen/common/device-tree/dom0less-build.c
>> @@ -0,0 +1,283 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +
>> +#include <xen/bootfdt.h>
>> +#include <xen/device_tree.h>
>> +#include <xen/domain.h>
>> +#include <xen/err.h>
>> +#include <xen/event.h>
>> +#include <xen/grant_table.h>
>> +#include <xen/init.h>
>> +#include <xen/iommu.h>
>> +#include <xen/llc-coloring.h>
>> +#include <xen/sched.h>
>> +#include <xen/stdbool.h>
>> +#include <xen/types.h>
>> +
>> +#include <public/bootfdt.h>
>> +#include <public/domctl.h>
>> +#include <public/event_channel.h>
>> +
>> +#include <asm/dom0less-build.h>
>> +#include <asm/setup.h>
>> +
>> +static domid_t __initdata xs_domid = DOMID_INVALID;
>> +
>> +void __init set_xs_domain(struct domain *d)
>> +{
>> + xs_domid = d->domain_id;
>> + set_global_virq_handler(d, VIRQ_DOM_EXC);
>> +}
>> +
>> +bool __init is_dom0less_mode(void)
>> +{
>> + struct bootmodules *mods = &bootinfo.modules;
>> + struct bootmodule *mod;
>> + unsigned int i;
>> + bool dom0found = false;
>> + bool domUfound = false;
>> +
>> + /* Look into the bootmodules */
>> + for ( i = 0 ; i < mods->nr_mods ; i++ )
>> + {
>> + mod = &mods->module[i];
>> + /* Find if dom0 and domU kernels are present */
>> + if ( mod->kind == BOOTMOD_KERNEL )
>> + {
>> + if ( mod->domU == false )
>> + {
>> + dom0found = true;
>> + break;
>> + }
>> + else
>> + domUfound = true;
>> + }
>> + }
>> +
>> + /*
>> + * If there is no dom0 kernel but at least one domU, then we are in
>> + * dom0less mode
>> + */
>> + return ( !dom0found && domUfound );
>> +}
>> +
>> +static int __init alloc_xenstore_evtchn(struct domain *d)
>> +{
>> + evtchn_alloc_unbound_t alloc;
>> + int rc;
>> +
>> + alloc.dom = d->domain_id;
>> + alloc.remote_dom = xs_domid;
>> + rc = evtchn_alloc_unbound(&alloc, 0);
>> + if ( rc )
>> + {
>> + printk("Failed allocating event channel for domain\n");
>> + return rc;
>> + }
>> +
>> + d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN] = alloc.port;
>> +
>> + return 0;
>> +}
>> +
>> +static void __init initialize_domU_xenstore(void)
>> +{
>> + struct domain *d;
>> +
>> + if ( xs_domid == DOMID_INVALID )
>> + return;
>> +
>> + for_each_domain( d )
>> + {
>> + uint64_t gfn = d->arch.hvm.params[HVM_PARAM_STORE_PFN];
>> + int rc;
>> +
>> + if ( gfn == 0 )
>> + continue;
>> +
>> + if ( is_xenstore_domain(d) )
>> + continue;
>> +
>> + rc = alloc_xenstore_evtchn(d);
>> + if ( rc < 0 )
>> + panic("%pd: Failed to allocate xenstore_evtchn\n", d);
>> +
>> + if ( gfn != XENSTORE_PFN_LATE_ALLOC && IS_ENABLED(CONFIG_GRANT_TABLE) )
>> + {
>> + ASSERT(gfn < UINT32_MAX);
>> + gnttab_seed_entry(d, GNTTAB_RESERVED_XENSTORE, xs_domid, gfn);
>> + }
>> + }
>> +}
>> +
>> +void __init create_domUs(void)
>> +{
>> + struct dt_device_node *node;
>> + const char *dom0less_iommu;
>> + bool iommu = false;
>> + const struct dt_device_node *cpupool_node,
>> + *chosen = dt_find_node_by_path("/chosen");
>> + const char *llc_colors_str = NULL;
>> +
>> + BUG_ON(chosen == NULL);
>> + dt_for_each_child_node(chosen, node)
>> + {
>> + struct domain *d;
>> + struct xen_domctl_createdomain d_cfg = {0};
>> + unsigned int flags = 0U;
>> + bool has_dtb = false;
>> + uint32_t val;
>> + int rc;
>> +
>> + if ( !dt_device_is_compatible(node, "xen,domain") )
>> + continue;
>> +
>> + if ( (max_init_domid + 1) >= DOMID_FIRST_RESERVED )
>> + panic("No more domain IDs available\n");
>> +
>> + d_cfg.max_evtchn_port = 1023;
>> + d_cfg.max_grant_frames = -1;
>> + d_cfg.max_maptrack_frames = -1;
>> + d_cfg.grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version);
>> +
>> + if ( dt_property_read_u32(node, "capabilities", &val) )
>> + {
>> + if ( val & ~DOMAIN_CAPS_MASK )
>> + panic("Invalid capabilities (%"PRIx32")\n", val);
>> +
>> + if ( val & DOMAIN_CAPS_CONTROL )
>> + flags |= CDF_privileged;
>> +
>> + if ( val & DOMAIN_CAPS_HARDWARE )
>> + {
>> + if ( hardware_domain )
>> + panic("Only 1 hardware domain can be specified! (%pd)\n",
>> + hardware_domain);
>> +
>> + d_cfg.max_grant_frames = gnttab_dom0_frames();
>> + d_cfg.max_evtchn_port = -1;
>> + flags |= CDF_hardware;
>> + iommu = true;
>> + }
>> +
>> + if ( val & DOMAIN_CAPS_XENSTORE )
>> + {
>> + if ( xs_domid != DOMID_INVALID )
>> + panic("Only 1 xenstore domain can be specified! (%u)\n",
>> + xs_domid);
>> +
>> + d_cfg.flags |= XEN_DOMCTL_CDF_xs_domain;
>> + d_cfg.max_evtchn_port = -1;
>> + }
>> + }
>> +
>> + if ( dt_find_property(node, "xen,static-mem", NULL) )
>> + {
>> + if ( llc_coloring_enabled )
>> + panic("LLC coloring and static memory are incompatible\n");
>> +
>> + flags |= CDF_staticmem;
>> + }
>> +
>> + if ( dt_property_read_bool(node, "direct-map") )
>> + {
>> + if ( !(flags & CDF_staticmem) )
>> + panic("direct-map is not valid for domain %s without static allocation.\n",
>> + dt_node_name(node));
>> +
>> + flags |= CDF_directmap;
>> + }
>> +
>> + if ( !dt_property_read_u32(node, "cpus", &d_cfg.max_vcpus) )
>> + panic("Missing property 'cpus' for domain %s\n",
>> + dt_node_name(node));
>> +
>> + if ( !dt_property_read_string(node, "passthrough", &dom0less_iommu) )
>> + {
>> + if ( flags & CDF_hardware )
>> + panic("Don't specify passthrough for hardware domain\n");
>> +
>> + if ( !strcmp(dom0less_iommu, "enabled") )
>> + iommu = true;
>> + }
>> +
>> + if ( (flags & CDF_hardware) && !(flags & CDF_directmap) &&
>> + !iommu_enabled )
>> + panic("non-direct mapped hardware domain requires iommu\n");
>> +
>> + if ( dt_find_compatible_node(node, NULL, "multiboot,device-tree") )
>> + {
>> + if ( flags & CDF_hardware )
>> + panic("\"multiboot,device-tree\" incompatible with hardware domain\n");
>> +
>> + has_dtb = true;
>> + }
>> +
>> + if ( iommu_enabled && (iommu || has_dtb) )
>> + d_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>> +
>> + /* Get the optional property domain-cpupool */
>> + cpupool_node = dt_parse_phandle(node, "domain-cpupool", 0);
>> + if ( cpupool_node )
>> + {
>> + int pool_id = btcpupools_get_domain_pool_id(cpupool_node);
>> + if ( pool_id < 0 )
>> + panic("Error getting cpupool id from domain-cpupool (%d)\n",
>> + pool_id);
>> + d_cfg.cpupool_id = pool_id;
>> + }
>> +
>> + if ( dt_property_read_u32(node, "max_grant_version", &val) )
>> + d_cfg.grant_opts = XEN_DOMCTL_GRANT_version(val);
>> +
>> + if ( dt_property_read_u32(node, "max_grant_frames", &val) )
>> + {
>> + if ( val > INT32_MAX )
>> + panic("max_grant_frames (%"PRIu32") overflow\n", val);
>> + d_cfg.max_grant_frames = val;
>> + }
>> +
>> + if ( dt_property_read_u32(node, "max_maptrack_frames", &val) )
>> + {
>> + if ( val > INT32_MAX )
>> + panic("max_maptrack_frames (%"PRIu32") overflow\n", val);
>> + d_cfg.max_maptrack_frames = val;
>> + }
>> +
>> + dt_property_read_string(node, "llc-colors", &llc_colors_str);
>> + if ( !llc_coloring_enabled && llc_colors_str )
>> + panic("'llc-colors' found, but LLC coloring is disabled\n");
>> +
>> + arch_create_domUs(node, &d_cfg, flags);
>> +
>> + /*
>> + * The variable max_init_domid is initialized with zero, so here it's
>> + * very important to use the pre-increment operator to call
>> + * domain_create() with a domid > 0. (domid == 0 is reserved for Dom0)
>> + */
>> + d = domain_create(++max_init_domid, &d_cfg, flags);
>> + if ( IS_ERR(d) )
>> + panic("Error creating domain %s (rc = %ld)\n",
>> + dt_node_name(node), PTR_ERR(d));
>> +
>> + if ( llc_coloring_enabled &&
>> + (rc = domain_set_llc_colors_from_str(d, llc_colors_str)) )
>> + panic("Error initializing LLC coloring for domain %s (rc = %d)\n",
>> + dt_node_name(node), rc);
>> +
>> + d->is_console = true;
>> + dt_device_set_used_by(node, d->domain_id);
>> +
>> + rc = construct_domU(d, node);
>> + if ( rc )
>> + panic("Could not set up domain %s (rc = %d)\n",
>> + dt_node_name(node), rc);
>> +
>> + if ( d_cfg.flags & XEN_DOMCTL_CDF_xs_domain )
>> + set_xs_domain(d);
>> + }
>> +
>> + if ( need_xenstore && xs_domid == DOMID_INVALID )
>> + panic("xenstore requested, but xenstore domain not present\n");
>> +
>> + initialize_domU_xenstore();
>> +}
>> diff --git a/xen/include/asm-generic/dom0less-build.h b/xen/include/asm-generic/dom0less-build.h
>> new file mode 100644
>> index 0000000000..5655571a66
>> --- /dev/null
>> +++ b/xen/include/asm-generic/dom0less-build.h
>> @@ -0,0 +1,49 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +
>> +#ifndef __ASM_GENERIC_DOM0LESS_BUILD_H__
>> +#define __ASM_GENERIC_DOM0LESS_BUILD_H__
>> +
>> +#include <xen/stdbool.h>
>> +
>> +#ifdef CONFIG_DOM0LESS_BOOT
>> +
>> +#include <public/domctl.h>
>> +
>> +struct domain;
> This declaration needs to be out of the #ifdef CONFIG_DOM0LESS_BOOT
> because...
>
>
>> +struct dt_device_node;
>> +
>> +/* TODO: remove both when construct_domU() will be moved to common. */
>> +#define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
>> +extern bool need_xenstore;
>> +
>> +void create_domUs(void);
>> +bool is_dom0less_mode(void);
>> +void set_xs_domain(struct domain *d);
>> +
>> +int construct_domU(struct domain *d, const struct dt_device_node *node);
>> +
>> +void arch_create_domUs(struct dt_device_node *node,
>> + struct xen_domctl_createdomain *d_cfg,
>> + unsigned int flags);
>> +
>> +#else /* !CONFIG_DOM0LESS_BOOT */
>> +
>> +static inline void create_domUs(void) {}
>> +static inline bool is_dom0less_mode(void)
>> +{
>> + return false;
>> +}
>> +static inline void set_xs_domain(struct domain *d) {}
> ... of this usage of struct domain *d.
>
>
>> +#endif /* CONFIG_DOM0LESS_BOOT */
>> +
>> +#endif /* __ASM_GENERIC_DOM0LESS_BUILD_H__ */
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> --
>> 2.49.0
>>
[-- Attachment #2: Type: text/html, Size: 35137 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v3 2/8] xen/common: dom0less: make some parts of Arm's CONFIG_DOM0LESS common
2025-05-05 7:35 ` Oleksii Kurochko
@ 2025-05-05 8:35 ` Orzel, Michal
0 siblings, 0 replies; 30+ messages in thread
From: Orzel, Michal @ 2025-05-05 8:35 UTC (permalink / raw)
To: Oleksii Kurochko, Stefano Stabellini
Cc: xen-devel, Julien Grall, Bertrand Marquis, Volodymyr Babchuk,
Andrew Cooper, Anthony PERARD, Jan Beulich, Roger Pau Monné
On 05/05/2025 09:35, Oleksii Kurochko wrote:
>
> On 5/2/25 7:55 PM, Stefano Stabellini wrote:
>> On Fri, 2 May 2025, Oleksii Kurochko wrote:
>>> Move some parts of Arm's Dom0Less code to be reused by other architectures.
>>> At the moment, RISC-V is going to reuse these parts.
>>>
>>> Move dom0less-build.h from the Arm-specific directory to asm-generic
>>> as these header is expected to be the same across acrhictectures with
>>> some updates: add the following declaration of construct_domU(),
>>> and arch_create_domUs() as there are some parts which are still
>>> architecture-specific.
>>>
>>> Introduce HAS_DOM0LESS to provide ability to enable generic Dom0less
>>> code for an architecture.
>>>
>>> Relocate the CONFIG_DOM0LESS configuration to the common with adding
>>> "depends on HAS_DOM0LESS" to not break builds for architectures which
>>> don't support CONFIG_DOM0LESS config, especically it would be useful
>>> to not provide stubs for construct_domU(), arch_create_domUs()
>>> in case of *-randconfig which may set CONFIG_DOM0LESS=y.
>>>
>>> Move is_dom0less_mode() function to the common code, as it depends on
>>> boot modules that are already part of the common code.
>>>
>>> Move create_domUs() function to the common code with some updates:
>>> - Add arch_create_domUs() to cover parsing of arch-specific features,
>>> for example, SVE (Scalar Vector Extension ) exists only in Arm.
>>>
>>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>>> ---
>>> It was suggested to change dom0less to predefined domus or similar
>>> (https://lore.kernel.org/xen-devel/cd2a3644-
>>> c9c6-4e77-9491-2988703906c0@gmail.com/T/
>>> #m1d5e81e5f1faca98a3c51efe4f35af25010edbf0):
>>>
>>> I decided to go with dom0less name for now as it will require a lot of places to change,
>>> including CI's test, and IMO we could do in a separate patch.
>>> If it is necessry to do now, I'll be happy to do that in next version of the current
>>> patch series.
>> I think it is fine to use dom0less for now, it will make the code easier
>> to review and it is not necessary to change the name at this point.
>>
>> The patch looks good to me, except for a couple of minor suggestions I
>> have below. I could make them on commit. With those:
>>
>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>
> Thanks.
>
> I will apply the suggestions below (unless they have already been committed by the time I start preparing the new version of the patch series).
NIT: please trim down your replies (unless you want to show the bigger context,
which is not necessary here)
I only skimmed through the patch and noticed you did not add EMACS comment in
dom0less-build.c. Please do.
~Michal
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v3 2/8] xen/common: dom0less: make some parts of Arm's CONFIG_DOM0LESS common
2025-05-02 17:55 ` Stefano Stabellini
2025-05-05 7:35 ` Oleksii Kurochko
@ 2025-05-05 13:19 ` Oleksii Kurochko
2025-05-05 17:19 ` Stefano Stabellini
1 sibling, 1 reply; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-05 13:19 UTC (permalink / raw)
To: Stefano Stabellini
Cc: xen-devel, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD, Jan Beulich,
Roger Pau Monné
[-- Attachment #1: Type: text/plain, Size: 3386 bytes --]
On 5/2/25 7:55 PM, Stefano Stabellini wrote:
> On Fri, 2 May 2025, Oleksii Kurochko wrote:
>> Move some parts of Arm's Dom0Less code to be reused by other architectures.
>> At the moment, RISC-V is going to reuse these parts.
>>
>> Move dom0less-build.h from the Arm-specific directory to asm-generic
>> as these header is expected to be the same across acrhictectures with
>> some updates: add the following declaration of construct_domU(),
>> and arch_create_domUs() as there are some parts which are still
>> architecture-specific.
>>
>> Introduce HAS_DOM0LESS to provide ability to enable generic Dom0less
>> code for an architecture.
>>
>> Relocate the CONFIG_DOM0LESS configuration to the common with adding
>> "depends on HAS_DOM0LESS" to not break builds for architectures which
>> don't support CONFIG_DOM0LESS config, especically it would be useful
>> to not provide stubs for construct_domU(), arch_create_domUs()
>> in case of *-randconfig which may set CONFIG_DOM0LESS=y.
>>
>> Move is_dom0less_mode() function to the common code, as it depends on
>> boot modules that are already part of the common code.
>>
>> Move create_domUs() function to the common code with some updates:
>> - Add arch_create_domUs() to cover parsing of arch-specific features,
>> for example, SVE (Scalar Vector Extension ) exists only in Arm.
>>
>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
>> ---
>> It was suggested to change dom0less to predefined domus or similar
>> (https://lore.kernel.org/xen-devel/cd2a3644-c9c6-4e77-9491-2988703906c0@gmail.com/T/#m1d5e81e5f1faca98a3c51efe4f35af25010edbf0):
>>
>> I decided to go with dom0less name for now as it will require a lot of places to change,
>> including CI's test, and IMO we could do in a separate patch.
>> If it is necessry to do now, I'll be happy to do that in next version of the current
>> patch series.
> I think it is fine to use dom0less for now, it will make the code easier
> to review and it is not necessary to change the name at this point.
>
> The patch looks good to me, except for a couple of minor suggestions I
> have below. I could make them on commit. With those:
>
> Reviewed-by: Stefano Stabellini<sstabellini@kernel.org>
During the randconfig testing the following issue occurs:
common/device-tree/dom0less-build.c: In function 'create_domUs':
common/device-tree/dom0less-build.c:156:42: error: implicit declaration of function 'gnttab_dom0_frames'; did you mean 'gnttab_map_frame'? [-Werror=implicit-function-declaration]
156 | d_cfg.max_grant_frames = gnttab_dom0_frames();
| ^~~~~~~~~~~~~~~~~~
| gnttab_map_frame
common/device-tree/dom0less-build.c:156:42: error: nested extern declaration of 'gnttab_dom0_frames' [-Werror=nested-externs]
I fixed that by the following #ifdef-ing:
...
d_cfg.max_grant_frames = -1;
...
if ( dt_property_read_u32(node, "capabilities", &val) )
{
...
#ifdef CONFIG_GRANT_TABLE
d_cfg.max_grant_frames = gnttab_dom0_frames();
#endif
d_cfg.max_evtchn_port = -1;
flags |= CDF_hardware;
iommu = true;
}
Do you agree with such fix?
If the CONFIG_GRANT_TABLE=n then the init value (d_cfg.max_grant_frames = -1;) will be used.
~ Oleksii
[-- Attachment #2: Type: text/html, Size: 4291 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v3 2/8] xen/common: dom0less: make some parts of Arm's CONFIG_DOM0LESS common
2025-05-05 13:19 ` Oleksii Kurochko
@ 2025-05-05 17:19 ` Stefano Stabellini
0 siblings, 0 replies; 30+ messages in thread
From: Stefano Stabellini @ 2025-05-05 17:19 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Stefano Stabellini, xen-devel, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Jan Beulich, Roger Pau Monné
On Mon, 5 May 2025, Oleksii Kurochko wrote:
> On 5/2/25 7:55 PM, Stefano Stabellini wrote:
>
> On Fri, 2 May 2025, Oleksii Kurochko wrote:
> Move some parts of Arm's Dom0Less code to be reused by other architectures.
> At the moment, RISC-V is going to reuse these parts.
>
> Move dom0less-build.h from the Arm-specific directory to asm-generic
> as these header is expected to be the same across acrhictectures with
> some updates: add the following declaration of construct_domU(),
> and arch_create_domUs() as there are some parts which are still
> architecture-specific.
>
> Introduce HAS_DOM0LESS to provide ability to enable generic Dom0less
> code for an architecture.
>
> Relocate the CONFIG_DOM0LESS configuration to the common with adding
> "depends on HAS_DOM0LESS" to not break builds for architectures which
> don't support CONFIG_DOM0LESS config, especically it would be useful
> to not provide stubs for construct_domU(), arch_create_domUs()
> in case of *-randconfig which may set CONFIG_DOM0LESS=y.
>
> Move is_dom0less_mode() function to the common code, as it depends on
> boot modules that are already part of the common code.
>
> Move create_domUs() function to the common code with some updates:
> - Add arch_create_domUs() to cover parsing of arch-specific features,
> for example, SVE (Scalar Vector Extension ) exists only in Arm.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> It was suggested to change dom0less to predefined domus or similar
> (https://lore.kernel.org/xen-devel/cd2a3644-c9c6-4e77-9491-2988703906c0@gmail.com/T/#m1d5e81e5f1faca98a3c51efe4f35af25010edbf0):
>
> I decided to go with dom0less name for now as it will require a lot of places to change,
> including CI's test, and IMO we could do in a separate patch.
> If it is necessry to do now, I'll be happy to do that in next version of the current
> patch series.
> I think it is fine to use dom0less for now, it will make the code easier
> to review and it is not necessary to change the name at this point.
>
> The patch looks good to me, except for a couple of minor suggestions I
> have below. I could make them on commit. With those:
>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>
> During the randconfig testing the following issue occurs:
>
> common/device-tree/dom0less-build.c: In function 'create_domUs':
> common/device-tree/dom0less-build.c:156:42: error: implicit declaration of function 'gnttab_dom0_frames'; did you mean 'gnttab_map_frame'?
> [-Werror=implicit-function-declaration]
> 156 | d_cfg.max_grant_frames = gnttab_dom0_frames();
> | ^~~~~~~~~~~~~~~~~~
> | gnttab_map_frame
> common/device-tree/dom0less-build.c:156:42: error: nested extern declaration of 'gnttab_dom0_frames' [-Werror=nested-externs]
>
> I fixed that by the following #ifdef-ing:
> ...
> d_cfg.max_grant_frames = -1;
> ...
>
> if ( dt_property_read_u32(node, "capabilities", &val) )
> {
> ...
>
> #ifdef CONFIG_GRANT_TABLE
> d_cfg.max_grant_frames = gnttab_dom0_frames();
> #endif
> d_cfg.max_evtchn_port = -1;
> flags |= CDF_hardware;
> iommu = true;
> }
>
> Do you agree with such fix?
>
> If the CONFIG_GRANT_TABLE=n then the init value (d_cfg.max_grant_frames = -1;) will be used.
Yes, OK
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v3 3/8] asm-generic: move parts of Arm's asm/kernel.h to common code
2025-05-02 16:22 [PATCH v3 0/8] Move parts of Arm's Dom0less to common code Oleksii Kurochko
2025-05-02 16:22 ` [PATCH v3 1/8] xen/arm: drop declaration of handle_device_interrupts() Oleksii Kurochko
2025-05-02 16:22 ` [PATCH v3 2/8] xen/common: dom0less: make some parts of Arm's CONFIG_DOM0LESS common Oleksii Kurochko
@ 2025-05-02 16:22 ` Oleksii Kurochko
2025-05-02 18:13 ` Stefano Stabellini
2025-05-05 9:08 ` Orzel, Michal
2025-05-02 16:22 ` [PATCH v3 4/8] arm/static-shmem.h: drop inclusion of asm/setup.h Oleksii Kurochko
` (4 subsequent siblings)
7 siblings, 2 replies; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-02 16:22 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné
Move the following parts to common with the following changes:
- struct kernel_info:
- Create arch_kernel_info for arch specific kernel information.
At the moment, it contains domain_type for Arm.
- Rename vpl011 to vuart to have more generic name suitable for other archs.
- s/phandle_gic/phandle_intc to have more generic name suitable for other
archs.
- Make text_offset of zimage structure available for RISCV_64.
- Wrap by `#ifdef KERNEL_INFO_SHM_MEM_INIT` definition of KERNEL_SHM_MEM_INIT
and wrap by `#ifndef KERNEL_INFO_INIT` definition of KERNEL_INFO_INIT to have
ability to override KERNEL_INFO_SHM_MEM_INIT for arch in case it doesn't
want to use generic one.
- Move DOM0LESS_* macros to dom0less-build.h.
- Move all others parts of Arm's kernel.h to xen/fdt-kernel.h.
Because of the changes in struct kernel_info the correspondent parts of Arm's
code are updated.
As part of this patch the following clean up happens:
- Drop asm/setup.h from asm/kernel.h as nothing depends from it.
Add inclusion of asm/setup.h for a code which uses device_tree_get_reg() to
avoid compilation issues for CONFIG_STATIC_MEMORY and CONFIG_STATIC_SHM.
- Drop inclusion of asm/kernel.h everywhere except xen/fdt-kernel.h.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v3:
- Only resolving of merge conflicts.
---
Changes in v2:
- Introduce xen/fdt-kernel.h.
- Move DOM0LESS_* macros to dom0less-build.h.
- Move the rest in asm-generic/kernel.h to xen/fdt-kernel.h.
- Drop inclusion of asm/kernel.h everywhere except xen/fdt-kernel.h.
- Wrap by #if __has_include(....) the member of kernel_info structure:
struct arch_kernel_info arch.
- Update the commit message.
---
xen/arch/arm/acpi/domain_build.c | 2 +-
xen/arch/arm/dom0less-build.c | 31 +++---
xen/arch/arm/domain_build.c | 12 +-
xen/arch/arm/include/asm/domain_build.h | 2 +-
xen/arch/arm/include/asm/kernel.h | 126 +--------------------
xen/arch/arm/include/asm/static-memory.h | 2 +-
xen/arch/arm/include/asm/static-shmem.h | 2 +-
xen/arch/arm/kernel.c | 12 +-
xen/arch/arm/static-memory.c | 1 +
xen/arch/arm/static-shmem.c | 1 +
xen/common/device-tree/dt-overlay.c | 2 +-
xen/include/asm-generic/dom0less-build.h | 28 +++++
xen/include/xen/fdt-kernel.h | 133 +++++++++++++++++++++++
13 files changed, 199 insertions(+), 155 deletions(-)
create mode 100644 xen/include/xen/fdt-kernel.h
diff --git a/xen/arch/arm/acpi/domain_build.c b/xen/arch/arm/acpi/domain_build.c
index 2ce75543d0..f9ca8b47e5 100644
--- a/xen/arch/arm/acpi/domain_build.c
+++ b/xen/arch/arm/acpi/domain_build.c
@@ -10,6 +10,7 @@
*/
#include <xen/compile.h>
+#include <xen/fdt-kernel.h>
#include <xen/mm.h>
#include <xen/sched.h>
#include <xen/acpi.h>
@@ -18,7 +19,6 @@
#include <xen/device_tree.h>
#include <xen/libfdt/libfdt.h>
#include <acpi/actables.h>
-#include <asm/kernel.h>
#include <asm/domain_build.h>
/* Override macros from asm/page.h to make them work with mfn_t */
diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index ef49495d4f..c0634dd61e 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0-only */
#include <xen/device_tree.h>
#include <xen/domain_page.h>
+#include <xen/fdt-kernel.h>
#include <xen/err.h>
#include <xen/event.h>
#include <xen/grant_table.h>
@@ -64,11 +65,11 @@ static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
if (res)
return res;
- res = fdt_property_cell(fdt, "linux,phandle", kinfo->phandle_gic);
+ res = fdt_property_cell(fdt, "linux,phandle", kinfo->phandle_intc);
if (res)
return res;
- res = fdt_property_cell(fdt, "phandle", kinfo->phandle_gic);
+ res = fdt_property_cell(fdt, "phandle", kinfo->phandle_intc);
if (res)
return res;
@@ -135,11 +136,11 @@ static int __init make_gicv3_domU_node(struct kernel_info *kinfo)
if (res)
return res;
- res = fdt_property_cell(fdt, "linux,phandle", kinfo->phandle_gic);
+ res = fdt_property_cell(fdt, "linux,phandle", kinfo->phandle_intc);
if (res)
return res;
- res = fdt_property_cell(fdt, "phandle", kinfo->phandle_gic);
+ res = fdt_property_cell(fdt, "phandle", kinfo->phandle_intc);
if (res)
return res;
@@ -200,7 +201,7 @@ static int __init make_vpl011_uart_node(struct kernel_info *kinfo)
return res;
res = fdt_property_cell(fdt, "interrupt-parent",
- kinfo->phandle_gic);
+ kinfo->phandle_intc);
if ( res )
return res;
@@ -486,10 +487,10 @@ static int __init domain_handle_dtb_bootmodule(struct domain *d,
*/
if ( dt_node_cmp(name, "gic") == 0 )
{
- uint32_t phandle_gic = fdt_get_phandle(pfdt, node_next);
+ uint32_t phandle_intc = fdt_get_phandle(pfdt, node_next);
- if ( phandle_gic != 0 )
- kinfo->phandle_gic = phandle_gic;
+ if ( phandle_intc != 0 )
+ kinfo->phandle_intc = phandle_intc;
continue;
}
@@ -532,7 +533,7 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
int addrcells, sizecells;
int ret, fdt_size = DOMU_DTB_SIZE;
- kinfo->phandle_gic = GUEST_PHANDLE_GIC;
+ kinfo->phandle_intc = GUEST_PHANDLE_GIC;
kinfo->gnttab_start = GUEST_GNTTAB_BASE;
kinfo->gnttab_size = GUEST_GNTTAB_SIZE;
@@ -594,7 +595,7 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
/*
* domain_handle_dtb_bootmodule has to be called before the rest of
* the device tree is generated because it depends on the value of
- * the field phandle_gic.
+ * the field phandle_intc.
*/
if ( kinfo->dtb_bootmodule )
{
@@ -611,7 +612,7 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
if ( ret )
goto err;
- if ( kinfo->vpl011 )
+ if ( kinfo->vuart )
{
ret = -EINVAL;
#ifdef CONFIG_SBSA_VUART_CONSOLE
@@ -839,8 +840,8 @@ int __init construct_domU(struct domain *d,
printk("*** LOADING DOMU cpus=%u memory=%#"PRIx64"KB ***\n",
d->max_vcpus, mem);
- kinfo.vpl011 = dt_property_read_bool(node, "vpl011");
- if ( kinfo.vpl011 && is_hardware_domain(d) )
+ kinfo.vuart = dt_property_read_bool(node, "vpl011");
+ if ( kinfo.vuart && is_hardware_domain(d) )
panic("hardware domain cannot specify vpl011\n");
rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced);
@@ -872,7 +873,7 @@ int __init construct_domU(struct domain *d,
#ifdef CONFIG_ARM_64
/* type must be set before allocate memory */
- d->arch.type = kinfo.type;
+ d->arch.type = kinfo.arch.type;
#endif
if ( is_hardware_domain(d) )
{
@@ -898,7 +899,7 @@ int __init construct_domU(struct domain *d,
* tree node in prepare_dtb_domU, so initialization on related variables
* shall be done first.
*/
- if ( kinfo.vpl011 )
+ if ( kinfo.vuart )
{
rc = domain_vpl011_init(d, NULL);
if ( rc < 0 )
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 270a6b97e4..8c7a054718 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0-only */
#include <xen/init.h>
#include <xen/compile.h>
+#include <xen/fdt-kernel.h>
#include <xen/lib.h>
#include <xen/llc-coloring.h>
#include <xen/mm.h>
@@ -20,7 +21,6 @@
#include <xen/vmap.h>
#include <xen/warning.h>
#include <asm/device.h>
-#include <asm/kernel.h>
#include <asm/setup.h>
#include <asm/tee/tee.h>
#include <asm/pci.h>
@@ -747,7 +747,7 @@ static int __init fdt_property_interrupts(const struct kernel_info *kinfo,
return res;
res = fdt_property_cell(kinfo->fdt, "interrupt-parent",
- kinfo->phandle_gic);
+ kinfo->phandle_intc);
return res;
}
@@ -2026,7 +2026,7 @@ static int __init prepare_dtb_hwdom(struct domain *d, struct kernel_info *kinfo)
ASSERT(dt_host && (dt_host->sibling == NULL));
- kinfo->phandle_gic = dt_interrupt_controller->phandle;
+ kinfo->phandle_intc = dt_interrupt_controller->phandle;
fdt = device_tree_flattened;
new_size = fdt_totalsize(fdt) + DOM0_FDT_EXTRA_SIZE;
@@ -2196,13 +2196,13 @@ int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
#ifdef CONFIG_ARM_64
/* if aarch32 mode is not supported at EL1 do not allow 32-bit domain */
- if ( !(cpu_has_el1_32) && kinfo->type == DOMAIN_32BIT )
+ if ( !(cpu_has_el1_32) && kinfo->arch.type == DOMAIN_32BIT )
{
printk("Platform does not support 32-bit domain\n");
return -EINVAL;
}
- if ( is_sve_domain(d) && (kinfo->type == DOMAIN_32BIT) )
+ if ( is_sve_domain(d) && (kinfo->arch.type == DOMAIN_32BIT) )
{
printk("SVE is not available for 32-bit domain\n");
return -EINVAL;
@@ -2318,7 +2318,7 @@ int __init construct_hwdom(struct kernel_info *kinfo,
#ifdef CONFIG_ARM_64
/* type must be set before allocate_memory */
- d->arch.type = kinfo->type;
+ d->arch.type = kinfo->arch.type;
#endif
find_gnttab_region(d, kinfo);
if ( is_domain_direct_mapped(d) )
diff --git a/xen/arch/arm/include/asm/domain_build.h b/xen/arch/arm/include/asm/domain_build.h
index 378c10cc98..df1c0fe301 100644
--- a/xen/arch/arm/include/asm/domain_build.h
+++ b/xen/arch/arm/include/asm/domain_build.h
@@ -1,8 +1,8 @@
#ifndef __ASM_DOMAIN_BUILD_H__
#define __ASM_DOMAIN_BUILD_H__
+#include <xen/fdt-kernel.h>
#include <xen/sched.h>
-#include <asm/kernel.h>
typedef __be32 gic_interrupt_t[3];
typedef bool (*alloc_domheap_mem_cb)(struct domain *d, struct page_info *pg,
diff --git a/xen/arch/arm/include/asm/kernel.h b/xen/arch/arm/include/asm/kernel.h
index bdc96f4c18..cfeab792c7 100644
--- a/xen/arch/arm/include/asm/kernel.h
+++ b/xen/arch/arm/include/asm/kernel.h
@@ -6,137 +6,15 @@
#ifndef __ARCH_ARM_KERNEL_H__
#define __ARCH_ARM_KERNEL_H__
-#include <xen/device_tree.h>
#include <asm/domain.h>
-#include <asm/setup.h>
-/*
- * List of possible features for dom0less domUs
- *
- * DOM0LESS_ENHANCED_NO_XS: Notify the OS it is running on top of Xen. All the
- * default features (excluding Xenstore) will be
- * available. Note that an OS *must* not rely on the
- * availability of Xen features if this is not set.
- * DOM0LESS_XENSTORE: Xenstore will be enabled for the VM. The
- * xenstore page allocation is done by Xen at
- * domain creation. This feature can't be
- * enabled without the DOM0LESS_ENHANCED_NO_XS.
- * DOM0LESS_XS_LEGACY Xenstore will be enabled for the VM, the
- * xenstore page allocation will happen in
- * init-dom0less. This feature can't be enabled
- * without the DOM0LESS_ENHANCED_NO_XS.
- * DOM0LESS_ENHANCED: Notify the OS it is running on top of Xen. All the
- * default features (including Xenstore) will be
- * available. Note that an OS *must* not rely on the
- * availability of Xen features if this is not set.
- * DOM0LESS_ENHANCED_LEGACY: Same as before, but using DOM0LESS_XS_LEGACY.
- */
-#define DOM0LESS_ENHANCED_NO_XS BIT(0, U)
-#define DOM0LESS_XENSTORE BIT(1, U)
-#define DOM0LESS_XS_LEGACY BIT(2, U)
-#define DOM0LESS_ENHANCED_LEGACY (DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XS_LEGACY)
-#define DOM0LESS_ENHANCED (DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XENSTORE)
-
-struct kernel_info {
+struct arch_kernel_info
+{
#ifdef CONFIG_ARM_64
enum domain_type type;
#endif
-
- struct domain *d;
-
- void *fdt; /* flat device tree */
- paddr_t unassigned_mem; /* RAM not (yet) assigned to a bank */
- struct meminfo mem;
-#ifdef CONFIG_STATIC_SHM
- struct shared_meminfo shm_mem;
-#endif
-
- /* kernel entry point */
- paddr_t entry;
-
- /* grant table region */
- paddr_t gnttab_start;
- paddr_t gnttab_size;
-
- /* boot blob load addresses */
- const struct bootmodule *kernel_bootmodule, *initrd_bootmodule, *dtb_bootmodule;
- const char* cmdline;
- paddr_t dtb_paddr;
- paddr_t initrd_paddr;
-
- /* Enable pl011 emulation */
- bool vpl011;
-
- /* Enable/Disable PV drivers interfaces */
- uint16_t dom0less_feature;
-
- /* GIC phandle */
- uint32_t phandle_gic;
-
- /* loader to use for this kernel */
- void (*load)(struct kernel_info *info);
- /* loader specific state */
- union {
- struct {
- paddr_t kernel_addr;
- paddr_t len;
-#ifdef CONFIG_ARM_64
- paddr_t text_offset; /* 64-bit Image only */
-#endif
- paddr_t start; /* Must be 0 for 64-bit Image */
- } zimage;
- };
};
-static inline struct membanks *kernel_info_get_mem(struct kernel_info *kinfo)
-{
- return container_of(&kinfo->mem.common, struct membanks, common);
-}
-
-static inline const struct membanks *
-kernel_info_get_mem_const(const struct kernel_info *kinfo)
-{
- return container_of(&kinfo->mem.common, const struct membanks, common);
-}
-
-#ifdef CONFIG_STATIC_SHM
-#define KERNEL_INFO_SHM_MEM_INIT \
- .shm_mem.common.max_banks = NR_SHMEM_BANKS, \
- .shm_mem.common.type = STATIC_SHARED_MEMORY,
-#else
-#define KERNEL_INFO_SHM_MEM_INIT
-#endif
-
-#define KERNEL_INFO_INIT \
-{ \
- .mem.common.max_banks = NR_MEM_BANKS, \
- .mem.common.type = MEMORY, \
- KERNEL_INFO_SHM_MEM_INIT \
-}
-
-/*
- * Probe the kernel to detemine its type and select a loader.
- *
- * Sets in info:
- * ->type
- * ->load hook, and sets loader specific variables ->zimage
- */
-int kernel_probe(struct kernel_info *info, const struct dt_device_node *domain);
-
-/*
- * Loads the kernel into guest RAM.
- *
- * Expects to be set in info when called:
- * ->mem
- * ->fdt
- *
- * Sets in info:
- * ->entry
- * ->dtb_paddr
- * ->initrd_paddr
- */
-void kernel_load(struct kernel_info *info);
-
#endif /* #ifdef __ARCH_ARM_KERNEL_H__ */
/*
diff --git a/xen/arch/arm/include/asm/static-memory.h b/xen/arch/arm/include/asm/static-memory.h
index 804166e541..a32a3c6553 100644
--- a/xen/arch/arm/include/asm/static-memory.h
+++ b/xen/arch/arm/include/asm/static-memory.h
@@ -3,8 +3,8 @@
#ifndef __ASM_STATIC_MEMORY_H_
#define __ASM_STATIC_MEMORY_H_
+#include <xen/fdt-kernel.h>
#include <xen/pfn.h>
-#include <asm/kernel.h>
#ifdef CONFIG_STATIC_MEMORY
diff --git a/xen/arch/arm/include/asm/static-shmem.h b/xen/arch/arm/include/asm/static-shmem.h
index 94eaa9d500..a4f853805a 100644
--- a/xen/arch/arm/include/asm/static-shmem.h
+++ b/xen/arch/arm/include/asm/static-shmem.h
@@ -3,8 +3,8 @@
#ifndef __ASM_STATIC_SHMEM_H_
#define __ASM_STATIC_SHMEM_H_
+#include <xen/fdt-kernel.h>
#include <xen/types.h>
-#include <asm/kernel.h>
#include <asm/setup.h>
#ifdef CONFIG_STATIC_SHM
diff --git a/xen/arch/arm/kernel.c b/xen/arch/arm/kernel.c
index 2647812e8e..f00fc388db 100644
--- a/xen/arch/arm/kernel.c
+++ b/xen/arch/arm/kernel.c
@@ -7,6 +7,7 @@
#include <xen/byteorder.h>
#include <xen/domain_page.h>
#include <xen/errno.h>
+#include <xen/fdt-kernel.h>
#include <xen/guest_access.h>
#include <xen/gunzip.h>
#include <xen/init.h>
@@ -16,6 +17,7 @@
#include <xen/sched.h>
#include <xen/vmap.h>
+#include <asm/domain_build.h>
#include <asm/kernel.h>
#include <asm/setup.h>
@@ -101,7 +103,7 @@ static paddr_t __init kernel_zimage_place(struct kernel_info *info)
paddr_t load_addr;
#ifdef CONFIG_ARM_64
- if ( (info->type == DOMAIN_64BIT) && (info->zimage.start == 0) )
+ if ( (info->arch.type == DOMAIN_64BIT) && (info->zimage.start == 0) )
return mem->bank[0].start + info->zimage.text_offset;
#endif
@@ -371,10 +373,10 @@ static int __init kernel_uimage_probe(struct kernel_info *info,
switch ( uimage.arch )
{
case IH_ARCH_ARM:
- info->type = DOMAIN_32BIT;
+ info->arch.type = DOMAIN_32BIT;
break;
case IH_ARCH_ARM64:
- info->type = DOMAIN_64BIT;
+ info->arch.type = DOMAIN_64BIT;
break;
default:
printk(XENLOG_ERR "Unsupported uImage arch type %d\n", uimage.arch);
@@ -444,7 +446,7 @@ static int __init kernel_zimage64_probe(struct kernel_info *info,
info->load = kernel_zimage_load;
- info->type = DOMAIN_64BIT;
+ info->arch.type = DOMAIN_64BIT;
return 0;
}
@@ -496,7 +498,7 @@ static int __init kernel_zimage32_probe(struct kernel_info *info,
info->load = kernel_zimage_load;
#ifdef CONFIG_ARM_64
- info->type = DOMAIN_32BIT;
+ info->arch.type = DOMAIN_32BIT;
#endif
return 0;
diff --git a/xen/arch/arm/static-memory.c b/xen/arch/arm/static-memory.c
index d4585c5a06..e0f76afcd8 100644
--- a/xen/arch/arm/static-memory.c
+++ b/xen/arch/arm/static-memory.c
@@ -2,6 +2,7 @@
#include <xen/sched.h>
+#include <asm/setup.h>
#include <asm/static-memory.h>
static bool __init append_static_memory_to_bank(struct domain *d,
diff --git a/xen/arch/arm/static-shmem.c b/xen/arch/arm/static-shmem.c
index e8d4ca3ba3..14ae48fb1e 100644
--- a/xen/arch/arm/static-shmem.c
+++ b/xen/arch/arm/static-shmem.c
@@ -6,6 +6,7 @@
#include <xen/sched.h>
#include <asm/domain_build.h>
+#include <asm/setup.h>
#include <asm/static-memory.h>
#include <asm/static-shmem.h>
diff --git a/xen/common/device-tree/dt-overlay.c b/xen/common/device-tree/dt-overlay.c
index 97fb99eaaa..81107cb48d 100644
--- a/xen/common/device-tree/dt-overlay.c
+++ b/xen/common/device-tree/dt-overlay.c
@@ -6,8 +6,8 @@
* Written by Vikram Garhwal <vikram.garhwal@amd.com>
*
*/
-#include <asm/domain_build.h>
#include <xen/dt-overlay.h>
+#include <xen/fdt-kernel.h>
#include <xen/guest_access.h>
#include <xen/iocap.h>
#include <xen/libfdt/libfdt.h>
diff --git a/xen/include/asm-generic/dom0less-build.h b/xen/include/asm-generic/dom0less-build.h
index 5655571a66..f095135caa 100644
--- a/xen/include/asm-generic/dom0less-build.h
+++ b/xen/include/asm-generic/dom0less-build.h
@@ -16,6 +16,34 @@ struct dt_device_node;
#define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
extern bool need_xenstore;
+/*
+ * List of possible features for dom0less domUs
+ *
+ * DOM0LESS_ENHANCED_NO_XS: Notify the OS it is running on top of Xen. All the
+ * default features (excluding Xenstore) will be
+ * available. Note that an OS *must* not rely on the
+ * availability of Xen features if this is not set.
+ * DOM0LESS_XENSTORE: Xenstore will be enabled for the VM. The
+ * xenstore page allocation is done by Xen at
+ * domain creation. This feature can't be
+ * enabled without the DOM0LESS_ENHANCED_NO_XS.
+ * DOM0LESS_XS_LEGACY Xenstore will be enabled for the VM, the
+ * xenstore page allocation will happen in
+ * init-dom0less. This feature can't be enabled
+ * without the DOM0LESS_ENHANCED_NO_XS.
+ * DOM0LESS_ENHANCED: Notify the OS it is running on top of Xen. All the
+ * default features (including Xenstore) will be
+ * available. Note that an OS *must* not rely on the
+ * availability of Xen features if this is not set.
+ * DOM0LESS_ENHANCED_LEGACY: Same as before, but using DOM0LESS_XS_LEGACY.
+
+ */
+#define DOM0LESS_ENHANCED_NO_XS BIT(0, U)
+#define DOM0LESS_XENSTORE BIT(1, U)
+#define DOM0LESS_XS_LEGACY BIT(2, U)
+#define DOM0LESS_ENHANCED_LEGACY (DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XS_LEGACY)
+#define DOM0LESS_ENHANCED (DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XENSTORE)
+
void create_domUs(void);
bool is_dom0less_mode(void);
void set_xs_domain(struct domain *d);
diff --git a/xen/include/xen/fdt-kernel.h b/xen/include/xen/fdt-kernel.h
new file mode 100644
index 0000000000..c81e759423
--- /dev/null
+++ b/xen/include/xen/fdt-kernel.h
@@ -0,0 +1,133 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * For Kernel image loading.
+ *
+ * Copyright (C) 2011 Citrix Systems, Inc.
+ */
+#ifndef __XEN_FDT_KERNEL_H__
+#define __XEN_FDT_KERNEL_H__
+
+#include <xen/bootfdt.h>
+#include <xen/device_tree.h>
+#include <xen/types.h>
+
+#if __has_include(<asm/kernel.h>)
+# include <asm/kernel.h>
+#endif
+
+struct kernel_info {
+ struct domain *d;
+
+ void *fdt; /* flat device tree */
+ paddr_t unassigned_mem; /* RAM not (yet) assigned to a bank */
+ struct meminfo mem;
+#ifdef CONFIG_STATIC_SHM
+ struct shared_meminfo shm_mem;
+#endif
+
+ /* kernel entry point */
+ paddr_t entry;
+
+ /* grant table region */
+ paddr_t gnttab_start;
+ paddr_t gnttab_size;
+
+ /* boot blob load addresses */
+ const struct bootmodule *kernel_bootmodule, *initrd_bootmodule, *dtb_bootmodule;
+ const char* cmdline;
+ paddr_t dtb_paddr;
+ paddr_t initrd_paddr;
+
+ /* Enable uart emulation */
+ bool vuart;
+
+ /* Enable/Disable PV drivers interfaces */
+ uint16_t dom0less_feature;
+
+ /* Interrupt controller phandle */
+ uint32_t phandle_intc;
+
+ /* loader to use for this kernel */
+ void (*load)(struct kernel_info *info);
+
+ /* loader specific state */
+ union {
+ struct {
+ paddr_t kernel_addr;
+ paddr_t len;
+#if defined(CONFIG_ARM_64) || defined(CONFIG_RISCV_64)
+ paddr_t text_offset; /* 64-bit Image only */
+#endif
+ paddr_t start; /* Must be 0 for 64-bit Image */
+ } zimage;
+ };
+
+#if __has_include(<asm/kernel.h>)
+ struct arch_kernel_info arch;
+#endif
+};
+
+static inline struct membanks *kernel_info_get_mem(struct kernel_info *kinfo)
+{
+ return container_of(&kinfo->mem.common, struct membanks, common);
+}
+
+static inline const struct membanks *
+kernel_info_get_mem_const(const struct kernel_info *kinfo)
+{
+ return container_of(&kinfo->mem.common, const struct membanks, common);
+}
+
+#ifndef KERNEL_INFO_SHM_MEM_INIT
+
+#ifdef CONFIG_STATIC_SHM
+#define KERNEL_INFO_SHM_MEM_INIT .shm_mem.common.max_banks = NR_SHMEM_BANKS,
+#else
+#define KERNEL_INFO_SHM_MEM_INIT
+#endif
+
+#endif /* KERNEL_INFO_SHM_MEM_INIT */
+
+#ifndef KERNEL_INFO_INIT
+
+#define KERNEL_INFO_INIT \
+{ \
+ .mem.common.max_banks = NR_MEM_BANKS, \
+ KERNEL_INFO_SHM_MEM_INIT \
+}
+
+#endif /* KERNEL_INFO_INIT */
+
+/*
+ * Probe the kernel to detemine its type and select a loader.
+ *
+ * Sets in info:
+ * ->type
+ * ->load hook, and sets loader specific variables ->zimage
+ */
+int kernel_probe(struct kernel_info *info, const struct dt_device_node *domain);
+
+/*
+ * Loads the kernel into guest RAM.
+ *
+ * Expects to be set in info when called:
+ * ->mem
+ * ->fdt
+ *
+ * Sets in info:
+ * ->entry
+ * ->dtb_paddr
+ * ->initrd_paddr
+ */
+void kernel_load(struct kernel_info *info);
+
+#endif /* __XEN_FDT_KERNEL_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
--
2.49.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH v3 3/8] asm-generic: move parts of Arm's asm/kernel.h to common code
2025-05-02 16:22 ` [PATCH v3 3/8] asm-generic: move parts of Arm's asm/kernel.h to common code Oleksii Kurochko
@ 2025-05-02 18:13 ` Stefano Stabellini
2025-05-05 11:10 ` Oleksii Kurochko
2025-05-05 9:08 ` Orzel, Michal
1 sibling, 1 reply; 30+ messages in thread
From: Stefano Stabellini @ 2025-05-02 18:13 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: xen-devel, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Jan Beulich, Roger Pau Monné
On Fri, 2 May 2025, Oleksii Kurochko wrote:
> Move the following parts to common with the following changes:
> - struct kernel_info:
> - Create arch_kernel_info for arch specific kernel information.
> At the moment, it contains domain_type for Arm.
> - Rename vpl011 to vuart to have more generic name suitable for other archs.
> - s/phandle_gic/phandle_intc to have more generic name suitable for other
> archs.
> - Make text_offset of zimage structure available for RISCV_64.
> - Wrap by `#ifdef KERNEL_INFO_SHM_MEM_INIT` definition of KERNEL_SHM_MEM_INIT
> and wrap by `#ifndef KERNEL_INFO_INIT` definition of KERNEL_INFO_INIT to have
> ability to override KERNEL_INFO_SHM_MEM_INIT for arch in case it doesn't
> want to use generic one.
> - Move DOM0LESS_* macros to dom0less-build.h.
> - Move all others parts of Arm's kernel.h to xen/fdt-kernel.h.
>
> Because of the changes in struct kernel_info the correspondent parts of Arm's
> code are updated.
>
> As part of this patch the following clean up happens:
> - Drop asm/setup.h from asm/kernel.h as nothing depends from it.
> Add inclusion of asm/setup.h for a code which uses device_tree_get_reg() to
> avoid compilation issues for CONFIG_STATIC_MEMORY and CONFIG_STATIC_SHM.
> - Drop inclusion of asm/kernel.h everywhere except xen/fdt-kernel.h.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Everything looks good except for one question below. This patch looks
like a lot of work, thanks Oleksii!
> ---
> Changes in v3:
> - Only resolving of merge conflicts.
> ---
> Changes in v2:
> - Introduce xen/fdt-kernel.h.
> - Move DOM0LESS_* macros to dom0less-build.h.
> - Move the rest in asm-generic/kernel.h to xen/fdt-kernel.h.
> - Drop inclusion of asm/kernel.h everywhere except xen/fdt-kernel.h.
> - Wrap by #if __has_include(....) the member of kernel_info structure:
> struct arch_kernel_info arch.
> - Update the commit message.
> ---
> xen/arch/arm/acpi/domain_build.c | 2 +-
> xen/arch/arm/dom0less-build.c | 31 +++---
> xen/arch/arm/domain_build.c | 12 +-
> xen/arch/arm/include/asm/domain_build.h | 2 +-
> xen/arch/arm/include/asm/kernel.h | 126 +--------------------
> xen/arch/arm/include/asm/static-memory.h | 2 +-
> xen/arch/arm/include/asm/static-shmem.h | 2 +-
> xen/arch/arm/kernel.c | 12 +-
> xen/arch/arm/static-memory.c | 1 +
> xen/arch/arm/static-shmem.c | 1 +
> xen/common/device-tree/dt-overlay.c | 2 +-
> xen/include/asm-generic/dom0less-build.h | 28 +++++
> xen/include/xen/fdt-kernel.h | 133 +++++++++++++++++++++++
> 13 files changed, 199 insertions(+), 155 deletions(-)
> create mode 100644 xen/include/xen/fdt-kernel.h
>
> diff --git a/xen/arch/arm/acpi/domain_build.c b/xen/arch/arm/acpi/domain_build.c
> index 2ce75543d0..f9ca8b47e5 100644
> --- a/xen/arch/arm/acpi/domain_build.c
> +++ b/xen/arch/arm/acpi/domain_build.c
> @@ -10,6 +10,7 @@
> */
>
> #include <xen/compile.h>
> +#include <xen/fdt-kernel.h>
> #include <xen/mm.h>
> #include <xen/sched.h>
> #include <xen/acpi.h>
> @@ -18,7 +19,6 @@
> #include <xen/device_tree.h>
> #include <xen/libfdt/libfdt.h>
> #include <acpi/actables.h>
> -#include <asm/kernel.h>
> #include <asm/domain_build.h>
>
> /* Override macros from asm/page.h to make them work with mfn_t */
> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
> index ef49495d4f..c0634dd61e 100644
> --- a/xen/arch/arm/dom0less-build.c
> +++ b/xen/arch/arm/dom0less-build.c
> @@ -1,6 +1,7 @@
> /* SPDX-License-Identifier: GPL-2.0-only */
> #include <xen/device_tree.h>
> #include <xen/domain_page.h>
> +#include <xen/fdt-kernel.h>
> #include <xen/err.h>
> #include <xen/event.h>
> #include <xen/grant_table.h>
> @@ -64,11 +65,11 @@ static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
> if (res)
> return res;
>
> - res = fdt_property_cell(fdt, "linux,phandle", kinfo->phandle_gic);
> + res = fdt_property_cell(fdt, "linux,phandle", kinfo->phandle_intc);
> if (res)
> return res;
>
> - res = fdt_property_cell(fdt, "phandle", kinfo->phandle_gic);
> + res = fdt_property_cell(fdt, "phandle", kinfo->phandle_intc);
> if (res)
> return res;
>
> @@ -135,11 +136,11 @@ static int __init make_gicv3_domU_node(struct kernel_info *kinfo)
> if (res)
> return res;
>
> - res = fdt_property_cell(fdt, "linux,phandle", kinfo->phandle_gic);
> + res = fdt_property_cell(fdt, "linux,phandle", kinfo->phandle_intc);
> if (res)
> return res;
>
> - res = fdt_property_cell(fdt, "phandle", kinfo->phandle_gic);
> + res = fdt_property_cell(fdt, "phandle", kinfo->phandle_intc);
> if (res)
> return res;
>
> @@ -200,7 +201,7 @@ static int __init make_vpl011_uart_node(struct kernel_info *kinfo)
> return res;
>
> res = fdt_property_cell(fdt, "interrupt-parent",
> - kinfo->phandle_gic);
> + kinfo->phandle_intc);
> if ( res )
> return res;
>
> @@ -486,10 +487,10 @@ static int __init domain_handle_dtb_bootmodule(struct domain *d,
> */
> if ( dt_node_cmp(name, "gic") == 0 )
> {
> - uint32_t phandle_gic = fdt_get_phandle(pfdt, node_next);
> + uint32_t phandle_intc = fdt_get_phandle(pfdt, node_next);
>
> - if ( phandle_gic != 0 )
> - kinfo->phandle_gic = phandle_gic;
> + if ( phandle_intc != 0 )
> + kinfo->phandle_intc = phandle_intc;
> continue;
> }
>
> @@ -532,7 +533,7 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
> int addrcells, sizecells;
> int ret, fdt_size = DOMU_DTB_SIZE;
>
> - kinfo->phandle_gic = GUEST_PHANDLE_GIC;
> + kinfo->phandle_intc = GUEST_PHANDLE_GIC;
> kinfo->gnttab_start = GUEST_GNTTAB_BASE;
> kinfo->gnttab_size = GUEST_GNTTAB_SIZE;
>
> @@ -594,7 +595,7 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
> /*
> * domain_handle_dtb_bootmodule has to be called before the rest of
> * the device tree is generated because it depends on the value of
> - * the field phandle_gic.
> + * the field phandle_intc.
> */
> if ( kinfo->dtb_bootmodule )
> {
> @@ -611,7 +612,7 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
> if ( ret )
> goto err;
>
> - if ( kinfo->vpl011 )
> + if ( kinfo->vuart )
> {
> ret = -EINVAL;
> #ifdef CONFIG_SBSA_VUART_CONSOLE
> @@ -839,8 +840,8 @@ int __init construct_domU(struct domain *d,
> printk("*** LOADING DOMU cpus=%u memory=%#"PRIx64"KB ***\n",
> d->max_vcpus, mem);
>
> - kinfo.vpl011 = dt_property_read_bool(node, "vpl011");
> - if ( kinfo.vpl011 && is_hardware_domain(d) )
> + kinfo.vuart = dt_property_read_bool(node, "vpl011");
> + if ( kinfo.vuart && is_hardware_domain(d) )
> panic("hardware domain cannot specify vpl011\n");
>
> rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced);
> @@ -872,7 +873,7 @@ int __init construct_domU(struct domain *d,
>
> #ifdef CONFIG_ARM_64
> /* type must be set before allocate memory */
> - d->arch.type = kinfo.type;
> + d->arch.type = kinfo.arch.type;
> #endif
> if ( is_hardware_domain(d) )
> {
> @@ -898,7 +899,7 @@ int __init construct_domU(struct domain *d,
> * tree node in prepare_dtb_domU, so initialization on related variables
> * shall be done first.
> */
> - if ( kinfo.vpl011 )
> + if ( kinfo.vuart )
> {
> rc = domain_vpl011_init(d, NULL);
> if ( rc < 0 )
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 270a6b97e4..8c7a054718 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1,6 +1,7 @@
> /* SPDX-License-Identifier: GPL-2.0-only */
> #include <xen/init.h>
> #include <xen/compile.h>
> +#include <xen/fdt-kernel.h>
> #include <xen/lib.h>
> #include <xen/llc-coloring.h>
> #include <xen/mm.h>
> @@ -20,7 +21,6 @@
> #include <xen/vmap.h>
> #include <xen/warning.h>
> #include <asm/device.h>
> -#include <asm/kernel.h>
> #include <asm/setup.h>
> #include <asm/tee/tee.h>
> #include <asm/pci.h>
> @@ -747,7 +747,7 @@ static int __init fdt_property_interrupts(const struct kernel_info *kinfo,
> return res;
>
> res = fdt_property_cell(kinfo->fdt, "interrupt-parent",
> - kinfo->phandle_gic);
> + kinfo->phandle_intc);
>
> return res;
> }
> @@ -2026,7 +2026,7 @@ static int __init prepare_dtb_hwdom(struct domain *d, struct kernel_info *kinfo)
>
> ASSERT(dt_host && (dt_host->sibling == NULL));
>
> - kinfo->phandle_gic = dt_interrupt_controller->phandle;
> + kinfo->phandle_intc = dt_interrupt_controller->phandle;
> fdt = device_tree_flattened;
>
> new_size = fdt_totalsize(fdt) + DOM0_FDT_EXTRA_SIZE;
> @@ -2196,13 +2196,13 @@ int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
>
> #ifdef CONFIG_ARM_64
> /* if aarch32 mode is not supported at EL1 do not allow 32-bit domain */
> - if ( !(cpu_has_el1_32) && kinfo->type == DOMAIN_32BIT )
> + if ( !(cpu_has_el1_32) && kinfo->arch.type == DOMAIN_32BIT )
> {
> printk("Platform does not support 32-bit domain\n");
> return -EINVAL;
> }
>
> - if ( is_sve_domain(d) && (kinfo->type == DOMAIN_32BIT) )
> + if ( is_sve_domain(d) && (kinfo->arch.type == DOMAIN_32BIT) )
> {
> printk("SVE is not available for 32-bit domain\n");
> return -EINVAL;
> @@ -2318,7 +2318,7 @@ int __init construct_hwdom(struct kernel_info *kinfo,
>
> #ifdef CONFIG_ARM_64
> /* type must be set before allocate_memory */
> - d->arch.type = kinfo->type;
> + d->arch.type = kinfo->arch.type;
> #endif
> find_gnttab_region(d, kinfo);
> if ( is_domain_direct_mapped(d) )
> diff --git a/xen/arch/arm/include/asm/domain_build.h b/xen/arch/arm/include/asm/domain_build.h
> index 378c10cc98..df1c0fe301 100644
> --- a/xen/arch/arm/include/asm/domain_build.h
> +++ b/xen/arch/arm/include/asm/domain_build.h
> @@ -1,8 +1,8 @@
> #ifndef __ASM_DOMAIN_BUILD_H__
> #define __ASM_DOMAIN_BUILD_H__
>
> +#include <xen/fdt-kernel.h>
> #include <xen/sched.h>
> -#include <asm/kernel.h>
>
> typedef __be32 gic_interrupt_t[3];
> typedef bool (*alloc_domheap_mem_cb)(struct domain *d, struct page_info *pg,
> diff --git a/xen/arch/arm/include/asm/kernel.h b/xen/arch/arm/include/asm/kernel.h
> index bdc96f4c18..cfeab792c7 100644
> --- a/xen/arch/arm/include/asm/kernel.h
> +++ b/xen/arch/arm/include/asm/kernel.h
> @@ -6,137 +6,15 @@
> #ifndef __ARCH_ARM_KERNEL_H__
> #define __ARCH_ARM_KERNEL_H__
>
> -#include <xen/device_tree.h>
> #include <asm/domain.h>
> -#include <asm/setup.h>
>
> -/*
> - * List of possible features for dom0less domUs
> - *
> - * DOM0LESS_ENHANCED_NO_XS: Notify the OS it is running on top of Xen. All the
> - * default features (excluding Xenstore) will be
> - * available. Note that an OS *must* not rely on the
> - * availability of Xen features if this is not set.
> - * DOM0LESS_XENSTORE: Xenstore will be enabled for the VM. The
> - * xenstore page allocation is done by Xen at
> - * domain creation. This feature can't be
> - * enabled without the DOM0LESS_ENHANCED_NO_XS.
> - * DOM0LESS_XS_LEGACY Xenstore will be enabled for the VM, the
> - * xenstore page allocation will happen in
> - * init-dom0less. This feature can't be enabled
> - * without the DOM0LESS_ENHANCED_NO_XS.
> - * DOM0LESS_ENHANCED: Notify the OS it is running on top of Xen. All the
> - * default features (including Xenstore) will be
> - * available. Note that an OS *must* not rely on the
> - * availability of Xen features if this is not set.
> - * DOM0LESS_ENHANCED_LEGACY: Same as before, but using DOM0LESS_XS_LEGACY.
> - */
> -#define DOM0LESS_ENHANCED_NO_XS BIT(0, U)
> -#define DOM0LESS_XENSTORE BIT(1, U)
> -#define DOM0LESS_XS_LEGACY BIT(2, U)
> -#define DOM0LESS_ENHANCED_LEGACY (DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XS_LEGACY)
> -#define DOM0LESS_ENHANCED (DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XENSTORE)
> -
> -struct kernel_info {
> +struct arch_kernel_info
> +{
> #ifdef CONFIG_ARM_64
> enum domain_type type;
> #endif
> -
> - struct domain *d;
> -
> - void *fdt; /* flat device tree */
> - paddr_t unassigned_mem; /* RAM not (yet) assigned to a bank */
> - struct meminfo mem;
> -#ifdef CONFIG_STATIC_SHM
> - struct shared_meminfo shm_mem;
> -#endif
> -
> - /* kernel entry point */
> - paddr_t entry;
> -
> - /* grant table region */
> - paddr_t gnttab_start;
> - paddr_t gnttab_size;
> -
> - /* boot blob load addresses */
> - const struct bootmodule *kernel_bootmodule, *initrd_bootmodule, *dtb_bootmodule;
> - const char* cmdline;
> - paddr_t dtb_paddr;
> - paddr_t initrd_paddr;
> -
> - /* Enable pl011 emulation */
> - bool vpl011;
> -
> - /* Enable/Disable PV drivers interfaces */
> - uint16_t dom0less_feature;
> -
> - /* GIC phandle */
> - uint32_t phandle_gic;
> -
> - /* loader to use for this kernel */
> - void (*load)(struct kernel_info *info);
> - /* loader specific state */
> - union {
> - struct {
> - paddr_t kernel_addr;
> - paddr_t len;
> -#ifdef CONFIG_ARM_64
> - paddr_t text_offset; /* 64-bit Image only */
> -#endif
> - paddr_t start; /* Must be 0 for 64-bit Image */
> - } zimage;
> - };
> };
>
> -static inline struct membanks *kernel_info_get_mem(struct kernel_info *kinfo)
> -{
> - return container_of(&kinfo->mem.common, struct membanks, common);
> -}
> -
> -static inline const struct membanks *
> -kernel_info_get_mem_const(const struct kernel_info *kinfo)
> -{
> - return container_of(&kinfo->mem.common, const struct membanks, common);
> -}
> -
> -#ifdef CONFIG_STATIC_SHM
> -#define KERNEL_INFO_SHM_MEM_INIT \
> - .shm_mem.common.max_banks = NR_SHMEM_BANKS, \
> - .shm_mem.common.type = STATIC_SHARED_MEMORY,
This line type = STATIC_SHARED_MEMORY,
> -#else
> -#define KERNEL_INFO_SHM_MEM_INIT
> -#endif
> -
> -#define KERNEL_INFO_INIT \
> -{ \
> - .mem.common.max_banks = NR_MEM_BANKS, \
> - .mem.common.type = MEMORY, \
and also this line type = MEMORY,
...
> - KERNEL_INFO_SHM_MEM_INIT \
> -}
> -
> -/*
> - * Probe the kernel to detemine its type and select a loader.
> - *
> - * Sets in info:
> - * ->type
> - * ->load hook, and sets loader specific variables ->zimage
> - */
> -int kernel_probe(struct kernel_info *info, const struct dt_device_node *domain);
> -
> -/*
> - * Loads the kernel into guest RAM.
> - *
> - * Expects to be set in info when called:
> - * ->mem
> - * ->fdt
> - *
> - * Sets in info:
> - * ->entry
> - * ->dtb_paddr
> - * ->initrd_paddr
> - */
> -void kernel_load(struct kernel_info *info);
> -
> #endif /* #ifdef __ARCH_ARM_KERNEL_H__ */
>
> /*
> diff --git a/xen/arch/arm/include/asm/static-memory.h b/xen/arch/arm/include/asm/static-memory.h
> index 804166e541..a32a3c6553 100644
> --- a/xen/arch/arm/include/asm/static-memory.h
> +++ b/xen/arch/arm/include/asm/static-memory.h
> @@ -3,8 +3,8 @@
> #ifndef __ASM_STATIC_MEMORY_H_
> #define __ASM_STATIC_MEMORY_H_
>
> +#include <xen/fdt-kernel.h>
> #include <xen/pfn.h>
> -#include <asm/kernel.h>
>
> #ifdef CONFIG_STATIC_MEMORY
>
> diff --git a/xen/arch/arm/include/asm/static-shmem.h b/xen/arch/arm/include/asm/static-shmem.h
> index 94eaa9d500..a4f853805a 100644
> --- a/xen/arch/arm/include/asm/static-shmem.h
> +++ b/xen/arch/arm/include/asm/static-shmem.h
> @@ -3,8 +3,8 @@
> #ifndef __ASM_STATIC_SHMEM_H_
> #define __ASM_STATIC_SHMEM_H_
>
> +#include <xen/fdt-kernel.h>
> #include <xen/types.h>
> -#include <asm/kernel.h>
> #include <asm/setup.h>
>
> #ifdef CONFIG_STATIC_SHM
> diff --git a/xen/arch/arm/kernel.c b/xen/arch/arm/kernel.c
> index 2647812e8e..f00fc388db 100644
> --- a/xen/arch/arm/kernel.c
> +++ b/xen/arch/arm/kernel.c
> @@ -7,6 +7,7 @@
> #include <xen/byteorder.h>
> #include <xen/domain_page.h>
> #include <xen/errno.h>
> +#include <xen/fdt-kernel.h>
> #include <xen/guest_access.h>
> #include <xen/gunzip.h>
> #include <xen/init.h>
> @@ -16,6 +17,7 @@
> #include <xen/sched.h>
> #include <xen/vmap.h>
>
> +#include <asm/domain_build.h>
> #include <asm/kernel.h>
> #include <asm/setup.h>
>
> @@ -101,7 +103,7 @@ static paddr_t __init kernel_zimage_place(struct kernel_info *info)
> paddr_t load_addr;
>
> #ifdef CONFIG_ARM_64
> - if ( (info->type == DOMAIN_64BIT) && (info->zimage.start == 0) )
> + if ( (info->arch.type == DOMAIN_64BIT) && (info->zimage.start == 0) )
> return mem->bank[0].start + info->zimage.text_offset;
> #endif
>
> @@ -371,10 +373,10 @@ static int __init kernel_uimage_probe(struct kernel_info *info,
> switch ( uimage.arch )
> {
> case IH_ARCH_ARM:
> - info->type = DOMAIN_32BIT;
> + info->arch.type = DOMAIN_32BIT;
> break;
> case IH_ARCH_ARM64:
> - info->type = DOMAIN_64BIT;
> + info->arch.type = DOMAIN_64BIT;
> break;
> default:
> printk(XENLOG_ERR "Unsupported uImage arch type %d\n", uimage.arch);
> @@ -444,7 +446,7 @@ static int __init kernel_zimage64_probe(struct kernel_info *info,
>
> info->load = kernel_zimage_load;
>
> - info->type = DOMAIN_64BIT;
> + info->arch.type = DOMAIN_64BIT;
>
> return 0;
> }
> @@ -496,7 +498,7 @@ static int __init kernel_zimage32_probe(struct kernel_info *info,
> info->load = kernel_zimage_load;
>
> #ifdef CONFIG_ARM_64
> - info->type = DOMAIN_32BIT;
> + info->arch.type = DOMAIN_32BIT;
> #endif
>
> return 0;
> diff --git a/xen/arch/arm/static-memory.c b/xen/arch/arm/static-memory.c
> index d4585c5a06..e0f76afcd8 100644
> --- a/xen/arch/arm/static-memory.c
> +++ b/xen/arch/arm/static-memory.c
> @@ -2,6 +2,7 @@
>
> #include <xen/sched.h>
>
> +#include <asm/setup.h>
> #include <asm/static-memory.h>
>
> static bool __init append_static_memory_to_bank(struct domain *d,
> diff --git a/xen/arch/arm/static-shmem.c b/xen/arch/arm/static-shmem.c
> index e8d4ca3ba3..14ae48fb1e 100644
> --- a/xen/arch/arm/static-shmem.c
> +++ b/xen/arch/arm/static-shmem.c
> @@ -6,6 +6,7 @@
> #include <xen/sched.h>
>
> #include <asm/domain_build.h>
> +#include <asm/setup.h>
> #include <asm/static-memory.h>
> #include <asm/static-shmem.h>
>
> diff --git a/xen/common/device-tree/dt-overlay.c b/xen/common/device-tree/dt-overlay.c
> index 97fb99eaaa..81107cb48d 100644
> --- a/xen/common/device-tree/dt-overlay.c
> +++ b/xen/common/device-tree/dt-overlay.c
> @@ -6,8 +6,8 @@
> * Written by Vikram Garhwal <vikram.garhwal@amd.com>
> *
> */
> -#include <asm/domain_build.h>
> #include <xen/dt-overlay.h>
> +#include <xen/fdt-kernel.h>
> #include <xen/guest_access.h>
> #include <xen/iocap.h>
> #include <xen/libfdt/libfdt.h>
> diff --git a/xen/include/asm-generic/dom0less-build.h b/xen/include/asm-generic/dom0less-build.h
> index 5655571a66..f095135caa 100644
> --- a/xen/include/asm-generic/dom0less-build.h
> +++ b/xen/include/asm-generic/dom0less-build.h
> @@ -16,6 +16,34 @@ struct dt_device_node;
> #define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
> extern bool need_xenstore;
>
> +/*
> + * List of possible features for dom0less domUs
> + *
> + * DOM0LESS_ENHANCED_NO_XS: Notify the OS it is running on top of Xen. All the
> + * default features (excluding Xenstore) will be
> + * available. Note that an OS *must* not rely on the
> + * availability of Xen features if this is not set.
> + * DOM0LESS_XENSTORE: Xenstore will be enabled for the VM. The
> + * xenstore page allocation is done by Xen at
> + * domain creation. This feature can't be
> + * enabled without the DOM0LESS_ENHANCED_NO_XS.
> + * DOM0LESS_XS_LEGACY Xenstore will be enabled for the VM, the
> + * xenstore page allocation will happen in
> + * init-dom0less. This feature can't be enabled
> + * without the DOM0LESS_ENHANCED_NO_XS.
> + * DOM0LESS_ENHANCED: Notify the OS it is running on top of Xen. All the
> + * default features (including Xenstore) will be
> + * available. Note that an OS *must* not rely on the
> + * availability of Xen features if this is not set.
> + * DOM0LESS_ENHANCED_LEGACY: Same as before, but using DOM0LESS_XS_LEGACY.
> +
> + */
> +#define DOM0LESS_ENHANCED_NO_XS BIT(0, U)
> +#define DOM0LESS_XENSTORE BIT(1, U)
> +#define DOM0LESS_XS_LEGACY BIT(2, U)
> +#define DOM0LESS_ENHANCED_LEGACY (DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XS_LEGACY)
> +#define DOM0LESS_ENHANCED (DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XENSTORE)
> +
> void create_domUs(void);
> bool is_dom0less_mode(void);
> void set_xs_domain(struct domain *d);
> diff --git a/xen/include/xen/fdt-kernel.h b/xen/include/xen/fdt-kernel.h
> new file mode 100644
> index 0000000000..c81e759423
> --- /dev/null
> +++ b/xen/include/xen/fdt-kernel.h
> @@ -0,0 +1,133 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * For Kernel image loading.
> + *
> + * Copyright (C) 2011 Citrix Systems, Inc.
> + */
> +#ifndef __XEN_FDT_KERNEL_H__
> +#define __XEN_FDT_KERNEL_H__
> +
> +#include <xen/bootfdt.h>
> +#include <xen/device_tree.h>
> +#include <xen/types.h>
> +
> +#if __has_include(<asm/kernel.h>)
> +# include <asm/kernel.h>
> +#endif
> +
> +struct kernel_info {
> + struct domain *d;
> +
> + void *fdt; /* flat device tree */
> + paddr_t unassigned_mem; /* RAM not (yet) assigned to a bank */
> + struct meminfo mem;
> +#ifdef CONFIG_STATIC_SHM
> + struct shared_meminfo shm_mem;
> +#endif
> +
> + /* kernel entry point */
> + paddr_t entry;
> +
> + /* grant table region */
> + paddr_t gnttab_start;
> + paddr_t gnttab_size;
> +
> + /* boot blob load addresses */
> + const struct bootmodule *kernel_bootmodule, *initrd_bootmodule, *dtb_bootmodule;
> + const char* cmdline;
> + paddr_t dtb_paddr;
> + paddr_t initrd_paddr;
> +
> + /* Enable uart emulation */
> + bool vuart;
> +
> + /* Enable/Disable PV drivers interfaces */
> + uint16_t dom0less_feature;
> +
> + /* Interrupt controller phandle */
> + uint32_t phandle_intc;
> +
> + /* loader to use for this kernel */
> + void (*load)(struct kernel_info *info);
> +
> + /* loader specific state */
> + union {
> + struct {
> + paddr_t kernel_addr;
> + paddr_t len;
> +#if defined(CONFIG_ARM_64) || defined(CONFIG_RISCV_64)
> + paddr_t text_offset; /* 64-bit Image only */
> +#endif
> + paddr_t start; /* Must be 0 for 64-bit Image */
> + } zimage;
> + };
> +
> +#if __has_include(<asm/kernel.h>)
> + struct arch_kernel_info arch;
> +#endif
> +};
> +
> +static inline struct membanks *kernel_info_get_mem(struct kernel_info *kinfo)
> +{
> + return container_of(&kinfo->mem.common, struct membanks, common);
> +}
> +
> +static inline const struct membanks *
> +kernel_info_get_mem_const(const struct kernel_info *kinfo)
> +{
> + return container_of(&kinfo->mem.common, const struct membanks, common);
> +}
> +
> +#ifndef KERNEL_INFO_SHM_MEM_INIT
> +
> +#ifdef CONFIG_STATIC_SHM
> +#define KERNEL_INFO_SHM_MEM_INIT .shm_mem.common.max_banks = NR_SHMEM_BANKS,
they are missing here...
> +#else
> +#define KERNEL_INFO_SHM_MEM_INIT
> +#endif
> +
> +#endif /* KERNEL_INFO_SHM_MEM_INIT */
> +
> +#ifndef KERNEL_INFO_INIT
> +
> +#define KERNEL_INFO_INIT \
> +{ \
> + .mem.common.max_banks = NR_MEM_BANKS, \
and also here.
Why?
> + KERNEL_INFO_SHM_MEM_INIT \
> +}
> +
> +#endif /* KERNEL_INFO_INIT */
> +
> +/*
> + * Probe the kernel to detemine its type and select a loader.
> + *
> + * Sets in info:
> + * ->type
> + * ->load hook, and sets loader specific variables ->zimage
> + */
> +int kernel_probe(struct kernel_info *info, const struct dt_device_node *domain);
> +
> +/*
> + * Loads the kernel into guest RAM.
> + *
> + * Expects to be set in info when called:
> + * ->mem
> + * ->fdt
> + *
> + * Sets in info:
> + * ->entry
> + * ->dtb_paddr
> + * ->initrd_paddr
> + */
> +void kernel_load(struct kernel_info *info);
> +
> +#endif /* __XEN_FDT_KERNEL_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v3 3/8] asm-generic: move parts of Arm's asm/kernel.h to common code
2025-05-02 18:13 ` Stefano Stabellini
@ 2025-05-05 11:10 ` Oleksii Kurochko
0 siblings, 0 replies; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-05 11:10 UTC (permalink / raw)
To: Stefano Stabellini
Cc: xen-devel, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD, Jan Beulich,
Roger Pau Monné
[-- Attachment #1: Type: text/plain, Size: 26798 bytes --]
On 5/2/25 8:13 PM, Stefano Stabellini wrote:
> On Fri, 2 May 2025, Oleksii Kurochko wrote:
>> Move the following parts to common with the following changes:
>> - struct kernel_info:
>> - Create arch_kernel_info for arch specific kernel information.
>> At the moment, it contains domain_type for Arm.
>> - Rename vpl011 to vuart to have more generic name suitable for other archs.
>> - s/phandle_gic/phandle_intc to have more generic name suitable for other
>> archs.
>> - Make text_offset of zimage structure available for RISCV_64.
>> - Wrap by `#ifdef KERNEL_INFO_SHM_MEM_INIT` definition of KERNEL_SHM_MEM_INIT
>> and wrap by `#ifndef KERNEL_INFO_INIT` definition of KERNEL_INFO_INIT to have
>> ability to override KERNEL_INFO_SHM_MEM_INIT for arch in case it doesn't
>> want to use generic one.
>> - Move DOM0LESS_* macros to dom0less-build.h.
>> - Move all others parts of Arm's kernel.h to xen/fdt-kernel.h.
>>
>> Because of the changes in struct kernel_info the correspondent parts of Arm's
>> code are updated.
>>
>> As part of this patch the following clean up happens:
>> - Drop asm/setup.h from asm/kernel.h as nothing depends from it.
>> Add inclusion of asm/setup.h for a code which uses device_tree_get_reg() to
>> avoid compilation issues for CONFIG_STATIC_MEMORY and CONFIG_STATIC_SHM.
>> - Drop inclusion of asm/kernel.h everywhere except xen/fdt-kernel.h.
>>
>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
> Everything looks good except for one question below. This patch looks
> like a lot of work, thanks Oleksii!
>
>
>> ---
>> Changes in v3:
>> - Only resolving of merge conflicts.
>> ---
>> Changes in v2:
>> - Introduce xen/fdt-kernel.h.
>> - Move DOM0LESS_* macros to dom0less-build.h.
>> - Move the rest in asm-generic/kernel.h to xen/fdt-kernel.h.
>> - Drop inclusion of asm/kernel.h everywhere except xen/fdt-kernel.h.
>> - Wrap by #if __has_include(....) the member of kernel_info structure:
>> struct arch_kernel_info arch.
>> - Update the commit message.
>> ---
>> xen/arch/arm/acpi/domain_build.c | 2 +-
>> xen/arch/arm/dom0less-build.c | 31 +++---
>> xen/arch/arm/domain_build.c | 12 +-
>> xen/arch/arm/include/asm/domain_build.h | 2 +-
>> xen/arch/arm/include/asm/kernel.h | 126 +--------------------
>> xen/arch/arm/include/asm/static-memory.h | 2 +-
>> xen/arch/arm/include/asm/static-shmem.h | 2 +-
>> xen/arch/arm/kernel.c | 12 +-
>> xen/arch/arm/static-memory.c | 1 +
>> xen/arch/arm/static-shmem.c | 1 +
>> xen/common/device-tree/dt-overlay.c | 2 +-
>> xen/include/asm-generic/dom0less-build.h | 28 +++++
>> xen/include/xen/fdt-kernel.h | 133 +++++++++++++++++++++++
>> 13 files changed, 199 insertions(+), 155 deletions(-)
>> create mode 100644 xen/include/xen/fdt-kernel.h
>>
>> diff --git a/xen/arch/arm/acpi/domain_build.c b/xen/arch/arm/acpi/domain_build.c
>> index 2ce75543d0..f9ca8b47e5 100644
>> --- a/xen/arch/arm/acpi/domain_build.c
>> +++ b/xen/arch/arm/acpi/domain_build.c
>> @@ -10,6 +10,7 @@
>> */
>>
>> #include <xen/compile.h>
>> +#include <xen/fdt-kernel.h>
>> #include <xen/mm.h>
>> #include <xen/sched.h>
>> #include <xen/acpi.h>
>> @@ -18,7 +19,6 @@
>> #include <xen/device_tree.h>
>> #include <xen/libfdt/libfdt.h>
>> #include <acpi/actables.h>
>> -#include <asm/kernel.h>
>> #include <asm/domain_build.h>
>>
>> /* Override macros from asm/page.h to make them work with mfn_t */
>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
>> index ef49495d4f..c0634dd61e 100644
>> --- a/xen/arch/arm/dom0less-build.c
>> +++ b/xen/arch/arm/dom0less-build.c
>> @@ -1,6 +1,7 @@
>> /* SPDX-License-Identifier: GPL-2.0-only */
>> #include <xen/device_tree.h>
>> #include <xen/domain_page.h>
>> +#include <xen/fdt-kernel.h>
>> #include <xen/err.h>
>> #include <xen/event.h>
>> #include <xen/grant_table.h>
>> @@ -64,11 +65,11 @@ static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
>> if (res)
>> return res;
>>
>> - res = fdt_property_cell(fdt, "linux,phandle", kinfo->phandle_gic);
>> + res = fdt_property_cell(fdt, "linux,phandle", kinfo->phandle_intc);
>> if (res)
>> return res;
>>
>> - res = fdt_property_cell(fdt, "phandle", kinfo->phandle_gic);
>> + res = fdt_property_cell(fdt, "phandle", kinfo->phandle_intc);
>> if (res)
>> return res;
>>
>> @@ -135,11 +136,11 @@ static int __init make_gicv3_domU_node(struct kernel_info *kinfo)
>> if (res)
>> return res;
>>
>> - res = fdt_property_cell(fdt, "linux,phandle", kinfo->phandle_gic);
>> + res = fdt_property_cell(fdt, "linux,phandle", kinfo->phandle_intc);
>> if (res)
>> return res;
>>
>> - res = fdt_property_cell(fdt, "phandle", kinfo->phandle_gic);
>> + res = fdt_property_cell(fdt, "phandle", kinfo->phandle_intc);
>> if (res)
>> return res;
>>
>> @@ -200,7 +201,7 @@ static int __init make_vpl011_uart_node(struct kernel_info *kinfo)
>> return res;
>>
>> res = fdt_property_cell(fdt, "interrupt-parent",
>> - kinfo->phandle_gic);
>> + kinfo->phandle_intc);
>> if ( res )
>> return res;
>>
>> @@ -486,10 +487,10 @@ static int __init domain_handle_dtb_bootmodule(struct domain *d,
>> */
>> if ( dt_node_cmp(name, "gic") == 0 )
>> {
>> - uint32_t phandle_gic = fdt_get_phandle(pfdt, node_next);
>> + uint32_t phandle_intc = fdt_get_phandle(pfdt, node_next);
>>
>> - if ( phandle_gic != 0 )
>> - kinfo->phandle_gic = phandle_gic;
>> + if ( phandle_intc != 0 )
>> + kinfo->phandle_intc = phandle_intc;
>> continue;
>> }
>>
>> @@ -532,7 +533,7 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
>> int addrcells, sizecells;
>> int ret, fdt_size = DOMU_DTB_SIZE;
>>
>> - kinfo->phandle_gic = GUEST_PHANDLE_GIC;
>> + kinfo->phandle_intc = GUEST_PHANDLE_GIC;
>> kinfo->gnttab_start = GUEST_GNTTAB_BASE;
>> kinfo->gnttab_size = GUEST_GNTTAB_SIZE;
>>
>> @@ -594,7 +595,7 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
>> /*
>> * domain_handle_dtb_bootmodule has to be called before the rest of
>> * the device tree is generated because it depends on the value of
>> - * the field phandle_gic.
>> + * the field phandle_intc.
>> */
>> if ( kinfo->dtb_bootmodule )
>> {
>> @@ -611,7 +612,7 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
>> if ( ret )
>> goto err;
>>
>> - if ( kinfo->vpl011 )
>> + if ( kinfo->vuart )
>> {
>> ret = -EINVAL;
>> #ifdef CONFIG_SBSA_VUART_CONSOLE
>> @@ -839,8 +840,8 @@ int __init construct_domU(struct domain *d,
>> printk("*** LOADING DOMU cpus=%u memory=%#"PRIx64"KB ***\n",
>> d->max_vcpus, mem);
>>
>> - kinfo.vpl011 = dt_property_read_bool(node, "vpl011");
>> - if ( kinfo.vpl011 && is_hardware_domain(d) )
>> + kinfo.vuart = dt_property_read_bool(node, "vpl011");
>> + if ( kinfo.vuart && is_hardware_domain(d) )
>> panic("hardware domain cannot specify vpl011\n");
>>
>> rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced);
>> @@ -872,7 +873,7 @@ int __init construct_domU(struct domain *d,
>>
>> #ifdef CONFIG_ARM_64
>> /* type must be set before allocate memory */
>> - d->arch.type = kinfo.type;
>> + d->arch.type = kinfo.arch.type;
>> #endif
>> if ( is_hardware_domain(d) )
>> {
>> @@ -898,7 +899,7 @@ int __init construct_domU(struct domain *d,
>> * tree node in prepare_dtb_domU, so initialization on related variables
>> * shall be done first.
>> */
>> - if ( kinfo.vpl011 )
>> + if ( kinfo.vuart )
>> {
>> rc = domain_vpl011_init(d, NULL);
>> if ( rc < 0 )
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index 270a6b97e4..8c7a054718 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -1,6 +1,7 @@
>> /* SPDX-License-Identifier: GPL-2.0-only */
>> #include <xen/init.h>
>> #include <xen/compile.h>
>> +#include <xen/fdt-kernel.h>
>> #include <xen/lib.h>
>> #include <xen/llc-coloring.h>
>> #include <xen/mm.h>
>> @@ -20,7 +21,6 @@
>> #include <xen/vmap.h>
>> #include <xen/warning.h>
>> #include <asm/device.h>
>> -#include <asm/kernel.h>
>> #include <asm/setup.h>
>> #include <asm/tee/tee.h>
>> #include <asm/pci.h>
>> @@ -747,7 +747,7 @@ static int __init fdt_property_interrupts(const struct kernel_info *kinfo,
>> return res;
>>
>> res = fdt_property_cell(kinfo->fdt, "interrupt-parent",
>> - kinfo->phandle_gic);
>> + kinfo->phandle_intc);
>>
>> return res;
>> }
>> @@ -2026,7 +2026,7 @@ static int __init prepare_dtb_hwdom(struct domain *d, struct kernel_info *kinfo)
>>
>> ASSERT(dt_host && (dt_host->sibling == NULL));
>>
>> - kinfo->phandle_gic = dt_interrupt_controller->phandle;
>> + kinfo->phandle_intc = dt_interrupt_controller->phandle;
>> fdt = device_tree_flattened;
>>
>> new_size = fdt_totalsize(fdt) + DOM0_FDT_EXTRA_SIZE;
>> @@ -2196,13 +2196,13 @@ int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
>>
>> #ifdef CONFIG_ARM_64
>> /* if aarch32 mode is not supported at EL1 do not allow 32-bit domain */
>> - if ( !(cpu_has_el1_32) && kinfo->type == DOMAIN_32BIT )
>> + if ( !(cpu_has_el1_32) && kinfo->arch.type == DOMAIN_32BIT )
>> {
>> printk("Platform does not support 32-bit domain\n");
>> return -EINVAL;
>> }
>>
>> - if ( is_sve_domain(d) && (kinfo->type == DOMAIN_32BIT) )
>> + if ( is_sve_domain(d) && (kinfo->arch.type == DOMAIN_32BIT) )
>> {
>> printk("SVE is not available for 32-bit domain\n");
>> return -EINVAL;
>> @@ -2318,7 +2318,7 @@ int __init construct_hwdom(struct kernel_info *kinfo,
>>
>> #ifdef CONFIG_ARM_64
>> /* type must be set before allocate_memory */
>> - d->arch.type = kinfo->type;
>> + d->arch.type = kinfo->arch.type;
>> #endif
>> find_gnttab_region(d, kinfo);
>> if ( is_domain_direct_mapped(d) )
>> diff --git a/xen/arch/arm/include/asm/domain_build.h b/xen/arch/arm/include/asm/domain_build.h
>> index 378c10cc98..df1c0fe301 100644
>> --- a/xen/arch/arm/include/asm/domain_build.h
>> +++ b/xen/arch/arm/include/asm/domain_build.h
>> @@ -1,8 +1,8 @@
>> #ifndef __ASM_DOMAIN_BUILD_H__
>> #define __ASM_DOMAIN_BUILD_H__
>>
>> +#include <xen/fdt-kernel.h>
>> #include <xen/sched.h>
>> -#include <asm/kernel.h>
>>
>> typedef __be32 gic_interrupt_t[3];
>> typedef bool (*alloc_domheap_mem_cb)(struct domain *d, struct page_info *pg,
>> diff --git a/xen/arch/arm/include/asm/kernel.h b/xen/arch/arm/include/asm/kernel.h
>> index bdc96f4c18..cfeab792c7 100644
>> --- a/xen/arch/arm/include/asm/kernel.h
>> +++ b/xen/arch/arm/include/asm/kernel.h
>> @@ -6,137 +6,15 @@
>> #ifndef __ARCH_ARM_KERNEL_H__
>> #define __ARCH_ARM_KERNEL_H__
>>
>> -#include <xen/device_tree.h>
>> #include <asm/domain.h>
>> -#include <asm/setup.h>
>>
>> -/*
>> - * List of possible features for dom0less domUs
>> - *
>> - * DOM0LESS_ENHANCED_NO_XS: Notify the OS it is running on top of Xen. All the
>> - * default features (excluding Xenstore) will be
>> - * available. Note that an OS *must* not rely on the
>> - * availability of Xen features if this is not set.
>> - * DOM0LESS_XENSTORE: Xenstore will be enabled for the VM. The
>> - * xenstore page allocation is done by Xen at
>> - * domain creation. This feature can't be
>> - * enabled without the DOM0LESS_ENHANCED_NO_XS.
>> - * DOM0LESS_XS_LEGACY Xenstore will be enabled for the VM, the
>> - * xenstore page allocation will happen in
>> - * init-dom0less. This feature can't be enabled
>> - * without the DOM0LESS_ENHANCED_NO_XS.
>> - * DOM0LESS_ENHANCED: Notify the OS it is running on top of Xen. All the
>> - * default features (including Xenstore) will be
>> - * available. Note that an OS *must* not rely on the
>> - * availability of Xen features if this is not set.
>> - * DOM0LESS_ENHANCED_LEGACY: Same as before, but using DOM0LESS_XS_LEGACY.
>> - */
>> -#define DOM0LESS_ENHANCED_NO_XS BIT(0, U)
>> -#define DOM0LESS_XENSTORE BIT(1, U)
>> -#define DOM0LESS_XS_LEGACY BIT(2, U)
>> -#define DOM0LESS_ENHANCED_LEGACY (DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XS_LEGACY)
>> -#define DOM0LESS_ENHANCED (DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XENSTORE)
>> -
>> -struct kernel_info {
>> +struct arch_kernel_info
>> +{
>> #ifdef CONFIG_ARM_64
>> enum domain_type type;
>> #endif
>> -
>> - struct domain *d;
>> -
>> - void *fdt; /* flat device tree */
>> - paddr_t unassigned_mem; /* RAM not (yet) assigned to a bank */
>> - struct meminfo mem;
>> -#ifdef CONFIG_STATIC_SHM
>> - struct shared_meminfo shm_mem;
>> -#endif
>> -
>> - /* kernel entry point */
>> - paddr_t entry;
>> -
>> - /* grant table region */
>> - paddr_t gnttab_start;
>> - paddr_t gnttab_size;
>> -
>> - /* boot blob load addresses */
>> - const struct bootmodule *kernel_bootmodule, *initrd_bootmodule, *dtb_bootmodule;
>> - const char* cmdline;
>> - paddr_t dtb_paddr;
>> - paddr_t initrd_paddr;
>> -
>> - /* Enable pl011 emulation */
>> - bool vpl011;
>> -
>> - /* Enable/Disable PV drivers interfaces */
>> - uint16_t dom0less_feature;
>> -
>> - /* GIC phandle */
>> - uint32_t phandle_gic;
>> -
>> - /* loader to use for this kernel */
>> - void (*load)(struct kernel_info *info);
>> - /* loader specific state */
>> - union {
>> - struct {
>> - paddr_t kernel_addr;
>> - paddr_t len;
>> -#ifdef CONFIG_ARM_64
>> - paddr_t text_offset; /* 64-bit Image only */
>> -#endif
>> - paddr_t start; /* Must be 0 for 64-bit Image */
>> - } zimage;
>> - };
>> };
>>
>> -static inline struct membanks *kernel_info_get_mem(struct kernel_info *kinfo)
>> -{
>> - return container_of(&kinfo->mem.common, struct membanks, common);
>> -}
>> -
>> -static inline const struct membanks *
>> -kernel_info_get_mem_const(const struct kernel_info *kinfo)
>> -{
>> - return container_of(&kinfo->mem.common, const struct membanks, common);
>> -}
>> -
>> -#ifdef CONFIG_STATIC_SHM
>> -#define KERNEL_INFO_SHM_MEM_INIT \
>> - .shm_mem.common.max_banks = NR_SHMEM_BANKS, \
>> - .shm_mem.common.type = STATIC_SHARED_MEMORY,
> This line type = STATIC_SHARED_MEMORY,
>
>
>> -#else
>> -#define KERNEL_INFO_SHM_MEM_INIT
>> -#endif
>> -
>> -#define KERNEL_INFO_INIT \
>> -{ \
>> - .mem.common.max_banks = NR_MEM_BANKS, \
>> - .mem.common.type = MEMORY, \
> and also this line type = MEMORY,
> ...
>
>
>> - KERNEL_INFO_SHM_MEM_INIT \
>> -}
>> -
>> -/*
>> - * Probe the kernel to detemine its type and select a loader.
>> - *
>> - * Sets in info:
>> - * ->type
>> - * ->load hook, and sets loader specific variables ->zimage
>> - */
>> -int kernel_probe(struct kernel_info *info, const struct dt_device_node *domain);
>> -
>> -/*
>> - * Loads the kernel into guest RAM.
>> - *
>> - * Expects to be set in info when called:
>> - * ->mem
>> - * ->fdt
>> - *
>> - * Sets in info:
>> - * ->entry
>> - * ->dtb_paddr
>> - * ->initrd_paddr
>> - */
>> -void kernel_load(struct kernel_info *info);
>> -
>> #endif /* #ifdef __ARCH_ARM_KERNEL_H__ */
>>
>> /*
>> diff --git a/xen/arch/arm/include/asm/static-memory.h b/xen/arch/arm/include/asm/static-memory.h
>> index 804166e541..a32a3c6553 100644
>> --- a/xen/arch/arm/include/asm/static-memory.h
>> +++ b/xen/arch/arm/include/asm/static-memory.h
>> @@ -3,8 +3,8 @@
>> #ifndef __ASM_STATIC_MEMORY_H_
>> #define __ASM_STATIC_MEMORY_H_
>>
>> +#include <xen/fdt-kernel.h>
>> #include <xen/pfn.h>
>> -#include <asm/kernel.h>
>>
>> #ifdef CONFIG_STATIC_MEMORY
>>
>> diff --git a/xen/arch/arm/include/asm/static-shmem.h b/xen/arch/arm/include/asm/static-shmem.h
>> index 94eaa9d500..a4f853805a 100644
>> --- a/xen/arch/arm/include/asm/static-shmem.h
>> +++ b/xen/arch/arm/include/asm/static-shmem.h
>> @@ -3,8 +3,8 @@
>> #ifndef __ASM_STATIC_SHMEM_H_
>> #define __ASM_STATIC_SHMEM_H_
>>
>> +#include <xen/fdt-kernel.h>
>> #include <xen/types.h>
>> -#include <asm/kernel.h>
>> #include <asm/setup.h>
>>
>> #ifdef CONFIG_STATIC_SHM
>> diff --git a/xen/arch/arm/kernel.c b/xen/arch/arm/kernel.c
>> index 2647812e8e..f00fc388db 100644
>> --- a/xen/arch/arm/kernel.c
>> +++ b/xen/arch/arm/kernel.c
>> @@ -7,6 +7,7 @@
>> #include <xen/byteorder.h>
>> #include <xen/domain_page.h>
>> #include <xen/errno.h>
>> +#include <xen/fdt-kernel.h>
>> #include <xen/guest_access.h>
>> #include <xen/gunzip.h>
>> #include <xen/init.h>
>> @@ -16,6 +17,7 @@
>> #include <xen/sched.h>
>> #include <xen/vmap.h>
>>
>> +#include <asm/domain_build.h>
>> #include <asm/kernel.h>
>> #include <asm/setup.h>
>>
>> @@ -101,7 +103,7 @@ static paddr_t __init kernel_zimage_place(struct kernel_info *info)
>> paddr_t load_addr;
>>
>> #ifdef CONFIG_ARM_64
>> - if ( (info->type == DOMAIN_64BIT) && (info->zimage.start == 0) )
>> + if ( (info->arch.type == DOMAIN_64BIT) && (info->zimage.start == 0) )
>> return mem->bank[0].start + info->zimage.text_offset;
>> #endif
>>
>> @@ -371,10 +373,10 @@ static int __init kernel_uimage_probe(struct kernel_info *info,
>> switch ( uimage.arch )
>> {
>> case IH_ARCH_ARM:
>> - info->type = DOMAIN_32BIT;
>> + info->arch.type = DOMAIN_32BIT;
>> break;
>> case IH_ARCH_ARM64:
>> - info->type = DOMAIN_64BIT;
>> + info->arch.type = DOMAIN_64BIT;
>> break;
>> default:
>> printk(XENLOG_ERR "Unsupported uImage arch type %d\n", uimage.arch);
>> @@ -444,7 +446,7 @@ static int __init kernel_zimage64_probe(struct kernel_info *info,
>>
>> info->load = kernel_zimage_load;
>>
>> - info->type = DOMAIN_64BIT;
>> + info->arch.type = DOMAIN_64BIT;
>>
>> return 0;
>> }
>> @@ -496,7 +498,7 @@ static int __init kernel_zimage32_probe(struct kernel_info *info,
>> info->load = kernel_zimage_load;
>>
>> #ifdef CONFIG_ARM_64
>> - info->type = DOMAIN_32BIT;
>> + info->arch.type = DOMAIN_32BIT;
>> #endif
>>
>> return 0;
>> diff --git a/xen/arch/arm/static-memory.c b/xen/arch/arm/static-memory.c
>> index d4585c5a06..e0f76afcd8 100644
>> --- a/xen/arch/arm/static-memory.c
>> +++ b/xen/arch/arm/static-memory.c
>> @@ -2,6 +2,7 @@
>>
>> #include <xen/sched.h>
>>
>> +#include <asm/setup.h>
>> #include <asm/static-memory.h>
>>
>> static bool __init append_static_memory_to_bank(struct domain *d,
>> diff --git a/xen/arch/arm/static-shmem.c b/xen/arch/arm/static-shmem.c
>> index e8d4ca3ba3..14ae48fb1e 100644
>> --- a/xen/arch/arm/static-shmem.c
>> +++ b/xen/arch/arm/static-shmem.c
>> @@ -6,6 +6,7 @@
>> #include <xen/sched.h>
>>
>> #include <asm/domain_build.h>
>> +#include <asm/setup.h>
>> #include <asm/static-memory.h>
>> #include <asm/static-shmem.h>
>>
>> diff --git a/xen/common/device-tree/dt-overlay.c b/xen/common/device-tree/dt-overlay.c
>> index 97fb99eaaa..81107cb48d 100644
>> --- a/xen/common/device-tree/dt-overlay.c
>> +++ b/xen/common/device-tree/dt-overlay.c
>> @@ -6,8 +6,8 @@
>> * Written by Vikram Garhwal<vikram.garhwal@amd.com>
>> *
>> */
>> -#include <asm/domain_build.h>
>> #include <xen/dt-overlay.h>
>> +#include <xen/fdt-kernel.h>
>> #include <xen/guest_access.h>
>> #include <xen/iocap.h>
>> #include <xen/libfdt/libfdt.h>
>> diff --git a/xen/include/asm-generic/dom0less-build.h b/xen/include/asm-generic/dom0less-build.h
>> index 5655571a66..f095135caa 100644
>> --- a/xen/include/asm-generic/dom0less-build.h
>> +++ b/xen/include/asm-generic/dom0less-build.h
>> @@ -16,6 +16,34 @@ struct dt_device_node;
>> #define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
>> extern bool need_xenstore;
>>
>> +/*
>> + * List of possible features for dom0less domUs
>> + *
>> + * DOM0LESS_ENHANCED_NO_XS: Notify the OS it is running on top of Xen. All the
>> + * default features (excluding Xenstore) will be
>> + * available. Note that an OS *must* not rely on the
>> + * availability of Xen features if this is not set.
>> + * DOM0LESS_XENSTORE: Xenstore will be enabled for the VM. The
>> + * xenstore page allocation is done by Xen at
>> + * domain creation. This feature can't be
>> + * enabled without the DOM0LESS_ENHANCED_NO_XS.
>> + * DOM0LESS_XS_LEGACY Xenstore will be enabled for the VM, the
>> + * xenstore page allocation will happen in
>> + * init-dom0less. This feature can't be enabled
>> + * without the DOM0LESS_ENHANCED_NO_XS.
>> + * DOM0LESS_ENHANCED: Notify the OS it is running on top of Xen. All the
>> + * default features (including Xenstore) will be
>> + * available. Note that an OS *must* not rely on the
>> + * availability of Xen features if this is not set.
>> + * DOM0LESS_ENHANCED_LEGACY: Same as before, but using DOM0LESS_XS_LEGACY.
>> +
>> + */
>> +#define DOM0LESS_ENHANCED_NO_XS BIT(0, U)
>> +#define DOM0LESS_XENSTORE BIT(1, U)
>> +#define DOM0LESS_XS_LEGACY BIT(2, U)
>> +#define DOM0LESS_ENHANCED_LEGACY (DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XS_LEGACY)
>> +#define DOM0LESS_ENHANCED (DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XENSTORE)
>> +
>> void create_domUs(void);
>> bool is_dom0less_mode(void);
>> void set_xs_domain(struct domain *d);
>> diff --git a/xen/include/xen/fdt-kernel.h b/xen/include/xen/fdt-kernel.h
>> new file mode 100644
>> index 0000000000..c81e759423
>> --- /dev/null
>> +++ b/xen/include/xen/fdt-kernel.h
>> @@ -0,0 +1,133 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * For Kernel image loading.
>> + *
>> + * Copyright (C) 2011 Citrix Systems, Inc.
>> + */
>> +#ifndef __XEN_FDT_KERNEL_H__
>> +#define __XEN_FDT_KERNEL_H__
>> +
>> +#include <xen/bootfdt.h>
>> +#include <xen/device_tree.h>
>> +#include <xen/types.h>
>> +
>> +#if __has_include(<asm/kernel.h>)
>> +# include <asm/kernel.h>
>> +#endif
>> +
>> +struct kernel_info {
>> + struct domain *d;
>> +
>> + void *fdt; /* flat device tree */
>> + paddr_t unassigned_mem; /* RAM not (yet) assigned to a bank */
>> + struct meminfo mem;
>> +#ifdef CONFIG_STATIC_SHM
>> + struct shared_meminfo shm_mem;
>> +#endif
>> +
>> + /* kernel entry point */
>> + paddr_t entry;
>> +
>> + /* grant table region */
>> + paddr_t gnttab_start;
>> + paddr_t gnttab_size;
>> +
>> + /* boot blob load addresses */
>> + const struct bootmodule *kernel_bootmodule, *initrd_bootmodule, *dtb_bootmodule;
>> + const char* cmdline;
>> + paddr_t dtb_paddr;
>> + paddr_t initrd_paddr;
>> +
>> + /* Enable uart emulation */
>> + bool vuart;
>> +
>> + /* Enable/Disable PV drivers interfaces */
>> + uint16_t dom0less_feature;
>> +
>> + /* Interrupt controller phandle */
>> + uint32_t phandle_intc;
>> +
>> + /* loader to use for this kernel */
>> + void (*load)(struct kernel_info *info);
>> +
>> + /* loader specific state */
>> + union {
>> + struct {
>> + paddr_t kernel_addr;
>> + paddr_t len;
>> +#if defined(CONFIG_ARM_64) || defined(CONFIG_RISCV_64)
>> + paddr_t text_offset; /* 64-bit Image only */
>> +#endif
>> + paddr_t start; /* Must be 0 for 64-bit Image */
>> + } zimage;
>> + };
>> +
>> +#if __has_include(<asm/kernel.h>)
>> + struct arch_kernel_info arch;
>> +#endif
>> +};
>> +
>> +static inline struct membanks *kernel_info_get_mem(struct kernel_info *kinfo)
>> +{
>> + return container_of(&kinfo->mem.common, struct membanks, common);
>> +}
>> +
>> +static inline const struct membanks *
>> +kernel_info_get_mem_const(const struct kernel_info *kinfo)
>> +{
>> + return container_of(&kinfo->mem.common, const struct membanks, common);
>> +}
>> +
>> +#ifndef KERNEL_INFO_SHM_MEM_INIT
>> +
>> +#ifdef CONFIG_STATIC_SHM
>> +#define KERNEL_INFO_SHM_MEM_INIT .shm_mem.common.max_banks = NR_SHMEM_BANKS,
> they are missing here...
>
>
>> +#else
>> +#define KERNEL_INFO_SHM_MEM_INIT
>> +#endif
>> +
>> +#endif /* KERNEL_INFO_SHM_MEM_INIT */
>> +
>> +#ifndef KERNEL_INFO_INIT
>> +
>> +#define KERNEL_INFO_INIT \
>> +{ \
>> + .mem.common.max_banks = NR_MEM_BANKS, \
> and also here.
>
> Why?
I just overlooked these changes (they’re relatively "new"). Thanks for pointing that out.
I’ll restore those types.
~ Oleksii
>> + KERNEL_INFO_SHM_MEM_INIT \
>> +}
>> +
>> +#endif /* KERNEL_INFO_INIT */
>> +
>> +/*
>> + * Probe the kernel to detemine its type and select a loader.
>> + *
>> + * Sets in info:
>> + * ->type
>> + * ->load hook, and sets loader specific variables ->zimage
>> + */
>> +int kernel_probe(struct kernel_info *info, const struct dt_device_node *domain);
>> +
>> +/*
>> + * Loads the kernel into guest RAM.
>> + *
>> + * Expects to be set in info when called:
>> + * ->mem
>> + * ->fdt
>> + *
>> + * Sets in info:
>> + * ->entry
>> + * ->dtb_paddr
>> + * ->initrd_paddr
>> + */
>> +void kernel_load(struct kernel_info *info);
>> +
>> +#endif /* __XEN_FDT_KERNEL_H__ */
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> --
>> 2.49.0
>>
[-- Attachment #2: Type: text/html, Size: 26725 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v3 3/8] asm-generic: move parts of Arm's asm/kernel.h to common code
2025-05-02 16:22 ` [PATCH v3 3/8] asm-generic: move parts of Arm's asm/kernel.h to common code Oleksii Kurochko
2025-05-02 18:13 ` Stefano Stabellini
@ 2025-05-05 9:08 ` Orzel, Michal
2025-05-05 11:56 ` Oleksii Kurochko
1 sibling, 1 reply; 30+ messages in thread
From: Orzel, Michal @ 2025-05-05 9:08 UTC (permalink / raw)
To: Oleksii Kurochko, xen-devel
Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD, Jan Beulich,
Roger Pau Monné
On 02/05/2025 18:22, Oleksii Kurochko wrote:
> Move the following parts to common with the following changes:
> - struct kernel_info:
> - Create arch_kernel_info for arch specific kernel information.
> At the moment, it contains domain_type for Arm.
> - Rename vpl011 to vuart to have more generic name suitable for other archs.
Why do you want to make it common? At the moment it referres to vpl011 which is
Arm specific, so it would be better to move it to arch specific struct. Also,
there can be more than one emulated UART (especially if you want to make the
parsing of vuart common), in which case enum would be the best fit.
Also, one remark...
[...]
> diff --git a/xen/include/xen/fdt-kernel.h b/xen/include/xen/fdt-kernel.h
> new file mode 100644
> index 0000000000..c81e759423
> --- /dev/null
> +++ b/xen/include/xen/fdt-kernel.h
> @@ -0,0 +1,133 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * For Kernel image loading.
> + *
> + * Copyright (C) 2011 Citrix Systems, Inc.
> + */
> +#ifndef __XEN_FDT_KERNEL_H__
> +#define __XEN_FDT_KERNEL_H__
> +
> +#include <xen/bootfdt.h>
> +#include <xen/device_tree.h>
> +#include <xen/types.h>
> +
> +#if __has_include(<asm/kernel.h>)
> +# include <asm/kernel.h>
> +#endif
> +
> +struct kernel_info {
> + struct domain *d;
> +
> + void *fdt; /* flat device tree */
> + paddr_t unassigned_mem; /* RAM not (yet) assigned to a bank */
> + struct meminfo mem;
> +#ifdef CONFIG_STATIC_SHM
> + struct shared_meminfo shm_mem;
> +#endif
> +
> + /* kernel entry point */
> + paddr_t entry;
> +
> + /* grant table region */
> + paddr_t gnttab_start;
> + paddr_t gnttab_size;
> +
> + /* boot blob load addresses */
> + const struct bootmodule *kernel_bootmodule, *initrd_bootmodule, *dtb_bootmodule;
> + const char* cmdline;
> + paddr_t dtb_paddr;
> + paddr_t initrd_paddr;
> +
> + /* Enable uart emulation */
> + bool vuart;
> +
> + /* Enable/Disable PV drivers interfaces */
> + uint16_t dom0less_feature;
> +
> + /* Interrupt controller phandle */
> + uint32_t phandle_intc;
> +
> + /* loader to use for this kernel */
> + void (*load)(struct kernel_info *info);
> +
> + /* loader specific state */
> + union {
> + struct {
> + paddr_t kernel_addr;
> + paddr_t len;
> +#if defined(CONFIG_ARM_64) || defined(CONFIG_RISCV_64)
> + paddr_t text_offset; /* 64-bit Image only */
> +#endif
> + paddr_t start; /* Must be 0 for 64-bit Image */
> + } zimage;
> + };
> +
> +#if __has_include(<asm/kernel.h>)
> + struct arch_kernel_info arch;
> +#endif
> +};
> +
> +static inline struct membanks *kernel_info_get_mem(struct kernel_info *kinfo)
> +{
> + return container_of(&kinfo->mem.common, struct membanks, common);
> +}
> +
> +static inline const struct membanks *
> +kernel_info_get_mem_const(const struct kernel_info *kinfo)
> +{
> + return container_of(&kinfo->mem.common, const struct membanks, common);
> +}
> +
> +#ifndef KERNEL_INFO_SHM_MEM_INIT
> +
> +#ifdef CONFIG_STATIC_SHM
> +#define KERNEL_INFO_SHM_MEM_INIT .shm_mem.common.max_banks = NR_SHMEM_BANKS,
> +#else
> +#define KERNEL_INFO_SHM_MEM_INIT
> +#endif
> +
> +#endif /* KERNEL_INFO_SHM_MEM_INIT */
> +
> +#ifndef KERNEL_INFO_INIT
> +
> +#define KERNEL_INFO_INIT \
> +{ \
> + .mem.common.max_banks = NR_MEM_BANKS, \
> + KERNEL_INFO_SHM_MEM_INIT \
> +}
> +
> +#endif /* KERNEL_INFO_INIT */
> +
> +/*
> + * Probe the kernel to detemine its type and select a loader.
> + *
> + * Sets in info:
> + * ->type
Arm specific information in generic comment.
~Michal
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v3 3/8] asm-generic: move parts of Arm's asm/kernel.h to common code
2025-05-05 9:08 ` Orzel, Michal
@ 2025-05-05 11:56 ` Oleksii Kurochko
0 siblings, 0 replies; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-05 11:56 UTC (permalink / raw)
To: Orzel, Michal, xen-devel
Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD, Jan Beulich,
Roger Pau Monné
[-- Attachment #1: Type: text/plain, Size: 837 bytes --]
On 5/5/25 11:08 AM, Orzel, Michal wrote:
> On 02/05/2025 18:22, Oleksii Kurochko wrote:
>> Move the following parts to common with the following changes:
>> - struct kernel_info:
>> - Create arch_kernel_info for arch specific kernel information.
>> At the moment, it contains domain_type for Arm.
>> - Rename vpl011 to vuart to have more generic name suitable for other archs.
> Why do you want to make it common? At the moment it referres to vpl011 which is
> Arm specific, so it would be better to move it to arch specific struct. Also,
> there can be more than one emulated UART (especially if you want to make the
> parsing of vuart common), in which case enum would be the best fit.
Good point. Actually, vuart/vpl011 could be moved to arch specific struct as
it doesn't used in common code anyway.
Thanks!
~ Oleksii
[-- Attachment #2: Type: text/html, Size: 1392 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v3 4/8] arm/static-shmem.h: drop inclusion of asm/setup.h
2025-05-02 16:22 [PATCH v3 0/8] Move parts of Arm's Dom0less to common code Oleksii Kurochko
` (2 preceding siblings ...)
2025-05-02 16:22 ` [PATCH v3 3/8] asm-generic: move parts of Arm's asm/kernel.h to common code Oleksii Kurochko
@ 2025-05-02 16:22 ` Oleksii Kurochko
2025-05-02 19:13 ` Stefano Stabellini
2025-05-02 16:22 ` [PATCH v3 5/8] asm-generic: move some parts of Arm's domain_build.h to common Oleksii Kurochko
` (3 subsequent siblings)
7 siblings, 1 reply; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-02 16:22 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk
Nothing is dependent from asm/setup.h in asm/static-shmem.h so inclusion of
asm/setup.h is droped.
After this drop the following compilation error related to impicit declaration
of the following functions device_tree_get_reg and map_device_irqs_to_domain,
device_tree_get_u32 occur during compilation of dom0less-build.c ( as they are
declared in asm/setup.h ).
Add inclusion of <asm/setup.h> in dt-overlay.c as it is using handle_device()
declared in <asm/setup.h>.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V2-3:
- Nothing changed. Only rebase.
---
xen/arch/arm/dom0less-build.c | 1 +
xen/common/device-tree/dt-overlay.c | 2 ++
2 files changed, 3 insertions(+)
diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index c0634dd61e..7eecd06d44 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -20,6 +20,7 @@
#include <asm/dom0less-build.h>
#include <asm/domain_build.h>
#include <asm/grant_table.h>
+#include <asm/setup.h>
#include <asm/static-memory.h>
#include <asm/static-shmem.h>
diff --git a/xen/common/device-tree/dt-overlay.c b/xen/common/device-tree/dt-overlay.c
index 81107cb48d..d184186c01 100644
--- a/xen/common/device-tree/dt-overlay.c
+++ b/xen/common/device-tree/dt-overlay.c
@@ -13,6 +13,8 @@
#include <xen/libfdt/libfdt.h>
#include <xen/xmalloc.h>
+#include <asm/setup.h>
+
#define DT_OVERLAY_MAX_SIZE KB(500)
static LIST_HEAD(overlay_tracker);
--
2.49.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH v3 4/8] arm/static-shmem.h: drop inclusion of asm/setup.h
2025-05-02 16:22 ` [PATCH v3 4/8] arm/static-shmem.h: drop inclusion of asm/setup.h Oleksii Kurochko
@ 2025-05-02 19:13 ` Stefano Stabellini
2025-05-05 8:19 ` Oleksii Kurochko
0 siblings, 1 reply; 30+ messages in thread
From: Stefano Stabellini @ 2025-05-02 19:13 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: xen-devel, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk
On Fri, 2 May 2025, Oleksii Kurochko wrote:
> Nothing is dependent from asm/setup.h in asm/static-shmem.h so inclusion of
> asm/setup.h is droped.
Actually, this patch is not currently dropping any inclusions
> After this drop the following compilation error related to impicit declaration
> of the following functions device_tree_get_reg and map_device_irqs_to_domain,
> device_tree_get_u32 occur during compilation of dom0less-build.c ( as they are
> declared in asm/setup.h ).
>
> Add inclusion of <asm/setup.h> in dt-overlay.c as it is using handle_device()
> declared in <asm/setup.h>.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Changes in V2-3:
> - Nothing changed. Only rebase.
> ---
> xen/arch/arm/dom0less-build.c | 1 +
> xen/common/device-tree/dt-overlay.c | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
> index c0634dd61e..7eecd06d44 100644
> --- a/xen/arch/arm/dom0less-build.c
> +++ b/xen/arch/arm/dom0less-build.c
> @@ -20,6 +20,7 @@
> #include <asm/dom0less-build.h>
> #include <asm/domain_build.h>
> #include <asm/grant_table.h>
> +#include <asm/setup.h>
> #include <asm/static-memory.h>
> #include <asm/static-shmem.h>
>
> diff --git a/xen/common/device-tree/dt-overlay.c b/xen/common/device-tree/dt-overlay.c
> index 81107cb48d..d184186c01 100644
> --- a/xen/common/device-tree/dt-overlay.c
> +++ b/xen/common/device-tree/dt-overlay.c
> @@ -13,6 +13,8 @@
> #include <xen/libfdt/libfdt.h>
> #include <xen/xmalloc.h>
>
> +#include <asm/setup.h>
> +
> #define DT_OVERLAY_MAX_SIZE KB(500)
>
> static LIST_HEAD(overlay_tracker);
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v3 4/8] arm/static-shmem.h: drop inclusion of asm/setup.h
2025-05-02 19:13 ` Stefano Stabellini
@ 2025-05-05 8:19 ` Oleksii Kurochko
0 siblings, 0 replies; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-05 8:19 UTC (permalink / raw)
To: Stefano Stabellini
Cc: xen-devel, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk
[-- Attachment #1: Type: text/plain, Size: 1886 bytes --]
On 5/2/25 9:13 PM, Stefano Stabellini wrote:
> On Fri, 2 May 2025, Oleksii Kurochko wrote:
>> Nothing is dependent from asm/setup.h in asm/static-shmem.h so inclusion of
>> asm/setup.h is droped.
> Actually, this patch is not currently dropping any inclusions
Lost dropping during one of the rebase. I'll return it back.
Thanks.
~ Oleksii
>
>
>> After this drop the following compilation error related to impicit declaration
>> of the following functions device_tree_get_reg and map_device_irqs_to_domain,
>> device_tree_get_u32 occur during compilation of dom0less-build.c ( as they are
>> declared in asm/setup.h ).
>>
>> Add inclusion of <asm/setup.h> in dt-overlay.c as it is using handle_device()
>> declared in <asm/setup.h>.
>>
>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
>> ---
>> Changes in V2-3:
>> - Nothing changed. Only rebase.
>> ---
>> xen/arch/arm/dom0less-build.c | 1 +
>> xen/common/device-tree/dt-overlay.c | 2 ++
>> 2 files changed, 3 insertions(+)
>>
>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
>> index c0634dd61e..7eecd06d44 100644
>> --- a/xen/arch/arm/dom0less-build.c
>> +++ b/xen/arch/arm/dom0less-build.c
>> @@ -20,6 +20,7 @@
>> #include <asm/dom0less-build.h>
>> #include <asm/domain_build.h>
>> #include <asm/grant_table.h>
>> +#include <asm/setup.h>
>> #include <asm/static-memory.h>
>> #include <asm/static-shmem.h>
>>
>> diff --git a/xen/common/device-tree/dt-overlay.c b/xen/common/device-tree/dt-overlay.c
>> index 81107cb48d..d184186c01 100644
>> --- a/xen/common/device-tree/dt-overlay.c
>> +++ b/xen/common/device-tree/dt-overlay.c
>> @@ -13,6 +13,8 @@
>> #include <xen/libfdt/libfdt.h>
>> #include <xen/xmalloc.h>
>>
>> +#include <asm/setup.h>
>> +
>> #define DT_OVERLAY_MAX_SIZE KB(500)
>>
>> static LIST_HEAD(overlay_tracker);
>> --
>> 2.49.0
>>
[-- Attachment #2: Type: text/html, Size: 2709 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v3 5/8] asm-generic: move some parts of Arm's domain_build.h to common
2025-05-02 16:22 [PATCH v3 0/8] Move parts of Arm's Dom0less to common code Oleksii Kurochko
` (3 preceding siblings ...)
2025-05-02 16:22 ` [PATCH v3 4/8] arm/static-shmem.h: drop inclusion of asm/setup.h Oleksii Kurochko
@ 2025-05-02 16:22 ` Oleksii Kurochko
2025-05-02 20:55 ` Stefano Stabellini
2025-05-02 16:22 ` [PATCH v3 6/8] xen/common: dom0less: introduce common kernel.c Oleksii Kurochko
` (2 subsequent siblings)
7 siblings, 1 reply; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-02 16:22 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné
Nothing changed. Only some functions declaration are moved to xen/include/
headers as they are expected to be used by common code of domain builing
or dom0less.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Chnages in v3:
- Drop inclusion of <asm/domain_build.h> from xen/fdt-domain-build.h.
- Add empty line after license tag in xen/fdt-domain-build.h.
---
Chnages in v2:
- Add missed declaration of construct_hwdom().
- Drop unnessary blank line.
- Introduce xen/fdt-domain-build.h and move parts of Arm's domain_build.h to
it.
- Update the commit message.
---
xen/arch/arm/acpi/domain_build.c | 1 +
xen/arch/arm/dom0less-build.c | 1 +
xen/arch/arm/domain_build.c | 1 +
xen/arch/arm/include/asm/domain_build.h | 21 ++----------
xen/arch/arm/kernel.c | 1 +
xen/arch/arm/static-shmem.c | 1 +
xen/include/xen/fdt-domain-build.h | 43 +++++++++++++++++++++++++
7 files changed, 51 insertions(+), 18 deletions(-)
create mode 100644 xen/include/xen/fdt-domain-build.h
diff --git a/xen/arch/arm/acpi/domain_build.c b/xen/arch/arm/acpi/domain_build.c
index f9ca8b47e5..1c3555d814 100644
--- a/xen/arch/arm/acpi/domain_build.c
+++ b/xen/arch/arm/acpi/domain_build.c
@@ -10,6 +10,7 @@
*/
#include <xen/compile.h>
+#include <xen/fdt-domain-build.h>
#include <xen/fdt-kernel.h>
#include <xen/mm.h>
#include <xen/sched.h>
diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index 7eecd06d44..0310579863 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0-only */
#include <xen/device_tree.h>
#include <xen/domain_page.h>
+#include <xen/fdt-domain-build.h>
#include <xen/fdt-kernel.h>
#include <xen/err.h>
#include <xen/event.h>
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 8c7a054718..9d649b06b3 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0-only */
#include <xen/init.h>
#include <xen/compile.h>
+#include <xen/fdt-domain-build.h>
#include <xen/fdt-kernel.h>
#include <xen/lib.h>
#include <xen/llc-coloring.h>
diff --git a/xen/arch/arm/include/asm/domain_build.h b/xen/arch/arm/include/asm/domain_build.h
index df1c0fe301..397e408a1f 100644
--- a/xen/arch/arm/include/asm/domain_build.h
+++ b/xen/arch/arm/include/asm/domain_build.h
@@ -5,28 +5,13 @@
#include <xen/sched.h>
typedef __be32 gic_interrupt_t[3];
-typedef bool (*alloc_domheap_mem_cb)(struct domain *d, struct page_info *pg,
- unsigned int order, void *extra);
-bool allocate_domheap_memory(struct domain *d, paddr_t tot_size,
- alloc_domheap_mem_cb cb, void *extra);
-bool allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
- paddr_t tot_size);
-void allocate_memory(struct domain *d, struct kernel_info *kinfo);
-int construct_domain(struct domain *d, struct kernel_info *kinfo);
-int construct_hwdom(struct kernel_info *kinfo,
- const struct dt_device_node *node);
+
int domain_fdt_begin_node(void *fdt, const char *name, uint64_t unit);
-int make_chosen_node(const struct kernel_info *kinfo);
-int make_cpus_node(const struct domain *d, void *fdt);
-int make_hypervisor_node(struct domain *d, const struct kernel_info *kinfo,
- int addrcells, int sizecells);
-int make_memory_node(const struct kernel_info *kinfo, int addrcells,
- int sizecells, const struct membanks *mem);
int make_psci_node(void *fdt);
-int make_timer_node(const struct kernel_info *kinfo);
void evtchn_allocate(struct domain *d);
-unsigned int get_allocation_size(paddr_t size);
+int construct_hwdom(struct kernel_info *kinfo,
+ const struct dt_device_node *node);
/*
* Helper to write an interrupts with the GIC format
diff --git a/xen/arch/arm/kernel.c b/xen/arch/arm/kernel.c
index f00fc388db..5759a3470a 100644
--- a/xen/arch/arm/kernel.c
+++ b/xen/arch/arm/kernel.c
@@ -7,6 +7,7 @@
#include <xen/byteorder.h>
#include <xen/domain_page.h>
#include <xen/errno.h>
+#include <xen/fdt-domain-build.h>
#include <xen/fdt-kernel.h>
#include <xen/guest_access.h>
#include <xen/gunzip.h>
diff --git a/xen/arch/arm/static-shmem.c b/xen/arch/arm/static-shmem.c
index 14ae48fb1e..1f8441d920 100644
--- a/xen/arch/arm/static-shmem.c
+++ b/xen/arch/arm/static-shmem.c
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0-only */
#include <xen/device_tree.h>
+#include <xen/fdt-domain-build.h>
#include <xen/libfdt/libfdt.h>
#include <xen/rangeset.h>
#include <xen/sched.h>
diff --git a/xen/include/xen/fdt-domain-build.h b/xen/include/xen/fdt-domain-build.h
new file mode 100644
index 0000000000..b79e9fabfe
--- /dev/null
+++ b/xen/include/xen/fdt-domain-build.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef __XEN_FDT_DOMAIN_BUILD_H__
+#define __XEN_FDT_DOMAIN_BUILD_H__
+
+#include <xen/bootfdt.h>
+#include <xen/device_tree.h>
+#include <xen/fdt-kernel.h>
+#include <xen/types.h>
+
+struct domain;
+struct page_info;
+struct membanks;
+
+typedef bool (*alloc_domheap_mem_cb)(struct domain *d, struct page_info *pg,
+ unsigned int order, void *extra);
+bool allocate_domheap_memory(struct domain *d, paddr_t tot_size,
+ alloc_domheap_mem_cb cb, void *extra);
+
+bool allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
+ paddr_t tot_size);
+void allocate_memory(struct domain *d, struct kernel_info *kinfo);
+int construct_domain(struct domain *d, struct kernel_info *kinfo);
+int make_chosen_node(const struct kernel_info *kinfo);
+int make_cpus_node(const struct domain *d, void *fdt);
+int make_hypervisor_node(struct domain *d, const struct kernel_info *kinfo,
+ int addrcells, int sizecells);
+int make_memory_node(const struct kernel_info *kinfo, int addrcells,
+ int sizecells, const struct membanks *mem);
+int make_timer_node(const struct kernel_info *kinfo);
+
+unsigned int get_allocation_size(paddr_t size);
+
+#endif /* __XEN_FDT_DOMAIN_BUILD_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
--
2.49.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH v3 5/8] asm-generic: move some parts of Arm's domain_build.h to common
2025-05-02 16:22 ` [PATCH v3 5/8] asm-generic: move some parts of Arm's domain_build.h to common Oleksii Kurochko
@ 2025-05-02 20:55 ` Stefano Stabellini
2025-05-05 11:08 ` Oleksii Kurochko
0 siblings, 1 reply; 30+ messages in thread
From: Stefano Stabellini @ 2025-05-02 20:55 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: xen-devel, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Jan Beulich, Roger Pau Monné
On Fri, 2 May 2025, Oleksii Kurochko wrote:
> Nothing changed. Only some functions declaration are moved to xen/include/
> headers as they are expected to be used by common code of domain builing
> or dom0less.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Chnages in v3:
> - Drop inclusion of <asm/domain_build.h> from xen/fdt-domain-build.h.
> - Add empty line after license tag in xen/fdt-domain-build.h.
> ---
> Chnages in v2:
> - Add missed declaration of construct_hwdom().
> - Drop unnessary blank line.
> - Introduce xen/fdt-domain-build.h and move parts of Arm's domain_build.h to
> it.
> - Update the commit message.
> ---
> xen/arch/arm/acpi/domain_build.c | 1 +
> xen/arch/arm/dom0less-build.c | 1 +
> xen/arch/arm/domain_build.c | 1 +
> xen/arch/arm/include/asm/domain_build.h | 21 ++----------
> xen/arch/arm/kernel.c | 1 +
> xen/arch/arm/static-shmem.c | 1 +
> xen/include/xen/fdt-domain-build.h | 43 +++++++++++++++++++++++++
> 7 files changed, 51 insertions(+), 18 deletions(-)
> create mode 100644 xen/include/xen/fdt-domain-build.h
>
> diff --git a/xen/arch/arm/acpi/domain_build.c b/xen/arch/arm/acpi/domain_build.c
> index f9ca8b47e5..1c3555d814 100644
> --- a/xen/arch/arm/acpi/domain_build.c
> +++ b/xen/arch/arm/acpi/domain_build.c
> @@ -10,6 +10,7 @@
> */
>
> #include <xen/compile.h>
> +#include <xen/fdt-domain-build.h>
> #include <xen/fdt-kernel.h>
> #include <xen/mm.h>
> #include <xen/sched.h>
> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
> index 7eecd06d44..0310579863 100644
> --- a/xen/arch/arm/dom0less-build.c
> +++ b/xen/arch/arm/dom0less-build.c
> @@ -1,6 +1,7 @@
> /* SPDX-License-Identifier: GPL-2.0-only */
> #include <xen/device_tree.h>
> #include <xen/domain_page.h>
> +#include <xen/fdt-domain-build.h>
> #include <xen/fdt-kernel.h>
> #include <xen/err.h>
> #include <xen/event.h>
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 8c7a054718..9d649b06b3 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1,6 +1,7 @@
> /* SPDX-License-Identifier: GPL-2.0-only */
> #include <xen/init.h>
> #include <xen/compile.h>
> +#include <xen/fdt-domain-build.h>
> #include <xen/fdt-kernel.h>
> #include <xen/lib.h>
> #include <xen/llc-coloring.h>
> diff --git a/xen/arch/arm/include/asm/domain_build.h b/xen/arch/arm/include/asm/domain_build.h
> index df1c0fe301..397e408a1f 100644
> --- a/xen/arch/arm/include/asm/domain_build.h
> +++ b/xen/arch/arm/include/asm/domain_build.h
> @@ -5,28 +5,13 @@
> #include <xen/sched.h>
>
> typedef __be32 gic_interrupt_t[3];
> -typedef bool (*alloc_domheap_mem_cb)(struct domain *d, struct page_info *pg,
> - unsigned int order, void *extra);
> -bool allocate_domheap_memory(struct domain *d, paddr_t tot_size,
> - alloc_domheap_mem_cb cb, void *extra);
> -bool allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
> - paddr_t tot_size);
> -void allocate_memory(struct domain *d, struct kernel_info *kinfo);
> -int construct_domain(struct domain *d, struct kernel_info *kinfo);
> -int construct_hwdom(struct kernel_info *kinfo,
> - const struct dt_device_node *node);
> +
> int domain_fdt_begin_node(void *fdt, const char *name, uint64_t unit);
> -int make_chosen_node(const struct kernel_info *kinfo);
> -int make_cpus_node(const struct domain *d, void *fdt);
> -int make_hypervisor_node(struct domain *d, const struct kernel_info *kinfo,
> - int addrcells, int sizecells);
> -int make_memory_node(const struct kernel_info *kinfo, int addrcells,
> - int sizecells, const struct membanks *mem);
> int make_psci_node(void *fdt);
> -int make_timer_node(const struct kernel_info *kinfo);
> void evtchn_allocate(struct domain *d);
>
> -unsigned int get_allocation_size(paddr_t size);
> +int construct_hwdom(struct kernel_info *kinfo,
> + const struct dt_device_node *node);
At the end of the series construct_hwdom is only called from within
xen/arch/arm/domain_build.c, so it could be made static and removed from
here. However, one of my review comments was that I think we should
still call construct_hwdom from xen/common/device-tree/dom0less-build.c.
So I think we should keep it.
> /*
> * Helper to write an interrupts with the GIC format
> diff --git a/xen/arch/arm/kernel.c b/xen/arch/arm/kernel.c
> index f00fc388db..5759a3470a 100644
> --- a/xen/arch/arm/kernel.c
> +++ b/xen/arch/arm/kernel.c
> @@ -7,6 +7,7 @@
> #include <xen/byteorder.h>
> #include <xen/domain_page.h>
> #include <xen/errno.h>
> +#include <xen/fdt-domain-build.h>
> #include <xen/fdt-kernel.h>
> #include <xen/guest_access.h>
> #include <xen/gunzip.h>
> diff --git a/xen/arch/arm/static-shmem.c b/xen/arch/arm/static-shmem.c
> index 14ae48fb1e..1f8441d920 100644
> --- a/xen/arch/arm/static-shmem.c
> +++ b/xen/arch/arm/static-shmem.c
> @@ -1,6 +1,7 @@
> /* SPDX-License-Identifier: GPL-2.0-only */
>
> #include <xen/device_tree.h>
> +#include <xen/fdt-domain-build.h>
> #include <xen/libfdt/libfdt.h>
> #include <xen/rangeset.h>
> #include <xen/sched.h>
> diff --git a/xen/include/xen/fdt-domain-build.h b/xen/include/xen/fdt-domain-build.h
> new file mode 100644
> index 0000000000..b79e9fabfe
> --- /dev/null
> +++ b/xen/include/xen/fdt-domain-build.h
> @@ -0,0 +1,43 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#ifndef __XEN_FDT_DOMAIN_BUILD_H__
> +#define __XEN_FDT_DOMAIN_BUILD_H__
> +
> +#include <xen/bootfdt.h>
> +#include <xen/device_tree.h>
> +#include <xen/fdt-kernel.h>
> +#include <xen/types.h>
> +
> +struct domain;
> +struct page_info;
> +struct membanks;
> +
> +typedef bool (*alloc_domheap_mem_cb)(struct domain *d, struct page_info *pg,
> + unsigned int order, void *extra);
> +bool allocate_domheap_memory(struct domain *d, paddr_t tot_size,
> + alloc_domheap_mem_cb cb, void *extra);
> +
> +bool allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
> + paddr_t tot_size);
> +void allocate_memory(struct domain *d, struct kernel_info *kinfo);
> +int construct_domain(struct domain *d, struct kernel_info *kinfo);
> +int make_chosen_node(const struct kernel_info *kinfo);
> +int make_cpus_node(const struct domain *d, void *fdt);
> +int make_hypervisor_node(struct domain *d, const struct kernel_info *kinfo,
> + int addrcells, int sizecells);
> +int make_memory_node(const struct kernel_info *kinfo, int addrcells,
> + int sizecells, const struct membanks *mem);
> +int make_timer_node(const struct kernel_info *kinfo);
> +
> +unsigned int get_allocation_size(paddr_t size);
Many of these functions are not actually moved until later patches. It
would be best to move the declaration at the time the function is also
moved. But if that is difficult for any reason, this is also OK.
> +#endif /* __XEN_FDT_DOMAIN_BUILD_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v3 5/8] asm-generic: move some parts of Arm's domain_build.h to common
2025-05-02 20:55 ` Stefano Stabellini
@ 2025-05-05 11:08 ` Oleksii Kurochko
0 siblings, 0 replies; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-05 11:08 UTC (permalink / raw)
To: Stefano Stabellini
Cc: xen-devel, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD, Jan Beulich,
Roger Pau Monné
[-- Attachment #1: Type: text/plain, Size: 7829 bytes --]
On 5/2/25 10:55 PM, Stefano Stabellini wrote:
> On Fri, 2 May 2025, Oleksii Kurochko wrote:
>> Nothing changed. Only some functions declaration are moved to xen/include/
>> headers as they are expected to be used by common code of domain builing
>> or dom0less.
>>
>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
>> ---
>> Chnages in v3:
>> - Drop inclusion of <asm/domain_build.h> from xen/fdt-domain-build.h.
>> - Add empty line after license tag in xen/fdt-domain-build.h.
>> ---
>> Chnages in v2:
>> - Add missed declaration of construct_hwdom().
>> - Drop unnessary blank line.
>> - Introduce xen/fdt-domain-build.h and move parts of Arm's domain_build.h to
>> it.
>> - Update the commit message.
>> ---
>> xen/arch/arm/acpi/domain_build.c | 1 +
>> xen/arch/arm/dom0less-build.c | 1 +
>> xen/arch/arm/domain_build.c | 1 +
>> xen/arch/arm/include/asm/domain_build.h | 21 ++----------
>> xen/arch/arm/kernel.c | 1 +
>> xen/arch/arm/static-shmem.c | 1 +
>> xen/include/xen/fdt-domain-build.h | 43 +++++++++++++++++++++++++
>> 7 files changed, 51 insertions(+), 18 deletions(-)
>> create mode 100644 xen/include/xen/fdt-domain-build.h
>>
>> diff --git a/xen/arch/arm/acpi/domain_build.c b/xen/arch/arm/acpi/domain_build.c
>> index f9ca8b47e5..1c3555d814 100644
>> --- a/xen/arch/arm/acpi/domain_build.c
>> +++ b/xen/arch/arm/acpi/domain_build.c
>> @@ -10,6 +10,7 @@
>> */
>>
>> #include <xen/compile.h>
>> +#include <xen/fdt-domain-build.h>
>> #include <xen/fdt-kernel.h>
>> #include <xen/mm.h>
>> #include <xen/sched.h>
>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
>> index 7eecd06d44..0310579863 100644
>> --- a/xen/arch/arm/dom0less-build.c
>> +++ b/xen/arch/arm/dom0less-build.c
>> @@ -1,6 +1,7 @@
>> /* SPDX-License-Identifier: GPL-2.0-only */
>> #include <xen/device_tree.h>
>> #include <xen/domain_page.h>
>> +#include <xen/fdt-domain-build.h>
>> #include <xen/fdt-kernel.h>
>> #include <xen/err.h>
>> #include <xen/event.h>
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index 8c7a054718..9d649b06b3 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -1,6 +1,7 @@
>> /* SPDX-License-Identifier: GPL-2.0-only */
>> #include <xen/init.h>
>> #include <xen/compile.h>
>> +#include <xen/fdt-domain-build.h>
>> #include <xen/fdt-kernel.h>
>> #include <xen/lib.h>
>> #include <xen/llc-coloring.h>
>> diff --git a/xen/arch/arm/include/asm/domain_build.h b/xen/arch/arm/include/asm/domain_build.h
>> index df1c0fe301..397e408a1f 100644
>> --- a/xen/arch/arm/include/asm/domain_build.h
>> +++ b/xen/arch/arm/include/asm/domain_build.h
>> @@ -5,28 +5,13 @@
>> #include <xen/sched.h>
>>
>> typedef __be32 gic_interrupt_t[3];
>> -typedef bool (*alloc_domheap_mem_cb)(struct domain *d, struct page_info *pg,
>> - unsigned int order, void *extra);
>> -bool allocate_domheap_memory(struct domain *d, paddr_t tot_size,
>> - alloc_domheap_mem_cb cb, void *extra);
>> -bool allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
>> - paddr_t tot_size);
>> -void allocate_memory(struct domain *d, struct kernel_info *kinfo);
>> -int construct_domain(struct domain *d, struct kernel_info *kinfo);
>> -int construct_hwdom(struct kernel_info *kinfo,
>> - const struct dt_device_node *node);
>> +
>> int domain_fdt_begin_node(void *fdt, const char *name, uint64_t unit);
>> -int make_chosen_node(const struct kernel_info *kinfo);
>> -int make_cpus_node(const struct domain *d, void *fdt);
>> -int make_hypervisor_node(struct domain *d, const struct kernel_info *kinfo,
>> - int addrcells, int sizecells);
>> -int make_memory_node(const struct kernel_info *kinfo, int addrcells,
>> - int sizecells, const struct membanks *mem);
>> int make_psci_node(void *fdt);
>> -int make_timer_node(const struct kernel_info *kinfo);
>> void evtchn_allocate(struct domain *d);
>>
>> -unsigned int get_allocation_size(paddr_t size);
>> +int construct_hwdom(struct kernel_info *kinfo,
>> + const struct dt_device_node *node);
> At the end of the series construct_hwdom is only called from within
> xen/arch/arm/domain_build.c, so it could be made static and removed from
> here. However, one of my review comments was that I think we should
> still call construct_hwdom from xen/common/device-tree/dom0less-build.c.
> So I think we should keep it.
I will move this change to the patch where this function will be really used.
>
>
>> /*
>> * Helper to write an interrupts with the GIC format
>> diff --git a/xen/arch/arm/kernel.c b/xen/arch/arm/kernel.c
>> index f00fc388db..5759a3470a 100644
>> --- a/xen/arch/arm/kernel.c
>> +++ b/xen/arch/arm/kernel.c
>> @@ -7,6 +7,7 @@
>> #include <xen/byteorder.h>
>> #include <xen/domain_page.h>
>> #include <xen/errno.h>
>> +#include <xen/fdt-domain-build.h>
>> #include <xen/fdt-kernel.h>
>> #include <xen/guest_access.h>
>> #include <xen/gunzip.h>
>> diff --git a/xen/arch/arm/static-shmem.c b/xen/arch/arm/static-shmem.c
>> index 14ae48fb1e..1f8441d920 100644
>> --- a/xen/arch/arm/static-shmem.c
>> +++ b/xen/arch/arm/static-shmem.c
>> @@ -1,6 +1,7 @@
>> /* SPDX-License-Identifier: GPL-2.0-only */
>>
>> #include <xen/device_tree.h>
>> +#include <xen/fdt-domain-build.h>
>> #include <xen/libfdt/libfdt.h>
>> #include <xen/rangeset.h>
>> #include <xen/sched.h>
>> diff --git a/xen/include/xen/fdt-domain-build.h b/xen/include/xen/fdt-domain-build.h
>> new file mode 100644
>> index 0000000000..b79e9fabfe
>> --- /dev/null
>> +++ b/xen/include/xen/fdt-domain-build.h
>> @@ -0,0 +1,43 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +
>> +#ifndef __XEN_FDT_DOMAIN_BUILD_H__
>> +#define __XEN_FDT_DOMAIN_BUILD_H__
>> +
>> +#include <xen/bootfdt.h>
>> +#include <xen/device_tree.h>
>> +#include <xen/fdt-kernel.h>
>> +#include <xen/types.h>
>> +
>> +struct domain;
>> +struct page_info;
>> +struct membanks;
>> +
>> +typedef bool (*alloc_domheap_mem_cb)(struct domain *d, struct page_info *pg,
>> + unsigned int order, void *extra);
>> +bool allocate_domheap_memory(struct domain *d, paddr_t tot_size,
>> + alloc_domheap_mem_cb cb, void *extra);
>> +
>> +bool allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
>> + paddr_t tot_size);
>> +void allocate_memory(struct domain *d, struct kernel_info *kinfo);
>> +int construct_domain(struct domain *d, struct kernel_info *kinfo);
>> +int make_chosen_node(const struct kernel_info *kinfo);
>> +int make_cpus_node(const struct domain *d, void *fdt);
>> +int make_hypervisor_node(struct domain *d, const struct kernel_info *kinfo,
>> + int addrcells, int sizecells);
>> +int make_memory_node(const struct kernel_info *kinfo, int addrcells,
>> + int sizecells, const struct membanks *mem);
>> +int make_timer_node(const struct kernel_info *kinfo);
>> +
>> +unsigned int get_allocation_size(paddr_t size);
> Many of these functions are not actually moved until later patches. It
> would be best to move the declaration at the time the function is also
> moved. But if that is difficult for any reason, this is also OK.
I'll move then allocate_*(), to the patch where defintions
for these functions are introduced.
~ Oleksii
>> +#endif /* __XEN_FDT_DOMAIN_BUILD_H__ */
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> --
>> 2.49.0
>>
[-- Attachment #2: Type: text/html, Size: 8709 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v3 6/8] xen/common: dom0less: introduce common kernel.c
2025-05-02 16:22 [PATCH v3 0/8] Move parts of Arm's Dom0less to common code Oleksii Kurochko
` (4 preceding siblings ...)
2025-05-02 16:22 ` [PATCH v3 5/8] asm-generic: move some parts of Arm's domain_build.h to common Oleksii Kurochko
@ 2025-05-02 16:22 ` Oleksii Kurochko
2025-05-02 19:36 ` Stefano Stabellini
2025-05-02 16:22 ` [PATCH v3 7/8] xen/common: dom0less: introduce common domain-build.c Oleksii Kurochko
2025-05-02 16:22 ` [PATCH v3 8/8] xen/common: dom0less: introduce common dom0less-build.c Oleksii Kurochko
7 siblings, 1 reply; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-02 16:22 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné
The following functions don't have arch specific things so it is moved to
common:
- kernel_prboe()
- kernel_load()
- output_length()
Functions necessary for dom0less are only moved.
The following changes are done:
- Swap __init and return type of kernel_decompress() function to be
consistent with defintions of functions in other files. The same
for output_length().
- Wrap by "ifdef CONFIG_ARM" the call of kernel_uimage_probe() in
kernel_probe() as uImage isn't really used nowadays thereby leave
kernel_uimage_probe() call here just for compatability with Arm code.
- Introduce kernel_zimage_probe() to cover the case that arch can have
different zimage header.
- Add ASSERT() for kernel_load() to check that it argument isn't NULL.
- Make kernel_uimage_probe() non-static in Arm's code as it is used in
common/kernel.c.
Introduce CONFIG_DOMAIN_BUILD_HELPERS to not provide stubs for archs
which don't provide enough functionality to enable it.
Select CONFIG_DOMAIN_BUILD_HELPERS for CONFIG_ARM as only Arm supports
it, at the moment.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Change in v3:
- Empty line after license tag for newly introduced files.
---
Change in v2:
- Drop inclusion of asm/kernel.h in kernel.c as everything necessary has
been moved to xen/fdt-kernel.h.
---
xen/arch/arm/Kconfig | 1 +
xen/arch/arm/kernel.c | 221 +---------------------------
xen/common/Kconfig | 9 +-
xen/common/device-tree/Makefile | 1 +
xen/common/device-tree/kernel.c | 242 +++++++++++++++++++++++++++++++
xen/include/asm-generic/kernel.h | 141 ++++++++++++++++++
xen/include/xen/fdt-kernel.h | 13 ++
7 files changed, 412 insertions(+), 216 deletions(-)
create mode 100644 xen/common/device-tree/kernel.c
create mode 100644 xen/include/asm-generic/kernel.h
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index d0e0a7753c..3321d89068 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -11,6 +11,7 @@ config ARM_64
config ARM
def_bool y
+ select DOMAIN_BUILD_HELPERS
select FUNCTION_ALIGNMENT_4B
select GENERIC_UART_INIT
select HAS_ALTERNATIVE if HAS_VMAP
diff --git a/xen/arch/arm/kernel.c b/xen/arch/arm/kernel.c
index 5759a3470a..1168c21e97 100644
--- a/xen/arch/arm/kernel.c
+++ b/xen/arch/arm/kernel.c
@@ -163,105 +163,6 @@ static void __init kernel_zimage_load(struct kernel_info *info)
iounmap(kernel);
}
-static __init uint32_t output_length(char *image, unsigned long image_len)
-{
- return *(uint32_t *)&image[image_len - 4];
-}
-
-static __init int kernel_decompress(struct bootmodule *mod, uint32_t offset)
-{
- char *output, *input;
- char magic[2];
- int rc;
- unsigned int kernel_order_out;
- paddr_t output_size;
- struct page_info *pages;
- mfn_t mfn;
- int i;
- paddr_t addr = mod->start;
- paddr_t size = mod->size;
-
- if ( size < offset )
- return -EINVAL;
-
- /*
- * It might be that gzip header does not appear at the start address
- * (e.g. in case of compressed uImage) so take into account offset to
- * gzip header.
- */
- addr += offset;
- size -= offset;
-
- if ( size < 2 )
- return -EINVAL;
-
- copy_from_paddr(magic, addr, sizeof(magic));
-
- /* only gzip is supported */
- if ( !gzip_check(magic, size) )
- return -EINVAL;
-
- input = ioremap_cache(addr, size);
- if ( input == NULL )
- return -EFAULT;
-
- output_size = output_length(input, size);
- kernel_order_out = get_order_from_bytes(output_size);
- pages = alloc_domheap_pages(NULL, kernel_order_out, 0);
- if ( pages == NULL )
- {
- iounmap(input);
- return -ENOMEM;
- }
- mfn = page_to_mfn(pages);
- output = vmap_contig(mfn, 1 << kernel_order_out);
-
- rc = perform_gunzip(output, input, size);
- clean_dcache_va_range(output, output_size);
- iounmap(input);
- vunmap(output);
-
- if ( rc )
- {
- free_domheap_pages(pages, kernel_order_out);
- return rc;
- }
-
- mod->start = page_to_maddr(pages);
- mod->size = output_size;
-
- /*
- * Need to free pages after output_size here because they won't be
- * freed by discard_initial_modules
- */
- i = PFN_UP(output_size);
- for ( ; i < (1 << kernel_order_out); i++ )
- free_domheap_page(pages + i);
-
- /*
- * When using static heap feature, don't give bootmodules memory back to
- * the heap allocator
- */
- if ( using_static_heap )
- return 0;
-
- /*
- * When freeing the kernel, we need to pass the module start address and
- * size as they were before taking an offset to gzip header into account,
- * so that the entire region will be freed.
- */
- addr -= offset;
- size += offset;
-
- /*
- * Free the original kernel, update the pointers to the
- * decompressed kernel
- */
- fw_unreserved_regions(addr, addr + size, init_domheap_pages, 0);
-
- return 0;
-}
-
/*
* Uimage CPU Architecture Codes
*/
@@ -274,8 +175,8 @@ static __init int kernel_decompress(struct bootmodule *mod, uint32_t offset)
/*
* Check if the image is a uImage and setup kernel_info
*/
-static int __init kernel_uimage_probe(struct kernel_info *info,
- struct bootmodule *mod)
+int __init kernel_uimage_probe(struct kernel_info *info,
+ struct bootmodule *mod)
{
struct {
__be32 magic; /* Image Header Magic Number */
@@ -505,130 +406,20 @@ static int __init kernel_zimage32_probe(struct kernel_info *info,
return 0;
}
-int __init kernel_probe(struct kernel_info *info,
- const struct dt_device_node *domain)
+int __init kernel_zimage_probe(struct kernel_info *info, paddr_t addr,
+ paddr_t size)
{
- struct bootmodule *mod = NULL;
- struct bootcmdline *cmd = NULL;
- struct dt_device_node *node;
- u64 kernel_addr, initrd_addr, dtb_addr, size;
int rc;
- /*
- * We need to initialize start to 0. This field may be populated during
- * kernel_xxx_probe() if the image has a fixed entry point (for e.g.
- * uimage.ep).
- * We will use this to determine if the image has a fixed entry point or
- * the load address should be used as the start address.
- */
- info->entry = 0;
-
- /* domain is NULL only for the hardware domain */
- if ( domain == NULL )
- {
- ASSERT(is_hardware_domain(info->d));
-
- mod = boot_module_find_by_kind(BOOTMOD_KERNEL);
-
- info->kernel_bootmodule = mod;
- info->initrd_bootmodule = boot_module_find_by_kind(BOOTMOD_RAMDISK);
-
- cmd = boot_cmdline_find_by_kind(BOOTMOD_KERNEL);
- if ( cmd )
- info->cmdline = &cmd->cmdline[0];
- }
- else
- {
- const char *name = NULL;
-
- dt_for_each_child_node(domain, node)
- {
- if ( dt_device_is_compatible(node, "multiboot,kernel") )
- {
- u32 len;
- const __be32 *val;
-
- val = dt_get_property(node, "reg", &len);
- dt_get_range(&val, node, &kernel_addr, &size);
- mod = boot_module_find_by_addr_and_kind(
- BOOTMOD_KERNEL, kernel_addr);
- info->kernel_bootmodule = mod;
- }
- else if ( dt_device_is_compatible(node, "multiboot,ramdisk") )
- {
- u32 len;
- const __be32 *val;
-
- val = dt_get_property(node, "reg", &len);
- dt_get_range(&val, node, &initrd_addr, &size);
- info->initrd_bootmodule = boot_module_find_by_addr_and_kind(
- BOOTMOD_RAMDISK, initrd_addr);
- }
- else if ( dt_device_is_compatible(node, "multiboot,device-tree") )
- {
- uint32_t len;
- const __be32 *val;
-
- val = dt_get_property(node, "reg", &len);
- if ( val == NULL )
- continue;
- dt_get_range(&val, node, &dtb_addr, &size);
- info->dtb_bootmodule = boot_module_find_by_addr_and_kind(
- BOOTMOD_GUEST_DTB, dtb_addr);
- }
- else
- continue;
- }
- name = dt_node_name(domain);
- cmd = boot_cmdline_find_by_name(name);
- if ( cmd )
- info->cmdline = &cmd->cmdline[0];
- }
- if ( !mod || !mod->size )
- {
- printk(XENLOG_ERR "Missing kernel boot module?\n");
- return -ENOENT;
- }
-
- printk("Loading %pd kernel from boot module @ %"PRIpaddr"\n",
- info->d, info->kernel_bootmodule->start);
- if ( info->initrd_bootmodule )
- printk("Loading ramdisk from boot module @ %"PRIpaddr"\n",
- info->initrd_bootmodule->start);
-
- /*
- * uImage header always appears at the top of the image (even compressed),
- * so it needs to be probed first. Note that in case of compressed uImage,
- * kernel_decompress is called from kernel_uimage_probe making the function
- * self-containing (i.e. fall through only in case of a header not found).
- */
- rc = kernel_uimage_probe(info, mod);
- if ( rc != -ENOENT )
- return rc;
-
- /*
- * If it is a gzip'ed image, 32bit or 64bit, uncompress it.
- * At this point, gzip header appears (if at all) at the top of the image,
- * so pass 0 as an offset.
- */
- rc = kernel_decompress(mod, 0);
- if ( rc && rc != -EINVAL )
- return rc;
-
#ifdef CONFIG_ARM_64
- rc = kernel_zimage64_probe(info, mod->start, mod->size);
+ rc = kernel_zimage64_probe(info, addr, size);
if (rc < 0)
#endif
- rc = kernel_zimage32_probe(info, mod->start, mod->size);
+ rc = kernel_zimage32_probe(info, addr, size);
return rc;
}
-void __init kernel_load(struct kernel_info *info)
-{
- info->load(info);
-}
-
/*
* Local variables:
* mode: C
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index be38abf9e1..38981f1d11 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -14,13 +14,20 @@ config CORE_PARKING
config DOM0LESS_BOOT
bool "Dom0less boot support" if EXPERT
- depends on HAS_DOM0LESS
+ depends on HAS_DOM0LESS && DOMAIN_BUILD_HELPERS
default y
help
Dom0less boot support enables Xen to create and start domU guests during
Xen boot without the need of a control domain (Dom0), which could be
present anyway.
+config DOMAIN_BUILD_HELPERS
+ bool
+ help
+ Introduce functions necessary for working with domain creation, kernel,
+ etc. As an examples, these type of functions are going to be used by
+ CONFIG_DOM0LESS_BOOT.
+
config GRANT_TABLE
bool "Grant table support" if EXPERT
default y
diff --git a/xen/common/device-tree/Makefile b/xen/common/device-tree/Makefile
index f3dafc9b81..e88a4d5799 100644
--- a/xen/common/device-tree/Makefile
+++ b/xen/common/device-tree/Makefile
@@ -4,3 +4,4 @@ obj-y += device-tree.o
obj-$(CONFIG_DOM0LESS_BOOT) += dom0less-build.o
obj-$(CONFIG_OVERLAY_DTB) += dt-overlay.o
obj-y += intc.o
+obj-$(CONFIG_DOMAIN_BUILD_HELPERS) += kernel.o
diff --git a/xen/common/device-tree/kernel.c b/xen/common/device-tree/kernel.c
new file mode 100644
index 0000000000..1bf3bbf64e
--- /dev/null
+++ b/xen/common/device-tree/kernel.c
@@ -0,0 +1,242 @@
+#include <xen/bootfdt.h>
+#include <xen/device_tree.h>
+#include <xen/fdt-kernel.h>
+#include <xen/errno.h>
+#include <xen/gunzip.h>
+#include <xen/init.h>
+#include <xen/lib.h>
+#include <xen/mm.h>
+#include <xen/pfn.h>
+#include <xen/sched.h>
+#include <xen/types.h>
+#include <xen/vmap.h>
+
+#include <asm/page.h>
+#include <asm/setup.h>
+
+static uint32_t __init output_length(char *image, unsigned long image_len)
+{
+ return *(uint32_t *)&image[image_len - 4];
+}
+
+int __init kernel_decompress(struct bootmodule *mod, uint32_t offset)
+{
+ char *output, *input;
+ char magic[2];
+ int rc;
+ unsigned int kernel_order_out;
+ paddr_t output_size;
+ struct page_info *pages;
+ mfn_t mfn;
+ int i;
+ paddr_t addr = mod->start;
+ paddr_t size = mod->size;
+
+ if ( size < offset )
+ return -EINVAL;
+
+ /*
+ * It might be that gzip header does not appear at the start address
+ * (e.g. in case of compressed uImage) so take into account offset to
+ * gzip header.
+ */
+ addr += offset;
+ size -= offset;
+
+ if ( size < 2 )
+ return -EINVAL;
+
+ copy_from_paddr(magic, addr, sizeof(magic));
+
+ /* only gzip is supported */
+ if ( !gzip_check(magic, size) )
+ return -EINVAL;
+
+ input = ioremap_cache(addr, size);
+ if ( input == NULL )
+ return -EFAULT;
+
+ output_size = output_length(input, size);
+ kernel_order_out = get_order_from_bytes(output_size);
+ pages = alloc_domheap_pages(NULL, kernel_order_out, 0);
+ if ( pages == NULL )
+ {
+ iounmap(input);
+ return -ENOMEM;
+ }
+ mfn = page_to_mfn(pages);
+ output = vmap_contig(mfn, 1 << kernel_order_out);
+
+ rc = perform_gunzip(output, input, size);
+ clean_dcache_va_range(output, output_size);
+ iounmap(input);
+ vunmap(output);
+
+ if ( rc )
+ {
+ free_domheap_pages(pages, kernel_order_out);
+ return rc;
+ }
+
+ mod->start = page_to_maddr(pages);
+ mod->size = output_size;
+
+ /*
+ * Need to free pages after output_size here because they won't be
+ * freed by discard_initial_modules
+ */
+ i = PFN_UP(output_size);
+ for ( ; i < (1 << kernel_order_out); i++ )
+ free_domheap_page(pages + i);
+
+ /*
+ * When using static heap feature, don't give bootmodules memory back to
+ * the heap allocator
+ */
+ if ( using_static_heap )
+ return 0;
+
+ /*
+ * When freeing the kernel, we need to pass the module start address and
+ * size as they were before taking an offset to gzip header into account,
+ * so that the entire region will be freed.
+ */
+ addr -= offset;
+ size += offset;
+
+ /*
+ * Free the original kernel, update the pointers to the
+ * decompressed kernel
+ */
+ fw_unreserved_regions(addr, addr + size, init_domheap_pages, 0);
+
+ return 0;
+}
+
+int __init kernel_probe(struct kernel_info *info,
+ const struct dt_device_node *domain)
+{
+ struct bootmodule *mod = NULL;
+ struct bootcmdline *cmd = NULL;
+ struct dt_device_node *node;
+ u64 kernel_addr, initrd_addr, dtb_addr, size;
+ int rc;
+
+ /*
+ * We need to initialize start to 0. This field may be populated during
+ * kernel_xxx_probe() if the image has a fixed entry point (for e.g.
+ * uimage.ep).
+ * We will use this to determine if the image has a fixed entry point or
+ * the load address should be used as the start address.
+ */
+ info->entry = 0;
+
+ /* domain is NULL only for the hardware domain */
+ if ( domain == NULL )
+ {
+ ASSERT(is_hardware_domain(info->d));
+
+ mod = boot_module_find_by_kind(BOOTMOD_KERNEL);
+
+ info->kernel_bootmodule = mod;
+ info->initrd_bootmodule = boot_module_find_by_kind(BOOTMOD_RAMDISK);
+
+ cmd = boot_cmdline_find_by_kind(BOOTMOD_KERNEL);
+ if ( cmd )
+ info->cmdline = &cmd->cmdline[0];
+ }
+ else
+ {
+ const char *name = NULL;
+
+ dt_for_each_child_node(domain, node)
+ {
+ if ( dt_device_is_compatible(node, "multiboot,kernel") )
+ {
+ u32 len;
+ const __be32 *val;
+
+ val = dt_get_property(node, "reg", &len);
+ dt_get_range(&val, node, &kernel_addr, &size);
+ mod = boot_module_find_by_addr_and_kind(
+ BOOTMOD_KERNEL, kernel_addr);
+ info->kernel_bootmodule = mod;
+ }
+ else if ( dt_device_is_compatible(node, "multiboot,ramdisk") )
+ {
+ u32 len;
+ const __be32 *val;
+
+ val = dt_get_property(node, "reg", &len);
+ dt_get_range(&val, node, &initrd_addr, &size);
+ info->initrd_bootmodule = boot_module_find_by_addr_and_kind(
+ BOOTMOD_RAMDISK, initrd_addr);
+ }
+ else if ( dt_device_is_compatible(node, "multiboot,device-tree") )
+ {
+ uint32_t len;
+ const __be32 *val;
+
+ val = dt_get_property(node, "reg", &len);
+ if ( val == NULL )
+ continue;
+ dt_get_range(&val, node, &dtb_addr, &size);
+ info->dtb_bootmodule = boot_module_find_by_addr_and_kind(
+ BOOTMOD_GUEST_DTB, dtb_addr);
+ }
+ else
+ continue;
+ }
+ name = dt_node_name(domain);
+ cmd = boot_cmdline_find_by_name(name);
+ if ( cmd )
+ info->cmdline = &cmd->cmdline[0];
+ }
+ if ( !mod || !mod->size )
+ {
+ printk(XENLOG_ERR "Missing kernel boot module?\n");
+ return -ENOENT;
+ }
+
+ printk("Loading %pd kernel from boot module @ %"PRIpaddr"\n",
+ info->d, info->kernel_bootmodule->start);
+ if ( info->initrd_bootmodule )
+ printk("Loading ramdisk from boot module @ %"PRIpaddr"\n",
+ info->initrd_bootmodule->start);
+
+ /*
+ * uImage isn't really used nowadays thereby leave kernel_uimage_probe()
+ * call here just for compatability with Arm code.
+ */
+#ifdef CONFIG_ARM
+ /*
+ * uImage header always appears at the top of the image (even compressed),
+ * so it needs to be probed first. Note that in case of compressed uImage,
+ * kernel_decompress is called from kernel_uimage_probe making the function
+ * self-containing (i.e. fall through only in case of a header not found).
+ */
+ rc = kernel_uimage_probe(info, mod);
+ if ( rc != -ENOENT )
+ return rc;
+#endif
+
+ /*
+ * If it is a gzip'ed image, 32bit or 64bit, uncompress it.
+ * At this point, gzip header appears (if at all) at the top of the image,
+ * so pass 0 as an offset.
+ */
+ rc = kernel_decompress(mod, 0);
+ if ( rc && rc != -EINVAL )
+ return rc;
+
+ rc = kernel_zimage_probe(info, mod->start, mod->size);
+
+ return rc;
+}
+
+void __init kernel_load(struct kernel_info *info)
+{
+ ASSERT(info && info->load);
+
+ info->load(info);
+}
diff --git a/xen/include/asm-generic/kernel.h b/xen/include/asm-generic/kernel.h
new file mode 100644
index 0000000000..6857fabb34
--- /dev/null
+++ b/xen/include/asm-generic/kernel.h
@@ -0,0 +1,141 @@
+/*
+ * Kernel image loading.
+ *
+ * Copyright (C) 2011 Citrix Systems, Inc.
+ */
+
+#ifndef __ASM_GENERIC_KERNEL_H__
+#define __ASM_GENERIC_KERNEL_H__
+
+#include <xen/bootfdt.h>
+#include <xen/device_tree.h>
+#include <xen/sched.h>
+#include <xen/types.h>
+
+struct kernel_info {
+ struct domain *d;
+
+ void *fdt; /* flat device tree */
+ paddr_t unassigned_mem; /* RAM not (yet) assigned to a bank */
+ struct meminfo mem;
+#ifdef CONFIG_STATIC_SHM
+ struct shared_meminfo shm_mem;
+#endif
+
+ /* kernel entry point */
+ paddr_t entry;
+
+ /* grant table region */
+ paddr_t gnttab_start;
+ paddr_t gnttab_size;
+
+ /* boot blob load addresses */
+ const struct bootmodule *kernel_bootmodule, *initrd_bootmodule, *dtb_bootmodule;
+ const char* cmdline;
+ paddr_t dtb_paddr;
+ paddr_t initrd_paddr;
+
+ /* Enable uart emulation */
+ bool vuart;
+
+ /* Enable/Disable PV drivers interfaces */
+ uint16_t dom0less_feature;
+
+ /* Interrupt controller phandle */
+ uint32_t phandle_intc;
+
+ /* loader to use for this kernel */
+ void (*load)(struct kernel_info *info);
+
+ /* loader specific state */
+ union {
+ struct {
+ paddr_t kernel_addr;
+ paddr_t len;
+#if defined(CONFIG_ARM_64) || defined(CONFIG_RISCV_64)
+ paddr_t text_offset; /* 64-bit Image only */
+#endif
+ paddr_t start; /* Must be 0 for 64-bit Image */
+ } zimage;
+ };
+
+ struct arch_kernel_info arch;
+};
+
+static inline struct membanks *kernel_info_get_mem(struct kernel_info *kinfo)
+{
+ return container_of(&kinfo->mem.common, struct membanks, common);
+}
+
+static inline const struct membanks *
+kernel_info_get_mem_const(const struct kernel_info *kinfo)
+{
+ return container_of(&kinfo->mem.common, const struct membanks, common);
+}
+
+#ifndef KERNEL_INFO_SHM_MEM_INIT
+
+#ifdef CONFIG_STATIC_SHM
+#define KERNEL_INFO_SHM_MEM_INIT .shm_mem.common.max_banks = NR_SHMEM_BANKS,
+#else
+#define KERNEL_INFO_SHM_MEM_INIT
+#endif
+
+#endif /* KERNEL_INFO_SHM_MEM_INIT */
+
+#ifndef KERNEL_INFO_INIT
+
+#define KERNEL_INFO_INIT \
+{ \
+ .mem.common.max_banks = NR_MEM_BANKS, \
+ KERNEL_INFO_SHM_MEM_INIT \
+}
+
+#endif /* KERNEL_INFO_INIT */
+
+/*
+ * Probe the kernel to detemine its type and select a loader.
+ *
+ * Sets in info:
+ * ->type
+ * ->load hook, and sets loader specific variables ->zimage
+ */
+int kernel_probe(struct kernel_info *info, const struct dt_device_node *domain);
+
+/*
+ * Loads the kernel into guest RAM.
+ *
+ * Expects to be set in info when called:
+ * ->mem
+ * ->fdt
+ *
+ * Sets in info:
+ * ->entry
+ * ->dtb_paddr
+ * ->initrd_paddr
+ */
+void kernel_load(struct kernel_info *info);
+
+int kernel_decompress(struct bootmodule *mod, uint32_t offset);
+
+int kernel_zimage_probe(struct kernel_info *info, paddr_t addr, paddr_t size);
+
+/*
+ * uImage isn't really used nowadays thereby leave kernel_uimage_probe()
+ * call here just for compatability with Arm code.
+ */
+#ifdef CONFIG_ARM
+struct bootmodule;
+int kernel_uimage_probe(struct kernel_info *info, struct bootmodule *mod);
+#endif
+
+#endif /*__ASM_GENERIC_KERNEL_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/fdt-kernel.h b/xen/include/xen/fdt-kernel.h
index c81e759423..d85324c867 100644
--- a/xen/include/xen/fdt-kernel.h
+++ b/xen/include/xen/fdt-kernel.h
@@ -121,6 +121,19 @@ int kernel_probe(struct kernel_info *info, const struct dt_device_node *domain);
*/
void kernel_load(struct kernel_info *info);
+int kernel_decompress(struct bootmodule *mod, uint32_t offset);
+
+int kernel_zimage_probe(struct kernel_info *info, paddr_t addr, paddr_t size);
+
+/*
+ * uImage isn't really used nowadays thereby leave kernel_uimage_probe()
+ * call here just for compatability with Arm code.
+ */
+#ifdef CONFIG_ARM
+struct bootmodule;
+int kernel_uimage_probe(struct kernel_info *info, struct bootmodule *mod);
+#endif
+
#endif /* __XEN_FDT_KERNEL_H__ */
/*
--
2.49.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH v3 6/8] xen/common: dom0less: introduce common kernel.c
2025-05-02 16:22 ` [PATCH v3 6/8] xen/common: dom0less: introduce common kernel.c Oleksii Kurochko
@ 2025-05-02 19:36 ` Stefano Stabellini
2025-05-05 9:23 ` Oleksii Kurochko
0 siblings, 1 reply; 30+ messages in thread
From: Stefano Stabellini @ 2025-05-02 19:36 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: xen-devel, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Jan Beulich, Roger Pau Monné
On Fri, 2 May 2025, Oleksii Kurochko wrote:
> The following functions don't have arch specific things so it is moved to
> common:
> - kernel_prboe()
> - kernel_load()
> - output_length()
>
> Functions necessary for dom0less are only moved.
>
> The following changes are done:
> - Swap __init and return type of kernel_decompress() function to be
> consistent with defintions of functions in other files. The same
> for output_length().
> - Wrap by "ifdef CONFIG_ARM" the call of kernel_uimage_probe() in
> kernel_probe() as uImage isn't really used nowadays thereby leave
> kernel_uimage_probe() call here just for compatability with Arm code.
> - Introduce kernel_zimage_probe() to cover the case that arch can have
> different zimage header.
> - Add ASSERT() for kernel_load() to check that it argument isn't NULL.
> - Make kernel_uimage_probe() non-static in Arm's code as it is used in
> common/kernel.c.
>
> Introduce CONFIG_DOMAIN_BUILD_HELPERS to not provide stubs for archs
> which don't provide enough functionality to enable it.
> Select CONFIG_DOMAIN_BUILD_HELPERS for CONFIG_ARM as only Arm supports
> it, at the moment.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Change in v3:
> - Empty line after license tag for newly introduced files.
> ---
> Change in v2:
> - Drop inclusion of asm/kernel.h in kernel.c as everything necessary has
> been moved to xen/fdt-kernel.h.
> ---
> xen/arch/arm/Kconfig | 1 +
> xen/arch/arm/kernel.c | 221 +---------------------------
> xen/common/Kconfig | 9 +-
> xen/common/device-tree/Makefile | 1 +
> xen/common/device-tree/kernel.c | 242 +++++++++++++++++++++++++++++++
> xen/include/asm-generic/kernel.h | 141 ++++++++++++++++++
> xen/include/xen/fdt-kernel.h | 13 ++
> 7 files changed, 412 insertions(+), 216 deletions(-)
> create mode 100644 xen/common/device-tree/kernel.c
> create mode 100644 xen/include/asm-generic/kernel.h
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index d0e0a7753c..3321d89068 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -11,6 +11,7 @@ config ARM_64
>
> config ARM
> def_bool y
> + select DOMAIN_BUILD_HELPERS
> select FUNCTION_ALIGNMENT_4B
> select GENERIC_UART_INIT
> select HAS_ALTERNATIVE if HAS_VMAP
> diff --git a/xen/arch/arm/kernel.c b/xen/arch/arm/kernel.c
> index 5759a3470a..1168c21e97 100644
> --- a/xen/arch/arm/kernel.c
> +++ b/xen/arch/arm/kernel.c
> @@ -163,105 +163,6 @@ static void __init kernel_zimage_load(struct kernel_info *info)
> iounmap(kernel);
> }
>
> -static __init uint32_t output_length(char *image, unsigned long image_len)
> -{
> - return *(uint32_t *)&image[image_len - 4];
> -}
> -
> -static __init int kernel_decompress(struct bootmodule *mod, uint32_t offset)
> -{
> - char *output, *input;
> - char magic[2];
> - int rc;
> - unsigned int kernel_order_out;
> - paddr_t output_size;
> - struct page_info *pages;
> - mfn_t mfn;
> - int i;
> - paddr_t addr = mod->start;
> - paddr_t size = mod->size;
> -
> - if ( size < offset )
> - return -EINVAL;
> -
> - /*
> - * It might be that gzip header does not appear at the start address
> - * (e.g. in case of compressed uImage) so take into account offset to
> - * gzip header.
> - */
> - addr += offset;
> - size -= offset;
> -
> - if ( size < 2 )
> - return -EINVAL;
> -
> - copy_from_paddr(magic, addr, sizeof(magic));
> -
> - /* only gzip is supported */
> - if ( !gzip_check(magic, size) )
> - return -EINVAL;
> -
> - input = ioremap_cache(addr, size);
> - if ( input == NULL )
> - return -EFAULT;
> -
> - output_size = output_length(input, size);
> - kernel_order_out = get_order_from_bytes(output_size);
> - pages = alloc_domheap_pages(NULL, kernel_order_out, 0);
> - if ( pages == NULL )
> - {
> - iounmap(input);
> - return -ENOMEM;
> - }
> - mfn = page_to_mfn(pages);
> - output = vmap_contig(mfn, 1 << kernel_order_out);
> -
> - rc = perform_gunzip(output, input, size);
> - clean_dcache_va_range(output, output_size);
> - iounmap(input);
> - vunmap(output);
> -
> - if ( rc )
> - {
> - free_domheap_pages(pages, kernel_order_out);
> - return rc;
> - }
> -
> - mod->start = page_to_maddr(pages);
> - mod->size = output_size;
> -
> - /*
> - * Need to free pages after output_size here because they won't be
> - * freed by discard_initial_modules
> - */
> - i = PFN_UP(output_size);
> - for ( ; i < (1 << kernel_order_out); i++ )
> - free_domheap_page(pages + i);
> -
> - /*
> - * When using static heap feature, don't give bootmodules memory back to
> - * the heap allocator
> - */
> - if ( using_static_heap )
> - return 0;
> -
> - /*
> - * When freeing the kernel, we need to pass the module start address and
> - * size as they were before taking an offset to gzip header into account,
> - * so that the entire region will be freed.
> - */
> - addr -= offset;
> - size += offset;
> -
> - /*
> - * Free the original kernel, update the pointers to the
> - * decompressed kernel
> - */
> - fw_unreserved_regions(addr, addr + size, init_domheap_pages, 0);
> -
> - return 0;
> -}
> -
> /*
> * Uimage CPU Architecture Codes
> */
> @@ -274,8 +175,8 @@ static __init int kernel_decompress(struct bootmodule *mod, uint32_t offset)
> /*
> * Check if the image is a uImage and setup kernel_info
> */
> -static int __init kernel_uimage_probe(struct kernel_info *info,
> - struct bootmodule *mod)
> +int __init kernel_uimage_probe(struct kernel_info *info,
> + struct bootmodule *mod)
> {
> struct {
> __be32 magic; /* Image Header Magic Number */
> @@ -505,130 +406,20 @@ static int __init kernel_zimage32_probe(struct kernel_info *info,
> return 0;
> }
>
> -int __init kernel_probe(struct kernel_info *info,
> - const struct dt_device_node *domain)
> +int __init kernel_zimage_probe(struct kernel_info *info, paddr_t addr,
> + paddr_t size)
> {
> - struct bootmodule *mod = NULL;
> - struct bootcmdline *cmd = NULL;
> - struct dt_device_node *node;
> - u64 kernel_addr, initrd_addr, dtb_addr, size;
> int rc;
>
> - /*
> - * We need to initialize start to 0. This field may be populated during
> - * kernel_xxx_probe() if the image has a fixed entry point (for e.g.
> - * uimage.ep).
> - * We will use this to determine if the image has a fixed entry point or
> - * the load address should be used as the start address.
> - */
> - info->entry = 0;
> -
> - /* domain is NULL only for the hardware domain */
> - if ( domain == NULL )
> - {
> - ASSERT(is_hardware_domain(info->d));
> -
> - mod = boot_module_find_by_kind(BOOTMOD_KERNEL);
> -
> - info->kernel_bootmodule = mod;
> - info->initrd_bootmodule = boot_module_find_by_kind(BOOTMOD_RAMDISK);
> -
> - cmd = boot_cmdline_find_by_kind(BOOTMOD_KERNEL);
> - if ( cmd )
> - info->cmdline = &cmd->cmdline[0];
> - }
> - else
> - {
> - const char *name = NULL;
> -
> - dt_for_each_child_node(domain, node)
> - {
> - if ( dt_device_is_compatible(node, "multiboot,kernel") )
> - {
> - u32 len;
> - const __be32 *val;
> -
> - val = dt_get_property(node, "reg", &len);
> - dt_get_range(&val, node, &kernel_addr, &size);
> - mod = boot_module_find_by_addr_and_kind(
> - BOOTMOD_KERNEL, kernel_addr);
> - info->kernel_bootmodule = mod;
> - }
> - else if ( dt_device_is_compatible(node, "multiboot,ramdisk") )
> - {
> - u32 len;
> - const __be32 *val;
> -
> - val = dt_get_property(node, "reg", &len);
> - dt_get_range(&val, node, &initrd_addr, &size);
> - info->initrd_bootmodule = boot_module_find_by_addr_and_kind(
> - BOOTMOD_RAMDISK, initrd_addr);
> - }
> - else if ( dt_device_is_compatible(node, "multiboot,device-tree") )
> - {
> - uint32_t len;
> - const __be32 *val;
> -
> - val = dt_get_property(node, "reg", &len);
> - if ( val == NULL )
> - continue;
> - dt_get_range(&val, node, &dtb_addr, &size);
> - info->dtb_bootmodule = boot_module_find_by_addr_and_kind(
> - BOOTMOD_GUEST_DTB, dtb_addr);
> - }
> - else
> - continue;
> - }
> - name = dt_node_name(domain);
> - cmd = boot_cmdline_find_by_name(name);
> - if ( cmd )
> - info->cmdline = &cmd->cmdline[0];
> - }
> - if ( !mod || !mod->size )
> - {
> - printk(XENLOG_ERR "Missing kernel boot module?\n");
> - return -ENOENT;
> - }
> -
> - printk("Loading %pd kernel from boot module @ %"PRIpaddr"\n",
> - info->d, info->kernel_bootmodule->start);
> - if ( info->initrd_bootmodule )
> - printk("Loading ramdisk from boot module @ %"PRIpaddr"\n",
> - info->initrd_bootmodule->start);
> -
> - /*
> - * uImage header always appears at the top of the image (even compressed),
> - * so it needs to be probed first. Note that in case of compressed uImage,
> - * kernel_decompress is called from kernel_uimage_probe making the function
> - * self-containing (i.e. fall through only in case of a header not found).
> - */
> - rc = kernel_uimage_probe(info, mod);
> - if ( rc != -ENOENT )
> - return rc;
> -
> - /*
> - * If it is a gzip'ed image, 32bit or 64bit, uncompress it.
> - * At this point, gzip header appears (if at all) at the top of the image,
> - * so pass 0 as an offset.
> - */
> - rc = kernel_decompress(mod, 0);
> - if ( rc && rc != -EINVAL )
> - return rc;
> -
> #ifdef CONFIG_ARM_64
> - rc = kernel_zimage64_probe(info, mod->start, mod->size);
> + rc = kernel_zimage64_probe(info, addr, size);
> if (rc < 0)
> #endif
> - rc = kernel_zimage32_probe(info, mod->start, mod->size);
> + rc = kernel_zimage32_probe(info, addr, size);
>
> return rc;
> }
>
> -void __init kernel_load(struct kernel_info *info)
> -{
> - info->load(info);
> -}
> -
> /*
> * Local variables:
> * mode: C
> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
> index be38abf9e1..38981f1d11 100644
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -14,13 +14,20 @@ config CORE_PARKING
>
> config DOM0LESS_BOOT
> bool "Dom0less boot support" if EXPERT
> - depends on HAS_DOM0LESS
> + depends on HAS_DOM0LESS && DOMAIN_BUILD_HELPERS
> default y
> help
> Dom0less boot support enables Xen to create and start domU guests during
> Xen boot without the need of a control domain (Dom0), which could be
> present anyway.
>
> +config DOMAIN_BUILD_HELPERS
> + bool
> + help
> + Introduce functions necessary for working with domain creation, kernel,
> + etc. As an examples, these type of functions are going to be used by
> + CONFIG_DOM0LESS_BOOT.
NIT: If possible, I would make this option a silent option that cannot
be manually enabled/disabled. As a choice to the user, I think
DOM0LESS_BOOT is sufficient.
> config GRANT_TABLE
> bool "Grant table support" if EXPERT
> default y
> diff --git a/xen/common/device-tree/Makefile b/xen/common/device-tree/Makefile
> index f3dafc9b81..e88a4d5799 100644
> --- a/xen/common/device-tree/Makefile
> +++ b/xen/common/device-tree/Makefile
> @@ -4,3 +4,4 @@ obj-y += device-tree.o
> obj-$(CONFIG_DOM0LESS_BOOT) += dom0less-build.o
> obj-$(CONFIG_OVERLAY_DTB) += dt-overlay.o
> obj-y += intc.o
> +obj-$(CONFIG_DOMAIN_BUILD_HELPERS) += kernel.o
> diff --git a/xen/common/device-tree/kernel.c b/xen/common/device-tree/kernel.c
> new file mode 100644
> index 0000000000..1bf3bbf64e
> --- /dev/null
> +++ b/xen/common/device-tree/kernel.c
> @@ -0,0 +1,242 @@
> +#include <xen/bootfdt.h>
> +#include <xen/device_tree.h>
> +#include <xen/fdt-kernel.h>
> +#include <xen/errno.h>
> +#include <xen/gunzip.h>
> +#include <xen/init.h>
> +#include <xen/lib.h>
> +#include <xen/mm.h>
> +#include <xen/pfn.h>
> +#include <xen/sched.h>
> +#include <xen/types.h>
> +#include <xen/vmap.h>
> +
> +#include <asm/page.h>
> +#include <asm/setup.h>
> +
> +static uint32_t __init output_length(char *image, unsigned long image_len)
> +{
> + return *(uint32_t *)&image[image_len - 4];
> +}
> +
> +int __init kernel_decompress(struct bootmodule *mod, uint32_t offset)
> +{
> + char *output, *input;
> + char magic[2];
> + int rc;
> + unsigned int kernel_order_out;
> + paddr_t output_size;
> + struct page_info *pages;
> + mfn_t mfn;
> + int i;
> + paddr_t addr = mod->start;
> + paddr_t size = mod->size;
> +
> + if ( size < offset )
> + return -EINVAL;
> +
> + /*
> + * It might be that gzip header does not appear at the start address
> + * (e.g. in case of compressed uImage) so take into account offset to
> + * gzip header.
> + */
> + addr += offset;
> + size -= offset;
> +
> + if ( size < 2 )
> + return -EINVAL;
> +
> + copy_from_paddr(magic, addr, sizeof(magic));
> +
> + /* only gzip is supported */
> + if ( !gzip_check(magic, size) )
> + return -EINVAL;
> +
> + input = ioremap_cache(addr, size);
> + if ( input == NULL )
> + return -EFAULT;
> +
> + output_size = output_length(input, size);
> + kernel_order_out = get_order_from_bytes(output_size);
> + pages = alloc_domheap_pages(NULL, kernel_order_out, 0);
> + if ( pages == NULL )
> + {
> + iounmap(input);
> + return -ENOMEM;
> + }
> + mfn = page_to_mfn(pages);
> + output = vmap_contig(mfn, 1 << kernel_order_out);
> +
> + rc = perform_gunzip(output, input, size);
> + clean_dcache_va_range(output, output_size);
> + iounmap(input);
> + vunmap(output);
> +
> + if ( rc )
> + {
> + free_domheap_pages(pages, kernel_order_out);
> + return rc;
> + }
> +
> + mod->start = page_to_maddr(pages);
> + mod->size = output_size;
> +
> + /*
> + * Need to free pages after output_size here because they won't be
> + * freed by discard_initial_modules
> + */
> + i = PFN_UP(output_size);
> + for ( ; i < (1 << kernel_order_out); i++ )
> + free_domheap_page(pages + i);
> +
> + /*
> + * When using static heap feature, don't give bootmodules memory back to
> + * the heap allocator
> + */
> + if ( using_static_heap )
> + return 0;
> +
> + /*
> + * When freeing the kernel, we need to pass the module start address and
> + * size as they were before taking an offset to gzip header into account,
> + * so that the entire region will be freed.
> + */
> + addr -= offset;
> + size += offset;
> +
> + /*
> + * Free the original kernel, update the pointers to the
> + * decompressed kernel
> + */
> + fw_unreserved_regions(addr, addr + size, init_domheap_pages, 0);
> +
> + return 0;
> +}
> +
> +int __init kernel_probe(struct kernel_info *info,
> + const struct dt_device_node *domain)
> +{
> + struct bootmodule *mod = NULL;
> + struct bootcmdline *cmd = NULL;
> + struct dt_device_node *node;
> + u64 kernel_addr, initrd_addr, dtb_addr, size;
> + int rc;
> +
> + /*
> + * We need to initialize start to 0. This field may be populated during
> + * kernel_xxx_probe() if the image has a fixed entry point (for e.g.
> + * uimage.ep).
> + * We will use this to determine if the image has a fixed entry point or
> + * the load address should be used as the start address.
> + */
> + info->entry = 0;
> +
> + /* domain is NULL only for the hardware domain */
> + if ( domain == NULL )
> + {
> + ASSERT(is_hardware_domain(info->d));
> +
> + mod = boot_module_find_by_kind(BOOTMOD_KERNEL);
> +
> + info->kernel_bootmodule = mod;
> + info->initrd_bootmodule = boot_module_find_by_kind(BOOTMOD_RAMDISK);
> +
> + cmd = boot_cmdline_find_by_kind(BOOTMOD_KERNEL);
> + if ( cmd )
> + info->cmdline = &cmd->cmdline[0];
> + }
> + else
> + {
> + const char *name = NULL;
> +
> + dt_for_each_child_node(domain, node)
> + {
> + if ( dt_device_is_compatible(node, "multiboot,kernel") )
> + {
> + u32 len;
> + const __be32 *val;
> +
> + val = dt_get_property(node, "reg", &len);
> + dt_get_range(&val, node, &kernel_addr, &size);
> + mod = boot_module_find_by_addr_and_kind(
> + BOOTMOD_KERNEL, kernel_addr);
> + info->kernel_bootmodule = mod;
> + }
> + else if ( dt_device_is_compatible(node, "multiboot,ramdisk") )
> + {
> + u32 len;
> + const __be32 *val;
> +
> + val = dt_get_property(node, "reg", &len);
> + dt_get_range(&val, node, &initrd_addr, &size);
> + info->initrd_bootmodule = boot_module_find_by_addr_and_kind(
> + BOOTMOD_RAMDISK, initrd_addr);
> + }
> + else if ( dt_device_is_compatible(node, "multiboot,device-tree") )
> + {
> + uint32_t len;
> + const __be32 *val;
> +
> + val = dt_get_property(node, "reg", &len);
> + if ( val == NULL )
> + continue;
> + dt_get_range(&val, node, &dtb_addr, &size);
> + info->dtb_bootmodule = boot_module_find_by_addr_and_kind(
> + BOOTMOD_GUEST_DTB, dtb_addr);
> + }
> + else
> + continue;
> + }
> + name = dt_node_name(domain);
> + cmd = boot_cmdline_find_by_name(name);
> + if ( cmd )
> + info->cmdline = &cmd->cmdline[0];
> + }
> + if ( !mod || !mod->size )
> + {
> + printk(XENLOG_ERR "Missing kernel boot module?\n");
> + return -ENOENT;
> + }
> +
> + printk("Loading %pd kernel from boot module @ %"PRIpaddr"\n",
> + info->d, info->kernel_bootmodule->start);
> + if ( info->initrd_bootmodule )
> + printk("Loading ramdisk from boot module @ %"PRIpaddr"\n",
> + info->initrd_bootmodule->start);
> +
> + /*
> + * uImage isn't really used nowadays thereby leave kernel_uimage_probe()
> + * call here just for compatability with Arm code.
> + */
> +#ifdef CONFIG_ARM
> + /*
> + * uImage header always appears at the top of the image (even compressed),
> + * so it needs to be probed first. Note that in case of compressed uImage,
> + * kernel_decompress is called from kernel_uimage_probe making the function
> + * self-containing (i.e. fall through only in case of a header not found).
> + */
> + rc = kernel_uimage_probe(info, mod);
> + if ( rc != -ENOENT )
> + return rc;
> +#endif
> +
> + /*
> + * If it is a gzip'ed image, 32bit or 64bit, uncompress it.
> + * At this point, gzip header appears (if at all) at the top of the image,
> + * so pass 0 as an offset.
> + */
> + rc = kernel_decompress(mod, 0);
> + if ( rc && rc != -EINVAL )
> + return rc;
> +
> + rc = kernel_zimage_probe(info, mod->start, mod->size);
> +
> + return rc;
> +}
> +
> +void __init kernel_load(struct kernel_info *info)
> +{
> + ASSERT(info && info->load);
> +
> + info->load(info);
> +}
> diff --git a/xen/include/asm-generic/kernel.h b/xen/include/asm-generic/kernel.h
> new file mode 100644
> index 0000000000..6857fabb34
> --- /dev/null
> +++ b/xen/include/asm-generic/kernel.h
This file seems to be a duplicate of the previously introduced
xen/include/xen/fdt-kernel.h ?
Other than that, I checked the rest of the patch, including all the code
movement, and it looks correct to me.
> @@ -0,0 +1,141 @@
> +/*
> + * Kernel image loading.
> + *
> + * Copyright (C) 2011 Citrix Systems, Inc.
> + */
> +
> +#ifndef __ASM_GENERIC_KERNEL_H__
> +#define __ASM_GENERIC_KERNEL_H__
> +
> +#include <xen/bootfdt.h>
> +#include <xen/device_tree.h>
> +#include <xen/sched.h>
> +#include <xen/types.h>
> +
> +struct kernel_info {
> + struct domain *d;
> +
> + void *fdt; /* flat device tree */
> + paddr_t unassigned_mem; /* RAM not (yet) assigned to a bank */
> + struct meminfo mem;
> +#ifdef CONFIG_STATIC_SHM
> + struct shared_meminfo shm_mem;
> +#endif
> +
> + /* kernel entry point */
> + paddr_t entry;
> +
> + /* grant table region */
> + paddr_t gnttab_start;
> + paddr_t gnttab_size;
> +
> + /* boot blob load addresses */
> + const struct bootmodule *kernel_bootmodule, *initrd_bootmodule, *dtb_bootmodule;
> + const char* cmdline;
> + paddr_t dtb_paddr;
> + paddr_t initrd_paddr;
> +
> + /* Enable uart emulation */
> + bool vuart;
> +
> + /* Enable/Disable PV drivers interfaces */
> + uint16_t dom0less_feature;
> +
> + /* Interrupt controller phandle */
> + uint32_t phandle_intc;
> +
> + /* loader to use for this kernel */
> + void (*load)(struct kernel_info *info);
> +
> + /* loader specific state */
> + union {
> + struct {
> + paddr_t kernel_addr;
> + paddr_t len;
> +#if defined(CONFIG_ARM_64) || defined(CONFIG_RISCV_64)
> + paddr_t text_offset; /* 64-bit Image only */
> +#endif
> + paddr_t start; /* Must be 0 for 64-bit Image */
> + } zimage;
> + };
> +
> + struct arch_kernel_info arch;
> +};
> +
> +static inline struct membanks *kernel_info_get_mem(struct kernel_info *kinfo)
> +{
> + return container_of(&kinfo->mem.common, struct membanks, common);
> +}
> +
> +static inline const struct membanks *
> +kernel_info_get_mem_const(const struct kernel_info *kinfo)
> +{
> + return container_of(&kinfo->mem.common, const struct membanks, common);
> +}
> +
> +#ifndef KERNEL_INFO_SHM_MEM_INIT
> +
> +#ifdef CONFIG_STATIC_SHM
> +#define KERNEL_INFO_SHM_MEM_INIT .shm_mem.common.max_banks = NR_SHMEM_BANKS,
> +#else
> +#define KERNEL_INFO_SHM_MEM_INIT
> +#endif
> +
> +#endif /* KERNEL_INFO_SHM_MEM_INIT */
> +
> +#ifndef KERNEL_INFO_INIT
> +
> +#define KERNEL_INFO_INIT \
> +{ \
> + .mem.common.max_banks = NR_MEM_BANKS, \
> + KERNEL_INFO_SHM_MEM_INIT \
> +}
> +
> +#endif /* KERNEL_INFO_INIT */
> +
> +/*
> + * Probe the kernel to detemine its type and select a loader.
> + *
> + * Sets in info:
> + * ->type
> + * ->load hook, and sets loader specific variables ->zimage
> + */
> +int kernel_probe(struct kernel_info *info, const struct dt_device_node *domain);
> +
> +/*
> + * Loads the kernel into guest RAM.
> + *
> + * Expects to be set in info when called:
> + * ->mem
> + * ->fdt
> + *
> + * Sets in info:
> + * ->entry
> + * ->dtb_paddr
> + * ->initrd_paddr
> + */
> +void kernel_load(struct kernel_info *info);
> +
> +int kernel_decompress(struct bootmodule *mod, uint32_t offset);
> +
> +int kernel_zimage_probe(struct kernel_info *info, paddr_t addr, paddr_t size);
> +
> +/*
> + * uImage isn't really used nowadays thereby leave kernel_uimage_probe()
> + * call here just for compatability with Arm code.
> + */
> +#ifdef CONFIG_ARM
> +struct bootmodule;
> +int kernel_uimage_probe(struct kernel_info *info, struct bootmodule *mod);
> +#endif
> +
> +#endif /*__ASM_GENERIC_KERNEL_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/include/xen/fdt-kernel.h b/xen/include/xen/fdt-kernel.h
> index c81e759423..d85324c867 100644
> --- a/xen/include/xen/fdt-kernel.h
> +++ b/xen/include/xen/fdt-kernel.h
> @@ -121,6 +121,19 @@ int kernel_probe(struct kernel_info *info, const struct dt_device_node *domain);
> */
> void kernel_load(struct kernel_info *info);
>
> +int kernel_decompress(struct bootmodule *mod, uint32_t offset);
> +
> +int kernel_zimage_probe(struct kernel_info *info, paddr_t addr, paddr_t size);
> +
> +/*
> + * uImage isn't really used nowadays thereby leave kernel_uimage_probe()
> + * call here just for compatability with Arm code.
> + */
> +#ifdef CONFIG_ARM
> +struct bootmodule;
> +int kernel_uimage_probe(struct kernel_info *info, struct bootmodule *mod);
> +#endif
> +
> #endif /* __XEN_FDT_KERNEL_H__ */
>
> /*
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v3 6/8] xen/common: dom0less: introduce common kernel.c
2025-05-02 19:36 ` Stefano Stabellini
@ 2025-05-05 9:23 ` Oleksii Kurochko
0 siblings, 0 replies; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-05 9:23 UTC (permalink / raw)
To: Stefano Stabellini
Cc: xen-devel, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD, Jan Beulich,
Roger Pau Monné
[-- Attachment #1: Type: text/plain, Size: 26268 bytes --]
On 5/2/25 9:36 PM, Stefano Stabellini wrote:
> On Fri, 2 May 2025, Oleksii Kurochko wrote:
>> The following functions don't have arch specific things so it is moved to
>> common:
>> - kernel_prboe()
>> - kernel_load()
>> - output_length()
>>
>> Functions necessary for dom0less are only moved.
>>
>> The following changes are done:
>> - Swap __init and return type of kernel_decompress() function to be
>> consistent with defintions of functions in other files. The same
>> for output_length().
>> - Wrap by "ifdef CONFIG_ARM" the call of kernel_uimage_probe() in
>> kernel_probe() as uImage isn't really used nowadays thereby leave
>> kernel_uimage_probe() call here just for compatability with Arm code.
>> - Introduce kernel_zimage_probe() to cover the case that arch can have
>> different zimage header.
>> - Add ASSERT() for kernel_load() to check that it argument isn't NULL.
>> - Make kernel_uimage_probe() non-static in Arm's code as it is used in
>> common/kernel.c.
>>
>> Introduce CONFIG_DOMAIN_BUILD_HELPERS to not provide stubs for archs
>> which don't provide enough functionality to enable it.
>> Select CONFIG_DOMAIN_BUILD_HELPERS for CONFIG_ARM as only Arm supports
>> it, at the moment.
>>
>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
>> ---
>> Change in v3:
>> - Empty line after license tag for newly introduced files.
>> ---
>> Change in v2:
>> - Drop inclusion of asm/kernel.h in kernel.c as everything necessary has
>> been moved to xen/fdt-kernel.h.
>> ---
>> xen/arch/arm/Kconfig | 1 +
>> xen/arch/arm/kernel.c | 221 +---------------------------
>> xen/common/Kconfig | 9 +-
>> xen/common/device-tree/Makefile | 1 +
>> xen/common/device-tree/kernel.c | 242 +++++++++++++++++++++++++++++++
>> xen/include/asm-generic/kernel.h | 141 ++++++++++++++++++
>> xen/include/xen/fdt-kernel.h | 13 ++
>> 7 files changed, 412 insertions(+), 216 deletions(-)
>> create mode 100644 xen/common/device-tree/kernel.c
>> create mode 100644 xen/include/asm-generic/kernel.h
>>
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index d0e0a7753c..3321d89068 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -11,6 +11,7 @@ config ARM_64
>>
>> config ARM
>> def_bool y
>> + select DOMAIN_BUILD_HELPERS
>> select FUNCTION_ALIGNMENT_4B
>> select GENERIC_UART_INIT
>> select HAS_ALTERNATIVE if HAS_VMAP
>> diff --git a/xen/arch/arm/kernel.c b/xen/arch/arm/kernel.c
>> index 5759a3470a..1168c21e97 100644
>> --- a/xen/arch/arm/kernel.c
>> +++ b/xen/arch/arm/kernel.c
>> @@ -163,105 +163,6 @@ static void __init kernel_zimage_load(struct kernel_info *info)
>> iounmap(kernel);
>> }
>>
>> -static __init uint32_t output_length(char *image, unsigned long image_len)
>> -{
>> - return *(uint32_t *)&image[image_len - 4];
>> -}
>> -
>> -static __init int kernel_decompress(struct bootmodule *mod, uint32_t offset)
>> -{
>> - char *output, *input;
>> - char magic[2];
>> - int rc;
>> - unsigned int kernel_order_out;
>> - paddr_t output_size;
>> - struct page_info *pages;
>> - mfn_t mfn;
>> - int i;
>> - paddr_t addr = mod->start;
>> - paddr_t size = mod->size;
>> -
>> - if ( size < offset )
>> - return -EINVAL;
>> -
>> - /*
>> - * It might be that gzip header does not appear at the start address
>> - * (e.g. in case of compressed uImage) so take into account offset to
>> - * gzip header.
>> - */
>> - addr += offset;
>> - size -= offset;
>> -
>> - if ( size < 2 )
>> - return -EINVAL;
>> -
>> - copy_from_paddr(magic, addr, sizeof(magic));
>> -
>> - /* only gzip is supported */
>> - if ( !gzip_check(magic, size) )
>> - return -EINVAL;
>> -
>> - input = ioremap_cache(addr, size);
>> - if ( input == NULL )
>> - return -EFAULT;
>> -
>> - output_size = output_length(input, size);
>> - kernel_order_out = get_order_from_bytes(output_size);
>> - pages = alloc_domheap_pages(NULL, kernel_order_out, 0);
>> - if ( pages == NULL )
>> - {
>> - iounmap(input);
>> - return -ENOMEM;
>> - }
>> - mfn = page_to_mfn(pages);
>> - output = vmap_contig(mfn, 1 << kernel_order_out);
>> -
>> - rc = perform_gunzip(output, input, size);
>> - clean_dcache_va_range(output, output_size);
>> - iounmap(input);
>> - vunmap(output);
>> -
>> - if ( rc )
>> - {
>> - free_domheap_pages(pages, kernel_order_out);
>> - return rc;
>> - }
>> -
>> - mod->start = page_to_maddr(pages);
>> - mod->size = output_size;
>> -
>> - /*
>> - * Need to free pages after output_size here because they won't be
>> - * freed by discard_initial_modules
>> - */
>> - i = PFN_UP(output_size);
>> - for ( ; i < (1 << kernel_order_out); i++ )
>> - free_domheap_page(pages + i);
>> -
>> - /*
>> - * When using static heap feature, don't give bootmodules memory back to
>> - * the heap allocator
>> - */
>> - if ( using_static_heap )
>> - return 0;
>> -
>> - /*
>> - * When freeing the kernel, we need to pass the module start address and
>> - * size as they were before taking an offset to gzip header into account,
>> - * so that the entire region will be freed.
>> - */
>> - addr -= offset;
>> - size += offset;
>> -
>> - /*
>> - * Free the original kernel, update the pointers to the
>> - * decompressed kernel
>> - */
>> - fw_unreserved_regions(addr, addr + size, init_domheap_pages, 0);
>> -
>> - return 0;
>> -}
>> -
>> /*
>> * Uimage CPU Architecture Codes
>> */
>> @@ -274,8 +175,8 @@ static __init int kernel_decompress(struct bootmodule *mod, uint32_t offset)
>> /*
>> * Check if the image is a uImage and setup kernel_info
>> */
>> -static int __init kernel_uimage_probe(struct kernel_info *info,
>> - struct bootmodule *mod)
>> +int __init kernel_uimage_probe(struct kernel_info *info,
>> + struct bootmodule *mod)
>> {
>> struct {
>> __be32 magic; /* Image Header Magic Number */
>> @@ -505,130 +406,20 @@ static int __init kernel_zimage32_probe(struct kernel_info *info,
>> return 0;
>> }
>>
>> -int __init kernel_probe(struct kernel_info *info,
>> - const struct dt_device_node *domain)
>> +int __init kernel_zimage_probe(struct kernel_info *info, paddr_t addr,
>> + paddr_t size)
>> {
>> - struct bootmodule *mod = NULL;
>> - struct bootcmdline *cmd = NULL;
>> - struct dt_device_node *node;
>> - u64 kernel_addr, initrd_addr, dtb_addr, size;
>> int rc;
>>
>> - /*
>> - * We need to initialize start to 0. This field may be populated during
>> - * kernel_xxx_probe() if the image has a fixed entry point (for e.g.
>> - * uimage.ep).
>> - * We will use this to determine if the image has a fixed entry point or
>> - * the load address should be used as the start address.
>> - */
>> - info->entry = 0;
>> -
>> - /* domain is NULL only for the hardware domain */
>> - if ( domain == NULL )
>> - {
>> - ASSERT(is_hardware_domain(info->d));
>> -
>> - mod = boot_module_find_by_kind(BOOTMOD_KERNEL);
>> -
>> - info->kernel_bootmodule = mod;
>> - info->initrd_bootmodule = boot_module_find_by_kind(BOOTMOD_RAMDISK);
>> -
>> - cmd = boot_cmdline_find_by_kind(BOOTMOD_KERNEL);
>> - if ( cmd )
>> - info->cmdline = &cmd->cmdline[0];
>> - }
>> - else
>> - {
>> - const char *name = NULL;
>> -
>> - dt_for_each_child_node(domain, node)
>> - {
>> - if ( dt_device_is_compatible(node, "multiboot,kernel") )
>> - {
>> - u32 len;
>> - const __be32 *val;
>> -
>> - val = dt_get_property(node, "reg", &len);
>> - dt_get_range(&val, node, &kernel_addr, &size);
>> - mod = boot_module_find_by_addr_and_kind(
>> - BOOTMOD_KERNEL, kernel_addr);
>> - info->kernel_bootmodule = mod;
>> - }
>> - else if ( dt_device_is_compatible(node, "multiboot,ramdisk") )
>> - {
>> - u32 len;
>> - const __be32 *val;
>> -
>> - val = dt_get_property(node, "reg", &len);
>> - dt_get_range(&val, node, &initrd_addr, &size);
>> - info->initrd_bootmodule = boot_module_find_by_addr_and_kind(
>> - BOOTMOD_RAMDISK, initrd_addr);
>> - }
>> - else if ( dt_device_is_compatible(node, "multiboot,device-tree") )
>> - {
>> - uint32_t len;
>> - const __be32 *val;
>> -
>> - val = dt_get_property(node, "reg", &len);
>> - if ( val == NULL )
>> - continue;
>> - dt_get_range(&val, node, &dtb_addr, &size);
>> - info->dtb_bootmodule = boot_module_find_by_addr_and_kind(
>> - BOOTMOD_GUEST_DTB, dtb_addr);
>> - }
>> - else
>> - continue;
>> - }
>> - name = dt_node_name(domain);
>> - cmd = boot_cmdline_find_by_name(name);
>> - if ( cmd )
>> - info->cmdline = &cmd->cmdline[0];
>> - }
>> - if ( !mod || !mod->size )
>> - {
>> - printk(XENLOG_ERR "Missing kernel boot module?\n");
>> - return -ENOENT;
>> - }
>> -
>> - printk("Loading %pd kernel from boot module @ %"PRIpaddr"\n",
>> - info->d, info->kernel_bootmodule->start);
>> - if ( info->initrd_bootmodule )
>> - printk("Loading ramdisk from boot module @ %"PRIpaddr"\n",
>> - info->initrd_bootmodule->start);
>> -
>> - /*
>> - * uImage header always appears at the top of the image (even compressed),
>> - * so it needs to be probed first. Note that in case of compressed uImage,
>> - * kernel_decompress is called from kernel_uimage_probe making the function
>> - * self-containing (i.e. fall through only in case of a header not found).
>> - */
>> - rc = kernel_uimage_probe(info, mod);
>> - if ( rc != -ENOENT )
>> - return rc;
>> -
>> - /*
>> - * If it is a gzip'ed image, 32bit or 64bit, uncompress it.
>> - * At this point, gzip header appears (if at all) at the top of the image,
>> - * so pass 0 as an offset.
>> - */
>> - rc = kernel_decompress(mod, 0);
>> - if ( rc && rc != -EINVAL )
>> - return rc;
>> -
>> #ifdef CONFIG_ARM_64
>> - rc = kernel_zimage64_probe(info, mod->start, mod->size);
>> + rc = kernel_zimage64_probe(info, addr, size);
>> if (rc < 0)
>> #endif
>> - rc = kernel_zimage32_probe(info, mod->start, mod->size);
>> + rc = kernel_zimage32_probe(info, addr, size);
>>
>> return rc;
>> }
>>
>> -void __init kernel_load(struct kernel_info *info)
>> -{
>> - info->load(info);
>> -}
>> -
>> /*
>> * Local variables:
>> * mode: C
>> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
>> index be38abf9e1..38981f1d11 100644
>> --- a/xen/common/Kconfig
>> +++ b/xen/common/Kconfig
>> @@ -14,13 +14,20 @@ config CORE_PARKING
>>
>> config DOM0LESS_BOOT
>> bool "Dom0less boot support" if EXPERT
>> - depends on HAS_DOM0LESS
>> + depends on HAS_DOM0LESS && DOMAIN_BUILD_HELPERS
>> default y
>> help
>> Dom0less boot support enables Xen to create and start domU guests during
>> Xen boot without the need of a control domain (Dom0), which could be
>> present anyway.
>>
>> +config DOMAIN_BUILD_HELPERS
>> + bool
>> + help
>> + Introduce functions necessary for working with domain creation, kernel,
>> + etc. As an examples, these type of functions are going to be used by
>> + CONFIG_DOM0LESS_BOOT.
> NIT: If possible, I would make this option a silent option that cannot
> be manually enabled/disabled. As a choice to the user, I think
> DOM0LESS_BOOT is sufficient.
Sure, I'll drop 'help' then and add 'select DOMAIN_BUILD_HELPERS' in DOM0LESS_BOOT
config.
>
>
>> config GRANT_TABLE
>> bool "Grant table support" if EXPERT
>> default y
>> diff --git a/xen/common/device-tree/Makefile b/xen/common/device-tree/Makefile
>> index f3dafc9b81..e88a4d5799 100644
>> --- a/xen/common/device-tree/Makefile
>> +++ b/xen/common/device-tree/Makefile
>> @@ -4,3 +4,4 @@ obj-y += device-tree.o
>> obj-$(CONFIG_DOM0LESS_BOOT) += dom0less-build.o
>> obj-$(CONFIG_OVERLAY_DTB) += dt-overlay.o
>> obj-y += intc.o
>> +obj-$(CONFIG_DOMAIN_BUILD_HELPERS) += kernel.o
>> diff --git a/xen/common/device-tree/kernel.c b/xen/common/device-tree/kernel.c
>> new file mode 100644
>> index 0000000000..1bf3bbf64e
>> --- /dev/null
>> +++ b/xen/common/device-tree/kernel.c
>> @@ -0,0 +1,242 @@
>> +#include <xen/bootfdt.h>
>> +#include <xen/device_tree.h>
>> +#include <xen/fdt-kernel.h>
>> +#include <xen/errno.h>
>> +#include <xen/gunzip.h>
>> +#include <xen/init.h>
>> +#include <xen/lib.h>
>> +#include <xen/mm.h>
>> +#include <xen/pfn.h>
>> +#include <xen/sched.h>
>> +#include <xen/types.h>
>> +#include <xen/vmap.h>
>> +
>> +#include <asm/page.h>
>> +#include <asm/setup.h>
>> +
>> +static uint32_t __init output_length(char *image, unsigned long image_len)
>> +{
>> + return *(uint32_t *)&image[image_len - 4];
>> +}
>> +
>> +int __init kernel_decompress(struct bootmodule *mod, uint32_t offset)
>> +{
>> + char *output, *input;
>> + char magic[2];
>> + int rc;
>> + unsigned int kernel_order_out;
>> + paddr_t output_size;
>> + struct page_info *pages;
>> + mfn_t mfn;
>> + int i;
>> + paddr_t addr = mod->start;
>> + paddr_t size = mod->size;
>> +
>> + if ( size < offset )
>> + return -EINVAL;
>> +
>> + /*
>> + * It might be that gzip header does not appear at the start address
>> + * (e.g. in case of compressed uImage) so take into account offset to
>> + * gzip header.
>> + */
>> + addr += offset;
>> + size -= offset;
>> +
>> + if ( size < 2 )
>> + return -EINVAL;
>> +
>> + copy_from_paddr(magic, addr, sizeof(magic));
>> +
>> + /* only gzip is supported */
>> + if ( !gzip_check(magic, size) )
>> + return -EINVAL;
>> +
>> + input = ioremap_cache(addr, size);
>> + if ( input == NULL )
>> + return -EFAULT;
>> +
>> + output_size = output_length(input, size);
>> + kernel_order_out = get_order_from_bytes(output_size);
>> + pages = alloc_domheap_pages(NULL, kernel_order_out, 0);
>> + if ( pages == NULL )
>> + {
>> + iounmap(input);
>> + return -ENOMEM;
>> + }
>> + mfn = page_to_mfn(pages);
>> + output = vmap_contig(mfn, 1 << kernel_order_out);
>> +
>> + rc = perform_gunzip(output, input, size);
>> + clean_dcache_va_range(output, output_size);
>> + iounmap(input);
>> + vunmap(output);
>> +
>> + if ( rc )
>> + {
>> + free_domheap_pages(pages, kernel_order_out);
>> + return rc;
>> + }
>> +
>> + mod->start = page_to_maddr(pages);
>> + mod->size = output_size;
>> +
>> + /*
>> + * Need to free pages after output_size here because they won't be
>> + * freed by discard_initial_modules
>> + */
>> + i = PFN_UP(output_size);
>> + for ( ; i < (1 << kernel_order_out); i++ )
>> + free_domheap_page(pages + i);
>> +
>> + /*
>> + * When using static heap feature, don't give bootmodules memory back to
>> + * the heap allocator
>> + */
>> + if ( using_static_heap )
>> + return 0;
>> +
>> + /*
>> + * When freeing the kernel, we need to pass the module start address and
>> + * size as they were before taking an offset to gzip header into account,
>> + * so that the entire region will be freed.
>> + */
>> + addr -= offset;
>> + size += offset;
>> +
>> + /*
>> + * Free the original kernel, update the pointers to the
>> + * decompressed kernel
>> + */
>> + fw_unreserved_regions(addr, addr + size, init_domheap_pages, 0);
>> +
>> + return 0;
>> +}
>> +
>> +int __init kernel_probe(struct kernel_info *info,
>> + const struct dt_device_node *domain)
>> +{
>> + struct bootmodule *mod = NULL;
>> + struct bootcmdline *cmd = NULL;
>> + struct dt_device_node *node;
>> + u64 kernel_addr, initrd_addr, dtb_addr, size;
>> + int rc;
>> +
>> + /*
>> + * We need to initialize start to 0. This field may be populated during
>> + * kernel_xxx_probe() if the image has a fixed entry point (for e.g.
>> + * uimage.ep).
>> + * We will use this to determine if the image has a fixed entry point or
>> + * the load address should be used as the start address.
>> + */
>> + info->entry = 0;
>> +
>> + /* domain is NULL only for the hardware domain */
>> + if ( domain == NULL )
>> + {
>> + ASSERT(is_hardware_domain(info->d));
>> +
>> + mod = boot_module_find_by_kind(BOOTMOD_KERNEL);
>> +
>> + info->kernel_bootmodule = mod;
>> + info->initrd_bootmodule = boot_module_find_by_kind(BOOTMOD_RAMDISK);
>> +
>> + cmd = boot_cmdline_find_by_kind(BOOTMOD_KERNEL);
>> + if ( cmd )
>> + info->cmdline = &cmd->cmdline[0];
>> + }
>> + else
>> + {
>> + const char *name = NULL;
>> +
>> + dt_for_each_child_node(domain, node)
>> + {
>> + if ( dt_device_is_compatible(node, "multiboot,kernel") )
>> + {
>> + u32 len;
>> + const __be32 *val;
>> +
>> + val = dt_get_property(node, "reg", &len);
>> + dt_get_range(&val, node, &kernel_addr, &size);
>> + mod = boot_module_find_by_addr_and_kind(
>> + BOOTMOD_KERNEL, kernel_addr);
>> + info->kernel_bootmodule = mod;
>> + }
>> + else if ( dt_device_is_compatible(node, "multiboot,ramdisk") )
>> + {
>> + u32 len;
>> + const __be32 *val;
>> +
>> + val = dt_get_property(node, "reg", &len);
>> + dt_get_range(&val, node, &initrd_addr, &size);
>> + info->initrd_bootmodule = boot_module_find_by_addr_and_kind(
>> + BOOTMOD_RAMDISK, initrd_addr);
>> + }
>> + else if ( dt_device_is_compatible(node, "multiboot,device-tree") )
>> + {
>> + uint32_t len;
>> + const __be32 *val;
>> +
>> + val = dt_get_property(node, "reg", &len);
>> + if ( val == NULL )
>> + continue;
>> + dt_get_range(&val, node, &dtb_addr, &size);
>> + info->dtb_bootmodule = boot_module_find_by_addr_and_kind(
>> + BOOTMOD_GUEST_DTB, dtb_addr);
>> + }
>> + else
>> + continue;
>> + }
>> + name = dt_node_name(domain);
>> + cmd = boot_cmdline_find_by_name(name);
>> + if ( cmd )
>> + info->cmdline = &cmd->cmdline[0];
>> + }
>> + if ( !mod || !mod->size )
>> + {
>> + printk(XENLOG_ERR "Missing kernel boot module?\n");
>> + return -ENOENT;
>> + }
>> +
>> + printk("Loading %pd kernel from boot module @ %"PRIpaddr"\n",
>> + info->d, info->kernel_bootmodule->start);
>> + if ( info->initrd_bootmodule )
>> + printk("Loading ramdisk from boot module @ %"PRIpaddr"\n",
>> + info->initrd_bootmodule->start);
>> +
>> + /*
>> + * uImage isn't really used nowadays thereby leave kernel_uimage_probe()
>> + * call here just for compatability with Arm code.
>> + */
>> +#ifdef CONFIG_ARM
>> + /*
>> + * uImage header always appears at the top of the image (even compressed),
>> + * so it needs to be probed first. Note that in case of compressed uImage,
>> + * kernel_decompress is called from kernel_uimage_probe making the function
>> + * self-containing (i.e. fall through only in case of a header not found).
>> + */
>> + rc = kernel_uimage_probe(info, mod);
>> + if ( rc != -ENOENT )
>> + return rc;
>> +#endif
>> +
>> + /*
>> + * If it is a gzip'ed image, 32bit or 64bit, uncompress it.
>> + * At this point, gzip header appears (if at all) at the top of the image,
>> + * so pass 0 as an offset.
>> + */
>> + rc = kernel_decompress(mod, 0);
>> + if ( rc && rc != -EINVAL )
>> + return rc;
>> +
>> + rc = kernel_zimage_probe(info, mod->start, mod->size);
>> +
>> + return rc;
>> +}
>> +
>> +void __init kernel_load(struct kernel_info *info)
>> +{
>> + ASSERT(info && info->load);
>> +
>> + info->load(info);
>> +}
>> diff --git a/xen/include/asm-generic/kernel.h b/xen/include/asm-generic/kernel.h
>> new file mode 100644
>> index 0000000000..6857fabb34
>> --- /dev/null
>> +++ b/xen/include/asm-generic/kernel.h
> This file seems to be a duplicate of the previously introduced
> xen/include/xen/fdt-kernel.h ?
>
> Other than that, I checked the rest of the patch, including all the code
> movement, and it looks correct to me.
I will drop asm-generic/kernel.h, it was created in the initial version of this
patch series before it was suggested to move such header to xen/. And I missed to drop it.
Thanks.
~ Oleksii
>
>
>
>> @@ -0,0 +1,141 @@
>> +/*
>> + * Kernel image loading.
>> + *
>> + * Copyright (C) 2011 Citrix Systems, Inc.
>> + */
>> +
>> +#ifndef __ASM_GENERIC_KERNEL_H__
>> +#define __ASM_GENERIC_KERNEL_H__
>> +
>> +#include <xen/bootfdt.h>
>> +#include <xen/device_tree.h>
>> +#include <xen/sched.h>
>> +#include <xen/types.h>
>> +
>> +struct kernel_info {
>> + struct domain *d;
>> +
>> + void *fdt; /* flat device tree */
>> + paddr_t unassigned_mem; /* RAM not (yet) assigned to a bank */
>> + struct meminfo mem;
>> +#ifdef CONFIG_STATIC_SHM
>> + struct shared_meminfo shm_mem;
>> +#endif
>> +
>> + /* kernel entry point */
>> + paddr_t entry;
>> +
>> + /* grant table region */
>> + paddr_t gnttab_start;
>> + paddr_t gnttab_size;
>> +
>> + /* boot blob load addresses */
>> + const struct bootmodule *kernel_bootmodule, *initrd_bootmodule, *dtb_bootmodule;
>> + const char* cmdline;
>> + paddr_t dtb_paddr;
>> + paddr_t initrd_paddr;
>> +
>> + /* Enable uart emulation */
>> + bool vuart;
>> +
>> + /* Enable/Disable PV drivers interfaces */
>> + uint16_t dom0less_feature;
>> +
>> + /* Interrupt controller phandle */
>> + uint32_t phandle_intc;
>> +
>> + /* loader to use for this kernel */
>> + void (*load)(struct kernel_info *info);
>> +
>> + /* loader specific state */
>> + union {
>> + struct {
>> + paddr_t kernel_addr;
>> + paddr_t len;
>> +#if defined(CONFIG_ARM_64) || defined(CONFIG_RISCV_64)
>> + paddr_t text_offset; /* 64-bit Image only */
>> +#endif
>> + paddr_t start; /* Must be 0 for 64-bit Image */
>> + } zimage;
>> + };
>> +
>> + struct arch_kernel_info arch;
>> +};
>> +
>> +static inline struct membanks *kernel_info_get_mem(struct kernel_info *kinfo)
>> +{
>> + return container_of(&kinfo->mem.common, struct membanks, common);
>> +}
>> +
>> +static inline const struct membanks *
>> +kernel_info_get_mem_const(const struct kernel_info *kinfo)
>> +{
>> + return container_of(&kinfo->mem.common, const struct membanks, common);
>> +}
>> +
>> +#ifndef KERNEL_INFO_SHM_MEM_INIT
>> +
>> +#ifdef CONFIG_STATIC_SHM
>> +#define KERNEL_INFO_SHM_MEM_INIT .shm_mem.common.max_banks = NR_SHMEM_BANKS,
>> +#else
>> +#define KERNEL_INFO_SHM_MEM_INIT
>> +#endif
>> +
>> +#endif /* KERNEL_INFO_SHM_MEM_INIT */
>> +
>> +#ifndef KERNEL_INFO_INIT
>> +
>> +#define KERNEL_INFO_INIT \
>> +{ \
>> + .mem.common.max_banks = NR_MEM_BANKS, \
>> + KERNEL_INFO_SHM_MEM_INIT \
>> +}
>> +
>> +#endif /* KERNEL_INFO_INIT */
>> +
>> +/*
>> + * Probe the kernel to detemine its type and select a loader.
>> + *
>> + * Sets in info:
>> + * ->type
>> + * ->load hook, and sets loader specific variables ->zimage
>> + */
>> +int kernel_probe(struct kernel_info *info, const struct dt_device_node *domain);
>> +
>> +/*
>> + * Loads the kernel into guest RAM.
>> + *
>> + * Expects to be set in info when called:
>> + * ->mem
>> + * ->fdt
>> + *
>> + * Sets in info:
>> + * ->entry
>> + * ->dtb_paddr
>> + * ->initrd_paddr
>> + */
>> +void kernel_load(struct kernel_info *info);
>> +
>> +int kernel_decompress(struct bootmodule *mod, uint32_t offset);
>> +
>> +int kernel_zimage_probe(struct kernel_info *info, paddr_t addr, paddr_t size);
>> +
>> +/*
>> + * uImage isn't really used nowadays thereby leave kernel_uimage_probe()
>> + * call here just for compatability with Arm code.
>> + */
>> +#ifdef CONFIG_ARM
>> +struct bootmodule;
>> +int kernel_uimage_probe(struct kernel_info *info, struct bootmodule *mod);
>> +#endif
>> +
>> +#endif /*__ASM_GENERIC_KERNEL_H__ */
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> diff --git a/xen/include/xen/fdt-kernel.h b/xen/include/xen/fdt-kernel.h
>> index c81e759423..d85324c867 100644
>> --- a/xen/include/xen/fdt-kernel.h
>> +++ b/xen/include/xen/fdt-kernel.h
>> @@ -121,6 +121,19 @@ int kernel_probe(struct kernel_info *info, const struct dt_device_node *domain);
>> */
>> void kernel_load(struct kernel_info *info);
>>
>> +int kernel_decompress(struct bootmodule *mod, uint32_t offset);
>> +
>> +int kernel_zimage_probe(struct kernel_info *info, paddr_t addr, paddr_t size);
>> +
>> +/*
>> + * uImage isn't really used nowadays thereby leave kernel_uimage_probe()
>> + * call here just for compatability with Arm code.
>> + */
>> +#ifdef CONFIG_ARM
>> +struct bootmodule;
>> +int kernel_uimage_probe(struct kernel_info *info, struct bootmodule *mod);
>> +#endif
>> +
>> #endif /* __XEN_FDT_KERNEL_H__ */
>>
>> /*
>> --
>> 2.49.0
>>
[-- Attachment #2: Type: text/html, Size: 25625 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v3 7/8] xen/common: dom0less: introduce common domain-build.c
2025-05-02 16:22 [PATCH v3 0/8] Move parts of Arm's Dom0less to common code Oleksii Kurochko
` (5 preceding siblings ...)
2025-05-02 16:22 ` [PATCH v3 6/8] xen/common: dom0less: introduce common kernel.c Oleksii Kurochko
@ 2025-05-02 16:22 ` Oleksii Kurochko
2025-05-02 20:02 ` Stefano Stabellini
2025-05-02 16:22 ` [PATCH v3 8/8] xen/common: dom0less: introduce common dom0less-build.c Oleksii Kurochko
7 siblings, 1 reply; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-02 16:22 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné
Some functions of Arm's domain_build.c could be reused by dom0less or other
features connected to domain construction/build.
The following functions are moved to common:
- get_allocation_size().
- allocate_domheap_memory().
- guest_map_pages().
- allocate_bank_memory().
- add_hwdom_free_regions().
- find_unallocated_memory().
- allocate_memory().
- dtb_load().
- initrd_load().
Prototype of dtb_load() and initrd_load() is updated to recieve a pointer
to copy_to_guest_phys() as some archs require
copy_to_guest_phys_fluch_dcache().
Update arm/include/asm/Makefile to generate domain-build.h for Arm as it is
used by domain-build.c.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Change in v3:
- Nothing changed. Only rebase.
---
Change in v2:
- Use xen/fdt-domain-build.h instead of asm/domain_build.h.
---
xen/arch/arm/domain_build.c | 397 +------------------------
xen/common/device-tree/Makefile | 1 +
xen/common/device-tree/domain-build.c | 404 ++++++++++++++++++++++++++
xen/include/xen/fdt-domain-build.h | 33 ++-
4 files changed, 439 insertions(+), 396 deletions(-)
create mode 100644 xen/common/device-tree/domain-build.c
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 9d649b06b3..df29619c40 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -120,18 +120,6 @@ struct vcpu *__init alloc_dom0_vcpu0(struct domain *dom0)
return vcpu_create(dom0, 0);
}
-unsigned int __init get_allocation_size(paddr_t size)
-{
- /*
- * get_order_from_bytes returns the order greater than or equal to
- * the given size, but we need less than or equal. Adding one to
- * the size pushes an evenly aligned size into the next order, so
- * we can then unconditionally subtract 1 from the order which is
- * returned.
- */
- return get_order_from_bytes(size + 1) - 1;
-}
-
/*
* Insert the given pages into a memory bank, banks are ordered by address.
*
@@ -418,98 +406,6 @@ static void __init allocate_memory_11(struct domain *d,
}
}
-bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
- alloc_domheap_mem_cb cb, void *extra)
-{
- unsigned int max_order = UINT_MAX;
-
- while ( tot_size > 0 )
- {
- unsigned int order = get_allocation_size(tot_size);
- struct page_info *pg;
-
- order = min(max_order, order);
-
- pg = alloc_domheap_pages(d, order, 0);
- if ( !pg )
- {
- /*
- * If we can't allocate one page, then it is unlikely to
- * succeed in the next iteration. So bail out.
- */
- if ( !order )
- return false;
-
- /*
- * If we can't allocate memory with order, then it is
- * unlikely to succeed in the next iteration.
- * Record the order - 1 to avoid re-trying.
- */
- max_order = order - 1;
- continue;
- }
-
- if ( !cb(d, pg, order, extra) )
- return false;
-
- tot_size -= (1ULL << (PAGE_SHIFT + order));
- }
-
- return true;
-}
-
-static bool __init guest_map_pages(struct domain *d, struct page_info *pg,
- unsigned int order, void *extra)
-{
- gfn_t *sgfn = (gfn_t *)extra;
- int res;
-
- BUG_ON(!sgfn);
- res = guest_physmap_add_page(d, *sgfn, page_to_mfn(pg), order);
- if ( res )
- {
- dprintk(XENLOG_ERR, "Failed map pages to DOMU: %d", res);
- return false;
- }
-
- *sgfn = gfn_add(*sgfn, 1UL << order);
-
- return true;
-}
-
-bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
- paddr_t tot_size)
-{
- struct membanks *mem = kernel_info_get_mem(kinfo);
- struct domain *d = kinfo->d;
- struct membank *bank;
-
- /*
- * allocate_bank_memory can be called with a tot_size of zero for
- * the second memory bank. It is not an error and we can safely
- * avoid creating a zero-size memory bank.
- */
- if ( tot_size == 0 )
- return true;
-
- bank = &mem->bank[mem->nr_banks];
- bank->start = gfn_to_gaddr(sgfn);
- bank->size = tot_size;
-
- /*
- * Allocate pages from the heap until tot_size is zero and map them to the
- * guest using guest_map_pages, passing the starting gfn as extra parameter
- * for the map operation.
- */
- if ( !allocate_domheap_memory(d, tot_size, guest_map_pages, &sgfn) )
- return false;
-
- mem->nr_banks++;
- kinfo->unassigned_mem -= bank->size;
-
- return true;
-}
-
/*
* When PCI passthrough is available we want to keep the
* "linux,pci-domain" in sync for every host bridge.
@@ -900,226 +796,6 @@ int __init add_ext_regions(unsigned long s_gfn, unsigned long e_gfn,
return 0;
}
-static int __init add_hwdom_free_regions(unsigned long s_gfn,
- unsigned long e_gfn, void *data)
-{
- struct membanks *free_regions = data;
- paddr_t start, size;
- paddr_t s = pfn_to_paddr(s_gfn);
- paddr_t e = pfn_to_paddr(e_gfn);
- unsigned int i, j;
-
- if ( free_regions->nr_banks >= free_regions->max_banks )
- return 0;
-
- /*
- * Both start and size of the free region should be 2MB aligned to
- * potentially allow superpage mapping.
- */
- start = (s + SZ_2M - 1) & ~(SZ_2M - 1);
- if ( start > e )
- return 0;
-
- /*
- * e is actually "end-1" because it is called by rangeset functions
- * which are inclusive of the last address.
- */
- e += 1;
- size = (e - start) & ~(SZ_2M - 1);
-
- /* Find the insert position (descending order). */
- for ( i = 0; i < free_regions->nr_banks ; i++ )
- if ( size > free_regions->bank[i].size )
- break;
-
- /* Move the other banks to make space. */
- for ( j = free_regions->nr_banks; j > i ; j-- )
- {
- free_regions->bank[j].start = free_regions->bank[j - 1].start;
- free_regions->bank[j].size = free_regions->bank[j - 1].size;
- }
-
- free_regions->bank[i].start = start;
- free_regions->bank[i].size = size;
- free_regions->nr_banks++;
-
- return 0;
-}
-
-/*
- * Find unused regions of Host address space which can be exposed to domain
- * using the host memory layout. In order to calculate regions we exclude every
- * region passed in mem_banks from the Host RAM.
- */
-static int __init find_unallocated_memory(const struct kernel_info *kinfo,
- const struct membanks *mem_banks[],
- unsigned int nr_mem_banks,
- struct membanks *free_regions,
- int (*cb)(unsigned long s_gfn,
- unsigned long e_gfn,
- void *data))
-{
- const struct membanks *mem = bootinfo_get_mem();
- struct rangeset *unalloc_mem;
- paddr_t start, end;
- unsigned int i, j;
- int res;
-
- ASSERT(domain_use_host_layout(kinfo->d));
-
- unalloc_mem = rangeset_new(NULL, NULL, 0);
- if ( !unalloc_mem )
- return -ENOMEM;
-
- /* Start with all available RAM */
- for ( i = 0; i < mem->nr_banks; i++ )
- {
- start = mem->bank[i].start;
- end = mem->bank[i].start + mem->bank[i].size;
- res = rangeset_add_range(unalloc_mem, PFN_DOWN(start),
- PFN_DOWN(end - 1));
- if ( res )
- {
- printk(XENLOG_ERR "Failed to add: %#"PRIpaddr"->%#"PRIpaddr"\n",
- start, end);
- goto out;
- }
- }
-
- /* Remove all regions listed in mem_banks */
- for ( i = 0; i < nr_mem_banks; i++ )
- for ( j = 0; j < mem_banks[i]->nr_banks; j++ )
- {
- start = mem_banks[i]->bank[j].start;
-
- /* Shared memory banks can contain INVALID_PADDR as start */
- if ( INVALID_PADDR == start )
- continue;
-
- end = mem_banks[i]->bank[j].start + mem_banks[i]->bank[j].size;
- res = rangeset_remove_range(unalloc_mem, PFN_DOWN(start),
- PFN_DOWN(end - 1));
- if ( res )
- {
- printk(XENLOG_ERR
- "Failed to add: %#"PRIpaddr"->%#"PRIpaddr", error %d\n",
- start, end, res);
- goto out;
- }
- }
-
- start = 0;
- end = (1ULL << p2m_ipa_bits) - 1;
- res = rangeset_report_ranges(unalloc_mem, PFN_DOWN(start), PFN_DOWN(end),
- cb, free_regions);
- if ( res )
- free_regions->nr_banks = 0;
- else if ( !free_regions->nr_banks )
- res = -ENOENT;
-
-out:
- rangeset_destroy(unalloc_mem);
-
- return res;
-}
-
-void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
-{
- struct membanks *mem = kernel_info_get_mem(kinfo);
- unsigned int i, nr_banks = GUEST_RAM_BANKS;
- struct membanks *hwdom_free_mem = NULL;
-
- printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
- /* Don't want format this as PRIpaddr (16 digit hex) */
- (unsigned long)(kinfo->unassigned_mem >> 20), d);
-
- mem->nr_banks = 0;
- /*
- * Use host memory layout for hwdom. Only case for this is when LLC coloring
- * is enabled.
- */
- if ( is_hardware_domain(d) )
- {
- struct membanks *gnttab = membanks_xzalloc(1, MEMORY);
- /*
- * Exclude the following regions:
- * 1) Remove reserved memory
- * 2) Grant table assigned to hwdom
- */
- const struct membanks *mem_banks[] = {
- bootinfo_get_reserved_mem(),
- gnttab,
- };
-
- if ( !gnttab )
- goto fail;
-
- gnttab->nr_banks = 1;
- gnttab->bank[0].start = kinfo->gnttab_start;
- gnttab->bank[0].size = kinfo->gnttab_size;
-
- hwdom_free_mem = membanks_xzalloc(NR_MEM_BANKS, MEMORY);
- if ( !hwdom_free_mem )
- goto fail;
-
- if ( find_unallocated_memory(kinfo, mem_banks, ARRAY_SIZE(mem_banks),
- hwdom_free_mem, add_hwdom_free_regions) )
- goto fail;
-
- nr_banks = hwdom_free_mem->nr_banks;
- xfree(gnttab);
- }
-
- for ( i = 0; kinfo->unassigned_mem > 0 && nr_banks > 0; i++, nr_banks-- )
- {
- paddr_t bank_start, bank_size;
-
- if ( is_hardware_domain(d) )
- {
- bank_start = hwdom_free_mem->bank[i].start;
- bank_size = hwdom_free_mem->bank[i].size;
- }
- else
- {
- const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
- const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
-
- if ( i >= GUEST_RAM_BANKS )
- goto fail;
-
- bank_start = bankbase[i];
- bank_size = banksize[i];
- }
-
- bank_size = MIN(bank_size, kinfo->unassigned_mem);
- if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(bank_start), bank_size) )
- goto fail;
- }
-
- if ( kinfo->unassigned_mem )
- goto fail;
-
- for( i = 0; i < mem->nr_banks; i++ )
- {
- printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
- d,
- i,
- mem->bank[i].start,
- mem->bank[i].start + mem->bank[i].size,
- /* Don't want format this as PRIpaddr (16 digit hex) */
- (unsigned long)(mem->bank[i].size >> 20));
- }
-
- xfree(hwdom_free_mem);
- return;
-
- fail:
- panic("Failed to allocate requested domain memory."
- /* Don't want format this as PRIpaddr (16 digit hex) */
- " %ldKB unallocated. Fix the VMs configurations.\n",
- (unsigned long)kinfo->unassigned_mem >> 10);
-}
-
static int __init handle_pci_range(const struct dt_device_node *dev,
uint64_t addr, uint64_t len, void *data)
{
@@ -2059,75 +1735,6 @@ static int __init prepare_dtb_hwdom(struct domain *d, struct kernel_info *kinfo)
return -EINVAL;
}
-static void __init dtb_load(struct kernel_info *kinfo)
-{
- unsigned long left;
-
- printk("Loading %pd DTB to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
- kinfo->d, kinfo->dtb_paddr,
- kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt));
-
- left = copy_to_guest_phys_flush_dcache(kinfo->d, kinfo->dtb_paddr,
- kinfo->fdt,
- fdt_totalsize(kinfo->fdt));
-
- if ( left != 0 )
- panic("Unable to copy the DTB to %pd memory (left = %lu bytes)\n",
- kinfo->d, left);
- xfree(kinfo->fdt);
-}
-
-static void __init initrd_load(struct kernel_info *kinfo)
-{
- const struct bootmodule *mod = kinfo->initrd_bootmodule;
- paddr_t load_addr = kinfo->initrd_paddr;
- paddr_t paddr, len;
- int node;
- int res;
- __be32 val[2];
- __be32 *cellp;
- void __iomem *initrd;
-
- if ( !mod || !mod->size )
- return;
-
- paddr = mod->start;
- len = mod->size;
-
- printk("Loading %pd initrd from %"PRIpaddr" to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
- kinfo->d, paddr, load_addr, load_addr + len);
-
- /* Fix up linux,initrd-start and linux,initrd-end in /chosen */
- node = fdt_path_offset(kinfo->fdt, "/chosen");
- if ( node < 0 )
- panic("Cannot find the /chosen node\n");
-
- cellp = (__be32 *)val;
- dt_set_cell(&cellp, ARRAY_SIZE(val), load_addr);
- res = fdt_setprop_inplace(kinfo->fdt, node, "linux,initrd-start",
- val, sizeof(val));
- if ( res )
- panic("Cannot fix up \"linux,initrd-start\" property\n");
-
- cellp = (__be32 *)val;
- dt_set_cell(&cellp, ARRAY_SIZE(val), load_addr + len);
- res = fdt_setprop_inplace(kinfo->fdt, node, "linux,initrd-end",
- val, sizeof(val));
- if ( res )
- panic("Cannot fix up \"linux,initrd-end\" property\n");
-
- initrd = ioremap_wc(paddr, len);
- if ( !initrd )
- panic("Unable to map the %pd initrd\n", kinfo->d);
-
- res = copy_to_guest_phys_flush_dcache(kinfo->d, load_addr,
- initrd, len);
- if ( res != 0 )
- panic("Unable to copy the initrd in the %pd memory\n", kinfo->d);
-
- iounmap(initrd);
-}
-
/*
* Allocate the event channel PPIs and setup the HVM_PARAM_CALLBACK_IRQ.
* The allocated IRQ will be found in d->arch.evtchn_irq.
@@ -2220,8 +1827,8 @@ int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
*/
kernel_load(kinfo);
/* initrd_load will fix up the fdt, so call it before dtb_load */
- initrd_load(kinfo);
- dtb_load(kinfo);
+ initrd_load(kinfo, copy_to_guest_phys_flush_dcache);
+ dtb_load(kinfo, copy_to_guest_phys_flush_dcache);
memset(regs, 0, sizeof(*regs));
diff --git a/xen/common/device-tree/Makefile b/xen/common/device-tree/Makefile
index e88a4d5799..831b91399b 100644
--- a/xen/common/device-tree/Makefile
+++ b/xen/common/device-tree/Makefile
@@ -1,6 +1,7 @@
obj-y += bootfdt.init.o
obj-y += bootinfo.init.o
obj-y += device-tree.o
+obj-$(CONFIG_DOMAIN_BUILD_HELPERS) += domain-build.o
obj-$(CONFIG_DOM0LESS_BOOT) += dom0less-build.o
obj-$(CONFIG_OVERLAY_DTB) += dt-overlay.o
obj-y += intc.o
diff --git a/xen/common/device-tree/domain-build.c b/xen/common/device-tree/domain-build.c
new file mode 100644
index 0000000000..69257a15ba
--- /dev/null
+++ b/xen/common/device-tree/domain-build.c
@@ -0,0 +1,404 @@
+#include <xen/bootfdt.h>
+#include <xen/fdt-domain-build.h>
+#include <xen/init.h>
+#include <xen/lib.h>
+#include <xen/libfdt/libfdt.h>
+#include <xen/mm.h>
+#include <xen/sched.h>
+#include <xen/sizes.h>
+#include <xen/types.h>
+#include <xen/vmap.h>
+
+#include <asm/p2m.h>
+
+bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
+ alloc_domheap_mem_cb cb, void *extra)
+{
+ unsigned int max_order = UINT_MAX;
+
+ while ( tot_size > 0 )
+ {
+ unsigned int order = get_allocation_size(tot_size);
+ struct page_info *pg;
+
+ order = min(max_order, order);
+
+ pg = alloc_domheap_pages(d, order, 0);
+ if ( !pg )
+ {
+ /*
+ * If we can't allocate one page, then it is unlikely to
+ * succeed in the next iteration. So bail out.
+ */
+ if ( !order )
+ return false;
+
+ /*
+ * If we can't allocate memory with order, then it is
+ * unlikely to succeed in the next iteration.
+ * Record the order - 1 to avoid re-trying.
+ */
+ max_order = order - 1;
+ continue;
+ }
+
+ if ( !cb(d, pg, order, extra) )
+ return false;
+
+ tot_size -= (1ULL << (PAGE_SHIFT + order));
+ }
+
+ return true;
+}
+
+static bool __init guest_map_pages(struct domain *d, struct page_info *pg,
+ unsigned int order, void *extra)
+{
+ gfn_t *sgfn = (gfn_t *)extra;
+ int res;
+
+ BUG_ON(!sgfn);
+ res = guest_physmap_add_page(d, *sgfn, page_to_mfn(pg), order);
+ if ( res )
+ {
+ dprintk(XENLOG_ERR, "Failed map pages to DOMU: %d", res);
+ return false;
+ }
+
+ *sgfn = gfn_add(*sgfn, 1UL << order);
+
+ return true;
+}
+
+bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
+ paddr_t tot_size)
+{
+ struct membanks *mem = kernel_info_get_mem(kinfo);
+ struct domain *d = kinfo->d;
+ struct membank *bank;
+
+ /*
+ * allocate_bank_memory can be called with a tot_size of zero for
+ * the second memory bank. It is not an error and we can safely
+ * avoid creating a zero-size memory bank.
+ */
+ if ( tot_size == 0 )
+ return true;
+
+ bank = &mem->bank[mem->nr_banks];
+ bank->start = gfn_to_gaddr(sgfn);
+ bank->size = tot_size;
+
+ /*
+ * Allocate pages from the heap until tot_size is zero and map them to the
+ * guest using guest_map_pages, passing the starting gfn as extra parameter
+ * for the map operation.
+ */
+ if ( !allocate_domheap_memory(d, tot_size, guest_map_pages, &sgfn) )
+ return false;
+
+ mem->nr_banks++;
+ kinfo->unassigned_mem -= bank->size;
+
+ return true;
+}
+
+static int __init add_hwdom_free_regions(unsigned long s_gfn,
+ unsigned long e_gfn, void *data)
+{
+ struct membanks *free_regions = data;
+ paddr_t start, size;
+ paddr_t s = pfn_to_paddr(s_gfn);
+ paddr_t e = pfn_to_paddr(e_gfn);
+ unsigned int i, j;
+
+ if ( free_regions->nr_banks >= free_regions->max_banks )
+ return 0;
+
+ /*
+ * Both start and size of the free region should be 2MB aligned to
+ * potentially allow superpage mapping.
+ */
+ start = (s + SZ_2M - 1) & ~(SZ_2M - 1);
+ if ( start > e )
+ return 0;
+
+ /*
+ * e is actually "end-1" because it is called by rangeset functions
+ * which are inclusive of the last address.
+ */
+ e += 1;
+ size = (e - start) & ~(SZ_2M - 1);
+
+ /* Find the insert position (descending order). */
+ for ( i = 0; i < free_regions->nr_banks ; i++ )
+ if ( size > free_regions->bank[i].size )
+ break;
+
+ /* Move the other banks to make space. */
+ for ( j = free_regions->nr_banks; j > i ; j-- )
+ {
+ free_regions->bank[j].start = free_regions->bank[j - 1].start;
+ free_regions->bank[j].size = free_regions->bank[j - 1].size;
+ }
+
+ free_regions->bank[i].start = start;
+ free_regions->bank[i].size = size;
+ free_regions->nr_banks++;
+
+ return 0;
+}
+
+/*
+ * Find unused regions of Host address space which can be exposed to domain
+ * using the host memory layout. In order to calculate regions we exclude every
+ * region passed in mem_banks from the Host RAM.
+ */
+int __init find_unallocated_memory(const struct kernel_info *kinfo,
+ const struct membanks *mem_banks[],
+ unsigned int nr_mem_banks,
+ struct membanks *free_regions,
+ int (*cb)(unsigned long s_gfn,
+ unsigned long e_gfn,
+ void *data))
+{
+ const struct membanks *mem = bootinfo_get_mem();
+ struct rangeset *unalloc_mem;
+ paddr_t start, end;
+ unsigned int i, j;
+ int res;
+
+ ASSERT(domain_use_host_layout(kinfo->d));
+
+ unalloc_mem = rangeset_new(NULL, NULL, 0);
+ if ( !unalloc_mem )
+ return -ENOMEM;
+
+ /* Start with all available RAM */
+ for ( i = 0; i < mem->nr_banks; i++ )
+ {
+ start = mem->bank[i].start;
+ end = mem->bank[i].start + mem->bank[i].size;
+ res = rangeset_add_range(unalloc_mem, PFN_DOWN(start),
+ PFN_DOWN(end - 1));
+ if ( res )
+ {
+ printk(XENLOG_ERR "Failed to add: %#"PRIpaddr"->%#"PRIpaddr"\n",
+ start, end);
+ goto out;
+ }
+ }
+
+ /* Remove all regions listed in mem_banks */
+ for ( i = 0; i < nr_mem_banks; i++ )
+ for ( j = 0; j < mem_banks[i]->nr_banks; j++ )
+ {
+ start = mem_banks[i]->bank[j].start;
+
+ /* Shared memory banks can contain INVALID_PADDR as start */
+ if ( INVALID_PADDR == start )
+ continue;
+
+ end = mem_banks[i]->bank[j].start + mem_banks[i]->bank[j].size;
+ res = rangeset_remove_range(unalloc_mem, PFN_DOWN(start),
+ PFN_DOWN(end - 1));
+ if ( res )
+ {
+ printk(XENLOG_ERR
+ "Failed to add: %#"PRIpaddr"->%#"PRIpaddr", error %d\n",
+ start, end, res);
+ goto out;
+ }
+ }
+
+ start = 0;
+ end = (1ULL << p2m_ipa_bits) - 1;
+ res = rangeset_report_ranges(unalloc_mem, PFN_DOWN(start), PFN_DOWN(end),
+ cb, free_regions);
+ if ( res )
+ free_regions->nr_banks = 0;
+ else if ( !free_regions->nr_banks )
+ res = -ENOENT;
+
+out:
+ rangeset_destroy(unalloc_mem);
+
+ return res;
+}
+
+void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
+{
+ struct membanks *mem = kernel_info_get_mem(kinfo);
+ unsigned int i, nr_banks = GUEST_RAM_BANKS;
+ struct membanks *hwdom_free_mem = NULL;
+
+ printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
+ /* Don't want format this as PRIpaddr (16 digit hex) */
+ (unsigned long)(kinfo->unassigned_mem >> 20), d);
+
+ mem->nr_banks = 0;
+ /*
+ * Use host memory layout for hwdom. Only case for this is when LLC coloring
+ * is enabled.
+ */
+ if ( is_hardware_domain(d) )
+ {
+ struct membanks *gnttab = xzalloc_flex_struct(struct membanks, bank, 1);
+ /*
+ * Exclude the following regions:
+ * 1) Remove reserved memory
+ * 2) Grant table assigned to hwdom
+ */
+ const struct membanks *mem_banks[] = {
+ bootinfo_get_reserved_mem(),
+ gnttab,
+ };
+
+ if ( !gnttab )
+ goto fail;
+
+ gnttab->nr_banks = 1;
+ gnttab->bank[0].start = kinfo->gnttab_start;
+ gnttab->bank[0].size = kinfo->gnttab_size;
+
+ hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
+ NR_MEM_BANKS);
+ if ( !hwdom_free_mem )
+ goto fail;
+
+ hwdom_free_mem->max_banks = NR_MEM_BANKS;
+
+ if ( find_unallocated_memory(kinfo, mem_banks, ARRAY_SIZE(mem_banks),
+ hwdom_free_mem, add_hwdom_free_regions) )
+ goto fail;
+
+ nr_banks = hwdom_free_mem->nr_banks;
+ xfree(gnttab);
+ }
+
+ for ( i = 0; kinfo->unassigned_mem > 0 && nr_banks > 0; i++, nr_banks-- )
+ {
+ paddr_t bank_start, bank_size;
+
+ if ( is_hardware_domain(d) )
+ {
+ bank_start = hwdom_free_mem->bank[i].start;
+ bank_size = hwdom_free_mem->bank[i].size;
+ }
+ else
+ {
+ const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
+ const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
+
+ if ( i >= GUEST_RAM_BANKS )
+ goto fail;
+
+ bank_start = bankbase[i];
+ bank_size = banksize[i];
+ }
+
+ bank_size = MIN(bank_size, kinfo->unassigned_mem);
+ if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(bank_start), bank_size) )
+ goto fail;
+ }
+
+ if ( kinfo->unassigned_mem )
+ goto fail;
+
+ for( i = 0; i < mem->nr_banks; i++ )
+ {
+ printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
+ d,
+ i,
+ mem->bank[i].start,
+ mem->bank[i].start + mem->bank[i].size,
+ /* Don't want format this as PRIpaddr (16 digit hex) */
+ (unsigned long)(mem->bank[i].size >> 20));
+ }
+
+ xfree(hwdom_free_mem);
+ return;
+
+ fail:
+ panic("Failed to allocate requested domain memory."
+ /* Don't want format this as PRIpaddr (16 digit hex) */
+ " %ldKB unallocated. Fix the VMs configurations.\n",
+ (unsigned long)kinfo->unassigned_mem >> 10);
+}
+
+/* Copy data to guest physical address, then clean the region. */
+typedef unsigned long (*copy_to_guest_phys_cb)(struct domain *d,
+ paddr_t gpa,
+ void *buf,
+ unsigned int len);
+
+void __init dtb_load(struct kernel_info *kinfo,
+ copy_to_guest_phys_cb copy_to_guest)
+{
+ unsigned long left;
+
+ printk("Loading %pd DTB to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
+ kinfo->d, kinfo->dtb_paddr,
+ kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt));
+
+ left = copy_to_guest(kinfo->d, kinfo->dtb_paddr,
+ kinfo->fdt,
+ fdt_totalsize(kinfo->fdt));
+
+ if ( left != 0 )
+ panic("Unable to copy the DTB to %pd memory (left = %lu bytes)\n",
+ kinfo->d, left);
+ xfree(kinfo->fdt);
+}
+
+void __init initrd_load(struct kernel_info *kinfo,
+ copy_to_guest_phys_cb copy_to_guest)
+{
+ const struct bootmodule *mod = kinfo->initrd_bootmodule;
+ paddr_t load_addr = kinfo->initrd_paddr;
+ paddr_t paddr, len;
+ int node;
+ int res;
+ __be32 val[2];
+ __be32 *cellp;
+ void __iomem *initrd;
+
+ if ( !mod || !mod->size )
+ return;
+
+ paddr = mod->start;
+ len = mod->size;
+
+ printk("Loading %pd initrd from %"PRIpaddr" to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
+ kinfo->d, paddr, load_addr, load_addr + len);
+
+ /* Fix up linux,initrd-start and linux,initrd-end in /chosen */
+ node = fdt_path_offset(kinfo->fdt, "/chosen");
+ if ( node < 0 )
+ panic("Cannot find the /chosen node\n");
+
+ cellp = (__be32 *)val;
+ dt_set_cell(&cellp, ARRAY_SIZE(val), load_addr);
+ res = fdt_setprop_inplace(kinfo->fdt, node, "linux,initrd-start",
+ val, sizeof(val));
+ if ( res )
+ panic("Cannot fix up \"linux,initrd-start\" property\n");
+
+ cellp = (__be32 *)val;
+ dt_set_cell(&cellp, ARRAY_SIZE(val), load_addr + len);
+ res = fdt_setprop_inplace(kinfo->fdt, node, "linux,initrd-end",
+ val, sizeof(val));
+ if ( res )
+ panic("Cannot fix up \"linux,initrd-end\" property\n");
+
+ initrd = ioremap_wc(paddr, len);
+ if ( !initrd )
+ panic("Unable to map the hwdom initrd\n");
+
+ res = copy_to_guest(kinfo->d, load_addr,
+ initrd, len);
+ if ( res != 0 )
+ panic("Unable to copy the initrd in the hwdom memory\n");
+
+ iounmap(initrd);
+}
diff --git a/xen/include/xen/fdt-domain-build.h b/xen/include/xen/fdt-domain-build.h
index b79e9fabfe..4a0052b2e8 100644
--- a/xen/include/xen/fdt-domain-build.h
+++ b/xen/include/xen/fdt-domain-build.h
@@ -6,6 +6,7 @@
#include <xen/bootfdt.h>
#include <xen/device_tree.h>
#include <xen/fdt-kernel.h>
+#include <xen/mm.h>
#include <xen/types.h>
struct domain;
@@ -29,7 +30,37 @@ int make_memory_node(const struct kernel_info *kinfo, int addrcells,
int sizecells, const struct membanks *mem);
int make_timer_node(const struct kernel_info *kinfo);
-unsigned int get_allocation_size(paddr_t size);
+
+static inline int get_allocation_size(paddr_t size)
+{
+ /*
+ * get_order_from_bytes returns the order greater than or equal to
+ * the given size, but we need less than or equal. Adding one to
+ * the size pushes an evenly aligned size into the next order, so
+ * we can then unconditionally subtract 1 from the order which is
+ * returned.
+ */
+ return get_order_from_bytes(size + 1) - 1;
+}
+
+typedef unsigned long (*copy_to_guest_phys_cb)(struct domain *d,
+ paddr_t gpa,
+ void *buf,
+ unsigned int len);
+
+void initrd_load(struct kernel_info *kinfo,
+ copy_to_guest_phys_cb copy_to_guest);
+
+void dtb_load(struct kernel_info *kinfo,
+ copy_to_guest_phys_cb copy_to_guest);
+
+int find_unallocated_memory(const struct kernel_info *kinfo,
+ const struct membanks *mem_banks[],
+ unsigned int nr_mem_banks,
+ struct membanks *free_regions,
+ int (*cb)(unsigned long s_gfn,
+ unsigned long e_gfn,
+ void *data));
#endif /* __XEN_FDT_DOMAIN_BUILD_H__ */
--
2.49.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH v3 7/8] xen/common: dom0less: introduce common domain-build.c
2025-05-02 16:22 ` [PATCH v3 7/8] xen/common: dom0less: introduce common domain-build.c Oleksii Kurochko
@ 2025-05-02 20:02 ` Stefano Stabellini
2025-05-05 10:06 ` Oleksii Kurochko
0 siblings, 1 reply; 30+ messages in thread
From: Stefano Stabellini @ 2025-05-02 20:02 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: xen-devel, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Jan Beulich, Roger Pau Monné
On Fri, 2 May 2025, Oleksii Kurochko wrote:
> Some functions of Arm's domain_build.c could be reused by dom0less or other
> features connected to domain construction/build.
>
> The following functions are moved to common:
> - get_allocation_size().
> - allocate_domheap_memory().
> - guest_map_pages().
> - allocate_bank_memory().
> - add_hwdom_free_regions().
> - find_unallocated_memory().
> - allocate_memory().
> - dtb_load().
> - initrd_load().
The declaration of allocate_domheap_memory, allocate_bank_memory,
allocate_memory were moved in patch #5. Maybe their movement should be
in this patch?
>
> Prototype of dtb_load() and initrd_load() is updated to recieve a pointer
> to copy_to_guest_phys() as some archs require
> copy_to_guest_phys_fluch_dcache().
>
> Update arm/include/asm/Makefile to generate domain-build.h for Arm as it is
> used by domain-build.c.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Change in v3:
> - Nothing changed. Only rebase.
> ---
> Change in v2:
> - Use xen/fdt-domain-build.h instead of asm/domain_build.h.
> ---
> xen/arch/arm/domain_build.c | 397 +------------------------
> xen/common/device-tree/Makefile | 1 +
> xen/common/device-tree/domain-build.c | 404 ++++++++++++++++++++++++++
> xen/include/xen/fdt-domain-build.h | 33 ++-
> 4 files changed, 439 insertions(+), 396 deletions(-)
> create mode 100644 xen/common/device-tree/domain-build.c
>
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 9d649b06b3..df29619c40 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -120,18 +120,6 @@ struct vcpu *__init alloc_dom0_vcpu0(struct domain *dom0)
> return vcpu_create(dom0, 0);
> }
>
> -unsigned int __init get_allocation_size(paddr_t size)
> -{
> - /*
> - * get_order_from_bytes returns the order greater than or equal to
> - * the given size, but we need less than or equal. Adding one to
> - * the size pushes an evenly aligned size into the next order, so
> - * we can then unconditionally subtract 1 from the order which is
> - * returned.
> - */
> - return get_order_from_bytes(size + 1) - 1;
> -}
> -
> /*
> * Insert the given pages into a memory bank, banks are ordered by address.
> *
> @@ -418,98 +406,6 @@ static void __init allocate_memory_11(struct domain *d,
> }
> }
>
> -bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
> - alloc_domheap_mem_cb cb, void *extra)
> -{
> - unsigned int max_order = UINT_MAX;
> -
> - while ( tot_size > 0 )
> - {
> - unsigned int order = get_allocation_size(tot_size);
> - struct page_info *pg;
> -
> - order = min(max_order, order);
> -
> - pg = alloc_domheap_pages(d, order, 0);
> - if ( !pg )
> - {
> - /*
> - * If we can't allocate one page, then it is unlikely to
> - * succeed in the next iteration. So bail out.
> - */
> - if ( !order )
> - return false;
> -
> - /*
> - * If we can't allocate memory with order, then it is
> - * unlikely to succeed in the next iteration.
> - * Record the order - 1 to avoid re-trying.
> - */
> - max_order = order - 1;
> - continue;
> - }
> -
> - if ( !cb(d, pg, order, extra) )
> - return false;
> -
> - tot_size -= (1ULL << (PAGE_SHIFT + order));
> - }
> -
> - return true;
> -}
> -
> -static bool __init guest_map_pages(struct domain *d, struct page_info *pg,
> - unsigned int order, void *extra)
> -{
> - gfn_t *sgfn = (gfn_t *)extra;
> - int res;
> -
> - BUG_ON(!sgfn);
> - res = guest_physmap_add_page(d, *sgfn, page_to_mfn(pg), order);
> - if ( res )
> - {
> - dprintk(XENLOG_ERR, "Failed map pages to DOMU: %d", res);
> - return false;
> - }
> -
> - *sgfn = gfn_add(*sgfn, 1UL << order);
> -
> - return true;
> -}
> -
> -bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
> - paddr_t tot_size)
> -{
> - struct membanks *mem = kernel_info_get_mem(kinfo);
> - struct domain *d = kinfo->d;
> - struct membank *bank;
> -
> - /*
> - * allocate_bank_memory can be called with a tot_size of zero for
> - * the second memory bank. It is not an error and we can safely
> - * avoid creating a zero-size memory bank.
> - */
> - if ( tot_size == 0 )
> - return true;
> -
> - bank = &mem->bank[mem->nr_banks];
> - bank->start = gfn_to_gaddr(sgfn);
> - bank->size = tot_size;
> -
> - /*
> - * Allocate pages from the heap until tot_size is zero and map them to the
> - * guest using guest_map_pages, passing the starting gfn as extra parameter
> - * for the map operation.
> - */
> - if ( !allocate_domheap_memory(d, tot_size, guest_map_pages, &sgfn) )
> - return false;
> -
> - mem->nr_banks++;
> - kinfo->unassigned_mem -= bank->size;
> -
> - return true;
> -}
> -
> /*
> * When PCI passthrough is available we want to keep the
> * "linux,pci-domain" in sync for every host bridge.
> @@ -900,226 +796,6 @@ int __init add_ext_regions(unsigned long s_gfn, unsigned long e_gfn,
> return 0;
> }
>
> -static int __init add_hwdom_free_regions(unsigned long s_gfn,
> - unsigned long e_gfn, void *data)
> -{
> - struct membanks *free_regions = data;
> - paddr_t start, size;
> - paddr_t s = pfn_to_paddr(s_gfn);
> - paddr_t e = pfn_to_paddr(e_gfn);
> - unsigned int i, j;
> -
> - if ( free_regions->nr_banks >= free_regions->max_banks )
> - return 0;
> -
> - /*
> - * Both start and size of the free region should be 2MB aligned to
> - * potentially allow superpage mapping.
> - */
> - start = (s + SZ_2M - 1) & ~(SZ_2M - 1);
> - if ( start > e )
> - return 0;
> -
> - /*
> - * e is actually "end-1" because it is called by rangeset functions
> - * which are inclusive of the last address.
> - */
> - e += 1;
> - size = (e - start) & ~(SZ_2M - 1);
> -
> - /* Find the insert position (descending order). */
> - for ( i = 0; i < free_regions->nr_banks ; i++ )
> - if ( size > free_regions->bank[i].size )
> - break;
> -
> - /* Move the other banks to make space. */
> - for ( j = free_regions->nr_banks; j > i ; j-- )
> - {
> - free_regions->bank[j].start = free_regions->bank[j - 1].start;
> - free_regions->bank[j].size = free_regions->bank[j - 1].size;
> - }
> -
> - free_regions->bank[i].start = start;
> - free_regions->bank[i].size = size;
> - free_regions->nr_banks++;
> -
> - return 0;
> -}
> -
> -/*
> - * Find unused regions of Host address space which can be exposed to domain
> - * using the host memory layout. In order to calculate regions we exclude every
> - * region passed in mem_banks from the Host RAM.
> - */
> -static int __init find_unallocated_memory(const struct kernel_info *kinfo,
> - const struct membanks *mem_banks[],
> - unsigned int nr_mem_banks,
> - struct membanks *free_regions,
> - int (*cb)(unsigned long s_gfn,
> - unsigned long e_gfn,
> - void *data))
> -{
> - const struct membanks *mem = bootinfo_get_mem();
> - struct rangeset *unalloc_mem;
> - paddr_t start, end;
> - unsigned int i, j;
> - int res;
> -
> - ASSERT(domain_use_host_layout(kinfo->d));
> -
> - unalloc_mem = rangeset_new(NULL, NULL, 0);
> - if ( !unalloc_mem )
> - return -ENOMEM;
> -
> - /* Start with all available RAM */
> - for ( i = 0; i < mem->nr_banks; i++ )
> - {
> - start = mem->bank[i].start;
> - end = mem->bank[i].start + mem->bank[i].size;
> - res = rangeset_add_range(unalloc_mem, PFN_DOWN(start),
> - PFN_DOWN(end - 1));
> - if ( res )
> - {
> - printk(XENLOG_ERR "Failed to add: %#"PRIpaddr"->%#"PRIpaddr"\n",
> - start, end);
> - goto out;
> - }
> - }
> -
> - /* Remove all regions listed in mem_banks */
> - for ( i = 0; i < nr_mem_banks; i++ )
> - for ( j = 0; j < mem_banks[i]->nr_banks; j++ )
> - {
> - start = mem_banks[i]->bank[j].start;
> -
> - /* Shared memory banks can contain INVALID_PADDR as start */
> - if ( INVALID_PADDR == start )
> - continue;
> -
> - end = mem_banks[i]->bank[j].start + mem_banks[i]->bank[j].size;
> - res = rangeset_remove_range(unalloc_mem, PFN_DOWN(start),
> - PFN_DOWN(end - 1));
> - if ( res )
> - {
> - printk(XENLOG_ERR
> - "Failed to add: %#"PRIpaddr"->%#"PRIpaddr", error %d\n",
> - start, end, res);
> - goto out;
> - }
> - }
> -
> - start = 0;
> - end = (1ULL << p2m_ipa_bits) - 1;
> - res = rangeset_report_ranges(unalloc_mem, PFN_DOWN(start), PFN_DOWN(end),
> - cb, free_regions);
> - if ( res )
> - free_regions->nr_banks = 0;
> - else if ( !free_regions->nr_banks )
> - res = -ENOENT;
> -
> -out:
> - rangeset_destroy(unalloc_mem);
> -
> - return res;
> -}
> -
> -void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
> -{
> - struct membanks *mem = kernel_info_get_mem(kinfo);
> - unsigned int i, nr_banks = GUEST_RAM_BANKS;
> - struct membanks *hwdom_free_mem = NULL;
> -
> - printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
> - /* Don't want format this as PRIpaddr (16 digit hex) */
> - (unsigned long)(kinfo->unassigned_mem >> 20), d);
> -
> - mem->nr_banks = 0;
> - /*
> - * Use host memory layout for hwdom. Only case for this is when LLC coloring
> - * is enabled.
> - */
> - if ( is_hardware_domain(d) )
> - {
> - struct membanks *gnttab = membanks_xzalloc(1, MEMORY);
> - /*
> - * Exclude the following regions:
> - * 1) Remove reserved memory
> - * 2) Grant table assigned to hwdom
> - */
> - const struct membanks *mem_banks[] = {
> - bootinfo_get_reserved_mem(),
> - gnttab,
> - };
> -
> - if ( !gnttab )
> - goto fail;
> -
> - gnttab->nr_banks = 1;
> - gnttab->bank[0].start = kinfo->gnttab_start;
> - gnttab->bank[0].size = kinfo->gnttab_size;
> -
> - hwdom_free_mem = membanks_xzalloc(NR_MEM_BANKS, MEMORY);
> - if ( !hwdom_free_mem )
> - goto fail;
> -
> - if ( find_unallocated_memory(kinfo, mem_banks, ARRAY_SIZE(mem_banks),
> - hwdom_free_mem, add_hwdom_free_regions) )
> - goto fail;
> -
> - nr_banks = hwdom_free_mem->nr_banks;
> - xfree(gnttab);
> - }
> -
> - for ( i = 0; kinfo->unassigned_mem > 0 && nr_banks > 0; i++, nr_banks-- )
> - {
> - paddr_t bank_start, bank_size;
> -
> - if ( is_hardware_domain(d) )
> - {
> - bank_start = hwdom_free_mem->bank[i].start;
> - bank_size = hwdom_free_mem->bank[i].size;
> - }
> - else
> - {
> - const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
> - const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
> -
> - if ( i >= GUEST_RAM_BANKS )
> - goto fail;
> -
> - bank_start = bankbase[i];
> - bank_size = banksize[i];
> - }
> -
> - bank_size = MIN(bank_size, kinfo->unassigned_mem);
> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(bank_start), bank_size) )
> - goto fail;
> - }
> -
> - if ( kinfo->unassigned_mem )
> - goto fail;
> -
> - for( i = 0; i < mem->nr_banks; i++ )
> - {
> - printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
> - d,
> - i,
> - mem->bank[i].start,
> - mem->bank[i].start + mem->bank[i].size,
> - /* Don't want format this as PRIpaddr (16 digit hex) */
> - (unsigned long)(mem->bank[i].size >> 20));
> - }
> -
> - xfree(hwdom_free_mem);
> - return;
> -
> - fail:
> - panic("Failed to allocate requested domain memory."
> - /* Don't want format this as PRIpaddr (16 digit hex) */
> - " %ldKB unallocated. Fix the VMs configurations.\n",
> - (unsigned long)kinfo->unassigned_mem >> 10);
> -}
> -
> static int __init handle_pci_range(const struct dt_device_node *dev,
> uint64_t addr, uint64_t len, void *data)
> {
> @@ -2059,75 +1735,6 @@ static int __init prepare_dtb_hwdom(struct domain *d, struct kernel_info *kinfo)
> return -EINVAL;
> }
>
> -static void __init dtb_load(struct kernel_info *kinfo)
> -{
> - unsigned long left;
> -
> - printk("Loading %pd DTB to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
> - kinfo->d, kinfo->dtb_paddr,
> - kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt));
> -
> - left = copy_to_guest_phys_flush_dcache(kinfo->d, kinfo->dtb_paddr,
> - kinfo->fdt,
> - fdt_totalsize(kinfo->fdt));
> -
> - if ( left != 0 )
> - panic("Unable to copy the DTB to %pd memory (left = %lu bytes)\n",
> - kinfo->d, left);
> - xfree(kinfo->fdt);
> -}
> -
> -static void __init initrd_load(struct kernel_info *kinfo)
> -{
> - const struct bootmodule *mod = kinfo->initrd_bootmodule;
> - paddr_t load_addr = kinfo->initrd_paddr;
> - paddr_t paddr, len;
> - int node;
> - int res;
> - __be32 val[2];
> - __be32 *cellp;
> - void __iomem *initrd;
> -
> - if ( !mod || !mod->size )
> - return;
> -
> - paddr = mod->start;
> - len = mod->size;
> -
> - printk("Loading %pd initrd from %"PRIpaddr" to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
> - kinfo->d, paddr, load_addr, load_addr + len);
> -
> - /* Fix up linux,initrd-start and linux,initrd-end in /chosen */
> - node = fdt_path_offset(kinfo->fdt, "/chosen");
> - if ( node < 0 )
> - panic("Cannot find the /chosen node\n");
> -
> - cellp = (__be32 *)val;
> - dt_set_cell(&cellp, ARRAY_SIZE(val), load_addr);
> - res = fdt_setprop_inplace(kinfo->fdt, node, "linux,initrd-start",
> - val, sizeof(val));
> - if ( res )
> - panic("Cannot fix up \"linux,initrd-start\" property\n");
> -
> - cellp = (__be32 *)val;
> - dt_set_cell(&cellp, ARRAY_SIZE(val), load_addr + len);
> - res = fdt_setprop_inplace(kinfo->fdt, node, "linux,initrd-end",
> - val, sizeof(val));
> - if ( res )
> - panic("Cannot fix up \"linux,initrd-end\" property\n");
> -
> - initrd = ioremap_wc(paddr, len);
> - if ( !initrd )
> - panic("Unable to map the %pd initrd\n", kinfo->d);
> -
> - res = copy_to_guest_phys_flush_dcache(kinfo->d, load_addr,
> - initrd, len);
> - if ( res != 0 )
> - panic("Unable to copy the initrd in the %pd memory\n", kinfo->d);
> -
> - iounmap(initrd);
> -}
> -
> /*
> * Allocate the event channel PPIs and setup the HVM_PARAM_CALLBACK_IRQ.
> * The allocated IRQ will be found in d->arch.evtchn_irq.
> @@ -2220,8 +1827,8 @@ int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
> */
> kernel_load(kinfo);
> /* initrd_load will fix up the fdt, so call it before dtb_load */
> - initrd_load(kinfo);
> - dtb_load(kinfo);
> + initrd_load(kinfo, copy_to_guest_phys_flush_dcache);
> + dtb_load(kinfo, copy_to_guest_phys_flush_dcache);
>
> memset(regs, 0, sizeof(*regs));
>
> diff --git a/xen/common/device-tree/Makefile b/xen/common/device-tree/Makefile
> index e88a4d5799..831b91399b 100644
> --- a/xen/common/device-tree/Makefile
> +++ b/xen/common/device-tree/Makefile
> @@ -1,6 +1,7 @@
> obj-y += bootfdt.init.o
> obj-y += bootinfo.init.o
> obj-y += device-tree.o
> +obj-$(CONFIG_DOMAIN_BUILD_HELPERS) += domain-build.o
> obj-$(CONFIG_DOM0LESS_BOOT) += dom0less-build.o
> obj-$(CONFIG_OVERLAY_DTB) += dt-overlay.o
> obj-y += intc.o
> diff --git a/xen/common/device-tree/domain-build.c b/xen/common/device-tree/domain-build.c
> new file mode 100644
> index 0000000000..69257a15ba
> --- /dev/null
> +++ b/xen/common/device-tree/domain-build.c
> @@ -0,0 +1,404 @@
> +#include <xen/bootfdt.h>
> +#include <xen/fdt-domain-build.h>
> +#include <xen/init.h>
> +#include <xen/lib.h>
> +#include <xen/libfdt/libfdt.h>
> +#include <xen/mm.h>
> +#include <xen/sched.h>
> +#include <xen/sizes.h>
> +#include <xen/types.h>
> +#include <xen/vmap.h>
> +
> +#include <asm/p2m.h>
> +
> +bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
> + alloc_domheap_mem_cb cb, void *extra)
> +{
> + unsigned int max_order = UINT_MAX;
> +
> + while ( tot_size > 0 )
> + {
> + unsigned int order = get_allocation_size(tot_size);
> + struct page_info *pg;
> +
> + order = min(max_order, order);
> +
> + pg = alloc_domheap_pages(d, order, 0);
> + if ( !pg )
> + {
> + /*
> + * If we can't allocate one page, then it is unlikely to
> + * succeed in the next iteration. So bail out.
> + */
> + if ( !order )
> + return false;
> +
> + /*
> + * If we can't allocate memory with order, then it is
> + * unlikely to succeed in the next iteration.
> + * Record the order - 1 to avoid re-trying.
> + */
> + max_order = order - 1;
> + continue;
> + }
> +
> + if ( !cb(d, pg, order, extra) )
> + return false;
> +
> + tot_size -= (1ULL << (PAGE_SHIFT + order));
> + }
> +
> + return true;
> +}
> +
> +static bool __init guest_map_pages(struct domain *d, struct page_info *pg,
> + unsigned int order, void *extra)
> +{
> + gfn_t *sgfn = (gfn_t *)extra;
> + int res;
> +
> + BUG_ON(!sgfn);
> + res = guest_physmap_add_page(d, *sgfn, page_to_mfn(pg), order);
> + if ( res )
> + {
> + dprintk(XENLOG_ERR, "Failed map pages to DOMU: %d", res);
> + return false;
> + }
> +
> + *sgfn = gfn_add(*sgfn, 1UL << order);
> +
> + return true;
> +}
> +
> +bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
> + paddr_t tot_size)
> +{
> + struct membanks *mem = kernel_info_get_mem(kinfo);
> + struct domain *d = kinfo->d;
> + struct membank *bank;
> +
> + /*
> + * allocate_bank_memory can be called with a tot_size of zero for
> + * the second memory bank. It is not an error and we can safely
> + * avoid creating a zero-size memory bank.
> + */
> + if ( tot_size == 0 )
> + return true;
> +
> + bank = &mem->bank[mem->nr_banks];
> + bank->start = gfn_to_gaddr(sgfn);
> + bank->size = tot_size;
> +
> + /*
> + * Allocate pages from the heap until tot_size is zero and map them to the
> + * guest using guest_map_pages, passing the starting gfn as extra parameter
> + * for the map operation.
> + */
> + if ( !allocate_domheap_memory(d, tot_size, guest_map_pages, &sgfn) )
> + return false;
> +
> + mem->nr_banks++;
> + kinfo->unassigned_mem -= bank->size;
> +
> + return true;
> +}
> +
> +static int __init add_hwdom_free_regions(unsigned long s_gfn,
> + unsigned long e_gfn, void *data)
> +{
> + struct membanks *free_regions = data;
> + paddr_t start, size;
> + paddr_t s = pfn_to_paddr(s_gfn);
> + paddr_t e = pfn_to_paddr(e_gfn);
> + unsigned int i, j;
> +
> + if ( free_regions->nr_banks >= free_regions->max_banks )
> + return 0;
> +
> + /*
> + * Both start and size of the free region should be 2MB aligned to
> + * potentially allow superpage mapping.
> + */
> + start = (s + SZ_2M - 1) & ~(SZ_2M - 1);
> + if ( start > e )
> + return 0;
> +
> + /*
> + * e is actually "end-1" because it is called by rangeset functions
> + * which are inclusive of the last address.
> + */
> + e += 1;
> + size = (e - start) & ~(SZ_2M - 1);
> +
> + /* Find the insert position (descending order). */
> + for ( i = 0; i < free_regions->nr_banks ; i++ )
> + if ( size > free_regions->bank[i].size )
> + break;
> +
> + /* Move the other banks to make space. */
> + for ( j = free_regions->nr_banks; j > i ; j-- )
> + {
> + free_regions->bank[j].start = free_regions->bank[j - 1].start;
> + free_regions->bank[j].size = free_regions->bank[j - 1].size;
> + }
> +
> + free_regions->bank[i].start = start;
> + free_regions->bank[i].size = size;
> + free_regions->nr_banks++;
> +
> + return 0;
> +}
> +
> +/*
> + * Find unused regions of Host address space which can be exposed to domain
> + * using the host memory layout. In order to calculate regions we exclude every
> + * region passed in mem_banks from the Host RAM.
> + */
> +int __init find_unallocated_memory(const struct kernel_info *kinfo,
> + const struct membanks *mem_banks[],
> + unsigned int nr_mem_banks,
> + struct membanks *free_regions,
> + int (*cb)(unsigned long s_gfn,
> + unsigned long e_gfn,
> + void *data))
> +{
> + const struct membanks *mem = bootinfo_get_mem();
> + struct rangeset *unalloc_mem;
> + paddr_t start, end;
> + unsigned int i, j;
> + int res;
> +
> + ASSERT(domain_use_host_layout(kinfo->d));
> +
> + unalloc_mem = rangeset_new(NULL, NULL, 0);
> + if ( !unalloc_mem )
> + return -ENOMEM;
> +
> + /* Start with all available RAM */
> + for ( i = 0; i < mem->nr_banks; i++ )
> + {
> + start = mem->bank[i].start;
> + end = mem->bank[i].start + mem->bank[i].size;
> + res = rangeset_add_range(unalloc_mem, PFN_DOWN(start),
> + PFN_DOWN(end - 1));
> + if ( res )
> + {
> + printk(XENLOG_ERR "Failed to add: %#"PRIpaddr"->%#"PRIpaddr"\n",
> + start, end);
> + goto out;
> + }
> + }
> +
> + /* Remove all regions listed in mem_banks */
> + for ( i = 0; i < nr_mem_banks; i++ )
> + for ( j = 0; j < mem_banks[i]->nr_banks; j++ )
> + {
> + start = mem_banks[i]->bank[j].start;
> +
> + /* Shared memory banks can contain INVALID_PADDR as start */
> + if ( INVALID_PADDR == start )
> + continue;
> +
> + end = mem_banks[i]->bank[j].start + mem_banks[i]->bank[j].size;
> + res = rangeset_remove_range(unalloc_mem, PFN_DOWN(start),
> + PFN_DOWN(end - 1));
> + if ( res )
> + {
> + printk(XENLOG_ERR
> + "Failed to add: %#"PRIpaddr"->%#"PRIpaddr", error %d\n",
> + start, end, res);
> + goto out;
> + }
> + }
> +
> + start = 0;
> + end = (1ULL << p2m_ipa_bits) - 1;
> + res = rangeset_report_ranges(unalloc_mem, PFN_DOWN(start), PFN_DOWN(end),
> + cb, free_regions);
> + if ( res )
> + free_regions->nr_banks = 0;
> + else if ( !free_regions->nr_banks )
> + res = -ENOENT;
> +
> +out:
> + rangeset_destroy(unalloc_mem);
> +
> + return res;
> +}
> +
> +void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
> +{
> + struct membanks *mem = kernel_info_get_mem(kinfo);
> + unsigned int i, nr_banks = GUEST_RAM_BANKS;
> + struct membanks *hwdom_free_mem = NULL;
> +
> + printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
> + /* Don't want format this as PRIpaddr (16 digit hex) */
> + (unsigned long)(kinfo->unassigned_mem >> 20), d);
> +
> + mem->nr_banks = 0;
> + /*
> + * Use host memory layout for hwdom. Only case for this is when LLC coloring
> + * is enabled.
> + */
> + if ( is_hardware_domain(d) )
> + {
> + struct membanks *gnttab = xzalloc_flex_struct(struct membanks, bank, 1);
shouldn't we set gnttab->max_banks and gnttab->type here?
> + /*
> + * Exclude the following regions:
> + * 1) Remove reserved memory
> + * 2) Grant table assigned to hwdom
> + */
> + const struct membanks *mem_banks[] = {
> + bootinfo_get_reserved_mem(),
> + gnttab,
> + };
> +
> + if ( !gnttab )
> + goto fail;
> +
> + gnttab->nr_banks = 1;
> + gnttab->bank[0].start = kinfo->gnttab_start;
> + gnttab->bank[0].size = kinfo->gnttab_size;
> +
> + hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
> + NR_MEM_BANKS);
> + if ( !hwdom_free_mem )
> + goto fail;
> +
> + hwdom_free_mem->max_banks = NR_MEM_BANKS;
here we are missing setting hwdom_free_mem->type ?
> +
> + if ( find_unallocated_memory(kinfo, mem_banks, ARRAY_SIZE(mem_banks),
> + hwdom_free_mem, add_hwdom_free_regions) )
> + goto fail;
> +
> + nr_banks = hwdom_free_mem->nr_banks;
> + xfree(gnttab);
> + }
> +
> + for ( i = 0; kinfo->unassigned_mem > 0 && nr_banks > 0; i++, nr_banks-- )
> + {
> + paddr_t bank_start, bank_size;
> +
> + if ( is_hardware_domain(d) )
> + {
> + bank_start = hwdom_free_mem->bank[i].start;
> + bank_size = hwdom_free_mem->bank[i].size;
> + }
> + else
> + {
> + const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
> + const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
> +
> + if ( i >= GUEST_RAM_BANKS )
> + goto fail;
> +
> + bank_start = bankbase[i];
> + bank_size = banksize[i];
> + }
> +
> + bank_size = MIN(bank_size, kinfo->unassigned_mem);
> + if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(bank_start), bank_size) )
> + goto fail;
> + }
> +
> + if ( kinfo->unassigned_mem )
> + goto fail;
> +
> + for( i = 0; i < mem->nr_banks; i++ )
> + {
> + printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
> + d,
> + i,
> + mem->bank[i].start,
> + mem->bank[i].start + mem->bank[i].size,
> + /* Don't want format this as PRIpaddr (16 digit hex) */
> + (unsigned long)(mem->bank[i].size >> 20));
> + }
> +
> + xfree(hwdom_free_mem);
> + return;
> +
> + fail:
> + panic("Failed to allocate requested domain memory."
> + /* Don't want format this as PRIpaddr (16 digit hex) */
> + " %ldKB unallocated. Fix the VMs configurations.\n",
> + (unsigned long)kinfo->unassigned_mem >> 10);
> +}
> +
> +/* Copy data to guest physical address, then clean the region. */
> +typedef unsigned long (*copy_to_guest_phys_cb)(struct domain *d,
> + paddr_t gpa,
> + void *buf,
> + unsigned int len);
This shouldn't be needed because copy_to_guest_phys_cb is already
declared in xen/include/xen/fdt-domain-build.h
> +void __init dtb_load(struct kernel_info *kinfo,
> + copy_to_guest_phys_cb copy_to_guest)
> +{
> + unsigned long left;
> +
> + printk("Loading %pd DTB to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
> + kinfo->d, kinfo->dtb_paddr,
> + kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt));
> +
> + left = copy_to_guest(kinfo->d, kinfo->dtb_paddr,
> + kinfo->fdt,
> + fdt_totalsize(kinfo->fdt));
> +
> + if ( left != 0 )
> + panic("Unable to copy the DTB to %pd memory (left = %lu bytes)\n",
> + kinfo->d, left);
> + xfree(kinfo->fdt);
> +}
> +
> +void __init initrd_load(struct kernel_info *kinfo,
> + copy_to_guest_phys_cb copy_to_guest)
> +{
> + const struct bootmodule *mod = kinfo->initrd_bootmodule;
> + paddr_t load_addr = kinfo->initrd_paddr;
> + paddr_t paddr, len;
> + int node;
> + int res;
> + __be32 val[2];
> + __be32 *cellp;
> + void __iomem *initrd;
> +
> + if ( !mod || !mod->size )
> + return;
> +
> + paddr = mod->start;
> + len = mod->size;
> +
> + printk("Loading %pd initrd from %"PRIpaddr" to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
> + kinfo->d, paddr, load_addr, load_addr + len);
> +
> + /* Fix up linux,initrd-start and linux,initrd-end in /chosen */
> + node = fdt_path_offset(kinfo->fdt, "/chosen");
> + if ( node < 0 )
> + panic("Cannot find the /chosen node\n");
> +
> + cellp = (__be32 *)val;
> + dt_set_cell(&cellp, ARRAY_SIZE(val), load_addr);
> + res = fdt_setprop_inplace(kinfo->fdt, node, "linux,initrd-start",
> + val, sizeof(val));
> + if ( res )
> + panic("Cannot fix up \"linux,initrd-start\" property\n");
> +
> + cellp = (__be32 *)val;
> + dt_set_cell(&cellp, ARRAY_SIZE(val), load_addr + len);
> + res = fdt_setprop_inplace(kinfo->fdt, node, "linux,initrd-end",
> + val, sizeof(val));
> + if ( res )
> + panic("Cannot fix up \"linux,initrd-end\" property\n");
> +
> + initrd = ioremap_wc(paddr, len);
> + if ( !initrd )
> + panic("Unable to map the hwdom initrd\n");
The original message was:
panic("Unable to map the %pd initrd\n", kinfo->d);
why change it? It can be called for domUs.
> + res = copy_to_guest(kinfo->d, load_addr,
> + initrd, len);
> + if ( res != 0 )
> + panic("Unable to copy the initrd in the hwdom memory\n");
Same here, the original message was:
panic("Unable to copy the initrd in the %pd memory\n", kinfo->d);
> + iounmap(initrd);
> +}
> diff --git a/xen/include/xen/fdt-domain-build.h b/xen/include/xen/fdt-domain-build.h
> index b79e9fabfe..4a0052b2e8 100644
> --- a/xen/include/xen/fdt-domain-build.h
> +++ b/xen/include/xen/fdt-domain-build.h
> @@ -6,6 +6,7 @@
> #include <xen/bootfdt.h>
> #include <xen/device_tree.h>
> #include <xen/fdt-kernel.h>
> +#include <xen/mm.h>
> #include <xen/types.h>
>
> struct domain;
> @@ -29,7 +30,37 @@ int make_memory_node(const struct kernel_info *kinfo, int addrcells,
> int sizecells, const struct membanks *mem);
> int make_timer_node(const struct kernel_info *kinfo);
>
> -unsigned int get_allocation_size(paddr_t size);
> +
> +static inline int get_allocation_size(paddr_t size)
> +{
> + /*
> + * get_order_from_bytes returns the order greater than or equal to
> + * the given size, but we need less than or equal. Adding one to
> + * the size pushes an evenly aligned size into the next order, so
> + * we can then unconditionally subtract 1 from the order which is
> + * returned.
> + */
> + return get_order_from_bytes(size + 1) - 1;
> +}
> +
> +typedef unsigned long (*copy_to_guest_phys_cb)(struct domain *d,
> + paddr_t gpa,
> + void *buf,
> + unsigned int len);
> +
> +void initrd_load(struct kernel_info *kinfo,
> + copy_to_guest_phys_cb copy_to_guest);
> +
> +void dtb_load(struct kernel_info *kinfo,
> + copy_to_guest_phys_cb copy_to_guest);
> +
> +int find_unallocated_memory(const struct kernel_info *kinfo,
> + const struct membanks *mem_banks[],
> + unsigned int nr_mem_banks,
> + struct membanks *free_regions,
> + int (*cb)(unsigned long s_gfn,
> + unsigned long e_gfn,
> + void *data));
>
> #endif /* __XEN_FDT_DOMAIN_BUILD_H__ */
>
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v3 7/8] xen/common: dom0less: introduce common domain-build.c
2025-05-02 20:02 ` Stefano Stabellini
@ 2025-05-05 10:06 ` Oleksii Kurochko
0 siblings, 0 replies; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-05 10:06 UTC (permalink / raw)
To: Stefano Stabellini
Cc: xen-devel, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD, Jan Beulich,
Roger Pau Monné
[-- Attachment #1: Type: text/plain, Size: 34564 bytes --]
On 5/2/25 10:02 PM, Stefano Stabellini wrote:
> On Fri, 2 May 2025, Oleksii Kurochko wrote:
>> Some functions of Arm's domain_build.c could be reused by dom0less or other
>> features connected to domain construction/build.
>>
>> The following functions are moved to common:
>> - get_allocation_size().
>> - allocate_domheap_memory().
>> - guest_map_pages().
>> - allocate_bank_memory().
>> - add_hwdom_free_regions().
>> - find_unallocated_memory().
>> - allocate_memory().
>> - dtb_load().
>> - initrd_load().
> The declaration of allocate_domheap_memory, allocate_bank_memory,
> allocate_memory were moved in patch #5. Maybe their movement should be
> in this patch?
Sure, it makes sense.
>
>> Prototype of dtb_load() and initrd_load() is updated to recieve a pointer
>> to copy_to_guest_phys() as some archs require
>> copy_to_guest_phys_fluch_dcache().
>>
>> Update arm/include/asm/Makefile to generate domain-build.h for Arm as it is
>> used by domain-build.c.
>>
>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
>> ---
>> Change in v3:
>> - Nothing changed. Only rebase.
>> ---
>> Change in v2:
>> - Use xen/fdt-domain-build.h instead of asm/domain_build.h.
>> ---
>> xen/arch/arm/domain_build.c | 397 +------------------------
>> xen/common/device-tree/Makefile | 1 +
>> xen/common/device-tree/domain-build.c | 404 ++++++++++++++++++++++++++
>> xen/include/xen/fdt-domain-build.h | 33 ++-
>> 4 files changed, 439 insertions(+), 396 deletions(-)
>> create mode 100644 xen/common/device-tree/domain-build.c
>>
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index 9d649b06b3..df29619c40 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -120,18 +120,6 @@ struct vcpu *__init alloc_dom0_vcpu0(struct domain *dom0)
>> return vcpu_create(dom0, 0);
>> }
>>
>> -unsigned int __init get_allocation_size(paddr_t size)
>> -{
>> - /*
>> - * get_order_from_bytes returns the order greater than or equal to
>> - * the given size, but we need less than or equal. Adding one to
>> - * the size pushes an evenly aligned size into the next order, so
>> - * we can then unconditionally subtract 1 from the order which is
>> - * returned.
>> - */
>> - return get_order_from_bytes(size + 1) - 1;
>> -}
>> -
>> /*
>> * Insert the given pages into a memory bank, banks are ordered by address.
>> *
>> @@ -418,98 +406,6 @@ static void __init allocate_memory_11(struct domain *d,
>> }
>> }
>>
>> -bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
>> - alloc_domheap_mem_cb cb, void *extra)
>> -{
>> - unsigned int max_order = UINT_MAX;
>> -
>> - while ( tot_size > 0 )
>> - {
>> - unsigned int order = get_allocation_size(tot_size);
>> - struct page_info *pg;
>> -
>> - order = min(max_order, order);
>> -
>> - pg = alloc_domheap_pages(d, order, 0);
>> - if ( !pg )
>> - {
>> - /*
>> - * If we can't allocate one page, then it is unlikely to
>> - * succeed in the next iteration. So bail out.
>> - */
>> - if ( !order )
>> - return false;
>> -
>> - /*
>> - * If we can't allocate memory with order, then it is
>> - * unlikely to succeed in the next iteration.
>> - * Record the order - 1 to avoid re-trying.
>> - */
>> - max_order = order - 1;
>> - continue;
>> - }
>> -
>> - if ( !cb(d, pg, order, extra) )
>> - return false;
>> -
>> - tot_size -= (1ULL << (PAGE_SHIFT + order));
>> - }
>> -
>> - return true;
>> -}
>> -
>> -static bool __init guest_map_pages(struct domain *d, struct page_info *pg,
>> - unsigned int order, void *extra)
>> -{
>> - gfn_t *sgfn = (gfn_t *)extra;
>> - int res;
>> -
>> - BUG_ON(!sgfn);
>> - res = guest_physmap_add_page(d, *sgfn, page_to_mfn(pg), order);
>> - if ( res )
>> - {
>> - dprintk(XENLOG_ERR, "Failed map pages to DOMU: %d", res);
>> - return false;
>> - }
>> -
>> - *sgfn = gfn_add(*sgfn, 1UL << order);
>> -
>> - return true;
>> -}
>> -
>> -bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
>> - paddr_t tot_size)
>> -{
>> - struct membanks *mem = kernel_info_get_mem(kinfo);
>> - struct domain *d = kinfo->d;
>> - struct membank *bank;
>> -
>> - /*
>> - * allocate_bank_memory can be called with a tot_size of zero for
>> - * the second memory bank. It is not an error and we can safely
>> - * avoid creating a zero-size memory bank.
>> - */
>> - if ( tot_size == 0 )
>> - return true;
>> -
>> - bank = &mem->bank[mem->nr_banks];
>> - bank->start = gfn_to_gaddr(sgfn);
>> - bank->size = tot_size;
>> -
>> - /*
>> - * Allocate pages from the heap until tot_size is zero and map them to the
>> - * guest using guest_map_pages, passing the starting gfn as extra parameter
>> - * for the map operation.
>> - */
>> - if ( !allocate_domheap_memory(d, tot_size, guest_map_pages, &sgfn) )
>> - return false;
>> -
>> - mem->nr_banks++;
>> - kinfo->unassigned_mem -= bank->size;
>> -
>> - return true;
>> -}
>> -
>> /*
>> * When PCI passthrough is available we want to keep the
>> * "linux,pci-domain" in sync for every host bridge.
>> @@ -900,226 +796,6 @@ int __init add_ext_regions(unsigned long s_gfn, unsigned long e_gfn,
>> return 0;
>> }
>>
>> -static int __init add_hwdom_free_regions(unsigned long s_gfn,
>> - unsigned long e_gfn, void *data)
>> -{
>> - struct membanks *free_regions = data;
>> - paddr_t start, size;
>> - paddr_t s = pfn_to_paddr(s_gfn);
>> - paddr_t e = pfn_to_paddr(e_gfn);
>> - unsigned int i, j;
>> -
>> - if ( free_regions->nr_banks >= free_regions->max_banks )
>> - return 0;
>> -
>> - /*
>> - * Both start and size of the free region should be 2MB aligned to
>> - * potentially allow superpage mapping.
>> - */
>> - start = (s + SZ_2M - 1) & ~(SZ_2M - 1);
>> - if ( start > e )
>> - return 0;
>> -
>> - /*
>> - * e is actually "end-1" because it is called by rangeset functions
>> - * which are inclusive of the last address.
>> - */
>> - e += 1;
>> - size = (e - start) & ~(SZ_2M - 1);
>> -
>> - /* Find the insert position (descending order). */
>> - for ( i = 0; i < free_regions->nr_banks ; i++ )
>> - if ( size > free_regions->bank[i].size )
>> - break;
>> -
>> - /* Move the other banks to make space. */
>> - for ( j = free_regions->nr_banks; j > i ; j-- )
>> - {
>> - free_regions->bank[j].start = free_regions->bank[j - 1].start;
>> - free_regions->bank[j].size = free_regions->bank[j - 1].size;
>> - }
>> -
>> - free_regions->bank[i].start = start;
>> - free_regions->bank[i].size = size;
>> - free_regions->nr_banks++;
>> -
>> - return 0;
>> -}
>> -
>> -/*
>> - * Find unused regions of Host address space which can be exposed to domain
>> - * using the host memory layout. In order to calculate regions we exclude every
>> - * region passed in mem_banks from the Host RAM.
>> - */
>> -static int __init find_unallocated_memory(const struct kernel_info *kinfo,
>> - const struct membanks *mem_banks[],
>> - unsigned int nr_mem_banks,
>> - struct membanks *free_regions,
>> - int (*cb)(unsigned long s_gfn,
>> - unsigned long e_gfn,
>> - void *data))
>> -{
>> - const struct membanks *mem = bootinfo_get_mem();
>> - struct rangeset *unalloc_mem;
>> - paddr_t start, end;
>> - unsigned int i, j;
>> - int res;
>> -
>> - ASSERT(domain_use_host_layout(kinfo->d));
>> -
>> - unalloc_mem = rangeset_new(NULL, NULL, 0);
>> - if ( !unalloc_mem )
>> - return -ENOMEM;
>> -
>> - /* Start with all available RAM */
>> - for ( i = 0; i < mem->nr_banks; i++ )
>> - {
>> - start = mem->bank[i].start;
>> - end = mem->bank[i].start + mem->bank[i].size;
>> - res = rangeset_add_range(unalloc_mem, PFN_DOWN(start),
>> - PFN_DOWN(end - 1));
>> - if ( res )
>> - {
>> - printk(XENLOG_ERR "Failed to add: %#"PRIpaddr"->%#"PRIpaddr"\n",
>> - start, end);
>> - goto out;
>> - }
>> - }
>> -
>> - /* Remove all regions listed in mem_banks */
>> - for ( i = 0; i < nr_mem_banks; i++ )
>> - for ( j = 0; j < mem_banks[i]->nr_banks; j++ )
>> - {
>> - start = mem_banks[i]->bank[j].start;
>> -
>> - /* Shared memory banks can contain INVALID_PADDR as start */
>> - if ( INVALID_PADDR == start )
>> - continue;
>> -
>> - end = mem_banks[i]->bank[j].start + mem_banks[i]->bank[j].size;
>> - res = rangeset_remove_range(unalloc_mem, PFN_DOWN(start),
>> - PFN_DOWN(end - 1));
>> - if ( res )
>> - {
>> - printk(XENLOG_ERR
>> - "Failed to add: %#"PRIpaddr"->%#"PRIpaddr", error %d\n",
>> - start, end, res);
>> - goto out;
>> - }
>> - }
>> -
>> - start = 0;
>> - end = (1ULL << p2m_ipa_bits) - 1;
>> - res = rangeset_report_ranges(unalloc_mem, PFN_DOWN(start), PFN_DOWN(end),
>> - cb, free_regions);
>> - if ( res )
>> - free_regions->nr_banks = 0;
>> - else if ( !free_regions->nr_banks )
>> - res = -ENOENT;
>> -
>> -out:
>> - rangeset_destroy(unalloc_mem);
>> -
>> - return res;
>> -}
>> -
>> -void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
>> -{
>> - struct membanks *mem = kernel_info_get_mem(kinfo);
>> - unsigned int i, nr_banks = GUEST_RAM_BANKS;
>> - struct membanks *hwdom_free_mem = NULL;
>> -
>> - printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>> - (unsigned long)(kinfo->unassigned_mem >> 20), d);
>> -
>> - mem->nr_banks = 0;
>> - /*
>> - * Use host memory layout for hwdom. Only case for this is when LLC coloring
>> - * is enabled.
>> - */
>> - if ( is_hardware_domain(d) )
>> - {
>> - struct membanks *gnttab = membanks_xzalloc(1, MEMORY);
>> - /*
>> - * Exclude the following regions:
>> - * 1) Remove reserved memory
>> - * 2) Grant table assigned to hwdom
>> - */
>> - const struct membanks *mem_banks[] = {
>> - bootinfo_get_reserved_mem(),
>> - gnttab,
>> - };
>> -
>> - if ( !gnttab )
>> - goto fail;
>> -
>> - gnttab->nr_banks = 1;
>> - gnttab->bank[0].start = kinfo->gnttab_start;
>> - gnttab->bank[0].size = kinfo->gnttab_size;
>> -
>> - hwdom_free_mem = membanks_xzalloc(NR_MEM_BANKS, MEMORY);
>> - if ( !hwdom_free_mem )
>> - goto fail;
>> -
>> - if ( find_unallocated_memory(kinfo, mem_banks, ARRAY_SIZE(mem_banks),
>> - hwdom_free_mem, add_hwdom_free_regions) )
>> - goto fail;
>> -
>> - nr_banks = hwdom_free_mem->nr_banks;
>> - xfree(gnttab);
>> - }
>> -
>> - for ( i = 0; kinfo->unassigned_mem > 0 && nr_banks > 0; i++, nr_banks-- )
>> - {
>> - paddr_t bank_start, bank_size;
>> -
>> - if ( is_hardware_domain(d) )
>> - {
>> - bank_start = hwdom_free_mem->bank[i].start;
>> - bank_size = hwdom_free_mem->bank[i].size;
>> - }
>> - else
>> - {
>> - const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>> - const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
>> -
>> - if ( i >= GUEST_RAM_BANKS )
>> - goto fail;
>> -
>> - bank_start = bankbase[i];
>> - bank_size = banksize[i];
>> - }
>> -
>> - bank_size = MIN(bank_size, kinfo->unassigned_mem);
>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(bank_start), bank_size) )
>> - goto fail;
>> - }
>> -
>> - if ( kinfo->unassigned_mem )
>> - goto fail;
>> -
>> - for( i = 0; i < mem->nr_banks; i++ )
>> - {
>> - printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
>> - d,
>> - i,
>> - mem->bank[i].start,
>> - mem->bank[i].start + mem->bank[i].size,
>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>> - (unsigned long)(mem->bank[i].size >> 20));
>> - }
>> -
>> - xfree(hwdom_free_mem);
>> - return;
>> -
>> - fail:
>> - panic("Failed to allocate requested domain memory."
>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>> - " %ldKB unallocated. Fix the VMs configurations.\n",
>> - (unsigned long)kinfo->unassigned_mem >> 10);
>> -}
>> -
>> static int __init handle_pci_range(const struct dt_device_node *dev,
>> uint64_t addr, uint64_t len, void *data)
>> {
>> @@ -2059,75 +1735,6 @@ static int __init prepare_dtb_hwdom(struct domain *d, struct kernel_info *kinfo)
>> return -EINVAL;
>> }
>>
>> -static void __init dtb_load(struct kernel_info *kinfo)
>> -{
>> - unsigned long left;
>> -
>> - printk("Loading %pd DTB to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
>> - kinfo->d, kinfo->dtb_paddr,
>> - kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt));
>> -
>> - left = copy_to_guest_phys_flush_dcache(kinfo->d, kinfo->dtb_paddr,
>> - kinfo->fdt,
>> - fdt_totalsize(kinfo->fdt));
>> -
>> - if ( left != 0 )
>> - panic("Unable to copy the DTB to %pd memory (left = %lu bytes)\n",
>> - kinfo->d, left);
>> - xfree(kinfo->fdt);
>> -}
>> -
>> -static void __init initrd_load(struct kernel_info *kinfo)
>> -{
>> - const struct bootmodule *mod = kinfo->initrd_bootmodule;
>> - paddr_t load_addr = kinfo->initrd_paddr;
>> - paddr_t paddr, len;
>> - int node;
>> - int res;
>> - __be32 val[2];
>> - __be32 *cellp;
>> - void __iomem *initrd;
>> -
>> - if ( !mod || !mod->size )
>> - return;
>> -
>> - paddr = mod->start;
>> - len = mod->size;
>> -
>> - printk("Loading %pd initrd from %"PRIpaddr" to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
>> - kinfo->d, paddr, load_addr, load_addr + len);
>> -
>> - /* Fix up linux,initrd-start and linux,initrd-end in /chosen */
>> - node = fdt_path_offset(kinfo->fdt, "/chosen");
>> - if ( node < 0 )
>> - panic("Cannot find the /chosen node\n");
>> -
>> - cellp = (__be32 *)val;
>> - dt_set_cell(&cellp, ARRAY_SIZE(val), load_addr);
>> - res = fdt_setprop_inplace(kinfo->fdt, node, "linux,initrd-start",
>> - val, sizeof(val));
>> - if ( res )
>> - panic("Cannot fix up \"linux,initrd-start\" property\n");
>> -
>> - cellp = (__be32 *)val;
>> - dt_set_cell(&cellp, ARRAY_SIZE(val), load_addr + len);
>> - res = fdt_setprop_inplace(kinfo->fdt, node, "linux,initrd-end",
>> - val, sizeof(val));
>> - if ( res )
>> - panic("Cannot fix up \"linux,initrd-end\" property\n");
>> -
>> - initrd = ioremap_wc(paddr, len);
>> - if ( !initrd )
>> - panic("Unable to map the %pd initrd\n", kinfo->d);
>> -
>> - res = copy_to_guest_phys_flush_dcache(kinfo->d, load_addr,
>> - initrd, len);
>> - if ( res != 0 )
>> - panic("Unable to copy the initrd in the %pd memory\n", kinfo->d);
>> -
>> - iounmap(initrd);
>> -}
>> -
>> /*
>> * Allocate the event channel PPIs and setup the HVM_PARAM_CALLBACK_IRQ.
>> * The allocated IRQ will be found in d->arch.evtchn_irq.
>> @@ -2220,8 +1827,8 @@ int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
>> */
>> kernel_load(kinfo);
>> /* initrd_load will fix up the fdt, so call it before dtb_load */
>> - initrd_load(kinfo);
>> - dtb_load(kinfo);
>> + initrd_load(kinfo, copy_to_guest_phys_flush_dcache);
>> + dtb_load(kinfo, copy_to_guest_phys_flush_dcache);
>>
>> memset(regs, 0, sizeof(*regs));
>>
>> diff --git a/xen/common/device-tree/Makefile b/xen/common/device-tree/Makefile
>> index e88a4d5799..831b91399b 100644
>> --- a/xen/common/device-tree/Makefile
>> +++ b/xen/common/device-tree/Makefile
>> @@ -1,6 +1,7 @@
>> obj-y += bootfdt.init.o
>> obj-y += bootinfo.init.o
>> obj-y += device-tree.o
>> +obj-$(CONFIG_DOMAIN_BUILD_HELPERS) += domain-build.o
>> obj-$(CONFIG_DOM0LESS_BOOT) += dom0less-build.o
>> obj-$(CONFIG_OVERLAY_DTB) += dt-overlay.o
>> obj-y += intc.o
>> diff --git a/xen/common/device-tree/domain-build.c b/xen/common/device-tree/domain-build.c
>> new file mode 100644
>> index 0000000000..69257a15ba
>> --- /dev/null
>> +++ b/xen/common/device-tree/domain-build.c
>> @@ -0,0 +1,404 @@
>> +#include <xen/bootfdt.h>
>> +#include <xen/fdt-domain-build.h>
>> +#include <xen/init.h>
>> +#include <xen/lib.h>
>> +#include <xen/libfdt/libfdt.h>
>> +#include <xen/mm.h>
>> +#include <xen/sched.h>
>> +#include <xen/sizes.h>
>> +#include <xen/types.h>
>> +#include <xen/vmap.h>
>> +
>> +#include <asm/p2m.h>
>> +
>> +bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
>> + alloc_domheap_mem_cb cb, void *extra)
>> +{
>> + unsigned int max_order = UINT_MAX;
>> +
>> + while ( tot_size > 0 )
>> + {
>> + unsigned int order = get_allocation_size(tot_size);
>> + struct page_info *pg;
>> +
>> + order = min(max_order, order);
>> +
>> + pg = alloc_domheap_pages(d, order, 0);
>> + if ( !pg )
>> + {
>> + /*
>> + * If we can't allocate one page, then it is unlikely to
>> + * succeed in the next iteration. So bail out.
>> + */
>> + if ( !order )
>> + return false;
>> +
>> + /*
>> + * If we can't allocate memory with order, then it is
>> + * unlikely to succeed in the next iteration.
>> + * Record the order - 1 to avoid re-trying.
>> + */
>> + max_order = order - 1;
>> + continue;
>> + }
>> +
>> + if ( !cb(d, pg, order, extra) )
>> + return false;
>> +
>> + tot_size -= (1ULL << (PAGE_SHIFT + order));
>> + }
>> +
>> + return true;
>> +}
>> +
>> +static bool __init guest_map_pages(struct domain *d, struct page_info *pg,
>> + unsigned int order, void *extra)
>> +{
>> + gfn_t *sgfn = (gfn_t *)extra;
>> + int res;
>> +
>> + BUG_ON(!sgfn);
>> + res = guest_physmap_add_page(d, *sgfn, page_to_mfn(pg), order);
>> + if ( res )
>> + {
>> + dprintk(XENLOG_ERR, "Failed map pages to DOMU: %d", res);
>> + return false;
>> + }
>> +
>> + *sgfn = gfn_add(*sgfn, 1UL << order);
>> +
>> + return true;
>> +}
>> +
>> +bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
>> + paddr_t tot_size)
>> +{
>> + struct membanks *mem = kernel_info_get_mem(kinfo);
>> + struct domain *d = kinfo->d;
>> + struct membank *bank;
>> +
>> + /*
>> + * allocate_bank_memory can be called with a tot_size of zero for
>> + * the second memory bank. It is not an error and we can safely
>> + * avoid creating a zero-size memory bank.
>> + */
>> + if ( tot_size == 0 )
>> + return true;
>> +
>> + bank = &mem->bank[mem->nr_banks];
>> + bank->start = gfn_to_gaddr(sgfn);
>> + bank->size = tot_size;
>> +
>> + /*
>> + * Allocate pages from the heap until tot_size is zero and map them to the
>> + * guest using guest_map_pages, passing the starting gfn as extra parameter
>> + * for the map operation.
>> + */
>> + if ( !allocate_domheap_memory(d, tot_size, guest_map_pages, &sgfn) )
>> + return false;
>> +
>> + mem->nr_banks++;
>> + kinfo->unassigned_mem -= bank->size;
>> +
>> + return true;
>> +}
>> +
>> +static int __init add_hwdom_free_regions(unsigned long s_gfn,
>> + unsigned long e_gfn, void *data)
>> +{
>> + struct membanks *free_regions = data;
>> + paddr_t start, size;
>> + paddr_t s = pfn_to_paddr(s_gfn);
>> + paddr_t e = pfn_to_paddr(e_gfn);
>> + unsigned int i, j;
>> +
>> + if ( free_regions->nr_banks >= free_regions->max_banks )
>> + return 0;
>> +
>> + /*
>> + * Both start and size of the free region should be 2MB aligned to
>> + * potentially allow superpage mapping.
>> + */
>> + start = (s + SZ_2M - 1) & ~(SZ_2M - 1);
>> + if ( start > e )
>> + return 0;
>> +
>> + /*
>> + * e is actually "end-1" because it is called by rangeset functions
>> + * which are inclusive of the last address.
>> + */
>> + e += 1;
>> + size = (e - start) & ~(SZ_2M - 1);
>> +
>> + /* Find the insert position (descending order). */
>> + for ( i = 0; i < free_regions->nr_banks ; i++ )
>> + if ( size > free_regions->bank[i].size )
>> + break;
>> +
>> + /* Move the other banks to make space. */
>> + for ( j = free_regions->nr_banks; j > i ; j-- )
>> + {
>> + free_regions->bank[j].start = free_regions->bank[j - 1].start;
>> + free_regions->bank[j].size = free_regions->bank[j - 1].size;
>> + }
>> +
>> + free_regions->bank[i].start = start;
>> + free_regions->bank[i].size = size;
>> + free_regions->nr_banks++;
>> +
>> + return 0;
>> +}
>> +
>> +/*
>> + * Find unused regions of Host address space which can be exposed to domain
>> + * using the host memory layout. In order to calculate regions we exclude every
>> + * region passed in mem_banks from the Host RAM.
>> + */
>> +int __init find_unallocated_memory(const struct kernel_info *kinfo,
>> + const struct membanks *mem_banks[],
>> + unsigned int nr_mem_banks,
>> + struct membanks *free_regions,
>> + int (*cb)(unsigned long s_gfn,
>> + unsigned long e_gfn,
>> + void *data))
>> +{
>> + const struct membanks *mem = bootinfo_get_mem();
>> + struct rangeset *unalloc_mem;
>> + paddr_t start, end;
>> + unsigned int i, j;
>> + int res;
>> +
>> + ASSERT(domain_use_host_layout(kinfo->d));
>> +
>> + unalloc_mem = rangeset_new(NULL, NULL, 0);
>> + if ( !unalloc_mem )
>> + return -ENOMEM;
>> +
>> + /* Start with all available RAM */
>> + for ( i = 0; i < mem->nr_banks; i++ )
>> + {
>> + start = mem->bank[i].start;
>> + end = mem->bank[i].start + mem->bank[i].size;
>> + res = rangeset_add_range(unalloc_mem, PFN_DOWN(start),
>> + PFN_DOWN(end - 1));
>> + if ( res )
>> + {
>> + printk(XENLOG_ERR "Failed to add: %#"PRIpaddr"->%#"PRIpaddr"\n",
>> + start, end);
>> + goto out;
>> + }
>> + }
>> +
>> + /* Remove all regions listed in mem_banks */
>> + for ( i = 0; i < nr_mem_banks; i++ )
>> + for ( j = 0; j < mem_banks[i]->nr_banks; j++ )
>> + {
>> + start = mem_banks[i]->bank[j].start;
>> +
>> + /* Shared memory banks can contain INVALID_PADDR as start */
>> + if ( INVALID_PADDR == start )
>> + continue;
>> +
>> + end = mem_banks[i]->bank[j].start + mem_banks[i]->bank[j].size;
>> + res = rangeset_remove_range(unalloc_mem, PFN_DOWN(start),
>> + PFN_DOWN(end - 1));
>> + if ( res )
>> + {
>> + printk(XENLOG_ERR
>> + "Failed to add: %#"PRIpaddr"->%#"PRIpaddr", error %d\n",
>> + start, end, res);
>> + goto out;
>> + }
>> + }
>> +
>> + start = 0;
>> + end = (1ULL << p2m_ipa_bits) - 1;
>> + res = rangeset_report_ranges(unalloc_mem, PFN_DOWN(start), PFN_DOWN(end),
>> + cb, free_regions);
>> + if ( res )
>> + free_regions->nr_banks = 0;
>> + else if ( !free_regions->nr_banks )
>> + res = -ENOENT;
>> +
>> +out:
>> + rangeset_destroy(unalloc_mem);
>> +
>> + return res;
>> +}
>> +
>> +void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
>> +{
>> + struct membanks *mem = kernel_info_get_mem(kinfo);
>> + unsigned int i, nr_banks = GUEST_RAM_BANKS;
>> + struct membanks *hwdom_free_mem = NULL;
>> +
>> + printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
>> + /* Don't want format this as PRIpaddr (16 digit hex) */
>> + (unsigned long)(kinfo->unassigned_mem >> 20), d);
>> +
>> + mem->nr_banks = 0;
>> + /*
>> + * Use host memory layout for hwdom. Only case for this is when LLC coloring
>> + * is enabled.
>> + */
>> + if ( is_hardware_domain(d) )
>> + {
>> + struct membanks *gnttab = xzalloc_flex_struct(struct membanks, bank, 1);
> shouldn't we set gnttab->max_banks and gnttab->type here?
Here and ...
>
>
>> + /*
>> + * Exclude the following regions:
>> + * 1) Remove reserved memory
>> + * 2) Grant table assigned to hwdom
>> + */
>> + const struct membanks *mem_banks[] = {
>> + bootinfo_get_reserved_mem(),
>> + gnttab,
>> + };
>> +
>> + if ( !gnttab )
>> + goto fail;
>> +
>> + gnttab->nr_banks = 1;
>> + gnttab->bank[0].start = kinfo->gnttab_start;
>> + gnttab->bank[0].size = kinfo->gnttab_size;
>> +
>> + hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
>> + NR_MEM_BANKS);
>> + if ( !hwdom_free_mem )
>> + goto fail;
>> +
>> + hwdom_free_mem->max_banks = NR_MEM_BANKS;
> here we are missing setting hwdom_free_mem->type ?
... here, membanks_xzalloc() should be really used.
for the first one case - membanks_xzalloc(1, MEMORY) and
for the second - membanks_xzalloc(NR_MEM_BANKS, MEMORY).
Good catch.
>
>> +
>> + if ( find_unallocated_memory(kinfo, mem_banks, ARRAY_SIZE(mem_banks),
>> + hwdom_free_mem, add_hwdom_free_regions) )
>> + goto fail;
>> +
>> + nr_banks = hwdom_free_mem->nr_banks;
>> + xfree(gnttab);
>> + }
>> +
>> + for ( i = 0; kinfo->unassigned_mem > 0 && nr_banks > 0; i++, nr_banks-- )
>> + {
>> + paddr_t bank_start, bank_size;
>> +
>> + if ( is_hardware_domain(d) )
>> + {
>> + bank_start = hwdom_free_mem->bank[i].start;
>> + bank_size = hwdom_free_mem->bank[i].size;
>> + }
>> + else
>> + {
>> + const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>> + const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
>> +
>> + if ( i >= GUEST_RAM_BANKS )
>> + goto fail;
>> +
>> + bank_start = bankbase[i];
>> + bank_size = banksize[i];
>> + }
>> +
>> + bank_size = MIN(bank_size, kinfo->unassigned_mem);
>> + if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(bank_start), bank_size) )
>> + goto fail;
>> + }
>> +
>> + if ( kinfo->unassigned_mem )
>> + goto fail;
>> +
>> + for( i = 0; i < mem->nr_banks; i++ )
>> + {
>> + printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
>> + d,
>> + i,
>> + mem->bank[i].start,
>> + mem->bank[i].start + mem->bank[i].size,
>> + /* Don't want format this as PRIpaddr (16 digit hex) */
>> + (unsigned long)(mem->bank[i].size >> 20));
>> + }
>> +
>> + xfree(hwdom_free_mem);
>> + return;
>> +
>> + fail:
>> + panic("Failed to allocate requested domain memory."
>> + /* Don't want format this as PRIpaddr (16 digit hex) */
>> + " %ldKB unallocated. Fix the VMs configurations.\n",
>> + (unsigned long)kinfo->unassigned_mem >> 10);
>> +}
>> +
>> +/* Copy data to guest physical address, then clean the region. */
>> +typedef unsigned long (*copy_to_guest_phys_cb)(struct domain *d,
>> + paddr_t gpa,
>> + void *buf,
>> + unsigned int len);
> This shouldn't be needed because copy_to_guest_phys_cb is already
> declared in xen/include/xen/fdt-domain-build.h
>
>
>> +void __init dtb_load(struct kernel_info *kinfo,
>> + copy_to_guest_phys_cb copy_to_guest)
>> +{
>> + unsigned long left;
>> +
>> + printk("Loading %pd DTB to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
>> + kinfo->d, kinfo->dtb_paddr,
>> + kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt));
>> +
>> + left = copy_to_guest(kinfo->d, kinfo->dtb_paddr,
>> + kinfo->fdt,
>> + fdt_totalsize(kinfo->fdt));
>> +
>> + if ( left != 0 )
>> + panic("Unable to copy the DTB to %pd memory (left = %lu bytes)\n",
>> + kinfo->d, left);
>> + xfree(kinfo->fdt);
>> +}
>> +
>> +void __init initrd_load(struct kernel_info *kinfo,
>> + copy_to_guest_phys_cb copy_to_guest)
>> +{
>> + const struct bootmodule *mod = kinfo->initrd_bootmodule;
>> + paddr_t load_addr = kinfo->initrd_paddr;
>> + paddr_t paddr, len;
>> + int node;
>> + int res;
>> + __be32 val[2];
>> + __be32 *cellp;
>> + void __iomem *initrd;
>> +
>> + if ( !mod || !mod->size )
>> + return;
>> +
>> + paddr = mod->start;
>> + len = mod->size;
>> +
>> + printk("Loading %pd initrd from %"PRIpaddr" to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
>> + kinfo->d, paddr, load_addr, load_addr + len);
>> +
>> + /* Fix up linux,initrd-start and linux,initrd-end in /chosen */
>> + node = fdt_path_offset(kinfo->fdt, "/chosen");
>> + if ( node < 0 )
>> + panic("Cannot find the /chosen node\n");
>> +
>> + cellp = (__be32 *)val;
>> + dt_set_cell(&cellp, ARRAY_SIZE(val), load_addr);
>> + res = fdt_setprop_inplace(kinfo->fdt, node, "linux,initrd-start",
>> + val, sizeof(val));
>> + if ( res )
>> + panic("Cannot fix up \"linux,initrd-start\" property\n");
>> +
>> + cellp = (__be32 *)val;
>> + dt_set_cell(&cellp, ARRAY_SIZE(val), load_addr + len);
>> + res = fdt_setprop_inplace(kinfo->fdt, node, "linux,initrd-end",
>> + val, sizeof(val));
>> + if ( res )
>> + panic("Cannot fix up \"linux,initrd-end\" property\n");
>> +
>> + initrd = ioremap_wc(paddr, len);
>> + if ( !initrd )
>> + panic("Unable to map the hwdom initrd\n");
> The original message was:
>
> panic("Unable to map the %pd initrd\n", kinfo->d);
>
> why change it? It can be called for domUs.
>
>
>> + res = copy_to_guest(kinfo->d, load_addr,
>> + initrd, len);
>> + if ( res != 0 )
>> + panic("Unable to copy the initrd in the hwdom memory\n");
> Same here, the original message was:
>
> panic("Unable to copy the initrd in the %pd memory\n", kinfo->d);
This is "new" (introduced 1 month ago), so I just overlooked them when doing rebasing. I'll update
the messages.
Thanks for review.
~ Oleksii
>
>
>> + iounmap(initrd);
>> +}
>> diff --git a/xen/include/xen/fdt-domain-build.h b/xen/include/xen/fdt-domain-build.h
>> index b79e9fabfe..4a0052b2e8 100644
>> --- a/xen/include/xen/fdt-domain-build.h
>> +++ b/xen/include/xen/fdt-domain-build.h
>> @@ -6,6 +6,7 @@
>> #include <xen/bootfdt.h>
>> #include <xen/device_tree.h>
>> #include <xen/fdt-kernel.h>
>> +#include <xen/mm.h>
>> #include <xen/types.h>
>>
>> struct domain;
>> @@ -29,7 +30,37 @@ int make_memory_node(const struct kernel_info *kinfo, int addrcells,
>> int sizecells, const struct membanks *mem);
>> int make_timer_node(const struct kernel_info *kinfo);
>>
>> -unsigned int get_allocation_size(paddr_t size);
>> +
>> +static inline int get_allocation_size(paddr_t size)
>> +{
>> + /*
>> + * get_order_from_bytes returns the order greater than or equal to
>> + * the given size, but we need less than or equal. Adding one to
>> + * the size pushes an evenly aligned size into the next order, so
>> + * we can then unconditionally subtract 1 from the order which is
>> + * returned.
>> + */
>> + return get_order_from_bytes(size + 1) - 1;
>> +}
>> +
>> +typedef unsigned long (*copy_to_guest_phys_cb)(struct domain *d,
>> + paddr_t gpa,
>> + void *buf,
>> + unsigned int len);
>> +
>> +void initrd_load(struct kernel_info *kinfo,
>> + copy_to_guest_phys_cb copy_to_guest);
>> +
>> +void dtb_load(struct kernel_info *kinfo,
>> + copy_to_guest_phys_cb copy_to_guest);
>> +
>> +int find_unallocated_memory(const struct kernel_info *kinfo,
>> + const struct membanks *mem_banks[],
>> + unsigned int nr_mem_banks,
>> + struct membanks *free_regions,
>> + int (*cb)(unsigned long s_gfn,
>> + unsigned long e_gfn,
>> + void *data));
>>
>> #endif /* __XEN_FDT_DOMAIN_BUILD_H__ */
>>
>> --
>> 2.49.0
>>
[-- Attachment #2: Type: text/html, Size: 34651 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v3 8/8] xen/common: dom0less: introduce common dom0less-build.c
2025-05-02 16:22 [PATCH v3 0/8] Move parts of Arm's Dom0less to common code Oleksii Kurochko
` (6 preceding siblings ...)
2025-05-02 16:22 ` [PATCH v3 7/8] xen/common: dom0less: introduce common domain-build.c Oleksii Kurochko
@ 2025-05-02 16:22 ` Oleksii Kurochko
2025-05-02 20:53 ` Stefano Stabellini
7 siblings, 1 reply; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-02 16:22 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné
Part of Arm's dom0less-build.c could be common between architectures which are
using device tree files to create guest domains. Thereby move some parts of
Arm's dom0less-build.c to common code with minor changes.
As a part of theses changes the following changes are introduced:
- Introduce make_arch_nodes() to cover arch-specific nodes. For example, in
case of Arm, it is PSCI and vpl011 nodes.
- Introduce set_domain_type() to abstract a way how setting of domain type
happens. For example, RISC-V won't have this member of arch_domain structure
as vCPUs will always have the same bitness as hypervisor. In case of Arm, it
is possible that Arm64 could create 32-bit and 64-bit domains.
- Introduce init_vuart() to cover details of virtual uart initialization.
- Introduce init_intc_phandle() to cover some details of interrupt controller
phandle initialization. As an example, RISC-V could have different name for
interrupt controller node ( APLIC, PLIC, IMSIC, etc ) but the code in
domain_handle_dtb_bootmodule() could handle only one interrupt controller
node name.
- s/make_gic_domU_node/make_intc_domU_node as GIC is Arm specific naming and
add prototype of make_intc_domU_node() to dom0less-build.h
The following functions are moved to xen/common/device-tree:
- Functions which are moved as is:
- domain_p2m_pages().
- handle_passthrough_prop().
- handle_prop_pfdt().
- scan_pfdt_node().
- check_partial_fdt().
- Functions which are moved with some minor changes:
- alloc_xenstore_evtchn():
- ifdef-ing by CONFIG_HVM accesses to hvm.params.
- prepare_dtb_domU():
- ifdef-ing access to gnttab_{start,size} by CONFIG_GRANT_TABLE.
- s/make_gic_domU_node/make_intc_domU_node.
- Add call of make_arch_nodes().
- domain_handle_dtb_bootmodule():
- hide details of interrupt controller phandle initialization by calling
init_intc_phandle().
- Update the comment above init_intc_phandle(): s/gic/interrupt controller.
- construct_domU():
- ifdef-ing by CONFIG_HVM accesses to hvm.params.
- Call init_vuart() to hide Arm's vpl011_init() details there.
- Add call of set_domain_type() instead of setting kinfo->arch.type explicitly.
Some parts of dom0less-build.c are wraped by #ifdef CONFIG_STATIC_{SHMEM,MEMORY}
as not all archs support these configs.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Change in v3:
- Align construct_domU() with the current staging.
- Align alloc_xenstore_params() with the current staging.
- Move defintion of XENSTORE_PFN_LATE_ALLOC to common and
declaration of need_xenstore to common.
---
Change in v2:
- Wrap by #ifdef CONFIG_STATIC_* inclusions of <asm/static-memory.h> and
<asm/static-shmem.h>. Wrap also the code which uses something from the
mentioned headers.
- Add handling of legacy case in construct_domU().
- Use xen/fdt-kernel.h and xen/fdt-domain-build.h instead of asm/*.
- Update the commit message.
---
xen/arch/arm/dom0less-build.c | 714 ++---------------------
xen/common/device-tree/dom0less-build.c | 699 ++++++++++++++++++++++
xen/include/asm-generic/dom0less-build.h | 18 +-
3 files changed, 751 insertions(+), 680 deletions(-)
diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index 0310579863..627c212b3b 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -25,8 +25,6 @@
#include <asm/static-memory.h>
#include <asm/static-shmem.h>
-bool __initdata need_xenstore;
-
#ifdef CONFIG_VGICV2
static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
{
@@ -152,7 +150,7 @@ static int __init make_gicv3_domU_node(struct kernel_info *kinfo)
}
#endif
-static int __init make_gic_domU_node(struct kernel_info *kinfo)
+int __init make_intc_domU_node(struct kernel_info *kinfo)
{
switch ( kinfo->d->arch.vgic.version )
{
@@ -218,708 +216,60 @@ static int __init make_vpl011_uart_node(struct kernel_info *kinfo)
}
#endif
-/*
- * Scan device tree properties for passthrough specific information.
- * Returns < 0 on error
- * 0 on success
- */
-static int __init handle_passthrough_prop(struct kernel_info *kinfo,
- const struct fdt_property *xen_reg,
- const struct fdt_property *xen_path,
- bool xen_force,
- uint32_t address_cells,
- uint32_t size_cells)
-{
- const __be32 *cell;
- unsigned int i, len;
- struct dt_device_node *node;
- int res;
- paddr_t mstart, size, gstart;
-
- /* xen,reg specifies where to map the MMIO region */
- cell = (const __be32 *)xen_reg->data;
- len = fdt32_to_cpu(xen_reg->len) / ((address_cells * 2 + size_cells) *
- sizeof(uint32_t));
-
- for ( i = 0; i < len; i++ )
- {
- device_tree_get_reg(&cell, address_cells, size_cells,
- &mstart, &size);
- gstart = dt_next_cell(address_cells, &cell);
-
- if ( gstart & ~PAGE_MASK || mstart & ~PAGE_MASK || size & ~PAGE_MASK )
- {
- printk(XENLOG_ERR
- "DomU passthrough config has not page aligned addresses/sizes\n");
- return -EINVAL;
- }
-
- res = iomem_permit_access(kinfo->d, paddr_to_pfn(mstart),
- paddr_to_pfn(PAGE_ALIGN(mstart + size - 1)));
- if ( res )
- {
- printk(XENLOG_ERR "Unable to permit to dom%d access to"
- " 0x%"PRIpaddr" - 0x%"PRIpaddr"\n",
- kinfo->d->domain_id,
- mstart & PAGE_MASK, PAGE_ALIGN(mstart + size) - 1);
- return res;
- }
-
- res = map_regions_p2mt(kinfo->d,
- gaddr_to_gfn(gstart),
- PFN_DOWN(size),
- maddr_to_mfn(mstart),
- p2m_mmio_direct_dev);
- if ( res < 0 )
- {
- printk(XENLOG_ERR
- "Failed to map %"PRIpaddr" to the guest at%"PRIpaddr"\n",
- mstart, gstart);
- return -EFAULT;
- }
- }
-
- /*
- * If xen_force, we let the user assign a MMIO region with no
- * associated path.
- */
- if ( xen_path == NULL )
- return xen_force ? 0 : -EINVAL;
-
- /*
- * xen,path specifies the corresponding node in the host DT.
- * Both interrupt mappings and IOMMU settings are based on it,
- * as they are done based on the corresponding host DT node.
- */
- node = dt_find_node_by_path(xen_path->data);
- if ( node == NULL )
- {
- printk(XENLOG_ERR "Couldn't find node %s in host_dt!\n",
- xen_path->data);
- return -EINVAL;
- }
-
- res = map_device_irqs_to_domain(kinfo->d, node, true, NULL);
- if ( res < 0 )
- return res;
-
- res = iommu_add_dt_device(node);
- if ( res < 0 )
- return res;
-
- /* If xen_force, we allow assignment of devices without IOMMU protection. */
- if ( xen_force && !dt_device_is_protected(node) )
- return 0;
-
- return iommu_assign_dt_device(kinfo->d, node);
-}
-
-static int __init handle_prop_pfdt(struct kernel_info *kinfo,
- const void *pfdt, int nodeoff,
- uint32_t address_cells, uint32_t size_cells,
- bool scan_passthrough_prop)
-{
- void *fdt = kinfo->fdt;
- int propoff, nameoff, res;
- const struct fdt_property *prop, *xen_reg = NULL, *xen_path = NULL;
- const char *name;
- bool found, xen_force = false;
-
- for ( propoff = fdt_first_property_offset(pfdt, nodeoff);
- propoff >= 0;
- propoff = fdt_next_property_offset(pfdt, propoff) )
- {
- if ( !(prop = fdt_get_property_by_offset(pfdt, propoff, NULL)) )
- return -FDT_ERR_INTERNAL;
-
- found = false;
- nameoff = fdt32_to_cpu(prop->nameoff);
- name = fdt_string(pfdt, nameoff);
-
- if ( scan_passthrough_prop )
- {
- if ( dt_prop_cmp("xen,reg", name) == 0 )
- {
- xen_reg = prop;
- found = true;
- }
- else if ( dt_prop_cmp("xen,path", name) == 0 )
- {
- xen_path = prop;
- found = true;
- }
- else if ( dt_prop_cmp("xen,force-assign-without-iommu",
- name) == 0 )
- {
- xen_force = true;
- found = true;
- }
- }
-
- /*
- * Copy properties other than the ones above: xen,reg, xen,path,
- * and xen,force-assign-without-iommu.
- */
- if ( !found )
- {
- res = fdt_property(fdt, name, prop->data, fdt32_to_cpu(prop->len));
- if ( res )
- return res;
- }
- }
-
- /*
- * Only handle passthrough properties if both xen,reg and xen,path
- * are present, or if xen,force-assign-without-iommu is specified.
- */
- if ( xen_reg != NULL && (xen_path != NULL || xen_force) )
- {
- res = handle_passthrough_prop(kinfo, xen_reg, xen_path, xen_force,
- address_cells, size_cells);
- if ( res < 0 )
- {
- printk(XENLOG_ERR "Failed to assign device to %pd\n", kinfo->d);
- return res;
- }
- }
- else if ( (xen_path && !xen_reg) || (xen_reg && !xen_path && !xen_force) )
- {
- printk(XENLOG_ERR "xen,reg or xen,path missing for %pd\n",
- kinfo->d);
- return -EINVAL;
- }
-
- /* FDT_ERR_NOTFOUND => There is no more properties for this node */
- return ( propoff != -FDT_ERR_NOTFOUND ) ? propoff : 0;
-}
-
-static int __init scan_pfdt_node(struct kernel_info *kinfo, const void *pfdt,
- int nodeoff,
- uint32_t address_cells, uint32_t size_cells,
- bool scan_passthrough_prop)
-{
- int rc = 0;
- void *fdt = kinfo->fdt;
- int node_next;
-
- rc = fdt_begin_node(fdt, fdt_get_name(pfdt, nodeoff, NULL));
- if ( rc )
- return rc;
-
- rc = handle_prop_pfdt(kinfo, pfdt, nodeoff, address_cells, size_cells,
- scan_passthrough_prop);
- if ( rc )
- return rc;
-
- address_cells = device_tree_get_u32(pfdt, nodeoff, "#address-cells",
- DT_ROOT_NODE_ADDR_CELLS_DEFAULT);
- size_cells = device_tree_get_u32(pfdt, nodeoff, "#size-cells",
- DT_ROOT_NODE_SIZE_CELLS_DEFAULT);
-
- node_next = fdt_first_subnode(pfdt, nodeoff);
- while ( node_next > 0 )
- {
- rc = scan_pfdt_node(kinfo, pfdt, node_next, address_cells, size_cells,
- scan_passthrough_prop);
- if ( rc )
- return rc;
-
- node_next = fdt_next_subnode(pfdt, node_next);
- }
-
- return fdt_end_node(fdt);
-}
-
-static int __init check_partial_fdt(void *pfdt, size_t size)
+int __init make_arch_nodes(struct kernel_info *kinfo)
{
- int res;
-
- if ( fdt_magic(pfdt) != FDT_MAGIC )
- {
- dprintk(XENLOG_ERR, "Partial FDT is not a valid Flat Device Tree");
- return -EINVAL;
- }
-
- res = fdt_check_header(pfdt);
- if ( res )
- {
- dprintk(XENLOG_ERR, "Failed to check the partial FDT (%d)", res);
- return -EINVAL;
- }
-
- if ( fdt_totalsize(pfdt) > size )
- {
- dprintk(XENLOG_ERR, "Partial FDT totalsize is too big");
- return -EINVAL;
- }
-
- return 0;
-}
-
-static int __init domain_handle_dtb_bootmodule(struct domain *d,
- struct kernel_info *kinfo)
-{
- void *pfdt;
- int res, node_next;
-
- pfdt = ioremap_cache(kinfo->dtb_bootmodule->start,
- kinfo->dtb_bootmodule->size);
- if ( pfdt == NULL )
- return -EFAULT;
-
- res = check_partial_fdt(pfdt, kinfo->dtb_bootmodule->size);
- if ( res < 0 )
- goto out;
-
- for ( node_next = fdt_first_subnode(pfdt, 0);
- node_next > 0;
- node_next = fdt_next_subnode(pfdt, node_next) )
- {
- const char *name = fdt_get_name(pfdt, node_next, NULL);
-
- if ( name == NULL )
- continue;
-
- /*
- * Only scan /gic /aliases /passthrough, ignore the rest.
- * They don't have to be parsed in order.
- *
- * Take the GIC phandle value from the special /gic node in the
- * DTB fragment.
- */
- if ( dt_node_cmp(name, "gic") == 0 )
- {
- uint32_t phandle_intc = fdt_get_phandle(pfdt, node_next);
-
- if ( phandle_intc != 0 )
- kinfo->phandle_intc = phandle_intc;
- continue;
- }
-
- if ( dt_node_cmp(name, "aliases") == 0 )
- {
- res = scan_pfdt_node(kinfo, pfdt, node_next,
- DT_ROOT_NODE_ADDR_CELLS_DEFAULT,
- DT_ROOT_NODE_SIZE_CELLS_DEFAULT,
- false);
- if ( res )
- goto out;
- continue;
- }
- if ( dt_node_cmp(name, "passthrough") == 0 )
- {
- res = scan_pfdt_node(kinfo, pfdt, node_next,
- DT_ROOT_NODE_ADDR_CELLS_DEFAULT,
- DT_ROOT_NODE_SIZE_CELLS_DEFAULT,
- true);
- if ( res )
- goto out;
- continue;
- }
- }
-
- out:
- iounmap(pfdt);
-
- return res;
-}
-
-/*
- * The max size for DT is 2MB. However, the generated DT is small (not including
- * domU passthrough DT nodes whose size we account separately), 4KB are enough
- * for now, but we might have to increase it in the future.
- */
-#define DOMU_DTB_SIZE 4096
-static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
-{
- int addrcells, sizecells;
- int ret, fdt_size = DOMU_DTB_SIZE;
-
- kinfo->phandle_intc = GUEST_PHANDLE_GIC;
- kinfo->gnttab_start = GUEST_GNTTAB_BASE;
- kinfo->gnttab_size = GUEST_GNTTAB_SIZE;
-
- addrcells = GUEST_ROOT_ADDRESS_CELLS;
- sizecells = GUEST_ROOT_SIZE_CELLS;
-
- /* Account for domU passthrough DT size */
- if ( kinfo->dtb_bootmodule )
- fdt_size += kinfo->dtb_bootmodule->size;
-
- /* Cap to max DT size if needed */
- fdt_size = min(fdt_size, SZ_2M);
-
- kinfo->fdt = xmalloc_bytes(fdt_size);
- if ( kinfo->fdt == NULL )
- return -ENOMEM;
-
- ret = fdt_create(kinfo->fdt, fdt_size);
- if ( ret < 0 )
- goto err;
-
- ret = fdt_finish_reservemap(kinfo->fdt);
- if ( ret < 0 )
- goto err;
-
- ret = fdt_begin_node(kinfo->fdt, "");
- if ( ret < 0 )
- goto err;
-
- ret = fdt_property_cell(kinfo->fdt, "#address-cells", addrcells);
- if ( ret )
- goto err;
-
- ret = fdt_property_cell(kinfo->fdt, "#size-cells", sizecells);
- if ( ret )
- goto err;
-
- ret = make_chosen_node(kinfo);
- if ( ret )
- goto err;
+ int ret;
ret = make_psci_node(kinfo->fdt);
if ( ret )
- goto err;
-
- ret = make_cpus_node(d, kinfo->fdt);
- if ( ret )
- goto err;
-
- ret = make_memory_node(kinfo, addrcells, sizecells,
- kernel_info_get_mem(kinfo));
- if ( ret )
- goto err;
-
- ret = make_resv_memory_node(kinfo, addrcells, sizecells);
- if ( ret )
- goto err;
-
- /*
- * domain_handle_dtb_bootmodule has to be called before the rest of
- * the device tree is generated because it depends on the value of
- * the field phandle_intc.
- */
- if ( kinfo->dtb_bootmodule )
- {
- ret = domain_handle_dtb_bootmodule(d, kinfo);
- if ( ret )
- goto err;
- }
-
- ret = make_gic_domU_node(kinfo);
- if ( ret )
- goto err;
-
- ret = make_timer_node(kinfo);
- if ( ret )
- goto err;
+ return -EINVAL;
if ( kinfo->vuart )
{
- ret = -EINVAL;
#ifdef CONFIG_SBSA_VUART_CONSOLE
ret = make_vpl011_uart_node(kinfo);
#endif
if ( ret )
- goto err;
- }
-
- if ( kinfo->dom0less_feature & DOM0LESS_ENHANCED_NO_XS )
- {
- ret = make_hypervisor_node(d, kinfo, addrcells, sizecells);
- if ( ret )
- goto err;
+ return -EINVAL;
}
- ret = fdt_end_node(kinfo->fdt);
- if ( ret < 0 )
- goto err;
-
- ret = fdt_finish(kinfo->fdt);
- if ( ret < 0 )
- goto err;
-
return 0;
-
- err:
- printk("Device tree generation failed (%d).\n", ret);
- xfree(kinfo->fdt);
-
- return -EINVAL;
}
-#define XENSTORE_PFN_OFFSET 1
-static int __init alloc_xenstore_page(struct domain *d)
+/* TODO: make arch.type generic ? */
+#ifdef CONFIG_ARM_64
+void __init set_domain_type(struct domain *d, struct kernel_info *kinfo)
{
- struct page_info *xenstore_pg;
- struct xenstore_domain_interface *interface;
- mfn_t mfn;
- gfn_t gfn;
- int rc;
-
- if ( (UINT_MAX - d->max_pages) < 1 )
- {
- printk(XENLOG_ERR "%pd: Over-allocation for d->max_pages by 1 page.\n",
- d);
- return -EINVAL;
- }
-
- d->max_pages += 1;
- xenstore_pg = alloc_domheap_page(d, MEMF_bits(32));
- if ( xenstore_pg == NULL && is_64bit_domain(d) )
- xenstore_pg = alloc_domheap_page(d, 0);
- if ( xenstore_pg == NULL )
- return -ENOMEM;
-
- mfn = page_to_mfn(xenstore_pg);
- if ( !mfn_x(mfn) )
- return -ENOMEM;
-
- if ( !is_domain_direct_mapped(d) )
- gfn = gaddr_to_gfn(GUEST_MAGIC_BASE +
- (XENSTORE_PFN_OFFSET << PAGE_SHIFT));
- else
- gfn = gaddr_to_gfn(mfn_to_maddr(mfn));
-
- rc = guest_physmap_add_page(d, gfn, mfn, 0);
- if ( rc )
- {
- free_domheap_page(xenstore_pg);
- return rc;
- }
-
- d->arch.hvm.params[HVM_PARAM_STORE_PFN] = gfn_x(gfn);
- interface = map_domain_page(mfn);
- interface->connection = XENSTORE_RECONNECT;
- unmap_domain_page(interface);
-
- return 0;
+ /* type must be set before allocate memory */
+ d->arch.type = kinfo->arch.type;
}
-
-static int __init alloc_xenstore_params(struct kernel_info *kinfo)
+#else
+void __init set_domain_type(struct domain *d, struct kernel_info *kinfo)
{
- struct domain *d = kinfo->d;
- int rc = 0;
-
- if ( (kinfo->dom0less_feature & (DOM0LESS_XENSTORE | DOM0LESS_XS_LEGACY))
- == (DOM0LESS_XENSTORE | DOM0LESS_XS_LEGACY) )
- d->arch.hvm.params[HVM_PARAM_STORE_PFN] = XENSTORE_PFN_LATE_ALLOC;
- else if ( kinfo->dom0less_feature & DOM0LESS_XENSTORE )
- {
- rc = alloc_xenstore_page(d);
- if ( rc < 0 )
- return rc;
- }
-
- return rc;
+ /* Nothing to do */
}
+#endif
-static void __init domain_vcpu_affinity(struct domain *d,
- const struct dt_device_node *node)
+int __init init_vuart(struct domain *d, struct kernel_info *kinfo,
+ const struct dt_device_node *node)
{
- struct dt_device_node *np;
-
- dt_for_each_child_node(node, np)
- {
- const char *hard_affinity_str = NULL;
- uint32_t val;
- int rc;
- struct vcpu *v;
- cpumask_t affinity;
-
- if ( !dt_device_is_compatible(np, "xen,vcpu") )
- continue;
-
- if ( !dt_property_read_u32(np, "id", &val) )
- panic("Invalid xen,vcpu node for domain %s\n", dt_node_name(node));
-
- if ( val >= d->max_vcpus )
- panic("Invalid vcpu_id %u for domain %s, max_vcpus=%u\n", val,
- dt_node_name(node), d->max_vcpus);
-
- v = d->vcpu[val];
- rc = dt_property_read_string(np, "hard-affinity", &hard_affinity_str);
- if ( rc < 0 )
- continue;
-
- cpumask_clear(&affinity);
- while ( *hard_affinity_str != '\0' )
- {
- unsigned int start, end;
-
- start = simple_strtoul(hard_affinity_str, &hard_affinity_str, 0);
-
- if ( *hard_affinity_str == '-' ) /* Range */
- {
- hard_affinity_str++;
- end = simple_strtoul(hard_affinity_str, &hard_affinity_str, 0);
- }
- else /* Single value */
- end = start;
-
- if ( end >= nr_cpu_ids )
- panic("Invalid pCPU %u for domain %s\n", end, dt_node_name(node));
-
- for ( ; start <= end; start++ )
- cpumask_set_cpu(start, &affinity);
-
- if ( *hard_affinity_str == ',' )
- hard_affinity_str++;
- else if ( *hard_affinity_str != '\0' )
- break;
- }
+ int rc = 0;
- rc = vcpu_set_hard_affinity(v, &affinity);
- if ( rc )
- panic("vcpu%d: failed (rc=%d) to set hard affinity for domain %s\n",
- v->vcpu_id, rc, dt_node_name(node));
- }
-}
+ kinfo->vuart = dt_property_read_bool(node, "vpl011");
-#ifdef CONFIG_ARCH_PAGING_MEMPOOL
-static unsigned long __init domain_p2m_pages(unsigned long maxmem_kb,
- unsigned int smp_cpus)
-{
/*
- * Keep in sync with libxl__get_required_paging_memory().
- * 256 pages (1MB) per vcpu, plus 1 page per MiB of RAM for the P2M map,
- * plus 128 pages to cover extended regions.
+ * Base address and irq number are needed when creating vpl011 device
+ * tree node in prepare_dtb_domU, so initialization on related variables
+ * shall be done first.
*/
- unsigned long memkb = 4 * (256 * smp_cpus + (maxmem_kb / 1024) + 128);
-
- BUILD_BUG_ON(PAGE_SIZE != SZ_4K);
-
- return DIV_ROUND_UP(memkb, 1024) << (20 - PAGE_SHIFT);
-}
-
-static int __init domain_p2m_set_allocation(struct domain *d, uint64_t mem,
- const struct dt_device_node *node)
-{
- unsigned long p2m_pages;
- uint32_t p2m_mem_mb;
- int rc;
-
- rc = dt_property_read_u32(node, "xen,domain-p2m-mem-mb", &p2m_mem_mb);
- /* If xen,domain-p2m-mem-mb is not specified, use the default value. */
- p2m_pages = rc ?
- p2m_mem_mb << (20 - PAGE_SHIFT) :
- domain_p2m_pages(mem, d->max_vcpus);
-
- spin_lock(&d->arch.paging.lock);
- rc = p2m_set_allocation(d, p2m_pages, NULL);
- spin_unlock(&d->arch.paging.lock);
-
- return rc;
-}
-#else /* !CONFIG_ARCH_PAGING_MEMPOOL */
-static inline int domain_p2m_set_allocation(struct domain *d, uint64_t mem,
- const struct dt_device_node *node)
-{
- return 0;
-}
-#endif /* CONFIG_ARCH_PAGING_MEMPOOL */
-
-int __init construct_domU(struct domain *d,
- const struct dt_device_node *node)
-{
- struct kernel_info kinfo = KERNEL_INFO_INIT;
- const char *dom0less_enhanced;
- int rc;
- u64 mem;
-
- rc = dt_property_read_u64(node, "memory", &mem);
- if ( !rc )
- {
- printk("Error building DomU: cannot read \"memory\" property\n");
- return -EINVAL;
- }
- kinfo.unassigned_mem = (paddr_t)mem * SZ_1K;
-
- rc = domain_p2m_set_allocation(d, mem, node);
- if ( rc != 0 )
- return rc;
-
- printk("*** LOADING DOMU cpus=%u memory=%#"PRIx64"KB ***\n",
- d->max_vcpus, mem);
-
- kinfo.vuart = dt_property_read_bool(node, "vpl011");
- if ( kinfo.vuart && is_hardware_domain(d) )
- panic("hardware domain cannot specify vpl011\n");
-
- rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced);
- if ( rc == -EILSEQ ||
- rc == -ENODATA ||
- (rc == 0 && !strcmp(dom0less_enhanced, "enabled")) )
- {
- need_xenstore = true;
- kinfo.dom0less_feature = DOM0LESS_ENHANCED;
- }
- else if ( rc == 0 && !strcmp(dom0less_enhanced, "legacy") )
- {
- need_xenstore = true;
- kinfo.dom0less_feature = DOM0LESS_ENHANCED_LEGACY;
- }
- else if ( rc == 0 && !strcmp(dom0less_enhanced, "no-xenstore") )
- kinfo.dom0less_feature = DOM0LESS_ENHANCED_NO_XS;
-
- if ( vcpu_create(d, 0) == NULL )
- return -ENOMEM;
-
- d->max_pages = ((paddr_t)mem * SZ_1K) >> PAGE_SHIFT;
-
- kinfo.d = d;
-
- rc = kernel_probe(&kinfo, node);
- if ( rc < 0 )
- return rc;
-
-#ifdef CONFIG_ARM_64
- /* type must be set before allocate memory */
- d->arch.type = kinfo.arch.type;
-#endif
- if ( is_hardware_domain(d) )
- {
- rc = construct_hwdom(&kinfo, node);
- if ( rc < 0 )
- return rc;
- }
- else
+ if ( kinfo->vuart )
{
- if ( !dt_find_property(node, "xen,static-mem", NULL) )
- allocate_memory(d, &kinfo);
- else if ( !is_domain_direct_mapped(d) )
- allocate_static_memory(d, &kinfo, node);
- else
- assign_static_memory_11(d, &kinfo, node);
-
- rc = process_shm(d, &kinfo, node);
- if ( rc < 0 )
- return rc;
-
- /*
- * Base address and irq number are needed when creating vpl011 device
- * tree node in prepare_dtb_domU, so initialization on related variables
- * shall be done first.
- */
- if ( kinfo.vuart )
- {
- rc = domain_vpl011_init(d, NULL);
- if ( rc < 0 )
- return rc;
- }
-
- rc = prepare_dtb_domU(d, &kinfo);
- if ( rc < 0 )
- return rc;
-
- rc = construct_domain(d, &kinfo);
+ rc = domain_vpl011_init(d, NULL);
if ( rc < 0 )
return rc;
}
- domain_vcpu_affinity(d, node);
-
- return alloc_xenstore_params(&kinfo);
+ return rc;
}
void __init arch_create_domUs(struct dt_device_node *node,
@@ -995,6 +345,22 @@ void __init arch_create_domUs(struct dt_device_node *node,
}
}
+int __init init_intc_phandle(struct kernel_info *kinfo, const char *name,
+ const int node_next, const void *pfdt)
+{
+ if ( dt_node_cmp(name, "gic") == 0 )
+ {
+ uint32_t phandle_intc = fdt_get_phandle(pfdt, node_next);
+
+ if ( phandle_intc != 0 )
+ kinfo->phandle_intc = phandle_intc;
+
+ return 0;
+ }
+
+ return 1;
+}
+
/*
* Local variables:
* mode: C
diff --git a/xen/common/device-tree/dom0less-build.c b/xen/common/device-tree/dom0less-build.c
index a01a8b6b1a..c3face5b90 100644
--- a/xen/common/device-tree/dom0less-build.c
+++ b/xen/common/device-tree/dom0less-build.c
@@ -3,24 +3,43 @@
#include <xen/bootfdt.h>
#include <xen/device_tree.h>
#include <xen/domain.h>
+#include <xen/domain_page.h>
#include <xen/err.h>
#include <xen/event.h>
+#include <xen/fdt-domain-build.h>
+#include <xen/fdt-kernel.h>
#include <xen/grant_table.h>
#include <xen/init.h>
+#include <xen/iocap.h>
#include <xen/iommu.h>
+#include <xen/libfdt/libfdt.h>
#include <xen/llc-coloring.h>
+#include <xen/sizes.h>
#include <xen/sched.h>
#include <xen/stdbool.h>
#include <xen/types.h>
+#include <xen/vmap.h>
#include <public/bootfdt.h>
#include <public/domctl.h>
#include <public/event_channel.h>
+#include <public/io/xs_wire.h>
#include <asm/dom0less-build.h>
#include <asm/setup.h>
+#ifdef CONFIG_STATIC_MEMORY
+#include <asm/static-memory.h>
+#endif
+
+#ifdef CONFIG_STATIC_SHM
+#include <asm/static-shmem.h>
+#endif
+
+#define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
+
static domid_t __initdata xs_domid = DOMID_INVALID;
+static bool __initdata need_xenstore;
void __init set_xs_domain(struct domain *d)
{
@@ -109,6 +128,686 @@ static void __init initialize_domU_xenstore(void)
}
}
+/*
+ * Scan device tree properties for passthrough specific information.
+ * Returns < 0 on error
+ * 0 on success
+ */
+static int __init handle_passthrough_prop(struct kernel_info *kinfo,
+ const struct fdt_property *xen_reg,
+ const struct fdt_property *xen_path,
+ bool xen_force,
+ uint32_t address_cells,
+ uint32_t size_cells)
+{
+ const __be32 *cell;
+ unsigned int i, len;
+ struct dt_device_node *node;
+ int res;
+ paddr_t mstart, size, gstart;
+
+ /* xen,reg specifies where to map the MMIO region */
+ cell = (const __be32 *)xen_reg->data;
+ len = fdt32_to_cpu(xen_reg->len) / ((address_cells * 2 + size_cells) *
+ sizeof(uint32_t));
+
+ for ( i = 0; i < len; i++ )
+ {
+ device_tree_get_reg(&cell, address_cells, size_cells,
+ &mstart, &size);
+ gstart = dt_next_cell(address_cells, &cell);
+
+ if ( gstart & ~PAGE_MASK || mstart & ~PAGE_MASK || size & ~PAGE_MASK )
+ {
+ printk(XENLOG_ERR
+ "DomU passthrough config has not page aligned addresses/sizes\n");
+ return -EINVAL;
+ }
+
+ res = iomem_permit_access(kinfo->d, paddr_to_pfn(mstart),
+ paddr_to_pfn(PAGE_ALIGN(mstart + size - 1)));
+ if ( res )
+ {
+ printk(XENLOG_ERR "Unable to permit to dom%d access to"
+ " 0x%"PRIpaddr" - 0x%"PRIpaddr"\n",
+ kinfo->d->domain_id,
+ mstart & PAGE_MASK, PAGE_ALIGN(mstart + size) - 1);
+ return res;
+ }
+
+ res = map_regions_p2mt(kinfo->d,
+ gaddr_to_gfn(gstart),
+ PFN_DOWN(size),
+ maddr_to_mfn(mstart),
+ p2m_mmio_direct_dev);
+ if ( res < 0 )
+ {
+ printk(XENLOG_ERR
+ "Failed to map %"PRIpaddr" to the guest at%"PRIpaddr"\n",
+ mstart, gstart);
+ return -EFAULT;
+ }
+ }
+
+ /*
+ * If xen_force, we let the user assign a MMIO region with no
+ * associated path.
+ */
+ if ( xen_path == NULL )
+ return xen_force ? 0 : -EINVAL;
+
+ /*
+ * xen,path specifies the corresponding node in the host DT.
+ * Both interrupt mappings and IOMMU settings are based on it,
+ * as they are done based on the corresponding host DT node.
+ */
+ node = dt_find_node_by_path(xen_path->data);
+ if ( node == NULL )
+ {
+ printk(XENLOG_ERR "Couldn't find node %s in host_dt!\n",
+ xen_path->data);
+ return -EINVAL;
+ }
+
+ res = map_device_irqs_to_domain(kinfo->d, node, true, NULL);
+ if ( res < 0 )
+ return res;
+
+ res = iommu_add_dt_device(node);
+ if ( res < 0 )
+ return res;
+
+ /* If xen_force, we allow assignment of devices without IOMMU protection. */
+ if ( xen_force && !dt_device_is_protected(node) )
+ return 0;
+
+ return iommu_assign_dt_device(kinfo->d, node);
+}
+
+static int __init handle_prop_pfdt(struct kernel_info *kinfo,
+ const void *pfdt, int nodeoff,
+ uint32_t address_cells, uint32_t size_cells,
+ bool scan_passthrough_prop)
+{
+ void *fdt = kinfo->fdt;
+ int propoff, nameoff, res;
+ const struct fdt_property *prop, *xen_reg = NULL, *xen_path = NULL;
+ const char *name;
+ bool found, xen_force = false;
+
+ for ( propoff = fdt_first_property_offset(pfdt, nodeoff);
+ propoff >= 0;
+ propoff = fdt_next_property_offset(pfdt, propoff) )
+ {
+ if ( !(prop = fdt_get_property_by_offset(pfdt, propoff, NULL)) )
+ return -FDT_ERR_INTERNAL;
+
+ found = false;
+ nameoff = fdt32_to_cpu(prop->nameoff);
+ name = fdt_string(pfdt, nameoff);
+
+ if ( scan_passthrough_prop )
+ {
+ if ( dt_prop_cmp("xen,reg", name) == 0 )
+ {
+ xen_reg = prop;
+ found = true;
+ }
+ else if ( dt_prop_cmp("xen,path", name) == 0 )
+ {
+ xen_path = prop;
+ found = true;
+ }
+ else if ( dt_prop_cmp("xen,force-assign-without-iommu",
+ name) == 0 )
+ {
+ xen_force = true;
+ found = true;
+ }
+ }
+
+ /*
+ * Copy properties other than the ones above: xen,reg, xen,path,
+ * and xen,force-assign-without-iommu.
+ */
+ if ( !found )
+ {
+ res = fdt_property(fdt, name, prop->data, fdt32_to_cpu(prop->len));
+ if ( res )
+ return res;
+ }
+ }
+
+ /*
+ * Only handle passthrough properties if both xen,reg and xen,path
+ * are present, or if xen,force-assign-without-iommu is specified.
+ */
+ if ( xen_reg != NULL && (xen_path != NULL || xen_force) )
+ {
+ res = handle_passthrough_prop(kinfo, xen_reg, xen_path, xen_force,
+ address_cells, size_cells);
+ if ( res < 0 )
+ {
+ printk(XENLOG_ERR "Failed to assign device to %pd\n", kinfo->d);
+ return res;
+ }
+ }
+ else if ( (xen_path && !xen_reg) || (xen_reg && !xen_path && !xen_force) )
+ {
+ printk(XENLOG_ERR "xen,reg or xen,path missing for %pd\n",
+ kinfo->d);
+ return -EINVAL;
+ }
+
+ /* FDT_ERR_NOTFOUND => There is no more properties for this node */
+ return ( propoff != -FDT_ERR_NOTFOUND ) ? propoff : 0;
+}
+
+static int __init scan_pfdt_node(struct kernel_info *kinfo, const void *pfdt,
+ int nodeoff,
+ uint32_t address_cells, uint32_t size_cells,
+ bool scan_passthrough_prop)
+{
+ int rc = 0;
+ void *fdt = kinfo->fdt;
+ int node_next;
+
+ rc = fdt_begin_node(fdt, fdt_get_name(pfdt, nodeoff, NULL));
+ if ( rc )
+ return rc;
+
+ rc = handle_prop_pfdt(kinfo, pfdt, nodeoff, address_cells, size_cells,
+ scan_passthrough_prop);
+ if ( rc )
+ return rc;
+
+ address_cells = device_tree_get_u32(pfdt, nodeoff, "#address-cells",
+ DT_ROOT_NODE_ADDR_CELLS_DEFAULT);
+ size_cells = device_tree_get_u32(pfdt, nodeoff, "#size-cells",
+ DT_ROOT_NODE_SIZE_CELLS_DEFAULT);
+
+ node_next = fdt_first_subnode(pfdt, nodeoff);
+ while ( node_next > 0 )
+ {
+ rc = scan_pfdt_node(kinfo, pfdt, node_next, address_cells, size_cells,
+ scan_passthrough_prop);
+ if ( rc )
+ return rc;
+
+ node_next = fdt_next_subnode(pfdt, node_next);
+ }
+
+ return fdt_end_node(fdt);
+}
+
+static int __init check_partial_fdt(void *pfdt, size_t size)
+{
+ int res;
+
+ if ( fdt_magic(pfdt) != FDT_MAGIC )
+ {
+ dprintk(XENLOG_ERR, "Partial FDT is not a valid Flat Device Tree");
+ return -EINVAL;
+ }
+
+ res = fdt_check_header(pfdt);
+ if ( res )
+ {
+ dprintk(XENLOG_ERR, "Failed to check the partial FDT (%d)", res);
+ return -EINVAL;
+ }
+
+ if ( fdt_totalsize(pfdt) > size )
+ {
+ dprintk(XENLOG_ERR, "Partial FDT totalsize is too big");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int __init domain_handle_dtb_bootmodule(struct domain *d,
+ struct kernel_info *kinfo)
+{
+ void *pfdt;
+ int res, node_next;
+
+ pfdt = ioremap_cache(kinfo->dtb_bootmodule->start,
+ kinfo->dtb_bootmodule->size);
+ if ( pfdt == NULL )
+ return -EFAULT;
+
+ res = check_partial_fdt(pfdt, kinfo->dtb_bootmodule->size);
+ if ( res < 0 )
+ goto out;
+
+ for ( node_next = fdt_first_subnode(pfdt, 0);
+ node_next > 0;
+ node_next = fdt_next_subnode(pfdt, node_next) )
+ {
+ const char *name = fdt_get_name(pfdt, node_next, NULL);
+
+ if ( name == NULL )
+ continue;
+
+ /*
+ * Only scan /$(interrupt_controller) /aliases /passthrough,
+ * ignore the rest.
+ * They don't have to be parsed in order.
+ *
+ * Take the interrupt controller phandle value from the special
+ * interrupt controller node in the DTB fragment.
+ */
+ if ( init_intc_phandle(kinfo, name, node_next, pfdt) == 0 )
+ continue;
+
+ if ( dt_node_cmp(name, "aliases") == 0 )
+ {
+ res = scan_pfdt_node(kinfo, pfdt, node_next,
+ DT_ROOT_NODE_ADDR_CELLS_DEFAULT,
+ DT_ROOT_NODE_SIZE_CELLS_DEFAULT,
+ false);
+ if ( res )
+ goto out;
+ continue;
+ }
+ if ( dt_node_cmp(name, "passthrough") == 0 )
+ {
+ res = scan_pfdt_node(kinfo, pfdt, node_next,
+ DT_ROOT_NODE_ADDR_CELLS_DEFAULT,
+ DT_ROOT_NODE_SIZE_CELLS_DEFAULT,
+ true);
+ if ( res )
+ goto out;
+ continue;
+ }
+ }
+
+ out:
+ iounmap(pfdt);
+
+ return res;
+}
+
+/*
+ * The max size for DT is 2MB. However, the generated DT is small (not including
+ * domU passthrough DT nodes whose size we account separately), 4KB are enough
+ * for now, but we might have to increase it in the future.
+ */
+#define DOMU_DTB_SIZE 4096
+static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
+{
+ int addrcells, sizecells;
+ int ret, fdt_size = DOMU_DTB_SIZE;
+
+ kinfo->phandle_intc = GUEST_PHANDLE_GIC;
+
+#ifdef CONFIG_GRANT_TABLE
+ kinfo->gnttab_start = GUEST_GNTTAB_BASE;
+ kinfo->gnttab_size = GUEST_GNTTAB_SIZE;
+#endif
+
+ addrcells = GUEST_ROOT_ADDRESS_CELLS;
+ sizecells = GUEST_ROOT_SIZE_CELLS;
+
+ /* Account for domU passthrough DT size */
+ if ( kinfo->dtb_bootmodule )
+ fdt_size += kinfo->dtb_bootmodule->size;
+
+ /* Cap to max DT size if needed */
+ fdt_size = min(fdt_size, SZ_2M);
+
+ kinfo->fdt = xmalloc_bytes(fdt_size);
+ if ( kinfo->fdt == NULL )
+ return -ENOMEM;
+
+ ret = fdt_create(kinfo->fdt, fdt_size);
+ if ( ret < 0 )
+ goto err;
+
+ ret = fdt_finish_reservemap(kinfo->fdt);
+ if ( ret < 0 )
+ goto err;
+
+ ret = fdt_begin_node(kinfo->fdt, "");
+ if ( ret < 0 )
+ goto err;
+
+ ret = fdt_property_cell(kinfo->fdt, "#address-cells", addrcells);
+ if ( ret )
+ goto err;
+
+ ret = fdt_property_cell(kinfo->fdt, "#size-cells", sizecells);
+ if ( ret )
+ goto err;
+
+ ret = make_chosen_node(kinfo);
+ if ( ret )
+ goto err;
+
+ ret = make_cpus_node(d, kinfo->fdt);
+ if ( ret )
+ goto err;
+
+ ret = make_memory_node(kinfo, addrcells, sizecells,
+ kernel_info_get_mem(kinfo));
+ if ( ret )
+ goto err;
+
+#ifdef CONFIG_STATIC_SHM
+ ret = make_resv_memory_node(kinfo, addrcells, sizecells);
+ if ( ret )
+ goto err;
+#endif
+
+ /*
+ * domain_handle_dtb_bootmodule has to be called before the rest of
+ * the device tree is generated because it depends on the value of
+ * the field phandle_intc.
+ */
+ if ( kinfo->dtb_bootmodule )
+ {
+ ret = domain_handle_dtb_bootmodule(d, kinfo);
+ if ( ret )
+ goto err;
+ }
+
+ ret = make_intc_domU_node(kinfo);
+ if ( ret )
+ goto err;
+
+ ret = make_timer_node(kinfo);
+ if ( ret )
+ goto err;
+
+ if ( kinfo->dom0less_feature & DOM0LESS_ENHANCED_NO_XS )
+ {
+ ret = make_hypervisor_node(d, kinfo, addrcells, sizecells);
+ if ( ret )
+ goto err;
+ }
+
+ ret = make_arch_nodes(kinfo);
+ if ( ret )
+ goto err;
+
+ ret = fdt_end_node(kinfo->fdt);
+ if ( ret < 0 )
+ goto err;
+
+ ret = fdt_finish(kinfo->fdt);
+ if ( ret < 0 )
+ goto err;
+
+ return 0;
+
+ err:
+ printk("Device tree generation failed (%d).\n", ret);
+ xfree(kinfo->fdt);
+
+ return -EINVAL;
+}
+
+#define XENSTORE_PFN_OFFSET 1
+static int __init alloc_xenstore_page(struct domain *d)
+{
+ struct page_info *xenstore_pg;
+ struct xenstore_domain_interface *interface;
+ mfn_t mfn;
+ gfn_t gfn;
+ int rc;
+
+ if ( (UINT_MAX - d->max_pages) < 1 )
+ {
+ printk(XENLOG_ERR "%pd: Over-allocation for d->max_pages by 1 page.\n",
+ d);
+ return -EINVAL;
+ }
+
+ d->max_pages += 1;
+ xenstore_pg = alloc_domheap_page(d, MEMF_bits(32));
+ if ( xenstore_pg == NULL && is_64bit_domain(d) )
+ xenstore_pg = alloc_domheap_page(d, 0);
+ if ( xenstore_pg == NULL )
+ return -ENOMEM;
+
+ mfn = page_to_mfn(xenstore_pg);
+ if ( !mfn_x(mfn) )
+ return -ENOMEM;
+
+ if ( !is_domain_direct_mapped(d) )
+ gfn = gaddr_to_gfn(GUEST_MAGIC_BASE +
+ (XENSTORE_PFN_OFFSET << PAGE_SHIFT));
+ else
+ gfn = gaddr_to_gfn(mfn_to_maddr(mfn));
+
+ rc = guest_physmap_add_page(d, gfn, mfn, 0);
+ if ( rc )
+ {
+ free_domheap_page(xenstore_pg);
+ return rc;
+ }
+
+#ifdef CONFIG_HVM
+ d->arch.hvm.params[HVM_PARAM_STORE_PFN] = gfn_x(gfn);
+#endif
+ interface = map_domain_page(mfn);
+ interface->connection = XENSTORE_RECONNECT;
+ unmap_domain_page(interface);
+
+ return 0;
+}
+
+static int __init alloc_xenstore_params(struct kernel_info *kinfo)
+{
+ struct domain *d = kinfo->d;
+ int rc = 0;
+
+#ifdef CONFIG_HVM
+ if ( (kinfo->dom0less_feature & (DOM0LESS_XENSTORE | DOM0LESS_XS_LEGACY))
+ == (DOM0LESS_XENSTORE | DOM0LESS_XS_LEGACY) )
+ d->arch.hvm.params[HVM_PARAM_STORE_PFN] = XENSTORE_PFN_LATE_ALLOC;
+ else
+#endif
+ if ( kinfo->dom0less_feature & DOM0LESS_XENSTORE )
+ {
+ rc = alloc_xenstore_page(d);
+ if ( rc < 0 )
+ return rc;
+ }
+
+ return rc;
+}
+
+static void __init domain_vcpu_affinity(struct domain *d,
+ const struct dt_device_node *node)
+{
+ struct dt_device_node *np;
+
+ dt_for_each_child_node(node, np)
+ {
+ const char *hard_affinity_str = NULL;
+ uint32_t val;
+ int rc;
+ struct vcpu *v;
+ cpumask_t affinity;
+
+ if ( !dt_device_is_compatible(np, "xen,vcpu") )
+ continue;
+
+ if ( !dt_property_read_u32(np, "id", &val) )
+ panic("Invalid xen,vcpu node for domain %s\n", dt_node_name(node));
+
+ if ( val >= d->max_vcpus )
+ panic("Invalid vcpu_id %u for domain %s, max_vcpus=%u\n", val,
+ dt_node_name(node), d->max_vcpus);
+
+ v = d->vcpu[val];
+ rc = dt_property_read_string(np, "hard-affinity", &hard_affinity_str);
+ if ( rc < 0 )
+ continue;
+
+ cpumask_clear(&affinity);
+ while ( *hard_affinity_str != '\0' )
+ {
+ unsigned int start, end;
+
+ start = simple_strtoul(hard_affinity_str, &hard_affinity_str, 0);
+
+ if ( *hard_affinity_str == '-' ) /* Range */
+ {
+ hard_affinity_str++;
+ end = simple_strtoul(hard_affinity_str, &hard_affinity_str, 0);
+ }
+ else /* Single value */
+ end = start;
+
+ if ( end >= nr_cpu_ids )
+ panic("Invalid pCPU %u for domain %s\n", end, dt_node_name(node));
+
+ for ( ; start <= end; start++ )
+ cpumask_set_cpu(start, &affinity);
+
+ if ( *hard_affinity_str == ',' )
+ hard_affinity_str++;
+ else if ( *hard_affinity_str != '\0' )
+ break;
+ }
+
+ rc = vcpu_set_hard_affinity(v, &affinity);
+ if ( rc )
+ panic("vcpu%d: failed (rc=%d) to set hard affinity for domain %s\n",
+ v->vcpu_id, rc, dt_node_name(node));
+ }
+}
+
+#ifdef CONFIG_ARCH_PAGING_MEMPOOL
+static unsigned long __init domain_p2m_pages(unsigned long maxmem_kb,
+ unsigned int smp_cpus)
+{
+ /*
+ * Keep in sync with libxl__get_required_paging_memory().
+ * 256 pages (1MB) per vcpu, plus 1 page per MiB of RAM for the P2M map,
+ * plus 128 pages to cover extended regions.
+ */
+ unsigned long memkb = 4 * (256 * smp_cpus + (maxmem_kb / 1024) + 128);
+
+ BUILD_BUG_ON(PAGE_SIZE != SZ_4K);
+
+ return DIV_ROUND_UP(memkb, 1024) << (20 - PAGE_SHIFT);
+}
+
+static int __init domain_p2m_set_allocation(struct domain *d, uint64_t mem,
+ const struct dt_device_node *node)
+{
+ unsigned long p2m_pages;
+ uint32_t p2m_mem_mb;
+ int rc;
+
+ rc = dt_property_read_u32(node, "xen,domain-p2m-mem-mb", &p2m_mem_mb);
+ /* If xen,domain-p2m-mem-mb is not specified, use the default value. */
+ p2m_pages = rc ?
+ p2m_mem_mb << (20 - PAGE_SHIFT) :
+ domain_p2m_pages(mem, d->max_vcpus);
+
+ spin_lock(&d->arch.paging.lock);
+ rc = p2m_set_allocation(d, p2m_pages, NULL);
+ spin_unlock(&d->arch.paging.lock);
+
+ return rc;
+}
+#else /* !CONFIG_ARCH_PAGING_MEMPOOL */
+static inline int domain_p2m_set_allocation(struct domain *d, uint64_t mem,
+ const struct dt_device_node *node)
+{
+ return 0;
+}
+#endif /* CONFIG_ARCH_PAGING_MEMPOOL */
+
+static int __init construct_domU(struct domain *d,
+ const struct dt_device_node *node)
+{
+ struct kernel_info kinfo = KERNEL_INFO_INIT;
+ const char *dom0less_enhanced;
+ int rc;
+ u64 mem;
+
+ rc = dt_property_read_u64(node, "memory", &mem);
+ if ( !rc )
+ {
+ printk("Error building DomU: cannot read \"memory\" property\n");
+ return -EINVAL;
+ }
+ kinfo.unassigned_mem = (paddr_t)mem * SZ_1K;
+
+ rc = domain_p2m_set_allocation(d, mem, node);
+ if ( rc != 0 )
+ return rc;
+
+ printk("*** LOADING DOMU cpus=%u memory=%#"PRIx64"KB ***\n",
+ d->max_vcpus, mem);
+
+ rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced);
+ if ( rc == -EILSEQ ||
+ rc == -ENODATA ||
+ (rc == 0 && !strcmp(dom0less_enhanced, "enabled")) )
+ {
+ need_xenstore = true;
+ kinfo.dom0less_feature = DOM0LESS_ENHANCED;
+ }
+ else if ( rc == 0 && !strcmp(dom0less_enhanced, "legacy") )
+ {
+ need_xenstore = true;
+ kinfo.dom0less_feature = DOM0LESS_ENHANCED_LEGACY;
+ }
+ else if ( rc == 0 && !strcmp(dom0less_enhanced, "no-xenstore") )
+ kinfo.dom0less_feature = DOM0LESS_ENHANCED_NO_XS;
+
+ if ( vcpu_create(d, 0) == NULL )
+ return -ENOMEM;
+
+ d->max_pages = ((paddr_t)mem * SZ_1K) >> PAGE_SHIFT;
+
+ kinfo.d = d;
+
+ rc = kernel_probe(&kinfo, node);
+ if ( rc < 0 )
+ return rc;
+
+ set_domain_type(d, &kinfo);
+
+ if ( !dt_find_property(node, "xen,static-mem", NULL) )
+ allocate_memory(d, &kinfo);
+#ifdef CONFIG_STATIC_MEMORY
+ else if ( !is_domain_direct_mapped(d) )
+ allocate_static_memory(d, &kinfo, node);
+ else
+ assign_static_memory_11(d, &kinfo, node);
+#endif
+
+#ifdef CONFIG_STATIC_SHM
+ rc = process_shm(d, &kinfo, node);
+ if ( rc < 0 )
+ return rc;
+#endif
+
+ rc = init_vuart(d, &kinfo, node);
+ if ( rc < 0 )
+ return rc;
+
+ rc = prepare_dtb_domU(d, &kinfo);
+ if ( rc < 0 )
+ return rc;
+
+ rc = construct_domain(d, &kinfo);
+ if ( rc < 0 )
+ return rc;
+
+ domain_vcpu_affinity(d, node);
+
+ return alloc_xenstore_params(&kinfo);
+}
+
void __init create_domUs(void)
{
struct dt_device_node *node;
diff --git a/xen/include/asm-generic/dom0less-build.h b/xen/include/asm-generic/dom0less-build.h
index f095135caa..c00bb853d6 100644
--- a/xen/include/asm-generic/dom0less-build.h
+++ b/xen/include/asm-generic/dom0less-build.h
@@ -11,10 +11,7 @@
struct domain;
struct dt_device_node;
-
-/* TODO: remove both when construct_domU() will be moved to common. */
-#define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
-extern bool need_xenstore;
+struct kernel_info;
/*
* List of possible features for dom0less domUs
@@ -48,12 +45,21 @@ void create_domUs(void);
bool is_dom0less_mode(void);
void set_xs_domain(struct domain *d);
-int construct_domU(struct domain *d, const struct dt_device_node *node);
-
void arch_create_domUs(struct dt_device_node *node,
struct xen_domctl_createdomain *d_cfg,
unsigned int flags);
+int init_vuart(struct domain *d, struct kernel_info *kinfo,
+ const struct dt_device_node *node);
+
+int make_intc_domU_node(struct kernel_info *kinfo);
+int make_arch_nodes(struct kernel_info *kinfo);
+
+void set_domain_type(struct domain *d, struct kernel_info *kinfo);
+
+int init_intc_phandle(struct kernel_info *kinfo, const char *name,
+ const int node_next, const void *pfdt);
+
#else /* !CONFIG_DOM0LESS_BOOT */
static inline void create_domUs(void) {}
--
2.49.0
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH v3 8/8] xen/common: dom0less: introduce common dom0less-build.c
2025-05-02 16:22 ` [PATCH v3 8/8] xen/common: dom0less: introduce common dom0less-build.c Oleksii Kurochko
@ 2025-05-02 20:53 ` Stefano Stabellini
2025-05-05 10:46 ` Oleksii Kurochko
0 siblings, 1 reply; 30+ messages in thread
From: Stefano Stabellini @ 2025-05-02 20:53 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: xen-devel, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Jan Beulich, Roger Pau Monné
On Fri, 2 May 2025, Oleksii Kurochko wrote:
> Part of Arm's dom0less-build.c could be common between architectures which are
> using device tree files to create guest domains. Thereby move some parts of
> Arm's dom0less-build.c to common code with minor changes.
>
> As a part of theses changes the following changes are introduced:
> - Introduce make_arch_nodes() to cover arch-specific nodes. For example, in
> case of Arm, it is PSCI and vpl011 nodes.
> - Introduce set_domain_type() to abstract a way how setting of domain type
> happens. For example, RISC-V won't have this member of arch_domain structure
> as vCPUs will always have the same bitness as hypervisor. In case of Arm, it
> is possible that Arm64 could create 32-bit and 64-bit domains.
> - Introduce init_vuart() to cover details of virtual uart initialization.
> - Introduce init_intc_phandle() to cover some details of interrupt controller
> phandle initialization. As an example, RISC-V could have different name for
> interrupt controller node ( APLIC, PLIC, IMSIC, etc ) but the code in
> domain_handle_dtb_bootmodule() could handle only one interrupt controller
> node name.
> - s/make_gic_domU_node/make_intc_domU_node as GIC is Arm specific naming and
> add prototype of make_intc_domU_node() to dom0less-build.h
>
> The following functions are moved to xen/common/device-tree:
> - Functions which are moved as is:
> - domain_p2m_pages().
> - handle_passthrough_prop().
> - handle_prop_pfdt().
> - scan_pfdt_node().
> - check_partial_fdt().
> - Functions which are moved with some minor changes:
> - alloc_xenstore_evtchn():
> - ifdef-ing by CONFIG_HVM accesses to hvm.params.
> - prepare_dtb_domU():
> - ifdef-ing access to gnttab_{start,size} by CONFIG_GRANT_TABLE.
> - s/make_gic_domU_node/make_intc_domU_node.
> - Add call of make_arch_nodes().
> - domain_handle_dtb_bootmodule():
> - hide details of interrupt controller phandle initialization by calling
> init_intc_phandle().
> - Update the comment above init_intc_phandle(): s/gic/interrupt controller.
> - construct_domU():
> - ifdef-ing by CONFIG_HVM accesses to hvm.params.
> - Call init_vuart() to hide Arm's vpl011_init() details there.
> - Add call of set_domain_type() instead of setting kinfo->arch.type explicitly.
>
> Some parts of dom0less-build.c are wraped by #ifdef CONFIG_STATIC_{SHMEM,MEMORY}
> as not all archs support these configs.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
FYI for a possible follow-up patch (doesn't have to be done in this
patch), the following functions could now be static:
alloc_dom0_vcpu0
dom0_max_vcpus
> ---
> Change in v3:
> - Align construct_domU() with the current staging.
> - Align alloc_xenstore_params() with the current staging.
> - Move defintion of XENSTORE_PFN_LATE_ALLOC to common and
> declaration of need_xenstore to common.
> ---
> Change in v2:
> - Wrap by #ifdef CONFIG_STATIC_* inclusions of <asm/static-memory.h> and
> <asm/static-shmem.h>. Wrap also the code which uses something from the
> mentioned headers.
> - Add handling of legacy case in construct_domU().
> - Use xen/fdt-kernel.h and xen/fdt-domain-build.h instead of asm/*.
> - Update the commit message.
> ---
> xen/arch/arm/dom0less-build.c | 714 ++---------------------
> xen/common/device-tree/dom0less-build.c | 699 ++++++++++++++++++++++
> xen/include/asm-generic/dom0less-build.h | 18 +-
> 3 files changed, 751 insertions(+), 680 deletions(-)
>
> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
> index 0310579863..627c212b3b 100644
> --- a/xen/arch/arm/dom0less-build.c
> +++ b/xen/arch/arm/dom0less-build.c
> @@ -25,8 +25,6 @@
> #include <asm/static-memory.h>
> #include <asm/static-shmem.h>
>
> -bool __initdata need_xenstore;
> -
> #ifdef CONFIG_VGICV2
> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
> {
> @@ -152,7 +150,7 @@ static int __init make_gicv3_domU_node(struct kernel_info *kinfo)
> }
> #endif
>
> -static int __init make_gic_domU_node(struct kernel_info *kinfo)
> +int __init make_intc_domU_node(struct kernel_info *kinfo)
> {
> switch ( kinfo->d->arch.vgic.version )
> {
> @@ -218,708 +216,60 @@ static int __init make_vpl011_uart_node(struct kernel_info *kinfo)
> }
> #endif
>
> -/*
> - * Scan device tree properties for passthrough specific information.
> - * Returns < 0 on error
> - * 0 on success
> - */
> -static int __init handle_passthrough_prop(struct kernel_info *kinfo,
> - const struct fdt_property *xen_reg,
> - const struct fdt_property *xen_path,
> - bool xen_force,
> - uint32_t address_cells,
> - uint32_t size_cells)
> -{
> - const __be32 *cell;
> - unsigned int i, len;
> - struct dt_device_node *node;
> - int res;
> - paddr_t mstart, size, gstart;
> -
> - /* xen,reg specifies where to map the MMIO region */
> - cell = (const __be32 *)xen_reg->data;
> - len = fdt32_to_cpu(xen_reg->len) / ((address_cells * 2 + size_cells) *
> - sizeof(uint32_t));
> -
> - for ( i = 0; i < len; i++ )
> - {
> - device_tree_get_reg(&cell, address_cells, size_cells,
> - &mstart, &size);
> - gstart = dt_next_cell(address_cells, &cell);
> -
> - if ( gstart & ~PAGE_MASK || mstart & ~PAGE_MASK || size & ~PAGE_MASK )
> - {
> - printk(XENLOG_ERR
> - "DomU passthrough config has not page aligned addresses/sizes\n");
> - return -EINVAL;
> - }
> -
> - res = iomem_permit_access(kinfo->d, paddr_to_pfn(mstart),
> - paddr_to_pfn(PAGE_ALIGN(mstart + size - 1)));
> - if ( res )
> - {
> - printk(XENLOG_ERR "Unable to permit to dom%d access to"
> - " 0x%"PRIpaddr" - 0x%"PRIpaddr"\n",
> - kinfo->d->domain_id,
> - mstart & PAGE_MASK, PAGE_ALIGN(mstart + size) - 1);
> - return res;
> - }
> -
> - res = map_regions_p2mt(kinfo->d,
> - gaddr_to_gfn(gstart),
> - PFN_DOWN(size),
> - maddr_to_mfn(mstart),
> - p2m_mmio_direct_dev);
> - if ( res < 0 )
> - {
> - printk(XENLOG_ERR
> - "Failed to map %"PRIpaddr" to the guest at%"PRIpaddr"\n",
> - mstart, gstart);
> - return -EFAULT;
> - }
> - }
> -
> - /*
> - * If xen_force, we let the user assign a MMIO region with no
> - * associated path.
> - */
> - if ( xen_path == NULL )
> - return xen_force ? 0 : -EINVAL;
> -
> - /*
> - * xen,path specifies the corresponding node in the host DT.
> - * Both interrupt mappings and IOMMU settings are based on it,
> - * as they are done based on the corresponding host DT node.
> - */
> - node = dt_find_node_by_path(xen_path->data);
> - if ( node == NULL )
> - {
> - printk(XENLOG_ERR "Couldn't find node %s in host_dt!\n",
> - xen_path->data);
> - return -EINVAL;
> - }
> -
> - res = map_device_irqs_to_domain(kinfo->d, node, true, NULL);
> - if ( res < 0 )
> - return res;
> -
> - res = iommu_add_dt_device(node);
> - if ( res < 0 )
> - return res;
> -
> - /* If xen_force, we allow assignment of devices without IOMMU protection. */
> - if ( xen_force && !dt_device_is_protected(node) )
> - return 0;
> -
> - return iommu_assign_dt_device(kinfo->d, node);
> -}
> -
> -static int __init handle_prop_pfdt(struct kernel_info *kinfo,
> - const void *pfdt, int nodeoff,
> - uint32_t address_cells, uint32_t size_cells,
> - bool scan_passthrough_prop)
> -{
> - void *fdt = kinfo->fdt;
> - int propoff, nameoff, res;
> - const struct fdt_property *prop, *xen_reg = NULL, *xen_path = NULL;
> - const char *name;
> - bool found, xen_force = false;
> -
> - for ( propoff = fdt_first_property_offset(pfdt, nodeoff);
> - propoff >= 0;
> - propoff = fdt_next_property_offset(pfdt, propoff) )
> - {
> - if ( !(prop = fdt_get_property_by_offset(pfdt, propoff, NULL)) )
> - return -FDT_ERR_INTERNAL;
> -
> - found = false;
> - nameoff = fdt32_to_cpu(prop->nameoff);
> - name = fdt_string(pfdt, nameoff);
> -
> - if ( scan_passthrough_prop )
> - {
> - if ( dt_prop_cmp("xen,reg", name) == 0 )
> - {
> - xen_reg = prop;
> - found = true;
> - }
> - else if ( dt_prop_cmp("xen,path", name) == 0 )
> - {
> - xen_path = prop;
> - found = true;
> - }
> - else if ( dt_prop_cmp("xen,force-assign-without-iommu",
> - name) == 0 )
> - {
> - xen_force = true;
> - found = true;
> - }
> - }
> -
> - /*
> - * Copy properties other than the ones above: xen,reg, xen,path,
> - * and xen,force-assign-without-iommu.
> - */
> - if ( !found )
> - {
> - res = fdt_property(fdt, name, prop->data, fdt32_to_cpu(prop->len));
> - if ( res )
> - return res;
> - }
> - }
> -
> - /*
> - * Only handle passthrough properties if both xen,reg and xen,path
> - * are present, or if xen,force-assign-without-iommu is specified.
> - */
> - if ( xen_reg != NULL && (xen_path != NULL || xen_force) )
> - {
> - res = handle_passthrough_prop(kinfo, xen_reg, xen_path, xen_force,
> - address_cells, size_cells);
> - if ( res < 0 )
> - {
> - printk(XENLOG_ERR "Failed to assign device to %pd\n", kinfo->d);
> - return res;
> - }
> - }
> - else if ( (xen_path && !xen_reg) || (xen_reg && !xen_path && !xen_force) )
> - {
> - printk(XENLOG_ERR "xen,reg or xen,path missing for %pd\n",
> - kinfo->d);
> - return -EINVAL;
> - }
> -
> - /* FDT_ERR_NOTFOUND => There is no more properties for this node */
> - return ( propoff != -FDT_ERR_NOTFOUND ) ? propoff : 0;
> -}
> -
> -static int __init scan_pfdt_node(struct kernel_info *kinfo, const void *pfdt,
> - int nodeoff,
> - uint32_t address_cells, uint32_t size_cells,
> - bool scan_passthrough_prop)
> -{
> - int rc = 0;
> - void *fdt = kinfo->fdt;
> - int node_next;
> -
> - rc = fdt_begin_node(fdt, fdt_get_name(pfdt, nodeoff, NULL));
> - if ( rc )
> - return rc;
> -
> - rc = handle_prop_pfdt(kinfo, pfdt, nodeoff, address_cells, size_cells,
> - scan_passthrough_prop);
> - if ( rc )
> - return rc;
> -
> - address_cells = device_tree_get_u32(pfdt, nodeoff, "#address-cells",
> - DT_ROOT_NODE_ADDR_CELLS_DEFAULT);
> - size_cells = device_tree_get_u32(pfdt, nodeoff, "#size-cells",
> - DT_ROOT_NODE_SIZE_CELLS_DEFAULT);
> -
> - node_next = fdt_first_subnode(pfdt, nodeoff);
> - while ( node_next > 0 )
> - {
> - rc = scan_pfdt_node(kinfo, pfdt, node_next, address_cells, size_cells,
> - scan_passthrough_prop);
> - if ( rc )
> - return rc;
> -
> - node_next = fdt_next_subnode(pfdt, node_next);
> - }
> -
> - return fdt_end_node(fdt);
> -}
> -
> -static int __init check_partial_fdt(void *pfdt, size_t size)
> +int __init make_arch_nodes(struct kernel_info *kinfo)
> {
> - int res;
> -
> - if ( fdt_magic(pfdt) != FDT_MAGIC )
> - {
> - dprintk(XENLOG_ERR, "Partial FDT is not a valid Flat Device Tree");
> - return -EINVAL;
> - }
> -
> - res = fdt_check_header(pfdt);
> - if ( res )
> - {
> - dprintk(XENLOG_ERR, "Failed to check the partial FDT (%d)", res);
> - return -EINVAL;
> - }
> -
> - if ( fdt_totalsize(pfdt) > size )
> - {
> - dprintk(XENLOG_ERR, "Partial FDT totalsize is too big");
> - return -EINVAL;
> - }
> -
> - return 0;
> -}
> -
> -static int __init domain_handle_dtb_bootmodule(struct domain *d,
> - struct kernel_info *kinfo)
> -{
> - void *pfdt;
> - int res, node_next;
> -
> - pfdt = ioremap_cache(kinfo->dtb_bootmodule->start,
> - kinfo->dtb_bootmodule->size);
> - if ( pfdt == NULL )
> - return -EFAULT;
> -
> - res = check_partial_fdt(pfdt, kinfo->dtb_bootmodule->size);
> - if ( res < 0 )
> - goto out;
> -
> - for ( node_next = fdt_first_subnode(pfdt, 0);
> - node_next > 0;
> - node_next = fdt_next_subnode(pfdt, node_next) )
> - {
> - const char *name = fdt_get_name(pfdt, node_next, NULL);
> -
> - if ( name == NULL )
> - continue;
> -
> - /*
> - * Only scan /gic /aliases /passthrough, ignore the rest.
> - * They don't have to be parsed in order.
> - *
> - * Take the GIC phandle value from the special /gic node in the
> - * DTB fragment.
> - */
> - if ( dt_node_cmp(name, "gic") == 0 )
> - {
> - uint32_t phandle_intc = fdt_get_phandle(pfdt, node_next);
> -
> - if ( phandle_intc != 0 )
> - kinfo->phandle_intc = phandle_intc;
> - continue;
> - }
> -
> - if ( dt_node_cmp(name, "aliases") == 0 )
> - {
> - res = scan_pfdt_node(kinfo, pfdt, node_next,
> - DT_ROOT_NODE_ADDR_CELLS_DEFAULT,
> - DT_ROOT_NODE_SIZE_CELLS_DEFAULT,
> - false);
> - if ( res )
> - goto out;
> - continue;
> - }
> - if ( dt_node_cmp(name, "passthrough") == 0 )
> - {
> - res = scan_pfdt_node(kinfo, pfdt, node_next,
> - DT_ROOT_NODE_ADDR_CELLS_DEFAULT,
> - DT_ROOT_NODE_SIZE_CELLS_DEFAULT,
> - true);
> - if ( res )
> - goto out;
> - continue;
> - }
> - }
> -
> - out:
> - iounmap(pfdt);
> -
> - return res;
> -}
> -
> -/*
> - * The max size for DT is 2MB. However, the generated DT is small (not including
> - * domU passthrough DT nodes whose size we account separately), 4KB are enough
> - * for now, but we might have to increase it in the future.
> - */
> -#define DOMU_DTB_SIZE 4096
> -static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
> -{
> - int addrcells, sizecells;
> - int ret, fdt_size = DOMU_DTB_SIZE;
> -
> - kinfo->phandle_intc = GUEST_PHANDLE_GIC;
> - kinfo->gnttab_start = GUEST_GNTTAB_BASE;
> - kinfo->gnttab_size = GUEST_GNTTAB_SIZE;
> -
> - addrcells = GUEST_ROOT_ADDRESS_CELLS;
> - sizecells = GUEST_ROOT_SIZE_CELLS;
> -
> - /* Account for domU passthrough DT size */
> - if ( kinfo->dtb_bootmodule )
> - fdt_size += kinfo->dtb_bootmodule->size;
> -
> - /* Cap to max DT size if needed */
> - fdt_size = min(fdt_size, SZ_2M);
> -
> - kinfo->fdt = xmalloc_bytes(fdt_size);
> - if ( kinfo->fdt == NULL )
> - return -ENOMEM;
> -
> - ret = fdt_create(kinfo->fdt, fdt_size);
> - if ( ret < 0 )
> - goto err;
> -
> - ret = fdt_finish_reservemap(kinfo->fdt);
> - if ( ret < 0 )
> - goto err;
> -
> - ret = fdt_begin_node(kinfo->fdt, "");
> - if ( ret < 0 )
> - goto err;
> -
> - ret = fdt_property_cell(kinfo->fdt, "#address-cells", addrcells);
> - if ( ret )
> - goto err;
> -
> - ret = fdt_property_cell(kinfo->fdt, "#size-cells", sizecells);
> - if ( ret )
> - goto err;
> -
> - ret = make_chosen_node(kinfo);
> - if ( ret )
> - goto err;
> + int ret;
>
> ret = make_psci_node(kinfo->fdt);
> if ( ret )
> - goto err;
> -
> - ret = make_cpus_node(d, kinfo->fdt);
> - if ( ret )
> - goto err;
> -
> - ret = make_memory_node(kinfo, addrcells, sizecells,
> - kernel_info_get_mem(kinfo));
> - if ( ret )
> - goto err;
> -
> - ret = make_resv_memory_node(kinfo, addrcells, sizecells);
> - if ( ret )
> - goto err;
> -
> - /*
> - * domain_handle_dtb_bootmodule has to be called before the rest of
> - * the device tree is generated because it depends on the value of
> - * the field phandle_intc.
> - */
> - if ( kinfo->dtb_bootmodule )
> - {
> - ret = domain_handle_dtb_bootmodule(d, kinfo);
> - if ( ret )
> - goto err;
> - }
> -
> - ret = make_gic_domU_node(kinfo);
> - if ( ret )
> - goto err;
> -
> - ret = make_timer_node(kinfo);
> - if ( ret )
> - goto err;
> + return -EINVAL;
>
> if ( kinfo->vuart )
> {
> - ret = -EINVAL;
> #ifdef CONFIG_SBSA_VUART_CONSOLE
> ret = make_vpl011_uart_node(kinfo);
> #endif
> if ( ret )
> - goto err;
> - }
> -
> - if ( kinfo->dom0less_feature & DOM0LESS_ENHANCED_NO_XS )
> - {
> - ret = make_hypervisor_node(d, kinfo, addrcells, sizecells);
> - if ( ret )
> - goto err;
> + return -EINVAL;
> }
>
> - ret = fdt_end_node(kinfo->fdt);
> - if ( ret < 0 )
> - goto err;
> -
> - ret = fdt_finish(kinfo->fdt);
> - if ( ret < 0 )
> - goto err;
> -
> return 0;
> -
> - err:
> - printk("Device tree generation failed (%d).\n", ret);
> - xfree(kinfo->fdt);
> -
> - return -EINVAL;
> }
>
> -#define XENSTORE_PFN_OFFSET 1
> -static int __init alloc_xenstore_page(struct domain *d)
> +/* TODO: make arch.type generic ? */
> +#ifdef CONFIG_ARM_64
> +void __init set_domain_type(struct domain *d, struct kernel_info *kinfo)
> {
> - struct page_info *xenstore_pg;
> - struct xenstore_domain_interface *interface;
> - mfn_t mfn;
> - gfn_t gfn;
> - int rc;
> -
> - if ( (UINT_MAX - d->max_pages) < 1 )
> - {
> - printk(XENLOG_ERR "%pd: Over-allocation for d->max_pages by 1 page.\n",
> - d);
> - return -EINVAL;
> - }
> -
> - d->max_pages += 1;
> - xenstore_pg = alloc_domheap_page(d, MEMF_bits(32));
> - if ( xenstore_pg == NULL && is_64bit_domain(d) )
> - xenstore_pg = alloc_domheap_page(d, 0);
> - if ( xenstore_pg == NULL )
> - return -ENOMEM;
> -
> - mfn = page_to_mfn(xenstore_pg);
> - if ( !mfn_x(mfn) )
> - return -ENOMEM;
> -
> - if ( !is_domain_direct_mapped(d) )
> - gfn = gaddr_to_gfn(GUEST_MAGIC_BASE +
> - (XENSTORE_PFN_OFFSET << PAGE_SHIFT));
> - else
> - gfn = gaddr_to_gfn(mfn_to_maddr(mfn));
> -
> - rc = guest_physmap_add_page(d, gfn, mfn, 0);
> - if ( rc )
> - {
> - free_domheap_page(xenstore_pg);
> - return rc;
> - }
> -
> - d->arch.hvm.params[HVM_PARAM_STORE_PFN] = gfn_x(gfn);
> - interface = map_domain_page(mfn);
> - interface->connection = XENSTORE_RECONNECT;
> - unmap_domain_page(interface);
> -
> - return 0;
> + /* type must be set before allocate memory */
> + d->arch.type = kinfo->arch.type;
> }
> -
> -static int __init alloc_xenstore_params(struct kernel_info *kinfo)
> +#else
> +void __init set_domain_type(struct domain *d, struct kernel_info *kinfo)
> {
> - struct domain *d = kinfo->d;
> - int rc = 0;
> -
> - if ( (kinfo->dom0less_feature & (DOM0LESS_XENSTORE | DOM0LESS_XS_LEGACY))
> - == (DOM0LESS_XENSTORE | DOM0LESS_XS_LEGACY) )
> - d->arch.hvm.params[HVM_PARAM_STORE_PFN] = XENSTORE_PFN_LATE_ALLOC;
> - else if ( kinfo->dom0less_feature & DOM0LESS_XENSTORE )
> - {
> - rc = alloc_xenstore_page(d);
> - if ( rc < 0 )
> - return rc;
> - }
> -
> - return rc;
> + /* Nothing to do */
> }
> +#endif
>
> -static void __init domain_vcpu_affinity(struct domain *d,
> - const struct dt_device_node *node)
> +int __init init_vuart(struct domain *d, struct kernel_info *kinfo,
> + const struct dt_device_node *node)
> {
> - struct dt_device_node *np;
> -
> - dt_for_each_child_node(node, np)
> - {
> - const char *hard_affinity_str = NULL;
> - uint32_t val;
> - int rc;
> - struct vcpu *v;
> - cpumask_t affinity;
> -
> - if ( !dt_device_is_compatible(np, "xen,vcpu") )
> - continue;
> -
> - if ( !dt_property_read_u32(np, "id", &val) )
> - panic("Invalid xen,vcpu node for domain %s\n", dt_node_name(node));
> -
> - if ( val >= d->max_vcpus )
> - panic("Invalid vcpu_id %u for domain %s, max_vcpus=%u\n", val,
> - dt_node_name(node), d->max_vcpus);
> -
> - v = d->vcpu[val];
> - rc = dt_property_read_string(np, "hard-affinity", &hard_affinity_str);
> - if ( rc < 0 )
> - continue;
> -
> - cpumask_clear(&affinity);
> - while ( *hard_affinity_str != '\0' )
> - {
> - unsigned int start, end;
> -
> - start = simple_strtoul(hard_affinity_str, &hard_affinity_str, 0);
> -
> - if ( *hard_affinity_str == '-' ) /* Range */
> - {
> - hard_affinity_str++;
> - end = simple_strtoul(hard_affinity_str, &hard_affinity_str, 0);
> - }
> - else /* Single value */
> - end = start;
> -
> - if ( end >= nr_cpu_ids )
> - panic("Invalid pCPU %u for domain %s\n", end, dt_node_name(node));
> -
> - for ( ; start <= end; start++ )
> - cpumask_set_cpu(start, &affinity);
> -
> - if ( *hard_affinity_str == ',' )
> - hard_affinity_str++;
> - else if ( *hard_affinity_str != '\0' )
> - break;
> - }
> + int rc = 0;
>
> - rc = vcpu_set_hard_affinity(v, &affinity);
> - if ( rc )
> - panic("vcpu%d: failed (rc=%d) to set hard affinity for domain %s\n",
> - v->vcpu_id, rc, dt_node_name(node));
> - }
> -}
> + kinfo->vuart = dt_property_read_bool(node, "vpl011");
>
> -#ifdef CONFIG_ARCH_PAGING_MEMPOOL
> -static unsigned long __init domain_p2m_pages(unsigned long maxmem_kb,
> - unsigned int smp_cpus)
> -{
> /*
> - * Keep in sync with libxl__get_required_paging_memory().
> - * 256 pages (1MB) per vcpu, plus 1 page per MiB of RAM for the P2M map,
> - * plus 128 pages to cover extended regions.
> + * Base address and irq number are needed when creating vpl011 device
> + * tree node in prepare_dtb_domU, so initialization on related variables
> + * shall be done first.
> */
> - unsigned long memkb = 4 * (256 * smp_cpus + (maxmem_kb / 1024) + 128);
> -
> - BUILD_BUG_ON(PAGE_SIZE != SZ_4K);
> -
> - return DIV_ROUND_UP(memkb, 1024) << (20 - PAGE_SHIFT);
> -}
> -
> -static int __init domain_p2m_set_allocation(struct domain *d, uint64_t mem,
> - const struct dt_device_node *node)
> -{
> - unsigned long p2m_pages;
> - uint32_t p2m_mem_mb;
> - int rc;
> -
> - rc = dt_property_read_u32(node, "xen,domain-p2m-mem-mb", &p2m_mem_mb);
> - /* If xen,domain-p2m-mem-mb is not specified, use the default value. */
> - p2m_pages = rc ?
> - p2m_mem_mb << (20 - PAGE_SHIFT) :
> - domain_p2m_pages(mem, d->max_vcpus);
> -
> - spin_lock(&d->arch.paging.lock);
> - rc = p2m_set_allocation(d, p2m_pages, NULL);
> - spin_unlock(&d->arch.paging.lock);
> -
> - return rc;
> -}
> -#else /* !CONFIG_ARCH_PAGING_MEMPOOL */
> -static inline int domain_p2m_set_allocation(struct domain *d, uint64_t mem,
> - const struct dt_device_node *node)
> -{
> - return 0;
> -}
> -#endif /* CONFIG_ARCH_PAGING_MEMPOOL */
> -
> -int __init construct_domU(struct domain *d,
> - const struct dt_device_node *node)
> -{
> - struct kernel_info kinfo = KERNEL_INFO_INIT;
> - const char *dom0less_enhanced;
> - int rc;
> - u64 mem;
> -
> - rc = dt_property_read_u64(node, "memory", &mem);
> - if ( !rc )
> - {
> - printk("Error building DomU: cannot read \"memory\" property\n");
> - return -EINVAL;
> - }
> - kinfo.unassigned_mem = (paddr_t)mem * SZ_1K;
> -
> - rc = domain_p2m_set_allocation(d, mem, node);
> - if ( rc != 0 )
> - return rc;
> -
> - printk("*** LOADING DOMU cpus=%u memory=%#"PRIx64"KB ***\n",
> - d->max_vcpus, mem);
> -
> - kinfo.vuart = dt_property_read_bool(node, "vpl011");
> - if ( kinfo.vuart && is_hardware_domain(d) )
> - panic("hardware domain cannot specify vpl011\n");
> -
> - rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced);
> - if ( rc == -EILSEQ ||
> - rc == -ENODATA ||
> - (rc == 0 && !strcmp(dom0less_enhanced, "enabled")) )
> - {
> - need_xenstore = true;
> - kinfo.dom0less_feature = DOM0LESS_ENHANCED;
> - }
> - else if ( rc == 0 && !strcmp(dom0less_enhanced, "legacy") )
> - {
> - need_xenstore = true;
> - kinfo.dom0less_feature = DOM0LESS_ENHANCED_LEGACY;
> - }
> - else if ( rc == 0 && !strcmp(dom0less_enhanced, "no-xenstore") )
> - kinfo.dom0less_feature = DOM0LESS_ENHANCED_NO_XS;
> -
> - if ( vcpu_create(d, 0) == NULL )
> - return -ENOMEM;
> -
> - d->max_pages = ((paddr_t)mem * SZ_1K) >> PAGE_SHIFT;
> -
> - kinfo.d = d;
> -
> - rc = kernel_probe(&kinfo, node);
> - if ( rc < 0 )
> - return rc;
> -
> -#ifdef CONFIG_ARM_64
> - /* type must be set before allocate memory */
> - d->arch.type = kinfo.arch.type;
> -#endif
> - if ( is_hardware_domain(d) )
> - {
> - rc = construct_hwdom(&kinfo, node);
> - if ( rc < 0 )
> - return rc;
> - }
I think we should retain this chunk in the code movement. It is OK if it
is behind a #ifdef CONFIG_ARM.
> - else
> + if ( kinfo->vuart )
> {
> - if ( !dt_find_property(node, "xen,static-mem", NULL) )
> - allocate_memory(d, &kinfo);
> - else if ( !is_domain_direct_mapped(d) )
> - allocate_static_memory(d, &kinfo, node);
> - else
> - assign_static_memory_11(d, &kinfo, node);
> -
> - rc = process_shm(d, &kinfo, node);
> - if ( rc < 0 )
> - return rc;
> -
> - /*
> - * Base address and irq number are needed when creating vpl011 device
> - * tree node in prepare_dtb_domU, so initialization on related variables
> - * shall be done first.
> - */
> - if ( kinfo.vuart )
> - {
> - rc = domain_vpl011_init(d, NULL);
> - if ( rc < 0 )
> - return rc;
> - }
> -
> - rc = prepare_dtb_domU(d, &kinfo);
> - if ( rc < 0 )
> - return rc;
> -
> - rc = construct_domain(d, &kinfo);
> + rc = domain_vpl011_init(d, NULL);
> if ( rc < 0 )
> return rc;
> }
>
> - domain_vcpu_affinity(d, node);
> -
> - return alloc_xenstore_params(&kinfo);
> + return rc;
> }
>
> void __init arch_create_domUs(struct dt_device_node *node,
> @@ -995,6 +345,22 @@ void __init arch_create_domUs(struct dt_device_node *node,
> }
> }
>
> +int __init init_intc_phandle(struct kernel_info *kinfo, const char *name,
> + const int node_next, const void *pfdt)
> +{
> + if ( dt_node_cmp(name, "gic") == 0 )
> + {
> + uint32_t phandle_intc = fdt_get_phandle(pfdt, node_next);
> +
> + if ( phandle_intc != 0 )
> + kinfo->phandle_intc = phandle_intc;
> +
> + return 0;
> + }
> +
> + return 1;
> +}
> +
> /*
> * Local variables:
> * mode: C
> diff --git a/xen/common/device-tree/dom0less-build.c b/xen/common/device-tree/dom0less-build.c
> index a01a8b6b1a..c3face5b90 100644
> --- a/xen/common/device-tree/dom0less-build.c
> +++ b/xen/common/device-tree/dom0less-build.c
> @@ -3,24 +3,43 @@
> #include <xen/bootfdt.h>
> #include <xen/device_tree.h>
> #include <xen/domain.h>
> +#include <xen/domain_page.h>
> #include <xen/err.h>
> #include <xen/event.h>
> +#include <xen/fdt-domain-build.h>
> +#include <xen/fdt-kernel.h>
> #include <xen/grant_table.h>
> #include <xen/init.h>
> +#include <xen/iocap.h>
> #include <xen/iommu.h>
> +#include <xen/libfdt/libfdt.h>
> #include <xen/llc-coloring.h>
> +#include <xen/sizes.h>
> #include <xen/sched.h>
> #include <xen/stdbool.h>
> #include <xen/types.h>
> +#include <xen/vmap.h>
>
> #include <public/bootfdt.h>
> #include <public/domctl.h>
> #include <public/event_channel.h>
> +#include <public/io/xs_wire.h>
>
> #include <asm/dom0less-build.h>
> #include <asm/setup.h>
>
> +#ifdef CONFIG_STATIC_MEMORY
> +#include <asm/static-memory.h>
> +#endif
#if __has_include ?
> +#ifdef CONFIG_STATIC_SHM
> +#include <asm/static-shmem.h>
> +#endif
Same here?
> +#define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
> +
> static domid_t __initdata xs_domid = DOMID_INVALID;
> +static bool __initdata need_xenstore;
>
> void __init set_xs_domain(struct domain *d)
> {
> @@ -109,6 +128,686 @@ static void __init initialize_domU_xenstore(void)
> }
> }
>
> +/*
> + * Scan device tree properties for passthrough specific information.
> + * Returns < 0 on error
> + * 0 on success
> + */
> +static int __init handle_passthrough_prop(struct kernel_info *kinfo,
> + const struct fdt_property *xen_reg,
> + const struct fdt_property *xen_path,
> + bool xen_force,
> + uint32_t address_cells,
> + uint32_t size_cells)
> +{
> + const __be32 *cell;
> + unsigned int i, len;
> + struct dt_device_node *node;
> + int res;
> + paddr_t mstart, size, gstart;
> +
> + /* xen,reg specifies where to map the MMIO region */
> + cell = (const __be32 *)xen_reg->data;
> + len = fdt32_to_cpu(xen_reg->len) / ((address_cells * 2 + size_cells) *
> + sizeof(uint32_t));
> +
> + for ( i = 0; i < len; i++ )
> + {
> + device_tree_get_reg(&cell, address_cells, size_cells,
> + &mstart, &size);
> + gstart = dt_next_cell(address_cells, &cell);
> +
> + if ( gstart & ~PAGE_MASK || mstart & ~PAGE_MASK || size & ~PAGE_MASK )
> + {
> + printk(XENLOG_ERR
> + "DomU passthrough config has not page aligned addresses/sizes\n");
> + return -EINVAL;
> + }
> +
> + res = iomem_permit_access(kinfo->d, paddr_to_pfn(mstart),
> + paddr_to_pfn(PAGE_ALIGN(mstart + size - 1)));
> + if ( res )
> + {
> + printk(XENLOG_ERR "Unable to permit to dom%d access to"
> + " 0x%"PRIpaddr" - 0x%"PRIpaddr"\n",
> + kinfo->d->domain_id,
> + mstart & PAGE_MASK, PAGE_ALIGN(mstart + size) - 1);
> + return res;
> + }
> +
> + res = map_regions_p2mt(kinfo->d,
> + gaddr_to_gfn(gstart),
> + PFN_DOWN(size),
> + maddr_to_mfn(mstart),
> + p2m_mmio_direct_dev);
> + if ( res < 0 )
> + {
> + printk(XENLOG_ERR
> + "Failed to map %"PRIpaddr" to the guest at%"PRIpaddr"\n",
> + mstart, gstart);
> + return -EFAULT;
> + }
> + }
> +
> + /*
> + * If xen_force, we let the user assign a MMIO region with no
> + * associated path.
> + */
> + if ( xen_path == NULL )
> + return xen_force ? 0 : -EINVAL;
> +
> + /*
> + * xen,path specifies the corresponding node in the host DT.
> + * Both interrupt mappings and IOMMU settings are based on it,
> + * as they are done based on the corresponding host DT node.
> + */
> + node = dt_find_node_by_path(xen_path->data);
> + if ( node == NULL )
> + {
> + printk(XENLOG_ERR "Couldn't find node %s in host_dt!\n",
> + xen_path->data);
> + return -EINVAL;
> + }
> +
> + res = map_device_irqs_to_domain(kinfo->d, node, true, NULL);
> + if ( res < 0 )
> + return res;
> +
> + res = iommu_add_dt_device(node);
> + if ( res < 0 )
> + return res;
> +
> + /* If xen_force, we allow assignment of devices without IOMMU protection. */
> + if ( xen_force && !dt_device_is_protected(node) )
> + return 0;
> +
> + return iommu_assign_dt_device(kinfo->d, node);
> +}
> +
> +static int __init handle_prop_pfdt(struct kernel_info *kinfo,
> + const void *pfdt, int nodeoff,
> + uint32_t address_cells, uint32_t size_cells,
> + bool scan_passthrough_prop)
> +{
> + void *fdt = kinfo->fdt;
> + int propoff, nameoff, res;
> + const struct fdt_property *prop, *xen_reg = NULL, *xen_path = NULL;
> + const char *name;
> + bool found, xen_force = false;
> +
> + for ( propoff = fdt_first_property_offset(pfdt, nodeoff);
> + propoff >= 0;
> + propoff = fdt_next_property_offset(pfdt, propoff) )
> + {
> + if ( !(prop = fdt_get_property_by_offset(pfdt, propoff, NULL)) )
> + return -FDT_ERR_INTERNAL;
> +
> + found = false;
> + nameoff = fdt32_to_cpu(prop->nameoff);
> + name = fdt_string(pfdt, nameoff);
> +
> + if ( scan_passthrough_prop )
> + {
> + if ( dt_prop_cmp("xen,reg", name) == 0 )
> + {
> + xen_reg = prop;
> + found = true;
> + }
> + else if ( dt_prop_cmp("xen,path", name) == 0 )
> + {
> + xen_path = prop;
> + found = true;
> + }
> + else if ( dt_prop_cmp("xen,force-assign-without-iommu",
> + name) == 0 )
> + {
> + xen_force = true;
> + found = true;
> + }
> + }
> +
> + /*
> + * Copy properties other than the ones above: xen,reg, xen,path,
> + * and xen,force-assign-without-iommu.
> + */
> + if ( !found )
> + {
> + res = fdt_property(fdt, name, prop->data, fdt32_to_cpu(prop->len));
> + if ( res )
> + return res;
> + }
> + }
> +
> + /*
> + * Only handle passthrough properties if both xen,reg and xen,path
> + * are present, or if xen,force-assign-without-iommu is specified.
> + */
> + if ( xen_reg != NULL && (xen_path != NULL || xen_force) )
> + {
> + res = handle_passthrough_prop(kinfo, xen_reg, xen_path, xen_force,
> + address_cells, size_cells);
> + if ( res < 0 )
> + {
> + printk(XENLOG_ERR "Failed to assign device to %pd\n", kinfo->d);
> + return res;
> + }
> + }
> + else if ( (xen_path && !xen_reg) || (xen_reg && !xen_path && !xen_force) )
> + {
> + printk(XENLOG_ERR "xen,reg or xen,path missing for %pd\n",
> + kinfo->d);
> + return -EINVAL;
> + }
> +
> + /* FDT_ERR_NOTFOUND => There is no more properties for this node */
> + return ( propoff != -FDT_ERR_NOTFOUND ) ? propoff : 0;
> +}
> +
> +static int __init scan_pfdt_node(struct kernel_info *kinfo, const void *pfdt,
> + int nodeoff,
> + uint32_t address_cells, uint32_t size_cells,
> + bool scan_passthrough_prop)
> +{
> + int rc = 0;
> + void *fdt = kinfo->fdt;
> + int node_next;
> +
> + rc = fdt_begin_node(fdt, fdt_get_name(pfdt, nodeoff, NULL));
> + if ( rc )
> + return rc;
> +
> + rc = handle_prop_pfdt(kinfo, pfdt, nodeoff, address_cells, size_cells,
> + scan_passthrough_prop);
> + if ( rc )
> + return rc;
> +
> + address_cells = device_tree_get_u32(pfdt, nodeoff, "#address-cells",
> + DT_ROOT_NODE_ADDR_CELLS_DEFAULT);
> + size_cells = device_tree_get_u32(pfdt, nodeoff, "#size-cells",
> + DT_ROOT_NODE_SIZE_CELLS_DEFAULT);
> +
> + node_next = fdt_first_subnode(pfdt, nodeoff);
> + while ( node_next > 0 )
> + {
> + rc = scan_pfdt_node(kinfo, pfdt, node_next, address_cells, size_cells,
> + scan_passthrough_prop);
> + if ( rc )
> + return rc;
> +
> + node_next = fdt_next_subnode(pfdt, node_next);
> + }
> +
> + return fdt_end_node(fdt);
> +}
> +
> +static int __init check_partial_fdt(void *pfdt, size_t size)
> +{
> + int res;
> +
> + if ( fdt_magic(pfdt) != FDT_MAGIC )
> + {
> + dprintk(XENLOG_ERR, "Partial FDT is not a valid Flat Device Tree");
> + return -EINVAL;
> + }
> +
> + res = fdt_check_header(pfdt);
> + if ( res )
> + {
> + dprintk(XENLOG_ERR, "Failed to check the partial FDT (%d)", res);
> + return -EINVAL;
> + }
> +
> + if ( fdt_totalsize(pfdt) > size )
> + {
> + dprintk(XENLOG_ERR, "Partial FDT totalsize is too big");
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
> +static int __init domain_handle_dtb_bootmodule(struct domain *d,
> + struct kernel_info *kinfo)
> +{
> + void *pfdt;
> + int res, node_next;
> +
> + pfdt = ioremap_cache(kinfo->dtb_bootmodule->start,
> + kinfo->dtb_bootmodule->size);
> + if ( pfdt == NULL )
> + return -EFAULT;
> +
> + res = check_partial_fdt(pfdt, kinfo->dtb_bootmodule->size);
> + if ( res < 0 )
> + goto out;
> +
> + for ( node_next = fdt_first_subnode(pfdt, 0);
> + node_next > 0;
> + node_next = fdt_next_subnode(pfdt, node_next) )
> + {
> + const char *name = fdt_get_name(pfdt, node_next, NULL);
> +
> + if ( name == NULL )
> + continue;
> +
> + /*
> + * Only scan /$(interrupt_controller) /aliases /passthrough,
> + * ignore the rest.
> + * They don't have to be parsed in order.
> + *
> + * Take the interrupt controller phandle value from the special
> + * interrupt controller node in the DTB fragment.
> + */
> + if ( init_intc_phandle(kinfo, name, node_next, pfdt) == 0 )
> + continue;
> +
> + if ( dt_node_cmp(name, "aliases") == 0 )
> + {
> + res = scan_pfdt_node(kinfo, pfdt, node_next,
> + DT_ROOT_NODE_ADDR_CELLS_DEFAULT,
> + DT_ROOT_NODE_SIZE_CELLS_DEFAULT,
> + false);
> + if ( res )
> + goto out;
> + continue;
> + }
> + if ( dt_node_cmp(name, "passthrough") == 0 )
> + {
> + res = scan_pfdt_node(kinfo, pfdt, node_next,
> + DT_ROOT_NODE_ADDR_CELLS_DEFAULT,
> + DT_ROOT_NODE_SIZE_CELLS_DEFAULT,
> + true);
> + if ( res )
> + goto out;
> + continue;
> + }
> + }
> +
> + out:
> + iounmap(pfdt);
> +
> + return res;
> +}
> +
> +/*
> + * The max size for DT is 2MB. However, the generated DT is small (not including
> + * domU passthrough DT nodes whose size we account separately), 4KB are enough
> + * for now, but we might have to increase it in the future.
> + */
> +#define DOMU_DTB_SIZE 4096
> +static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
> +{
> + int addrcells, sizecells;
> + int ret, fdt_size = DOMU_DTB_SIZE;
> +
> + kinfo->phandle_intc = GUEST_PHANDLE_GIC;
> +
> +#ifdef CONFIG_GRANT_TABLE
> + kinfo->gnttab_start = GUEST_GNTTAB_BASE;
> + kinfo->gnttab_size = GUEST_GNTTAB_SIZE;
> +#endif
> +
> + addrcells = GUEST_ROOT_ADDRESS_CELLS;
> + sizecells = GUEST_ROOT_SIZE_CELLS;
> +
> + /* Account for domU passthrough DT size */
> + if ( kinfo->dtb_bootmodule )
> + fdt_size += kinfo->dtb_bootmodule->size;
> +
> + /* Cap to max DT size if needed */
> + fdt_size = min(fdt_size, SZ_2M);
> +
> + kinfo->fdt = xmalloc_bytes(fdt_size);
> + if ( kinfo->fdt == NULL )
> + return -ENOMEM;
> +
> + ret = fdt_create(kinfo->fdt, fdt_size);
> + if ( ret < 0 )
> + goto err;
> +
> + ret = fdt_finish_reservemap(kinfo->fdt);
> + if ( ret < 0 )
> + goto err;
> +
> + ret = fdt_begin_node(kinfo->fdt, "");
> + if ( ret < 0 )
> + goto err;
> +
> + ret = fdt_property_cell(kinfo->fdt, "#address-cells", addrcells);
> + if ( ret )
> + goto err;
> +
> + ret = fdt_property_cell(kinfo->fdt, "#size-cells", sizecells);
> + if ( ret )
> + goto err;
> +
> + ret = make_chosen_node(kinfo);
> + if ( ret )
> + goto err;
> +
> + ret = make_cpus_node(d, kinfo->fdt);
> + if ( ret )
> + goto err;
> +
> + ret = make_memory_node(kinfo, addrcells, sizecells,
> + kernel_info_get_mem(kinfo));
> + if ( ret )
> + goto err;
> +
> +#ifdef CONFIG_STATIC_SHM
This should not be necessary because there is a stub implementation of
make_resv_memory_node available in static-shmem.h for the
!CONFIG_STATIC_SHM case.
> + ret = make_resv_memory_node(kinfo, addrcells, sizecells);
> + if ( ret )
> + goto err;
> +#endif
> +
> + /*
> + * domain_handle_dtb_bootmodule has to be called before the rest of
> + * the device tree is generated because it depends on the value of
> + * the field phandle_intc.
> + */
> + if ( kinfo->dtb_bootmodule )
> + {
> + ret = domain_handle_dtb_bootmodule(d, kinfo);
> + if ( ret )
> + goto err;
> + }
> +
> + ret = make_intc_domU_node(kinfo);
> + if ( ret )
> + goto err;
> +
> + ret = make_timer_node(kinfo);
> + if ( ret )
> + goto err;
> +
> + if ( kinfo->dom0less_feature & DOM0LESS_ENHANCED_NO_XS )
> + {
> + ret = make_hypervisor_node(d, kinfo, addrcells, sizecells);
> + if ( ret )
> + goto err;
> + }
> +
> + ret = make_arch_nodes(kinfo);
> + if ( ret )
> + goto err;
> +
> + ret = fdt_end_node(kinfo->fdt);
> + if ( ret < 0 )
> + goto err;
> +
> + ret = fdt_finish(kinfo->fdt);
> + if ( ret < 0 )
> + goto err;
> +
> + return 0;
> +
> + err:
> + printk("Device tree generation failed (%d).\n", ret);
> + xfree(kinfo->fdt);
> +
> + return -EINVAL;
> +}
> +
> +#define XENSTORE_PFN_OFFSET 1
> +static int __init alloc_xenstore_page(struct domain *d)
> +{
> + struct page_info *xenstore_pg;
> + struct xenstore_domain_interface *interface;
> + mfn_t mfn;
> + gfn_t gfn;
> + int rc;
> +
> + if ( (UINT_MAX - d->max_pages) < 1 )
> + {
> + printk(XENLOG_ERR "%pd: Over-allocation for d->max_pages by 1 page.\n",
> + d);
> + return -EINVAL;
> + }
> +
> + d->max_pages += 1;
> + xenstore_pg = alloc_domheap_page(d, MEMF_bits(32));
> + if ( xenstore_pg == NULL && is_64bit_domain(d) )
> + xenstore_pg = alloc_domheap_page(d, 0);
> + if ( xenstore_pg == NULL )
> + return -ENOMEM;
> +
> + mfn = page_to_mfn(xenstore_pg);
> + if ( !mfn_x(mfn) )
> + return -ENOMEM;
> +
> + if ( !is_domain_direct_mapped(d) )
> + gfn = gaddr_to_gfn(GUEST_MAGIC_BASE +
> + (XENSTORE_PFN_OFFSET << PAGE_SHIFT));
> + else
> + gfn = gaddr_to_gfn(mfn_to_maddr(mfn));
> +
> + rc = guest_physmap_add_page(d, gfn, mfn, 0);
> + if ( rc )
> + {
> + free_domheap_page(xenstore_pg);
> + return rc;
> + }
> +
> +#ifdef CONFIG_HVM
> + d->arch.hvm.params[HVM_PARAM_STORE_PFN] = gfn_x(gfn);
> +#endif
> + interface = map_domain_page(mfn);
> + interface->connection = XENSTORE_RECONNECT;
> + unmap_domain_page(interface);
> +
> + return 0;
> +}
> +
> +static int __init alloc_xenstore_params(struct kernel_info *kinfo)
> +{
> + struct domain *d = kinfo->d;
> + int rc = 0;
> +
> +#ifdef CONFIG_HVM
> + if ( (kinfo->dom0less_feature & (DOM0LESS_XENSTORE | DOM0LESS_XS_LEGACY))
> + == (DOM0LESS_XENSTORE | DOM0LESS_XS_LEGACY) )
> + d->arch.hvm.params[HVM_PARAM_STORE_PFN] = XENSTORE_PFN_LATE_ALLOC;
> + else
> +#endif
> + if ( kinfo->dom0less_feature & DOM0LESS_XENSTORE )
> + {
> + rc = alloc_xenstore_page(d);
> + if ( rc < 0 )
> + return rc;
> + }
> +
> + return rc;
> +}
> +
> +static void __init domain_vcpu_affinity(struct domain *d,
> + const struct dt_device_node *node)
> +{
> + struct dt_device_node *np;
> +
> + dt_for_each_child_node(node, np)
> + {
> + const char *hard_affinity_str = NULL;
> + uint32_t val;
> + int rc;
> + struct vcpu *v;
> + cpumask_t affinity;
> +
> + if ( !dt_device_is_compatible(np, "xen,vcpu") )
> + continue;
> +
> + if ( !dt_property_read_u32(np, "id", &val) )
> + panic("Invalid xen,vcpu node for domain %s\n", dt_node_name(node));
> +
> + if ( val >= d->max_vcpus )
> + panic("Invalid vcpu_id %u for domain %s, max_vcpus=%u\n", val,
> + dt_node_name(node), d->max_vcpus);
> +
> + v = d->vcpu[val];
> + rc = dt_property_read_string(np, "hard-affinity", &hard_affinity_str);
> + if ( rc < 0 )
> + continue;
> +
> + cpumask_clear(&affinity);
> + while ( *hard_affinity_str != '\0' )
> + {
> + unsigned int start, end;
> +
> + start = simple_strtoul(hard_affinity_str, &hard_affinity_str, 0);
> +
> + if ( *hard_affinity_str == '-' ) /* Range */
> + {
> + hard_affinity_str++;
> + end = simple_strtoul(hard_affinity_str, &hard_affinity_str, 0);
> + }
> + else /* Single value */
> + end = start;
> +
> + if ( end >= nr_cpu_ids )
> + panic("Invalid pCPU %u for domain %s\n", end, dt_node_name(node));
> +
> + for ( ; start <= end; start++ )
> + cpumask_set_cpu(start, &affinity);
> +
> + if ( *hard_affinity_str == ',' )
> + hard_affinity_str++;
> + else if ( *hard_affinity_str != '\0' )
> + break;
> + }
> +
> + rc = vcpu_set_hard_affinity(v, &affinity);
> + if ( rc )
> + panic("vcpu%d: failed (rc=%d) to set hard affinity for domain %s\n",
> + v->vcpu_id, rc, dt_node_name(node));
> + }
> +}
> +
> +#ifdef CONFIG_ARCH_PAGING_MEMPOOL
> +static unsigned long __init domain_p2m_pages(unsigned long maxmem_kb,
> + unsigned int smp_cpus)
> +{
> + /*
> + * Keep in sync with libxl__get_required_paging_memory().
> + * 256 pages (1MB) per vcpu, plus 1 page per MiB of RAM for the P2M map,
> + * plus 128 pages to cover extended regions.
> + */
> + unsigned long memkb = 4 * (256 * smp_cpus + (maxmem_kb / 1024) + 128);
> +
> + BUILD_BUG_ON(PAGE_SIZE != SZ_4K);
> +
> + return DIV_ROUND_UP(memkb, 1024) << (20 - PAGE_SHIFT);
> +}
> +
> +static int __init domain_p2m_set_allocation(struct domain *d, uint64_t mem,
> + const struct dt_device_node *node)
> +{
> + unsigned long p2m_pages;
> + uint32_t p2m_mem_mb;
> + int rc;
> +
> + rc = dt_property_read_u32(node, "xen,domain-p2m-mem-mb", &p2m_mem_mb);
> + /* If xen,domain-p2m-mem-mb is not specified, use the default value. */
> + p2m_pages = rc ?
> + p2m_mem_mb << (20 - PAGE_SHIFT) :
> + domain_p2m_pages(mem, d->max_vcpus);
> +
> + spin_lock(&d->arch.paging.lock);
> + rc = p2m_set_allocation(d, p2m_pages, NULL);
> + spin_unlock(&d->arch.paging.lock);
> +
> + return rc;
> +}
> +#else /* !CONFIG_ARCH_PAGING_MEMPOOL */
> +static inline int domain_p2m_set_allocation(struct domain *d, uint64_t mem,
> + const struct dt_device_node *node)
> +{
> + return 0;
> +}
> +#endif /* CONFIG_ARCH_PAGING_MEMPOOL */
> +
> +static int __init construct_domU(struct domain *d,
> + const struct dt_device_node *node)
> +{
> + struct kernel_info kinfo = KERNEL_INFO_INIT;
> + const char *dom0less_enhanced;
> + int rc;
> + u64 mem;
> +
> + rc = dt_property_read_u64(node, "memory", &mem);
> + if ( !rc )
> + {
> + printk("Error building DomU: cannot read \"memory\" property\n");
> + return -EINVAL;
> + }
> + kinfo.unassigned_mem = (paddr_t)mem * SZ_1K;
> +
> + rc = domain_p2m_set_allocation(d, mem, node);
> + if ( rc != 0 )
> + return rc;
> +
> + printk("*** LOADING DOMU cpus=%u memory=%#"PRIx64"KB ***\n",
> + d->max_vcpus, mem);
> +
> + rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced);
> + if ( rc == -EILSEQ ||
> + rc == -ENODATA ||
> + (rc == 0 && !strcmp(dom0less_enhanced, "enabled")) )
> + {
> + need_xenstore = true;
> + kinfo.dom0less_feature = DOM0LESS_ENHANCED;
> + }
> + else if ( rc == 0 && !strcmp(dom0less_enhanced, "legacy") )
> + {
> + need_xenstore = true;
> + kinfo.dom0less_feature = DOM0LESS_ENHANCED_LEGACY;
> + }
> + else if ( rc == 0 && !strcmp(dom0less_enhanced, "no-xenstore") )
> + kinfo.dom0less_feature = DOM0LESS_ENHANCED_NO_XS;
> +
> + if ( vcpu_create(d, 0) == NULL )
> + return -ENOMEM;
> +
> + d->max_pages = ((paddr_t)mem * SZ_1K) >> PAGE_SHIFT;
> +
> + kinfo.d = d;
> +
> + rc = kernel_probe(&kinfo, node);
> + if ( rc < 0 )
> + return rc;
> +
> + set_domain_type(d, &kinfo);
> +
> + if ( !dt_find_property(node, "xen,static-mem", NULL) )
> + allocate_memory(d, &kinfo);
> +#ifdef CONFIG_STATIC_MEMORY
Also this should not be needed thanks to the stub implementation of
allocate_static_memory and assign_static_memory_11
> + else if ( !is_domain_direct_mapped(d) )
> + allocate_static_memory(d, &kinfo, node);
> + else
> + assign_static_memory_11(d, &kinfo, node);
> +#endif
> +
> +#ifdef CONFIG_STATIC_SHM
There is a stub for process_shm too
> + rc = process_shm(d, &kinfo, node);
> + if ( rc < 0 )
> + return rc;
> +#endif
> +
> + rc = init_vuart(d, &kinfo, node);
> + if ( rc < 0 )
> + return rc;
> +
> + rc = prepare_dtb_domU(d, &kinfo);
> + if ( rc < 0 )
> + return rc;
> +
> + rc = construct_domain(d, &kinfo);
> + if ( rc < 0 )
> + return rc;
> +
> + domain_vcpu_affinity(d, node);
> +
> + return alloc_xenstore_params(&kinfo);
> +}
> +
> void __init create_domUs(void)
> {
> struct dt_device_node *node;
> diff --git a/xen/include/asm-generic/dom0less-build.h b/xen/include/asm-generic/dom0less-build.h
> index f095135caa..c00bb853d6 100644
> --- a/xen/include/asm-generic/dom0less-build.h
> +++ b/xen/include/asm-generic/dom0less-build.h
> @@ -11,10 +11,7 @@
>
> struct domain;
> struct dt_device_node;
> -
> -/* TODO: remove both when construct_domU() will be moved to common. */
> -#define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
> -extern bool need_xenstore;
> +struct kernel_info;
>
> /*
> * List of possible features for dom0less domUs
> @@ -48,12 +45,21 @@ void create_domUs(void);
> bool is_dom0less_mode(void);
> void set_xs_domain(struct domain *d);
>
> -int construct_domU(struct domain *d, const struct dt_device_node *node);
> -
> void arch_create_domUs(struct dt_device_node *node,
> struct xen_domctl_createdomain *d_cfg,
> unsigned int flags);
>
> +int init_vuart(struct domain *d, struct kernel_info *kinfo,
> + const struct dt_device_node *node);
> +
> +int make_intc_domU_node(struct kernel_info *kinfo);
> +int make_arch_nodes(struct kernel_info *kinfo);
> +
> +void set_domain_type(struct domain *d, struct kernel_info *kinfo);
> +
> +int init_intc_phandle(struct kernel_info *kinfo, const char *name,
> + const int node_next, const void *pfdt);
> +
> #else /* !CONFIG_DOM0LESS_BOOT */
>
> static inline void create_domUs(void) {}
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v3 8/8] xen/common: dom0less: introduce common dom0less-build.c
2025-05-02 20:53 ` Stefano Stabellini
@ 2025-05-05 10:46 ` Oleksii Kurochko
2025-05-05 10:56 ` Oleksii Kurochko
2025-05-05 17:35 ` Stefano Stabellini
0 siblings, 2 replies; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-05 10:46 UTC (permalink / raw)
To: Stefano Stabellini
Cc: xen-devel, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD, Jan Beulich,
Roger Pau Monné
[-- Attachment #1: Type: text/plain, Size: 56842 bytes --]
On 5/2/25 10:53 PM, Stefano Stabellini wrote:
> On Fri, 2 May 2025, Oleksii Kurochko wrote:
>> Part of Arm's dom0less-build.c could be common between architectures which are
>> using device tree files to create guest domains. Thereby move some parts of
>> Arm's dom0less-build.c to common code with minor changes.
>>
>> As a part of theses changes the following changes are introduced:
>> - Introduce make_arch_nodes() to cover arch-specific nodes. For example, in
>> case of Arm, it is PSCI and vpl011 nodes.
>> - Introduce set_domain_type() to abstract a way how setting of domain type
>> happens. For example, RISC-V won't have this member of arch_domain structure
>> as vCPUs will always have the same bitness as hypervisor. In case of Arm, it
>> is possible that Arm64 could create 32-bit and 64-bit domains.
>> - Introduce init_vuart() to cover details of virtual uart initialization.
>> - Introduce init_intc_phandle() to cover some details of interrupt controller
>> phandle initialization. As an example, RISC-V could have different name for
>> interrupt controller node ( APLIC, PLIC, IMSIC, etc ) but the code in
>> domain_handle_dtb_bootmodule() could handle only one interrupt controller
>> node name.
>> - s/make_gic_domU_node/make_intc_domU_node as GIC is Arm specific naming and
>> add prototype of make_intc_domU_node() to dom0less-build.h
>>
>> The following functions are moved to xen/common/device-tree:
>> - Functions which are moved as is:
>> - domain_p2m_pages().
>> - handle_passthrough_prop().
>> - handle_prop_pfdt().
>> - scan_pfdt_node().
>> - check_partial_fdt().
>> - Functions which are moved with some minor changes:
>> - alloc_xenstore_evtchn():
>> - ifdef-ing by CONFIG_HVM accesses to hvm.params.
>> - prepare_dtb_domU():
>> - ifdef-ing access to gnttab_{start,size} by CONFIG_GRANT_TABLE.
>> - s/make_gic_domU_node/make_intc_domU_node.
>> - Add call of make_arch_nodes().
>> - domain_handle_dtb_bootmodule():
>> - hide details of interrupt controller phandle initialization by calling
>> init_intc_phandle().
>> - Update the comment above init_intc_phandle(): s/gic/interrupt controller.
>> - construct_domU():
>> - ifdef-ing by CONFIG_HVM accesses to hvm.params.
>> - Call init_vuart() to hide Arm's vpl011_init() details there.
>> - Add call of set_domain_type() instead of setting kinfo->arch.type explicitly.
>>
>> Some parts of dom0less-build.c are wraped by #ifdef CONFIG_STATIC_{SHMEM,MEMORY}
>> as not all archs support these configs.
>>
>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
> FYI for a possible follow-up patch (doesn't have to be done in this
> patch), the following functions could now be static:
>
> alloc_dom0_vcpu0
> dom0_max_vcpus
I will make them static in follow-up patch in the next patch series version.
>
>
>
>> ---
>> Change in v3:
>> - Align construct_domU() with the current staging.
>> - Align alloc_xenstore_params() with the current staging.
>> - Move defintion of XENSTORE_PFN_LATE_ALLOC to common and
>> declaration of need_xenstore to common.
>> ---
>> Change in v2:
>> - Wrap by #ifdef CONFIG_STATIC_* inclusions of <asm/static-memory.h> and
>> <asm/static-shmem.h>. Wrap also the code which uses something from the
>> mentioned headers.
>> - Add handling of legacy case in construct_domU().
>> - Use xen/fdt-kernel.h and xen/fdt-domain-build.h instead of asm/*.
>> - Update the commit message.
>> ---
>> xen/arch/arm/dom0less-build.c | 714 ++---------------------
>> xen/common/device-tree/dom0less-build.c | 699 ++++++++++++++++++++++
>> xen/include/asm-generic/dom0less-build.h | 18 +-
>> 3 files changed, 751 insertions(+), 680 deletions(-)
>>
>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
>> index 0310579863..627c212b3b 100644
>> --- a/xen/arch/arm/dom0less-build.c
>> +++ b/xen/arch/arm/dom0less-build.c
>> @@ -25,8 +25,6 @@
>> #include <asm/static-memory.h>
>> #include <asm/static-shmem.h>
>>
>> -bool __initdata need_xenstore;
>> -
>> #ifdef CONFIG_VGICV2
>> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
>> {
>> @@ -152,7 +150,7 @@ static int __init make_gicv3_domU_node(struct kernel_info *kinfo)
>> }
>> #endif
>>
>> -static int __init make_gic_domU_node(struct kernel_info *kinfo)
>> +int __init make_intc_domU_node(struct kernel_info *kinfo)
>> {
>> switch ( kinfo->d->arch.vgic.version )
>> {
>> @@ -218,708 +216,60 @@ static int __init make_vpl011_uart_node(struct kernel_info *kinfo)
>> }
>> #endif
>>
>> -/*
>> - * Scan device tree properties for passthrough specific information.
>> - * Returns < 0 on error
>> - * 0 on success
>> - */
>> -static int __init handle_passthrough_prop(struct kernel_info *kinfo,
>> - const struct fdt_property *xen_reg,
>> - const struct fdt_property *xen_path,
>> - bool xen_force,
>> - uint32_t address_cells,
>> - uint32_t size_cells)
>> -{
>> - const __be32 *cell;
>> - unsigned int i, len;
>> - struct dt_device_node *node;
>> - int res;
>> - paddr_t mstart, size, gstart;
>> -
>> - /* xen,reg specifies where to map the MMIO region */
>> - cell = (const __be32 *)xen_reg->data;
>> - len = fdt32_to_cpu(xen_reg->len) / ((address_cells * 2 + size_cells) *
>> - sizeof(uint32_t));
>> -
>> - for ( i = 0; i < len; i++ )
>> - {
>> - device_tree_get_reg(&cell, address_cells, size_cells,
>> - &mstart, &size);
>> - gstart = dt_next_cell(address_cells, &cell);
>> -
>> - if ( gstart & ~PAGE_MASK || mstart & ~PAGE_MASK || size & ~PAGE_MASK )
>> - {
>> - printk(XENLOG_ERR
>> - "DomU passthrough config has not page aligned addresses/sizes\n");
>> - return -EINVAL;
>> - }
>> -
>> - res = iomem_permit_access(kinfo->d, paddr_to_pfn(mstart),
>> - paddr_to_pfn(PAGE_ALIGN(mstart + size - 1)));
>> - if ( res )
>> - {
>> - printk(XENLOG_ERR "Unable to permit to dom%d access to"
>> - " 0x%"PRIpaddr" - 0x%"PRIpaddr"\n",
>> - kinfo->d->domain_id,
>> - mstart & PAGE_MASK, PAGE_ALIGN(mstart + size) - 1);
>> - return res;
>> - }
>> -
>> - res = map_regions_p2mt(kinfo->d,
>> - gaddr_to_gfn(gstart),
>> - PFN_DOWN(size),
>> - maddr_to_mfn(mstart),
>> - p2m_mmio_direct_dev);
>> - if ( res < 0 )
>> - {
>> - printk(XENLOG_ERR
>> - "Failed to map %"PRIpaddr" to the guest at%"PRIpaddr"\n",
>> - mstart, gstart);
>> - return -EFAULT;
>> - }
>> - }
>> -
>> - /*
>> - * If xen_force, we let the user assign a MMIO region with no
>> - * associated path.
>> - */
>> - if ( xen_path == NULL )
>> - return xen_force ? 0 : -EINVAL;
>> -
>> - /*
>> - * xen,path specifies the corresponding node in the host DT.
>> - * Both interrupt mappings and IOMMU settings are based on it,
>> - * as they are done based on the corresponding host DT node.
>> - */
>> - node = dt_find_node_by_path(xen_path->data);
>> - if ( node == NULL )
>> - {
>> - printk(XENLOG_ERR "Couldn't find node %s in host_dt!\n",
>> - xen_path->data);
>> - return -EINVAL;
>> - }
>> -
>> - res = map_device_irqs_to_domain(kinfo->d, node, true, NULL);
>> - if ( res < 0 )
>> - return res;
>> -
>> - res = iommu_add_dt_device(node);
>> - if ( res < 0 )
>> - return res;
>> -
>> - /* If xen_force, we allow assignment of devices without IOMMU protection. */
>> - if ( xen_force && !dt_device_is_protected(node) )
>> - return 0;
>> -
>> - return iommu_assign_dt_device(kinfo->d, node);
>> -}
>> -
>> -static int __init handle_prop_pfdt(struct kernel_info *kinfo,
>> - const void *pfdt, int nodeoff,
>> - uint32_t address_cells, uint32_t size_cells,
>> - bool scan_passthrough_prop)
>> -{
>> - void *fdt = kinfo->fdt;
>> - int propoff, nameoff, res;
>> - const struct fdt_property *prop, *xen_reg = NULL, *xen_path = NULL;
>> - const char *name;
>> - bool found, xen_force = false;
>> -
>> - for ( propoff = fdt_first_property_offset(pfdt, nodeoff);
>> - propoff >= 0;
>> - propoff = fdt_next_property_offset(pfdt, propoff) )
>> - {
>> - if ( !(prop = fdt_get_property_by_offset(pfdt, propoff, NULL)) )
>> - return -FDT_ERR_INTERNAL;
>> -
>> - found = false;
>> - nameoff = fdt32_to_cpu(prop->nameoff);
>> - name = fdt_string(pfdt, nameoff);
>> -
>> - if ( scan_passthrough_prop )
>> - {
>> - if ( dt_prop_cmp("xen,reg", name) == 0 )
>> - {
>> - xen_reg = prop;
>> - found = true;
>> - }
>> - else if ( dt_prop_cmp("xen,path", name) == 0 )
>> - {
>> - xen_path = prop;
>> - found = true;
>> - }
>> - else if ( dt_prop_cmp("xen,force-assign-without-iommu",
>> - name) == 0 )
>> - {
>> - xen_force = true;
>> - found = true;
>> - }
>> - }
>> -
>> - /*
>> - * Copy properties other than the ones above: xen,reg, xen,path,
>> - * and xen,force-assign-without-iommu.
>> - */
>> - if ( !found )
>> - {
>> - res = fdt_property(fdt, name, prop->data, fdt32_to_cpu(prop->len));
>> - if ( res )
>> - return res;
>> - }
>> - }
>> -
>> - /*
>> - * Only handle passthrough properties if both xen,reg and xen,path
>> - * are present, or if xen,force-assign-without-iommu is specified.
>> - */
>> - if ( xen_reg != NULL && (xen_path != NULL || xen_force) )
>> - {
>> - res = handle_passthrough_prop(kinfo, xen_reg, xen_path, xen_force,
>> - address_cells, size_cells);
>> - if ( res < 0 )
>> - {
>> - printk(XENLOG_ERR "Failed to assign device to %pd\n", kinfo->d);
>> - return res;
>> - }
>> - }
>> - else if ( (xen_path && !xen_reg) || (xen_reg && !xen_path && !xen_force) )
>> - {
>> - printk(XENLOG_ERR "xen,reg or xen,path missing for %pd\n",
>> - kinfo->d);
>> - return -EINVAL;
>> - }
>> -
>> - /* FDT_ERR_NOTFOUND => There is no more properties for this node */
>> - return ( propoff != -FDT_ERR_NOTFOUND ) ? propoff : 0;
>> -}
>> -
>> -static int __init scan_pfdt_node(struct kernel_info *kinfo, const void *pfdt,
>> - int nodeoff,
>> - uint32_t address_cells, uint32_t size_cells,
>> - bool scan_passthrough_prop)
>> -{
>> - int rc = 0;
>> - void *fdt = kinfo->fdt;
>> - int node_next;
>> -
>> - rc = fdt_begin_node(fdt, fdt_get_name(pfdt, nodeoff, NULL));
>> - if ( rc )
>> - return rc;
>> -
>> - rc = handle_prop_pfdt(kinfo, pfdt, nodeoff, address_cells, size_cells,
>> - scan_passthrough_prop);
>> - if ( rc )
>> - return rc;
>> -
>> - address_cells = device_tree_get_u32(pfdt, nodeoff, "#address-cells",
>> - DT_ROOT_NODE_ADDR_CELLS_DEFAULT);
>> - size_cells = device_tree_get_u32(pfdt, nodeoff, "#size-cells",
>> - DT_ROOT_NODE_SIZE_CELLS_DEFAULT);
>> -
>> - node_next = fdt_first_subnode(pfdt, nodeoff);
>> - while ( node_next > 0 )
>> - {
>> - rc = scan_pfdt_node(kinfo, pfdt, node_next, address_cells, size_cells,
>> - scan_passthrough_prop);
>> - if ( rc )
>> - return rc;
>> -
>> - node_next = fdt_next_subnode(pfdt, node_next);
>> - }
>> -
>> - return fdt_end_node(fdt);
>> -}
>> -
>> -static int __init check_partial_fdt(void *pfdt, size_t size)
>> +int __init make_arch_nodes(struct kernel_info *kinfo)
>> {
>> - int res;
>> -
>> - if ( fdt_magic(pfdt) != FDT_MAGIC )
>> - {
>> - dprintk(XENLOG_ERR, "Partial FDT is not a valid Flat Device Tree");
>> - return -EINVAL;
>> - }
>> -
>> - res = fdt_check_header(pfdt);
>> - if ( res )
>> - {
>> - dprintk(XENLOG_ERR, "Failed to check the partial FDT (%d)", res);
>> - return -EINVAL;
>> - }
>> -
>> - if ( fdt_totalsize(pfdt) > size )
>> - {
>> - dprintk(XENLOG_ERR, "Partial FDT totalsize is too big");
>> - return -EINVAL;
>> - }
>> -
>> - return 0;
>> -}
>> -
>> -static int __init domain_handle_dtb_bootmodule(struct domain *d,
>> - struct kernel_info *kinfo)
>> -{
>> - void *pfdt;
>> - int res, node_next;
>> -
>> - pfdt = ioremap_cache(kinfo->dtb_bootmodule->start,
>> - kinfo->dtb_bootmodule->size);
>> - if ( pfdt == NULL )
>> - return -EFAULT;
>> -
>> - res = check_partial_fdt(pfdt, kinfo->dtb_bootmodule->size);
>> - if ( res < 0 )
>> - goto out;
>> -
>> - for ( node_next = fdt_first_subnode(pfdt, 0);
>> - node_next > 0;
>> - node_next = fdt_next_subnode(pfdt, node_next) )
>> - {
>> - const char *name = fdt_get_name(pfdt, node_next, NULL);
>> -
>> - if ( name == NULL )
>> - continue;
>> -
>> - /*
>> - * Only scan /gic /aliases /passthrough, ignore the rest.
>> - * They don't have to be parsed in order.
>> - *
>> - * Take the GIC phandle value from the special /gic node in the
>> - * DTB fragment.
>> - */
>> - if ( dt_node_cmp(name, "gic") == 0 )
>> - {
>> - uint32_t phandle_intc = fdt_get_phandle(pfdt, node_next);
>> -
>> - if ( phandle_intc != 0 )
>> - kinfo->phandle_intc = phandle_intc;
>> - continue;
>> - }
>> -
>> - if ( dt_node_cmp(name, "aliases") == 0 )
>> - {
>> - res = scan_pfdt_node(kinfo, pfdt, node_next,
>> - DT_ROOT_NODE_ADDR_CELLS_DEFAULT,
>> - DT_ROOT_NODE_SIZE_CELLS_DEFAULT,
>> - false);
>> - if ( res )
>> - goto out;
>> - continue;
>> - }
>> - if ( dt_node_cmp(name, "passthrough") == 0 )
>> - {
>> - res = scan_pfdt_node(kinfo, pfdt, node_next,
>> - DT_ROOT_NODE_ADDR_CELLS_DEFAULT,
>> - DT_ROOT_NODE_SIZE_CELLS_DEFAULT,
>> - true);
>> - if ( res )
>> - goto out;
>> - continue;
>> - }
>> - }
>> -
>> - out:
>> - iounmap(pfdt);
>> -
>> - return res;
>> -}
>> -
>> -/*
>> - * The max size for DT is 2MB. However, the generated DT is small (not including
>> - * domU passthrough DT nodes whose size we account separately), 4KB are enough
>> - * for now, but we might have to increase it in the future.
>> - */
>> -#define DOMU_DTB_SIZE 4096
>> -static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
>> -{
>> - int addrcells, sizecells;
>> - int ret, fdt_size = DOMU_DTB_SIZE;
>> -
>> - kinfo->phandle_intc = GUEST_PHANDLE_GIC;
>> - kinfo->gnttab_start = GUEST_GNTTAB_BASE;
>> - kinfo->gnttab_size = GUEST_GNTTAB_SIZE;
>> -
>> - addrcells = GUEST_ROOT_ADDRESS_CELLS;
>> - sizecells = GUEST_ROOT_SIZE_CELLS;
>> -
>> - /* Account for domU passthrough DT size */
>> - if ( kinfo->dtb_bootmodule )
>> - fdt_size += kinfo->dtb_bootmodule->size;
>> -
>> - /* Cap to max DT size if needed */
>> - fdt_size = min(fdt_size, SZ_2M);
>> -
>> - kinfo->fdt = xmalloc_bytes(fdt_size);
>> - if ( kinfo->fdt == NULL )
>> - return -ENOMEM;
>> -
>> - ret = fdt_create(kinfo->fdt, fdt_size);
>> - if ( ret < 0 )
>> - goto err;
>> -
>> - ret = fdt_finish_reservemap(kinfo->fdt);
>> - if ( ret < 0 )
>> - goto err;
>> -
>> - ret = fdt_begin_node(kinfo->fdt, "");
>> - if ( ret < 0 )
>> - goto err;
>> -
>> - ret = fdt_property_cell(kinfo->fdt, "#address-cells", addrcells);
>> - if ( ret )
>> - goto err;
>> -
>> - ret = fdt_property_cell(kinfo->fdt, "#size-cells", sizecells);
>> - if ( ret )
>> - goto err;
>> -
>> - ret = make_chosen_node(kinfo);
>> - if ( ret )
>> - goto err;
>> + int ret;
>>
>> ret = make_psci_node(kinfo->fdt);
>> if ( ret )
>> - goto err;
>> -
>> - ret = make_cpus_node(d, kinfo->fdt);
>> - if ( ret )
>> - goto err;
>> -
>> - ret = make_memory_node(kinfo, addrcells, sizecells,
>> - kernel_info_get_mem(kinfo));
>> - if ( ret )
>> - goto err;
>> -
>> - ret = make_resv_memory_node(kinfo, addrcells, sizecells);
>> - if ( ret )
>> - goto err;
>> -
>> - /*
>> - * domain_handle_dtb_bootmodule has to be called before the rest of
>> - * the device tree is generated because it depends on the value of
>> - * the field phandle_intc.
>> - */
>> - if ( kinfo->dtb_bootmodule )
>> - {
>> - ret = domain_handle_dtb_bootmodule(d, kinfo);
>> - if ( ret )
>> - goto err;
>> - }
>> -
>> - ret = make_gic_domU_node(kinfo);
>> - if ( ret )
>> - goto err;
>> -
>> - ret = make_timer_node(kinfo);
>> - if ( ret )
>> - goto err;
>> + return -EINVAL;
>>
>> if ( kinfo->vuart )
>> {
>> - ret = -EINVAL;
>> #ifdef CONFIG_SBSA_VUART_CONSOLE
>> ret = make_vpl011_uart_node(kinfo);
>> #endif
>> if ( ret )
>> - goto err;
>> - }
>> -
>> - if ( kinfo->dom0less_feature & DOM0LESS_ENHANCED_NO_XS )
>> - {
>> - ret = make_hypervisor_node(d, kinfo, addrcells, sizecells);
>> - if ( ret )
>> - goto err;
>> + return -EINVAL;
>> }
>>
>> - ret = fdt_end_node(kinfo->fdt);
>> - if ( ret < 0 )
>> - goto err;
>> -
>> - ret = fdt_finish(kinfo->fdt);
>> - if ( ret < 0 )
>> - goto err;
>> -
>> return 0;
>> -
>> - err:
>> - printk("Device tree generation failed (%d).\n", ret);
>> - xfree(kinfo->fdt);
>> -
>> - return -EINVAL;
>> }
>>
>> -#define XENSTORE_PFN_OFFSET 1
>> -static int __init alloc_xenstore_page(struct domain *d)
>> +/* TODO: make arch.type generic ? */
>> +#ifdef CONFIG_ARM_64
>> +void __init set_domain_type(struct domain *d, struct kernel_info *kinfo)
>> {
>> - struct page_info *xenstore_pg;
>> - struct xenstore_domain_interface *interface;
>> - mfn_t mfn;
>> - gfn_t gfn;
>> - int rc;
>> -
>> - if ( (UINT_MAX - d->max_pages) < 1 )
>> - {
>> - printk(XENLOG_ERR "%pd: Over-allocation for d->max_pages by 1 page.\n",
>> - d);
>> - return -EINVAL;
>> - }
>> -
>> - d->max_pages += 1;
>> - xenstore_pg = alloc_domheap_page(d, MEMF_bits(32));
>> - if ( xenstore_pg == NULL && is_64bit_domain(d) )
>> - xenstore_pg = alloc_domheap_page(d, 0);
>> - if ( xenstore_pg == NULL )
>> - return -ENOMEM;
>> -
>> - mfn = page_to_mfn(xenstore_pg);
>> - if ( !mfn_x(mfn) )
>> - return -ENOMEM;
>> -
>> - if ( !is_domain_direct_mapped(d) )
>> - gfn = gaddr_to_gfn(GUEST_MAGIC_BASE +
>> - (XENSTORE_PFN_OFFSET << PAGE_SHIFT));
>> - else
>> - gfn = gaddr_to_gfn(mfn_to_maddr(mfn));
>> -
>> - rc = guest_physmap_add_page(d, gfn, mfn, 0);
>> - if ( rc )
>> - {
>> - free_domheap_page(xenstore_pg);
>> - return rc;
>> - }
>> -
>> - d->arch.hvm.params[HVM_PARAM_STORE_PFN] = gfn_x(gfn);
>> - interface = map_domain_page(mfn);
>> - interface->connection = XENSTORE_RECONNECT;
>> - unmap_domain_page(interface);
>> -
>> - return 0;
>> + /* type must be set before allocate memory */
>> + d->arch.type = kinfo->arch.type;
>> }
>> -
>> -static int __init alloc_xenstore_params(struct kernel_info *kinfo)
>> +#else
>> +void __init set_domain_type(struct domain *d, struct kernel_info *kinfo)
>> {
>> - struct domain *d = kinfo->d;
>> - int rc = 0;
>> -
>> - if ( (kinfo->dom0less_feature & (DOM0LESS_XENSTORE | DOM0LESS_XS_LEGACY))
>> - == (DOM0LESS_XENSTORE | DOM0LESS_XS_LEGACY) )
>> - d->arch.hvm.params[HVM_PARAM_STORE_PFN] = XENSTORE_PFN_LATE_ALLOC;
>> - else if ( kinfo->dom0less_feature & DOM0LESS_XENSTORE )
>> - {
>> - rc = alloc_xenstore_page(d);
>> - if ( rc < 0 )
>> - return rc;
>> - }
>> -
>> - return rc;
>> + /* Nothing to do */
>> }
>> +#endif
>>
>> -static void __init domain_vcpu_affinity(struct domain *d,
>> - const struct dt_device_node *node)
>> +int __init init_vuart(struct domain *d, struct kernel_info *kinfo,
>> + const struct dt_device_node *node)
>> {
>> - struct dt_device_node *np;
>> -
>> - dt_for_each_child_node(node, np)
>> - {
>> - const char *hard_affinity_str = NULL;
>> - uint32_t val;
>> - int rc;
>> - struct vcpu *v;
>> - cpumask_t affinity;
>> -
>> - if ( !dt_device_is_compatible(np, "xen,vcpu") )
>> - continue;
>> -
>> - if ( !dt_property_read_u32(np, "id", &val) )
>> - panic("Invalid xen,vcpu node for domain %s\n", dt_node_name(node));
>> -
>> - if ( val >= d->max_vcpus )
>> - panic("Invalid vcpu_id %u for domain %s, max_vcpus=%u\n", val,
>> - dt_node_name(node), d->max_vcpus);
>> -
>> - v = d->vcpu[val];
>> - rc = dt_property_read_string(np, "hard-affinity", &hard_affinity_str);
>> - if ( rc < 0 )
>> - continue;
>> -
>> - cpumask_clear(&affinity);
>> - while ( *hard_affinity_str != '\0' )
>> - {
>> - unsigned int start, end;
>> -
>> - start = simple_strtoul(hard_affinity_str, &hard_affinity_str, 0);
>> -
>> - if ( *hard_affinity_str == '-' ) /* Range */
>> - {
>> - hard_affinity_str++;
>> - end = simple_strtoul(hard_affinity_str, &hard_affinity_str, 0);
>> - }
>> - else /* Single value */
>> - end = start;
>> -
>> - if ( end >= nr_cpu_ids )
>> - panic("Invalid pCPU %u for domain %s\n", end, dt_node_name(node));
>> -
>> - for ( ; start <= end; start++ )
>> - cpumask_set_cpu(start, &affinity);
>> -
>> - if ( *hard_affinity_str == ',' )
>> - hard_affinity_str++;
>> - else if ( *hard_affinity_str != '\0' )
>> - break;
>> - }
>> + int rc = 0;
>>
>> - rc = vcpu_set_hard_affinity(v, &affinity);
>> - if ( rc )
>> - panic("vcpu%d: failed (rc=%d) to set hard affinity for domain %s\n",
>> - v->vcpu_id, rc, dt_node_name(node));
>> - }
>> -}
>> + kinfo->vuart = dt_property_read_bool(node, "vpl011");
>>
>> -#ifdef CONFIG_ARCH_PAGING_MEMPOOL
>> -static unsigned long __init domain_p2m_pages(unsigned long maxmem_kb,
>> - unsigned int smp_cpus)
>> -{
>> /*
>> - * Keep in sync with libxl__get_required_paging_memory().
>> - * 256 pages (1MB) per vcpu, plus 1 page per MiB of RAM for the P2M map,
>> - * plus 128 pages to cover extended regions.
>> + * Base address and irq number are needed when creating vpl011 device
>> + * tree node in prepare_dtb_domU, so initialization on related variables
>> + * shall be done first.
>> */
>> - unsigned long memkb = 4 * (256 * smp_cpus + (maxmem_kb / 1024) + 128);
>> -
>> - BUILD_BUG_ON(PAGE_SIZE != SZ_4K);
>> -
>> - return DIV_ROUND_UP(memkb, 1024) << (20 - PAGE_SHIFT);
>> -}
>> -
>> -static int __init domain_p2m_set_allocation(struct domain *d, uint64_t mem,
>> - const struct dt_device_node *node)
>> -{
>> - unsigned long p2m_pages;
>> - uint32_t p2m_mem_mb;
>> - int rc;
>> -
>> - rc = dt_property_read_u32(node, "xen,domain-p2m-mem-mb", &p2m_mem_mb);
>> - /* If xen,domain-p2m-mem-mb is not specified, use the default value. */
>> - p2m_pages = rc ?
>> - p2m_mem_mb << (20 - PAGE_SHIFT) :
>> - domain_p2m_pages(mem, d->max_vcpus);
>> -
>> - spin_lock(&d->arch.paging.lock);
>> - rc = p2m_set_allocation(d, p2m_pages, NULL);
>> - spin_unlock(&d->arch.paging.lock);
>> -
>> - return rc;
>> -}
>> -#else /* !CONFIG_ARCH_PAGING_MEMPOOL */
>> -static inline int domain_p2m_set_allocation(struct domain *d, uint64_t mem,
>> - const struct dt_device_node *node)
>> -{
>> - return 0;
>> -}
>> -#endif /* CONFIG_ARCH_PAGING_MEMPOOL */
>> -
>> -int __init construct_domU(struct domain *d,
>> - const struct dt_device_node *node)
>> -{
>> - struct kernel_info kinfo = KERNEL_INFO_INIT;
>> - const char *dom0less_enhanced;
>> - int rc;
>> - u64 mem;
>> -
>> - rc = dt_property_read_u64(node, "memory", &mem);
>> - if ( !rc )
>> - {
>> - printk("Error building DomU: cannot read \"memory\" property\n");
>> - return -EINVAL;
>> - }
>> - kinfo.unassigned_mem = (paddr_t)mem * SZ_1K;
>> -
>> - rc = domain_p2m_set_allocation(d, mem, node);
>> - if ( rc != 0 )
>> - return rc;
>> -
>> - printk("*** LOADING DOMU cpus=%u memory=%#"PRIx64"KB ***\n",
>> - d->max_vcpus, mem);
>> -
>> - kinfo.vuart = dt_property_read_bool(node, "vpl011");
>> - if ( kinfo.vuart && is_hardware_domain(d) )
>> - panic("hardware domain cannot specify vpl011\n");
>> -
>> - rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced);
>> - if ( rc == -EILSEQ ||
>> - rc == -ENODATA ||
>> - (rc == 0 && !strcmp(dom0less_enhanced, "enabled")) )
>> - {
>> - need_xenstore = true;
>> - kinfo.dom0less_feature = DOM0LESS_ENHANCED;
>> - }
>> - else if ( rc == 0 && !strcmp(dom0less_enhanced, "legacy") )
>> - {
>> - need_xenstore = true;
>> - kinfo.dom0less_feature = DOM0LESS_ENHANCED_LEGACY;
>> - }
>> - else if ( rc == 0 && !strcmp(dom0less_enhanced, "no-xenstore") )
>> - kinfo.dom0less_feature = DOM0LESS_ENHANCED_NO_XS;
>> -
>> - if ( vcpu_create(d, 0) == NULL )
>> - return -ENOMEM;
>> -
>> - d->max_pages = ((paddr_t)mem * SZ_1K) >> PAGE_SHIFT;
>> -
>> - kinfo.d = d;
>> -
>> - rc = kernel_probe(&kinfo, node);
>> - if ( rc < 0 )
>> - return rc;
>> -
>> -#ifdef CONFIG_ARM_64
>> - /* type must be set before allocate memory */
>> - d->arch.type = kinfo.arch.type;
>> -#endif
>> - if ( is_hardware_domain(d) )
>> - {
>> - rc = construct_hwdom(&kinfo, node);
>> - if ( rc < 0 )
>> - return rc;
>> - }
> I think we should retain this chunk in the code movement. It is OK if it
> is behind a #ifdef CONFIG_ARM.
I'll is_hardware_domain() handling. I think it can be without #ifdef, it seems to me
it is a good thing to re-use construct_hwdom() in the case of creation of h/w domain.
>
>
>> - else
>> + if ( kinfo->vuart )
>> {
>> - if ( !dt_find_property(node, "xen,static-mem", NULL) )
>> - allocate_memory(d, &kinfo);
>> - else if ( !is_domain_direct_mapped(d) )
>> - allocate_static_memory(d, &kinfo, node);
>> - else
>> - assign_static_memory_11(d, &kinfo, node);
>> -
>> - rc = process_shm(d, &kinfo, node);
>> - if ( rc < 0 )
>> - return rc;
>> -
>> - /*
>> - * Base address and irq number are needed when creating vpl011 device
>> - * tree node in prepare_dtb_domU, so initialization on related variables
>> - * shall be done first.
>> - */
>> - if ( kinfo.vuart )
>> - {
>> - rc = domain_vpl011_init(d, NULL);
>> - if ( rc < 0 )
>> - return rc;
>> - }
>> -
>> - rc = prepare_dtb_domU(d, &kinfo);
>> - if ( rc < 0 )
>> - return rc;
>> -
>> - rc = construct_domain(d, &kinfo);
>> + rc = domain_vpl011_init(d, NULL);
>> if ( rc < 0 )
>> return rc;
>> }
>>
>> - domain_vcpu_affinity(d, node);
>> -
>> - return alloc_xenstore_params(&kinfo);
>> + return rc;
>> }
>>
>> void __init arch_create_domUs(struct dt_device_node *node,
>> @@ -995,6 +345,22 @@ void __init arch_create_domUs(struct dt_device_node *node,
>> }
>> }
>>
>> +int __init init_intc_phandle(struct kernel_info *kinfo, const char *name,
>> + const int node_next, const void *pfdt)
>> +{
>> + if ( dt_node_cmp(name, "gic") == 0 )
>> + {
>> + uint32_t phandle_intc = fdt_get_phandle(pfdt, node_next);
>> +
>> + if ( phandle_intc != 0 )
>> + kinfo->phandle_intc = phandle_intc;
>> +
>> + return 0;
>> + }
>> +
>> + return 1;
>> +}
>> +
>> /*
>> * Local variables:
>> * mode: C
>> diff --git a/xen/common/device-tree/dom0less-build.c b/xen/common/device-tree/dom0less-build.c
>> index a01a8b6b1a..c3face5b90 100644
>> --- a/xen/common/device-tree/dom0less-build.c
>> +++ b/xen/common/device-tree/dom0less-build.c
>> @@ -3,24 +3,43 @@
>> #include <xen/bootfdt.h>
>> #include <xen/device_tree.h>
>> #include <xen/domain.h>
>> +#include <xen/domain_page.h>
>> #include <xen/err.h>
>> #include <xen/event.h>
>> +#include <xen/fdt-domain-build.h>
>> +#include <xen/fdt-kernel.h>
>> #include <xen/grant_table.h>
>> #include <xen/init.h>
>> +#include <xen/iocap.h>
>> #include <xen/iommu.h>
>> +#include <xen/libfdt/libfdt.h>
>> #include <xen/llc-coloring.h>
>> +#include <xen/sizes.h>
>> #include <xen/sched.h>
>> #include <xen/stdbool.h>
>> #include <xen/types.h>
>> +#include <xen/vmap.h>
>>
>> #include <public/bootfdt.h>
>> #include <public/domctl.h>
>> #include <public/event_channel.h>
>> +#include <public/io/xs_wire.h>
>>
>> #include <asm/dom0less-build.h>
>> #include <asm/setup.h>
>>
>> +#ifdef CONFIG_STATIC_MEMORY
>> +#include <asm/static-memory.h>
>> +#endif
> #if __has_include ?
>
>
>> +#ifdef CONFIG_STATIC_SHM
>> +#include <asm/static-shmem.h>
>> +#endif
> Same here?
I thought that if we have already some CONFIG_* then it is better to use #ifdef, but I am okay with
changing it to __has_include.
>
>
>> +#define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
>> +
>> static domid_t __initdata xs_domid = DOMID_INVALID;
>> +static bool __initdata need_xenstore;
>>
>> void __init set_xs_domain(struct domain *d)
>> {
>> @@ -109,6 +128,686 @@ static void __init initialize_domU_xenstore(void)
>> }
>> }
>>
>> +/*
>> + * Scan device tree properties for passthrough specific information.
>> + * Returns < 0 on error
>> + * 0 on success
>> + */
>> +static int __init handle_passthrough_prop(struct kernel_info *kinfo,
>> + const struct fdt_property *xen_reg,
>> + const struct fdt_property *xen_path,
>> + bool xen_force,
>> + uint32_t address_cells,
>> + uint32_t size_cells)
>> +{
>> + const __be32 *cell;
>> + unsigned int i, len;
>> + struct dt_device_node *node;
>> + int res;
>> + paddr_t mstart, size, gstart;
>> +
>> + /* xen,reg specifies where to map the MMIO region */
>> + cell = (const __be32 *)xen_reg->data;
>> + len = fdt32_to_cpu(xen_reg->len) / ((address_cells * 2 + size_cells) *
>> + sizeof(uint32_t));
>> +
>> + for ( i = 0; i < len; i++ )
>> + {
>> + device_tree_get_reg(&cell, address_cells, size_cells,
>> + &mstart, &size);
>> + gstart = dt_next_cell(address_cells, &cell);
>> +
>> + if ( gstart & ~PAGE_MASK || mstart & ~PAGE_MASK || size & ~PAGE_MASK )
>> + {
>> + printk(XENLOG_ERR
>> + "DomU passthrough config has not page aligned addresses/sizes\n");
>> + return -EINVAL;
>> + }
>> +
>> + res = iomem_permit_access(kinfo->d, paddr_to_pfn(mstart),
>> + paddr_to_pfn(PAGE_ALIGN(mstart + size - 1)));
>> + if ( res )
>> + {
>> + printk(XENLOG_ERR "Unable to permit to dom%d access to"
>> + " 0x%"PRIpaddr" - 0x%"PRIpaddr"\n",
>> + kinfo->d->domain_id,
>> + mstart & PAGE_MASK, PAGE_ALIGN(mstart + size) - 1);
>> + return res;
>> + }
>> +
>> + res = map_regions_p2mt(kinfo->d,
>> + gaddr_to_gfn(gstart),
>> + PFN_DOWN(size),
>> + maddr_to_mfn(mstart),
>> + p2m_mmio_direct_dev);
>> + if ( res < 0 )
>> + {
>> + printk(XENLOG_ERR
>> + "Failed to map %"PRIpaddr" to the guest at%"PRIpaddr"\n",
>> + mstart, gstart);
>> + return -EFAULT;
>> + }
>> + }
>> +
>> + /*
>> + * If xen_force, we let the user assign a MMIO region with no
>> + * associated path.
>> + */
>> + if ( xen_path == NULL )
>> + return xen_force ? 0 : -EINVAL;
>> +
>> + /*
>> + * xen,path specifies the corresponding node in the host DT.
>> + * Both interrupt mappings and IOMMU settings are based on it,
>> + * as they are done based on the corresponding host DT node.
>> + */
>> + node = dt_find_node_by_path(xen_path->data);
>> + if ( node == NULL )
>> + {
>> + printk(XENLOG_ERR "Couldn't find node %s in host_dt!\n",
>> + xen_path->data);
>> + return -EINVAL;
>> + }
>> +
>> + res = map_device_irqs_to_domain(kinfo->d, node, true, NULL);
>> + if ( res < 0 )
>> + return res;
>> +
>> + res = iommu_add_dt_device(node);
>> + if ( res < 0 )
>> + return res;
>> +
>> + /* If xen_force, we allow assignment of devices without IOMMU protection. */
>> + if ( xen_force && !dt_device_is_protected(node) )
>> + return 0;
>> +
>> + return iommu_assign_dt_device(kinfo->d, node);
>> +}
>> +
>> +static int __init handle_prop_pfdt(struct kernel_info *kinfo,
>> + const void *pfdt, int nodeoff,
>> + uint32_t address_cells, uint32_t size_cells,
>> + bool scan_passthrough_prop)
>> +{
>> + void *fdt = kinfo->fdt;
>> + int propoff, nameoff, res;
>> + const struct fdt_property *prop, *xen_reg = NULL, *xen_path = NULL;
>> + const char *name;
>> + bool found, xen_force = false;
>> +
>> + for ( propoff = fdt_first_property_offset(pfdt, nodeoff);
>> + propoff >= 0;
>> + propoff = fdt_next_property_offset(pfdt, propoff) )
>> + {
>> + if ( !(prop = fdt_get_property_by_offset(pfdt, propoff, NULL)) )
>> + return -FDT_ERR_INTERNAL;
>> +
>> + found = false;
>> + nameoff = fdt32_to_cpu(prop->nameoff);
>> + name = fdt_string(pfdt, nameoff);
>> +
>> + if ( scan_passthrough_prop )
>> + {
>> + if ( dt_prop_cmp("xen,reg", name) == 0 )
>> + {
>> + xen_reg = prop;
>> + found = true;
>> + }
>> + else if ( dt_prop_cmp("xen,path", name) == 0 )
>> + {
>> + xen_path = prop;
>> + found = true;
>> + }
>> + else if ( dt_prop_cmp("xen,force-assign-without-iommu",
>> + name) == 0 )
>> + {
>> + xen_force = true;
>> + found = true;
>> + }
>> + }
>> +
>> + /*
>> + * Copy properties other than the ones above: xen,reg, xen,path,
>> + * and xen,force-assign-without-iommu.
>> + */
>> + if ( !found )
>> + {
>> + res = fdt_property(fdt, name, prop->data, fdt32_to_cpu(prop->len));
>> + if ( res )
>> + return res;
>> + }
>> + }
>> +
>> + /*
>> + * Only handle passthrough properties if both xen,reg and xen,path
>> + * are present, or if xen,force-assign-without-iommu is specified.
>> + */
>> + if ( xen_reg != NULL && (xen_path != NULL || xen_force) )
>> + {
>> + res = handle_passthrough_prop(kinfo, xen_reg, xen_path, xen_force,
>> + address_cells, size_cells);
>> + if ( res < 0 )
>> + {
>> + printk(XENLOG_ERR "Failed to assign device to %pd\n", kinfo->d);
>> + return res;
>> + }
>> + }
>> + else if ( (xen_path && !xen_reg) || (xen_reg && !xen_path && !xen_force) )
>> + {
>> + printk(XENLOG_ERR "xen,reg or xen,path missing for %pd\n",
>> + kinfo->d);
>> + return -EINVAL;
>> + }
>> +
>> + /* FDT_ERR_NOTFOUND => There is no more properties for this node */
>> + return ( propoff != -FDT_ERR_NOTFOUND ) ? propoff : 0;
>> +}
>> +
>> +static int __init scan_pfdt_node(struct kernel_info *kinfo, const void *pfdt,
>> + int nodeoff,
>> + uint32_t address_cells, uint32_t size_cells,
>> + bool scan_passthrough_prop)
>> +{
>> + int rc = 0;
>> + void *fdt = kinfo->fdt;
>> + int node_next;
>> +
>> + rc = fdt_begin_node(fdt, fdt_get_name(pfdt, nodeoff, NULL));
>> + if ( rc )
>> + return rc;
>> +
>> + rc = handle_prop_pfdt(kinfo, pfdt, nodeoff, address_cells, size_cells,
>> + scan_passthrough_prop);
>> + if ( rc )
>> + return rc;
>> +
>> + address_cells = device_tree_get_u32(pfdt, nodeoff, "#address-cells",
>> + DT_ROOT_NODE_ADDR_CELLS_DEFAULT);
>> + size_cells = device_tree_get_u32(pfdt, nodeoff, "#size-cells",
>> + DT_ROOT_NODE_SIZE_CELLS_DEFAULT);
>> +
>> + node_next = fdt_first_subnode(pfdt, nodeoff);
>> + while ( node_next > 0 )
>> + {
>> + rc = scan_pfdt_node(kinfo, pfdt, node_next, address_cells, size_cells,
>> + scan_passthrough_prop);
>> + if ( rc )
>> + return rc;
>> +
>> + node_next = fdt_next_subnode(pfdt, node_next);
>> + }
>> +
>> + return fdt_end_node(fdt);
>> +}
>> +
>> +static int __init check_partial_fdt(void *pfdt, size_t size)
>> +{
>> + int res;
>> +
>> + if ( fdt_magic(pfdt) != FDT_MAGIC )
>> + {
>> + dprintk(XENLOG_ERR, "Partial FDT is not a valid Flat Device Tree");
>> + return -EINVAL;
>> + }
>> +
>> + res = fdt_check_header(pfdt);
>> + if ( res )
>> + {
>> + dprintk(XENLOG_ERR, "Failed to check the partial FDT (%d)", res);
>> + return -EINVAL;
>> + }
>> +
>> + if ( fdt_totalsize(pfdt) > size )
>> + {
>> + dprintk(XENLOG_ERR, "Partial FDT totalsize is too big");
>> + return -EINVAL;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int __init domain_handle_dtb_bootmodule(struct domain *d,
>> + struct kernel_info *kinfo)
>> +{
>> + void *pfdt;
>> + int res, node_next;
>> +
>> + pfdt = ioremap_cache(kinfo->dtb_bootmodule->start,
>> + kinfo->dtb_bootmodule->size);
>> + if ( pfdt == NULL )
>> + return -EFAULT;
>> +
>> + res = check_partial_fdt(pfdt, kinfo->dtb_bootmodule->size);
>> + if ( res < 0 )
>> + goto out;
>> +
>> + for ( node_next = fdt_first_subnode(pfdt, 0);
>> + node_next > 0;
>> + node_next = fdt_next_subnode(pfdt, node_next) )
>> + {
>> + const char *name = fdt_get_name(pfdt, node_next, NULL);
>> +
>> + if ( name == NULL )
>> + continue;
>> +
>> + /*
>> + * Only scan /$(interrupt_controller) /aliases /passthrough,
>> + * ignore the rest.
>> + * They don't have to be parsed in order.
>> + *
>> + * Take the interrupt controller phandle value from the special
>> + * interrupt controller node in the DTB fragment.
>> + */
>> + if ( init_intc_phandle(kinfo, name, node_next, pfdt) == 0 )
>> + continue;
>> +
>> + if ( dt_node_cmp(name, "aliases") == 0 )
>> + {
>> + res = scan_pfdt_node(kinfo, pfdt, node_next,
>> + DT_ROOT_NODE_ADDR_CELLS_DEFAULT,
>> + DT_ROOT_NODE_SIZE_CELLS_DEFAULT,
>> + false);
>> + if ( res )
>> + goto out;
>> + continue;
>> + }
>> + if ( dt_node_cmp(name, "passthrough") == 0 )
>> + {
>> + res = scan_pfdt_node(kinfo, pfdt, node_next,
>> + DT_ROOT_NODE_ADDR_CELLS_DEFAULT,
>> + DT_ROOT_NODE_SIZE_CELLS_DEFAULT,
>> + true);
>> + if ( res )
>> + goto out;
>> + continue;
>> + }
>> + }
>> +
>> + out:
>> + iounmap(pfdt);
>> +
>> + return res;
>> +}
>> +
>> +/*
>> + * The max size for DT is 2MB. However, the generated DT is small (not including
>> + * domU passthrough DT nodes whose size we account separately), 4KB are enough
>> + * for now, but we might have to increase it in the future.
>> + */
>> +#define DOMU_DTB_SIZE 4096
>> +static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
>> +{
>> + int addrcells, sizecells;
>> + int ret, fdt_size = DOMU_DTB_SIZE;
>> +
>> + kinfo->phandle_intc = GUEST_PHANDLE_GIC;
>> +
>> +#ifdef CONFIG_GRANT_TABLE
>> + kinfo->gnttab_start = GUEST_GNTTAB_BASE;
>> + kinfo->gnttab_size = GUEST_GNTTAB_SIZE;
>> +#endif
>> +
>> + addrcells = GUEST_ROOT_ADDRESS_CELLS;
>> + sizecells = GUEST_ROOT_SIZE_CELLS;
>> +
>> + /* Account for domU passthrough DT size */
>> + if ( kinfo->dtb_bootmodule )
>> + fdt_size += kinfo->dtb_bootmodule->size;
>> +
>> + /* Cap to max DT size if needed */
>> + fdt_size = min(fdt_size, SZ_2M);
>> +
>> + kinfo->fdt = xmalloc_bytes(fdt_size);
>> + if ( kinfo->fdt == NULL )
>> + return -ENOMEM;
>> +
>> + ret = fdt_create(kinfo->fdt, fdt_size);
>> + if ( ret < 0 )
>> + goto err;
>> +
>> + ret = fdt_finish_reservemap(kinfo->fdt);
>> + if ( ret < 0 )
>> + goto err;
>> +
>> + ret = fdt_begin_node(kinfo->fdt, "");
>> + if ( ret < 0 )
>> + goto err;
>> +
>> + ret = fdt_property_cell(kinfo->fdt, "#address-cells", addrcells);
>> + if ( ret )
>> + goto err;
>> +
>> + ret = fdt_property_cell(kinfo->fdt, "#size-cells", sizecells);
>> + if ( ret )
>> + goto err;
>> +
>> + ret = make_chosen_node(kinfo);
>> + if ( ret )
>> + goto err;
>> +
>> + ret = make_cpus_node(d, kinfo->fdt);
>> + if ( ret )
>> + goto err;
>> +
>> + ret = make_memory_node(kinfo, addrcells, sizecells,
>> + kernel_info_get_mem(kinfo));
>> + if ( ret )
>> + goto err;
>> +
>> +#ifdef CONFIG_STATIC_SHM
> This should not be necessary because there is a stub implementation of
> make_resv_memory_node available in static-shmem.h for the
> !CONFIG_STATIC_SHM case.
But static-shmem.h isn't available on all architectures. Until static-shmem.h isn't moved to
asm-generic or xen folders and then re-used by an architecture we have to have #ifdef.
>
>
>> + ret = make_resv_memory_node(kinfo, addrcells, sizecells);
>> + if ( ret )
>> + goto err;
>> +#endif
>> +
>> + /*
>> + * domain_handle_dtb_bootmodule has to be called before the rest of
>> + * the device tree is generated because it depends on the value of
>> + * the field phandle_intc.
>> + */
>> + if ( kinfo->dtb_bootmodule )
>> + {
>> + ret = domain_handle_dtb_bootmodule(d, kinfo);
>> + if ( ret )
>> + goto err;
>> + }
>> +
>> + ret = make_intc_domU_node(kinfo);
>> + if ( ret )
>> + goto err;
>> +
>> + ret = make_timer_node(kinfo);
>> + if ( ret )
>> + goto err;
>> +
>> + if ( kinfo->dom0less_feature & DOM0LESS_ENHANCED_NO_XS )
>> + {
>> + ret = make_hypervisor_node(d, kinfo, addrcells, sizecells);
>> + if ( ret )
>> + goto err;
>> + }
>> +
>> + ret = make_arch_nodes(kinfo);
>> + if ( ret )
>> + goto err;
>> +
>> + ret = fdt_end_node(kinfo->fdt);
>> + if ( ret < 0 )
>> + goto err;
>> +
>> + ret = fdt_finish(kinfo->fdt);
>> + if ( ret < 0 )
>> + goto err;
>> +
>> + return 0;
>> +
>> + err:
>> + printk("Device tree generation failed (%d).\n", ret);
>> + xfree(kinfo->fdt);
>> +
>> + return -EINVAL;
>> +}
>> +
>> +#define XENSTORE_PFN_OFFSET 1
>> +static int __init alloc_xenstore_page(struct domain *d)
>> +{
>> + struct page_info *xenstore_pg;
>> + struct xenstore_domain_interface *interface;
>> + mfn_t mfn;
>> + gfn_t gfn;
>> + int rc;
>> +
>> + if ( (UINT_MAX - d->max_pages) < 1 )
>> + {
>> + printk(XENLOG_ERR "%pd: Over-allocation for d->max_pages by 1 page.\n",
>> + d);
>> + return -EINVAL;
>> + }
>> +
>> + d->max_pages += 1;
>> + xenstore_pg = alloc_domheap_page(d, MEMF_bits(32));
>> + if ( xenstore_pg == NULL && is_64bit_domain(d) )
>> + xenstore_pg = alloc_domheap_page(d, 0);
>> + if ( xenstore_pg == NULL )
>> + return -ENOMEM;
>> +
>> + mfn = page_to_mfn(xenstore_pg);
>> + if ( !mfn_x(mfn) )
>> + return -ENOMEM;
>> +
>> + if ( !is_domain_direct_mapped(d) )
>> + gfn = gaddr_to_gfn(GUEST_MAGIC_BASE +
>> + (XENSTORE_PFN_OFFSET << PAGE_SHIFT));
>> + else
>> + gfn = gaddr_to_gfn(mfn_to_maddr(mfn));
>> +
>> + rc = guest_physmap_add_page(d, gfn, mfn, 0);
>> + if ( rc )
>> + {
>> + free_domheap_page(xenstore_pg);
>> + return rc;
>> + }
>> +
>> +#ifdef CONFIG_HVM
>> + d->arch.hvm.params[HVM_PARAM_STORE_PFN] = gfn_x(gfn);
>> +#endif
>> + interface = map_domain_page(mfn);
>> + interface->connection = XENSTORE_RECONNECT;
>> + unmap_domain_page(interface);
>> +
>> + return 0;
>> +}
>> +
>> +static int __init alloc_xenstore_params(struct kernel_info *kinfo)
>> +{
>> + struct domain *d = kinfo->d;
>> + int rc = 0;
>> +
>> +#ifdef CONFIG_HVM
>> + if ( (kinfo->dom0less_feature & (DOM0LESS_XENSTORE | DOM0LESS_XS_LEGACY))
>> + == (DOM0LESS_XENSTORE | DOM0LESS_XS_LEGACY) )
>> + d->arch.hvm.params[HVM_PARAM_STORE_PFN] = XENSTORE_PFN_LATE_ALLOC;
>> + else
>> +#endif
>> + if ( kinfo->dom0less_feature & DOM0LESS_XENSTORE )
>> + {
>> + rc = alloc_xenstore_page(d);
>> + if ( rc < 0 )
>> + return rc;
>> + }
>> +
>> + return rc;
>> +}
>> +
>> +static void __init domain_vcpu_affinity(struct domain *d,
>> + const struct dt_device_node *node)
>> +{
>> + struct dt_device_node *np;
>> +
>> + dt_for_each_child_node(node, np)
>> + {
>> + const char *hard_affinity_str = NULL;
>> + uint32_t val;
>> + int rc;
>> + struct vcpu *v;
>> + cpumask_t affinity;
>> +
>> + if ( !dt_device_is_compatible(np, "xen,vcpu") )
>> + continue;
>> +
>> + if ( !dt_property_read_u32(np, "id", &val) )
>> + panic("Invalid xen,vcpu node for domain %s\n", dt_node_name(node));
>> +
>> + if ( val >= d->max_vcpus )
>> + panic("Invalid vcpu_id %u for domain %s, max_vcpus=%u\n", val,
>> + dt_node_name(node), d->max_vcpus);
>> +
>> + v = d->vcpu[val];
>> + rc = dt_property_read_string(np, "hard-affinity", &hard_affinity_str);
>> + if ( rc < 0 )
>> + continue;
>> +
>> + cpumask_clear(&affinity);
>> + while ( *hard_affinity_str != '\0' )
>> + {
>> + unsigned int start, end;
>> +
>> + start = simple_strtoul(hard_affinity_str, &hard_affinity_str, 0);
>> +
>> + if ( *hard_affinity_str == '-' ) /* Range */
>> + {
>> + hard_affinity_str++;
>> + end = simple_strtoul(hard_affinity_str, &hard_affinity_str, 0);
>> + }
>> + else /* Single value */
>> + end = start;
>> +
>> + if ( end >= nr_cpu_ids )
>> + panic("Invalid pCPU %u for domain %s\n", end, dt_node_name(node));
>> +
>> + for ( ; start <= end; start++ )
>> + cpumask_set_cpu(start, &affinity);
>> +
>> + if ( *hard_affinity_str == ',' )
>> + hard_affinity_str++;
>> + else if ( *hard_affinity_str != '\0' )
>> + break;
>> + }
>> +
>> + rc = vcpu_set_hard_affinity(v, &affinity);
>> + if ( rc )
>> + panic("vcpu%d: failed (rc=%d) to set hard affinity for domain %s\n",
>> + v->vcpu_id, rc, dt_node_name(node));
>> + }
>> +}
>> +
>> +#ifdef CONFIG_ARCH_PAGING_MEMPOOL
>> +static unsigned long __init domain_p2m_pages(unsigned long maxmem_kb,
>> + unsigned int smp_cpus)
>> +{
>> + /*
>> + * Keep in sync with libxl__get_required_paging_memory().
>> + * 256 pages (1MB) per vcpu, plus 1 page per MiB of RAM for the P2M map,
>> + * plus 128 pages to cover extended regions.
>> + */
>> + unsigned long memkb = 4 * (256 * smp_cpus + (maxmem_kb / 1024) + 128);
>> +
>> + BUILD_BUG_ON(PAGE_SIZE != SZ_4K);
>> +
>> + return DIV_ROUND_UP(memkb, 1024) << (20 - PAGE_SHIFT);
>> +}
>> +
>> +static int __init domain_p2m_set_allocation(struct domain *d, uint64_t mem,
>> + const struct dt_device_node *node)
>> +{
>> + unsigned long p2m_pages;
>> + uint32_t p2m_mem_mb;
>> + int rc;
>> +
>> + rc = dt_property_read_u32(node, "xen,domain-p2m-mem-mb", &p2m_mem_mb);
>> + /* If xen,domain-p2m-mem-mb is not specified, use the default value. */
>> + p2m_pages = rc ?
>> + p2m_mem_mb << (20 - PAGE_SHIFT) :
>> + domain_p2m_pages(mem, d->max_vcpus);
>> +
>> + spin_lock(&d->arch.paging.lock);
>> + rc = p2m_set_allocation(d, p2m_pages, NULL);
>> + spin_unlock(&d->arch.paging.lock);
>> +
>> + return rc;
>> +}
>> +#else /* !CONFIG_ARCH_PAGING_MEMPOOL */
>> +static inline int domain_p2m_set_allocation(struct domain *d, uint64_t mem,
>> + const struct dt_device_node *node)
>> +{
>> + return 0;
>> +}
>> +#endif /* CONFIG_ARCH_PAGING_MEMPOOL */
>> +
>> +static int __init construct_domU(struct domain *d,
>> + const struct dt_device_node *node)
>> +{
>> + struct kernel_info kinfo = KERNEL_INFO_INIT;
>> + const char *dom0less_enhanced;
>> + int rc;
>> + u64 mem;
>> +
>> + rc = dt_property_read_u64(node, "memory", &mem);
>> + if ( !rc )
>> + {
>> + printk("Error building DomU: cannot read \"memory\" property\n");
>> + return -EINVAL;
>> + }
>> + kinfo.unassigned_mem = (paddr_t)mem * SZ_1K;
>> +
>> + rc = domain_p2m_set_allocation(d, mem, node);
>> + if ( rc != 0 )
>> + return rc;
>> +
>> + printk("*** LOADING DOMU cpus=%u memory=%#"PRIx64"KB ***\n",
>> + d->max_vcpus, mem);
>> +
>> + rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced);
>> + if ( rc == -EILSEQ ||
>> + rc == -ENODATA ||
>> + (rc == 0 && !strcmp(dom0less_enhanced, "enabled")) )
>> + {
>> + need_xenstore = true;
>> + kinfo.dom0less_feature = DOM0LESS_ENHANCED;
>> + }
>> + else if ( rc == 0 && !strcmp(dom0less_enhanced, "legacy") )
>> + {
>> + need_xenstore = true;
>> + kinfo.dom0less_feature = DOM0LESS_ENHANCED_LEGACY;
>> + }
>> + else if ( rc == 0 && !strcmp(dom0less_enhanced, "no-xenstore") )
>> + kinfo.dom0less_feature = DOM0LESS_ENHANCED_NO_XS;
>> +
>> + if ( vcpu_create(d, 0) == NULL )
>> + return -ENOMEM;
>> +
>> + d->max_pages = ((paddr_t)mem * SZ_1K) >> PAGE_SHIFT;
>> +
>> + kinfo.d = d;
>> +
>> + rc = kernel_probe(&kinfo, node);
>> + if ( rc < 0 )
>> + return rc;
>> +
>> + set_domain_type(d, &kinfo);
>> +
>> + if ( !dt_find_property(node, "xen,static-mem", NULL) )
>> + allocate_memory(d, &kinfo);
>> +#ifdef CONFIG_STATIC_MEMORY
> Also this should not be needed thanks to the stub implementation of
> allocate_static_memory and assign_static_memory_11
>
>
>> + else if ( !is_domain_direct_mapped(d) )
>> + allocate_static_memory(d, &kinfo, node);
>> + else
>> + assign_static_memory_11(d, &kinfo, node);
>> +#endif
>> +
>> +#ifdef CONFIG_STATIC_SHM
> There is a stub for process_shm too
The same as with make_resv_memory_node(), static-shmem.h header isn't available for
all archs.
I can return my patches which move static-shmem.h to asm-generic and then drop all the ifdef-s connect to it:
https://lore.kernel.org/xen-devel/0203b98aa6a42aa69e22e7c973320add3ff4bb5d.1736334615.git.oleksii.kurochko@gmail.com/
https://lore.kernel.org/xen-devel/0203b98aa6a42aa69e22e7c973320add3ff4bb5d.1736334615.git.oleksii.kurochko@gmail.com/
Let me know if it is better to do now or should it be better to drop #ifdef-ing when an architrecture will require
static-shmem or static-mem features?
~ Oleksii
>
>> + rc = process_shm(d, &kinfo, node);
>> + if ( rc < 0 )
>> + return rc;
>> +#endif
>> +
>> + rc = init_vuart(d, &kinfo, node);
>> + if ( rc < 0 )
>> + return rc;
>> +
>> + rc = prepare_dtb_domU(d, &kinfo);
>> + if ( rc < 0 )
>> + return rc;
>> +
>> + rc = construct_domain(d, &kinfo);
>> + if ( rc < 0 )
>> + return rc;
>> +
>> + domain_vcpu_affinity(d, node);
>> +
>> + return alloc_xenstore_params(&kinfo);
>> +}
>> +
>> void __init create_domUs(void)
>> {
>> struct dt_device_node *node;
>> diff --git a/xen/include/asm-generic/dom0less-build.h b/xen/include/asm-generic/dom0less-build.h
>> index f095135caa..c00bb853d6 100644
>> --- a/xen/include/asm-generic/dom0less-build.h
>> +++ b/xen/include/asm-generic/dom0less-build.h
>> @@ -11,10 +11,7 @@
>>
>> struct domain;
>> struct dt_device_node;
>> -
>> -/* TODO: remove both when construct_domU() will be moved to common. */
>> -#define XENSTORE_PFN_LATE_ALLOC UINT64_MAX
>> -extern bool need_xenstore;
>> +struct kernel_info;
>>
>> /*
>> * List of possible features for dom0less domUs
>> @@ -48,12 +45,21 @@ void create_domUs(void);
>> bool is_dom0less_mode(void);
>> void set_xs_domain(struct domain *d);
>>
>> -int construct_domU(struct domain *d, const struct dt_device_node *node);
>> -
>> void arch_create_domUs(struct dt_device_node *node,
>> struct xen_domctl_createdomain *d_cfg,
>> unsigned int flags);
>>
>> +int init_vuart(struct domain *d, struct kernel_info *kinfo,
>> + const struct dt_device_node *node);
>> +
>> +int make_intc_domU_node(struct kernel_info *kinfo);
>> +int make_arch_nodes(struct kernel_info *kinfo);
>> +
>> +void set_domain_type(struct domain *d, struct kernel_info *kinfo);
>> +
>> +int init_intc_phandle(struct kernel_info *kinfo, const char *name,
>> + const int node_next, const void *pfdt);
>> +
>> #else /* !CONFIG_DOM0LESS_BOOT */
>>
>> static inline void create_domUs(void) {}
>> --
>> 2.49.0
>>
[-- Attachment #2: Type: text/html, Size: 55914 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v3 8/8] xen/common: dom0less: introduce common dom0less-build.c
2025-05-05 10:46 ` Oleksii Kurochko
@ 2025-05-05 10:56 ` Oleksii Kurochko
2025-05-05 17:35 ` Stefano Stabellini
1 sibling, 0 replies; 30+ messages in thread
From: Oleksii Kurochko @ 2025-05-05 10:56 UTC (permalink / raw)
To: Stefano Stabellini
Cc: xen-devel, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD, Jan Beulich,
Roger Pau Monné
[-- Attachment #1: Type: text/plain, Size: 3093 bytes --]
On 5/5/25 12:46 PM, Oleksii Kurochko wrote:
>
>
> On 5/2/25 10:53 PM, Stefano Stabellini wrote:
>> On Fri, 2 May 2025, Oleksii Kurochko wrote:
>>> Part of Arm's dom0less-build.c could be common between architectures which are
>>> using device tree files to create guest domains. Thereby move some parts of
>>> Arm's dom0less-build.c to common code with minor changes.
>>>
>>> As a part of theses changes the following changes are introduced:
>>> - Introduce make_arch_nodes() to cover arch-specific nodes. For example, in
>>> case of Arm, it is PSCI and vpl011 nodes.
>>> - Introduce set_domain_type() to abstract a way how setting of domain type
>>> happens. For example, RISC-V won't have this member of arch_domain structure
>>> as vCPUs will always have the same bitness as hypervisor. In case of Arm, it
>>> is possible that Arm64 could create 32-bit and 64-bit domains.
>>> - Introduce init_vuart() to cover details of virtual uart initialization.
>>> - Introduce init_intc_phandle() to cover some details of interrupt controller
>>> phandle initialization. As an example, RISC-V could have different name for
>>> interrupt controller node ( APLIC, PLIC, IMSIC, etc ) but the code in
>>> domain_handle_dtb_bootmodule() could handle only one interrupt controller
>>> node name.
>>> - s/make_gic_domU_node/make_intc_domU_node as GIC is Arm specific naming and
>>> add prototype of make_intc_domU_node() to dom0less-build.h
>>>
>>> The following functions are moved to xen/common/device-tree:
>>> - Functions which are moved as is:
>>> - domain_p2m_pages().
>>> - handle_passthrough_prop().
>>> - handle_prop_pfdt().
>>> - scan_pfdt_node().
>>> - check_partial_fdt().
>>> - Functions which are moved with some minor changes:
>>> - alloc_xenstore_evtchn():
>>> - ifdef-ing by CONFIG_HVM accesses to hvm.params.
>>> - prepare_dtb_domU():
>>> - ifdef-ing access to gnttab_{start,size} by CONFIG_GRANT_TABLE.
>>> - s/make_gic_domU_node/make_intc_domU_node.
>>> - Add call of make_arch_nodes().
>>> - domain_handle_dtb_bootmodule():
>>> - hide details of interrupt controller phandle initialization by calling
>>> init_intc_phandle().
>>> - Update the comment above init_intc_phandle(): s/gic/interrupt controller.
>>> - construct_domU():
>>> - ifdef-ing by CONFIG_HVM accesses to hvm.params.
>>> - Call init_vuart() to hide Arm's vpl011_init() details there.
>>> - Add call of set_domain_type() instead of setting kinfo->arch.type explicitly.
>>>
>>> Some parts of dom0less-build.c are wraped by #ifdef CONFIG_STATIC_{SHMEM,MEMORY}
>>> as not all archs support these configs.
>>>
>>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
>> FYI for a possible follow-up patch (doesn't have to be done in this
>> patch), the following functions could now be static:
>>
>> alloc_dom0_vcpu0
>> dom0_max_vcpus
> I will make them static in follow-up patch in the next patch series version.
Oh, I just noticed that we can't make them static as there is none static declaration in
xen/domain.h
~ Oleksii
[-- Attachment #2: Type: text/html, Size: 3769 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v3 8/8] xen/common: dom0less: introduce common dom0less-build.c
2025-05-05 10:46 ` Oleksii Kurochko
2025-05-05 10:56 ` Oleksii Kurochko
@ 2025-05-05 17:35 ` Stefano Stabellini
1 sibling, 0 replies; 30+ messages in thread
From: Stefano Stabellini @ 2025-05-05 17:35 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Stefano Stabellini, xen-devel, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Jan Beulich, Roger Pau Monné
Hi Oleksii,
FYI I know you might not be able to disable HTML in your email client
replies, but just as a heads up, my email client doesn't support HTML at
all so my replies will have your text and my older text mixed up.
On Mon, 5 May 2025, Oleksii Kurochko wrote:
> On 5/2/25 10:53 PM, Stefano Stabellini wrote:
>
> On Fri, 2 May 2025, Oleksii Kurochko wrote:
>
> Part of Arm's dom0less-build.c could be common between architectures which are
> using device tree files to create guest domains. Thereby move some parts of
> Arm's dom0less-build.c to common code with minor changes.
>
> As a part of theses changes the following changes are introduced:
> - Introduce make_arch_nodes() to cover arch-specific nodes. For example, in
> case of Arm, it is PSCI and vpl011 nodes.
> - Introduce set_domain_type() to abstract a way how setting of domain type
> happens. For example, RISC-V won't have this member of arch_domain structure
> as vCPUs will always have the same bitness as hypervisor. In case of Arm, it
> is possible that Arm64 could create 32-bit and 64-bit domains.
> - Introduce init_vuart() to cover details of virtual uart initialization.
> - Introduce init_intc_phandle() to cover some details of interrupt controller
> phandle initialization. As an example, RISC-V could have different name for
> interrupt controller node ( APLIC, PLIC, IMSIC, etc ) but the code in
> domain_handle_dtb_bootmodule() could handle only one interrupt controller
> node name.
> - s/make_gic_domU_node/make_intc_domU_node as GIC is Arm specific naming and
> add prototype of make_intc_domU_node() to dom0less-build.h
>
> The following functions are moved to xen/common/device-tree:
> - Functions which are moved as is:
> - domain_p2m_pages().
> - handle_passthrough_prop().
> - handle_prop_pfdt().
> - scan_pfdt_node().
> - check_partial_fdt().
> - Functions which are moved with some minor changes:
> - alloc_xenstore_evtchn():
> - ifdef-ing by CONFIG_HVM accesses to hvm.params.
> - prepare_dtb_domU():
> - ifdef-ing access to gnttab_{start,size} by CONFIG_GRANT_TABLE.
> - s/make_gic_domU_node/make_intc_domU_node.
> - Add call of make_arch_nodes().
> - domain_handle_dtb_bootmodule():
> - hide details of interrupt controller phandle initialization by calling
> init_intc_phandle().
> - Update the comment above init_intc_phandle(): s/gic/interrupt controller.
> - construct_domU():
> - ifdef-ing by CONFIG_HVM accesses to hvm.params.
> - Call init_vuart() to hide Arm's vpl011_init() details there.
> - Add call of set_domain_type() instead of setting kinfo->arch.type explicitly.
>
> Some parts of dom0less-build.c are wraped by #ifdef CONFIG_STATIC_{SHMEM,MEMORY}
> as not all archs support these configs.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
[...]
> + ret = make_memory_node(kinfo, addrcells, sizecells,
> + kernel_info_get_mem(kinfo));
> + if ( ret )
> + goto err;
> +
> +#ifdef CONFIG_STATIC_SHM
>
> This should not be necessary because there is a stub implementation of
> make_resv_memory_node available in static-shmem.h for the
> !CONFIG_STATIC_SHM case.
>
> But static-shmem.h isn't available on all architectures. Until static-shmem.h isn't moved to
> asm-generic or xen folders and then re-used by an architecture we have to have #ifdef.
OK let's keep it as is so that we don't need to move static-shmem.h
> + ret = make_resv_memory_node(kinfo, addrcells, sizecells);
> + if ( ret )
> + goto err;
> +#endif
[...]
> + if ( !dt_find_property(node, "xen,static-mem", NULL) )
> + allocate_memory(d, &kinfo);
> +#ifdef CONFIG_STATIC_MEMORY
>
> Also this should not be needed thanks to the stub implementation of
> allocate_static_memory and assign_static_memory_11
>
>
> + else if ( !is_domain_direct_mapped(d) )
> + allocate_static_memory(d, &kinfo, node);
> + else
> + assign_static_memory_11(d, &kinfo, node);
> +#endif
> +
> +#ifdef CONFIG_STATIC_SHM
>
> There is a stub for process_shm too
>
> The same as with make_resv_memory_node(), static-shmem.h header isn't available for
> all archs.
> I can return my patches which move static-shmem.h to asm-generic and then drop all the ifdef-s connect to it:
> https://lore.kernel.org/xen-devel/0203b98aa6a42aa69e22e7c973320add3ff4bb5d.1736334615.git.oleksii.kurochko@gmail.com/
> https://lore.kernel.org/xen-devel/0203b98aa6a42aa69e22e7c973320add3ff4bb5d.1736334615.git.oleksii.kurochko@gmail.com/
>
> Let me know if it is better to do now or should it be better to drop #ifdef-ing when an architrecture will require
> static-shmem or static-mem features?
I see Jan's point that they are advanced features probably not needed
initially. So maybe it is better to start with something simpler. I
think it is OK to keep the patch as is.
^ permalink raw reply [flat|nested] 30+ messages in thread