* [PATCH v11 01/12] xen/common: add cache coloring common code
2024-12-02 16:59 [PATCH v11 00/12] Arm cache coloring Carlo Nonato
@ 2024-12-02 16:59 ` Carlo Nonato
2024-12-03 9:55 ` Michal Orzel
2024-12-02 16:59 ` [PATCH v11 02/12] xen/arm: add initial support for LLC coloring on arm64 Carlo Nonato
` (10 subsequent siblings)
11 siblings, 1 reply; 40+ messages in thread
From: Carlo Nonato @ 2024-12-02 16:59 UTC (permalink / raw)
To: xen-devel
Cc: andrea.bastoni, marco.solieri, Carlo Nonato, Andrew Cooper,
Jan Beulich, Julien Grall, Stefano Stabellini, Michal Orzel
Last Level Cache (LLC) coloring allows to partition the cache in smaller
chunks called cache colors.
Since not all architectures can actually implement it, add a HAS_LLC_COLORING
Kconfig option.
LLC_COLORS_ORDER Kconfig option has a range maximum of 10 (2^10 = 1024)
because that's the number of colors that fit in a 4 KiB page when integers
are 4 bytes long.
LLC colors are a property of the domain, so struct domain has to be extended.
Based on original work from: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Acked-by: Michal Orzel <michal.orzel@amd.com>
---
v11:
- __COLORING_H__ -> __XEN_LLC_COLORING_H__ in llc-coloring.h
- added SPDX tag to cache-coloring.rst
- llc-coloring=off now takes precedence over other cmdline options
- removed useless #includes
v10:
- fixed commit message to use LLC_COLORS_ORDER
- added documentation to index.rst
- moved check on CONFIG_NUMA in arch/arm/Kconfig (next patch)
- fixed copyright line
- fixed array type for colors parameter in print_colors()
- added check on (way_size & ~PAGE_MASK)
v9:
- dropped _MAX_ from CONFIG_MAX_LLC_COLORS_ORDER
v8:
- minor documentation fixes
- "llc-coloring=on" is inferred from "llc-nr-ways" and "llc-size" usage
- turned CONFIG_NR_LLC_COLORS to CONFIG_MAX_LLC_COLORS_ORDER, base-2 exponent
- moved Kconfig options to common/Kconfig
- don't crash if computed max_nr_colors is too large
v7:
- SUPPORT.md changes added to this patch
- extended documentation to better address applicability of cache coloring
- "llc-nr-ways" and "llc-size" params introduced in favor of "llc-way-size"
- moved dump_llc_coloring_info() call in 'm' keyhandler (pagealloc_info())
v6:
- moved almost all code in common
- moved documentation in this patch
- reintroduced range for CONFIG_NR_LLC_COLORS
- reintroduced some stub functions to reduce the number of checks on
llc_coloring_enabled
- moved domain_llc_coloring_free() in same patch where allocation happens
- turned "d->llc_colors" to pointer-to-const
- llc_coloring_init() now returns void and panics if errors are found
v5:
- used - instead of _ for filenames
- removed domain_create_llc_colored()
- removed stub functions
- coloring domain fields are now #ifdef protected
v4:
- Kconfig options moved to xen/arch
- removed range for CONFIG_NR_LLC_COLORS
- added "llc_coloring_enabled" global to later implement the boot-time
switch
- added domain_create_llc_colored() to be able to pass colors
- added is_domain_llc_colored() macro
---
SUPPORT.md | 7 ++
docs/index.rst | 1 +
docs/misc/cache-coloring.rst | 118 +++++++++++++++++++++++++++++
docs/misc/xen-command-line.pandoc | 37 +++++++++
xen/common/Kconfig | 21 ++++++
xen/common/Makefile | 1 +
xen/common/keyhandler.c | 3 +
xen/common/llc-coloring.c | 121 ++++++++++++++++++++++++++++++
xen/common/page_alloc.c | 3 +
xen/include/xen/llc-coloring.h | 36 +++++++++
xen/include/xen/sched.h | 5 ++
11 files changed, 353 insertions(+)
create mode 100644 docs/misc/cache-coloring.rst
create mode 100644 xen/common/llc-coloring.c
create mode 100644 xen/include/xen/llc-coloring.h
diff --git a/SUPPORT.md b/SUPPORT.md
index 82239d0294..998faf5635 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -401,6 +401,13 @@ by maintaining multiple physical to machine (p2m) memory mappings.
Status, x86 HVM: Tech Preview
Status, ARM: Tech Preview
+### Cache coloring
+
+Allows to reserve Last Level Cache (LLC) partitions for Dom0, DomUs and Xen
+itself.
+
+ Status, Arm64: Experimental
+
## Resource Management
### CPU Pools
diff --git a/docs/index.rst b/docs/index.rst
index 1d44796d72..1bb8d02ea3 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -66,6 +66,7 @@ Documents in need of some rearranging.
misc/xen-makefiles/makefiles
misra/index
fusa/index
+ misc/cache-coloring
Miscellanea
diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
new file mode 100644
index 0000000000..371f21a0e7
--- /dev/null
+++ b/docs/misc/cache-coloring.rst
@@ -0,0 +1,118 @@
+.. SPDX-License-Identifier: CC-BY-4.0
+
+Xen cache coloring user guide
+=============================
+
+The cache coloring support in Xen allows to reserve Last Level Cache (LLC)
+partitions for Dom0, DomUs and Xen itself. Currently only ARM64 is supported.
+Cache coloring realizes per-set cache partitioning in software and is applicable
+to shared LLCs as implemented in Cortex-A53, Cortex-A72 and similar CPUs.
+
+To compile LLC coloring support set ``CONFIG_LLC_COLORING=y``.
+
+If needed, change the maximum number of colors with
+``CONFIG_LLC_COLORS_ORDER=<n>``.
+
+Runtime configuration is done via `Command line parameters`_.
+
+Background
+**********
+
+Cache hierarchy of a modern multi-core CPU typically has first levels dedicated
+to each core (hence using multiple cache units), while the last level is shared
+among all of them. Such configuration implies that memory operations on one
+core (e.g. running a DomU) are able to generate interference on another core
+(e.g. hosting another DomU). Cache coloring realizes per-set cache-partitioning
+in software and mitigates this, guaranteeing more predictable performances for
+memory accesses.
+Software-based cache coloring is particularly useful in those situations where
+no hardware mechanisms (e.g., DSU-based way partitioning) are available to
+partition caches. This is the case for e.g., Cortex-A53, A57 and A72 CPUs that
+feature a L2 LLC cache shared among all cores.
+
+The key concept underlying cache coloring is a fragmentation of the memory
+space into a set of sub-spaces called colors that are mapped to disjoint cache
+partitions. Technically, the whole memory space is first divided into a number
+of subsequent regions. Then each region is in turn divided into a number of
+subsequent sub-colors. The generic i-th color is then obtained by all the
+i-th sub-colors in each region.
+
+::
+
+ Region j Region j+1
+ ..................... ............
+ . . .
+ . .
+ _ _ _______________ _ _____________________ _ _
+ | | | | | | |
+ | c_0 | c_1 | | c_n | c_0 | c_1 |
+ _ _ _|_____|_____|_ _ _|_____|_____|_____|_ _ _
+ : :
+ : :... ... .
+ : color 0
+ :........................... ... .
+ :
+ . . ..................................:
+
+How colors are actually defined depends on the function that maps memory to
+cache lines. In case of physically-indexed, physically-tagged caches with linear
+mapping, the set index is found by extracting some contiguous bits from the
+physical address. This allows colors to be defined as shown in figure: they
+appear in memory as subsequent blocks of equal size and repeats themselves after
+``n`` different colors, where ``n`` is the total number of colors.
+
+If some kind of bit shuffling appears in the mapping function, then colors
+assume a different layout in memory. Those kind of caches aren't supported by
+the current implementation.
+
+**Note**: Finding the exact cache mapping function can be a really difficult
+task since it's not always documented in the CPU manual. As said Cortex-A53, A57
+and A72 are known to work with the current implementation.
+
+How to compute the number of colors
+###################################
+
+Given the linear mapping from physical memory to cache lines for granted, the
+number of available colors for a specific platform is computed using three
+parameters:
+
+- the size of the LLC.
+- the number of the LLC ways.
+- the page size used by Xen.
+
+The first two parameters can be found in the processor manual, while the third
+one is the minimum mapping granularity. Dividing the cache size by the number of
+its ways we obtain the size of a way. Dividing this number by the page size,
+the number of total cache colors is found. So for example an Arm Cortex-A53
+with a 16-ways associative 1 MiB LLC can isolate up to 16 colors when pages are
+4 KiB in size.
+
+Effective colors assignment
+###########################
+
+When assigning colors, if one wants to avoid cache interference between two
+domains, different colors needs to be used for their memory.
+
+Command line parameters
+***********************
+
+Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
+
++----------------------+-------------------------------+
+| **Parameter** | **Description** |
++----------------------+-------------------------------+
+| ``llc-coloring`` | Enable coloring at runtime |
++----------------------+-------------------------------+
+| ``llc-size`` | Set the LLC size |
++----------------------+-------------------------------+
+| ``llc-nr-ways`` | Set the LLC number of ways |
++----------------------+-------------------------------+
+
+Auto-probing of LLC specs
+#########################
+
+LLC size and number of ways are probed automatically by default.
+
+LLC specs can be manually set via the above command line parameters. This
+bypasses any auto-probing and it's used to overcome failing situations, such as
+flawed probing logic, or for debugging/testing purposes.
diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 293dbc1a95..abd8dae96f 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1708,6 +1708,43 @@ This option is intended for debugging purposes only. Enable MSR_DEBUGCTL.LBR
in hypervisor context to be able to dump the Last Interrupt/Exception To/From
record with other registers.
+### llc-coloring (arm64)
+> `= <boolean>`
+
+> Default: `false`
+
+Flag to enable or disable LLC coloring support at runtime. This option is
+available only when `CONFIG_LLC_COLORING` is enabled. See the general
+cache coloring documentation for more info.
+
+### llc-nr-ways (arm64)
+> `= <integer>`
+
+> Default: `Obtained from hardware`
+
+Specify the number of ways of the Last Level Cache. This option is available
+only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
+to find the number of supported cache colors. By default the value is
+automatically computed by probing the hardware, but in case of specific needs,
+it can be manually set. Those include failing probing and debugging/testing
+purposes so that it's possible to emulate platforms with different number of
+supported colors. If set, also "llc-size" must be set, otherwise the default
+will be used. Note that using both options implies "llc-coloring=on".
+
+### llc-size (arm64)
+> `= <size>`
+
+> Default: `Obtained from hardware`
+
+Specify the size of the Last Level Cache. This option is available only when
+`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
+the number of supported cache colors. By default the value is automatically
+computed by probing the hardware, but in case of specific needs, it can be
+manually set. Those include failing probing and debugging/testing purposes so
+that it's possible to emulate platforms with different number of supported
+colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
+used. Note that using both options implies "llc-coloring=on".
+
### lock-depth-size
> `= <integer>`
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 90268d9249..b4ec6893be 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -71,6 +71,9 @@ config HAS_IOPORTS
config HAS_KEXEC
bool
+config HAS_LLC_COLORING
+ bool
+
config HAS_PIRQ
bool
@@ -516,4 +519,22 @@ config TRACEBUFFER
to be collected at run time for debugging or performance analysis.
Memory and execution overhead when not active is minimal.
+config LLC_COLORING
+ bool "Last Level Cache (LLC) coloring" if EXPERT
+ depends on HAS_LLC_COLORING
+
+config LLC_COLORS_ORDER
+ int "Maximum number of LLC colors (base-2 exponent)"
+ range 1 10
+ default 7
+ depends on LLC_COLORING
+ help
+ Controls the build-time size of various arrays associated with LLC
+ coloring. The value is a base-2 exponent. Refer to cache coloring
+ documentation for how to compute the number of colors supported by the
+ platform. This is only an upper bound. The runtime value is autocomputed
+ or manually set via cmdline parameters.
+ The default value corresponds to an 8 MiB 16-ways LLC, which should be
+ more than what's needed in the general case.
+
endmenu
diff --git a/xen/common/Makefile b/xen/common/Makefile
index b279b09bfb..cba3b32733 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -25,6 +25,7 @@ obj-y += keyhandler.o
obj-$(CONFIG_KEXEC) += kexec.o
obj-$(CONFIG_KEXEC) += kimage.o
obj-$(CONFIG_LIVEPATCH) += livepatch.o livepatch_elf.o
+obj-$(CONFIG_LLC_COLORING) += llc-coloring.o
obj-$(CONFIG_MEM_ACCESS) += mem_access.o
obj-y += memory.o
obj-y += multicall.o
diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
index 6da291b34e..6ea54838d4 100644
--- a/xen/common/keyhandler.c
+++ b/xen/common/keyhandler.c
@@ -5,6 +5,7 @@
#include <asm/regs.h>
#include <xen/delay.h>
#include <xen/keyhandler.h>
+#include <xen/llc-coloring.h>
#include <xen/param.h>
#include <xen/sections.h>
#include <xen/shutdown.h>
@@ -304,6 +305,8 @@ static void cf_check dump_domains(unsigned char key)
arch_dump_domain_info(d);
+ domain_dump_llc_colors(d);
+
rangeset_domain_printk(d);
dump_pageframe_info(d);
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
new file mode 100644
index 0000000000..54d76e3aca
--- /dev/null
+++ b/xen/common/llc-coloring.c
@@ -0,0 +1,121 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Last Level Cache (LLC) coloring common code
+ *
+ * Copyright (C) 2024, Advanced Micro Devices, Inc.
+ * Copyright (C) 2024, Minerva Systems SRL
+ */
+#include <xen/keyhandler.h>
+#include <xen/llc-coloring.h>
+#include <xen/param.h>
+
+#define NR_LLC_COLORS (1U << CONFIG_LLC_COLORS_ORDER)
+
+/*
+ * -1: not specified (disabled unless llc-size and llc-nr-ways present)
+ * 0: explicitly disabled through cmdline
+ * 1: explicitly enabled through cmdline
+ */
+static int8_t __ro_after_init llc_coloring_enabled = -1;
+boolean_param("llc-coloring", llc_coloring_enabled);
+
+static unsigned int __initdata llc_size;
+size_param("llc-size", llc_size);
+static unsigned int __initdata llc_nr_ways;
+integer_param("llc-nr-ways", llc_nr_ways);
+/* Number of colors available in the LLC */
+static unsigned int __ro_after_init max_nr_colors;
+
+static void print_colors(const unsigned int colors[], unsigned int num_colors)
+{
+ unsigned int i;
+
+ printk("{ ");
+ for ( i = 0; i < num_colors; i++ )
+ {
+ unsigned int start = colors[i], end = start;
+
+ printk("%u", start);
+
+ for ( ; i < num_colors - 1 && end + 1 == colors[i + 1]; i++, end++ )
+ ;
+
+ if ( start != end )
+ printk("-%u", end);
+
+ if ( i < num_colors - 1 )
+ printk(", ");
+ }
+ printk(" }\n");
+}
+
+void __init llc_coloring_init(void)
+{
+ unsigned int way_size;
+
+ if ( (llc_coloring_enabled < 0) && (llc_size && llc_nr_ways) )
+ {
+ llc_coloring_enabled = true;
+ way_size = llc_size / llc_nr_ways;
+ }
+ else if ( !llc_coloring_enabled )
+ return;
+ else
+ {
+ way_size = get_llc_way_size();
+ if ( !way_size )
+ panic("LLC probing failed and 'llc-size' or 'llc-nr-ways' missing\n");
+ }
+
+ if ( way_size & ~PAGE_MASK )
+ panic("LLC way size must be a multiple of PAGE_SIZE\n");
+
+ /*
+ * The maximum number of colors must be a power of 2 in order to correctly
+ * map them to bits of an address.
+ */
+ max_nr_colors = way_size >> PAGE_SHIFT;
+
+ if ( max_nr_colors & (max_nr_colors - 1) )
+ panic("Number of LLC colors (%u) isn't a power of 2\n", max_nr_colors);
+
+ if ( max_nr_colors > NR_LLC_COLORS )
+ {
+ printk(XENLOG_WARNING
+ "Number of LLC colors (%u) too big. Using configured max %u\n",
+ max_nr_colors, NR_LLC_COLORS);
+ max_nr_colors = NR_LLC_COLORS;
+ }
+ else if ( max_nr_colors < 2 )
+ panic("Number of LLC colors %u < 2\n", max_nr_colors);
+
+ arch_llc_coloring_init();
+}
+
+void dump_llc_coloring_info(void)
+{
+ if ( !llc_coloring_enabled )
+ return;
+
+ printk("LLC coloring info:\n");
+ printk(" Number of LLC colors supported: %u\n", max_nr_colors);
+}
+
+void domain_dump_llc_colors(const struct domain *d)
+{
+ if ( !llc_coloring_enabled )
+ return;
+
+ printk("%u LLC colors: ", d->num_llc_colors);
+ print_colors(d->llc_colors, d->num_llc_colors);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 92abed6514..55d561e93c 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -126,6 +126,7 @@
#include <xen/irq.h>
#include <xen/keyhandler.h>
#include <xen/lib.h>
+#include <xen/llc-coloring.h>
#include <xen/mm.h>
#include <xen/nodemask.h>
#include <xen/numa.h>
@@ -2644,6 +2645,8 @@ static void cf_check pagealloc_info(unsigned char key)
}
printk(" Dom heap: %lukB free\n", total << (PAGE_SHIFT-10));
+
+ dump_llc_coloring_info();
}
static __init int cf_check pagealloc_keyhandler_init(void)
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
new file mode 100644
index 0000000000..0acd8d0ad6
--- /dev/null
+++ b/xen/include/xen/llc-coloring.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Last Level Cache (LLC) coloring common header
+ *
+ * Copyright (C) 2024, Advanced Micro Devices, Inc.
+ * Copyright (C) 2024, Minerva Systems SRL
+ */
+#ifndef __XEN_LLC_COLORING_H__
+#define __XEN_LLC_COLORING_H__
+
+struct domain;
+
+#ifdef CONFIG_LLC_COLORING
+void llc_coloring_init(void);
+void dump_llc_coloring_info(void);
+void domain_dump_llc_colors(const struct domain *d);
+#else
+static inline void llc_coloring_init(void) {}
+static inline void dump_llc_coloring_info(void) {}
+static inline void domain_dump_llc_colors(const struct domain *d) {}
+#endif
+
+unsigned int get_llc_way_size(void);
+void arch_llc_coloring_init(void);
+
+#endif /* __XEN_LLC_COLORING_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 76e39378b3..bc798a2f61 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -637,6 +637,11 @@ struct domain
/* Holding CDF_* constant. Internal flags for domain creation. */
unsigned int cdf;
+
+#ifdef CONFIG_LLC_COLORING
+ unsigned int num_llc_colors;
+ const unsigned int *llc_colors;
+#endif
};
static inline struct page_list_head *page_to_list(
--
2.43.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v11 01/12] xen/common: add cache coloring common code
2024-12-02 16:59 ` [PATCH v11 01/12] xen/common: add cache coloring common code Carlo Nonato
@ 2024-12-03 9:55 ` Michal Orzel
2024-12-05 8:00 ` Michal Orzel
2024-12-09 13:35 ` Jan Beulich
0 siblings, 2 replies; 40+ messages in thread
From: Michal Orzel @ 2024-12-03 9:55 UTC (permalink / raw)
To: Carlo Nonato, xen-devel
Cc: andrea.bastoni, marco.solieri, Andrew Cooper, Jan Beulich,
Julien Grall, Stefano Stabellini
On 02/12/2024 17:59, Carlo Nonato wrote:
>
>
> Last Level Cache (LLC) coloring allows to partition the cache in smaller
> chunks called cache colors.
>
> Since not all architectures can actually implement it, add a HAS_LLC_COLORING
> Kconfig option.
> LLC_COLORS_ORDER Kconfig option has a range maximum of 10 (2^10 = 1024)
> because that's the number of colors that fit in a 4 KiB page when integers
> are 4 bytes long.
>
> LLC colors are a property of the domain, so struct domain has to be extended.
>
> Based on original work from: Luca Miccio <lucmiccio@gmail.com>
>
> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> Acked-by: Michal Orzel <michal.orzel@amd.com>
[...]
>
> +### llc-coloring (arm64)
> +> `= <boolean>`
> +
> +> Default: `false`
By default, it is disabled. If CONFIG_ is enabled but ...
[...]
> + * -1: not specified (disabled unless llc-size and llc-nr-ways present)
the user doesn't specify any llc-* options, LLC feature should be disabled.
In your case llc_coloring_enabled is -1 and due to 'if ( llc_coloring_enabled ... )' checks
all around the code base, the LLC will be enabled even though it should not.
You can either set it to 0 if (llc_coloring_enabled < 0) and other llc-* options have not been provided
(this would require modifying this comment to provide different meaning depending on the context) or
you could do sth like that:
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index 2d6aed5fb4ac..560fe03aa86b 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -18,8 +18,10 @@
* 0: explicitly disabled through cmdline
* 1: explicitly enabled through cmdline
*/
-int8_t __ro_after_init llc_coloring_enabled = -1;
-boolean_param("llc-coloring", llc_coloring_enabled);
+int8_t __init opt_llc_coloring = -1;
+boolean_param("llc-coloring", opt_llc_coloring);
+
+bool __ro_after_init llc_coloring_enabled = false;
static unsigned int __initdata llc_size;
size_param("llc-size", llc_size);
@@ -147,15 +149,17 @@ void __init llc_coloring_init(void)
{
unsigned int way_size, i;
- if ( (llc_coloring_enabled < 0) && (llc_size && llc_nr_ways) )
+ if ( (opt_llc_coloring < 0) && (llc_size && llc_nr_ways) )
{
llc_coloring_enabled = true;
way_size = llc_size / llc_nr_ways;
}
- else if ( !llc_coloring_enabled )
+ else if ( !opt_llc_coloring )
return;
else
{
+ llc_coloring_enabled = true;
+
way_size = get_llc_way_size();
if ( !way_size )
panic("LLC probing failed and 'llc-size' or 'llc-nr-ways' missing\n");
I think that this would be better in terms of readability.
~Michal
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v11 01/12] xen/common: add cache coloring common code
2024-12-03 9:55 ` Michal Orzel
@ 2024-12-05 8:00 ` Michal Orzel
2024-12-09 13:35 ` Jan Beulich
1 sibling, 0 replies; 40+ messages in thread
From: Michal Orzel @ 2024-12-05 8:00 UTC (permalink / raw)
To: Carlo Nonato, xen-devel
Cc: andrea.bastoni, marco.solieri, Andrew Cooper, Jan Beulich,
Julien Grall, Stefano Stabellini
On 03/12/2024 10:55, Michal Orzel wrote:
>
>
> On 02/12/2024 17:59, Carlo Nonato wrote:
>>
>>
>> Last Level Cache (LLC) coloring allows to partition the cache in smaller
>> chunks called cache colors.
>>
>> Since not all architectures can actually implement it, add a HAS_LLC_COLORING
>> Kconfig option.
>> LLC_COLORS_ORDER Kconfig option has a range maximum of 10 (2^10 = 1024)
>> because that's the number of colors that fit in a 4 KiB page when integers
>> are 4 bytes long.
>>
>> LLC colors are a property of the domain, so struct domain has to be extended.
>>
>> Based on original work from: Luca Miccio <lucmiccio@gmail.com>
>>
>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
>> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
>> Acked-by: Michal Orzel <michal.orzel@amd.com>
>
> [...]
>>
>> +### llc-coloring (arm64)
>> +> `= <boolean>`
>> +
>> +> Default: `false`
> By default, it is disabled. If CONFIG_ is enabled but ...
>
> [...]
>
>> + * -1: not specified (disabled unless llc-size and llc-nr-ways present)
> the user doesn't specify any llc-* options, LLC feature should be disabled.
> In your case llc_coloring_enabled is -1 and due to 'if ( llc_coloring_enabled ... )' checks
> all around the code base, the LLC will be enabled even though it should not.
>
> You can either set it to 0 if (llc_coloring_enabled < 0) and other llc-* options have not been provided
> (this would require modifying this comment to provide different meaning depending on the context) or
> you could do sth like that:
>
> diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
> index 2d6aed5fb4ac..560fe03aa86b 100644
> --- a/xen/common/llc-coloring.c
> +++ b/xen/common/llc-coloring.c
> @@ -18,8 +18,10 @@
> * 0: explicitly disabled through cmdline
> * 1: explicitly enabled through cmdline
> */
> -int8_t __ro_after_init llc_coloring_enabled = -1;
> -boolean_param("llc-coloring", llc_coloring_enabled);
> +int8_t __init opt_llc_coloring = -1;
> +boolean_param("llc-coloring", opt_llc_coloring);
> +
> +bool __ro_after_init llc_coloring_enabled = false;
>
> static unsigned int __initdata llc_size;
> size_param("llc-size", llc_size);
> @@ -147,15 +149,17 @@ void __init llc_coloring_init(void)
> {
> unsigned int way_size, i;
>
> - if ( (llc_coloring_enabled < 0) && (llc_size && llc_nr_ways) )
> + if ( (opt_llc_coloring < 0) && (llc_size && llc_nr_ways) )
> {
> llc_coloring_enabled = true;
> way_size = llc_size / llc_nr_ways;
> }
> - else if ( !llc_coloring_enabled )
> + else if ( !opt_llc_coloring )
> return;
> else
> {
> + llc_coloring_enabled = true;
> +
> way_size = get_llc_way_size();
> if ( !way_size )
> panic("LLC probing failed and 'llc-size' or 'llc-nr-ways' missing\n");
>
> I think that this would be better in terms of readability.
>
> ~Michal
>
>
On top of my previous comments, attempt to build patch 2 with LLC_COLORING enabled results in a few
build errors. In general, this should be avoided to allow bisection. There are 2 issues:
- error: invalid use of undefined type 'const struct domain'
You need to include xen/sched.h
- error: unknown type name 'int8_t'
You need to include xen/types.h in a header (regardless of whether you stick to int8_t or bool)
- implicit declaration of function 'isb'
~Michal
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v11 01/12] xen/common: add cache coloring common code
2024-12-03 9:55 ` Michal Orzel
2024-12-05 8:00 ` Michal Orzel
@ 2024-12-09 13:35 ` Jan Beulich
1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2024-12-09 13:35 UTC (permalink / raw)
To: Michal Orzel, Carlo Nonato
Cc: andrea.bastoni, marco.solieri, Andrew Cooper, Julien Grall,
Stefano Stabellini, xen-devel
On 03.12.2024 10:55, Michal Orzel wrote:
>
>
> On 02/12/2024 17:59, Carlo Nonato wrote:
>>
>>
>> Last Level Cache (LLC) coloring allows to partition the cache in smaller
>> chunks called cache colors.
>>
>> Since not all architectures can actually implement it, add a HAS_LLC_COLORING
>> Kconfig option.
>> LLC_COLORS_ORDER Kconfig option has a range maximum of 10 (2^10 = 1024)
>> because that's the number of colors that fit in a 4 KiB page when integers
>> are 4 bytes long.
>>
>> LLC colors are a property of the domain, so struct domain has to be extended.
>>
>> Based on original work from: Luca Miccio <lucmiccio@gmail.com>
>>
>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
>> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
>> Acked-by: Michal Orzel <michal.orzel@amd.com>
>
> [...]
>>
>> +### llc-coloring (arm64)
>> +> `= <boolean>`
>> +
>> +> Default: `false`
> By default, it is disabled. If CONFIG_ is enabled but ...
>
> [...]
>
>> + * -1: not specified (disabled unless llc-size and llc-nr-ways present)
> the user doesn't specify any llc-* options, LLC feature should be disabled.
> In your case llc_coloring_enabled is -1 and due to 'if ( llc_coloring_enabled ... )' checks
> all around the code base, the LLC will be enabled even though it should not.
>
> You can either set it to 0 if (llc_coloring_enabled < 0) and other llc-* options have not been provided
> (this would require modifying this comment to provide different meaning depending on the context) or
> you could do sth like that:
I agree the below is going to be better in terms of both readability and
consistency. A few minor comments though:
> --- a/xen/common/llc-coloring.c
> +++ b/xen/common/llc-coloring.c
> @@ -18,8 +18,10 @@
> * 0: explicitly disabled through cmdline
> * 1: explicitly enabled through cmdline
> */
> -int8_t __ro_after_init llc_coloring_enabled = -1;
> -boolean_param("llc-coloring", llc_coloring_enabled);
> +int8_t __init opt_llc_coloring = -1;
__initdata
> +boolean_param("llc-coloring", opt_llc_coloring);
> +
> +bool __ro_after_init llc_coloring_enabled = false;
>
> static unsigned int __initdata llc_size;
> size_param("llc-size", llc_size);
> @@ -147,15 +149,17 @@ void __init llc_coloring_init(void)
> {
> unsigned int way_size, i;
>
> - if ( (llc_coloring_enabled < 0) && (llc_size && llc_nr_ways) )
> + if ( (opt_llc_coloring < 0) && (llc_size && llc_nr_ways) )
Excess parentheses (&& doesn't need parenthesizing against another &&).
> {
> llc_coloring_enabled = true;
This becomes appropriate only with the variable's type changing back
to bool.
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v11 02/12] xen/arm: add initial support for LLC coloring on arm64
2024-12-02 16:59 [PATCH v11 00/12] Arm cache coloring Carlo Nonato
2024-12-02 16:59 ` [PATCH v11 01/12] xen/common: add cache coloring common code Carlo Nonato
@ 2024-12-02 16:59 ` Carlo Nonato
2024-12-05 8:04 ` Michal Orzel
2024-12-02 16:59 ` [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction Carlo Nonato
` (9 subsequent siblings)
11 siblings, 1 reply; 40+ messages in thread
From: Carlo Nonato @ 2024-12-02 16:59 UTC (permalink / raw)
To: xen-devel
Cc: andrea.bastoni, marco.solieri, Carlo Nonato, Andrew Cooper,
Jan Beulich, Julien Grall, Stefano Stabellini, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk
LLC coloring needs to know the last level cache layout in order to make the
best use of it. This can be probed by inspecting the CLIDR_EL1 register,
so the Last Level is defined as the last level visible by this register.
Note that this excludes system caches in some platforms.
Static memory allocation and cache coloring are incompatible because static
memory can't be guaranteed to use only colors assigned to the domain.
Panic during DomUs creation when both are enabled.
Based on original work from: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
v11:
- removed useless #define from processor.h
v10:
- moved CONFIG_NUMA check in arch/arm/Kconfig
v9:
- no changes
v8:
- no changes
v7:
- only minor changes
v6:
- get_llc_way_size() now checks for at least separate I/D caches
v5:
- used - instead of _ for filenames
- moved static-mem check in this patch
- moved dom0 colors parsing in next patch
- moved color allocation and configuration in next patch
- moved check_colors() in next patch
- colors are now printed in short form
v4:
- added "llc-coloring" cmdline option for the boot-time switch
- dom0 colors are now checked during domain init as for any other domain
- fixed processor.h masks bit width
- check for overflow in parse_color_config()
- check_colors() now checks also that colors are sorted and unique
---
docs/misc/cache-coloring.rst | 14 +++++
xen/arch/arm/Kconfig | 1 +
xen/arch/arm/Makefile | 1 +
xen/arch/arm/dom0less-build.c | 6 +++
xen/arch/arm/include/asm/processor.h | 15 ++++++
xen/arch/arm/llc-coloring.c | 79 ++++++++++++++++++++++++++++
xen/arch/arm/setup.c | 3 ++
xen/common/llc-coloring.c | 2 +-
xen/include/xen/llc-coloring.h | 4 ++
9 files changed, 124 insertions(+), 1 deletion(-)
create mode 100644 xen/arch/arm/llc-coloring.c
diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
index 371f21a0e7..12972dbb2c 100644
--- a/docs/misc/cache-coloring.rst
+++ b/docs/misc/cache-coloring.rst
@@ -113,6 +113,20 @@ Auto-probing of LLC specs
LLC size and number of ways are probed automatically by default.
+In the Arm implementation, this is done by inspecting the CLIDR_EL1 register.
+This means that other system caches that aren't visible there are ignored.
+
LLC specs can be manually set via the above command line parameters. This
bypasses any auto-probing and it's used to overcome failing situations, such as
flawed probing logic, or for debugging/testing purposes.
+
+Known issues and limitations
+****************************
+
+"xen,static-mem" isn't supported when coloring is enabled
+#########################################################
+
+In the domain configuration, "xen,static-mem" allows memory to be statically
+allocated to the domain. This isn't possible when LLC coloring is enabled,
+because that memory can't be guaranteed to use only colors assigned to the
+domain.
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 23bbc91aad..4ec9ef8334 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -8,6 +8,7 @@ config ARM_64
depends on !ARM_32
select 64BIT
select HAS_FAST_MULTIPLY
+ select HAS_LLC_COLORING if !NUMA
config ARM
def_bool y
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index e4ad1ce851..ccbfc61f88 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -35,6 +35,7 @@ obj-$(CONFIG_IOREQ_SERVER) += ioreq.o
obj-y += irq.o
obj-y += kernel.init.o
obj-$(CONFIG_LIVEPATCH) += livepatch.o
+obj-$(CONFIG_LLC_COLORING) += llc-coloring.o
obj-$(CONFIG_MEM_ACCESS) += mem_access.o
obj-y += mm.o
obj-y += monitor.o
diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index f328a044e9..d93a85434e 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -5,6 +5,7 @@
#include <xen/grant_table.h>
#include <xen/iocap.h>
#include <xen/libfdt/libfdt.h>
+#include <xen/llc-coloring.h>
#include <xen/sched.h>
#include <xen/serial.h>
#include <xen/sizes.h>
@@ -890,7 +891,12 @@ void __init create_domUs(void)
panic("No more domain IDs available\n");
if ( dt_find_property(node, "xen,static-mem", NULL) )
+ {
+ if ( llc_coloring_enabled )
+ panic("LLC coloring and static memory are incompatible\n");
+
flags |= CDF_staticmem;
+ }
if ( dt_property_read_bool(node, "direct-map") )
{
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 8e02410465..60b587db69 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -18,6 +18,21 @@
#define CTR_IDC_SHIFT 28
#define CTR_DIC_SHIFT 29
+/* CCSIDR Current Cache Size ID Register */
+#define CCSIDR_LINESIZE_MASK _AC(0x7, UL)
+#define CCSIDR_NUMSETS_SHIFT 13
+#define CCSIDR_NUMSETS_MASK _AC(0x3fff, UL)
+#define CCSIDR_NUMSETS_SHIFT_FEAT_CCIDX 32
+#define CCSIDR_NUMSETS_MASK_FEAT_CCIDX _AC(0xffffff, UL)
+
+/* CSSELR Cache Size Selection Register */
+#define CSSELR_LEVEL_SHIFT 1
+
+/* CLIDR Cache Level ID Register */
+#define CLIDR_CTYPEn_SHIFT(n) (3 * ((n) - 1))
+#define CLIDR_CTYPEn_MASK _AC(0x7, UL)
+#define CLIDR_CTYPEn_LEVELS 7
+
#define ICACHE_POLICY_VPIPT 0
#define ICACHE_POLICY_AIVIVT 1
#define ICACHE_POLICY_VIPT 2
diff --git a/xen/arch/arm/llc-coloring.c b/xen/arch/arm/llc-coloring.c
new file mode 100644
index 0000000000..6c8fa6b576
--- /dev/null
+++ b/xen/arch/arm/llc-coloring.c
@@ -0,0 +1,79 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Last Level Cache (LLC) coloring support for ARM
+ *
+ * Copyright (C) 2024, Advanced Micro Devices, Inc.
+ * Copyright (C) 2024, Minerva Systems SRL
+ */
+#include <xen/init.h>
+#include <xen/llc-coloring.h>
+#include <xen/types.h>
+
+#include <asm/processor.h>
+#include <asm/sysregs.h>
+
+/* Return the LLC way size by probing the hardware */
+unsigned int __init get_llc_way_size(void)
+{
+ register_t ccsidr_el1;
+ register_t clidr_el1 = READ_SYSREG(CLIDR_EL1);
+ register_t csselr_el1 = READ_SYSREG(CSSELR_EL1);
+ register_t id_aa64mmfr2_el1 = READ_SYSREG(ID_AA64MMFR2_EL1);
+ uint32_t ccsidr_numsets_shift = CCSIDR_NUMSETS_SHIFT;
+ uint32_t ccsidr_numsets_mask = CCSIDR_NUMSETS_MASK;
+ unsigned int n, line_size, num_sets;
+
+ for ( n = CLIDR_CTYPEn_LEVELS; n != 0; n-- )
+ {
+ uint8_t ctype_n = (clidr_el1 >> CLIDR_CTYPEn_SHIFT(n)) &
+ CLIDR_CTYPEn_MASK;
+
+ /* Unified cache (see Arm ARM DDI 0487J.a D19.2.27) */
+ if ( ctype_n == 0b100 )
+ break;
+ }
+
+ if ( n == 0 )
+ return 0;
+
+ WRITE_SYSREG((n - 1) << CSSELR_LEVEL_SHIFT, CSSELR_EL1);
+ isb();
+
+ ccsidr_el1 = READ_SYSREG(CCSIDR_EL1);
+
+ /* Arm ARM: (Log2(Number of bytes in cache line)) - 4 */
+ line_size = 1U << ((ccsidr_el1 & CCSIDR_LINESIZE_MASK) + 4);
+
+ /* If FEAT_CCIDX is enabled, CCSIDR_EL1 has a different bit layout */
+ if ( (id_aa64mmfr2_el1 >> ID_AA64MMFR2_CCIDX_SHIFT) & 0x7 )
+ {
+ ccsidr_numsets_shift = CCSIDR_NUMSETS_SHIFT_FEAT_CCIDX;
+ ccsidr_numsets_mask = CCSIDR_NUMSETS_MASK_FEAT_CCIDX;
+ }
+
+ /* Arm ARM: (Number of sets in cache) - 1 */
+ num_sets = ((ccsidr_el1 >> ccsidr_numsets_shift) & ccsidr_numsets_mask) + 1;
+
+ printk(XENLOG_INFO "LLC found: L%u (line size: %u bytes, sets num: %u)\n",
+ n, line_size, num_sets);
+
+ /* Restore value in CSSELR_EL1 */
+ WRITE_SYSREG(csselr_el1, CSSELR_EL1);
+ isb();
+
+ return line_size * num_sets;
+}
+
+void __init arch_llc_coloring_init(void)
+{
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 2e27af4560..568a49b274 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -12,6 +12,7 @@
#include <xen/device_tree.h>
#include <xen/domain_page.h>
#include <xen/grant_table.h>
+#include <xen/llc-coloring.h>
#include <xen/types.h>
#include <xen/string.h>
#include <xen/serial.h>
@@ -326,6 +327,8 @@ void asmlinkage __init start_xen(unsigned long fdt_paddr)
printk("Command line: %s\n", cmdline);
cmdline_parse(cmdline);
+ llc_coloring_init();
+
setup_mm();
vm_init();
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index 54d76e3aca..5139890e3d 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -16,7 +16,7 @@
* 0: explicitly disabled through cmdline
* 1: explicitly enabled through cmdline
*/
-static int8_t __ro_after_init llc_coloring_enabled = -1;
+int8_t __ro_after_init llc_coloring_enabled = -1;
boolean_param("llc-coloring", llc_coloring_enabled);
static unsigned int __initdata llc_size;
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
index 0acd8d0ad6..ee0c58ab1c 100644
--- a/xen/include/xen/llc-coloring.h
+++ b/xen/include/xen/llc-coloring.h
@@ -11,10 +11,14 @@
struct domain;
#ifdef CONFIG_LLC_COLORING
+extern int8_t llc_coloring_enabled;
+
void llc_coloring_init(void);
void dump_llc_coloring_info(void);
void domain_dump_llc_colors(const struct domain *d);
#else
+#define llc_coloring_enabled false
+
static inline void llc_coloring_init(void) {}
static inline void dump_llc_coloring_info(void) {}
static inline void domain_dump_llc_colors(const struct domain *d) {}
--
2.43.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v11 02/12] xen/arm: add initial support for LLC coloring on arm64
2024-12-02 16:59 ` [PATCH v11 02/12] xen/arm: add initial support for LLC coloring on arm64 Carlo Nonato
@ 2024-12-05 8:04 ` Michal Orzel
0 siblings, 0 replies; 40+ messages in thread
From: Michal Orzel @ 2024-12-05 8:04 UTC (permalink / raw)
To: Carlo Nonato, xen-devel
Cc: andrea.bastoni, marco.solieri, Andrew Cooper, Jan Beulich,
Julien Grall, Stefano Stabellini, Bertrand Marquis,
Volodymyr Babchuk
On 02/12/2024 17:59, Carlo Nonato wrote:
>
>
> LLC coloring needs to know the last level cache layout in order to make the
> best use of it. This can be probed by inspecting the CLIDR_EL1 register,
> so the Last Level is defined as the last level visible by this register.
> Note that this excludes system caches in some platforms.
>
> Static memory allocation and cache coloring are incompatible because static
> memory can't be guaranteed to use only colors assigned to the domain.
> Panic during DomUs creation when both are enabled.
>
> Based on original work from: Luca Miccio <lucmiccio@gmail.com>
>
> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> Reviewed-by: Michal Orzel <michal.orzel@amd.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
[...]
> + isb();
Attempt to build this patch with LLC_COLORING=y results in:
error: implicit declaration of function 'isb'
You need to include <asm/system.h>
Please, always try to make sure that each patch can be built successfully.
~Michal
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-02 16:59 [PATCH v11 00/12] Arm cache coloring Carlo Nonato
2024-12-02 16:59 ` [PATCH v11 01/12] xen/common: add cache coloring common code Carlo Nonato
2024-12-02 16:59 ` [PATCH v11 02/12] xen/arm: add initial support for LLC coloring on arm64 Carlo Nonato
@ 2024-12-02 16:59 ` Carlo Nonato
2024-12-05 9:40 ` Michal Orzel
2024-12-02 16:59 ` [PATCH v11 04/12] xen/arm: add Dom0 cache coloring support Carlo Nonato
` (8 subsequent siblings)
11 siblings, 1 reply; 40+ messages in thread
From: Carlo Nonato @ 2024-12-02 16:59 UTC (permalink / raw)
To: xen-devel
Cc: andrea.bastoni, marco.solieri, Carlo Nonato, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk
Cache coloring requires Dom0 not to be direct-mapped because of its non
contiguous mapping nature, so allocate_memory() is needed in this case.
8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
moved allocate_memory() in dom0less_build.c. In order to use it
in Dom0 construction bring it back to domain_build.c and declare it in
domain_build.h.
Take the opportunity to adapt the implementation of allocate_memory() so
that it uses the host layout when called on the hwdom, via
find_unallocated_memory().
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
---
v11:
- GUEST_RAM_BANKS instead of hardcoding the number of banks in allocate_memory()
- hwdom_ext_regions -> hwdom_free_mem in allocate_memory()
- added a comment in allocate_memory() when skipping small banks
v10:
- fixed a compilation bug that happened when dom0less support was disabled
v9:
- no changes
v8:
- patch adapted to new changes to allocate_memory()
v7:
- allocate_memory() now uses the host layout when called on the hwdom
v6:
- new patch
---
xen/arch/arm/dom0less-build.c | 44 -----------
xen/arch/arm/domain_build.c | 97 ++++++++++++++++++++++++-
xen/arch/arm/include/asm/domain_build.h | 1 +
3 files changed, 94 insertions(+), 48 deletions(-)
diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index d93a85434e..67b1503647 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
return ( !dom0found && domUfound );
}
-static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
-{
- struct membanks *mem = kernel_info_get_mem(kinfo);
- unsigned int i;
- paddr_t bank_size;
-
- printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
- /* Don't want format this as PRIpaddr (16 digit hex) */
- (unsigned long)(kinfo->unassigned_mem >> 20), d);
-
- mem->nr_banks = 0;
- bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
- if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
- bank_size) )
- goto fail;
-
- bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
- if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
- bank_size) )
- goto fail;
-
- if ( kinfo->unassigned_mem )
- goto fail;
-
- for( i = 0; i < mem->nr_banks; i++ )
- {
- printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
- d,
- i,
- mem->bank[i].start,
- mem->bank[i].start + mem->bank[i].size,
- /* Don't want format this as PRIpaddr (16 digit hex) */
- (unsigned long)(mem->bank[i].size >> 20));
- }
-
- return;
-
-fail:
- panic("Failed to allocate requested domain memory."
- /* Don't want format this as PRIpaddr (16 digit hex) */
- " %ldKB unallocated. Fix the VMs configurations.\n",
- (unsigned long)kinfo->unassigned_mem >> 10);
-}
-
#ifdef CONFIG_VGICV2
static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
{
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 2c30792de8..2b8cba9b2f 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -416,7 +416,6 @@ static void __init allocate_memory_11(struct domain *d,
}
}
-#ifdef CONFIG_DOM0LESS_BOOT
bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
alloc_domheap_mem_cb cb, void *extra)
{
@@ -508,7 +507,6 @@ bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
return true;
}
-#endif
/*
* When PCI passthrough is available we want to keep the
@@ -1003,6 +1001,94 @@ out:
return res;
}
+void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
+{
+ struct membanks *mem = kernel_info_get_mem(kinfo);
+ unsigned int i, nr_banks = GUEST_RAM_BANKS;
+ paddr_t bank_start, bank_size;
+ struct membanks *hwdom_free_mem = NULL;
+ const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
+ const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
+
+ printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
+ /* Don't want format this as PRIpaddr (16 digit hex) */
+ (unsigned long)(kinfo->unassigned_mem >> 20), d);
+
+ mem->nr_banks = 0;
+ /*
+ * Use host memory layout for hwdom. Only case for this is when LLC coloring
+ * is enabled.
+ */
+ if ( is_hardware_domain(d) )
+ {
+ ASSERT(llc_coloring_enabled);
+
+ hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
+ NR_MEM_BANKS);
+ if ( !hwdom_free_mem )
+ goto fail;
+
+ hwdom_free_mem->max_banks = NR_MEM_BANKS;
+
+ if ( find_unallocated_memory(kinfo, hwdom_free_mem) )
+ goto fail;
+
+ nr_banks = hwdom_free_mem->nr_banks;
+ }
+
+ for ( i = 0; kinfo->unassigned_mem > 0 && nr_banks > 0; i++, nr_banks-- )
+ {
+ if ( is_hardware_domain(d) )
+ {
+ bank_start = hwdom_free_mem->bank[i].start;
+ bank_size = hwdom_free_mem->bank[i].size;
+
+ /*
+ * Skip banks that are too small. The first bank must contain
+ * dom0 kernel + ramdisk + dtb and 128 MB is the same limit used
+ * in allocate_memory_11().
+ */
+ if ( bank_size < min_t(paddr_t, kinfo->unassigned_mem, MB(128)) )
+ continue;
+ }
+ else
+ {
+ if ( i >= GUEST_RAM_BANKS )
+ goto fail;
+
+ bank_start = bankbase[i];
+ bank_size = banksize[i];
+ }
+
+ bank_size = MIN(bank_size, kinfo->unassigned_mem);
+ if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(bank_start), bank_size) )
+ goto fail;
+ }
+
+ if ( kinfo->unassigned_mem )
+ goto fail;
+
+ for( i = 0; i < mem->nr_banks; i++ )
+ {
+ printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
+ d,
+ i,
+ mem->bank[i].start,
+ mem->bank[i].start + mem->bank[i].size,
+ /* Don't want format this as PRIpaddr (16 digit hex) */
+ (unsigned long)(mem->bank[i].size >> 20));
+ }
+
+ xfree(hwdom_free_mem);
+ return;
+
+fail:
+ panic("Failed to allocate requested domain memory."
+ /* Don't want format this as PRIpaddr (16 digit hex) */
+ " %ldKB unallocated. Fix the VMs configurations.\n",
+ (unsigned long)kinfo->unassigned_mem >> 10);
+}
+
static int __init handle_pci_range(const struct dt_device_node *dev,
uint64_t addr, uint64_t len, void *data)
{
@@ -1223,7 +1309,7 @@ int __init make_hypervisor_node(struct domain *d,
ext_regions->max_banks = NR_MEM_BANKS;
- if ( is_domain_direct_mapped(d) )
+ if ( domain_use_host_layout(d) )
{
if ( !is_iommu_enabled(d) )
res = find_unallocated_memory(kinfo, ext_regions);
@@ -2152,7 +2238,10 @@ static int __init construct_dom0(struct domain *d)
/* type must be set before allocate_memory */
d->arch.type = kinfo.type;
#endif
- allocate_memory_11(d, &kinfo);
+ if ( is_domain_direct_mapped(d) )
+ allocate_memory_11(d, &kinfo);
+ else
+ allocate_memory(d, &kinfo);
find_gnttab_region(d, &kinfo);
rc = process_shm_chosen(d, &kinfo);
diff --git a/xen/arch/arm/include/asm/domain_build.h b/xen/arch/arm/include/asm/domain_build.h
index e712afbc7f..5d77af2e8b 100644
--- a/xen/arch/arm/include/asm/domain_build.h
+++ b/xen/arch/arm/include/asm/domain_build.h
@@ -11,6 +11,7 @@ bool allocate_domheap_memory(struct domain *d, paddr_t tot_size,
alloc_domheap_mem_cb cb, void *extra);
bool allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
paddr_t tot_size);
+void allocate_memory(struct domain *d, struct kernel_info *kinfo);
int construct_domain(struct domain *d, struct kernel_info *kinfo);
int domain_fdt_begin_node(void *fdt, const char *name, uint64_t unit);
int make_chosen_node(const struct kernel_info *kinfo);
--
2.43.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-02 16:59 ` [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction Carlo Nonato
@ 2024-12-05 9:40 ` Michal Orzel
2024-12-06 18:37 ` Julien Grall
0 siblings, 1 reply; 40+ messages in thread
From: Michal Orzel @ 2024-12-05 9:40 UTC (permalink / raw)
To: Carlo Nonato, xen-devel
Cc: andrea.bastoni, marco.solieri, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Volodymyr Babchuk
On 02/12/2024 17:59, Carlo Nonato wrote:
>
>
> Cache coloring requires Dom0 not to be direct-mapped because of its non
> contiguous mapping nature, so allocate_memory() is needed in this case.
> 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
> moved allocate_memory() in dom0less_build.c. In order to use it
> in Dom0 construction bring it back to domain_build.c and declare it in
> domain_build.h.
>
> Take the opportunity to adapt the implementation of allocate_memory() so
> that it uses the host layout when called on the hwdom, via
> find_unallocated_memory().
>
> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> ---
> v11:
> - GUEST_RAM_BANKS instead of hardcoding the number of banks in allocate_memory()
> - hwdom_ext_regions -> hwdom_free_mem in allocate_memory()
> - added a comment in allocate_memory() when skipping small banks
> v10:
> - fixed a compilation bug that happened when dom0less support was disabled
> v9:
> - no changes
> v8:
> - patch adapted to new changes to allocate_memory()
> v7:
> - allocate_memory() now uses the host layout when called on the hwdom
> v6:
> - new patch
> ---
> xen/arch/arm/dom0less-build.c | 44 -----------
> xen/arch/arm/domain_build.c | 97 ++++++++++++++++++++++++-
> xen/arch/arm/include/asm/domain_build.h | 1 +
> 3 files changed, 94 insertions(+), 48 deletions(-)
>
> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
> index d93a85434e..67b1503647 100644
> --- a/xen/arch/arm/dom0less-build.c
> +++ b/xen/arch/arm/dom0less-build.c
> @@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
> return ( !dom0found && domUfound );
> }
>
> -static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
> -{
> - struct membanks *mem = kernel_info_get_mem(kinfo);
> - unsigned int i;
> - paddr_t bank_size;
> -
> - printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
> - /* Don't want format this as PRIpaddr (16 digit hex) */
> - (unsigned long)(kinfo->unassigned_mem >> 20), d);
> -
> - mem->nr_banks = 0;
> - bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
> - bank_size) )
> - goto fail;
> -
> - bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
> - bank_size) )
> - goto fail;
> -
> - if ( kinfo->unassigned_mem )
> - goto fail;
> -
> - for( i = 0; i < mem->nr_banks; i++ )
> - {
> - printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
> - d,
> - i,
> - mem->bank[i].start,
> - mem->bank[i].start + mem->bank[i].size,
> - /* Don't want format this as PRIpaddr (16 digit hex) */
> - (unsigned long)(mem->bank[i].size >> 20));
> - }
> -
> - return;
> -
> -fail:
> - panic("Failed to allocate requested domain memory."
> - /* Don't want format this as PRIpaddr (16 digit hex) */
> - " %ldKB unallocated. Fix the VMs configurations.\n",
> - (unsigned long)kinfo->unassigned_mem >> 10);
> -}
> -
> #ifdef CONFIG_VGICV2
> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
> {
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 2c30792de8..2b8cba9b2f 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -416,7 +416,6 @@ static void __init allocate_memory_11(struct domain *d,
> }
> }
>
> -#ifdef CONFIG_DOM0LESS_BOOT
> bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
> alloc_domheap_mem_cb cb, void *extra)
> {
> @@ -508,7 +507,6 @@ bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
>
> return true;
> }
> -#endif
>
> /*
> * When PCI passthrough is available we want to keep the
> @@ -1003,6 +1001,94 @@ out:
> return res;
> }
>
> +void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
> +{
> + struct membanks *mem = kernel_info_get_mem(kinfo);
> + unsigned int i, nr_banks = GUEST_RAM_BANKS;
> + paddr_t bank_start, bank_size;
Limit the scope
> + struct membanks *hwdom_free_mem = NULL;
> + const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
> + const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
Limit the scope
> +
> + printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
> + /* Don't want format this as PRIpaddr (16 digit hex) */
> + (unsigned long)(kinfo->unassigned_mem >> 20), d);
> +
> + mem->nr_banks = 0;
> + /*
> + * Use host memory layout for hwdom. Only case for this is when LLC coloring
> + * is enabled.
> + */
> + if ( is_hardware_domain(d) )
> + {
> + ASSERT(llc_coloring_enabled);
This patch does not build because of declaration not being visible. You must include <xen/llc-coloring.h>.
> +
> + hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
> + NR_MEM_BANKS);
> + if ( !hwdom_free_mem )
> + goto fail;
> +
> + hwdom_free_mem->max_banks = NR_MEM_BANKS;
> +
> + if ( find_unallocated_memory(kinfo, hwdom_free_mem) )
My remarks for the use of find_unallocated_memory() 1:1 have not been addressed. You did not even
change the comments inside the function. The problem is that the function is specifically designed
for finding extended regions and assumes being called at certain point i.e. dom0 RAM allocated, gnttab
region allocated, etc. My opinion is that we should attempt to make the function generic so that in your
case you can choose which regions to exclude, define even your own function to grab free regions (at the moment
add_ext_regions grabs banks >= 64M but you still discards banks >= 128M, so it's a bit wasteful.
My very short attempt to make the function as generic as possible in the first iteration:
https://paste.debian.net/1338334/
For coloring, you could define your own add_free_regions and only pass RSVD and GNTTAB banks to be excluded.
As said before, I still wait for other Arm maintainers to provide their own opinion.
~Michal
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-05 9:40 ` Michal Orzel
@ 2024-12-06 18:37 ` Julien Grall
2024-12-07 15:04 ` Michal Orzel
0 siblings, 1 reply; 40+ messages in thread
From: Julien Grall @ 2024-12-06 18:37 UTC (permalink / raw)
To: Michal Orzel, Carlo Nonato, xen-devel
Cc: andrea.bastoni, marco.solieri, Stefano Stabellini,
Bertrand Marquis, Volodymyr Babchuk
Hi,
Sorry for the late answer.
On 05/12/2024 09:40, Michal Orzel wrote:
>
>
> On 02/12/2024 17:59, Carlo Nonato wrote:
>>
>>
>> Cache coloring requires Dom0 not to be direct-mapped because of its non
>> contiguous mapping nature, so allocate_memory() is needed in this case.
>> 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
>> moved allocate_memory() in dom0less_build.c. In order to use it
>> in Dom0 construction bring it back to domain_build.c and declare it in
>> domain_build.h.
>>
>> Take the opportunity to adapt the implementation of allocate_memory() so
>> that it uses the host layout when called on the hwdom, via
>> find_unallocated_memory().
>>
>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
>> ---
>> v11:
>> - GUEST_RAM_BANKS instead of hardcoding the number of banks in allocate_memory()
>> - hwdom_ext_regions -> hwdom_free_mem in allocate_memory()
>> - added a comment in allocate_memory() when skipping small banks
>> v10:
>> - fixed a compilation bug that happened when dom0less support was disabled
>> v9:
>> - no changes
>> v8:
>> - patch adapted to new changes to allocate_memory()
>> v7:
>> - allocate_memory() now uses the host layout when called on the hwdom
>> v6:
>> - new patch
>> ---
>> xen/arch/arm/dom0less-build.c | 44 -----------
>> xen/arch/arm/domain_build.c | 97 ++++++++++++++++++++++++-
>> xen/arch/arm/include/asm/domain_build.h | 1 +
>> 3 files changed, 94 insertions(+), 48 deletions(-)
>>
>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
>> index d93a85434e..67b1503647 100644
>> --- a/xen/arch/arm/dom0less-build.c
>> +++ b/xen/arch/arm/dom0less-build.c
>> @@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
>> return ( !dom0found && domUfound );
>> }
>>
>> -static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
>> -{
>> - struct membanks *mem = kernel_info_get_mem(kinfo);
>> - unsigned int i;
>> - paddr_t bank_size;
>> -
>> - printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>> - (unsigned long)(kinfo->unassigned_mem >> 20), d);
>> -
>> - mem->nr_banks = 0;
>> - bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
>> - bank_size) )
>> - goto fail;
>> -
>> - bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
>> - bank_size) )
>> - goto fail;
>> -
>> - if ( kinfo->unassigned_mem )
>> - goto fail;
>> -
>> - for( i = 0; i < mem->nr_banks; i++ )
>> - {
>> - printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
>> - d,
>> - i,
>> - mem->bank[i].start,
>> - mem->bank[i].start + mem->bank[i].size,
>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>> - (unsigned long)(mem->bank[i].size >> 20));
>> - }
>> -
>> - return;
>> -
>> -fail:
>> - panic("Failed to allocate requested domain memory."
>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>> - " %ldKB unallocated. Fix the VMs configurations.\n",
>> - (unsigned long)kinfo->unassigned_mem >> 10);
>> -}
>> -
>> #ifdef CONFIG_VGICV2
>> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
>> {
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index 2c30792de8..2b8cba9b2f 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -416,7 +416,6 @@ static void __init allocate_memory_11(struct domain *d,
>> }
>> }
>>
>> -#ifdef CONFIG_DOM0LESS_BOOT
>> bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
>> alloc_domheap_mem_cb cb, void *extra)
>> {
>> @@ -508,7 +507,6 @@ bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
>>
>> return true;
>> }
>> -#endif
>>
>> /*
>> * When PCI passthrough is available we want to keep the
>> @@ -1003,6 +1001,94 @@ out:
>> return res;
>> }
>>
>> +void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
>> +{
>> + struct membanks *mem = kernel_info_get_mem(kinfo);
>> + unsigned int i, nr_banks = GUEST_RAM_BANKS;
>> + paddr_t bank_start, bank_size;
> Limit the scope
>
>> + struct membanks *hwdom_free_mem = NULL;
>> + const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>> + const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
> Limit the scope
>
>> +
>> + printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
>> + /* Don't want format this as PRIpaddr (16 digit hex) */
>> + (unsigned long)(kinfo->unassigned_mem >> 20), d);
>> +
>> + mem->nr_banks = 0;
>> + /*
>> + * Use host memory layout for hwdom. Only case for this is when LLC coloring
>> + * is enabled.
>> + */
>> + if ( is_hardware_domain(d) )
>> + {
>> + ASSERT(llc_coloring_enabled);
> This patch does not build because of declaration not being visible. You must include <xen/llc-coloring.h>.
Piggying back on this comment. AFAICT, the code below would work also in
the non cache coloring case. So what's the assert is for?
>
>> +
>> + hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
>> + NR_MEM_BANKS);
>> + if ( !hwdom_free_mem )
>> + goto fail;
>> +
>> + hwdom_free_mem->max_banks = NR_MEM_BANKS;
>> +
>> + if ( find_unallocated_memory(kinfo, hwdom_free_mem) )
> My remarks for the use of find_unallocated_memory() 1:1 have not been addressed. You did not even
> change the comments inside the function. The problem is that the function is specifically designed
> for finding extended regions and assumes being called at certain point i.e. dom0 RAM allocated, gnttab
> region allocated, etc.
So I agree that the function should be updated if we plan to use it for
other purpose.
My opinion is that we should attempt to make the function generic so
that in your
> case you can choose which regions to exclude, define even your own function to grab free regions (at the moment
> add_ext_regions grabs banks >= 64M but you still discards banks >= 128M, so it's a bit wasteful.
>
> My very short attempt to make the function as generic as possible in the first iteration:
> https://paste.debian.net/1338334/
This looks better, but I wonder why we need still need to exclude the
static regions? Wouldn't it be sufficient to exclude just reserved regions?
>
> For coloring, you could define your own add_free_regions and only pass RSVD and GNTTAB banks to be excluded.
>
> As said before, I still wait for other Arm maintainers to provide their own opinion.
>
> ~Michal
>
--
Julien Grall
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-06 18:37 ` Julien Grall
@ 2024-12-07 15:04 ` Michal Orzel
2024-12-09 9:47 ` Carlo Nonato
2024-12-09 19:17 ` Julien Grall
0 siblings, 2 replies; 40+ messages in thread
From: Michal Orzel @ 2024-12-07 15:04 UTC (permalink / raw)
To: Julien Grall, Carlo Nonato, xen-devel
Cc: andrea.bastoni, marco.solieri, Stefano Stabellini,
Bertrand Marquis, Volodymyr Babchuk
On 06/12/2024 19:37, Julien Grall wrote:
>
>
> Hi,
>
> Sorry for the late answer.
>
> On 05/12/2024 09:40, Michal Orzel wrote:
>>
>>
>> On 02/12/2024 17:59, Carlo Nonato wrote:
>>>
>>>
>>> Cache coloring requires Dom0 not to be direct-mapped because of its non
>>> contiguous mapping nature, so allocate_memory() is needed in this case.
>>> 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
>>> moved allocate_memory() in dom0less_build.c. In order to use it
>>> in Dom0 construction bring it back to domain_build.c and declare it in
>>> domain_build.h.
>>>
>>> Take the opportunity to adapt the implementation of allocate_memory() so
>>> that it uses the host layout when called on the hwdom, via
>>> find_unallocated_memory().
>>>
>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
>>> ---
>>> v11:
>>> - GUEST_RAM_BANKS instead of hardcoding the number of banks in allocate_memory()
>>> - hwdom_ext_regions -> hwdom_free_mem in allocate_memory()
>>> - added a comment in allocate_memory() when skipping small banks
>>> v10:
>>> - fixed a compilation bug that happened when dom0less support was disabled
>>> v9:
>>> - no changes
>>> v8:
>>> - patch adapted to new changes to allocate_memory()
>>> v7:
>>> - allocate_memory() now uses the host layout when called on the hwdom
>>> v6:
>>> - new patch
>>> ---
>>> xen/arch/arm/dom0less-build.c | 44 -----------
>>> xen/arch/arm/domain_build.c | 97 ++++++++++++++++++++++++-
>>> xen/arch/arm/include/asm/domain_build.h | 1 +
>>> 3 files changed, 94 insertions(+), 48 deletions(-)
>>>
>>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
>>> index d93a85434e..67b1503647 100644
>>> --- a/xen/arch/arm/dom0less-build.c
>>> +++ b/xen/arch/arm/dom0less-build.c
>>> @@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
>>> return ( !dom0found && domUfound );
>>> }
>>>
>>> -static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
>>> -{
>>> - struct membanks *mem = kernel_info_get_mem(kinfo);
>>> - unsigned int i;
>>> - paddr_t bank_size;
>>> -
>>> - printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>> - (unsigned long)(kinfo->unassigned_mem >> 20), d);
>>> -
>>> - mem->nr_banks = 0;
>>> - bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
>>> - bank_size) )
>>> - goto fail;
>>> -
>>> - bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
>>> - bank_size) )
>>> - goto fail;
>>> -
>>> - if ( kinfo->unassigned_mem )
>>> - goto fail;
>>> -
>>> - for( i = 0; i < mem->nr_banks; i++ )
>>> - {
>>> - printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
>>> - d,
>>> - i,
>>> - mem->bank[i].start,
>>> - mem->bank[i].start + mem->bank[i].size,
>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>> - (unsigned long)(mem->bank[i].size >> 20));
>>> - }
>>> -
>>> - return;
>>> -
>>> -fail:
>>> - panic("Failed to allocate requested domain memory."
>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>> - " %ldKB unallocated. Fix the VMs configurations.\n",
>>> - (unsigned long)kinfo->unassigned_mem >> 10);
>>> -}
>>> -
>>> #ifdef CONFIG_VGICV2
>>> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
>>> {
>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>> index 2c30792de8..2b8cba9b2f 100644
>>> --- a/xen/arch/arm/domain_build.c
>>> +++ b/xen/arch/arm/domain_build.c
>>> @@ -416,7 +416,6 @@ static void __init allocate_memory_11(struct domain *d,
>>> }
>>> }
>>>
>>> -#ifdef CONFIG_DOM0LESS_BOOT
>>> bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
>>> alloc_domheap_mem_cb cb, void *extra)
>>> {
>>> @@ -508,7 +507,6 @@ bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
>>>
>>> return true;
>>> }
>>> -#endif
>>>
>>> /*
>>> * When PCI passthrough is available we want to keep the
>>> @@ -1003,6 +1001,94 @@ out:
>>> return res;
>>> }
>>>
>>> +void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
>>> +{
>>> + struct membanks *mem = kernel_info_get_mem(kinfo);
>>> + unsigned int i, nr_banks = GUEST_RAM_BANKS;
>>> + paddr_t bank_start, bank_size;
>> Limit the scope
>>
>>> + struct membanks *hwdom_free_mem = NULL;
>>> + const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>>> + const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
>> Limit the scope
>>
>>> +
>>> + printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
>>> + /* Don't want format this as PRIpaddr (16 digit hex) */
>>> + (unsigned long)(kinfo->unassigned_mem >> 20), d);
>>> +
>>> + mem->nr_banks = 0;
>>> + /*
>>> + * Use host memory layout for hwdom. Only case for this is when LLC coloring
>>> + * is enabled.
>>> + */
>>> + if ( is_hardware_domain(d) )
>>> + {
>>> + ASSERT(llc_coloring_enabled);
>> This patch does not build because of declaration not being visible. You must include <xen/llc-coloring.h>.
>
> Piggying back on this comment. AFAICT, the code below would work also in
> the non cache coloring case. So what's the assert is for?
>
>>
>>> +
>>> + hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
>>> + NR_MEM_BANKS);
>>> + if ( !hwdom_free_mem )
>>> + goto fail;
>>> +
>>> + hwdom_free_mem->max_banks = NR_MEM_BANKS;
>>> +
>>> + if ( find_unallocated_memory(kinfo, hwdom_free_mem) )
>> My remarks for the use of find_unallocated_memory() 1:1 have not been addressed. You did not even
>> change the comments inside the function. The problem is that the function is specifically designed
>> for finding extended regions and assumes being called at certain point i.e. dom0 RAM allocated, gnttab
>> region allocated, etc.
>
> So I agree that the function should be updated if we plan to use it for
> other purpose.
>
> My opinion is that we should attempt to make the function generic so
> that in your
>> case you can choose which regions to exclude, define even your own function to grab free regions (at the moment
>> add_ext_regions grabs banks >= 64M but you still discards banks >= 128M, so it's a bit wasteful.
>>
>> My very short attempt to make the function as generic as possible in the first iteration:
>> https://paste.debian.net/1338334/
>
> This looks better, but I wonder why we need still need to exclude the
> static regions? Wouldn't it be sufficient to exclude just reserved regions?
Static shared memory banks are not part of reserved memory (i.e. bootinfo.reserved_mem) if that's what you're asking.
They are stored in bootinfo.shmem, hence we need to take them into account when searching for unused address space.
If you and Carlo are ok with my proposed solution for making the function generic, I can send a patch as a prerequisite
patch for Carlo series.
~Michal
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-07 15:04 ` Michal Orzel
@ 2024-12-09 9:47 ` Carlo Nonato
2024-12-09 19:17 ` Julien Grall
1 sibling, 0 replies; 40+ messages in thread
From: Carlo Nonato @ 2024-12-09 9:47 UTC (permalink / raw)
To: Michal Orzel
Cc: Julien Grall, xen-devel, andrea.bastoni, marco.solieri,
Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk
Hi Michal, Julien
On Sat, Dec 7, 2024 at 4:05 PM Michal Orzel <michal.orzel@amd.com> wrote:
>
> On 06/12/2024 19:37, Julien Grall wrote:
> >
> >
> > Hi,
> >
> > Sorry for the late answer.
> >
> > On 05/12/2024 09:40, Michal Orzel wrote:
> >>
> >>
> >> On 02/12/2024 17:59, Carlo Nonato wrote:
> >>>
> >>>
> >>> Cache coloring requires Dom0 not to be direct-mapped because of its non
> >>> contiguous mapping nature, so allocate_memory() is needed in this case.
> >>> 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
> >>> moved allocate_memory() in dom0less_build.c. In order to use it
> >>> in Dom0 construction bring it back to domain_build.c and declare it in
> >>> domain_build.h.
> >>>
> >>> Take the opportunity to adapt the implementation of allocate_memory() so
> >>> that it uses the host layout when called on the hwdom, via
> >>> find_unallocated_memory().
> >>>
> >>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> >>> ---
> >>> v11:
> >>> - GUEST_RAM_BANKS instead of hardcoding the number of banks in allocate_memory()
> >>> - hwdom_ext_regions -> hwdom_free_mem in allocate_memory()
> >>> - added a comment in allocate_memory() when skipping small banks
> >>> v10:
> >>> - fixed a compilation bug that happened when dom0less support was disabled
> >>> v9:
> >>> - no changes
> >>> v8:
> >>> - patch adapted to new changes to allocate_memory()
> >>> v7:
> >>> - allocate_memory() now uses the host layout when called on the hwdom
> >>> v6:
> >>> - new patch
> >>> ---
> >>> xen/arch/arm/dom0less-build.c | 44 -----------
> >>> xen/arch/arm/domain_build.c | 97 ++++++++++++++++++++++++-
> >>> xen/arch/arm/include/asm/domain_build.h | 1 +
> >>> 3 files changed, 94 insertions(+), 48 deletions(-)
> >>>
> >>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
> >>> index d93a85434e..67b1503647 100644
> >>> --- a/xen/arch/arm/dom0less-build.c
> >>> +++ b/xen/arch/arm/dom0less-build.c
> >>> @@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
> >>> return ( !dom0found && domUfound );
> >>> }
> >>>
> >>> -static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
> >>> -{
> >>> - struct membanks *mem = kernel_info_get_mem(kinfo);
> >>> - unsigned int i;
> >>> - paddr_t bank_size;
> >>> -
> >>> - printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
> >>> - /* Don't want format this as PRIpaddr (16 digit hex) */
> >>> - (unsigned long)(kinfo->unassigned_mem >> 20), d);
> >>> -
> >>> - mem->nr_banks = 0;
> >>> - bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
> >>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
> >>> - bank_size) )
> >>> - goto fail;
> >>> -
> >>> - bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
> >>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
> >>> - bank_size) )
> >>> - goto fail;
> >>> -
> >>> - if ( kinfo->unassigned_mem )
> >>> - goto fail;
> >>> -
> >>> - for( i = 0; i < mem->nr_banks; i++ )
> >>> - {
> >>> - printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
> >>> - d,
> >>> - i,
> >>> - mem->bank[i].start,
> >>> - mem->bank[i].start + mem->bank[i].size,
> >>> - /* Don't want format this as PRIpaddr (16 digit hex) */
> >>> - (unsigned long)(mem->bank[i].size >> 20));
> >>> - }
> >>> -
> >>> - return;
> >>> -
> >>> -fail:
> >>> - panic("Failed to allocate requested domain memory."
> >>> - /* Don't want format this as PRIpaddr (16 digit hex) */
> >>> - " %ldKB unallocated. Fix the VMs configurations.\n",
> >>> - (unsigned long)kinfo->unassigned_mem >> 10);
> >>> -}
> >>> -
> >>> #ifdef CONFIG_VGICV2
> >>> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
> >>> {
> >>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> >>> index 2c30792de8..2b8cba9b2f 100644
> >>> --- a/xen/arch/arm/domain_build.c
> >>> +++ b/xen/arch/arm/domain_build.c
> >>> @@ -416,7 +416,6 @@ static void __init allocate_memory_11(struct domain *d,
> >>> }
> >>> }
> >>>
> >>> -#ifdef CONFIG_DOM0LESS_BOOT
> >>> bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
> >>> alloc_domheap_mem_cb cb, void *extra)
> >>> {
> >>> @@ -508,7 +507,6 @@ bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
> >>>
> >>> return true;
> >>> }
> >>> -#endif
> >>>
> >>> /*
> >>> * When PCI passthrough is available we want to keep the
> >>> @@ -1003,6 +1001,94 @@ out:
> >>> return res;
> >>> }
> >>>
> >>> +void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
> >>> +{
> >>> + struct membanks *mem = kernel_info_get_mem(kinfo);
> >>> + unsigned int i, nr_banks = GUEST_RAM_BANKS;
> >>> + paddr_t bank_start, bank_size;
> >> Limit the scope
> >>
> >>> + struct membanks *hwdom_free_mem = NULL;
> >>> + const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
> >>> + const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
> >> Limit the scope
> >>
> >>> +
> >>> + printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
> >>> + /* Don't want format this as PRIpaddr (16 digit hex) */
> >>> + (unsigned long)(kinfo->unassigned_mem >> 20), d);
> >>> +
> >>> + mem->nr_banks = 0;
> >>> + /*
> >>> + * Use host memory layout for hwdom. Only case for this is when LLC coloring
> >>> + * is enabled.
> >>> + */
> >>> + if ( is_hardware_domain(d) )
> >>> + {
> >>> + ASSERT(llc_coloring_enabled);
> >> This patch does not build because of declaration not being visible. You must include <xen/llc-coloring.h>.
> >
> > Piggying back on this comment. AFAICT, the code below would work also in
> > the non cache coloring case. So what's the assert is for?
> >
> >>
> >>> +
> >>> + hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
> >>> + NR_MEM_BANKS);
> >>> + if ( !hwdom_free_mem )
> >>> + goto fail;
> >>> +
> >>> + hwdom_free_mem->max_banks = NR_MEM_BANKS;
> >>> +
> >>> + if ( find_unallocated_memory(kinfo, hwdom_free_mem) )
> >> My remarks for the use of find_unallocated_memory() 1:1 have not been addressed. You did not even
> >> change the comments inside the function. The problem is that the function is specifically designed
> >> for finding extended regions and assumes being called at certain point i.e. dom0 RAM allocated, gnttab
> >> region allocated, etc.
Answering Michal. Sorry about it, since we were waiting for comments and I
wanted to keep the revision alive (it happend too many times that we
(minervasys) left the discussion hanging for too long) I sent the v11
even if it was incomplete. I should have at least added commens, you're
right.
> > So I agree that the function should be updated if we plan to use it for
> > other purpose.
> >
> > My opinion is that we should attempt to make the function generic so
> > that in your
> >> case you can choose which regions to exclude, define even your own function to grab free regions (at the moment
> >> add_ext_regions grabs banks >= 64M but you still discards banks >= 128M, so it's a bit wasteful.
> >>
> >> My very short attempt to make the function as generic as possible in the first iteration:
> >> https://paste.debian.net/1338334/
> >
> > This looks better, but I wonder why we need still need to exclude the
> > static regions? Wouldn't it be sufficient to exclude just reserved regions?
> Static shared memory banks are not part of reserved memory (i.e. bootinfo.reserved_mem) if that's what you're asking.
> They are stored in bootinfo.shmem, hence we need to take them into account when searching for unused address space.
>
> If you and Carlo are ok with my proposed solution for making the function generic, I can send a patch as a prerequisite
> patch for Carlo series.
I'm ok with that.
> ~Michal
Thanks both.
- Carlo
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-07 15:04 ` Michal Orzel
2024-12-09 9:47 ` Carlo Nonato
@ 2024-12-09 19:17 ` Julien Grall
2024-12-12 17:48 ` Carlo Nonato
1 sibling, 1 reply; 40+ messages in thread
From: Julien Grall @ 2024-12-09 19:17 UTC (permalink / raw)
To: Michal Orzel, Carlo Nonato, xen-devel
Cc: andrea.bastoni, marco.solieri, Stefano Stabellini,
Bertrand Marquis, Volodymyr Babchuk
Hi Michal,
On 07/12/2024 15:04, Michal Orzel wrote:
>
>
> On 06/12/2024 19:37, Julien Grall wrote:
>>
>>
>> Hi,
>>
>> Sorry for the late answer.
>>
>> On 05/12/2024 09:40, Michal Orzel wrote:
>>>
>>>
>>> On 02/12/2024 17:59, Carlo Nonato wrote:
>>>>
>>>>
>>>> Cache coloring requires Dom0 not to be direct-mapped because of its non
>>>> contiguous mapping nature, so allocate_memory() is needed in this case.
>>>> 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
>>>> moved allocate_memory() in dom0less_build.c. In order to use it
>>>> in Dom0 construction bring it back to domain_build.c and declare it in
>>>> domain_build.h.
>>>>
>>>> Take the opportunity to adapt the implementation of allocate_memory() so
>>>> that it uses the host layout when called on the hwdom, via
>>>> find_unallocated_memory().
>>>>
>>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
>>>> ---
>>>> v11:
>>>> - GUEST_RAM_BANKS instead of hardcoding the number of banks in allocate_memory()
>>>> - hwdom_ext_regions -> hwdom_free_mem in allocate_memory()
>>>> - added a comment in allocate_memory() when skipping small banks
>>>> v10:
>>>> - fixed a compilation bug that happened when dom0less support was disabled
>>>> v9:
>>>> - no changes
>>>> v8:
>>>> - patch adapted to new changes to allocate_memory()
>>>> v7:
>>>> - allocate_memory() now uses the host layout when called on the hwdom
>>>> v6:
>>>> - new patch
>>>> ---
>>>> xen/arch/arm/dom0less-build.c | 44 -----------
>>>> xen/arch/arm/domain_build.c | 97 ++++++++++++++++++++++++-
>>>> xen/arch/arm/include/asm/domain_build.h | 1 +
>>>> 3 files changed, 94 insertions(+), 48 deletions(-)
>>>>
>>>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
>>>> index d93a85434e..67b1503647 100644
>>>> --- a/xen/arch/arm/dom0less-build.c
>>>> +++ b/xen/arch/arm/dom0less-build.c
>>>> @@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
>>>> return ( !dom0found && domUfound );
>>>> }
>>>>
>>>> -static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
>>>> -{
>>>> - struct membanks *mem = kernel_info_get_mem(kinfo);
>>>> - unsigned int i;
>>>> - paddr_t bank_size;
>>>> -
>>>> - printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>>> - (unsigned long)(kinfo->unassigned_mem >> 20), d);
>>>> -
>>>> - mem->nr_banks = 0;
>>>> - bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
>>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
>>>> - bank_size) )
>>>> - goto fail;
>>>> -
>>>> - bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
>>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
>>>> - bank_size) )
>>>> - goto fail;
>>>> -
>>>> - if ( kinfo->unassigned_mem )
>>>> - goto fail;
>>>> -
>>>> - for( i = 0; i < mem->nr_banks; i++ )
>>>> - {
>>>> - printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
>>>> - d,
>>>> - i,
>>>> - mem->bank[i].start,
>>>> - mem->bank[i].start + mem->bank[i].size,
>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>>> - (unsigned long)(mem->bank[i].size >> 20));
>>>> - }
>>>> -
>>>> - return;
>>>> -
>>>> -fail:
>>>> - panic("Failed to allocate requested domain memory."
>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>>> - " %ldKB unallocated. Fix the VMs configurations.\n",
>>>> - (unsigned long)kinfo->unassigned_mem >> 10);
>>>> -}
>>>> -
>>>> #ifdef CONFIG_VGICV2
>>>> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
>>>> {
>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>> index 2c30792de8..2b8cba9b2f 100644
>>>> --- a/xen/arch/arm/domain_build.c
>>>> +++ b/xen/arch/arm/domain_build.c
>>>> @@ -416,7 +416,6 @@ static void __init allocate_memory_11(struct domain *d,
>>>> }
>>>> }
>>>>
>>>> -#ifdef CONFIG_DOM0LESS_BOOT
>>>> bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
>>>> alloc_domheap_mem_cb cb, void *extra)
>>>> {
>>>> @@ -508,7 +507,6 @@ bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
>>>>
>>>> return true;
>>>> }
>>>> -#endif
>>>>
>>>> /*
>>>> * When PCI passthrough is available we want to keep the
>>>> @@ -1003,6 +1001,94 @@ out:
>>>> return res;
>>>> }
>>>>
>>>> +void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
>>>> +{
>>>> + struct membanks *mem = kernel_info_get_mem(kinfo);
>>>> + unsigned int i, nr_banks = GUEST_RAM_BANKS;
>>>> + paddr_t bank_start, bank_size;
>>> Limit the scope
>>>
>>>> + struct membanks *hwdom_free_mem = NULL;
>>>> + const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>>>> + const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
>>> Limit the scope
>>>
>>>> +
>>>> + printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
>>>> + /* Don't want format this as PRIpaddr (16 digit hex) */
>>>> + (unsigned long)(kinfo->unassigned_mem >> 20), d);
>>>> +
>>>> + mem->nr_banks = 0;
>>>> + /*
>>>> + * Use host memory layout for hwdom. Only case for this is when LLC coloring
>>>> + * is enabled.
>>>> + */
>>>> + if ( is_hardware_domain(d) )
>>>> + {
>>>> + ASSERT(llc_coloring_enabled);
>>> This patch does not build because of declaration not being visible. You must include <xen/llc-coloring.h>.
>>
>> Piggying back on this comment. AFAICT, the code below would work also in
>> the non cache coloring case. So what's the assert is for?
>>
>>>
>>>> +
>>>> + hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
>>>> + NR_MEM_BANKS);
>>>> + if ( !hwdom_free_mem )
>>>> + goto fail;
>>>> +
>>>> + hwdom_free_mem->max_banks = NR_MEM_BANKS;
>>>> +
>>>> + if ( find_unallocated_memory(kinfo, hwdom_free_mem) )
>>> My remarks for the use of find_unallocated_memory() 1:1 have not been addressed. You did not even
>>> change the comments inside the function. The problem is that the function is specifically designed
>>> for finding extended regions and assumes being called at certain point i.e. dom0 RAM allocated, gnttab
>>> region allocated, etc.
>>
>> So I agree that the function should be updated if we plan to use it for
>> other purpose.
>>
>> My opinion is that we should attempt to make the function generic so
>> that in your
>>> case you can choose which regions to exclude, define even your own function to grab free regions (at the moment
>>> add_ext_regions grabs banks >= 64M but you still discards banks >= 128M, so it's a bit wasteful.
>>>
>>> My very short attempt to make the function as generic as possible in the first iteration:
>>> https://paste.debian.net/1338334/
>>
>> This looks better, but I wonder why we need still need to exclude the
>> static regions? Wouldn't it be sufficient to exclude just reserved regions?
> Static shared memory banks are not part of reserved memory (i.e. bootinfo.reserved_mem) if that's what you're asking.
> They are stored in bootinfo.shmem, hence we need to take them into account when searching for unused address space.
Oh I missed the fact you now pass "mem_banks" as a parameter. I thought
they would still get excluded for cache coloring case.
>
> If you and Carlo are ok with my proposed solution for making the function generic, I can send a patch as a prerequisite
> patch for Carlo series.
I am fine with the approach.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-09 19:17 ` Julien Grall
@ 2024-12-12 17:48 ` Carlo Nonato
2024-12-12 18:22 ` Andrea Bastoni
0 siblings, 1 reply; 40+ messages in thread
From: Carlo Nonato @ 2024-12-12 17:48 UTC (permalink / raw)
To: Julien Grall
Cc: Michal Orzel, xen-devel, andrea.bastoni, marco.solieri,
Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk
Hi,
On Mon, Dec 9, 2024 at 8:17 PM Julien Grall <julien@xen.org> wrote:
>
> Hi Michal,
>
> On 07/12/2024 15:04, Michal Orzel wrote:
> >
> >
> > On 06/12/2024 19:37, Julien Grall wrote:
> >>
> >>
> >> Hi,
> >>
> >> Sorry for the late answer.
> >>
> >> On 05/12/2024 09:40, Michal Orzel wrote:
> >>>
> >>>
> >>> On 02/12/2024 17:59, Carlo Nonato wrote:
> >>>>
> >>>>
> >>>> Cache coloring requires Dom0 not to be direct-mapped because of its non
> >>>> contiguous mapping nature, so allocate_memory() is needed in this case.
> >>>> 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
> >>>> moved allocate_memory() in dom0less_build.c. In order to use it
> >>>> in Dom0 construction bring it back to domain_build.c and declare it in
> >>>> domain_build.h.
> >>>>
> >>>> Take the opportunity to adapt the implementation of allocate_memory() so
> >>>> that it uses the host layout when called on the hwdom, via
> >>>> find_unallocated_memory().
> >>>>
> >>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> >>>> ---
> >>>> v11:
> >>>> - GUEST_RAM_BANKS instead of hardcoding the number of banks in allocate_memory()
> >>>> - hwdom_ext_regions -> hwdom_free_mem in allocate_memory()
> >>>> - added a comment in allocate_memory() when skipping small banks
> >>>> v10:
> >>>> - fixed a compilation bug that happened when dom0less support was disabled
> >>>> v9:
> >>>> - no changes
> >>>> v8:
> >>>> - patch adapted to new changes to allocate_memory()
> >>>> v7:
> >>>> - allocate_memory() now uses the host layout when called on the hwdom
> >>>> v6:
> >>>> - new patch
> >>>> ---
> >>>> xen/arch/arm/dom0less-build.c | 44 -----------
> >>>> xen/arch/arm/domain_build.c | 97 ++++++++++++++++++++++++-
> >>>> xen/arch/arm/include/asm/domain_build.h | 1 +
> >>>> 3 files changed, 94 insertions(+), 48 deletions(-)
> >>>>
> >>>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
> >>>> index d93a85434e..67b1503647 100644
> >>>> --- a/xen/arch/arm/dom0less-build.c
> >>>> +++ b/xen/arch/arm/dom0less-build.c
> >>>> @@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
> >>>> return ( !dom0found && domUfound );
> >>>> }
> >>>>
> >>>> -static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
> >>>> -{
> >>>> - struct membanks *mem = kernel_info_get_mem(kinfo);
> >>>> - unsigned int i;
> >>>> - paddr_t bank_size;
> >>>> -
> >>>> - printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
> >>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
> >>>> - (unsigned long)(kinfo->unassigned_mem >> 20), d);
> >>>> -
> >>>> - mem->nr_banks = 0;
> >>>> - bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
> >>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
> >>>> - bank_size) )
> >>>> - goto fail;
> >>>> -
> >>>> - bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
> >>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
> >>>> - bank_size) )
> >>>> - goto fail;
> >>>> -
> >>>> - if ( kinfo->unassigned_mem )
> >>>> - goto fail;
> >>>> -
> >>>> - for( i = 0; i < mem->nr_banks; i++ )
> >>>> - {
> >>>> - printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
> >>>> - d,
> >>>> - i,
> >>>> - mem->bank[i].start,
> >>>> - mem->bank[i].start + mem->bank[i].size,
> >>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
> >>>> - (unsigned long)(mem->bank[i].size >> 20));
> >>>> - }
> >>>> -
> >>>> - return;
> >>>> -
> >>>> -fail:
> >>>> - panic("Failed to allocate requested domain memory."
> >>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
> >>>> - " %ldKB unallocated. Fix the VMs configurations.\n",
> >>>> - (unsigned long)kinfo->unassigned_mem >> 10);
> >>>> -}
> >>>> -
> >>>> #ifdef CONFIG_VGICV2
> >>>> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
> >>>> {
> >>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> >>>> index 2c30792de8..2b8cba9b2f 100644
> >>>> --- a/xen/arch/arm/domain_build.c
> >>>> +++ b/xen/arch/arm/domain_build.c
> >>>> @@ -416,7 +416,6 @@ static void __init allocate_memory_11(struct domain *d,
> >>>> }
> >>>> }
> >>>>
> >>>> -#ifdef CONFIG_DOM0LESS_BOOT
> >>>> bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
> >>>> alloc_domheap_mem_cb cb, void *extra)
> >>>> {
> >>>> @@ -508,7 +507,6 @@ bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
> >>>>
> >>>> return true;
> >>>> }
> >>>> -#endif
> >>>>
> >>>> /*
> >>>> * When PCI passthrough is available we want to keep the
> >>>> @@ -1003,6 +1001,94 @@ out:
> >>>> return res;
> >>>> }
> >>>>
> >>>> +void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
> >>>> +{
> >>>> + struct membanks *mem = kernel_info_get_mem(kinfo);
> >>>> + unsigned int i, nr_banks = GUEST_RAM_BANKS;
> >>>> + paddr_t bank_start, bank_size;
> >>> Limit the scope
> >>>
> >>>> + struct membanks *hwdom_free_mem = NULL;
> >>>> + const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
> >>>> + const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
> >>> Limit the scope
> >>>
> >>>> +
> >>>> + printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
> >>>> + /* Don't want format this as PRIpaddr (16 digit hex) */
> >>>> + (unsigned long)(kinfo->unassigned_mem >> 20), d);
> >>>> +
> >>>> + mem->nr_banks = 0;
> >>>> + /*
> >>>> + * Use host memory layout for hwdom. Only case for this is when LLC coloring
> >>>> + * is enabled.
> >>>> + */
> >>>> + if ( is_hardware_domain(d) )
> >>>> + {
> >>>> + ASSERT(llc_coloring_enabled);
> >>> This patch does not build because of declaration not being visible. You must include <xen/llc-coloring.h>.
> >>
> >> Piggying back on this comment. AFAICT, the code below would work also in
> >> the non cache coloring case. So what's the assert is for?
> >>
> >>>
> >>>> +
> >>>> + hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
> >>>> + NR_MEM_BANKS);
> >>>> + if ( !hwdom_free_mem )
> >>>> + goto fail;
> >>>> +
> >>>> + hwdom_free_mem->max_banks = NR_MEM_BANKS;
> >>>> +
> >>>> + if ( find_unallocated_memory(kinfo, hwdom_free_mem) )
> >>> My remarks for the use of find_unallocated_memory() 1:1 have not been addressed. You did not even
> >>> change the comments inside the function. The problem is that the function is specifically designed
> >>> for finding extended regions and assumes being called at certain point i.e. dom0 RAM allocated, gnttab
> >>> region allocated, etc.
> >>
> >> So I agree that the function should be updated if we plan to use it for
> >> other purpose.
> >>
> >> My opinion is that we should attempt to make the function generic so
> >> that in your
> >>> case you can choose which regions to exclude, define even your own function to grab free regions (at the moment
> >>> add_ext_regions grabs banks >= 64M but you still discards banks >= 128M, so it's a bit wasteful.
> >>>
> >>> My very short attempt to make the function as generic as possible in the first iteration:
> >>> https://paste.debian.net/1338334/
> >>
> >> This looks better, but I wonder why we need still need to exclude the
> >> static regions? Wouldn't it be sufficient to exclude just reserved regions?
> > Static shared memory banks are not part of reserved memory (i.e. bootinfo.reserved_mem) if that's what you're asking.
> > They are stored in bootinfo.shmem, hence we need to take them into account when searching for unused address space.
>
> Oh I missed the fact you now pass "mem_banks" as a parameter. I thought
> they would still get excluded for cache coloring case.
>
> >
> > If you and Carlo are ok with my proposed solution for making the function generic, I can send a patch as a prerequisite
> > patch for Carlo series.
>
> I am fine with the approach.
>
> Cheers,
>
> --
> Julien Grall
>
> @@ -2152,7 +2238,10 @@ static int __init construct_dom0(struct domain *d)
> /* type must be set before allocate_memory */
> d->arch.type = kinfo.type;
> #endif
> - allocate_memory_11(d, &kinfo);
> + if ( is_domain_direct_mapped(d) )
> + allocate_memory_11(d, &kinfo);
> + else
> + allocate_memory(d, &kinfo);
> find_gnttab_region(d, &kinfo);
Since find_gnttab_region() is called after allocate_memory(), kinfo->gnttab_*
fields aren't initialized and the call to find_unallocated_memory() with
gnttab as the region to exclude, fails ending in a crash since memory for
dom0 can't be allocated.
Can the solution be to call find_gnttab_region() before the above if?
Or should I just call it before allocate_memory() in one case, but still after
allocate_memory_11() in the other?
Thanks.
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-12 17:48 ` Carlo Nonato
@ 2024-12-12 18:22 ` Andrea Bastoni
2024-12-13 9:45 ` Michal Orzel
0 siblings, 1 reply; 40+ messages in thread
From: Andrea Bastoni @ 2024-12-12 18:22 UTC (permalink / raw)
To: Carlo Nonato, Julien Grall
Cc: Michal Orzel, xen-devel, marco.solieri, Stefano Stabellini,
Bertrand Marquis, Volodymyr Babchuk
On 12/12/2024 18:48, Carlo Nonato wrote:
> Hi,
>
> On Mon, Dec 9, 2024 at 8:17 PM Julien Grall <julien@xen.org> wrote:
>>
>> Hi Michal,
>>
>> On 07/12/2024 15:04, Michal Orzel wrote:
>>>
>>>
>>> On 06/12/2024 19:37, Julien Grall wrote:
>>>>
>>>>
>>>> Hi,
>>>>
>>>> Sorry for the late answer.
>>>>
>>>> On 05/12/2024 09:40, Michal Orzel wrote:
>>>>>
>>>>>
>>>>> On 02/12/2024 17:59, Carlo Nonato wrote:
>>>>>>
>>>>>>
>>>>>> Cache coloring requires Dom0 not to be direct-mapped because of its non
>>>>>> contiguous mapping nature, so allocate_memory() is needed in this case.
>>>>>> 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
>>>>>> moved allocate_memory() in dom0less_build.c. In order to use it
>>>>>> in Dom0 construction bring it back to domain_build.c and declare it in
>>>>>> domain_build.h.
>>>>>>
>>>>>> Take the opportunity to adapt the implementation of allocate_memory() so
>>>>>> that it uses the host layout when called on the hwdom, via
>>>>>> find_unallocated_memory().
>>>>>>
>>>>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
>>>>>> ---
>>>>>> v11:
>>>>>> - GUEST_RAM_BANKS instead of hardcoding the number of banks in allocate_memory()
>>>>>> - hwdom_ext_regions -> hwdom_free_mem in allocate_memory()
>>>>>> - added a comment in allocate_memory() when skipping small banks
>>>>>> v10:
>>>>>> - fixed a compilation bug that happened when dom0less support was disabled
>>>>>> v9:
>>>>>> - no changes
>>>>>> v8:
>>>>>> - patch adapted to new changes to allocate_memory()
>>>>>> v7:
>>>>>> - allocate_memory() now uses the host layout when called on the hwdom
>>>>>> v6:
>>>>>> - new patch
>>>>>> ---
>>>>>> xen/arch/arm/dom0less-build.c | 44 -----------
>>>>>> xen/arch/arm/domain_build.c | 97 ++++++++++++++++++++++++-
>>>>>> xen/arch/arm/include/asm/domain_build.h | 1 +
>>>>>> 3 files changed, 94 insertions(+), 48 deletions(-)
>>>>>>
>>>>>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
>>>>>> index d93a85434e..67b1503647 100644
>>>>>> --- a/xen/arch/arm/dom0less-build.c
>>>>>> +++ b/xen/arch/arm/dom0less-build.c
>>>>>> @@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
>>>>>> return ( !dom0found && domUfound );
>>>>>> }
>>>>>>
>>>>>> -static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
>>>>>> -{
>>>>>> - struct membanks *mem = kernel_info_get_mem(kinfo);
>>>>>> - unsigned int i;
>>>>>> - paddr_t bank_size;
>>>>>> -
>>>>>> - printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>> - (unsigned long)(kinfo->unassigned_mem >> 20), d);
>>>>>> -
>>>>>> - mem->nr_banks = 0;
>>>>>> - bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
>>>>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
>>>>>> - bank_size) )
>>>>>> - goto fail;
>>>>>> -
>>>>>> - bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
>>>>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
>>>>>> - bank_size) )
>>>>>> - goto fail;
>>>>>> -
>>>>>> - if ( kinfo->unassigned_mem )
>>>>>> - goto fail;
>>>>>> -
>>>>>> - for( i = 0; i < mem->nr_banks; i++ )
>>>>>> - {
>>>>>> - printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
>>>>>> - d,
>>>>>> - i,
>>>>>> - mem->bank[i].start,
>>>>>> - mem->bank[i].start + mem->bank[i].size,
>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>> - (unsigned long)(mem->bank[i].size >> 20));
>>>>>> - }
>>>>>> -
>>>>>> - return;
>>>>>> -
>>>>>> -fail:
>>>>>> - panic("Failed to allocate requested domain memory."
>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>> - " %ldKB unallocated. Fix the VMs configurations.\n",
>>>>>> - (unsigned long)kinfo->unassigned_mem >> 10);
>>>>>> -}
>>>>>> -
>>>>>> #ifdef CONFIG_VGICV2
>>>>>> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
>>>>>> {
>>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>>>> index 2c30792de8..2b8cba9b2f 100644
>>>>>> --- a/xen/arch/arm/domain_build.c
>>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>>> @@ -416,7 +416,6 @@ static void __init allocate_memory_11(struct domain *d,
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> -#ifdef CONFIG_DOM0LESS_BOOT
>>>>>> bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
>>>>>> alloc_domheap_mem_cb cb, void *extra)
>>>>>> {
>>>>>> @@ -508,7 +507,6 @@ bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
>>>>>>
>>>>>> return true;
>>>>>> }
>>>>>> -#endif
>>>>>>
>>>>>> /*
>>>>>> * When PCI passthrough is available we want to keep the
>>>>>> @@ -1003,6 +1001,94 @@ out:
>>>>>> return res;
>>>>>> }
>>>>>>
>>>>>> +void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
>>>>>> +{
>>>>>> + struct membanks *mem = kernel_info_get_mem(kinfo);
>>>>>> + unsigned int i, nr_banks = GUEST_RAM_BANKS;
>>>>>> + paddr_t bank_start, bank_size;
>>>>> Limit the scope
>>>>>
>>>>>> + struct membanks *hwdom_free_mem = NULL;
>>>>>> + const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>>>>>> + const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
>>>>> Limit the scope
>>>>>
>>>>>> +
>>>>>> + printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
>>>>>> + /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>> + (unsigned long)(kinfo->unassigned_mem >> 20), d);
>>>>>> +
>>>>>> + mem->nr_banks = 0;
>>>>>> + /*
>>>>>> + * Use host memory layout for hwdom. Only case for this is when LLC coloring
>>>>>> + * is enabled.
>>>>>> + */
>>>>>> + if ( is_hardware_domain(d) )
>>>>>> + {
>>>>>> + ASSERT(llc_coloring_enabled);
>>>>> This patch does not build because of declaration not being visible. You must include <xen/llc-coloring.h>.
>>>>
>>>> Piggying back on this comment. AFAICT, the code below would work also in
>>>> the non cache coloring case. So what's the assert is for?
>>>>
>>>>>
>>>>>> +
>>>>>> + hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
>>>>>> + NR_MEM_BANKS);
>>>>>> + if ( !hwdom_free_mem )
>>>>>> + goto fail;
>>>>>> +
>>>>>> + hwdom_free_mem->max_banks = NR_MEM_BANKS;
>>>>>> +
>>>>>> + if ( find_unallocated_memory(kinfo, hwdom_free_mem) )
>>>>> My remarks for the use of find_unallocated_memory() 1:1 have not been addressed. You did not even
>>>>> change the comments inside the function. The problem is that the function is specifically designed
>>>>> for finding extended regions and assumes being called at certain point i.e. dom0 RAM allocated, gnttab
>>>>> region allocated, etc.
>>>>
>>>> So I agree that the function should be updated if we plan to use it for
>>>> other purpose.
>>>>
>>>> My opinion is that we should attempt to make the function generic so
>>>> that in your
>>>>> case you can choose which regions to exclude, define even your own function to grab free regions (at the moment
>>>>> add_ext_regions grabs banks >= 64M but you still discards banks >= 128M, so it's a bit wasteful.
>>>>>
>>>>> My very short attempt to make the function as generic as possible in the first iteration:
>>>>> https://paste.debian.net/1338334/
>>>>
>>>> This looks better, but I wonder why we need still need to exclude the
>>>> static regions? Wouldn't it be sufficient to exclude just reserved regions?
>>> Static shared memory banks are not part of reserved memory (i.e. bootinfo.reserved_mem) if that's what you're asking.
>>> They are stored in bootinfo.shmem, hence we need to take them into account when searching for unused address space.
>>
>> Oh I missed the fact you now pass "mem_banks" as a parameter. I thought
>> they would still get excluded for cache coloring case.
>>
>>>
>>> If you and Carlo are ok with my proposed solution for making the function generic, I can send a patch as a prerequisite
>>> patch for Carlo series.
>>
>> I am fine with the approach.
>>
>> Cheers,
>>
>> --
>> Julien Grall
>>
>
>> @@ -2152,7 +2238,10 @@ static int __init construct_dom0(struct domain *d)
>> /* type must be set before allocate_memory */
>> d->arch.type = kinfo.type;
>> #endif
>> - allocate_memory_11(d, &kinfo);
>> + if ( is_domain_direct_mapped(d) )
>> + allocate_memory_11(d, &kinfo);
>> + else
>> + allocate_memory(d, &kinfo);
>> find_gnttab_region(d, &kinfo);
>
> Since find_gnttab_region() is called after allocate_memory(), kinfo->gnttab_*
> fields aren't initialized and the call to find_unallocated_memory() with
> gnttab as the region to exclude, fails ending in a crash since memory for
> dom0 can't be allocated.
>
> Can the solution be to call find_gnttab_region() before the above if?
The function is called find, but currently it only initializes kinfo->gnttab_start
and kinfo->gnttab_size and we tested that moving it before allocate_memory* doesn't
cause fallouts.
If moving before allocate_memory*, would it be better to rename it e.g., init_gnttab_region()?
Thanks,
Andrea
> Or should I just call it before allocate_memory() in one case, but still after
> allocate_memory_11() in the other?
>
> Thanks.
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-12 18:22 ` Andrea Bastoni
@ 2024-12-13 9:45 ` Michal Orzel
2024-12-13 10:26 ` Carlo Nonato
0 siblings, 1 reply; 40+ messages in thread
From: Michal Orzel @ 2024-12-13 9:45 UTC (permalink / raw)
To: Andrea Bastoni, Carlo Nonato, Julien Grall
Cc: xen-devel, marco.solieri, Stefano Stabellini, Bertrand Marquis,
Volodymyr Babchuk
Hi Carlo, Andrea,
On 12/12/2024 19:22, Andrea Bastoni wrote:
>
>
> On 12/12/2024 18:48, Carlo Nonato wrote:
>> Hi,
>>
>> On Mon, Dec 9, 2024 at 8:17 PM Julien Grall <julien@xen.org> wrote:
>>>
>>> Hi Michal,
>>>
>>> On 07/12/2024 15:04, Michal Orzel wrote:
>>>>
>>>>
>>>> On 06/12/2024 19:37, Julien Grall wrote:
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> Sorry for the late answer.
>>>>>
>>>>> On 05/12/2024 09:40, Michal Orzel wrote:
>>>>>>
>>>>>>
>>>>>> On 02/12/2024 17:59, Carlo Nonato wrote:
>>>>>>>
>>>>>>>
>>>>>>> Cache coloring requires Dom0 not to be direct-mapped because of its non
>>>>>>> contiguous mapping nature, so allocate_memory() is needed in this case.
>>>>>>> 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
>>>>>>> moved allocate_memory() in dom0less_build.c. In order to use it
>>>>>>> in Dom0 construction bring it back to domain_build.c and declare it in
>>>>>>> domain_build.h.
>>>>>>>
>>>>>>> Take the opportunity to adapt the implementation of allocate_memory() so
>>>>>>> that it uses the host layout when called on the hwdom, via
>>>>>>> find_unallocated_memory().
>>>>>>>
>>>>>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
>>>>>>> ---
>>>>>>> v11:
>>>>>>> - GUEST_RAM_BANKS instead of hardcoding the number of banks in allocate_memory()
>>>>>>> - hwdom_ext_regions -> hwdom_free_mem in allocate_memory()
>>>>>>> - added a comment in allocate_memory() when skipping small banks
>>>>>>> v10:
>>>>>>> - fixed a compilation bug that happened when dom0less support was disabled
>>>>>>> v9:
>>>>>>> - no changes
>>>>>>> v8:
>>>>>>> - patch adapted to new changes to allocate_memory()
>>>>>>> v7:
>>>>>>> - allocate_memory() now uses the host layout when called on the hwdom
>>>>>>> v6:
>>>>>>> - new patch
>>>>>>> ---
>>>>>>> xen/arch/arm/dom0less-build.c | 44 -----------
>>>>>>> xen/arch/arm/domain_build.c | 97 ++++++++++++++++++++++++-
>>>>>>> xen/arch/arm/include/asm/domain_build.h | 1 +
>>>>>>> 3 files changed, 94 insertions(+), 48 deletions(-)
>>>>>>>
>>>>>>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
>>>>>>> index d93a85434e..67b1503647 100644
>>>>>>> --- a/xen/arch/arm/dom0less-build.c
>>>>>>> +++ b/xen/arch/arm/dom0less-build.c
>>>>>>> @@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
>>>>>>> return ( !dom0found && domUfound );
>>>>>>> }
>>>>>>>
>>>>>>> -static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
>>>>>>> -{
>>>>>>> - struct membanks *mem = kernel_info_get_mem(kinfo);
>>>>>>> - unsigned int i;
>>>>>>> - paddr_t bank_size;
>>>>>>> -
>>>>>>> - printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
>>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>>> - (unsigned long)(kinfo->unassigned_mem >> 20), d);
>>>>>>> -
>>>>>>> - mem->nr_banks = 0;
>>>>>>> - bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
>>>>>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
>>>>>>> - bank_size) )
>>>>>>> - goto fail;
>>>>>>> -
>>>>>>> - bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
>>>>>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
>>>>>>> - bank_size) )
>>>>>>> - goto fail;
>>>>>>> -
>>>>>>> - if ( kinfo->unassigned_mem )
>>>>>>> - goto fail;
>>>>>>> -
>>>>>>> - for( i = 0; i < mem->nr_banks; i++ )
>>>>>>> - {
>>>>>>> - printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
>>>>>>> - d,
>>>>>>> - i,
>>>>>>> - mem->bank[i].start,
>>>>>>> - mem->bank[i].start + mem->bank[i].size,
>>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>>> - (unsigned long)(mem->bank[i].size >> 20));
>>>>>>> - }
>>>>>>> -
>>>>>>> - return;
>>>>>>> -
>>>>>>> -fail:
>>>>>>> - panic("Failed to allocate requested domain memory."
>>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>>> - " %ldKB unallocated. Fix the VMs configurations.\n",
>>>>>>> - (unsigned long)kinfo->unassigned_mem >> 10);
>>>>>>> -}
>>>>>>> -
>>>>>>> #ifdef CONFIG_VGICV2
>>>>>>> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
>>>>>>> {
>>>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>>>>> index 2c30792de8..2b8cba9b2f 100644
>>>>>>> --- a/xen/arch/arm/domain_build.c
>>>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>>>> @@ -416,7 +416,6 @@ static void __init allocate_memory_11(struct domain *d,
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> -#ifdef CONFIG_DOM0LESS_BOOT
>>>>>>> bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
>>>>>>> alloc_domheap_mem_cb cb, void *extra)
>>>>>>> {
>>>>>>> @@ -508,7 +507,6 @@ bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
>>>>>>>
>>>>>>> return true;
>>>>>>> }
>>>>>>> -#endif
>>>>>>>
>>>>>>> /*
>>>>>>> * When PCI passthrough is available we want to keep the
>>>>>>> @@ -1003,6 +1001,94 @@ out:
>>>>>>> return res;
>>>>>>> }
>>>>>>>
>>>>>>> +void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
>>>>>>> +{
>>>>>>> + struct membanks *mem = kernel_info_get_mem(kinfo);
>>>>>>> + unsigned int i, nr_banks = GUEST_RAM_BANKS;
>>>>>>> + paddr_t bank_start, bank_size;
>>>>>> Limit the scope
>>>>>>
>>>>>>> + struct membanks *hwdom_free_mem = NULL;
>>>>>>> + const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>>>>>>> + const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
>>>>>> Limit the scope
>>>>>>
>>>>>>> +
>>>>>>> + printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
>>>>>>> + /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>>> + (unsigned long)(kinfo->unassigned_mem >> 20), d);
>>>>>>> +
>>>>>>> + mem->nr_banks = 0;
>>>>>>> + /*
>>>>>>> + * Use host memory layout for hwdom. Only case for this is when LLC coloring
>>>>>>> + * is enabled.
>>>>>>> + */
>>>>>>> + if ( is_hardware_domain(d) )
>>>>>>> + {
>>>>>>> + ASSERT(llc_coloring_enabled);
>>>>>> This patch does not build because of declaration not being visible. You must include <xen/llc-coloring.h>.
>>>>>
>>>>> Piggying back on this comment. AFAICT, the code below would work also in
>>>>> the non cache coloring case. So what's the assert is for?
>>>>>
>>>>>>
>>>>>>> +
>>>>>>> + hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
>>>>>>> + NR_MEM_BANKS);
>>>>>>> + if ( !hwdom_free_mem )
>>>>>>> + goto fail;
>>>>>>> +
>>>>>>> + hwdom_free_mem->max_banks = NR_MEM_BANKS;
>>>>>>> +
>>>>>>> + if ( find_unallocated_memory(kinfo, hwdom_free_mem) )
>>>>>> My remarks for the use of find_unallocated_memory() 1:1 have not been addressed. You did not even
>>>>>> change the comments inside the function. The problem is that the function is specifically designed
>>>>>> for finding extended regions and assumes being called at certain point i.e. dom0 RAM allocated, gnttab
>>>>>> region allocated, etc.
>>>>>
>>>>> So I agree that the function should be updated if we plan to use it for
>>>>> other purpose.
>>>>>
>>>>> My opinion is that we should attempt to make the function generic so
>>>>> that in your
>>>>>> case you can choose which regions to exclude, define even your own function to grab free regions (at the moment
>>>>>> add_ext_regions grabs banks >= 64M but you still discards banks >= 128M, so it's a bit wasteful.
>>>>>>
>>>>>> My very short attempt to make the function as generic as possible in the first iteration:
>>>>>> https://paste.debian.net/1338334/
>>>>>
>>>>> This looks better, but I wonder why we need still need to exclude the
>>>>> static regions? Wouldn't it be sufficient to exclude just reserved regions?
>>>> Static shared memory banks are not part of reserved memory (i.e. bootinfo.reserved_mem) if that's what you're asking.
>>>> They are stored in bootinfo.shmem, hence we need to take them into account when searching for unused address space.
>>>
>>> Oh I missed the fact you now pass "mem_banks" as a parameter. I thought
>>> they would still get excluded for cache coloring case.
>>>
>>>>
>>>> If you and Carlo are ok with my proposed solution for making the function generic, I can send a patch as a prerequisite
>>>> patch for Carlo series.
>>>
>>> I am fine with the approach.
>>>
>>> Cheers,
>>>
>>> --
>>> Julien Grall
>>>
>>
>>> @@ -2152,7 +2238,10 @@ static int __init construct_dom0(struct domain *d)
>>> /* type must be set before allocate_memory */
>>> d->arch.type = kinfo.type;
>>> #endif
>>> - allocate_memory_11(d, &kinfo);
>>> + if ( is_domain_direct_mapped(d) )
>>> + allocate_memory_11(d, &kinfo);
>>> + else
>>> + allocate_memory(d, &kinfo);
>>> find_gnttab_region(d, &kinfo);
>>
>> Since find_gnttab_region() is called after allocate_memory(), kinfo->gnttab_*
>> fields aren't initialized and the call to find_unallocated_memory() with
>> gnttab as the region to exclude, fails ending in a crash since memory for
>> dom0 can't be allocated.
>>
>> Can the solution be to call find_gnttab_region() before the above if?
>
> The function is called find, but currently it only initializes kinfo->gnttab_start
> and kinfo->gnttab_size and we tested that moving it before allocate_memory* doesn't
> cause fallouts.
>
> If moving before allocate_memory*, would it be better to rename it e.g., init_gnttab_region()?
>
> Thanks,
> Andrea
>
>> Or should I just call it before allocate_memory() in one case, but still after
>> allocate_memory_11() in the other?
>>
>> Thanks.
>
AFAICT there is nothing stopping us from moving find_gnttab_region() before allocate_*. This function initializes
gnttab region with PA of Xen. In normal case, because Xen is added as bootmodule, it will never be mapped in dom0 memory map
and the placement does not matter. In LLC case, it will point to relocated address of Xen and it needs to be known before
calling find_unallocated_memory. Don't rename it, leave as is, just move before allocate_*.
@Carlo:
My prerequisite patch has been merged, so you're good to respin a series (unless you wait for some feedback in which case do let me know).
To prevent too many respins, you're going to call find_unallocated_memory for LLC passing resmem and gnttab to be excluded. If you're going
to reuse add_ext_regions, you need to rename it and fix comments to make it more generic. As for the size, the decision is yours. One solution
would be to modify add_ext_regions to take min bank size as parameter (64MB for extended regions, X for LLC dom0). In your code, you write that
the first bank must contain dom0, dtb, ramdisk and you chose 128MB. However, looking at the code, you seem to discard banks < 128 for all the banks,
not only for the first one. This is the part that I don't have a ready solution. Maybe you could define your own add_free_region function and sort
the banks, so that you take the largest possible bank first for dom0. This could simplify things.
You can also ask others for opinion.
We are approaching Dec 20th deadline, and I want this series to be in as it's been on the list for too many years. I'm willing to accept a sub-optimal solution
(so far will be used only for LLC, and LLC as experimental feature will be the only victim of not optimal algorithm) for now, and we can think of a better one
after the release. But still, even the sub-optimal solution must make sense.
~Michal
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-13 9:45 ` Michal Orzel
@ 2024-12-13 10:26 ` Carlo Nonato
2024-12-13 10:56 ` Michal Orzel
0 siblings, 1 reply; 40+ messages in thread
From: Carlo Nonato @ 2024-12-13 10:26 UTC (permalink / raw)
To: Michal Orzel
Cc: Andrea Bastoni, Julien Grall, xen-devel, marco.solieri,
Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk
Hi Michal,
On Fri, Dec 13, 2024 at 10:46 AM Michal Orzel <michal.orzel@amd.com> wrote:
>
> Hi Carlo, Andrea,
>
> On 12/12/2024 19:22, Andrea Bastoni wrote:
> >
> >
> > On 12/12/2024 18:48, Carlo Nonato wrote:
> >> Hi,
> >>
> >> On Mon, Dec 9, 2024 at 8:17 PM Julien Grall <julien@xen.org> wrote:
> >>>
> >>> Hi Michal,
> >>>
> >>> On 07/12/2024 15:04, Michal Orzel wrote:
> >>>>
> >>>>
> >>>> On 06/12/2024 19:37, Julien Grall wrote:
> >>>>>
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> Sorry for the late answer.
> >>>>>
> >>>>> On 05/12/2024 09:40, Michal Orzel wrote:
> >>>>>>
> >>>>>>
> >>>>>> On 02/12/2024 17:59, Carlo Nonato wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> Cache coloring requires Dom0 not to be direct-mapped because of its non
> >>>>>>> contiguous mapping nature, so allocate_memory() is needed in this case.
> >>>>>>> 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
> >>>>>>> moved allocate_memory() in dom0less_build.c. In order to use it
> >>>>>>> in Dom0 construction bring it back to domain_build.c and declare it in
> >>>>>>> domain_build.h.
> >>>>>>>
> >>>>>>> Take the opportunity to adapt the implementation of allocate_memory() so
> >>>>>>> that it uses the host layout when called on the hwdom, via
> >>>>>>> find_unallocated_memory().
> >>>>>>>
> >>>>>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> >>>>>>> ---
> >>>>>>> v11:
> >>>>>>> - GUEST_RAM_BANKS instead of hardcoding the number of banks in allocate_memory()
> >>>>>>> - hwdom_ext_regions -> hwdom_free_mem in allocate_memory()
> >>>>>>> - added a comment in allocate_memory() when skipping small banks
> >>>>>>> v10:
> >>>>>>> - fixed a compilation bug that happened when dom0less support was disabled
> >>>>>>> v9:
> >>>>>>> - no changes
> >>>>>>> v8:
> >>>>>>> - patch adapted to new changes to allocate_memory()
> >>>>>>> v7:
> >>>>>>> - allocate_memory() now uses the host layout when called on the hwdom
> >>>>>>> v6:
> >>>>>>> - new patch
> >>>>>>> ---
> >>>>>>> xen/arch/arm/dom0less-build.c | 44 -----------
> >>>>>>> xen/arch/arm/domain_build.c | 97 ++++++++++++++++++++++++-
> >>>>>>> xen/arch/arm/include/asm/domain_build.h | 1 +
> >>>>>>> 3 files changed, 94 insertions(+), 48 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
> >>>>>>> index d93a85434e..67b1503647 100644
> >>>>>>> --- a/xen/arch/arm/dom0less-build.c
> >>>>>>> +++ b/xen/arch/arm/dom0less-build.c
> >>>>>>> @@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
> >>>>>>> return ( !dom0found && domUfound );
> >>>>>>> }
> >>>>>>>
> >>>>>>> -static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
> >>>>>>> -{
> >>>>>>> - struct membanks *mem = kernel_info_get_mem(kinfo);
> >>>>>>> - unsigned int i;
> >>>>>>> - paddr_t bank_size;
> >>>>>>> -
> >>>>>>> - printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
> >>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
> >>>>>>> - (unsigned long)(kinfo->unassigned_mem >> 20), d);
> >>>>>>> -
> >>>>>>> - mem->nr_banks = 0;
> >>>>>>> - bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
> >>>>>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
> >>>>>>> - bank_size) )
> >>>>>>> - goto fail;
> >>>>>>> -
> >>>>>>> - bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
> >>>>>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
> >>>>>>> - bank_size) )
> >>>>>>> - goto fail;
> >>>>>>> -
> >>>>>>> - if ( kinfo->unassigned_mem )
> >>>>>>> - goto fail;
> >>>>>>> -
> >>>>>>> - for( i = 0; i < mem->nr_banks; i++ )
> >>>>>>> - {
> >>>>>>> - printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
> >>>>>>> - d,
> >>>>>>> - i,
> >>>>>>> - mem->bank[i].start,
> >>>>>>> - mem->bank[i].start + mem->bank[i].size,
> >>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
> >>>>>>> - (unsigned long)(mem->bank[i].size >> 20));
> >>>>>>> - }
> >>>>>>> -
> >>>>>>> - return;
> >>>>>>> -
> >>>>>>> -fail:
> >>>>>>> - panic("Failed to allocate requested domain memory."
> >>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
> >>>>>>> - " %ldKB unallocated. Fix the VMs configurations.\n",
> >>>>>>> - (unsigned long)kinfo->unassigned_mem >> 10);
> >>>>>>> -}
> >>>>>>> -
> >>>>>>> #ifdef CONFIG_VGICV2
> >>>>>>> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
> >>>>>>> {
> >>>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> >>>>>>> index 2c30792de8..2b8cba9b2f 100644
> >>>>>>> --- a/xen/arch/arm/domain_build.c
> >>>>>>> +++ b/xen/arch/arm/domain_build.c
> >>>>>>> @@ -416,7 +416,6 @@ static void __init allocate_memory_11(struct domain *d,
> >>>>>>> }
> >>>>>>> }
> >>>>>>>
> >>>>>>> -#ifdef CONFIG_DOM0LESS_BOOT
> >>>>>>> bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
> >>>>>>> alloc_domheap_mem_cb cb, void *extra)
> >>>>>>> {
> >>>>>>> @@ -508,7 +507,6 @@ bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
> >>>>>>>
> >>>>>>> return true;
> >>>>>>> }
> >>>>>>> -#endif
> >>>>>>>
> >>>>>>> /*
> >>>>>>> * When PCI passthrough is available we want to keep the
> >>>>>>> @@ -1003,6 +1001,94 @@ out:
> >>>>>>> return res;
> >>>>>>> }
> >>>>>>>
> >>>>>>> +void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
> >>>>>>> +{
> >>>>>>> + struct membanks *mem = kernel_info_get_mem(kinfo);
> >>>>>>> + unsigned int i, nr_banks = GUEST_RAM_BANKS;
> >>>>>>> + paddr_t bank_start, bank_size;
> >>>>>> Limit the scope
> >>>>>>
> >>>>>>> + struct membanks *hwdom_free_mem = NULL;
> >>>>>>> + const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
> >>>>>>> + const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
> >>>>>> Limit the scope
> >>>>>>
> >>>>>>> +
> >>>>>>> + printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
> >>>>>>> + /* Don't want format this as PRIpaddr (16 digit hex) */
> >>>>>>> + (unsigned long)(kinfo->unassigned_mem >> 20), d);
> >>>>>>> +
> >>>>>>> + mem->nr_banks = 0;
> >>>>>>> + /*
> >>>>>>> + * Use host memory layout for hwdom. Only case for this is when LLC coloring
> >>>>>>> + * is enabled.
> >>>>>>> + */
> >>>>>>> + if ( is_hardware_domain(d) )
> >>>>>>> + {
> >>>>>>> + ASSERT(llc_coloring_enabled);
> >>>>>> This patch does not build because of declaration not being visible. You must include <xen/llc-coloring.h>.
> >>>>>
> >>>>> Piggying back on this comment. AFAICT, the code below would work also in
> >>>>> the non cache coloring case. So what's the assert is for?
> >>>>>
> >>>>>>
> >>>>>>> +
> >>>>>>> + hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
> >>>>>>> + NR_MEM_BANKS);
> >>>>>>> + if ( !hwdom_free_mem )
> >>>>>>> + goto fail;
> >>>>>>> +
> >>>>>>> + hwdom_free_mem->max_banks = NR_MEM_BANKS;
> >>>>>>> +
> >>>>>>> + if ( find_unallocated_memory(kinfo, hwdom_free_mem) )
> >>>>>> My remarks for the use of find_unallocated_memory() 1:1 have not been addressed. You did not even
> >>>>>> change the comments inside the function. The problem is that the function is specifically designed
> >>>>>> for finding extended regions and assumes being called at certain point i.e. dom0 RAM allocated, gnttab
> >>>>>> region allocated, etc.
> >>>>>
> >>>>> So I agree that the function should be updated if we plan to use it for
> >>>>> other purpose.
> >>>>>
> >>>>> My opinion is that we should attempt to make the function generic so
> >>>>> that in your
> >>>>>> case you can choose which regions to exclude, define even your own function to grab free regions (at the moment
> >>>>>> add_ext_regions grabs banks >= 64M but you still discards banks >= 128M, so it's a bit wasteful.
> >>>>>>
> >>>>>> My very short attempt to make the function as generic as possible in the first iteration:
> >>>>>> https://paste.debian.net/1338334/
> >>>>>
> >>>>> This looks better, but I wonder why we need still need to exclude the
> >>>>> static regions? Wouldn't it be sufficient to exclude just reserved regions?
> >>>> Static shared memory banks are not part of reserved memory (i.e. bootinfo.reserved_mem) if that's what you're asking.
> >>>> They are stored in bootinfo.shmem, hence we need to take them into account when searching for unused address space.
> >>>
> >>> Oh I missed the fact you now pass "mem_banks" as a parameter. I thought
> >>> they would still get excluded for cache coloring case.
> >>>
> >>>>
> >>>> If you and Carlo are ok with my proposed solution for making the function generic, I can send a patch as a prerequisite
> >>>> patch for Carlo series.
> >>>
> >>> I am fine with the approach.
> >>>
> >>> Cheers,
> >>>
> >>> --
> >>> Julien Grall
> >>>
> >>
> >>> @@ -2152,7 +2238,10 @@ static int __init construct_dom0(struct domain *d)
> >>> /* type must be set before allocate_memory */
> >>> d->arch.type = kinfo.type;
> >>> #endif
> >>> - allocate_memory_11(d, &kinfo);
> >>> + if ( is_domain_direct_mapped(d) )
> >>> + allocate_memory_11(d, &kinfo);
> >>> + else
> >>> + allocate_memory(d, &kinfo);
> >>> find_gnttab_region(d, &kinfo);
> >>
> >> Since find_gnttab_region() is called after allocate_memory(), kinfo->gnttab_*
> >> fields aren't initialized and the call to find_unallocated_memory() with
> >> gnttab as the region to exclude, fails ending in a crash since memory for
> >> dom0 can't be allocated.
> >>
> >> Can the solution be to call find_gnttab_region() before the above if?
> >
> > The function is called find, but currently it only initializes kinfo->gnttab_start
> > and kinfo->gnttab_size and we tested that moving it before allocate_memory* doesn't
> > cause fallouts.
> >
> > If moving before allocate_memory*, would it be better to rename it e.g., init_gnttab_region()?
> >
> > Thanks,
> > Andrea
> >
> >> Or should I just call it before allocate_memory() in one case, but still after
> >> allocate_memory_11() in the other?
> >>
> >> Thanks.
> >
>
> AFAICT there is nothing stopping us from moving find_gnttab_region() before allocate_*. This function initializes
> gnttab region with PA of Xen. In normal case, because Xen is added as bootmodule, it will never be mapped in dom0 memory map
> and the placement does not matter. In LLC case, it will point to relocated address of Xen and it needs to be known before
> calling find_unallocated_memory. Don't rename it, leave as is, just move before allocate_*.
>
> @Carlo:
> My prerequisite patch has been merged, so you're good to respin a series (unless you wait for some feedback in which case do let me know).
> To prevent too many respins, you're going to call find_unallocated_memory for LLC passing resmem and gnttab to be excluded. If you're going
> to reuse add_ext_regions, you need to rename it and fix comments to make it more generic. As for the size, the decision is yours. One solution
> would be to modify add_ext_regions to take min bank size as parameter (64MB for extended regions, X for LLC dom0). In your code, you write that
> the first bank must contain dom0, dtb, ramdisk and you chose 128MB. However, looking at the code, you seem to discard banks < 128 for all the banks,
> not only for the first one. This is the part that I don't have a ready solution. Maybe you could define your own add_free_region function and sort
> the banks, so that you take the largest possible bank first for dom0. This could simplify things.
For the moment I added a __add_ext_regions() helper that also takes a skip_size
parameter. This is called by add_ext_regions() and by a new
add_hwdom_free_regions() callback used in allocate_memory().
I still use 128MB for all the banks. Do you think this is acceptable, maybe
with a FIXME comment cause we should skip only the first bank, or not?
> You can also ask others for opinion.
>
> We are approaching Dec 20th deadline, and I want this series to be in as it's been on the list for too many years. I'm willing to accept a sub-optimal solution
> (so far will be used only for LLC, and LLC as experimental feature will be the only victim of not optimal algorithm) for now, and we can think of a better one
> after the release. But still, even the sub-optimal solution must make sense.
>
> ~Michal
>
Thanks.
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-13 10:26 ` Carlo Nonato
@ 2024-12-13 10:56 ` Michal Orzel
2024-12-13 11:30 ` Carlo Nonato
0 siblings, 1 reply; 40+ messages in thread
From: Michal Orzel @ 2024-12-13 10:56 UTC (permalink / raw)
To: Carlo Nonato
Cc: Andrea Bastoni, Julien Grall, xen-devel, marco.solieri,
Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk
On 13/12/2024 11:26, Carlo Nonato wrote:
>
>
> Hi Michal,
>
> On Fri, Dec 13, 2024 at 10:46 AM Michal Orzel <michal.orzel@amd.com> wrote:
>>
>> Hi Carlo, Andrea,
>>
>> On 12/12/2024 19:22, Andrea Bastoni wrote:
>>>
>>>
>>> On 12/12/2024 18:48, Carlo Nonato wrote:
>>>> Hi,
>>>>
>>>> On Mon, Dec 9, 2024 at 8:17 PM Julien Grall <julien@xen.org> wrote:
>>>>>
>>>>> Hi Michal,
>>>>>
>>>>> On 07/12/2024 15:04, Michal Orzel wrote:
>>>>>>
>>>>>>
>>>>>> On 06/12/2024 19:37, Julien Grall wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Sorry for the late answer.
>>>>>>>
>>>>>>> On 05/12/2024 09:40, Michal Orzel wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 02/12/2024 17:59, Carlo Nonato wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Cache coloring requires Dom0 not to be direct-mapped because of its non
>>>>>>>>> contiguous mapping nature, so allocate_memory() is needed in this case.
>>>>>>>>> 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
>>>>>>>>> moved allocate_memory() in dom0less_build.c. In order to use it
>>>>>>>>> in Dom0 construction bring it back to domain_build.c and declare it in
>>>>>>>>> domain_build.h.
>>>>>>>>>
>>>>>>>>> Take the opportunity to adapt the implementation of allocate_memory() so
>>>>>>>>> that it uses the host layout when called on the hwdom, via
>>>>>>>>> find_unallocated_memory().
>>>>>>>>>
>>>>>>>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
>>>>>>>>> ---
>>>>>>>>> v11:
>>>>>>>>> - GUEST_RAM_BANKS instead of hardcoding the number of banks in allocate_memory()
>>>>>>>>> - hwdom_ext_regions -> hwdom_free_mem in allocate_memory()
>>>>>>>>> - added a comment in allocate_memory() when skipping small banks
>>>>>>>>> v10:
>>>>>>>>> - fixed a compilation bug that happened when dom0less support was disabled
>>>>>>>>> v9:
>>>>>>>>> - no changes
>>>>>>>>> v8:
>>>>>>>>> - patch adapted to new changes to allocate_memory()
>>>>>>>>> v7:
>>>>>>>>> - allocate_memory() now uses the host layout when called on the hwdom
>>>>>>>>> v6:
>>>>>>>>> - new patch
>>>>>>>>> ---
>>>>>>>>> xen/arch/arm/dom0less-build.c | 44 -----------
>>>>>>>>> xen/arch/arm/domain_build.c | 97 ++++++++++++++++++++++++-
>>>>>>>>> xen/arch/arm/include/asm/domain_build.h | 1 +
>>>>>>>>> 3 files changed, 94 insertions(+), 48 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
>>>>>>>>> index d93a85434e..67b1503647 100644
>>>>>>>>> --- a/xen/arch/arm/dom0less-build.c
>>>>>>>>> +++ b/xen/arch/arm/dom0less-build.c
>>>>>>>>> @@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
>>>>>>>>> return ( !dom0found && domUfound );
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> -static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
>>>>>>>>> -{
>>>>>>>>> - struct membanks *mem = kernel_info_get_mem(kinfo);
>>>>>>>>> - unsigned int i;
>>>>>>>>> - paddr_t bank_size;
>>>>>>>>> -
>>>>>>>>> - printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
>>>>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>>>>> - (unsigned long)(kinfo->unassigned_mem >> 20), d);
>>>>>>>>> -
>>>>>>>>> - mem->nr_banks = 0;
>>>>>>>>> - bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
>>>>>>>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
>>>>>>>>> - bank_size) )
>>>>>>>>> - goto fail;
>>>>>>>>> -
>>>>>>>>> - bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
>>>>>>>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
>>>>>>>>> - bank_size) )
>>>>>>>>> - goto fail;
>>>>>>>>> -
>>>>>>>>> - if ( kinfo->unassigned_mem )
>>>>>>>>> - goto fail;
>>>>>>>>> -
>>>>>>>>> - for( i = 0; i < mem->nr_banks; i++ )
>>>>>>>>> - {
>>>>>>>>> - printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
>>>>>>>>> - d,
>>>>>>>>> - i,
>>>>>>>>> - mem->bank[i].start,
>>>>>>>>> - mem->bank[i].start + mem->bank[i].size,
>>>>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>>>>> - (unsigned long)(mem->bank[i].size >> 20));
>>>>>>>>> - }
>>>>>>>>> -
>>>>>>>>> - return;
>>>>>>>>> -
>>>>>>>>> -fail:
>>>>>>>>> - panic("Failed to allocate requested domain memory."
>>>>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>>>>> - " %ldKB unallocated. Fix the VMs configurations.\n",
>>>>>>>>> - (unsigned long)kinfo->unassigned_mem >> 10);
>>>>>>>>> -}
>>>>>>>>> -
>>>>>>>>> #ifdef CONFIG_VGICV2
>>>>>>>>> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
>>>>>>>>> {
>>>>>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>>>>>>> index 2c30792de8..2b8cba9b2f 100644
>>>>>>>>> --- a/xen/arch/arm/domain_build.c
>>>>>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>>>>>> @@ -416,7 +416,6 @@ static void __init allocate_memory_11(struct domain *d,
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> -#ifdef CONFIG_DOM0LESS_BOOT
>>>>>>>>> bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
>>>>>>>>> alloc_domheap_mem_cb cb, void *extra)
>>>>>>>>> {
>>>>>>>>> @@ -508,7 +507,6 @@ bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
>>>>>>>>>
>>>>>>>>> return true;
>>>>>>>>> }
>>>>>>>>> -#endif
>>>>>>>>>
>>>>>>>>> /*
>>>>>>>>> * When PCI passthrough is available we want to keep the
>>>>>>>>> @@ -1003,6 +1001,94 @@ out:
>>>>>>>>> return res;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> +void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
>>>>>>>>> +{
>>>>>>>>> + struct membanks *mem = kernel_info_get_mem(kinfo);
>>>>>>>>> + unsigned int i, nr_banks = GUEST_RAM_BANKS;
>>>>>>>>> + paddr_t bank_start, bank_size;
>>>>>>>> Limit the scope
>>>>>>>>
>>>>>>>>> + struct membanks *hwdom_free_mem = NULL;
>>>>>>>>> + const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>>>>>>>>> + const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
>>>>>>>> Limit the scope
>>>>>>>>
>>>>>>>>> +
>>>>>>>>> + printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
>>>>>>>>> + /* Don't want format this as PRIpaddr (16 digit hex) */
>>>>>>>>> + (unsigned long)(kinfo->unassigned_mem >> 20), d);
>>>>>>>>> +
>>>>>>>>> + mem->nr_banks = 0;
>>>>>>>>> + /*
>>>>>>>>> + * Use host memory layout for hwdom. Only case for this is when LLC coloring
>>>>>>>>> + * is enabled.
>>>>>>>>> + */
>>>>>>>>> + if ( is_hardware_domain(d) )
>>>>>>>>> + {
>>>>>>>>> + ASSERT(llc_coloring_enabled);
>>>>>>>> This patch does not build because of declaration not being visible. You must include <xen/llc-coloring.h>.
>>>>>>>
>>>>>>> Piggying back on this comment. AFAICT, the code below would work also in
>>>>>>> the non cache coloring case. So what's the assert is for?
>>>>>>>
>>>>>>>>
>>>>>>>>> +
>>>>>>>>> + hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
>>>>>>>>> + NR_MEM_BANKS);
>>>>>>>>> + if ( !hwdom_free_mem )
>>>>>>>>> + goto fail;
>>>>>>>>> +
>>>>>>>>> + hwdom_free_mem->max_banks = NR_MEM_BANKS;
>>>>>>>>> +
>>>>>>>>> + if ( find_unallocated_memory(kinfo, hwdom_free_mem) )
>>>>>>>> My remarks for the use of find_unallocated_memory() 1:1 have not been addressed. You did not even
>>>>>>>> change the comments inside the function. The problem is that the function is specifically designed
>>>>>>>> for finding extended regions and assumes being called at certain point i.e. dom0 RAM allocated, gnttab
>>>>>>>> region allocated, etc.
>>>>>>>
>>>>>>> So I agree that the function should be updated if we plan to use it for
>>>>>>> other purpose.
>>>>>>>
>>>>>>> My opinion is that we should attempt to make the function generic so
>>>>>>> that in your
>>>>>>>> case you can choose which regions to exclude, define even your own function to grab free regions (at the moment
>>>>>>>> add_ext_regions grabs banks >= 64M but you still discards banks >= 128M, so it's a bit wasteful.
>>>>>>>>
>>>>>>>> My very short attempt to make the function as generic as possible in the first iteration:
>>>>>>>> https://paste.debian.net/1338334/
>>>>>>>
>>>>>>> This looks better, but I wonder why we need still need to exclude the
>>>>>>> static regions? Wouldn't it be sufficient to exclude just reserved regions?
>>>>>> Static shared memory banks are not part of reserved memory (i.e. bootinfo.reserved_mem) if that's what you're asking.
>>>>>> They are stored in bootinfo.shmem, hence we need to take them into account when searching for unused address space.
>>>>>
>>>>> Oh I missed the fact you now pass "mem_banks" as a parameter. I thought
>>>>> they would still get excluded for cache coloring case.
>>>>>
>>>>>>
>>>>>> If you and Carlo are ok with my proposed solution for making the function generic, I can send a patch as a prerequisite
>>>>>> patch for Carlo series.
>>>>>
>>>>> I am fine with the approach.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> --
>>>>> Julien Grall
>>>>>
>>>>
>>>>> @@ -2152,7 +2238,10 @@ static int __init construct_dom0(struct domain *d)
>>>>> /* type must be set before allocate_memory */
>>>>> d->arch.type = kinfo.type;
>>>>> #endif
>>>>> - allocate_memory_11(d, &kinfo);
>>>>> + if ( is_domain_direct_mapped(d) )
>>>>> + allocate_memory_11(d, &kinfo);
>>>>> + else
>>>>> + allocate_memory(d, &kinfo);
>>>>> find_gnttab_region(d, &kinfo);
>>>>
>>>> Since find_gnttab_region() is called after allocate_memory(), kinfo->gnttab_*
>>>> fields aren't initialized and the call to find_unallocated_memory() with
>>>> gnttab as the region to exclude, fails ending in a crash since memory for
>>>> dom0 can't be allocated.
>>>>
>>>> Can the solution be to call find_gnttab_region() before the above if?
>>>
>>> The function is called find, but currently it only initializes kinfo->gnttab_start
>>> and kinfo->gnttab_size and we tested that moving it before allocate_memory* doesn't
>>> cause fallouts.
>>>
>>> If moving before allocate_memory*, would it be better to rename it e.g., init_gnttab_region()?
>>>
>>> Thanks,
>>> Andrea
>>>
>>>> Or should I just call it before allocate_memory() in one case, but still after
>>>> allocate_memory_11() in the other?
>>>>
>>>> Thanks.
>>>
>>
>> AFAICT there is nothing stopping us from moving find_gnttab_region() before allocate_*. This function initializes
>> gnttab region with PA of Xen. In normal case, because Xen is added as bootmodule, it will never be mapped in dom0 memory map
>> and the placement does not matter. In LLC case, it will point to relocated address of Xen and it needs to be known before
>> calling find_unallocated_memory. Don't rename it, leave as is, just move before allocate_*.
>>
>> @Carlo:
>> My prerequisite patch has been merged, so you're good to respin a series (unless you wait for some feedback in which case do let me know).
>> To prevent too many respins, you're going to call find_unallocated_memory for LLC passing resmem and gnttab to be excluded. If you're going
>> to reuse add_ext_regions, you need to rename it and fix comments to make it more generic. As for the size, the decision is yours. One solution
>> would be to modify add_ext_regions to take min bank size as parameter (64MB for extended regions, X for LLC dom0). In your code, you write that
>> the first bank must contain dom0, dtb, ramdisk and you chose 128MB. However, looking at the code, you seem to discard banks < 128 for all the banks,
>> not only for the first one. This is the part that I don't have a ready solution. Maybe you could define your own add_free_region function and sort
>> the banks, so that you take the largest possible bank first for dom0. This could simplify things.
>
> For the moment I added a __add_ext_regions() helper that also takes a skip_size
I'm not sure if MISRA and our guidelines are happy with prefixing with function with __.
I don't understand the skip_size parameter. In which scenario do you want to use it? Not for
extended regions and for LLC, even with your current solution, you also want to find banks bigger than
some size.
> parameter. This is called by add_ext_regions() and by a new
> add_hwdom_free_regions() callback used in allocate_memory().
> I still use 128MB for all the banks. Do you think this is acceptable, maybe
> with a FIXME comment cause we should skip only the first bank, or not?
First of all, I'm not convinced with 128MB. This is definitely not a requirement for arm64.
allocate_memory_11 uses it but the algorithm of finding banks is completely different.
AFAICT, with my suggested solution i.e. sorting banks in a helper like add_ext_regions used only
for LLC case, you no longer need to worry about size. You simply start with the biggest possible bank
as the first bank.
~Michal
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-13 10:56 ` Michal Orzel
@ 2024-12-13 11:30 ` Carlo Nonato
2024-12-13 11:33 ` Carlo Nonato
0 siblings, 1 reply; 40+ messages in thread
From: Carlo Nonato @ 2024-12-13 11:30 UTC (permalink / raw)
To: Michal Orzel
Cc: Andrea Bastoni, Julien Grall, xen-devel, marco.solieri,
Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk
On Fri, Dec 13, 2024 at 11:56 AM Michal Orzel <michal.orzel@amd.com> wrote:
>
>
>
> On 13/12/2024 11:26, Carlo Nonato wrote:
> >
> >
> > Hi Michal,
> >
> > On Fri, Dec 13, 2024 at 10:46 AM Michal Orzel <michal.orzel@amd.com> wrote:
> >>
> >> Hi Carlo, Andrea,
> >>
> >> On 12/12/2024 19:22, Andrea Bastoni wrote:
> >>>
> >>>
> >>> On 12/12/2024 18:48, Carlo Nonato wrote:
> >>>> Hi,
> >>>>
> >>>> On Mon, Dec 9, 2024 at 8:17 PM Julien Grall <julien@xen.org> wrote:
> >>>>>
> >>>>> Hi Michal,
> >>>>>
> >>>>> On 07/12/2024 15:04, Michal Orzel wrote:
> >>>>>>
> >>>>>>
> >>>>>> On 06/12/2024 19:37, Julien Grall wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> Sorry for the late answer.
> >>>>>>>
> >>>>>>> On 05/12/2024 09:40, Michal Orzel wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 02/12/2024 17:59, Carlo Nonato wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Cache coloring requires Dom0 not to be direct-mapped because of its non
> >>>>>>>>> contiguous mapping nature, so allocate_memory() is needed in this case.
> >>>>>>>>> 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
> >>>>>>>>> moved allocate_memory() in dom0less_build.c. In order to use it
> >>>>>>>>> in Dom0 construction bring it back to domain_build.c and declare it in
> >>>>>>>>> domain_build.h.
> >>>>>>>>>
> >>>>>>>>> Take the opportunity to adapt the implementation of allocate_memory() so
> >>>>>>>>> that it uses the host layout when called on the hwdom, via
> >>>>>>>>> find_unallocated_memory().
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> >>>>>>>>> ---
> >>>>>>>>> v11:
> >>>>>>>>> - GUEST_RAM_BANKS instead of hardcoding the number of banks in allocate_memory()
> >>>>>>>>> - hwdom_ext_regions -> hwdom_free_mem in allocate_memory()
> >>>>>>>>> - added a comment in allocate_memory() when skipping small banks
> >>>>>>>>> v10:
> >>>>>>>>> - fixed a compilation bug that happened when dom0less support was disabled
> >>>>>>>>> v9:
> >>>>>>>>> - no changes
> >>>>>>>>> v8:
> >>>>>>>>> - patch adapted to new changes to allocate_memory()
> >>>>>>>>> v7:
> >>>>>>>>> - allocate_memory() now uses the host layout when called on the hwdom
> >>>>>>>>> v6:
> >>>>>>>>> - new patch
> >>>>>>>>> ---
> >>>>>>>>> xen/arch/arm/dom0less-build.c | 44 -----------
> >>>>>>>>> xen/arch/arm/domain_build.c | 97 ++++++++++++++++++++++++-
> >>>>>>>>> xen/arch/arm/include/asm/domain_build.h | 1 +
> >>>>>>>>> 3 files changed, 94 insertions(+), 48 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
> >>>>>>>>> index d93a85434e..67b1503647 100644
> >>>>>>>>> --- a/xen/arch/arm/dom0less-build.c
> >>>>>>>>> +++ b/xen/arch/arm/dom0less-build.c
> >>>>>>>>> @@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
> >>>>>>>>> return ( !dom0found && domUfound );
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> -static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
> >>>>>>>>> -{
> >>>>>>>>> - struct membanks *mem = kernel_info_get_mem(kinfo);
> >>>>>>>>> - unsigned int i;
> >>>>>>>>> - paddr_t bank_size;
> >>>>>>>>> -
> >>>>>>>>> - printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
> >>>>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
> >>>>>>>>> - (unsigned long)(kinfo->unassigned_mem >> 20), d);
> >>>>>>>>> -
> >>>>>>>>> - mem->nr_banks = 0;
> >>>>>>>>> - bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
> >>>>>>>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
> >>>>>>>>> - bank_size) )
> >>>>>>>>> - goto fail;
> >>>>>>>>> -
> >>>>>>>>> - bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
> >>>>>>>>> - if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
> >>>>>>>>> - bank_size) )
> >>>>>>>>> - goto fail;
> >>>>>>>>> -
> >>>>>>>>> - if ( kinfo->unassigned_mem )
> >>>>>>>>> - goto fail;
> >>>>>>>>> -
> >>>>>>>>> - for( i = 0; i < mem->nr_banks; i++ )
> >>>>>>>>> - {
> >>>>>>>>> - printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
> >>>>>>>>> - d,
> >>>>>>>>> - i,
> >>>>>>>>> - mem->bank[i].start,
> >>>>>>>>> - mem->bank[i].start + mem->bank[i].size,
> >>>>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
> >>>>>>>>> - (unsigned long)(mem->bank[i].size >> 20));
> >>>>>>>>> - }
> >>>>>>>>> -
> >>>>>>>>> - return;
> >>>>>>>>> -
> >>>>>>>>> -fail:
> >>>>>>>>> - panic("Failed to allocate requested domain memory."
> >>>>>>>>> - /* Don't want format this as PRIpaddr (16 digit hex) */
> >>>>>>>>> - " %ldKB unallocated. Fix the VMs configurations.\n",
> >>>>>>>>> - (unsigned long)kinfo->unassigned_mem >> 10);
> >>>>>>>>> -}
> >>>>>>>>> -
> >>>>>>>>> #ifdef CONFIG_VGICV2
> >>>>>>>>> static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
> >>>>>>>>> {
> >>>>>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> >>>>>>>>> index 2c30792de8..2b8cba9b2f 100644
> >>>>>>>>> --- a/xen/arch/arm/domain_build.c
> >>>>>>>>> +++ b/xen/arch/arm/domain_build.c
> >>>>>>>>> @@ -416,7 +416,6 @@ static void __init allocate_memory_11(struct domain *d,
> >>>>>>>>> }
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> -#ifdef CONFIG_DOM0LESS_BOOT
> >>>>>>>>> bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
> >>>>>>>>> alloc_domheap_mem_cb cb, void *extra)
> >>>>>>>>> {
> >>>>>>>>> @@ -508,7 +507,6 @@ bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
> >>>>>>>>>
> >>>>>>>>> return true;
> >>>>>>>>> }
> >>>>>>>>> -#endif
> >>>>>>>>>
> >>>>>>>>> /*
> >>>>>>>>> * When PCI passthrough is available we want to keep the
> >>>>>>>>> @@ -1003,6 +1001,94 @@ out:
> >>>>>>>>> return res;
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> +void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
> >>>>>>>>> +{
> >>>>>>>>> + struct membanks *mem = kernel_info_get_mem(kinfo);
> >>>>>>>>> + unsigned int i, nr_banks = GUEST_RAM_BANKS;
> >>>>>>>>> + paddr_t bank_start, bank_size;
> >>>>>>>> Limit the scope
> >>>>>>>>
> >>>>>>>>> + struct membanks *hwdom_free_mem = NULL;
> >>>>>>>>> + const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
> >>>>>>>>> + const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
> >>>>>>>> Limit the scope
> >>>>>>>>
> >>>>>>>>> +
> >>>>>>>>> + printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
> >>>>>>>>> + /* Don't want format this as PRIpaddr (16 digit hex) */
> >>>>>>>>> + (unsigned long)(kinfo->unassigned_mem >> 20), d);
> >>>>>>>>> +
> >>>>>>>>> + mem->nr_banks = 0;
> >>>>>>>>> + /*
> >>>>>>>>> + * Use host memory layout for hwdom. Only case for this is when LLC coloring
> >>>>>>>>> + * is enabled.
> >>>>>>>>> + */
> >>>>>>>>> + if ( is_hardware_domain(d) )
> >>>>>>>>> + {
> >>>>>>>>> + ASSERT(llc_coloring_enabled);
> >>>>>>>> This patch does not build because of declaration not being visible. You must include <xen/llc-coloring.h>.
> >>>>>>>
> >>>>>>> Piggying back on this comment. AFAICT, the code below would work also in
> >>>>>>> the non cache coloring case. So what's the assert is for?
> >>>>>>>
> >>>>>>>>
> >>>>>>>>> +
> >>>>>>>>> + hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
> >>>>>>>>> + NR_MEM_BANKS);
> >>>>>>>>> + if ( !hwdom_free_mem )
> >>>>>>>>> + goto fail;
> >>>>>>>>> +
> >>>>>>>>> + hwdom_free_mem->max_banks = NR_MEM_BANKS;
> >>>>>>>>> +
> >>>>>>>>> + if ( find_unallocated_memory(kinfo, hwdom_free_mem) )
> >>>>>>>> My remarks for the use of find_unallocated_memory() 1:1 have not been addressed. You did not even
> >>>>>>>> change the comments inside the function. The problem is that the function is specifically designed
> >>>>>>>> for finding extended regions and assumes being called at certain point i.e. dom0 RAM allocated, gnttab
> >>>>>>>> region allocated, etc.
> >>>>>>>
> >>>>>>> So I agree that the function should be updated if we plan to use it for
> >>>>>>> other purpose.
> >>>>>>>
> >>>>>>> My opinion is that we should attempt to make the function generic so
> >>>>>>> that in your
> >>>>>>>> case you can choose which regions to exclude, define even your own function to grab free regions (at the moment
> >>>>>>>> add_ext_regions grabs banks >= 64M but you still discards banks >= 128M, so it's a bit wasteful.
> >>>>>>>>
> >>>>>>>> My very short attempt to make the function as generic as possible in the first iteration:
> >>>>>>>> https://paste.debian.net/1338334/
> >>>>>>>
> >>>>>>> This looks better, but I wonder why we need still need to exclude the
> >>>>>>> static regions? Wouldn't it be sufficient to exclude just reserved regions?
> >>>>>> Static shared memory banks are not part of reserved memory (i.e. bootinfo.reserved_mem) if that's what you're asking.
> >>>>>> They are stored in bootinfo.shmem, hence we need to take them into account when searching for unused address space.
> >>>>>
> >>>>> Oh I missed the fact you now pass "mem_banks" as a parameter. I thought
> >>>>> they would still get excluded for cache coloring case.
> >>>>>
> >>>>>>
> >>>>>> If you and Carlo are ok with my proposed solution for making the function generic, I can send a patch as a prerequisite
> >>>>>> patch for Carlo series.
> >>>>>
> >>>>> I am fine with the approach.
> >>>>>
> >>>>> Cheers,
> >>>>>
> >>>>> --
> >>>>> Julien Grall
> >>>>>
> >>>>
> >>>>> @@ -2152,7 +2238,10 @@ static int __init construct_dom0(struct domain *d)
> >>>>> /* type must be set before allocate_memory */
> >>>>> d->arch.type = kinfo.type;
> >>>>> #endif
> >>>>> - allocate_memory_11(d, &kinfo);
> >>>>> + if ( is_domain_direct_mapped(d) )
> >>>>> + allocate_memory_11(d, &kinfo);
> >>>>> + else
> >>>>> + allocate_memory(d, &kinfo);
> >>>>> find_gnttab_region(d, &kinfo);
> >>>>
> >>>> Since find_gnttab_region() is called after allocate_memory(), kinfo->gnttab_*
> >>>> fields aren't initialized and the call to find_unallocated_memory() with
> >>>> gnttab as the region to exclude, fails ending in a crash since memory for
> >>>> dom0 can't be allocated.
> >>>>
> >>>> Can the solution be to call find_gnttab_region() before the above if?
> >>>
> >>> The function is called find, but currently it only initializes kinfo->gnttab_start
> >>> and kinfo->gnttab_size and we tested that moving it before allocate_memory* doesn't
> >>> cause fallouts.
> >>>
> >>> If moving before allocate_memory*, would it be better to rename it e.g., init_gnttab_region()?
> >>>
> >>> Thanks,
> >>> Andrea
> >>>
> >>>> Or should I just call it before allocate_memory() in one case, but still after
> >>>> allocate_memory_11() in the other?
> >>>>
> >>>> Thanks.
> >>>
> >>
> >> AFAICT there is nothing stopping us from moving find_gnttab_region() before allocate_*. This function initializes
> >> gnttab region with PA of Xen. In normal case, because Xen is added as bootmodule, it will never be mapped in dom0 memory map
> >> and the placement does not matter. In LLC case, it will point to relocated address of Xen and it needs to be known before
> >> calling find_unallocated_memory. Don't rename it, leave as is, just move before allocate_*.
> >>
> >> @Carlo:
> >> My prerequisite patch has been merged, so you're good to respin a series (unless you wait for some feedback in which case do let me know).
> >> To prevent too many respins, you're going to call find_unallocated_memory for LLC passing resmem and gnttab to be excluded. If you're going
> >> to reuse add_ext_regions, you need to rename it and fix comments to make it more generic. As for the size, the decision is yours. One solution
> >> would be to modify add_ext_regions to take min bank size as parameter (64MB for extended regions, X for LLC dom0). In your code, you write that
> >> the first bank must contain dom0, dtb, ramdisk and you chose 128MB. However, looking at the code, you seem to discard banks < 128 for all the banks,
> >> not only for the first one. This is the part that I don't have a ready solution. Maybe you could define your own add_free_region function and sort
> >> the banks, so that you take the largest possible bank first for dom0. This could simplify things.
> >
> > For the moment I added a __add_ext_regions() helper that also takes a skip_size
> I'm not sure if MISRA and our guidelines are happy with prefixing with function with __.
> I don't understand the skip_size parameter. In which scenario do you want to use it? Not for
> extended regions and for LLC, even with your current solution, you also want to find banks bigger than
> some size.
>
> > parameter. This is called by add_ext_regions() and by a new
> > add_hwdom_free_regions() callback used in allocate_memory().
> > I still use 128MB for all the banks. Do you think this is acceptable, maybe
> > with a FIXME comment cause we should skip only the first bank, or not?
> First of all, I'm not convinced with 128MB. This is definitely not a requirement for arm64.
> allocate_memory_11 uses it but the algorithm of finding banks is completely different.
>
> AFAICT, with my suggested solution i.e. sorting banks in a helper like add_ext_regions used only
> for LLC case, you no longer need to worry about size. You simply start with the biggest possible bank
> as the first bank.
>
> ~Michal
>
Here's my current patch:
diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index d93a85434e..67b1503647 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -49,50 +49,6 @@ bool __init is_dom0less_mode(void)
return ( !dom0found && domUfound );
}
-static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
-{
- struct membanks *mem = kernel_info_get_mem(kinfo);
- unsigned int i;
- paddr_t bank_size;
-
- printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
- /* Don't want format this as PRIpaddr (16 digit hex) */
- (unsigned long)(kinfo->unassigned_mem >> 20), d);
-
- mem->nr_banks = 0;
- bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
- if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
- bank_size) )
- goto fail;
-
- bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
- if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
- bank_size) )
- goto fail;
-
- if ( kinfo->unassigned_mem )
- goto fail;
-
- for( i = 0; i < mem->nr_banks; i++ )
- {
- printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
- d,
- i,
- mem->bank[i].start,
- mem->bank[i].start + mem->bank[i].size,
- /* Don't want format this as PRIpaddr (16 digit hex) */
- (unsigned long)(mem->bank[i].size >> 20));
- }
-
- return;
-
-fail:
- panic("Failed to allocate requested domain memory."
- /* Don't want format this as PRIpaddr (16 digit hex) */
- " %ldKB unallocated. Fix the VMs configurations.\n",
- (unsigned long)kinfo->unassigned_mem >> 10);
-}
-
#ifdef CONFIG_VGICV2
static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
{
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index adf26f2778..59ac45c4e0 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2,6 +2,7 @@
#include <xen/init.h>
#include <xen/compile.h>
#include <xen/lib.h>
+#include <xen/llc-coloring.h>
#include <xen/mm.h>
#include <xen/param.h>
#include <xen/domain_page.h>
@@ -416,7 +417,6 @@ static void __init allocate_memory_11(struct domain *d,
}
}
-#ifdef CONFIG_DOM0LESS_BOOT
bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
alloc_domheap_mem_cb cb, void *extra)
{
@@ -508,7 +508,6 @@ bool __init allocate_bank_memory(struct
kernel_info *kinfo, gfn_t sgfn,
return true;
}
-#endif
/*
* When PCI passthrough is available we want to keep the
@@ -859,8 +858,8 @@ int __init make_memory_node(const struct
kernel_info *kinfo, int addrcells,
return res;
}
-int __init add_ext_regions(unsigned long s_gfn, unsigned long e_gfn,
- void *data)
+static int __init __add_ext_regions(unsigned long s_gfn, unsigned long e_gfn,
+ void *data, paddr_t skip_size)
{
struct membanks *ext_regions = data;
paddr_t start, size;
@@ -885,12 +884,7 @@ int __init add_ext_regions(unsigned long s_gfn,
unsigned long e_gfn,
e += 1;
size = (e - start) & ~(SZ_2M - 1);
- /*
- * Reasonable size. Not too little to pick up small ranges which are
- * not quite useful but increase bookkeeping and not too large
- * to skip a large proportion of unused address space.
- */
- if ( size < MB(64) )
+ if ( size < skip_size )
return 0;
ext_regions->bank[ext_regions->nr_banks].start = start;
@@ -900,6 +894,28 @@ int __init add_ext_regions(unsigned long s_gfn,
unsigned long e_gfn,
return 0;
}
+static int __init add_hwdom_free_regions(unsigned long s_gfn,
+ unsigned long e_gfn, void *data)
+{
+ /*
+ * Skip banks that are too small. The first bank must contain dom0 kernel +
+ * ramdisk + dtb and 128 MB is the same limit used in allocate_memory_11().
+ */
+ return __add_ext_regions(s_gfn, e_gfn, data, MB(128));
+}
+
+
+int __init add_ext_regions(unsigned long s_gfn, unsigned long e_gfn,
+ void *data)
+{
+ /*
+ * Reasonable size. Not too little to pick up small ranges which are
+ * not quite useful but increase bookkeeping and not too large
+ * to skip a large proportion of unused address space.
+ */
+ return __add_ext_regions(s_gfn, e_gfn, data, MB(64));
+}
+
/*
* Find unused regions of Host address space which can be exposed to domain
* using the host memory layout. In order to calculate regions we exclude every
@@ -977,6 +993,109 @@ out:
return res;
}
+void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
+{
+ struct membanks *mem = kernel_info_get_mem(kinfo);
+ unsigned int i, nr_banks = GUEST_RAM_BANKS;
+ struct membanks *hwdom_free_mem = NULL;
+
+ printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
+ /* Don't want format this as PRIpaddr (16 digit hex) */
+ (unsigned long)(kinfo->unassigned_mem >> 20), d);
+
+ mem->nr_banks = 0;
+ /*
+ * Use host memory layout for hwdom. Only case for this is when
LLC coloring
+ * is enabled.
+ */
+ if ( is_hardware_domain(d) )
+ {
+ struct membanks *gnttab = xzalloc_flex_struct(struct
membanks, bank, 1);
+ /*
+ * Exclude the following regions:
+ * 1) Remove reserved memory
+ * 2) Grant table assigned to Dom0
+ */
+ const struct membanks *mem_banks[] = {
+ bootinfo_get_reserved_mem(),
+ gnttab,
+ };
+
+ ASSERT(llc_coloring_enabled);
+
+ if ( !gnttab )
+ goto fail;
+
+ gnttab->nr_banks = 1;
+ gnttab->bank[0].start = kinfo->gnttab_start;
+ gnttab->bank[0].size = kinfo->gnttab_start + kinfo->gnttab_size;
+
+ hwdom_free_mem = xzalloc_flex_struct(struct membanks, bank,
+ NR_MEM_BANKS);
+ if ( !hwdom_free_mem )
+ goto fail;
+
+ hwdom_free_mem->max_banks = NR_MEM_BANKS;
+
+ if ( find_unallocated_memory(kinfo, mem_banks, ARRAY_SIZE(mem_banks),
+ hwdom_free_mem, add_hwdom_free_regions) )
+ goto fail;
+
+ nr_banks = hwdom_free_mem->nr_banks;
+ xfree(gnttab);
+ }
+
+ for ( i = 0; kinfo->unassigned_mem > 0 && nr_banks > 0; i++, nr_banks-- )
+ {
+ paddr_t bank_start, bank_size;
+
+ if ( is_hardware_domain(d) )
+ {
+ bank_start = hwdom_free_mem->bank[i].start;
+ bank_size = hwdom_free_mem->bank[i].size;
+ ASSERT(bank_size >= MB(128));
+ }
+ else
+ {
+ const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
+ const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
+
+ if ( i >= GUEST_RAM_BANKS )
+ goto fail;
+
+ bank_start = bankbase[i];
+ bank_size = banksize[i];
+ }
+
+ bank_size = MIN(bank_size, kinfo->unassigned_mem);
+ if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(bank_start),
bank_size) )
+ goto fail;
+ }
+
+ if ( kinfo->unassigned_mem )
+ goto fail;
+
+ for( i = 0; i < mem->nr_banks; i++ )
+ {
+ printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
+ d,
+ i,
+ mem->bank[i].start,
+ mem->bank[i].start + mem->bank[i].size,
+ /* Don't want format this as PRIpaddr (16 digit hex) */
+ (unsigned long)(mem->bank[i].size >> 20));
+ }
+
+ xfree(hwdom_free_mem);
+ return;
+
+fail:
+ panic("Failed to allocate requested domain memory."
+ /* Don't want format this as PRIpaddr (16 digit hex) */
+ " %ldKB unallocated. Fix the VMs configurations.\n",
+ (unsigned long)kinfo->unassigned_mem >> 10);
+}
+
static int __init handle_pci_range(const struct dt_device_node *dev,
uint64_t addr, uint64_t len, void *data)
{
@@ -1235,7 +1354,7 @@ int __init make_hypervisor_node(struct domain *d,
ext_regions->max_banks = NR_MEM_BANKS;
- if ( is_domain_direct_mapped(d) )
+ if ( domain_use_host_layout(d) )
{
if ( !is_iommu_enabled(d) )
res = find_host_extended_regions(kinfo, ext_regions);
@@ -2164,8 +2283,11 @@ static int __init construct_dom0(struct domain *d)
/* type must be set before allocate_memory */
d->arch.type = kinfo.type;
#endif
- allocate_memory_11(d, &kinfo);
find_gnttab_region(d, &kinfo);
+ if ( is_domain_direct_mapped(d) )
+ allocate_memory_11(d, &kinfo);
+ else
+ allocate_memory(d, &kinfo);
rc = process_shm_chosen(d, &kinfo);
if ( rc < 0 )
diff --git a/xen/arch/arm/include/asm/domain_build.h
b/xen/arch/arm/include/asm/domain_build.h
index e712afbc7f..b0d646e173 100644
--- a/xen/arch/arm/include/asm/domain_build.h
+++ b/xen/arch/arm/include/asm/domain_build.h
@@ -11,6 +11,7 @@ bool allocate_domheap_memory(struct domain *d,
paddr_t tot_size,
alloc_domheap_mem_cb cb, void *extra);
bool allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
paddr_t tot_size);
+void allocate_memory(struct domain *d, struct kernel_info *kinfo);
int construct_domain(struct domain *d, struct kernel_info *kinfo);
int domain_fdt_begin_node(void *fdt, const char *name, uint64_t unit);
int make_chosen_node(const struct kernel_info *kinfo);
@@ -54,6 +55,9 @@ static inline int prepare_acpi(struct domain *d,
struct kernel_info *kinfo)
int prepare_acpi(struct domain *d, struct kernel_info *kinfo);
#endif
+typedef int (*add_free_regions_fn)(unsigned long s_gfn, unsigned long e_gfn,
+ void *data);
+
int add_ext_regions(unsigned long s_gfn, unsigned long e_gfn, void *data);
#endif
skip_size can be renamed to threshold_size to make it more clear.
Anyway I'm not following you on the suggested solution: when should I sort the
banks, how can I do it in the callback of find_unallocated_memory() and
what if the first biggest bank is lower than 128MB, I should not care for that?
Thanks.
- Carlo
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-13 11:30 ` Carlo Nonato
@ 2024-12-13 11:33 ` Carlo Nonato
2024-12-13 11:47 ` Michal Orzel
0 siblings, 1 reply; 40+ messages in thread
From: Carlo Nonato @ 2024-12-13 11:33 UTC (permalink / raw)
To: Michal Orzel
Cc: Andrea Bastoni, Julien Grall, xen-devel, marco.solieri,
Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk
Using paste.debian:
https://paste.debian.net/1339647/
Thanks.
- Carlo
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-13 11:33 ` Carlo Nonato
@ 2024-12-13 11:47 ` Michal Orzel
2024-12-13 12:45 ` Carlo Nonato
0 siblings, 1 reply; 40+ messages in thread
From: Michal Orzel @ 2024-12-13 11:47 UTC (permalink / raw)
To: Carlo Nonato
Cc: Andrea Bastoni, Julien Grall, xen-devel, marco.solieri,
Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk
On 13/12/2024 12:33, Carlo Nonato wrote:
>
>
> Using paste.debian:
>
> https://paste.debian.net/1339647/
1. Issue I mentioned with prefixing with double underscore
2. Generic helper should not be named ext_regions
3. s/skip_size/min_bank_size/
And still you need to convince others about 128MB limit because I'm not sure.
Imagine, that our kernel+dtb+ramdisk is > 128MB and your first bank is 128MB. With your
solution this would fail. Now, imagine that you sort your banks and start with the biggest
one. You don't care about its size. It's the biggest one so if it does not fit, then that's not
your problem.
~Michal
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction
2024-12-13 11:47 ` Michal Orzel
@ 2024-12-13 12:45 ` Carlo Nonato
0 siblings, 0 replies; 40+ messages in thread
From: Carlo Nonato @ 2024-12-13 12:45 UTC (permalink / raw)
To: Michal Orzel
Cc: Andrea Bastoni, Julien Grall, xen-devel, marco.solieri,
Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk
On Fri, Dec 13, 2024 at 12:47 PM Michal Orzel <michal.orzel@amd.com> wrote:
>
>
>
> On 13/12/2024 12:33, Carlo Nonato wrote:
> >
> >
> > Using paste.debian:
> >
> > https://paste.debian.net/1339647/
>
> 1. Issue I mentioned with prefixing with double underscore
> 2. Generic helper should not be named ext_regions
> 3. s/skip_size/min_bank_size/
>
> And still you need to convince others about 128MB limit because I'm not sure.
>
> Imagine, that our kernel+dtb+ramdisk is > 128MB and your first bank is 128MB. With your
> solution this would fail. Now, imagine that you sort your banks and start with the biggest
> one. You don't care about its size. It's the biggest one so if it does not fit, then that's not
> your problem.
>
> ~Michal
>
Something like that in add_hwdom_free_regions()?
> /* Find the insert position (descending order). */
> for ( i = 0; i < free_regions->nr_banks ; i++)
> if ( size > free_regions->bank[i].size )
> break;
>
> /* Move the other banks to make space. */
> for ( j = free_regions->nr_banks; j > i ; j-- )
> {
> free_regions->bank[j].start = free_regions->bank[j - 1].start;
> free_regions->bank[j].size = free_regions->bank[j - 1].size;
> }
>
> free_regions->bank[i].start = start;
> free_regions->bank[i].size = size;
> free_regions->nr_banks++;
With that (if I didn't make any mistake) I'm inserting banks in descending size
order. Is it ok?
Thanks.
- Carlo
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v11 04/12] xen/arm: add Dom0 cache coloring support
2024-12-02 16:59 [PATCH v11 00/12] Arm cache coloring Carlo Nonato
` (2 preceding siblings ...)
2024-12-02 16:59 ` [PATCH v11 03/12] xen/arm: permit non direct-mapped Dom0 construction Carlo Nonato
@ 2024-12-02 16:59 ` Carlo Nonato
2024-12-02 16:59 ` [PATCH v11 05/12] xen: extend domctl interface for cache coloring Carlo Nonato
` (7 subsequent siblings)
11 siblings, 0 replies; 40+ messages in thread
From: Carlo Nonato @ 2024-12-02 16:59 UTC (permalink / raw)
To: xen-devel
Cc: andrea.bastoni, marco.solieri, Carlo Nonato, Andrew Cooper,
Jan Beulich, Julien Grall, Stefano Stabellini, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk
Add a command line parameter to allow the user to set the coloring
configuration for Dom0.
A common configuration syntax for cache colors is introduced and
documented.
Take the opportunity to also add:
- default configuration notion.
- function to check well-formed configurations.
Direct mapping Dom0 isn't possible when coloring is enabled, so
CDF_directmap flag is removed when creating it.
Based on original work from: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
---
v11:
- minor changes
v10:
- fixed array type for colors parameter in check_colors()
v9:
- moved domain_llc_coloring_free() in next patch cause it's never used for dom0
v8:
- added bound check on dom0_num_colors
- default colors array set just once
v7:
- parse_color_config() doesn't accept leading/trailing commas anymore
- removed alloc_colors() helper
v6:
- moved domain_llc_coloring_free() in this patch
- removed domain_alloc_colors() in favor of a more explicit allocation
- parse_color_config() now accepts the size of the array to be filled
- allocate_memory() moved in another patch
v5:
- Carlo Nonato as the new author
- moved dom0 colors parsing (parse_colors()) in this patch
- added dom0_set_llc_colors() to set dom0 colors after creation
- moved color allocation and checking in this patch
- error handling when allocating color arrays
- FIXME: copy pasted allocate_memory() cause it got moved
v4:
- dom0 colors are dynamically allocated as for any other domain
(colors are duplicated in dom0_colors and in the new array, but logic
is simpler)
---
docs/misc/cache-coloring.rst | 29 ++++++++
docs/misc/xen-command-line.pandoc | 9 +++
xen/arch/arm/domain_build.c | 10 ++-
xen/common/llc-coloring.c | 120 +++++++++++++++++++++++++++++-
xen/include/xen/llc-coloring.h | 1 +
5 files changed, 167 insertions(+), 2 deletions(-)
diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
index 12972dbb2c..7b47d0ed92 100644
--- a/docs/misc/cache-coloring.rst
+++ b/docs/misc/cache-coloring.rst
@@ -107,6 +107,35 @@ Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
+----------------------+-------------------------------+
| ``llc-nr-ways`` | Set the LLC number of ways |
+----------------------+-------------------------------+
+| ``dom0-llc-colors`` | Dom0 color configuration |
++----------------------+-------------------------------+
+
+Colors selection format
+***********************
+
+Regardless of the memory pool that has to be colored (Xen, Dom0/DomUs),
+the color selection can be expressed using the same syntax. In particular a
+comma-separated list of colors or ranges of colors is used.
+Ranges are hyphen-separated intervals (such as `0-4`) and are inclusive on both
+sides.
+
+Note that:
+
+- no spaces are allowed between values.
+- no overlapping ranges or duplicated colors are allowed.
+- values must be written in ascending order.
+
+Examples:
+
++-------------------+-----------------------------+
+| **Configuration** | **Actual selection** |
++-------------------+-----------------------------+
+| 1-2,5-8 | [1, 2, 5, 6, 7, 8] |
++-------------------+-----------------------------+
+| 4-8,10,11,12 | [4, 5, 6, 7, 8, 10, 11, 12] |
++-------------------+-----------------------------+
+| 0 | [0] |
++-------------------+-----------------------------+
Auto-probing of LLC specs
#########################
diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index abd8dae96f..bfdc8b0002 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
Specify a list of IO ports to be excluded from dom0 access.
+### dom0-llc-colors (arm64)
+> `= List of [ <integer> | <integer>-<integer> ]`
+
+> Default: `All available LLC colors`
+
+Specify dom0 LLC color configuration. This option is available only when
+`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
+colors are used.
+
### dom0_max_vcpus
Either:
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 2b8cba9b2f..83d7585e7e 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2,6 +2,7 @@
#include <xen/init.h>
#include <xen/compile.h>
#include <xen/lib.h>
+#include <xen/llc-coloring.h>
#include <xen/mm.h>
#include <xen/param.h>
#include <xen/domain_page.h>
@@ -2285,6 +2286,7 @@ void __init create_dom0(void)
.max_maptrack_frames = -1,
.grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version),
};
+ unsigned int flags = CDF_privileged;
int rc;
/* The vGIC for DOM0 is exactly emulating the hardware GIC */
@@ -2312,10 +2314,16 @@ void __init create_dom0(void)
panic("SVE vector length error\n");
}
- dom0 = domain_create(0, &dom0_cfg, CDF_privileged | CDF_directmap);
+ if ( !llc_coloring_enabled )
+ flags |= CDF_directmap;
+
+ dom0 = domain_create(0, &dom0_cfg, flags);
if ( IS_ERR(dom0) )
panic("Error creating domain 0 (rc = %ld)\n", PTR_ERR(dom0));
+ if ( llc_coloring_enabled && (rc = dom0_set_llc_colors(dom0)) )
+ panic("Error initializing LLC coloring for domain 0 (rc = %d)\n", rc);
+
if ( alloc_dom0_vcpu0(dom0) == NULL )
panic("Error creating domain 0 vcpu0\n");
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index 5139890e3d..8f076849c1 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -25,6 +25,66 @@ static unsigned int __initdata llc_nr_ways;
integer_param("llc-nr-ways", llc_nr_ways);
/* Number of colors available in the LLC */
static unsigned int __ro_after_init max_nr_colors;
+/* Default coloring configuration */
+static unsigned int __ro_after_init default_colors[NR_LLC_COLORS];
+
+static unsigned int __initdata dom0_colors[NR_LLC_COLORS];
+static unsigned int __initdata dom0_num_colors;
+
+/*
+ * Parse the coloring configuration given in the buf string, following the
+ * syntax below.
+ *
+ * COLOR_CONFIGURATION ::= COLOR | RANGE,...,COLOR | RANGE
+ * RANGE ::= COLOR-COLOR
+ *
+ * Example: "0,2-6,15-16" represents the set of colors: 0,2,3,4,5,6,15,16.
+ */
+static int __init parse_color_config(const char *buf, unsigned int colors[],
+ unsigned int max_num_colors,
+ unsigned int *num_colors)
+{
+ const char *s = buf;
+
+ *num_colors = 0;
+
+ while ( *s != '\0' )
+ {
+ unsigned int color, start, end;
+
+ start = simple_strtoul(s, &s, 0);
+
+ if ( *s == '-' ) /* Range */
+ {
+ s++;
+ end = simple_strtoul(s, &s, 0);
+ }
+ else /* Single value */
+ end = start;
+
+ if ( start > end || (end - start) > (UINT_MAX - *num_colors) ||
+ (*num_colors + (end - start)) >= max_num_colors )
+ return -EINVAL;
+
+ /* Colors are range checked in check_colors() */
+ for ( color = start; color <= end; color++ )
+ colors[(*num_colors)++] = color;
+
+ if ( *s == ',' )
+ s++;
+ else if ( *s != '\0' )
+ break;
+ }
+
+ return *s ? -EINVAL : 0;
+}
+
+static int __init parse_dom0_colors(const char *s)
+{
+ return parse_color_config(s, dom0_colors, ARRAY_SIZE(dom0_colors),
+ &dom0_num_colors);
+}
+custom_param("dom0-llc-colors", parse_dom0_colors);
static void print_colors(const unsigned int colors[], unsigned int num_colors)
{
@@ -49,9 +109,27 @@ static void print_colors(const unsigned int colors[], unsigned int num_colors)
printk(" }\n");
}
+static bool __init check_colors(const unsigned int colors[],
+ unsigned int num_colors)
+{
+ unsigned int i;
+
+ for ( i = 0; i < num_colors; i++ )
+ {
+ if ( colors[i] >= max_nr_colors )
+ {
+ printk(XENLOG_ERR "LLC color %u >= %u (max allowed)\n", colors[i],
+ max_nr_colors);
+ return false;
+ }
+ }
+
+ return true;
+}
+
void __init llc_coloring_init(void)
{
- unsigned int way_size;
+ unsigned int way_size, i;
if ( (llc_coloring_enabled < 0) && (llc_size && llc_nr_ways) )
{
@@ -89,6 +167,9 @@ void __init llc_coloring_init(void)
else if ( max_nr_colors < 2 )
panic("Number of LLC colors %u < 2\n", max_nr_colors);
+ for ( i = 0; i < max_nr_colors; i++ )
+ default_colors[i] = i;
+
arch_llc_coloring_init();
}
@@ -110,6 +191,43 @@ void domain_dump_llc_colors(const struct domain *d)
print_colors(d->llc_colors, d->num_llc_colors);
}
+static void __init domain_set_default_colors(struct domain *d)
+{
+ printk(XENLOG_WARNING
+ "LLC color config not found for %pd, using all colors\n", d);
+
+ d->llc_colors = default_colors;
+ d->num_llc_colors = max_nr_colors;
+}
+
+int __init dom0_set_llc_colors(struct domain *d)
+{
+ typeof(*dom0_colors) *colors;
+
+ if ( !dom0_num_colors )
+ {
+ domain_set_default_colors(d);
+ return 0;
+ }
+
+ if ( (dom0_num_colors > max_nr_colors) ||
+ !check_colors(dom0_colors, dom0_num_colors) )
+ {
+ printk(XENLOG_ERR "%pd: bad LLC color config\n", d);
+ return -EINVAL;
+ }
+
+ colors = xmalloc_array(typeof(*dom0_colors), dom0_num_colors);
+ if ( !colors )
+ return -ENOMEM;
+
+ memcpy(colors, dom0_colors, sizeof(*colors) * dom0_num_colors);
+ d->llc_colors = colors;
+ d->num_llc_colors = dom0_num_colors;
+
+ return 0;
+}
+
/*
* Local variables:
* mode: C
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
index ee0c58ab1c..4ce14e4e4a 100644
--- a/xen/include/xen/llc-coloring.h
+++ b/xen/include/xen/llc-coloring.h
@@ -26,6 +26,7 @@ static inline void domain_dump_llc_colors(const struct domain *d) {}
unsigned int get_llc_way_size(void);
void arch_llc_coloring_init(void);
+int dom0_set_llc_colors(struct domain *d);
#endif /* __XEN_LLC_COLORING_H__ */
--
2.43.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v11 05/12] xen: extend domctl interface for cache coloring
2024-12-02 16:59 [PATCH v11 00/12] Arm cache coloring Carlo Nonato
` (3 preceding siblings ...)
2024-12-02 16:59 ` [PATCH v11 04/12] xen/arm: add Dom0 cache coloring support Carlo Nonato
@ 2024-12-02 16:59 ` Carlo Nonato
2024-12-02 16:59 ` [PATCH v11 06/12] tools: add support for cache coloring configuration Carlo Nonato
` (6 subsequent siblings)
11 siblings, 0 replies; 40+ messages in thread
From: Carlo Nonato @ 2024-12-02 16:59 UTC (permalink / raw)
To: xen-devel
Cc: andrea.bastoni, marco.solieri, Carlo Nonato, Andrew Cooper,
Jan Beulich, Julien Grall, Stefano Stabellini
Add a new domctl hypercall to allow the user to set LLC coloring
configurations. Colors can be set only once, just after domain creation,
since recoloring isn't supported.
Based on original work from: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v11:
- no changes
v10:
- no changes
v9:
- minor printk message changes
- moved domain_llc_coloring_free() in this patch
v8:
- fixed memory leak on error path of domain_set_llc_colors()
v7:
- -EOPNOTSUPP returned in case of hypercall called without llc_coloring_enabled
- domain_set_llc_colors_domctl() renamed to domain_set_llc_colors()
- added padding and input bound checks to domain_set_llc_colors()
- removed alloc_colors() helper usage from domain_set_llc_colors()
v6:
- reverted the XEN_DOMCTL_INTERFACE_VERSION bump
- reverted to uint32 for the guest handle
- explicit padding added to the domctl struct
- rewrote domain_set_llc_colors_domctl() to be more explicit
v5:
- added a new hypercall to set colors
- uint for the guest handle
v4:
- updated XEN_DOMCTL_INTERFACE_VERSION
---
xen/common/domain.c | 3 ++
xen/common/domctl.c | 10 +++++++
xen/common/llc-coloring.c | 55 ++++++++++++++++++++++++++++++++--
xen/include/public/domctl.h | 9 ++++++
xen/include/xen/llc-coloring.h | 5 ++++
5 files changed, 79 insertions(+), 3 deletions(-)
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 92263a4fbd..842a23751a 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -34,6 +34,7 @@
#include <xen/xenoprof.h>
#include <xen/irq.h>
#include <xen/argo.h>
+#include <xen/llc-coloring.h>
#include <asm/p2m.h>
#include <asm/processor.h>
#include <public/sched.h>
@@ -1276,6 +1277,8 @@ void domain_destroy(struct domain *d)
{
BUG_ON(!d->is_dying);
+ domain_llc_coloring_free(d);
+
/* May be already destroyed, or get_domain() can race us. */
if ( atomic_cmpxchg(&d->refcnt, 0, DOMAIN_DESTROYED) != 0 )
return;
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index ea16b75910..6387dddbcd 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -8,6 +8,7 @@
#include <xen/types.h>
#include <xen/lib.h>
+#include <xen/llc-coloring.h>
#include <xen/err.h>
#include <xen/mm.h>
#include <xen/sched.h>
@@ -866,6 +867,15 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
__HYPERVISOR_domctl, "h", u_domctl);
break;
+ case XEN_DOMCTL_set_llc_colors:
+ if ( op->u.set_llc_colors.pad )
+ ret = -EINVAL;
+ else if ( llc_coloring_enabled )
+ ret = domain_set_llc_colors(d, &op->u.set_llc_colors);
+ else
+ ret = -EOPNOTSUPP;
+ break;
+
default:
ret = arch_do_domctl(op, d, u_domctl);
break;
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index 8f076849c1..2a0ee695c8 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -5,6 +5,7 @@
* Copyright (C) 2024, Advanced Micro Devices, Inc.
* Copyright (C) 2024, Minerva Systems SRL
*/
+#include <xen/guest_access.h>
#include <xen/keyhandler.h>
#include <xen/llc-coloring.h>
#include <xen/param.h>
@@ -109,8 +110,7 @@ static void print_colors(const unsigned int colors[], unsigned int num_colors)
printk(" }\n");
}
-static bool __init check_colors(const unsigned int colors[],
- unsigned int num_colors)
+static bool check_colors(const unsigned int colors[], unsigned int num_colors)
{
unsigned int i;
@@ -191,7 +191,7 @@ void domain_dump_llc_colors(const struct domain *d)
print_colors(d->llc_colors, d->num_llc_colors);
}
-static void __init domain_set_default_colors(struct domain *d)
+static void domain_set_default_colors(struct domain *d)
{
printk(XENLOG_WARNING
"LLC color config not found for %pd, using all colors\n", d);
@@ -228,6 +228,55 @@ int __init dom0_set_llc_colors(struct domain *d)
return 0;
}
+int domain_set_llc_colors(struct domain *d,
+ const struct xen_domctl_set_llc_colors *config)
+{
+ unsigned int *colors;
+
+ if ( d->num_llc_colors )
+ return -EEXIST;
+
+ if ( !config->num_llc_colors )
+ {
+ domain_set_default_colors(d);
+ return 0;
+ }
+
+ if ( config->num_llc_colors > max_nr_colors )
+ return -EINVAL;
+
+ colors = xmalloc_array(unsigned int, config->num_llc_colors);
+ if ( !colors )
+ return -ENOMEM;
+
+ if ( copy_from_guest(colors, config->llc_colors, config->num_llc_colors) )
+ {
+ xfree(colors);
+ return -EFAULT;
+ }
+
+ if ( !check_colors(colors, config->num_llc_colors) )
+ {
+ printk(XENLOG_ERR "%pd: bad LLC color config\n", d);
+ xfree(colors);
+ return -EINVAL;
+ }
+
+ d->llc_colors = colors;
+ d->num_llc_colors = config->num_llc_colors;
+
+ return 0;
+}
+
+void domain_llc_coloring_free(struct domain *d)
+{
+ if ( !llc_coloring_enabled || d->llc_colors == default_colors )
+ return;
+
+ /* free pointer-to-const using __va(__pa()) */
+ xfree(__va(__pa(d->llc_colors)));
+}
+
/*
* Local variables:
* mode: C
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 353f831e40..e2d392d1e5 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -1236,6 +1236,13 @@ struct xen_domctl_dt_overlay {
};
#endif
+struct xen_domctl_set_llc_colors {
+ /* IN LLC coloring parameters */
+ uint32_t num_llc_colors;
+ uint32_t pad;
+ XEN_GUEST_HANDLE_64(uint32) llc_colors;
+};
+
struct xen_domctl {
uint32_t cmd;
#define XEN_DOMCTL_createdomain 1
@@ -1325,6 +1332,7 @@ struct xen_domctl {
#define XEN_DOMCTL_set_paging_mempool_size 86
#define XEN_DOMCTL_dt_overlay 87
#define XEN_DOMCTL_gsi_permission 88
+#define XEN_DOMCTL_set_llc_colors 89
#define XEN_DOMCTL_gdbsx_guestmemio 1000
#define XEN_DOMCTL_gdbsx_pausevcpu 1001
#define XEN_DOMCTL_gdbsx_unpausevcpu 1002
@@ -1391,6 +1399,7 @@ struct xen_domctl {
#if defined(__arm__) || defined(__aarch64__)
struct xen_domctl_dt_overlay dt_overlay;
#endif
+ struct xen_domctl_set_llc_colors set_llc_colors;
uint8_t pad[128];
} u;
};
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
index 4ce14e4e4a..cbebe0816c 100644
--- a/xen/include/xen/llc-coloring.h
+++ b/xen/include/xen/llc-coloring.h
@@ -9,6 +9,7 @@
#define __XEN_LLC_COLORING_H__
struct domain;
+struct xen_domctl_set_llc_colors;
#ifdef CONFIG_LLC_COLORING
extern int8_t llc_coloring_enabled;
@@ -16,17 +17,21 @@ extern int8_t llc_coloring_enabled;
void llc_coloring_init(void);
void dump_llc_coloring_info(void);
void domain_dump_llc_colors(const struct domain *d);
+void domain_llc_coloring_free(struct domain *d);
#else
#define llc_coloring_enabled false
static inline void llc_coloring_init(void) {}
static inline void dump_llc_coloring_info(void) {}
static inline void domain_dump_llc_colors(const struct domain *d) {}
+static inline void domain_llc_coloring_free(struct domain *d) {}
#endif
unsigned int get_llc_way_size(void);
void arch_llc_coloring_init(void);
int dom0_set_llc_colors(struct domain *d);
+int domain_set_llc_colors(struct domain *d,
+ const struct xen_domctl_set_llc_colors *config);
#endif /* __XEN_LLC_COLORING_H__ */
--
2.43.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v11 06/12] tools: add support for cache coloring configuration
2024-12-02 16:59 [PATCH v11 00/12] Arm cache coloring Carlo Nonato
` (4 preceding siblings ...)
2024-12-02 16:59 ` [PATCH v11 05/12] xen: extend domctl interface for cache coloring Carlo Nonato
@ 2024-12-02 16:59 ` Carlo Nonato
2024-12-04 14:07 ` Anthony PERARD
2024-12-02 16:59 ` [PATCH v11 07/12] xen/arm: add support for cache coloring configuration via device-tree Carlo Nonato
` (5 subsequent siblings)
11 siblings, 1 reply; 40+ messages in thread
From: Carlo Nonato @ 2024-12-02 16:59 UTC (permalink / raw)
To: xen-devel
Cc: andrea.bastoni, marco.solieri, Carlo Nonato, Anthony PERARD,
Nick Rosbrook, George Dunlap, Juergen Gross
Add a new "llc_colors" parameter that defines the LLC color assignment for
a domain. The user can specify one or more color ranges using the same
syntax used everywhere else for color config described in the
documentation.
The parameter is defined as a list of strings that represent the color
ranges.
Documentation is also added.
Based on original work from: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
v11:
- turned unsigned int to uint32_t in xc_domain_set_llc_colors()
- return -1 in case of error instead of ENOMEM in xc_domain_set_llc_colors()
- added regenerated go bindings
v10:
- no changes
v9:
- turned warning into error in case of coloring not enabled
v8:
- warn the user in case of coloring not supported at hypervisor level
v7:
- removed unneeded NULL check before xc_hypercall_buffer_free() in
xc_domain_set_llc_colors()
v6:
- no edits
v5:
- added LIBXL_HAVE_BUILDINFO_LLC_COLORS
- moved color configuration in xc_domain_set_llc_colors() cause of the new
hypercall
v4:
- removed overlapping color ranges checks during parsing
- moved hypercall buffer initialization in libxenctrl
---
docs/man/xl.cfg.5.pod.in | 6 +++++
tools/golang/xenlight/helpers.gen.go | 16 ++++++++++++
tools/golang/xenlight/types.gen.go | 1 +
tools/include/libxl.h | 5 ++++
tools/include/xenctrl.h | 9 +++++++
tools/libs/ctrl/xc_domain.c | 34 +++++++++++++++++++++++++
tools/libs/light/libxl_create.c | 18 +++++++++++++
tools/libs/light/libxl_types.idl | 1 +
tools/xl/xl_parse.c | 38 +++++++++++++++++++++++++++-
9 files changed, 127 insertions(+), 1 deletion(-)
diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index ac3f88fd57..8e1422104e 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -3074,6 +3074,12 @@ raised.
=over 4
+=item B<llc_colors=[ "RANGE", "RANGE", ...]>
+
+Specify the Last Level Cache (LLC) color configuration for the guest.
+B<RANGE> can be either a single color value or a hypen-separated closed
+interval of colors (such as "0-4").
+
=item B<nr_spis="NR_SPIS">
An optional integer parameter specifying the number of SPIs (Shared
diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index fe5110474d..90846ea8e8 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -1097,6 +1097,14 @@ if err := x.Iomem[i].fromC(&v); err != nil {
return fmt.Errorf("converting field Iomem: %v", err) }
}
}
+x.LlcColors = nil
+if n := int(xc.num_llc_colors); n > 0 {
+cLlcColors := (*[1<<28]C.uint32_t)(unsafe.Pointer(xc.llc_colors))[:n:n]
+x.LlcColors = make([]uint32, n)
+for i, v := range cLlcColors {
+x.LlcColors[i] = uint32(v)
+}
+}
if err := x.ClaimMode.fromC(&xc.claim_mode);err != nil {
return fmt.Errorf("converting field ClaimMode: %v", err)
}
@@ -1453,6 +1461,14 @@ return fmt.Errorf("converting field Iomem: %v", err)
}
}
}
+if numLlcColors := len(x.LlcColors); numLlcColors > 0 {
+xc.llc_colors = (*C.uint32_t)(C.malloc(C.size_t(numLlcColors*numLlcColors)))
+xc.num_llc_colors = C.int(numLlcColors)
+cLlcColors := (*[1<<28]C.uint32_t)(unsafe.Pointer(xc.llc_colors))[:numLlcColors:numLlcColors]
+for i,v := range x.LlcColors {
+cLlcColors[i] = C.uint32_t(v)
+}
+}
if err := x.ClaimMode.toC(&xc.claim_mode); err != nil {
return fmt.Errorf("converting field ClaimMode: %v", err)
}
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index c9e45b306f..e7667f1ce3 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -575,6 +575,7 @@ SchedParams DomainSchedParams
Ioports []IoportRange
Irqs []uint32
Iomem []IomemRange
+LlcColors []uint32
ClaimMode Defbool
EventChannels uint32
Kernel string
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 8d32428ea9..f8fe4afd7d 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -1379,6 +1379,11 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, const libxl_mac *src);
*/
#define LIBXL_HAVE_BUILDINFO_HVM_SYSTEM_FIRMWARE
+/*
+ * The libxl_domain_build_info has the llc_colors array.
+ */
+#define LIBXL_HAVE_BUILDINFO_LLC_COLORS 1
+
/*
* ERROR_REMUS_XXX error code only exists from Xen 4.5, Xen 4.6 and it
* is changed to ERROR_CHECKPOINT_XXX in Xen 4.7
diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h
index 29617585c5..d6d93a2e8f 100644
--- a/tools/include/xenctrl.h
+++ b/tools/include/xenctrl.h
@@ -2667,6 +2667,15 @@ int xc_livepatch_replace(xc_interface *xch, char *name, uint32_t timeout, uint32
int xc_domain_cacheflush(xc_interface *xch, uint32_t domid,
xen_pfn_t start_pfn, xen_pfn_t nr_pfns);
+/*
+ * Set LLC colors for a domain.
+ * It can only be used directly after domain creation. An attempt to use it
+ * afterwards will result in an error.
+ */
+int xc_domain_set_llc_colors(xc_interface *xch, uint32_t domid,
+ const uint32_t *llc_colors,
+ uint32_t num_llc_colors);
+
#if defined(__arm__) || defined(__aarch64__)
int xc_dt_overlay(xc_interface *xch, void *overlay_fdt,
uint32_t overlay_fdt_size, uint8_t overlay_op);
diff --git a/tools/libs/ctrl/xc_domain.c b/tools/libs/ctrl/xc_domain.c
index e3538ec0ba..2ddc3f4f42 100644
--- a/tools/libs/ctrl/xc_domain.c
+++ b/tools/libs/ctrl/xc_domain.c
@@ -2195,6 +2195,40 @@ int xc_domain_soft_reset(xc_interface *xch,
domctl.domain = domid;
return do_domctl(xch, &domctl);
}
+
+int xc_domain_set_llc_colors(xc_interface *xch, uint32_t domid,
+ const uint32_t *llc_colors,
+ uint32_t num_llc_colors)
+{
+ struct xen_domctl domctl = {};
+ DECLARE_HYPERCALL_BUFFER(uint32_t, local);
+ int ret = -1;
+
+ if ( num_llc_colors )
+ {
+ size_t bytes = sizeof(uint32_t) * num_llc_colors;
+
+ local = xc_hypercall_buffer_alloc(xch, local, bytes);
+ if ( local == NULL )
+ {
+ PERROR("Could not allocate LLC colors for set_llc_colors");
+ goto out;
+ }
+ memcpy(local, llc_colors, bytes);
+ set_xen_guest_handle(domctl.u.set_llc_colors.llc_colors, local);
+ }
+
+ domctl.cmd = XEN_DOMCTL_set_llc_colors;
+ domctl.domain = domid;
+ domctl.u.set_llc_colors.num_llc_colors = num_llc_colors;
+
+ ret = do_domctl(xch, &domctl);
+
+out:
+ xc_hypercall_buffer_free(xch, local);
+
+ return ret;
+}
/*
* Local variables:
* mode: C
diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index edeadd57ef..e03599ea99 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -747,6 +747,24 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
/* A new domain now exists */
*domid = local_domid;
+ ret = xc_domain_set_llc_colors(ctx->xch, local_domid,
+ b_info->llc_colors,
+ b_info->num_llc_colors);
+ if (ret < 0) {
+ if (errno == EOPNOTSUPP) {
+ if (b_info->num_llc_colors > 0) {
+ LOGED(ERROR, local_domid,
+ "LLC coloring not enabled in the hypervisor");
+ rc = ERROR_FAIL;
+ goto out;
+ }
+ } else {
+ LOGED(ERROR, local_domid, "LLC colors allocation failed");
+ rc = ERROR_FAIL;
+ goto out;
+ }
+ }
+
rc = libxl__is_domid_recent(gc, local_domid, &recent);
if (rc)
goto out;
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 4e65e6fda5..bd4b8721ff 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -616,6 +616,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
("ioports", Array(libxl_ioport_range, "num_ioports")),
("irqs", Array(uint32, "num_irqs")),
("iomem", Array(libxl_iomem_range, "num_iomem")),
+ ("llc_colors", Array(uint32, "num_llc_colors")),
("claim_mode", libxl_defbool),
("event_channels", uint32),
("kernel", string),
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index e3a4800f6e..3d85be7dd4 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -1296,7 +1296,7 @@ void parse_config_data(const char *config_source,
XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms,
*usbctrls, *usbdevs, *p9devs, *vdispls, *pvcallsifs_devs;
XLU_ConfigList *channels, *ioports, *irqs, *iomem, *viridian, *dtdevs,
- *mca_caps, *smbios;
+ *mca_caps, *smbios, *llc_colors;
int num_ioports, num_irqs, num_iomem, num_cpus, num_viridian, num_mca_caps;
int num_smbios;
int pci_power_mgmt = 0;
@@ -1304,6 +1304,7 @@ void parse_config_data(const char *config_source,
int pci_permissive = 0;
int pci_seize = 0;
int i, e;
+ int num_llc_colors;
char *kernel_basename;
libxl_domain_create_info *c_info = &d_config->c_info;
@@ -1447,6 +1448,41 @@ void parse_config_data(const char *config_source,
if (!xlu_cfg_get_long (config, "maxmem", &l, 0))
b_info->max_memkb = l * 1024;
+ if (!xlu_cfg_get_list(config, "llc_colors", &llc_colors, &num_llc_colors, 0)) {
+ int cur_index = 0;
+
+ b_info->num_llc_colors = 0;
+ for (i = 0; i < num_llc_colors; i++) {
+ uint32_t start = 0, end = 0, k;
+
+ buf = xlu_cfg_get_listitem(llc_colors, i);
+ if (!buf) {
+ fprintf(stderr,
+ "xl: Can't get element %d in LLC color list\n", i);
+ exit(1);
+ }
+
+ if (sscanf(buf, "%" SCNu32 "-%" SCNu32, &start, &end) != 2) {
+ if (sscanf(buf, "%" SCNu32, &start) != 1) {
+ fprintf(stderr, "xl: Invalid LLC color range: %s\n", buf);
+ exit(1);
+ }
+ end = start;
+ } else if (start > end) {
+ fprintf(stderr,
+ "xl: Start LLC color is greater than end: %s\n", buf);
+ exit(1);
+ }
+
+ b_info->num_llc_colors += (end - start) + 1;
+ b_info->llc_colors = (uint32_t *)realloc(b_info->llc_colors,
+ sizeof(*b_info->llc_colors) * b_info->num_llc_colors);
+
+ for (k = start; k <= end; k++)
+ b_info->llc_colors[cur_index++] = k;
+ }
+ }
+
if (!xlu_cfg_get_long (config, "vcpus", &l, 0)) {
vcpus = l;
if (libxl_cpu_bitmap_alloc(ctx, &b_info->avail_vcpus, l)) {
--
2.43.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v11 06/12] tools: add support for cache coloring configuration
2024-12-02 16:59 ` [PATCH v11 06/12] tools: add support for cache coloring configuration Carlo Nonato
@ 2024-12-04 14:07 ` Anthony PERARD
0 siblings, 0 replies; 40+ messages in thread
From: Anthony PERARD @ 2024-12-04 14:07 UTC (permalink / raw)
To: Carlo Nonato
Cc: xen-devel, andrea.bastoni, marco.solieri, Nick Rosbrook,
George Dunlap, Juergen Gross
On Mon, Dec 02, 2024 at 05:59:15PM +0100, Carlo Nonato wrote:
> Add a new "llc_colors" parameter that defines the LLC color assignment for
> a domain. The user can specify one or more color ranges using the same
> syntax used everywhere else for color config described in the
> documentation.
> The parameter is defined as a list of strings that represent the color
> ranges.
>
> Documentation is also added.
>
> Based on original work from: Luca Miccio <lucmiccio@gmail.com>
>
> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
Thanks,
--
Anthony Perard | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v11 07/12] xen/arm: add support for cache coloring configuration via device-tree
2024-12-02 16:59 [PATCH v11 00/12] Arm cache coloring Carlo Nonato
` (5 preceding siblings ...)
2024-12-02 16:59 ` [PATCH v11 06/12] tools: add support for cache coloring configuration Carlo Nonato
@ 2024-12-02 16:59 ` Carlo Nonato
2024-12-02 16:59 ` [PATCH v11 08/12] xen/page_alloc: introduce preserved page flags macro Carlo Nonato
` (4 subsequent siblings)
11 siblings, 0 replies; 40+ messages in thread
From: Carlo Nonato @ 2024-12-02 16:59 UTC (permalink / raw)
To: xen-devel
Cc: andrea.bastoni, marco.solieri, Carlo Nonato, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
Andrew Cooper, Jan Beulich
Add the "llc-colors" Device Tree property to express DomUs and Dom0less
color configurations.
Based on original work from: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Reviewed-by: Jan Beulich <jbeulich@suse.com> # non-Arm
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
---
v11:
- made clear that llc-colors device-tree property is Arm64-only in booting.txt
v10:
- no changes
v9:
- use best-effort allocation in domain_set_llc_colors_from_str()
v8:
- fixed memory leak on error path of domain_set_llc_colors_from_str()
- realloc colors array after parsing from string to reduce memory usage
v7:
- removed alloc_colors() helper usage from domain_set_llc_colors_from_str()
v6:
- rewrote domain_set_llc_colors_from_str() to be more explicit
v5:
- static-mem check has been moved in a previous patch
- added domain_set_llc_colors_from_str() to set colors after domain creation
---
docs/misc/arm/device-tree/booting.txt | 5 +++
docs/misc/cache-coloring.rst | 48 +++++++++++++++++++++++++++
xen/arch/arm/dom0less-build.c | 10 ++++++
xen/common/llc-coloring.c | 40 ++++++++++++++++++++++
xen/include/xen/llc-coloring.h | 1 +
xen/include/xen/xmalloc.h | 12 +++++++
6 files changed, 116 insertions(+)
diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt
index 3a04f5c57f..9c881baccc 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -162,6 +162,11 @@ with the following properties:
An integer specifying the number of vcpus to allocate to the guest.
+- llc-colors
+ A string specifying the LLC color configuration for the guest.
+ Refer to docs/misc/cache_coloring.rst for syntax. This option is applicable
+ only to Arm64 guests.
+
- vpl011
An empty property to enable/disable a virtual pl011 for the guest to
diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
index 7b47d0ed92..e097e74032 100644
--- a/docs/misc/cache-coloring.rst
+++ b/docs/misc/cache-coloring.rst
@@ -14,6 +14,7 @@ If needed, change the maximum number of colors with
``CONFIG_LLC_COLORS_ORDER=<n>``.
Runtime configuration is done via `Command line parameters`_.
+For DomUs follow `DomUs configuration`_.
Background
**********
@@ -149,6 +150,53 @@ LLC specs can be manually set via the above command line parameters. This
bypasses any auto-probing and it's used to overcome failing situations, such as
flawed probing logic, or for debugging/testing purposes.
+DomUs configuration
+*******************
+
+DomUs colors can be set either in the ``xl`` configuration file (documentation
+at `docs/man/xl.cfg.pod.5.in`) or via Device Tree (documentation at
+`docs/misc/arm/device-tree/booting.txt`) using the ``llc-colors`` option.
+For example:
+
+::
+
+ xen,xen-bootargs = "console=dtuart dtuart=serial0 dom0_mem=1G dom0_max_vcpus=1 sched=null llc-coloring=on dom0-llc-colors=2-6";
+ xen,dom0-bootargs "console=hvc0 earlycon=xen earlyprintk=xen root=/dev/ram0"
+
+ dom0 {
+ compatible = "xen,linux-zimage" "xen,multiboot-module";
+ reg = <0x0 0x1000000 0x0 15858176>;
+ };
+
+ dom0-ramdisk {
+ compatible = "xen,linux-initrd" "xen,multiboot-module";
+ reg = <0x0 0x2000000 0x0 20638062>;
+ };
+
+ domU0 {
+ #address-cells = <0x1>;
+ #size-cells = <0x1>;
+ compatible = "xen,domain";
+ memory = <0x0 0x40000>;
+ llc-colors = "4-8,10,11,12";
+ cpus = <0x1>;
+ vpl011 = <0x1>;
+
+ module@2000000 {
+ compatible = "multiboot,kernel", "multiboot,module";
+ reg = <0x2000000 0xffffff>;
+ bootargs = "console=ttyAMA0";
+ };
+
+ module@30000000 {
+ compatible = "multiboot,ramdisk", "multiboot,module";
+ reg = <0x3000000 0xffffff>;
+ };
+ };
+
+**Note:** If no color configuration is provided for a domain, the default one,
+which corresponds to all available colors is used instead.
+
Known issues and limitations
****************************
diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index 67b1503647..49d1f14d65 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -817,6 +817,7 @@ void __init create_domUs(void)
bool iommu = false;
const struct dt_device_node *cpupool_node,
*chosen = dt_find_node_by_path("/chosen");
+ const char *llc_colors_str = NULL;
BUG_ON(chosen == NULL);
dt_for_each_child_node(chosen, node)
@@ -965,6 +966,10 @@ void __init create_domUs(void)
#endif
}
+ dt_property_read_string(node, "llc-colors", &llc_colors_str);
+ if ( !llc_coloring_enabled && llc_colors_str )
+ panic("'llc-colors' found, but LLC coloring is disabled\n");
+
/*
* The variable max_init_domid is initialized with zero, so here it's
* very important to use the pre-increment operator to call
@@ -975,6 +980,11 @@ void __init create_domUs(void)
panic("Error creating domain %s (rc = %ld)\n",
dt_node_name(node), PTR_ERR(d));
+ if ( llc_coloring_enabled &&
+ (rc = domain_set_llc_colors_from_str(d, llc_colors_str)) )
+ panic("Error initializing LLC coloring for domain %s (rc = %d)\n",
+ dt_node_name(node), rc);
+
d->is_console = true;
dt_device_set_used_by(node, d->domain_id);
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index 2a0ee695c8..2a85345cf1 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -277,6 +277,46 @@ void domain_llc_coloring_free(struct domain *d)
xfree(__va(__pa(d->llc_colors)));
}
+int __init domain_set_llc_colors_from_str(struct domain *d, const char *str)
+{
+ int err;
+ unsigned int *colors, num_colors;
+
+ if ( !str )
+ {
+ domain_set_default_colors(d);
+ return 0;
+ }
+
+ colors = xmalloc_array(unsigned int, max_nr_colors);
+ if ( !colors )
+ return -ENOMEM;
+
+ err = parse_color_config(str, colors, max_nr_colors, &num_colors);
+ if ( err )
+ {
+ printk(XENLOG_ERR "Error parsing LLC color configuration");
+ xfree(colors);
+ return err;
+ }
+
+ if ( !check_colors(colors, num_colors) )
+ {
+ printk(XENLOG_ERR "%pd: bad LLC color config\n", d);
+ xfree(colors);
+ return -EINVAL;
+ }
+
+ /* Adjust the size cause it was initially set to max_nr_colors */
+ d->llc_colors = xrealloc_array(colors, num_colors);
+ if ( !d->llc_colors )
+ d->llc_colors = colors;
+
+ d->num_llc_colors = num_colors;
+
+ return 0;
+}
+
/*
* Local variables:
* mode: C
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
index cbebe0816c..ae8a8825e5 100644
--- a/xen/include/xen/llc-coloring.h
+++ b/xen/include/xen/llc-coloring.h
@@ -32,6 +32,7 @@ void arch_llc_coloring_init(void);
int dom0_set_llc_colors(struct domain *d);
int domain_set_llc_colors(struct domain *d,
const struct xen_domctl_set_llc_colors *config);
+int domain_set_llc_colors_from_str(struct domain *d, const char *str);
#endif /* __XEN_LLC_COLORING_H__ */
diff --git a/xen/include/xen/xmalloc.h b/xen/include/xen/xmalloc.h
index b903fa2e26..f0412fb4e0 100644
--- a/xen/include/xen/xmalloc.h
+++ b/xen/include/xen/xmalloc.h
@@ -37,6 +37,9 @@
((_type *)_xmalloc_array(sizeof(_type), __alignof__(_type), _num))
#define xzalloc_array(_type, _num) \
((_type *)_xzalloc_array(sizeof(_type), __alignof__(_type), _num))
+#define xrealloc_array(_ptr, _num) \
+ ((typeof(_ptr))_xrealloc_array(_ptr, sizeof(typeof(*(_ptr))), \
+ __alignof__(typeof(*(_ptr))), _num))
/* Allocate space for a structure with a flexible array of typed objects. */
#define xzalloc_flex_struct(type, field, nr) \
@@ -98,6 +101,15 @@ static inline void *_xzalloc_array(
return _xzalloc(size * num, align);
}
+static inline void *_xrealloc_array(
+ void *ptr, unsigned long size, unsigned long align, unsigned long num)
+{
+ /* Check for overflow. */
+ if ( size && num > UINT_MAX / size )
+ return NULL;
+ return _xrealloc(ptr, size * num, align);
+}
+
/*
* Pooled allocator interface.
*/
--
2.43.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v11 08/12] xen/page_alloc: introduce preserved page flags macro
2024-12-02 16:59 [PATCH v11 00/12] Arm cache coloring Carlo Nonato
` (6 preceding siblings ...)
2024-12-02 16:59 ` [PATCH v11 07/12] xen/arm: add support for cache coloring configuration via device-tree Carlo Nonato
@ 2024-12-02 16:59 ` Carlo Nonato
2024-12-02 17:33 ` Carlo Nonato
2024-12-02 16:59 ` [PATCH v11 09/12] xen: add cache coloring allocator for domains Carlo Nonato
` (3 subsequent siblings)
11 siblings, 1 reply; 40+ messages in thread
From: Carlo Nonato @ 2024-12-02 16:59 UTC (permalink / raw)
To: xen-devel
Cc: andrea.bastoni, marco.solieri, Carlo Nonato, Andrew Cooper,
Jan Beulich, Julien Grall, Stefano Stabellini
PGC_static and PGC_extra need to be preserved when assigning a page.
Define a new macro that groups those flags and use it instead of or'ing
every time.
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
---
v11:
- removed PGC_broken from PGC_preserved
- removed PGC preservation from mark_page_free()
v10:
- fixed commit message
v9:
- add PGC_broken to PGC_preserved
- clear PGC_extra in alloc_domheap_pages() only if MEMF_no_refcount is set
v8:
- fixed PGC_extra ASSERT fail in alloc_domheap_pages() by removing PGC_extra
before freeing
v7:
- PGC_preserved used also in mark_page_free()
v6:
- preserved_flags renamed to PGC_preserved
- PGC_preserved is used only in assign_pages()
v5:
- new patch
---
xen/common/page_alloc.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 55d561e93c..e73b404169 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -161,6 +161,7 @@
#endif
#define PGC_no_buddy_merge PGC_static
+#define PGC_preserved (PGC_extra | PGC_static)
#ifndef PGT_TYPE_INFO_INITIALIZER
#define PGT_TYPE_INFO_INITIALIZER 0
@@ -1447,8 +1448,7 @@ static bool mark_page_free(struct page_info *pg, mfn_t mfn)
break;
case PGC_state_offlining:
- pg->count_info = (pg->count_info & PGC_broken) |
- PGC_state_offlined;
+ pg->count_info = (pg->count_info & PGC_broken) | PGC_state_offlined;
pg_offlined = true;
break;
@@ -2382,7 +2382,7 @@ int assign_pages(
for ( i = 0; i < nr; i++ )
{
- ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static)));
+ ASSERT(!(pg[i].count_info & ~PGC_preserved));
if ( pg[i].count_info & PGC_extra )
extra_pages++;
}
@@ -2442,7 +2442,7 @@ int assign_pages(
page_set_owner(&pg[i], d);
smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
pg[i].count_info =
- (pg[i].count_info & (PGC_extra | PGC_static)) | PGC_allocated | 1;
+ (pg[i].count_info & PGC_preserved) | PGC_allocated | 1;
page_list_add_tail(&pg[i], page_to_list(d, &pg[i]));
}
@@ -2501,6 +2501,14 @@ struct page_info *alloc_domheap_pages(
}
if ( assign_page(pg, order, d, memflags) )
{
+ if ( memflags & MEMF_no_refcount )
+ {
+ unsigned long i;
+
+ for ( i = 0; i < (1UL << order); i++ )
+ pg[i].count_info &= ~PGC_extra;
+ }
+
free_heap_pages(pg, order, memflags & MEMF_no_scrub);
return NULL;
}
@@ -2555,6 +2563,7 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
{
ASSERT(d->extra_pages);
d->extra_pages--;
+ pg[i].count_info &= ~PGC_extra;
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v11 08/12] xen/page_alloc: introduce preserved page flags macro
2024-12-02 16:59 ` [PATCH v11 08/12] xen/page_alloc: introduce preserved page flags macro Carlo Nonato
@ 2024-12-02 17:33 ` Carlo Nonato
2024-12-03 8:54 ` Jan Beulich
0 siblings, 1 reply; 40+ messages in thread
From: Carlo Nonato @ 2024-12-02 17:33 UTC (permalink / raw)
To: xen-devel
Cc: andrea.bastoni, marco.solieri, Andrew Cooper, Jan Beulich,
Julien Grall, Stefano Stabellini
Hi all,
On Mon, Dec 2, 2024 at 5:59 PM Carlo Nonato
<carlo.nonato@minervasys.tech> wrote:
>
> PGC_static and PGC_extra need to be preserved when assigning a page.
> Define a new macro that groups those flags and use it instead of or'ing
> every time.
>
> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> ---
> v11:
> - removed PGC_broken from PGC_preserved
> - removed PGC preservation from mark_page_free()
> v10:
> - fixed commit message
> v9:
> - add PGC_broken to PGC_preserved
> - clear PGC_extra in alloc_domheap_pages() only if MEMF_no_refcount is set
> v8:
> - fixed PGC_extra ASSERT fail in alloc_domheap_pages() by removing PGC_extra
> before freeing
> v7:
> - PGC_preserved used also in mark_page_free()
> v6:
> - preserved_flags renamed to PGC_preserved
> - PGC_preserved is used only in assign_pages()
> v5:
> - new patch
> ---
> xen/common/page_alloc.c | 17 +++++++++++++----
> 1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index 55d561e93c..e73b404169 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -161,6 +161,7 @@
> #endif
>
> #define PGC_no_buddy_merge PGC_static
> +#define PGC_preserved (PGC_extra | PGC_static)
>
> #ifndef PGT_TYPE_INFO_INITIALIZER
> #define PGT_TYPE_INFO_INITIALIZER 0
> @@ -1447,8 +1448,7 @@ static bool mark_page_free(struct page_info *pg, mfn_t mfn)
> break;
>
> case PGC_state_offlining:
> - pg->count_info = (pg->count_info & PGC_broken) |
> - PGC_state_offlined;
> + pg->count_info = (pg->count_info & PGC_broken) | PGC_state_offlined;
> pg_offlined = true;
> break;
>
> @@ -2382,7 +2382,7 @@ int assign_pages(
>
> for ( i = 0; i < nr; i++ )
> {
> - ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static)));
> + ASSERT(!(pg[i].count_info & ~PGC_preserved));
> if ( pg[i].count_info & PGC_extra )
> extra_pages++;
> }
> @@ -2442,7 +2442,7 @@ int assign_pages(
> page_set_owner(&pg[i], d);
> smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
> pg[i].count_info =
> - (pg[i].count_info & (PGC_extra | PGC_static)) | PGC_allocated | 1;
> + (pg[i].count_info & PGC_preserved) | PGC_allocated | 1;
>
> page_list_add_tail(&pg[i], page_to_list(d, &pg[i]));
> }
> @@ -2501,6 +2501,14 @@ struct page_info *alloc_domheap_pages(
> }
> if ( assign_page(pg, order, d, memflags) )
> {
> + if ( memflags & MEMF_no_refcount )
> + {
> + unsigned long i;
> +
> + for ( i = 0; i < (1UL << order); i++ )
> + pg[i].count_info &= ~PGC_extra;
> + }
> +
> free_heap_pages(pg, order, memflags & MEMF_no_scrub);
> return NULL;
> }
> @@ -2555,6 +2563,7 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
> {
> ASSERT(d->extra_pages);
> d->extra_pages--;
> + pg[i].count_info &= ~PGC_extra;
> }
> }
>
> --
> 2.43.0
>
Sorry guys, this patch is wrong.
Here's the correct one.
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 55d561e93c..1c0990b180 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -161,6 +161,7 @@
#endif
#define PGC_no_buddy_merge PGC_static
+#define PGC_preserved (PGC_extra | PGC_static)
#ifndef PGT_TYPE_INFO_INITIALIZER
#define PGT_TYPE_INFO_INITIALIZER 0
@@ -2382,7 +2383,7 @@ int assign_pages(
for ( i = 0; i < nr; i++ )
{
- ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static)));
+ ASSERT(!(pg[i].count_info & ~PGC_preserved));
if ( pg[i].count_info & PGC_extra )
extra_pages++;
}
@@ -2442,7 +2443,7 @@ int assign_pages(
page_set_owner(&pg[i], d);
smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
pg[i].count_info =
- (pg[i].count_info & (PGC_extra | PGC_static)) | PGC_allocated | 1;
+ (pg[i].count_info & PGC_preserved) | PGC_allocated | 1;
page_list_add_tail(&pg[i], page_to_list(d, &pg[i]));
}
Sorry about that.
Thanks.
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v11 08/12] xen/page_alloc: introduce preserved page flags macro
2024-12-02 17:33 ` Carlo Nonato
@ 2024-12-03 8:54 ` Jan Beulich
0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2024-12-03 8:54 UTC (permalink / raw)
To: Carlo Nonato
Cc: andrea.bastoni, marco.solieri, Andrew Cooper, Julien Grall,
Stefano Stabellini, xen-devel
On 02.12.2024 18:33, Carlo Nonato wrote:
> Sorry guys, this patch is wrong.
> Here's the correct one.
Which looks okay to me now, just that imo ...
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -161,6 +161,7 @@
> #endif
>
> #define PGC_no_buddy_merge PGC_static
> +#define PGC_preserved (PGC_extra | PGC_static)
... this new #define now wants a comment, to clarify where the constant is
to be used (or specifically not to be used). Unlike for PGC_no_buddy_merge
this can't be easily deduced from the name.
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v11 09/12] xen: add cache coloring allocator for domains
2024-12-02 16:59 [PATCH v11 00/12] Arm cache coloring Carlo Nonato
` (7 preceding siblings ...)
2024-12-02 16:59 ` [PATCH v11 08/12] xen/page_alloc: introduce preserved page flags macro Carlo Nonato
@ 2024-12-02 16:59 ` Carlo Nonato
2024-12-09 13:41 ` Jan Beulich
2024-12-02 16:59 ` [PATCH v11 10/12] xen/arm: add Xen cache colors command line parameter Carlo Nonato
` (2 subsequent siblings)
11 siblings, 1 reply; 40+ messages in thread
From: Carlo Nonato @ 2024-12-02 16:59 UTC (permalink / raw)
To: xen-devel
Cc: andrea.bastoni, marco.solieri, Carlo Nonato, Andrew Cooper,
Jan Beulich, Julien Grall, Stefano Stabellini, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk
Add a new memory page allocator that implements the cache coloring mechanism.
The allocation algorithm enforces equal frequency distribution of cache
partitions, following the coloring configuration of a domain. This allows
for an even utilization of cache sets for every domain.
Pages are stored in a color-indexed array of lists. Those lists are filled
by a simple init function which computes the color of each page.
When a domain requests a page, the allocator extracts the page from the list
with the maximum number of free pages among those that the domain can access,
given its coloring configuration.
The allocator can only handle requests of order-0 pages. This allows for
easier implementation and since cache coloring targets only embedded systems,
it's assumed not to be a major problem.
The buddy allocator must coexist with the colored one because the Xen heap
isn't colored. For this reason a new Kconfig option and a command line
parameter are added to let the user set the amount of memory reserved for
the buddy allocator. Even when cache coloring is enabled, this memory
isn't managed by the colored allocator.
Colored heap information is dumped in the dump_heap() debug-key function.
Based on original work from: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
---
v11:
- CONFIG_BUDDY_ALLOCATOR_SIZE depends on CONFIG_LLC_COLORING
- buddy_alloc_size is defined only if CONFIG_LLC_COLORING
- buddy-alloc-size param is parsed only if CONFIG_LLC_COLORING
v10:
- stated explicit dependency on CONFIG_LLC_COLORING for buddy-alloc-size
- fix for MISRA rule 20.7 parenthesis
v9:
- added ASSERT(order == 0) when freeing a colored page
- moved buddy_alloc_size initialization logic in Kconfig
v8:
- requests that uses MEMF_* flags that can't be served are now going to fail
- free_color_heap_page() is called directly from free_heap_pages()
v7:
- requests to alloc_color_heap_page() now fail if MEMF_bits is used
v6:
- colored allocator functions are now static
v5:
- Carlo Nonato as the new author
- the colored allocator balances color usage for each domain and it searches
linearly only in the number of colors (FIXME removed)
- addedd scrub functionality
- removed stub functions (still requires some macro definition)
- addr_to_color turned to mfn_to_color for easier operations
- removed BUG_ON in init_color_heap_pages() in favor of panic()
- only non empty page lists are logged in dump_color_heap()
v4:
- moved colored allocator code after buddy allocator because it now has
some dependencies on buddy functions
- buddy_alloc_size is now used only by the colored allocator
- fixed a bug that allowed the buddy to merge pages when they were colored
- free_color_heap_page() now calls mark_page_free()
- free_color_heap_page() uses of the frametable array for faster searches
- added FIXME comment for the linear search in free_color_heap_page()
- removed alloc_color_domheap_page() to let the colored allocator exploit
some more buddy allocator code
- alloc_color_heap_page() now allocs min address pages first
- reduced the mess in end_boot_allocator(): use the first loop for
init_color_heap_pages()
- fixed page_list_add_prev() (list.h) since it was doing the opposite of
what it was supposed to do
- fixed page_list_add_prev() (non list.h) to check also for next existence
- removed unused page_list_add_next()
- moved p2m code in another patch
---
docs/misc/cache-coloring.rst | 37 ++++++
docs/misc/xen-command-line.pandoc | 14 +++
xen/arch/arm/include/asm/mm.h | 5 +
xen/common/Kconfig | 8 ++
xen/common/llc-coloring.c | 13 ++
xen/common/page_alloc.c | 191 +++++++++++++++++++++++++++++-
xen/include/xen/llc-coloring.h | 3 +
7 files changed, 267 insertions(+), 4 deletions(-)
diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
index e097e74032..5224b27afe 100644
--- a/docs/misc/cache-coloring.rst
+++ b/docs/misc/cache-coloring.rst
@@ -13,6 +13,9 @@ To compile LLC coloring support set ``CONFIG_LLC_COLORING=y``.
If needed, change the maximum number of colors with
``CONFIG_LLC_COLORS_ORDER=<n>``.
+If needed, change the buddy allocator reserved size with
+``CONFIG_BUDDY_ALLOCATOR_SIZE=<n>``.
+
Runtime configuration is done via `Command line parameters`_.
For DomUs follow `DomUs configuration`_.
@@ -110,6 +113,8 @@ Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
+----------------------+-------------------------------+
| ``dom0-llc-colors`` | Dom0 color configuration |
+----------------------+-------------------------------+
+| ``buddy-alloc-size`` | Buddy allocator reserved size |
++----------------------+-------------------------------+
Colors selection format
***********************
@@ -197,6 +202,17 @@ For example:
**Note:** If no color configuration is provided for a domain, the default one,
which corresponds to all available colors is used instead.
+Colored allocator and buddy allocator
+*************************************
+
+The colored allocator distributes pages based on color configurations of
+domains so that each domains only gets pages of its own colors.
+The colored allocator is meant as an alternative to the buddy allocator because
+its allocation policy is by definition incompatible with the generic one. Since
+the Xen heap is not colored yet, we need to support the coexistence of the two
+allocators and some memory must be left for the buddy one. Buddy memory
+reservation is configured via Kconfig or via command-line.
+
Known issues and limitations
****************************
@@ -207,3 +223,24 @@ In the domain configuration, "xen,static-mem" allows memory to be statically
allocated to the domain. This isn't possible when LLC coloring is enabled,
because that memory can't be guaranteed to use only colors assigned to the
domain.
+
+Cache coloring is intended only for embedded systems
+####################################################
+
+The current implementation aims to satisfy the need of predictability in
+embedded systems with small amount of memory to be managed in a colored way.
+Given that, some shortcuts are taken in the development. Expect worse
+performances on larger systems.
+
+Colored allocator can only make use of order-0 pages
+####################################################
+
+The cache coloring technique relies on memory mappings and on the smallest
+mapping granularity to achieve the maximum number of colors (cache partitions)
+possible. This granularity is what is normally called a page and, in Xen
+terminology, the order-0 page is the smallest one. The fairly simple
+colored allocator currently implemented, makes use only of such pages.
+It must be said that a more complex one could, in theory, adopt higher order
+pages if the colors selection contained adjacent colors. Two subsequent colors,
+for example, can be represented by an order-1 page, four colors correspond to
+an order-2 page, etc.
diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index bfdc8b0002..3a70c49c05 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -270,6 +270,20 @@ and not running softirqs. Reduce this if softirqs are not being run frequently
enough. Setting this to a high value may cause boot failure, particularly if
the NMI watchdog is also enabled.
+### buddy-alloc-size (arm64)
+> `= <size>`
+
+> Default: `64M`
+
+Amount of memory reserved for the buddy allocator when colored allocator is
+active. This option is available only when `CONFIG_LLC_COLORING` is enabled.
+The colored allocator is meant as an alternative to the buddy allocator,
+because its allocation policy is by definition incompatible with the generic
+one. Since the Xen heap systems is not colored yet, we need to support the
+coexistence of the two allocators for now. This parameter, which is optional
+and for expert only, it's used to set the amount of memory reserved to the
+buddy allocator.
+
### cet
= List of [ shstk=<bool>, ibt=<bool> ]
diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index 5abd4b0d1c..c1a5ac7bee 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -145,6 +145,11 @@ struct page_info
#else
#define PGC_static 0
#endif
+#ifdef CONFIG_LLC_COLORING
+/* Page is cache colored */
+#define _PGC_colored PG_shift(4)
+#define PGC_colored PG_mask(1, 4)
+#endif
/* ... */
/* Page is broken? */
#define _PGC_broken PG_shift(7)
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index b4ec6893be..6166327f4d 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -537,4 +537,12 @@ config LLC_COLORS_ORDER
The default value corresponds to an 8 MiB 16-ways LLC, which should be
more than what's needed in the general case.
+config BUDDY_ALLOCATOR_SIZE
+ int "Buddy allocator reserved memory size (MiB)"
+ default "64"
+ depends on LLC_COLORING
+ help
+ Amount of memory reserved for the buddy allocator to serve Xen heap,
+ working alongside the colored one.
+
endmenu
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index 2a85345cf1..0f22a9b72c 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -32,6 +32,9 @@ static unsigned int __ro_after_init default_colors[NR_LLC_COLORS];
static unsigned int __initdata dom0_colors[NR_LLC_COLORS];
static unsigned int __initdata dom0_num_colors;
+#define mfn_color_mask (max_nr_colors - 1)
+#define mfn_to_color(mfn) (mfn_x(mfn) & mfn_color_mask)
+
/*
* Parse the coloring configuration given in the buf string, following the
* syntax below.
@@ -317,6 +320,16 @@ int __init domain_set_llc_colors_from_str(struct domain *d, const char *str)
return 0;
}
+unsigned int page_to_llc_color(const struct page_info *pg)
+{
+ return mfn_to_color(page_to_mfn(pg));
+}
+
+unsigned int get_max_nr_llc_colors(void)
+{
+ return max_nr_colors;
+}
+
/*
* Local variables:
* mode: C
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index e73b404169..a1133cfd89 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -139,6 +139,7 @@
#include <xen/softirq.h>
#include <xen/spinlock.h>
#include <xen/vm_event.h>
+#include <xen/xvmalloc.h>
#include <asm/flushtlb.h>
#include <asm/page.h>
@@ -160,8 +161,12 @@
#define PGC_static 0
#endif
-#define PGC_no_buddy_merge PGC_static
-#define PGC_preserved (PGC_extra | PGC_static)
+#ifndef PGC_colored
+#define PGC_colored 0
+#endif
+
+#define PGC_no_buddy_merge (PGC_static | PGC_colored)
+#define PGC_preserved (PGC_extra | PGC_static | PGC_colored)
#ifndef PGT_TYPE_INFO_INITIALIZER
#define PGT_TYPE_INFO_INITIALIZER 0
@@ -1473,6 +1478,8 @@ static bool mark_page_free(struct page_info *pg, mfn_t mfn)
return pg_offlined;
}
+static void free_color_heap_page(struct page_info *pg, bool need_scrub);
+
/* Free 2^@order set of pages. */
static void free_heap_pages(
struct page_info *pg, unsigned int order, bool need_scrub)
@@ -1497,6 +1504,15 @@ static void free_heap_pages(
pg[i].count_info |= PGC_need_scrub;
poison_one_page(&pg[i]);
}
+
+ if ( pg->count_info & PGC_colored )
+ {
+ ASSERT(order == 0);
+
+ free_color_heap_page(pg, need_scrub);
+ spin_unlock(&heap_lock);
+ return;
+ }
}
avail[node][zone] += 1 << order;
@@ -1961,6 +1977,157 @@ static unsigned long avail_heap_pages(
return free_pages;
}
+/*************************
+ * COLORED SIDE-ALLOCATOR
+ *
+ * Pages are grouped by LLC color in lists which are globally referred to as the
+ * color heap. Lists are populated in end_boot_allocator().
+ * After initialization there will be N lists where N is the number of
+ * available colors on the platform.
+ */
+static struct page_list_head *__ro_after_init _color_heap;
+#define color_heap(color) (&_color_heap[color])
+
+static unsigned long *__ro_after_init free_colored_pages;
+
+#ifdef CONFIG_LLC_COLORING
+#define domain_num_llc_colors(d) ((d)->num_llc_colors)
+#define domain_llc_color(d, i) ((d)->llc_colors[i])
+
+/* Memory required for buddy allocator to work with colored one */
+static unsigned long __initdata buddy_alloc_size =
+ MB(CONFIG_BUDDY_ALLOCATOR_SIZE);
+size_param("buddy-alloc-size", buddy_alloc_size);
+#else
+#define domain_num_llc_colors(d) 0
+#define domain_llc_color(d, i) 0
+#endif
+
+static void free_color_heap_page(struct page_info *pg, bool need_scrub)
+{
+ unsigned int color;
+
+ color = page_to_llc_color(pg);
+ free_colored_pages[color]++;
+ /*
+ * Head insertion allows re-using cache-hot pages in configurations without
+ * sharing of colors.
+ */
+ page_list_add(pg, color_heap(color));
+}
+
+static struct page_info *alloc_color_heap_page(unsigned int memflags,
+ const struct domain *d)
+{
+ struct page_info *pg = NULL;
+ unsigned int i, color = 0;
+ unsigned long max = 0;
+ bool need_tlbflush = false;
+ uint32_t tlbflush_timestamp = 0;
+ bool need_scrub;
+
+ if ( memflags & ~(MEMF_no_refcount | MEMF_no_owner | MEMF_no_tlbflush |
+ MEMF_no_icache_flush | MEMF_no_scrub) )
+ return NULL;
+
+ spin_lock(&heap_lock);
+
+ for ( i = 0; i < domain_num_llc_colors(d); i++ )
+ {
+ unsigned long free = free_colored_pages[domain_llc_color(d, i)];
+
+ if ( free > max )
+ {
+ color = domain_llc_color(d, i);
+ pg = page_list_first(color_heap(color));
+ max = free;
+ }
+ }
+
+ if ( !pg )
+ {
+ spin_unlock(&heap_lock);
+ return NULL;
+ }
+
+ need_scrub = pg->count_info & PGC_need_scrub;
+ pg->count_info = PGC_state_inuse | (pg->count_info & PGC_colored);
+ free_colored_pages[color]--;
+ page_list_del(pg, color_heap(color));
+
+ if ( !(memflags & MEMF_no_tlbflush) )
+ accumulate_tlbflush(&need_tlbflush, pg, &tlbflush_timestamp);
+
+ init_free_page_fields(pg);
+
+ spin_unlock(&heap_lock);
+
+ if ( !(memflags & MEMF_no_scrub) )
+ {
+ if ( need_scrub )
+ scrub_one_page(pg);
+ else
+ check_one_page(pg);
+ }
+
+ if ( need_tlbflush )
+ filtered_flush_tlb_mask(tlbflush_timestamp);
+
+ flush_page_to_ram(mfn_x(page_to_mfn(pg)),
+ !(memflags & MEMF_no_icache_flush));
+
+ return pg;
+}
+
+static void __init init_color_heap_pages(struct page_info *pg,
+ unsigned long nr_pages)
+{
+ unsigned long i;
+ bool need_scrub = opt_bootscrub == BOOTSCRUB_IDLE;
+
+#ifdef buddy_alloc_size
+ if ( buddy_alloc_size >= PAGE_SIZE )
+ {
+ unsigned long buddy_pages = min(PFN_DOWN(buddy_alloc_size), nr_pages);
+
+ init_heap_pages(pg, buddy_pages);
+ nr_pages -= buddy_pages;
+ buddy_alloc_size -= buddy_pages << PAGE_SHIFT;
+ pg += buddy_pages;
+ }
+#endif
+
+ if ( !_color_heap )
+ {
+ unsigned int max_nr_colors = get_max_nr_llc_colors();
+
+ _color_heap = xvmalloc_array(struct page_list_head, max_nr_colors);
+ free_colored_pages = xvzalloc_array(unsigned long, max_nr_colors);
+ if ( !_color_heap || !free_colored_pages )
+ panic("Can't allocate colored heap. Buddy reserved size is too low");
+
+ for ( i = 0; i < max_nr_colors; i++ )
+ INIT_PAGE_LIST_HEAD(color_heap(i));
+ }
+
+ for ( i = 0; i < nr_pages; i++ )
+ {
+ pg[i].count_info = PGC_colored;
+ free_color_heap_page(&pg[i], need_scrub);
+ }
+}
+
+static void dump_color_heap(void)
+{
+ unsigned int color;
+
+ printk("Dumping color heap info\n");
+ for ( color = 0; color < get_max_nr_llc_colors(); color++ )
+ if ( free_colored_pages[color] > 0 )
+ printk("Color heap[%u]: %lu pages\n",
+ color, free_colored_pages[color]);
+}
+
void __init end_boot_allocator(void)
{
unsigned int i;
@@ -1980,7 +2147,13 @@ void __init end_boot_allocator(void)
for ( i = nr_bootmem_regions; i-- > 0; )
{
struct bootmem_region *r = &bootmem_region_list[i];
- if ( r->s < r->e )
+
+ if ( r->s >= r->e )
+ continue;
+
+ if ( llc_coloring_enabled )
+ init_color_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s);
+ else
init_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s);
}
nr_bootmem_regions = 0;
@@ -2476,7 +2649,14 @@ struct page_info *alloc_domheap_pages(
if ( memflags & MEMF_no_owner )
memflags |= MEMF_no_refcount;
- if ( !dma_bitsize )
+ /* Only domains are supported for coloring */
+ if ( d && llc_coloring_enabled )
+ {
+ /* Colored allocation must be done on 0 order */
+ if ( order || (pg = alloc_color_heap_page(memflags, d)) == NULL )
+ return NULL;
+ }
+ else if ( !dma_bitsize )
memflags &= ~MEMF_no_dma;
else if ( (dma_zone = bits_to_zone(dma_bitsize)) < zone_hi )
pg = alloc_heap_pages(dma_zone + 1, zone_hi, order, memflags, d);
@@ -2688,6 +2868,9 @@ static void cf_check dump_heap(unsigned char key)
continue;
printk("Node %d has %lu unscrubbed pages\n", i, node_need_scrub[i]);
}
+
+ if ( llc_coloring_enabled )
+ dump_color_heap();
}
static __init int cf_check register_heap_trigger(void)
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
index ae8a8825e5..b3f2fa22bc 100644
--- a/xen/include/xen/llc-coloring.h
+++ b/xen/include/xen/llc-coloring.h
@@ -9,6 +9,7 @@
#define __XEN_LLC_COLORING_H__
struct domain;
+struct page_info;
struct xen_domctl_set_llc_colors;
#ifdef CONFIG_LLC_COLORING
@@ -33,6 +34,8 @@ int dom0_set_llc_colors(struct domain *d);
int domain_set_llc_colors(struct domain *d,
const struct xen_domctl_set_llc_colors *config);
int domain_set_llc_colors_from_str(struct domain *d, const char *str);
+unsigned int page_to_llc_color(const struct page_info *pg);
+unsigned int get_max_nr_llc_colors(void);
#endif /* __XEN_LLC_COLORING_H__ */
--
2.43.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v11 09/12] xen: add cache coloring allocator for domains
2024-12-02 16:59 ` [PATCH v11 09/12] xen: add cache coloring allocator for domains Carlo Nonato
@ 2024-12-09 13:41 ` Jan Beulich
0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2024-12-09 13:41 UTC (permalink / raw)
To: Carlo Nonato
Cc: andrea.bastoni, marco.solieri, Andrew Cooper, Julien Grall,
Stefano Stabellini, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, xen-devel
On 02.12.2024 17:59, Carlo Nonato wrote:
> +static void __init init_color_heap_pages(struct page_info *pg,
> + unsigned long nr_pages)
> +{
> + unsigned long i;
> + bool need_scrub = opt_bootscrub == BOOTSCRUB_IDLE;
> +
> +#ifdef buddy_alloc_size
Did you mean #ifndef? Or #ifdef CONFIG_LLC_COLORING? The latter may be
more logical here, given that no fallback #define is needed for the
variable. Then
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v11 10/12] xen/arm: add Xen cache colors command line parameter
2024-12-02 16:59 [PATCH v11 00/12] Arm cache coloring Carlo Nonato
` (8 preceding siblings ...)
2024-12-02 16:59 ` [PATCH v11 09/12] xen: add cache coloring allocator for domains Carlo Nonato
@ 2024-12-02 16:59 ` Carlo Nonato
2024-12-02 16:59 ` [PATCH v11 11/12] xen/arm: make consider_modules() available for xen relocation Carlo Nonato
2024-12-02 16:59 ` [PATCH v11 12/12] xen/arm: add cache coloring support for Xen image Carlo Nonato
11 siblings, 0 replies; 40+ messages in thread
From: Carlo Nonato @ 2024-12-02 16:59 UTC (permalink / raw)
To: xen-devel
Cc: andrea.bastoni, marco.solieri, Luca Miccio, Andrew Cooper,
Jan Beulich, Julien Grall, Stefano Stabellini, Carlo Nonato
From: Luca Miccio <lucmiccio@gmail.com>
Add a new command line parameter to configure Xen cache colors.
These colors are dumped together with other coloring info.
Benchmarking the VM interrupt response time provides an estimation of
LLC usage by Xen's most latency-critical runtime task. Results on Arm
Cortex-A53 on Xilinx Zynq UltraScale+ XCZU9EG show that one color, which
reserves 64 KiB of L2, is enough to attain best responsiveness:
- Xen 1 color latency: 3.1 us
- Xen 2 color latency: 3.1 us
Since this is the most common target for Arm cache coloring, the default
amount of Xen colors is set to one.
More colors are instead very likely to be needed on processors whose L1
cache is physically-indexed and physically-tagged, such as Cortex-A57.
In such cases, coloring applies to L1 also, and there typically are two
distinct L1-colors. Therefore, reserving only one color for Xen would
senselessly partitions a cache memory that is already private, i.e.
underutilize it.
Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v11:
- no changes
v10:
- no changes
v9:
- no changes
v8:
- added bound check on xen_colors in llc_coloring_init()
v7:
- removed XEN_DEFAULT_COLOR
- XEN_DEFAULT_NUM_COLORS is now used in a for loop to set xen default colors
---
docs/misc/cache-coloring.rst | 2 ++
docs/misc/xen-command-line.pandoc | 10 ++++++++++
xen/common/llc-coloring.c | 29 +++++++++++++++++++++++++++++
3 files changed, 41 insertions(+)
diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
index 5224b27afe..e156062aa2 100644
--- a/docs/misc/cache-coloring.rst
+++ b/docs/misc/cache-coloring.rst
@@ -115,6 +115,8 @@ Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
+----------------------+-------------------------------+
| ``buddy-alloc-size`` | Buddy allocator reserved size |
+----------------------+-------------------------------+
+| ``xen-llc-colors`` | Xen color configuration |
++----------------------+-------------------------------+
Colors selection format
***********************
diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 3a70c49c05..992e1f993e 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -2923,6 +2923,16 @@ mode.
**WARNING: `x2apic_phys` is deprecated and superseded by `x2apic-mode`.
The latter takes precedence if both are set.**
+### xen-llc-colors (arm64)
+> `= List of [ <integer> | <integer>-<integer> ]`
+
+> Default: `0: the lowermost color`
+
+Specify Xen LLC color configuration. This options is available only when
+`CONFIG_LLC_COLORING` is enabled.
+Two colors are most likely needed on platforms where private caches are
+physically indexed, e.g. the L1 instruction cache of the Arm Cortex-A57.
+
### xenheap_megabytes (arm32)
> `= <size>`
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index 0f22a9b72c..2e7c0f505d 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -11,6 +11,7 @@
#include <xen/param.h>
#define NR_LLC_COLORS (1U << CONFIG_LLC_COLORS_ORDER)
+#define XEN_DEFAULT_NUM_COLORS 1
/*
* -1: not specified (disabled unless llc-size and llc-nr-ways present)
@@ -32,6 +33,9 @@ static unsigned int __ro_after_init default_colors[NR_LLC_COLORS];
static unsigned int __initdata dom0_colors[NR_LLC_COLORS];
static unsigned int __initdata dom0_num_colors;
+static unsigned int __ro_after_init xen_colors[NR_LLC_COLORS];
+static unsigned int __ro_after_init xen_num_colors;
+
#define mfn_color_mask (max_nr_colors - 1)
#define mfn_to_color(mfn) (mfn_x(mfn) & mfn_color_mask)
@@ -90,6 +94,13 @@ static int __init parse_dom0_colors(const char *s)
}
custom_param("dom0-llc-colors", parse_dom0_colors);
+static int __init parse_xen_colors(const char *s)
+{
+ return parse_color_config(s, xen_colors, ARRAY_SIZE(xen_colors),
+ &xen_num_colors);
+}
+custom_param("xen-llc-colors", parse_xen_colors);
+
static void print_colors(const unsigned int colors[], unsigned int num_colors)
{
unsigned int i;
@@ -173,6 +184,22 @@ void __init llc_coloring_init(void)
for ( i = 0; i < max_nr_colors; i++ )
default_colors[i] = i;
+ if ( !xen_num_colors )
+ {
+ unsigned int i;
+
+ xen_num_colors = MIN(XEN_DEFAULT_NUM_COLORS, max_nr_colors);
+
+ printk(XENLOG_WARNING
+ "Xen LLC color config not found. Using first %u colors\n",
+ xen_num_colors);
+ for ( i = 0; i < xen_num_colors; i++ )
+ xen_colors[i] = i;
+ }
+ else if ( xen_num_colors > max_nr_colors ||
+ !check_colors(xen_colors, xen_num_colors) )
+ panic("Bad LLC color config for Xen\n");
+
arch_llc_coloring_init();
}
@@ -183,6 +210,8 @@ void dump_llc_coloring_info(void)
printk("LLC coloring info:\n");
printk(" Number of LLC colors supported: %u\n", max_nr_colors);
+ printk(" Xen LLC colors (%u): ", xen_num_colors);
+ print_colors(xen_colors, xen_num_colors);
}
void domain_dump_llc_colors(const struct domain *d)
--
2.43.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v11 11/12] xen/arm: make consider_modules() available for xen relocation
2024-12-02 16:59 [PATCH v11 00/12] Arm cache coloring Carlo Nonato
` (9 preceding siblings ...)
2024-12-02 16:59 ` [PATCH v11 10/12] xen/arm: add Xen cache colors command line parameter Carlo Nonato
@ 2024-12-02 16:59 ` Carlo Nonato
2024-12-02 16:59 ` [PATCH v11 12/12] xen/arm: add cache coloring support for Xen image Carlo Nonato
11 siblings, 0 replies; 40+ messages in thread
From: Carlo Nonato @ 2024-12-02 16:59 UTC (permalink / raw)
To: xen-devel
Cc: andrea.bastoni, marco.solieri, Carlo Nonato, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk
Cache coloring must physically relocate Xen in order to color the hypervisor
and consider_modules() is a key function that is needed to find a new
available physical address.
672d67f339c0 ("xen/arm: Split MMU-specific setup_mm() and related code out")
moved consider_modules() under arm32. Move it to mmu/setup.c and make it
non-static so that it can be used outside.
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
---
v11:
- removed useless #include
v10:
- no changes
v9:
- no changes
v8:
- patch adapted to new changes to consider_modules()
v7:
- moved consider_modules() to arm/mmu/setup.c
v6:
- new patch
---
xen/arch/arm/arm32/mmu/mm.c | 95 +-------------------------------
xen/arch/arm/include/asm/setup.h | 3 +
xen/arch/arm/mmu/setup.c | 94 +++++++++++++++++++++++++++++++
3 files changed, 98 insertions(+), 94 deletions(-)
diff --git a/xen/arch/arm/arm32/mmu/mm.c b/xen/arch/arm/arm32/mmu/mm.c
index 063611412b..903d946f07 100644
--- a/xen/arch/arm/arm32/mmu/mm.c
+++ b/xen/arch/arm/arm32/mmu/mm.c
@@ -7,6 +7,7 @@
#include <xen/param.h>
#include <xen/pfn.h>
#include <asm/fixmap.h>
+#include <asm/setup.h>
#include <asm/static-memory.h>
#include <asm/static-shmem.h>
@@ -31,100 +32,6 @@ static void __init setup_directmap_mappings(unsigned long base_mfn,
directmap_virt_end = XENHEAP_VIRT_START + nr_mfns * PAGE_SIZE;
}
-/*
- * Returns the end address of the highest region in the range s..e
- * with required size and alignment that does not conflict with the
- * modules from first_mod to nr_modules.
- *
- * For non-recursive callers first_mod should normally be 0 (all
- * modules and Xen itself) or 1 (all modules but not Xen).
- */
-static paddr_t __init consider_modules(paddr_t s, paddr_t e,
- uint32_t size, paddr_t align,
- int first_mod)
-{
- const struct membanks *reserved_mem = bootinfo_get_reserved_mem();
-#ifdef CONFIG_STATIC_SHM
- const struct membanks *shmem = bootinfo_get_shmem();
-#endif
- const struct bootmodules *mi = &bootinfo.modules;
- int i;
- int nr;
-
- s = (s+align-1) & ~(align-1);
- e = e & ~(align-1);
-
- if ( s > e || e - s < size )
- return 0;
-
- /* First check the boot modules */
- for ( i = first_mod; i < mi->nr_mods; i++ )
- {
- paddr_t mod_s = mi->module[i].start;
- paddr_t mod_e = mod_s + mi->module[i].size;
-
- if ( s < mod_e && mod_s < e )
- {
- mod_e = consider_modules(mod_e, e, size, align, i+1);
- if ( mod_e )
- return mod_e;
-
- return consider_modules(s, mod_s, size, align, i+1);
- }
- }
-
- /*
- * i is the current bootmodule we are evaluating, across all
- * possible kinds of bootmodules.
- *
- * When retrieving the corresponding reserved-memory addresses, we
- * need to index the reserved_mem bank starting from 0, and only counting
- * the reserved-memory modules. Hence, we need to use i - nr.
- */
- nr = mi->nr_mods;
- for ( ; i - nr < reserved_mem->nr_banks; i++ )
- {
- paddr_t r_s = reserved_mem->bank[i - nr].start;
- paddr_t r_e = r_s + reserved_mem->bank[i - nr].size;
-
- if ( s < r_e && r_s < e )
- {
- r_e = consider_modules(r_e, e, size, align, i + 1);
- if ( r_e )
- return r_e;
-
- return consider_modules(s, r_s, size, align, i + 1);
- }
- }
-
-#ifdef CONFIG_STATIC_SHM
- nr += reserved_mem->nr_banks;
- for ( ; i - nr < shmem->nr_banks; i++ )
- {
- paddr_t r_s, r_e;
-
- r_s = shmem->bank[i - nr].start;
-
- /* Shared memory banks can contain INVALID_PADDR as start */
- if ( INVALID_PADDR == r_s )
- continue;
-
- r_e = r_s + shmem->bank[i - nr].size;
-
- if ( s < r_e && r_s < e )
- {
- r_e = consider_modules(r_e, e, size, align, i + 1);
- if ( r_e )
- return r_e;
-
- return consider_modules(s, r_s, size, align, i + 1);
- }
- }
-#endif
-
- return e;
-}
-
/*
* Find a contiguous region that fits in the static heap region with
* required size and alignment, and return the end address of the region
diff --git a/xen/arch/arm/include/asm/setup.h b/xen/arch/arm/include/asm/setup.h
index 64c227d171..0c560d141f 100644
--- a/xen/arch/arm/include/asm/setup.h
+++ b/xen/arch/arm/include/asm/setup.h
@@ -89,6 +89,9 @@ struct init_info
unsigned int cpuid;
};
+paddr_t consider_modules(paddr_t s, paddr_t e, uint32_t size, paddr_t align,
+ int first_mod);
+
#endif
/*
* Local variables:
diff --git a/xen/arch/arm/mmu/setup.c b/xen/arch/arm/mmu/setup.c
index 9664e85ee6..196974f3e2 100644
--- a/xen/arch/arm/mmu/setup.c
+++ b/xen/arch/arm/mmu/setup.c
@@ -222,6 +222,100 @@ static void xen_pt_enforce_wnx(void)
flush_xen_tlb_local();
}
+/*
+ * Returns the end address of the highest region in the range s..e
+ * with required size and alignment that does not conflict with the
+ * modules from first_mod to nr_modules.
+ *
+ * For non-recursive callers first_mod should normally be 0 (all
+ * modules and Xen itself) or 1 (all modules but not Xen).
+ */
+paddr_t __init consider_modules(paddr_t s, paddr_t e,
+ uint32_t size, paddr_t align,
+ int first_mod)
+{
+ const struct membanks *reserved_mem = bootinfo_get_reserved_mem();
+#ifdef CONFIG_STATIC_SHM
+ const struct membanks *shmem = bootinfo_get_shmem();
+#endif
+ const struct bootmodules *mi = &bootinfo.modules;
+ int i;
+ int nr;
+
+ s = (s+align-1) & ~(align-1);
+ e = e & ~(align-1);
+
+ if ( s > e || e - s < size )
+ return 0;
+
+ /* First check the boot modules */
+ for ( i = first_mod; i < mi->nr_mods; i++ )
+ {
+ paddr_t mod_s = mi->module[i].start;
+ paddr_t mod_e = mod_s + mi->module[i].size;
+
+ if ( s < mod_e && mod_s < e )
+ {
+ mod_e = consider_modules(mod_e, e, size, align, i+1);
+ if ( mod_e )
+ return mod_e;
+
+ return consider_modules(s, mod_s, size, align, i+1);
+ }
+ }
+
+ /*
+ * i is the current bootmodule we are evaluating, across all
+ * possible kinds of bootmodules.
+ *
+ * When retrieving the corresponding reserved-memory addresses, we
+ * need to index the reserved_mem bank starting from 0, and only counting
+ * the reserved-memory modules. Hence, we need to use i - nr.
+ */
+ nr = mi->nr_mods;
+ for ( ; i - nr < reserved_mem->nr_banks; i++ )
+ {
+ paddr_t r_s = reserved_mem->bank[i - nr].start;
+ paddr_t r_e = r_s + reserved_mem->bank[i - nr].size;
+
+ if ( s < r_e && r_s < e )
+ {
+ r_e = consider_modules(r_e, e, size, align, i + 1);
+ if ( r_e )
+ return r_e;
+
+ return consider_modules(s, r_s, size, align, i + 1);
+ }
+ }
+
+#ifdef CONFIG_STATIC_SHM
+ nr += reserved_mem->nr_banks;
+ for ( ; i - nr < shmem->nr_banks; i++ )
+ {
+ paddr_t r_s, r_e;
+
+ r_s = shmem->bank[i - nr].start;
+
+ /* Shared memory banks can contain INVALID_PADDR as start */
+ if ( INVALID_PADDR == r_s )
+ continue;
+
+ r_e = r_s + shmem->bank[i - nr].size;
+
+ if ( s < r_e && r_s < e )
+ {
+ r_e = consider_modules(r_e, e, size, align, i + 1);
+ if ( r_e )
+ return r_e;
+
+ return consider_modules(s, r_s, size, align, i + 1);
+ }
+ }
+#endif
+
+ return e;
+}
+
/*
* Boot-time pagetable setup.
* Changes here may need matching changes in head.S
--
2.43.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v11 12/12] xen/arm: add cache coloring support for Xen image
2024-12-02 16:59 [PATCH v11 00/12] Arm cache coloring Carlo Nonato
` (10 preceding siblings ...)
2024-12-02 16:59 ` [PATCH v11 11/12] xen/arm: make consider_modules() available for xen relocation Carlo Nonato
@ 2024-12-02 16:59 ` Carlo Nonato
2024-12-02 21:44 ` Julien Grall
11 siblings, 1 reply; 40+ messages in thread
From: Carlo Nonato @ 2024-12-02 16:59 UTC (permalink / raw)
To: xen-devel
Cc: andrea.bastoni, marco.solieri, Carlo Nonato, Stefano Stabellini,
Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
Andrew Cooper, Jan Beulich
Xen image is relocated to a new colored physical space. Some relocation
functionalities must be brought back:
- the virtual address of the new space is taken from 0c18fb76323b
("xen/arm: Remove unused BOOT_RELOC_VIRT_START").
- relocate_xen() and get_xen_paddr() are taken from f60658c6ae47
("xen/arm: Stop relocating Xen").
setup_pagetables() must be adapted for coloring and for relocation. Runtime
page tables are used to map the colored space, but they are also linked in
boot tables so that the new space is temporarily available for relocation.
This implies that Xen protection must happen after the copy.
Finally, since the alternative framework needs to remap the Xen text and
inittext sections, this operation must be done in a coloring-aware way.
The function xen_remap_colored() is introduced for that.
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Reviewed-by: Jan Beulich <jbeulich@suse.com> # common
---
v11:
- else if -> if in xen_colored_mfn()
v10:
- no changes
v9:
- patch adapted to changes to setup_pagetables()
v8:
- moved xen_colored_map_size() to arm/llc-coloring.c
v7:
- added BUG_ON() checks to arch_llc_coloring_init() and
create_llc_coloring_mappings()
v6:
- squashed with BOOT_RELOC_VIRT_START patch
- consider_modules() moved in another patch
- removed psci and smpboot code because of new idmap work already handles that
- moved xen_remap_colored() in alternative.c since it's only used there
- removed xen_colored_temp[] in favor of xen_xenmap[] usage for mapping
- use of boot_module_find_by_kind() to remove the need of extra parameter in
setup_pagetables()
- moved get_xen_paddr() in arm/llc-coloring.c since it's only used there
v5:
- FIXME: consider_modules copy pasted since it got moved
v4:
- removed set_value_for_secondary() because it was wrongly cleaning cache
- relocate_xen() now calls switch_ttbr_id()
---
xen/arch/arm/alternative.c | 30 +++++++-
xen/arch/arm/arm64/mmu/head.S | 58 +++++++++++++++-
xen/arch/arm/arm64/mmu/mm.c | 28 ++++++--
xen/arch/arm/include/asm/mmu/layout.h | 3 +
xen/arch/arm/llc-coloring.c | 63 +++++++++++++++++
xen/arch/arm/mmu/setup.c | 99 +++++++++++++++++++++++----
xen/arch/arm/setup.c | 10 ++-
xen/common/llc-coloring.c | 18 +++++
xen/include/xen/llc-coloring.h | 14 ++++
9 files changed, 302 insertions(+), 21 deletions(-)
diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
index d99b507093..0fcf4e451d 100644
--- a/xen/arch/arm/alternative.c
+++ b/xen/arch/arm/alternative.c
@@ -9,6 +9,7 @@
#include <xen/init.h>
#include <xen/types.h>
#include <xen/kernel.h>
+#include <xen/llc-coloring.h>
#include <xen/mm.h>
#include <xen/vmap.h>
#include <xen/smp.h>
@@ -191,6 +192,27 @@ static int __apply_alternatives_multi_stop(void *xenmap)
return 0;
}
+static void __init *xen_remap_colored(mfn_t xen_mfn, paddr_t xen_size)
+{
+ unsigned int i;
+ void *xenmap;
+ mfn_t *xen_colored_mfns, mfn;
+
+ xen_colored_mfns = xmalloc_array(mfn_t, xen_size >> PAGE_SHIFT);
+ if ( !xen_colored_mfns )
+ panic("Can't allocate LLC colored MFNs\n");
+
+ for_each_xen_colored_mfn ( xen_mfn, mfn, i )
+ {
+ xen_colored_mfns[i] = mfn;
+ }
+
+ xenmap = vmap(xen_colored_mfns, xen_size >> PAGE_SHIFT);
+ xfree(xen_colored_mfns);
+
+ return xenmap;
+}
+
/*
* This function should only be called during boot and before CPU0 jump
* into the idle_loop.
@@ -209,8 +231,12 @@ void __init apply_alternatives_all(void)
* The text and inittext section are read-only. So re-map Xen to
* be able to patch the code.
*/
- xenmap = __vmap(&xen_mfn, 1U << xen_order, 1, 1, PAGE_HYPERVISOR,
- VMAP_DEFAULT);
+ if ( llc_coloring_enabled )
+ xenmap = xen_remap_colored(xen_mfn, xen_size);
+ else
+ xenmap = __vmap(&xen_mfn, 1U << xen_order, 1, 1, PAGE_HYPERVISOR,
+ VMAP_DEFAULT);
+
/* Re-mapping Xen is not expected to fail during boot. */
BUG_ON(!xenmap);
diff --git a/xen/arch/arm/arm64/mmu/head.S b/xen/arch/arm/arm64/mmu/head.S
index 665a51a337..a1fc9a82f1 100644
--- a/xen/arch/arm/arm64/mmu/head.S
+++ b/xen/arch/arm/arm64/mmu/head.S
@@ -428,6 +428,61 @@ FUNC_LOCAL(fail)
b 1b
END(fail)
+/*
+ * Copy Xen to new location and switch TTBR
+ * x0 ttbr
+ * x1 source address
+ * x2 destination address
+ * x3 length
+ *
+ * Source and destination must be word aligned, length is rounded up
+ * to a 16 byte boundary.
+ *
+ * MUST BE VERY CAREFUL when saving things to RAM over the copy
+ */
+ENTRY(relocate_xen)
+ /*
+ * Copy 16 bytes at a time using:
+ * x9: counter
+ * x10: data
+ * x11: data
+ * x12: source
+ * x13: destination
+ */
+ mov x9, x3
+ mov x12, x1
+ mov x13, x2
+
+1: ldp x10, x11, [x12], #16
+ stp x10, x11, [x13], #16
+
+ subs x9, x9, #16
+ bgt 1b
+
+ /*
+ * Flush destination from dcache using:
+ * x9: counter
+ * x10: step
+ * x11: vaddr
+ *
+ * This is to ensure data is visible to the instruction cache
+ */
+ dsb sy
+
+ mov x9, x3
+ ldr x10, =dcache_line_bytes /* x10 := step */
+ ldr x10, [x10]
+ mov x11, x2
+
+1: dc cvac, x11
+
+ add x11, x11, x10
+ subs x9, x9, x10
+ bgt 1b
+
+ /* No need for dsb/isb because they are alredy done in switch_ttbr_id */
+ b switch_ttbr_id
+
/*
* Switch TTBR
*
@@ -453,7 +508,8 @@ FUNC(switch_ttbr_id)
/*
* 5) Flush I-cache
- * This should not be necessary but it is kept for safety.
+ * This should not be necessary in the general case, but it's needed
+ * for cache coloring because code is relocated in that case.
*/
ic iallu
isb
diff --git a/xen/arch/arm/arm64/mmu/mm.c b/xen/arch/arm/arm64/mmu/mm.c
index 671eaadbc1..3732d5897e 100644
--- a/xen/arch/arm/arm64/mmu/mm.c
+++ b/xen/arch/arm/arm64/mmu/mm.c
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0-only */
#include <xen/init.h>
+#include <xen/llc-coloring.h>
#include <xen/mm.h>
#include <xen/pfn.h>
@@ -138,27 +139,46 @@ void update_boot_mapping(bool enable)
}
extern void switch_ttbr_id(uint64_t ttbr);
+extern void relocate_xen(uint64_t ttbr, void *src, void *dst, size_t len);
typedef void (switch_ttbr_fn)(uint64_t ttbr);
+typedef void (relocate_xen_fn)(uint64_t ttbr, void *src, void *dst, size_t len);
void __init switch_ttbr(uint64_t ttbr)
{
- vaddr_t id_addr = virt_to_maddr(switch_ttbr_id);
- switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
+ vaddr_t vaddr, id_addr;
lpae_t pte;
+ if ( llc_coloring_enabled )
+ vaddr = (vaddr_t)relocate_xen;
+ else
+ vaddr = (vaddr_t)switch_ttbr_id;
+
+ id_addr = virt_to_maddr(vaddr);
+
/* Enable the identity mapping in the boot page tables */
update_identity_mapping(true);
/* Enable the identity mapping in the runtime page tables */
- pte = pte_of_xenaddr((vaddr_t)switch_ttbr_id);
+ pte = pte_of_xenaddr(vaddr);
pte.pt.table = 1;
pte.pt.xn = 0;
pte.pt.ro = 1;
write_pte(&xen_third_id[third_table_offset(id_addr)], pte);
/* Switch TTBR */
- fn(ttbr);
+ if ( llc_coloring_enabled )
+ {
+ relocate_xen_fn *fn = (relocate_xen_fn *)id_addr;
+
+ fn(ttbr, _start, (void *)BOOT_RELOC_VIRT_START, _end - _start);
+ }
+ else
+ {
+ switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
+
+ fn(ttbr);
+ }
/*
* Disable the identity mapping in the runtime page tables.
diff --git a/xen/arch/arm/include/asm/mmu/layout.h b/xen/arch/arm/include/asm/mmu/layout.h
index a3b546465b..19c0ec63a5 100644
--- a/xen/arch/arm/include/asm/mmu/layout.h
+++ b/xen/arch/arm/include/asm/mmu/layout.h
@@ -30,6 +30,7 @@
* 10M - 12M Fixmap: special-purpose 4K mapping slots
* 12M - 16M Early boot mapping of FDT
* 16M - 18M Livepatch vmap (if compiled in)
+ * 16M - 24M Cache-colored Xen text, data, bss (temporary, if compiled in)
*
* 1G - 2G VMAP: ioremap and early_ioremap
*
@@ -74,6 +75,8 @@
#define BOOT_FDT_VIRT_START (FIXMAP_VIRT_START + FIXMAP_VIRT_SIZE)
#define BOOT_FDT_VIRT_SIZE _AT(vaddr_t, MB(4))
+#define BOOT_RELOC_VIRT_START (BOOT_FDT_VIRT_START + BOOT_FDT_VIRT_SIZE)
+
#ifdef CONFIG_LIVEPATCH
#define LIVEPATCH_VMAP_START (BOOT_FDT_VIRT_START + BOOT_FDT_VIRT_SIZE)
#define LIVEPATCH_VMAP_SIZE _AT(vaddr_t, MB(2))
diff --git a/xen/arch/arm/llc-coloring.c b/xen/arch/arm/llc-coloring.c
index 6c8fa6b576..8e10a505db 100644
--- a/xen/arch/arm/llc-coloring.c
+++ b/xen/arch/arm/llc-coloring.c
@@ -10,6 +10,7 @@
#include <xen/types.h>
#include <asm/processor.h>
+#include <asm/setup.h>
#include <asm/sysregs.h>
/* Return the LLC way size by probing the hardware */
@@ -64,8 +65,70 @@ unsigned int __init get_llc_way_size(void)
return line_size * num_sets;
}
+/**
+ * get_xen_paddr - get physical address to relocate Xen to
+ *
+ * Xen is relocated to as near to the top of RAM as possible and
+ * aligned to a XEN_PADDR_ALIGN boundary.
+ */
+static paddr_t __init get_xen_paddr(paddr_t xen_size)
+{
+ const struct membanks *mem = bootinfo_get_mem();
+ paddr_t min_size, paddr = 0;
+ unsigned int i;
+
+ min_size = (xen_size + (XEN_PADDR_ALIGN-1)) & ~(XEN_PADDR_ALIGN-1);
+
+ /* Find the highest bank with enough space. */
+ for ( i = 0; i < mem->nr_banks; i++ )
+ {
+ const struct membank *bank = &mem->bank[i];
+ paddr_t s, e;
+
+ if ( bank->size >= min_size )
+ {
+ e = consider_modules(bank->start, bank->start + bank->size,
+ min_size, XEN_PADDR_ALIGN, 0);
+ if ( !e )
+ continue;
+
+#ifdef CONFIG_ARM_32
+ /* Xen must be under 4GB */
+ if ( e > GB(4) )
+ e = GB(4);
+ if ( e < bank->start )
+ continue;
+#endif
+
+ s = e - min_size;
+
+ if ( s > paddr )
+ paddr = s;
+ }
+ }
+
+ if ( !paddr )
+ panic("Not enough memory to relocate Xen\n");
+
+ printk("Placing Xen at 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
+ paddr, paddr + min_size);
+
+ return paddr;
+}
+
+static paddr_t __init xen_colored_map_size(void)
+{
+ return ROUNDUP((_end - _start) * get_max_nr_llc_colors(), XEN_PADDR_ALIGN);
+}
+
void __init arch_llc_coloring_init(void)
{
+ struct bootmodule *xen_bootmodule = boot_module_find_by_kind(BOOTMOD_XEN);
+
+ BUG_ON(!xen_bootmodule);
+
+ xen_bootmodule->size = xen_colored_map_size();
+ xen_bootmodule->start = get_xen_paddr(xen_bootmodule->size);
}
/*
diff --git a/xen/arch/arm/mmu/setup.c b/xen/arch/arm/mmu/setup.c
index 196974f3e2..1d8d7eb70c 100644
--- a/xen/arch/arm/mmu/setup.c
+++ b/xen/arch/arm/mmu/setup.c
@@ -7,6 +7,7 @@
#include <xen/init.h>
#include <xen/libfdt/libfdt.h>
+#include <xen/llc-coloring.h>
#include <xen/sections.h>
#include <xen/sizes.h>
#include <xen/vmap.h>
@@ -20,6 +21,9 @@
#undef virt_to_mfn
#define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
+#define virt_to_reloc_virt(virt) \
+ (((vaddr_t)virt) - XEN_VIRT_START + BOOT_RELOC_VIRT_START)
+
/* Main runtime page tables */
/*
@@ -69,6 +73,7 @@ static void __init __maybe_unused build_assertions(void)
/* 2MB aligned regions */
BUILD_BUG_ON(XEN_VIRT_START & ~SECOND_MASK);
BUILD_BUG_ON(FIXMAP_ADDR(0) & ~SECOND_MASK);
+ BUILD_BUG_ON(BOOT_RELOC_VIRT_START & ~SECOND_MASK);
/* 1GB aligned regions */
#ifdef CONFIG_ARM_32
BUILD_BUG_ON(XENHEAP_VIRT_START & ~FIRST_MASK);
@@ -138,6 +143,9 @@ static void __init __maybe_unused build_assertions(void)
lpae_t __init pte_of_xenaddr(vaddr_t va)
{
+ if ( llc_coloring_enabled )
+ va = virt_to_reloc_virt(va);
+
return mfn_to_xen_entry(virt_to_mfn(va), MT_NORMAL);
}
@@ -316,9 +324,44 @@ paddr_t __init consider_modules(paddr_t s, paddr_t e,
return e;
}
+static void __init create_llc_coloring_mappings(void)
+{
+ lpae_t pte;
+ unsigned int i;
+ struct bootmodule *xen_bootmodule = boot_module_find_by_kind(BOOTMOD_XEN);
+ mfn_t start_mfn = maddr_to_mfn(xen_bootmodule->start), mfn;
+
+ for_each_xen_colored_mfn ( start_mfn, mfn, i )
+ {
+ pte = mfn_to_xen_entry(mfn, MT_NORMAL);
+ pte.pt.table = 1; /* level 3 mappings always have this bit set */
+ xen_xenmap[i] = pte;
+ }
+
+ for ( i = 0; i < XEN_NR_ENTRIES(2); i++ )
+ {
+ vaddr_t va = BOOT_RELOC_VIRT_START + (i << XEN_PT_LEVEL_SHIFT(2));
+
+ pte = mfn_to_xen_entry(virt_to_mfn(xen_xenmap +
+ i * XEN_PT_LPAE_ENTRIES),
+ MT_NORMAL);
+ pte.pt.table = 1;
+ write_pte(&boot_second[second_table_offset(va)], pte);
+ }
+}
+
/*
- * Boot-time pagetable setup.
+ * Boot-time pagetable setup with coloring support
* Changes here may need matching changes in head.S
+ *
+ * The cache coloring support consists of:
+ * - Create colored mapping that conforms to Xen color selection in xen_xenmap[]
+ * - Link the mapping in boot page tables using BOOT_RELOC_VIRT_START as vaddr
+ * - pte_of_xenaddr() takes care of translating addresses to the new space
+ * during runtime page tables creation
+ * - Relocate xen and update TTBR with the new address in the colored space
+ * (see switch_ttbr())
+ * - Protect the new space
*/
void __init setup_pagetables(void)
{
@@ -326,6 +369,9 @@ void __init setup_pagetables(void)
lpae_t pte, *p;
int i;
+ if ( llc_coloring_enabled )
+ create_llc_coloring_mappings();
+
arch_setup_page_tables();
#ifdef CONFIG_ARM_64
@@ -353,13 +399,7 @@ void __init setup_pagetables(void)
break;
pte = pte_of_xenaddr(va);
pte.pt.table = 1; /* third level mappings always have this bit set */
- if ( is_kernel_text(va) || is_kernel_inittext(va) )
- {
- pte.pt.xn = 0;
- pte.pt.ro = 1;
- }
- if ( is_kernel_rodata(va) )
- pte.pt.ro = 1;
+ pte.pt.xn = 0; /* Permissions will be enforced later. Allow execution */
xen_xenmap[i] = pte;
}
@@ -385,13 +425,48 @@ void __init setup_pagetables(void)
ttbr = virt_to_maddr(cpu0_pgtable);
#endif
- switch_ttbr(ttbr);
-
- xen_pt_enforce_wnx();
-
#ifdef CONFIG_ARM_32
per_cpu(xen_pgtable, 0) = cpu0_pgtable;
#endif
+
+ if ( llc_coloring_enabled )
+ ttbr = virt_to_maddr(virt_to_reloc_virt(THIS_CPU_PGTABLE));
+
+ switch_ttbr(ttbr);
+
+ /* Protect Xen */
+ for ( i = 0; i < XEN_NR_ENTRIES(3); i++ )
+ {
+ vaddr_t va = XEN_VIRT_START + (i << PAGE_SHIFT);
+ lpae_t *entry = xen_xenmap + i;
+
+ if ( !is_kernel(va) )
+ break;
+
+ pte = read_atomic(entry);
+
+ if ( is_kernel_text(va) || is_kernel_inittext(va) )
+ {
+ pte.pt.xn = 0;
+ pte.pt.ro = 1;
+ } else if ( is_kernel_rodata(va) ) {
+ pte.pt.ro = 1;
+ pte.pt.xn = 1;
+ } else {
+ pte.pt.xn = 1;
+ pte.pt.ro = 0;
+ }
+
+ write_pte(entry, pte);
+ }
+
+ /*
+ * We modified live page-tables. Ensure the TLBs are invalidated
+ * before setting enforcing the WnX permissions.
+ */
+ flush_xen_tlb_local();
+
+ xen_pt_enforce_wnx();
}
void *__init arch_vmap_virt_end(void)
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 568a49b274..5e2c519ce8 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -304,8 +304,6 @@ void asmlinkage __init start_xen(unsigned long fdt_paddr)
/* Initialize traps early allow us to get backtrace when an error occurred */
init_traps();
- setup_pagetables();
-
smp_clear_cpu_maps();
device_tree_flattened = early_fdt_map(fdt_paddr);
@@ -329,6 +327,14 @@ void asmlinkage __init start_xen(unsigned long fdt_paddr)
llc_coloring_init();
+ /*
+ * Page tables must be setup after LLC coloring initialization because
+ * coloring info are required in order to create colored mappings
+ */
+ setup_pagetables();
+ /* Device-tree was mapped in boot page tables, remap it in the new tables */
+ device_tree_flattened = early_fdt_map(fdt_paddr);
+
setup_mm();
vm_init();
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index 2e7c0f505d..2d6aed5fb4 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -38,6 +38,8 @@ static unsigned int __ro_after_init xen_num_colors;
#define mfn_color_mask (max_nr_colors - 1)
#define mfn_to_color(mfn) (mfn_x(mfn) & mfn_color_mask)
+#define get_mfn_with_color(mfn, color) \
+ (_mfn((mfn_x(mfn) & ~mfn_color_mask) | (color)))
/*
* Parse the coloring configuration given in the buf string, following the
@@ -359,6 +361,22 @@ unsigned int get_max_nr_llc_colors(void)
return max_nr_colors;
}
+mfn_t __init xen_colored_mfn(mfn_t mfn)
+{
+ unsigned int i, color = mfn_to_color(mfn);
+
+ for ( i = 0; i < xen_num_colors; i++ )
+ {
+ if ( color == xen_colors[i] )
+ return mfn;
+ if ( color < xen_colors[i] )
+ return get_mfn_with_color(mfn, xen_colors[i]);
+ }
+
+ /* Jump to next color space (max_nr_colors mfns) and use the first color */
+ return get_mfn_with_color(mfn_add(mfn, max_nr_colors), xen_colors[0]);
+}
+
/*
* Local variables:
* mode: C
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
index b3f2fa22bc..4f10a5310f 100644
--- a/xen/include/xen/llc-coloring.h
+++ b/xen/include/xen/llc-coloring.h
@@ -8,6 +8,8 @@
#ifndef __XEN_LLC_COLORING_H__
#define __XEN_LLC_COLORING_H__
+#include <xen/mm-frame.h>
+
struct domain;
struct page_info;
struct xen_domctl_set_llc_colors;
@@ -28,6 +30,17 @@ static inline void domain_dump_llc_colors(const struct domain *d) {}
static inline void domain_llc_coloring_free(struct domain *d) {}
#endif
+/**
+ * Iterate over each Xen mfn in the colored space.
+ * @start_mfn: the first mfn that needs to be colored.
+ * @mfn: the current mfn.
+ * @i: loop index.
+ */
+#define for_each_xen_colored_mfn(start_mfn, mfn, i) \
+ for ( i = 0, mfn = xen_colored_mfn(start_mfn); \
+ i < (_end - _start) >> PAGE_SHIFT; \
+ i++, mfn = xen_colored_mfn(mfn_add(mfn, 1)) )
+
unsigned int get_llc_way_size(void);
void arch_llc_coloring_init(void);
int dom0_set_llc_colors(struct domain *d);
@@ -36,6 +49,7 @@ int domain_set_llc_colors(struct domain *d,
int domain_set_llc_colors_from_str(struct domain *d, const char *str);
unsigned int page_to_llc_color(const struct page_info *pg);
unsigned int get_max_nr_llc_colors(void);
+mfn_t xen_colored_mfn(mfn_t mfn);
#endif /* __XEN_LLC_COLORING_H__ */
--
2.43.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v11 12/12] xen/arm: add cache coloring support for Xen image
2024-12-02 16:59 ` [PATCH v11 12/12] xen/arm: add cache coloring support for Xen image Carlo Nonato
@ 2024-12-02 21:44 ` Julien Grall
2024-12-03 10:08 ` Carlo Nonato
0 siblings, 1 reply; 40+ messages in thread
From: Julien Grall @ 2024-12-02 21:44 UTC (permalink / raw)
To: Carlo Nonato, xen-devel
Cc: andrea.bastoni, marco.solieri, Stefano Stabellini,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Jan Beulich
Hi Carlo,
I appreciate this is v11. So I will try to keep the comments to only
what I consider important to fix and some coding style issue.
On 02/12/2024 16:59, Carlo Nonato wrote:
> ---
> xen/arch/arm/alternative.c | 30 +++++++-
> xen/arch/arm/arm64/mmu/head.S | 58 +++++++++++++++-
> xen/arch/arm/arm64/mmu/mm.c | 28 ++++++--
> xen/arch/arm/include/asm/mmu/layout.h | 3 +
> xen/arch/arm/llc-coloring.c | 63 +++++++++++++++++
> xen/arch/arm/mmu/setup.c | 99 +++++++++++++++++++++++----
> xen/arch/arm/setup.c | 10 ++-
> xen/common/llc-coloring.c | 18 +++++
> xen/include/xen/llc-coloring.h | 14 ++++
> 9 files changed, 302 insertions(+), 21 deletions(-)
>
> diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
> index d99b507093..0fcf4e451d 100644
> --- a/xen/arch/arm/alternative.c
> +++ b/xen/arch/arm/alternative.c
> @@ -9,6 +9,7 @@
> #include <xen/init.h>
> #include <xen/types.h>
> #include <xen/kernel.h>
> +#include <xen/llc-coloring.h>
> #include <xen/mm.h>
> #include <xen/vmap.h>
> #include <xen/smp.h>
> @@ -191,6 +192,27 @@ static int __apply_alternatives_multi_stop(void *xenmap)
> return 0;
> }
>
> +static void __init *xen_remap_colored(mfn_t xen_mfn, paddr_t xen_size)
> +{
> + unsigned int i;
> + void *xenmap;
> + mfn_t *xen_colored_mfns, mfn;
> +
> + xen_colored_mfns = xmalloc_array(mfn_t, xen_size >> PAGE_SHIFT);
> + if ( !xen_colored_mfns )> + panic("Can't allocate LLC
colored MFNs\n");
> +
> + for_each_xen_colored_mfn ( xen_mfn, mfn, i )
> + {
> + xen_colored_mfns[i] = mfn;
> + }
NIT: Parenthesis should not be necessary.
> +
> + xenmap = vmap(xen_colored_mfns, xen_size >> PAGE_SHIFT);
> + xfree(xen_colored_mfns);> +
> + return xenmap;
> +}
> +
> /*
> * This function should only be called during boot and before CPU0 jump
> * into the idle_loop.
> @@ -209,8 +231,12 @@ void __init apply_alternatives_all(void)
> * The text and inittext section are read-only. So re-map Xen to
> * be able to patch the code.
> */
> - xenmap = __vmap(&xen_mfn, 1U << xen_order, 1, 1, PAGE_HYPERVISOR,
> - VMAP_DEFAULT);
> + if ( llc_coloring_enabled )
> + xenmap = xen_remap_colored(xen_mfn, xen_size);
> + else
> + xenmap = __vmap(&xen_mfn, 1U << xen_order, 1, 1, PAGE_HYPERVISOR,
> + VMAP_DEFAULT);
> +
> /* Re-mapping Xen is not expected to fail during boot. */
> BUG_ON(!xenmap);
>
> diff --git a/xen/arch/arm/arm64/mmu/head.S b/xen/arch/arm/arm64/mmu/head.S
> index 665a51a337..a1fc9a82f1 100644
> --- a/xen/arch/arm/arm64/mmu/head.S
> +++ b/xen/arch/arm/arm64/mmu/head.S
> @@ -428,6 +428,61 @@ FUNC_LOCAL(fail)
> b 1b
> END(fail)
>
> +/*
> + * Copy Xen to new location and switch TTBR
> + * x0 ttbr
> + * x1 source address
> + * x2 destination address
> + * x3 length
> + *
> + * Source and destination must be word aligned, length is rounded up
> + * to a 16 byte boundary.
> + *
> + * MUST BE VERY CAREFUL when saving things to RAM over the copy
> + */
> +ENTRY(relocate_xen)
We are trying to get rid of ENTRY. Instead, please use FUNC().
> + /*
> + * Copy 16 bytes at a time using:
> + * x9: counter
> + * x10: data
> + * x11: data
> + * x12: source
> + * x13: destination
> + */
> + mov x9, x3
> + mov x12, x1
> + mov x13, x2
> +
> +1: ldp x10, x11, [x12], #16
> + stp x10, x11, [x13], #16
> +
> + subs x9, x9, #16
> + bgt 1b
> +
> + /*
> + * Flush destination from dcache using:
> + * x9: counter
> + * x10: step
> + * x11: vaddr
> + *
> + * This is to ensure data is visible to the instruction cache
> + */
The comments implies that the only reason we need to flush the cache is
to ensure the data and cache is coherent. But...
> + dsb sy
> +
> + mov x9, x3
> + ldr x10, =dcache_line_bytes /* x10 := step */
> + ldr x10, [x10]> + mov x11, x2
> +
> +1: dc cvac, x11
... here you use Point of Coherency. Point of Unification should be
sufficient here. This is boot code, so I am not against having stricter
cache maintenance. But it would be good to clarify.
> +
> + add x11, x11, x10
> + subs x9, x9, x10
> + bgt 1b
> +
> + /* No need for dsb/isb because they are alredy done in switch_ttbr_id */
> + b switch_ttbr_id
> +
> /*
> * Switch TTBR
> *
> @@ -453,7 +508,8 @@ FUNC(switch_ttbr_id)
>
> /*
> * 5) Flush I-cache
> - * This should not be necessary but it is kept for safety.
> + * This should not be necessary in the general case, but it's needed
> + * for cache coloring because code is relocated in that case.
> */
> ic iallu
> isb
> diff --git a/xen/arch/arm/arm64/mmu/mm.c b/xen/arch/arm/arm64/mmu/mm.c
> index 671eaadbc1..3732d5897e 100644
> --- a/xen/arch/arm/arm64/mmu/mm.c
> +++ b/xen/arch/arm/arm64/mmu/mm.c
> @@ -1,6 +1,7 @@
> /* SPDX-License-Identifier: GPL-2.0-only */
>
> #include <xen/init.h>
> +#include <xen/llc-coloring.h>
> #include <xen/mm.h>
> #include <xen/pfn.h>
>
> @@ -138,27 +139,46 @@ void update_boot_mapping(bool enable)
> }
>
> extern void switch_ttbr_id(uint64_t ttbr);
> +extern void relocate_xen(uint64_t ttbr, void *src, void *dst, size_t len);
>
> typedef void (switch_ttbr_fn)(uint64_t ttbr);
> +typedef void (relocate_xen_fn)(uint64_t ttbr, void *src, void *dst, size_t len);
>
> void __init switch_ttbr(uint64_t ttbr)
Given the change below, I think this function needs to be renamed.
Possibly to relocate_and_jump() with a comment explaning that the
relocation only happen for cache-coloring.
> {
> - vaddr_t id_addr = virt_to_maddr(switch_ttbr_id);
> - switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
> + vaddr_t vaddr, id_addr;
> lpae_t pte;
>
> + if ( llc_coloring_enabled )
> + vaddr = (vaddr_t)relocate_xen;
> + else
> + vaddr = (vaddr_t)switch_ttbr_id;
> +
> + id_addr = virt_to_maddr(vaddr);
> +
> /* Enable the identity mapping in the boot page tables */
> update_identity_mapping(true);
>
> /* Enable the identity mapping in the runtime page tables */
> - pte = pte_of_xenaddr((vaddr_t)switch_ttbr_id);
> + pte = pte_of_xenaddr(vaddr);
> pte.pt.table = 1;
> pte.pt.xn = 0;
> pte.pt.ro = 1;
> write_pte(&xen_third_id[third_table_offset(id_addr)], pte);
>
> /* Switch TTBR */
This comment needs to be updated.
> - fn(ttbr);
> + if ( llc_coloring_enabled )
> + {
> + relocate_xen_fn *fn = (relocate_xen_fn *)id_addr;
> +
> + fn(ttbr, _start, (void *)BOOT_RELOC_VIRT_START, _end - _start);
> + }
> + else
> + {
> + switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
> +
> + fn(ttbr);
> + }
>
> /*
> * Disable the identity mapping in the runtime page tables.
> diff --git a/xen/arch/arm/include/asm/mmu/layout.h b/xen/arch/arm/include/asm/mmu/layout.h
> index a3b546465b..19c0ec63a5 100644
> --- a/xen/arch/arm/include/asm/mmu/layout.h
> +++ b/xen/arch/arm/include/asm/mmu/layout.h
> @@ -30,6 +30,7 @@
> * 10M - 12M Fixmap: special-purpose 4K mapping slots
> * 12M - 16M Early boot mapping of FDT
> * 16M - 18M Livepatch vmap (if compiled in)
> + * 16M - 24M Cache-colored Xen text, data, bss (temporary, if compiled in)
> *
> * 1G - 2G VMAP: ioremap and early_ioremap
> *
> @@ -74,6 +75,8 @@
> #define BOOT_FDT_VIRT_START (FIXMAP_VIRT_START + FIXMAP_VIRT_SIZE)
> #define BOOT_FDT_VIRT_SIZE _AT(vaddr_t, MB(4))
>
> +#define BOOT_RELOC_VIRT_START (BOOT_FDT_VIRT_START + BOOT_FDT_VIRT_SIZE)
> +> #ifdef CONFIG_LIVEPATCH
> #define LIVEPATCH_VMAP_START (BOOT_FDT_VIRT_START + BOOT_FDT_VIRT_SIZE)
> #define LIVEPATCH_VMAP_SIZE _AT(vaddr_t, MB(2))
> diff --git a/xen/arch/arm/llc-coloring.c b/xen/arch/arm/llc-coloring.c
> index 6c8fa6b576..8e10a505db 100644
> --- a/xen/arch/arm/llc-coloring.c
> +++ b/xen/arch/arm/llc-coloring.c
> @@ -10,6 +10,7 @@
> #include <xen/types.h>
>
> #include <asm/processor.h>
> +#include <asm/setup.h>
> #include <asm/sysregs.h>
>
> /* Return the LLC way size by probing the hardware */
> @@ -64,8 +65,70 @@ unsigned int __init get_llc_way_size(void)
> return line_size * num_sets;
> }
>
> +/**
> + * get_xen_paddr - get physical address to relocate Xen to
> + *
> + * Xen is relocated to as near to the top of RAM as possible and
> + * aligned to a XEN_PADDR_ALIGN boundary.
> + */
> +static paddr_t __init get_xen_paddr(paddr_t xen_size)
> +{
> + const struct membanks *mem = bootinfo_get_mem();
> + paddr_t min_size, paddr = 0;
> + unsigned int i;
> +
> + min_size = (xen_size + (XEN_PADDR_ALIGN-1)) & ~(XEN_PADDR_ALIGN-1);
Style: Missing space before *and* after '-' in both cases.
But effectively, this is an open-coded version of ROUNDUP().
> +
> + /* Find the highest bank with enough space. */
> + for ( i = 0; i < mem->nr_banks; i++ )
> + {
> + const struct membank *bank = &mem->bank[i];
> + paddr_t s, e;
> +
> + if ( bank->size >= min_size )
> + {
> + e = consider_modules(bank->start, bank->start + bank->size,
> + min_size, XEN_PADDR_ALIGN, 0);
> + if ( !e )
> + continue;
> +
> +#ifdef CONFIG_ARM_32
> + /* Xen must be under 4GB */
> + if ( e > GB(4) )
> + e = GB(4);
> + if ( e < bank->start )
> + continue;
> +#endif
> +
> + s = e - min_size;
> +
> + if ( s > paddr )
> + paddr = s;
> + }
> + }
> +
> + if ( !paddr )
> + panic("Not enough memory to relocate Xen\n");
> +
> + printk("Placing Xen at 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
> + paddr, paddr + min_size);
> +
> + return paddr;
> +}
> +
[...]
> +static void __init create_llc_coloring_mappings(void)
> +{
> + lpae_t pte;
> + unsigned int i;
> + struct bootmodule *xen_bootmodule = boot_module_find_by_kind(BOOTMOD_XEN);
> + mfn_t start_mfn = maddr_to_mfn(xen_bootmodule->start), mfn;> +
> + for_each_xen_colored_mfn ( start_mfn, mfn, i )
> + {
> + pte = mfn_to_xen_entry(mfn, MT_NORMAL);
> + pte.pt.table = 1; /* level 3 mappings always have this bit set */
> + xen_xenmap[i] = pte;
> + }
> +
> + for ( i = 0; i < XEN_NR_ENTRIES(2); i++ )
> + {
> + vaddr_t va = BOOT_RELOC_VIRT_START + (i << XEN_PT_LEVEL_SHIFT(2));
> +
> + pte = mfn_to_xen_entry(virt_to_mfn(xen_xenmap +
> + i * XEN_PT_LPAE_ENTRIES),
> + MT_NORMAL);
> + pte.pt.table = 1;
> + write_pte(&boot_second[second_table_offset(va)], pte);
> + }
> +}
> +
> /*
> - * Boot-time pagetable setup.
> + * Boot-time pagetable setup with coloring support
I am a bit confused with this change. I agree you added support for
cache coloring, but the code is still doing the same thing: Preparing
the page-tables regardless on whether this is cache coloring or not.
So I would say this update is not warrant.
> * Changes here may need matching changes in head.S
> + *
> + * The cache coloring support consists of:
> + * - Create colored mapping that conforms to Xen color selection in xen_xenmap[]
> + * - Link the mapping in boot page tables using BOOT_RELOC_VIRT_START as vaddr
> + * - pte_of_xenaddr() takes care of translating addresses to the new space
> + * during runtime page tables creation
> + * - Relocate xen and update TTBR with the new address in the colored space
> + * (see switch_ttbr())
> + * - Protect the new space
Similarly here. Most of what is written has nothing to do with cache
coloring. So I think this comment needs to be made a bit more generic.
> */
> void __init setup_pagetables(void)
> {
> @@ -326,6 +369,9 @@ void __init setup_pagetables(void)
> lpae_t pte, *p;
> int i;
>
> + if ( llc_coloring_enabled )
> + create_llc_coloring_mappings();
> +
> arch_setup_page_tables();
>
> #ifdef CONFIG_ARM_64
> @@ -353,13 +399,7 @@ void __init setup_pagetables(void)
> break;
> pte = pte_of_xenaddr(va);
> pte.pt.table = 1; /* third level mappings always have this bit set */
> - if ( is_kernel_text(va) || is_kernel_inittext(va) )
> - {
> - pte.pt.xn = 0;
> - pte.pt.ro = 1;
> - }
> - if ( is_kernel_rodata(va) )
> - pte.pt.ro = 1;
> + pte.pt.xn = 0; /* Permissions will be enforced later. Allow execution */
> xen_xenmap[i] = pte;
> }
>
> @@ -385,13 +425,48 @@ void __init setup_pagetables(void)
> ttbr = virt_to_maddr(cpu0_pgtable);
> #endif
>
> - switch_ttbr(ttbr);
> -
> - xen_pt_enforce_wnx();
> -
> #ifdef CONFIG_ARM_32
> per_cpu(xen_pgtable, 0) = cpu0_pgtable;
> #endif
> +
> + if ( llc_coloring_enabled )
> + ttbr = virt_to_maddr(virt_to_reloc_virt(THIS_CPU_PGTABLE));
The logic is a bit difficult to understand. You first update ttbr above:
ttbr = virt_to_maddr(cpu0_pgtable);
But then overwrite it for cache coloring. virt_to_maddr() is also not a
trivial function.
So I think it would be better to write the following:
#ifdef CONFIG_ARM_32
per_cpu(xen_pgtable, 0) = cpu0_pgtable;
#endif
if ( llc_coloring_enabled )
ttbr = virt_to_maddr(virt_to_reloc_virt(...));
else
ttbr = virt_to_maddr(THIS_CPU_PGTABLE);
> +> + switch_ttbr(ttbr);
> +
> + /* Protect Xen */
> + for ( i = 0; i < XEN_NR_ENTRIES(3); i++ )
> + {
> + vaddr_t va = XEN_VIRT_START + (i << PAGE_SHIFT);
> + lpae_t *entry = xen_xenmap + i;
> +
> + if ( !is_kernel(va) )
> + break;
> +
> + pte = read_atomic(entry);
> +
> + if ( is_kernel_text(va) || is_kernel_inittext(va) )
> + {
> + pte.pt.xn = 0;
> + pte.pt.ro = 1;
> + } else if ( is_kernel_rodata(va) ) {
Coding style:
}
else if
{
...
}
else
{
...
}
> + pte.pt.ro = 1;
> + pte.pt.xn = 1;
> + } else {
> + pte.pt.xn = 1;
> + pte.pt.ro = 0;
> + }
> +
> + write_pte(entry, pte);
> + }
> +
> + /*
> + * We modified live page-tables. Ensure the TLBs are invalidated
> + * before setting enforcing the WnX permissions.
> + */
> + flush_xen_tlb_local();
> +
> + xen_pt_enforce_wnx();
> }
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v11 12/12] xen/arm: add cache coloring support for Xen image
2024-12-02 21:44 ` Julien Grall
@ 2024-12-03 10:08 ` Carlo Nonato
2024-12-03 10:36 ` Julien Grall
0 siblings, 1 reply; 40+ messages in thread
From: Carlo Nonato @ 2024-12-03 10:08 UTC (permalink / raw)
To: Julien Grall
Cc: xen-devel, andrea.bastoni, marco.solieri, Stefano Stabellini,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Jan Beulich
Hi Julien,
On Mon, Dec 2, 2024 at 10:44 PM Julien Grall <julien@xen.org> wrote:
>
> Hi Carlo,
[...]
> > diff --git a/xen/arch/arm/arm64/mmu/mm.c b/xen/arch/arm/arm64/mmu/mm.c
> > index 671eaadbc1..3732d5897e 100644
> > --- a/xen/arch/arm/arm64/mmu/mm.c
> > +++ b/xen/arch/arm/arm64/mmu/mm.c
> > @@ -1,6 +1,7 @@
> > /* SPDX-License-Identifier: GPL-2.0-only */
> >
> > #include <xen/init.h>
> > +#include <xen/llc-coloring.h>
> > #include <xen/mm.h>
> > #include <xen/pfn.h>
> >
> > @@ -138,27 +139,46 @@ void update_boot_mapping(bool enable)
> > }
> >
> > extern void switch_ttbr_id(uint64_t ttbr);
> > +extern void relocate_xen(uint64_t ttbr, void *src, void *dst, size_t len);
> >
> > typedef void (switch_ttbr_fn)(uint64_t ttbr);
> > +typedef void (relocate_xen_fn)(uint64_t ttbr, void *src, void *dst, size_t len);
> >
> > void __init switch_ttbr(uint64_t ttbr)
>
> Given the change below, I think this function needs to be renamed.
> Possibly to relocate_and_jump() with a comment explaning that the
> relocation only happen for cache-coloring.
Changing the name of switch_ttbr() to relocate_and_jump() seems a bit
misleading to me. First I need to change the name also for arm32 where there's
no relocation at all. Second, relocation is something that happens
conditionally so I don't think it's a good name for the function.
[...]
> Cheers,
>
> --
> Julien Grall
Thanks.
- Carlo
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v11 12/12] xen/arm: add cache coloring support for Xen image
2024-12-03 10:08 ` Carlo Nonato
@ 2024-12-03 10:36 ` Julien Grall
2024-12-03 11:37 ` Carlo Nonato
0 siblings, 1 reply; 40+ messages in thread
From: Julien Grall @ 2024-12-03 10:36 UTC (permalink / raw)
To: Carlo Nonato
Cc: xen-devel, andrea.bastoni, marco.solieri, Stefano Stabellini,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Jan Beulich
On 03/12/2024 10:08, Carlo Nonato wrote:
> Hi Julien,
>
> On Mon, Dec 2, 2024 at 10:44 PM Julien Grall <julien@xen.org> wrote:
>>
>> Hi Carlo,
>
> [...]
>
>>> diff --git a/xen/arch/arm/arm64/mmu/mm.c b/xen/arch/arm/arm64/mmu/mm.c
>>> index 671eaadbc1..3732d5897e 100644
>>> --- a/xen/arch/arm/arm64/mmu/mm.c
>>> +++ b/xen/arch/arm/arm64/mmu/mm.c
>>> @@ -1,6 +1,7 @@
>>> /* SPDX-License-Identifier: GPL-2.0-only */
>>>
>>> #include <xen/init.h>
>>> +#include <xen/llc-coloring.h>
>>> #include <xen/mm.h>
>>> #include <xen/pfn.h>
>>>
>>> @@ -138,27 +139,46 @@ void update_boot_mapping(bool enable)
>>> }
>>>
>>> extern void switch_ttbr_id(uint64_t ttbr);
>>> +extern void relocate_xen(uint64_t ttbr, void *src, void *dst, size_t len);
>>>
>>> typedef void (switch_ttbr_fn)(uint64_t ttbr);
>>> +typedef void (relocate_xen_fn)(uint64_t ttbr, void *src, void *dst, size_t len);
>>>
>>> void __init switch_ttbr(uint64_t ttbr)
>>
>> Given the change below, I think this function needs to be renamed.
>> Possibly to relocate_and_jump() with a comment explaning that the
>> relocation only happen for cache-coloring.
>
> Changing the name of switch_ttbr() to relocate_and_jump() seems a bit
> misleading to me. First I need to change the name also for arm32 where there's
> no relocation at all. Second, relocation is something that happens
> conditionally so I don't think it's a good name for the function.
Feel free to propose a new name. The main thing is the current name
can't stay "switch_ttbr()" because you are doing more than switching the
TTBR.
The other solution is to have a separate call for relocating xen (which
will fall-through to switch_ttbr) and another one for those that only to
switch TTBR.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v11 12/12] xen/arm: add cache coloring support for Xen image
2024-12-03 10:36 ` Julien Grall
@ 2024-12-03 11:37 ` Carlo Nonato
2024-12-04 11:18 ` Julien Grall
0 siblings, 1 reply; 40+ messages in thread
From: Carlo Nonato @ 2024-12-03 11:37 UTC (permalink / raw)
To: Julien Grall
Cc: xen-devel, andrea.bastoni, marco.solieri, Stefano Stabellini,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Jan Beulich
Hi Julien,
On Tue, Dec 3, 2024 at 11:36 AM Julien Grall <julien@xen.org> wrote:
>
> On 03/12/2024 10:08, Carlo Nonato wrote:
> > Hi Julien,
> >
> > On Mon, Dec 2, 2024 at 10:44 PM Julien Grall <julien@xen.org> wrote:
> >>
> >> Hi Carlo,
> >
> > [...]
> >
> >>> diff --git a/xen/arch/arm/arm64/mmu/mm.c b/xen/arch/arm/arm64/mmu/mm.c
> >>> index 671eaadbc1..3732d5897e 100644
> >>> --- a/xen/arch/arm/arm64/mmu/mm.c
> >>> +++ b/xen/arch/arm/arm64/mmu/mm.c
> >>> @@ -1,6 +1,7 @@
> >>> /* SPDX-License-Identifier: GPL-2.0-only */
> >>>
> >>> #include <xen/init.h>
> >>> +#include <xen/llc-coloring.h>
> >>> #include <xen/mm.h>
> >>> #include <xen/pfn.h>
> >>>
> >>> @@ -138,27 +139,46 @@ void update_boot_mapping(bool enable)
> >>> }
> >>>
> >>> extern void switch_ttbr_id(uint64_t ttbr);
> >>> +extern void relocate_xen(uint64_t ttbr, void *src, void *dst, size_t len);
> >>>
> >>> typedef void (switch_ttbr_fn)(uint64_t ttbr);
> >>> +typedef void (relocate_xen_fn)(uint64_t ttbr, void *src, void *dst, size_t len);
> >>>
> >>> void __init switch_ttbr(uint64_t ttbr)
> >>
> >> Given the change below, I think this function needs to be renamed.
> >> Possibly to relocate_and_jump() with a comment explaning that the
> >> relocation only happen for cache-coloring.
> >
> > Changing the name of switch_ttbr() to relocate_and_jump() seems a bit
> > misleading to me. First I need to change the name also for arm32 where there's
> > no relocation at all. Second, relocation is something that happens
> > conditionally so I don't think it's a good name for the function.
>
> Feel free to propose a new name. The main thing is the current name
> can't stay "switch_ttbr()" because you are doing more than switching the
> TTBR.
>
> The other solution is to have a separate call for relocating xen (which
> will fall-through to switch_ttbr) and another one for those that only to
> switch TTBR.
What about a function like this one, defined in xen/arch/arm/arm64/mmu/mm.c:
typedef void (relocate_xen_fn)(uint64_t ttbr, void *src, void *dst, size_t len);
void __init relocate_and_switch_ttbr(uint64_t ttbr) {
vaddr_t id_addr = virt_to_maddr(relocate_xen);
relocate_xen_fn *fn = (relocate_xen_fn *)id_addr;
lpae_t pte;
/* Enable the identity mapping in the boot page tables */
update_identity_mapping(true);
/* Enable the identity mapping in the runtime page tables */
pte = pte_of_xenaddr((vaddr_t)relocate_xen);
pte.pt.table = 1;
pte.pt.xn = 0;
pte.pt.ro = 1;
write_pte(&xen_third_id[third_table_offset(id_addr)], pte);
/* Relocate Xen and switch TTBR */
fn(ttbr, _start, (void *)BOOT_RELOC_VIRT_START, _end - _start);
/*
* Disable the identity mapping in the runtime page tables.
* Note it is not necessary to disable it in the boot page tables
* because they are not going to be used by this CPU anymore.
*/
update_identity_mapping(false);
}
which is actually a clone of switch_ttbr() but it does relocation. I would
then call it in case of coloring in setup_pagetables(). This should go in the
direction you suggested, but it would duplicate a bit of code. What do you
think about it?
> Cheers,
>
> --
> Julien Grall
>
Thanks,
- Carlo
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v11 12/12] xen/arm: add cache coloring support for Xen image
2024-12-03 11:37 ` Carlo Nonato
@ 2024-12-04 11:18 ` Julien Grall
0 siblings, 0 replies; 40+ messages in thread
From: Julien Grall @ 2024-12-04 11:18 UTC (permalink / raw)
To: Carlo Nonato
Cc: xen-devel, andrea.bastoni, marco.solieri, Stefano Stabellini,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Jan Beulich
Hi Carlo,
On 03/12/2024 11:37, Carlo Nonato wrote:
> On Tue, Dec 3, 2024 at 11:36 AM Julien Grall <julien@xen.org> wrote:
>>
>> On 03/12/2024 10:08, Carlo Nonato wrote:
>>> Hi Julien,
>>>
>>> On Mon, Dec 2, 2024 at 10:44 PM Julien Grall <julien@xen.org> wrote:
>>>>
>>>> Hi Carlo,
>>>
>>> [...]
>>>
>>>>> diff --git a/xen/arch/arm/arm64/mmu/mm.c b/xen/arch/arm/arm64/mmu/mm.c
>>>>> index 671eaadbc1..3732d5897e 100644
>>>>> --- a/xen/arch/arm/arm64/mmu/mm.c
>>>>> +++ b/xen/arch/arm/arm64/mmu/mm.c
>>>>> @@ -1,6 +1,7 @@
>>>>> /* SPDX-License-Identifier: GPL-2.0-only */
>>>>>
>>>>> #include <xen/init.h>
>>>>> +#include <xen/llc-coloring.h>
>>>>> #include <xen/mm.h>
>>>>> #include <xen/pfn.h>
>>>>>
>>>>> @@ -138,27 +139,46 @@ void update_boot_mapping(bool enable)
>>>>> }
>>>>>
>>>>> extern void switch_ttbr_id(uint64_t ttbr);
>>>>> +extern void relocate_xen(uint64_t ttbr, void *src, void *dst, size_t len);
>>>>>
>>>>> typedef void (switch_ttbr_fn)(uint64_t ttbr);
>>>>> +typedef void (relocate_xen_fn)(uint64_t ttbr, void *src, void *dst, size_t len);
>>>>>
>>>>> void __init switch_ttbr(uint64_t ttbr)
>>>>
>>>> Given the change below, I think this function needs to be renamed.
>>>> Possibly to relocate_and_jump() with a comment explaning that the
>>>> relocation only happen for cache-coloring.
>>>
>>> Changing the name of switch_ttbr() to relocate_and_jump() seems a bit
>>> misleading to me. First I need to change the name also for arm32 where there's
>>> no relocation at all. Second, relocation is something that happens
>>> conditionally so I don't think it's a good name for the function.
>>
>> Feel free to propose a new name. The main thing is the current name
>> can't stay "switch_ttbr()" because you are doing more than switching the
>> TTBR.
>>
>> The other solution is to have a separate call for relocating xen (which
>> will fall-through to switch_ttbr) and another one for those that only to
>> switch TTBR.
>
> What about a function like this one, defined in xen/arch/arm/arm64/mmu/mm.c:
>
> typedef void (relocate_xen_fn)(uint64_t ttbr, void *src, void *dst, size_t len);
>
> void __init relocate_and_switch_ttbr(uint64_t ttbr) {
> vaddr_t id_addr = virt_to_maddr(relocate_xen);
> relocate_xen_fn *fn = (relocate_xen_fn *)id_addr;
> lpae_t pte;
>
> /* Enable the identity mapping in the boot page tables */
> update_identity_mapping(true);
>
> /* Enable the identity mapping in the runtime page tables */
> pte = pte_of_xenaddr((vaddr_t)relocate_xen);
> pte.pt.table = 1;
> pte.pt.xn = 0;
> pte.pt.ro = 1;
> write_pte(&xen_third_id[third_table_offset(id_addr)], pte);
>
> /* Relocate Xen and switch TTBR */
> fn(ttbr, _start, (void *)BOOT_RELOC_VIRT_START, _end - _start);
>
> /*
> * Disable the identity mapping in the runtime page tables.
> * Note it is not necessary to disable it in the boot page tables
> * because they are not going to be used by this CPU anymore.
> */
> update_identity_mapping(false);
> }
>
> which is actually a clone of switch_ttbr() but it does relocation. I would
> then call it in case of coloring in setup_pagetables(). This should go in the
> direction you suggested, but it would duplicate a bit of code. What do you
> think about it?
I think the duplication is fine here.
It would be possible to reduce the duplication is we introduce an helper
for call update_identity_mapping(true) and update the PTE. But I am not
sure it is worth it.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 40+ messages in thread