* [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode
@ 2025-05-21 16:03 Oleksii Kurochko
2025-05-21 16:03 ` [PATCH v3 01/14] xen/riscv: introduce smp_prepare_boot_cpu() Oleksii Kurochko
` (13 more replies)
0 siblings, 14 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Doug Goldstein
The patch series introduces basic UART support (in interrupt mode) and support of
interrupts for hypervisor mode.
To implement this the following has been added:
- APLIC and IMISC initialization.
- Introduce of intc_hw_operations abstraction.
- Introduce some APLIC and IMSIC operations.
- Introduce init_IRQ(), platform_get_irq() and setup_irq() functions.
- Update do_trap() handler to handle IRQ_S_EXT.
- Introduce some other functions such as: get_s_time(), smp_clear_cpu_maps(),
ioremap().
- Enable UART.
CI tests: https://gitlab.com/xen-project/people/olkur/xen/-/pipelines/1829069921
---
Changes in V2:
- Merged to staging:
xen/riscv: initialize bitmap to zero in riscv_fill_hwcap_from_isa_string()
xen/asm-generic: introduce asm-generic/irq-dt.h
- All other changes are patch-specific. Please check each patch separately.
---
Oleksii Kurochko (14):
xen/riscv: introduce smp_prepare_boot_cpu()
xen/riscv: introduce support of Svpbmt extension and make it mandatory
xen/riscv: add ioremap_*() variants using ioremap_attr()
xen/riscv: introduce init_IRQ()
xen/riscv: introduce platform_get_irq()
xen/riscv: dt_processor_hartid() implementation
xen/riscv: introduce register_intc_ops() and intc_hw_ops.
xen/riscv: imsic_init() implementation
xen/riscv: aplic_init() implementation
xen/riscv: introduce intc_init() and helpers
xen/riscv: implementation of aplic and imsic operations
xen/riscv: add external interrupt handling for hypervisor mode
xen/riscv: implement setup_irq()
xen/riscv: add basic UART support
automation/scripts/qemu-smoke-riscv64.sh | 1 +
docs/misc/riscv/booting.txt | 4 +
xen/arch/riscv/Kconfig | 5 +
xen/arch/riscv/Makefile | 3 +
xen/arch/riscv/aplic-priv.h | 38 ++
xen/arch/riscv/aplic.c | 300 ++++++++++++++
xen/arch/riscv/cpufeature.c | 2 +
xen/arch/riscv/imsic.c | 474 +++++++++++++++++++++++
xen/arch/riscv/include/asm/Makefile | 1 +
xen/arch/riscv/include/asm/aplic.h | 73 ++++
xen/arch/riscv/include/asm/cpufeature.h | 1 +
xen/arch/riscv/include/asm/fixmap.h | 2 +-
xen/arch/riscv/include/asm/imsic.h | 84 ++++
xen/arch/riscv/include/asm/intc.h | 32 ++
xen/arch/riscv/include/asm/io.h | 10 +-
xen/arch/riscv/include/asm/irq.h | 24 ++
xen/arch/riscv/include/asm/mm-types.h | 8 +
xen/arch/riscv/include/asm/page.h | 23 +-
xen/arch/riscv/include/asm/smp.h | 17 +
xen/arch/riscv/intc.c | 55 +++
xen/arch/riscv/irq.c | 223 +++++++++++
xen/arch/riscv/mm.c | 30 ++
xen/arch/riscv/pt.c | 20 +-
xen/arch/riscv/setup.c | 21 +-
xen/arch/riscv/smpboot.c | 85 ++++
xen/arch/riscv/stubs.c | 11 -
xen/arch/riscv/traps.c | 19 +
xen/arch/riscv/xen.lds.S | 2 +
xen/drivers/char/Kconfig | 3 +-
29 files changed, 1537 insertions(+), 34 deletions(-)
create mode 100644 xen/arch/riscv/aplic-priv.h
create mode 100644 xen/arch/riscv/imsic.c
create mode 100644 xen/arch/riscv/include/asm/aplic.h
create mode 100644 xen/arch/riscv/include/asm/imsic.h
create mode 100644 xen/arch/riscv/include/asm/mm-types.h
create mode 100644 xen/arch/riscv/irq.c
create mode 100644 xen/arch/riscv/smpboot.c
--
2.49.0
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH v3 01/14] xen/riscv: introduce smp_prepare_boot_cpu()
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
@ 2025-05-21 16:03 ` Oleksii Kurochko
2025-05-22 7:22 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 02/14] xen/riscv: introduce support of Svpbmt extension and make it mandatory Oleksii Kurochko
` (12 subsequent siblings)
13 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini
Initialize cpu_{possible, online}_map by using smp_prepare_boot_cpu().
Drop DEFINE_PER_CPU(unsigned int, cpu_id) from stubs.c as this variable isn't
expected to be used in RISC-V at all.
Move declaration of cpu_{possible,online}_map from stubs.c to smpboot.c
as now smpboot.c is now introduced.
Other defintions keep in stubs.c as they are not initialized and not needed, at
the moment.
Drop cpu_present_map as it is enough to have cpu_possible_map. Also, ask
linker to provide symbol for cpu_present_map as common code references it.
Move call of set_processor_id(0) to smp_prepare_boot_cpu().
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V3:
- Move call of set_processor_id(0) inside smp_prepare_boot_cpu().
- Update the commit message.
---
Changes in v2:
- Add __read_mostly for cpu_online_map.
- Add __ro_after_init for cpu_possible_map.
- Drop cpu_present_map and cpumask_copy(&cpu_present_map, &cpu_possible_map);
- Drop cpumask_clear() for cpu_{possible,online}_map.
- Ask the linker provide the symbol for cpu_present_map as common code uses
it.
- s/smp_clear_cpu_maps/smp_prepare_boot_cpu.
- Include <xen/smp.h> in setup.c as smp_prepare_boot_cpu() is declare in that
header now.
Also, drop inclusion of asm/smp.h in setup.c asm xen/smp.h has inclusion of
asm/smp.h.
- Update the commit message.
---
xen/arch/riscv/Makefile | 1 +
xen/arch/riscv/setup.c | 4 ++--
xen/arch/riscv/smpboot.c | 16 ++++++++++++++++
xen/arch/riscv/stubs.c | 6 ------
xen/arch/riscv/xen.lds.S | 2 ++
5 files changed, 21 insertions(+), 8 deletions(-)
create mode 100644 xen/arch/riscv/smpboot.c
diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index d882c57528..f42cf3dfa6 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -10,6 +10,7 @@ obj-y += sbi.o
obj-y += setup.o
obj-y += shutdown.o
obj-y += smp.o
+obj-y += smpboot.o
obj-y += stubs.o
obj-y += time.o
obj-y += traps.o
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index 4e416f6e44..a9c0c61fb6 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -8,6 +8,7 @@
#include <xen/init.h>
#include <xen/mm.h>
#include <xen/shutdown.h>
+#include <xen/smp.h>
#include <xen/vmap.h>
#include <xen/xvmalloc.h>
@@ -19,7 +20,6 @@
#include <asm/intc.h>
#include <asm/sbi.h>
#include <asm/setup.h>
-#include <asm/smp.h>
#include <asm/traps.h>
/* Xen stack for bringing up the first CPU. */
@@ -72,7 +72,7 @@ void __init noreturn start_xen(unsigned long bootcpu_id,
remove_identity_mapping();
- set_processor_id(0);
+ smp_prepare_boot_cpu();
set_cpuid_to_hartid(0, bootcpu_id);
diff --git a/xen/arch/riscv/smpboot.c b/xen/arch/riscv/smpboot.c
new file mode 100644
index 0000000000..0f9c2cc54a
--- /dev/null
+++ b/xen/arch/riscv/smpboot.c
@@ -0,0 +1,16 @@
+#include <xen/cpumask.h>
+#include <xen/init.h>
+#include <xen/sections.h>
+
+#include <asm/current.h>
+
+cpumask_t __read_mostly cpu_online_map;
+cpumask_t __ro_after_init cpu_possible_map;
+
+void __init smp_prepare_boot_cpu(void)
+{
+ set_processor_id(0);
+
+ cpumask_set_cpu(0, &cpu_possible_map);
+ cpumask_set_cpu(0, &cpu_online_map);
+}
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index 83416d3350..fdcf91054e 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -11,12 +11,6 @@
/* smpboot.c */
-cpumask_t cpu_online_map;
-cpumask_t cpu_present_map;
-cpumask_t cpu_possible_map;
-
-/* ID of the PCPU we're running on */
-DEFINE_PER_CPU(unsigned int, cpu_id);
/* XXX these seem awfully x86ish... */
/* representing HT siblings of each logical CPU */
DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_mask);
diff --git a/xen/arch/riscv/xen.lds.S b/xen/arch/riscv/xen.lds.S
index 818aa43669..8c3c06de01 100644
--- a/xen/arch/riscv/xen.lds.S
+++ b/xen/arch/riscv/xen.lds.S
@@ -165,6 +165,8 @@ SECTIONS
ELF_DETAILS_SECTIONS
}
+PROVIDE(cpu_present_map = cpu_possible_map);
+
ASSERT(IS_ALIGNED(__bss_start, POINTER_ALIGN), "__bss_start is misaligned")
ASSERT(IS_ALIGNED(__bss_end, POINTER_ALIGN), "__bss_end is misaligned")
--
2.49.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v3 02/14] xen/riscv: introduce support of Svpbmt extension and make it mandatory
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
2025-05-21 16:03 ` [PATCH v3 01/14] xen/riscv: introduce smp_prepare_boot_cpu() Oleksii Kurochko
@ 2025-05-21 16:03 ` Oleksii Kurochko
2025-05-22 7:26 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 03/14] xen/riscv: add ioremap_*() variants using ioremap_attr() Oleksii Kurochko
` (11 subsequent siblings)
13 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Doug Goldstein, Stefano Stabellini,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Alistair Francis,
Bob Eshleman, Connor Davis
Svpbmt extension is necessary for chaning the memory type for a page contains
a combination of attributes that indicate the cacheability, idempotency,
and ordering properties for access to that page.
As a part of the patch the following is introduced:
- Svpbmt memory type defintions: PTE_PBMT_{NOCACHE,IO}.
- PAGE_HYPERVISOR_{NOCACHE,WC}.
- RISCV_ISA_EXT_svpbmt and add a check in runtime that Svpbmt is
supported by platform.
- Update riscv/booting.txt with information about Svpbmt.
- Update logic of pt_update_entry() to take into account PBMT bits.
Use 'unsigned long' for pte_attr_t as PMBT bits are 61 and 62 and it doesn't
fit into 'unsigned int'. Also, update function prototypes which uses
'unsigned int' for flags/attibutes.
Enable Svpbmt for testing in QEMU as Svpmbt is now mandatory for
Xen work.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v3:
- Remove dependecy "depends on RISCV_64" for HAS_SVPBMT as it is selected by
CONFIG_RISCV_64.
- Move HAS_SVPMBT's help text to commit message as it's not too much sense
to have it for a prompt-less option.
- Move definition of PTE_PMBT_{NOCACHE,IO} up closer to arch-specific
definitions.
- Update the commit message and subject.
- Add a comment above PAGE_HYPERVISOR_NOCACHE.
---
Changes in v2:
- new patch.
---
automation/scripts/qemu-smoke-riscv64.sh | 1 +
docs/misc/riscv/booting.txt | 4 ++++
xen/arch/riscv/Kconfig | 4 ++++
xen/arch/riscv/cpufeature.c | 2 ++
xen/arch/riscv/include/asm/cpufeature.h | 1 +
xen/arch/riscv/include/asm/fixmap.h | 2 +-
xen/arch/riscv/include/asm/mm-types.h | 8 ++++++++
xen/arch/riscv/include/asm/page.h | 23 ++++++++++++++++++++++-
xen/arch/riscv/pt.c | 20 +++++++++++---------
9 files changed, 54 insertions(+), 11 deletions(-)
create mode 100644 xen/arch/riscv/include/asm/mm-types.h
diff --git a/automation/scripts/qemu-smoke-riscv64.sh b/automation/scripts/qemu-smoke-riscv64.sh
index b2e112c942..25f9e4190e 100755
--- a/automation/scripts/qemu-smoke-riscv64.sh
+++ b/automation/scripts/qemu-smoke-riscv64.sh
@@ -7,6 +7,7 @@ rm -f smoke.serial
export TEST_CMD="qemu-system-riscv64 \
-M virt,aia=aplic-imsic \
+ -cpu rv64,svpbmt=on \
-smp 1 \
-nographic \
-m 2g \
diff --git a/docs/misc/riscv/booting.txt b/docs/misc/riscv/booting.txt
index 3a8474a27d..e100bde575 100644
--- a/docs/misc/riscv/booting.txt
+++ b/docs/misc/riscv/booting.txt
@@ -18,3 +18,7 @@ Xen is run:
- Zihintpause:
On a system that doesn't have this extension, cpu_relax() should be
implemented properly.
+- SVPBMT is mandatory to enable changing the memory attributes of a page.
+ For platforms that do not support SVPBMT, it is necessary to introduce a
+ similar mechanism as described in:
+ https://lore.kernel.org/all/20241102000843.1301099-1-samuel.holland@sifive.com/
diff --git a/xen/arch/riscv/Kconfig b/xen/arch/riscv/Kconfig
index d882e0a059..62c5b7ba34 100644
--- a/xen/arch/riscv/Kconfig
+++ b/xen/arch/riscv/Kconfig
@@ -10,11 +10,15 @@ config RISCV
config RISCV_64
def_bool y
select 64BIT
+ select HAS_SVPBMT
config ARCH_DEFCONFIG
string
default "arch/riscv/configs/tiny64_defconfig"
+config HAS_SVPBMT
+ bool
+
menu "Architecture Features"
source "arch/Kconfig"
diff --git a/xen/arch/riscv/cpufeature.c b/xen/arch/riscv/cpufeature.c
index 3246a03624..b7d5ec6580 100644
--- a/xen/arch/riscv/cpufeature.c
+++ b/xen/arch/riscv/cpufeature.c
@@ -137,6 +137,7 @@ const struct riscv_isa_ext_data __initconst riscv_isa_ext[] = {
RISCV_ISA_EXT_DATA(zbs),
RISCV_ISA_EXT_DATA(smaia),
RISCV_ISA_EXT_DATA(ssaia),
+ RISCV_ISA_EXT_DATA(svpbmt),
};
static const struct riscv_isa_ext_data __initconst required_extensions[] = {
@@ -151,6 +152,7 @@ static const struct riscv_isa_ext_data __initconst required_extensions[] = {
RISCV_ISA_EXT_DATA(zifencei),
RISCV_ISA_EXT_DATA(zihintpause),
RISCV_ISA_EXT_DATA(zbb),
+ RISCV_ISA_EXT_DATA(svpbmt),
};
static bool __init is_lowercase_extension_name(const char *str)
diff --git a/xen/arch/riscv/include/asm/cpufeature.h b/xen/arch/riscv/include/asm/cpufeature.h
index 1015b6ee44..768b84b769 100644
--- a/xen/arch/riscv/include/asm/cpufeature.h
+++ b/xen/arch/riscv/include/asm/cpufeature.h
@@ -37,6 +37,7 @@ enum riscv_isa_ext_id {
RISCV_ISA_EXT_zbs,
RISCV_ISA_EXT_smaia,
RISCV_ISA_EXT_ssaia,
+ RISCV_ISA_EXT_svpbmt,
RISCV_ISA_EXT_MAX
};
diff --git a/xen/arch/riscv/include/asm/fixmap.h b/xen/arch/riscv/include/asm/fixmap.h
index e399a15f53..5990c964aa 100644
--- a/xen/arch/riscv/include/asm/fixmap.h
+++ b/xen/arch/riscv/include/asm/fixmap.h
@@ -33,7 +33,7 @@
extern pte_t xen_fixmap[];
/* Map a page in a fixmap entry */
-void set_fixmap(unsigned int map, mfn_t mfn, unsigned int flags);
+void set_fixmap(unsigned int map, mfn_t mfn, pte_attr_t flags);
/* Remove a mapping from a fixmap entry */
void clear_fixmap(unsigned int map);
diff --git a/xen/arch/riscv/include/asm/mm-types.h b/xen/arch/riscv/include/asm/mm-types.h
new file mode 100644
index 0000000000..fa512064b8
--- /dev/null
+++ b/xen/arch/riscv/include/asm/mm-types.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef ASM_RISCV_MM_TYPES_H
+#define ASM_RISCV_MM_TYPES_H
+
+typedef unsigned long pte_attr_t;
+
+#endif /* ASM_RISCV_MM_TYPES_H */
diff --git a/xen/arch/riscv/include/asm/page.h b/xen/arch/riscv/include/asm/page.h
index bf8988f657..81b91b63d8 100644
--- a/xen/arch/riscv/include/asm/page.h
+++ b/xen/arch/riscv/include/asm/page.h
@@ -37,6 +37,16 @@
#define PTE_ACCESSED BIT(6, UL)
#define PTE_DIRTY BIT(7, UL)
#define PTE_RSW (BIT(8, UL) | BIT(9, UL))
+/*
+ * [62:61] Svpbmt Memory Type definitions:
+ *
+ * 00 - PMA Normal Cacheable, No change to implied PMA memory type
+ * 01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory
+ * 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O memory
+ * 11 - Rsvd Reserved for future standard use
+ */
+#define PTE_PMBT_NOCACHE BIT(61, UL)
+#define PTE_PMBT_IO BIT(62, UL)
#define PTE_LEAF_DEFAULT (PTE_VALID | PTE_READABLE | PTE_WRITABLE)
#define PTE_TABLE (PTE_VALID)
@@ -46,6 +56,15 @@
#define PAGE_HYPERVISOR_RX (PTE_VALID | PTE_READABLE | PTE_EXECUTABLE)
#define PAGE_HYPERVISOR PAGE_HYPERVISOR_RW
+/*
+ * PAGE_HYPERVISOR_NOCACHE is used for ioremap().
+ *
+ * Both PTE_PMBT_IO and PTE_PMBT_NOCACHE are non-cacheable, but the difference
+ * is that IO is non-idempotent and strongly ordered, which makes it a good
+ * candidate for mapping IOMEM.
+ */
+#define PAGE_HYPERVISOR_NOCACHE (PAGE_HYPERVISOR_RW | PTE_PMBT_IO)
+#define PAGE_HYPERVISOR_WC (PAGE_HYPERVISOR_RW | PTE_PMBT_NOCACHE)
/*
* The PTE format does not contain the following bits within itself;
@@ -58,6 +77,8 @@
#define PTE_ACCESS_MASK (PTE_READABLE | PTE_WRITABLE | PTE_EXECUTABLE)
+#define PTE_PBMT_MASK (PTE_PMBT_NOCACHE | PTE_PMBT_IO)
+
/* Calculate the offsets into the pagetables for a given VA */
#define pt_linear_offset(lvl, va) ((va) >> XEN_PT_LEVEL_SHIFT(lvl))
@@ -202,7 +223,7 @@ static inline pte_t read_pte(const pte_t *p)
return read_atomic(p);
}
-static inline pte_t pte_from_mfn(mfn_t mfn, unsigned int flags)
+static inline pte_t pte_from_mfn(mfn_t mfn, pte_attr_t flags)
{
unsigned long pte = (mfn_x(mfn) << PTE_PPN_SHIFT) | flags;
return (pte_t){ .pte = pte };
diff --git a/xen/arch/riscv/pt.c b/xen/arch/riscv/pt.c
index 918b1b91ab..82c8c73c3e 100644
--- a/xen/arch/riscv/pt.c
+++ b/xen/arch/riscv/pt.c
@@ -25,7 +25,7 @@ static inline mfn_t get_root_page(void)
* See the comment about the possible combination of (mfn, flags) in
* the comment above pt_update().
*/
-static bool pt_check_entry(pte_t entry, mfn_t mfn, unsigned int flags)
+static bool pt_check_entry(pte_t entry, mfn_t mfn, pte_attr_t flags)
{
/* Sanity check when modifying an entry. */
if ( (flags & PTE_VALID) && mfn_eq(mfn, INVALID_MFN) )
@@ -260,7 +260,7 @@ pte_t pt_walk(vaddr_t va, unsigned int *pte_level)
*/
static int pt_update_entry(mfn_t root, vaddr_t virt,
mfn_t mfn, unsigned int *target,
- unsigned int flags)
+ pte_attr_t flags)
{
int rc;
/*
@@ -328,17 +328,19 @@ static int pt_update_entry(mfn_t root, vaddr_t virt,
pte.pte = 0;
else
{
+ const pte_attr_t attrs = PTE_ACCESS_MASK | PTE_PBMT_MASK;
+
/* We are inserting a mapping => Create new pte. */
if ( !mfn_eq(mfn, INVALID_MFN) )
pte = pte_from_mfn(mfn, PTE_VALID);
- else /* We are updating the permission => Copy the current pte. */
+ else /* We are updating the attributes => Copy the current pte. */
{
pte = *ptep;
- pte.pte &= ~PTE_ACCESS_MASK;
+ pte.pte &= ~attrs;
}
- /* update permission according to the flags */
- pte.pte |= (flags & PTE_ACCESS_MASK) | PTE_ACCESSED | PTE_DIRTY;
+ /* Update attributes of PTE according to the flags. */
+ pte.pte |= (flags & attrs) | PTE_ACCESSED | PTE_DIRTY;
}
write_pte(ptep, pte);
@@ -353,7 +355,7 @@ static int pt_update_entry(mfn_t root, vaddr_t virt,
/* Return the level where mapping should be done */
static int pt_mapping_level(unsigned long vfn, mfn_t mfn, unsigned long nr,
- unsigned int flags)
+ pte_attr_t flags)
{
unsigned int level = 0;
unsigned long mask;
@@ -407,7 +409,7 @@ static DEFINE_SPINLOCK(pt_lock);
* inserting will be done.
*/
static int pt_update(vaddr_t virt, mfn_t mfn,
- unsigned long nr_mfns, unsigned int flags)
+ unsigned long nr_mfns, pte_attr_t flags)
{
int rc = 0;
unsigned long vfn = PFN_DOWN(virt);
@@ -535,7 +537,7 @@ int __init populate_pt_range(unsigned long virt, unsigned long nr_mfns)
}
/* Map a 4k page in a fixmap entry */
-void set_fixmap(unsigned int map, mfn_t mfn, unsigned int flags)
+void set_fixmap(unsigned int map, mfn_t mfn, pte_attr_t flags)
{
if ( map_pages_to_xen(FIXMAP_ADDR(map), mfn, 1, flags | PTE_SMALL) != 0 )
BUG();
--
2.49.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v3 03/14] xen/riscv: add ioremap_*() variants using ioremap_attr()
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
2025-05-21 16:03 ` [PATCH v3 01/14] xen/riscv: introduce smp_prepare_boot_cpu() Oleksii Kurochko
2025-05-21 16:03 ` [PATCH v3 02/14] xen/riscv: introduce support of Svpbmt extension and make it mandatory Oleksii Kurochko
@ 2025-05-21 16:03 ` Oleksii Kurochko
2025-05-22 7:33 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 04/14] xen/riscv: introduce init_IRQ() Oleksii Kurochko
` (10 subsequent siblings)
13 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini
Introduce ioremap_attr() as a shared helper to implement architecture-specific
ioremap variants:
- ioremap_cache()
- ioremap_wc()
These functions use __vmap() internally and apply appropriate memory attributes
for RISC-V.
These functions are implemned not as static inline function or macros as it will
require to include asm/page.h into asm/io.h what will lead to compilation
issue.
Also, remove the unused ioremap_wt() macro from asm/io.h, as write-through
mappings are not expected to be used on RISC-V.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v3:
- Drop ioremap_nocache() as ioremap() duplicates functionality of
ioremap_nocache().
- Add __iomem to defintion of ioremap(), ioremap_attr().
- Rename start to pa for all ioremap*() functions.
- Update the commit message ioremap_nocache().
---
Changes in v2:
- Update the commit subject + message.
- move out Svpbmt changes to separate patch.
- Drop #ifdef SVPBMT for ioremap().
- Redefine ioremap_* in io.h.
- Introduce ioremap_attr().
---
xen/arch/riscv/include/asm/io.h | 10 ++--------
xen/arch/riscv/mm.c | 30 ++++++++++++++++++++++++++++++
2 files changed, 32 insertions(+), 8 deletions(-)
diff --git a/xen/arch/riscv/include/asm/io.h b/xen/arch/riscv/include/asm/io.h
index 8bab4ffa03..9aeafd6b3b 100644
--- a/xen/arch/riscv/include/asm/io.h
+++ b/xen/arch/riscv/include/asm/io.h
@@ -41,14 +41,8 @@
#include <xen/macros.h>
#include <xen/types.h>
-/*
- * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
- * change the properties of memory regions. This should be fixed by the
- * upcoming platform spec.
- */
-#define ioremap_nocache(addr, size) ioremap(addr, size)
-#define ioremap_wc(addr, size) ioremap(addr, size)
-#define ioremap_wt(addr, size) ioremap(addr, size)
+void __iomem *ioremap_cache(paddr_t pa, size_t len);
+void __iomem *ioremap_wc(paddr_t pa, size_t len);
/* Generic IO read/write. These perform native-endian accesses. */
static inline void __raw_writeb(uint8_t val, volatile void __iomem *addr)
diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c
index d3ece9f132..4047d67c0e 100644
--- a/xen/arch/riscv/mm.c
+++ b/xen/arch/riscv/mm.c
@@ -11,6 +11,7 @@
#include <xen/pfn.h>
#include <xen/sections.h>
#include <xen/sizes.h>
+#include <xen/vmap.h>
#include <asm/early_printk.h>
#include <asm/csr.h>
@@ -583,3 +584,32 @@ void *__init arch_vmap_virt_end(void)
{
return (void *)(VMAP_VIRT_START + VMAP_VIRT_SIZE);
}
+
+static void __iomem *ioremap_attr(paddr_t pa, size_t len,
+ pte_attr_t attributes)
+{
+ mfn_t mfn = _mfn(PFN_DOWN(pa));
+ unsigned int offs = pa & (PAGE_SIZE - 1);
+ unsigned int nr = PFN_UP(offs + len);
+ void *ptr = __vmap(&mfn, nr, 1, 1, attributes, VMAP_DEFAULT);
+
+ if ( ptr == NULL )
+ return NULL;
+
+ return ptr + offs;
+}
+
+void __iomem *ioremap_cache(paddr_t pa, size_t len)
+{
+ return ioremap_attr(pa, len, PAGE_HYPERVISOR);
+}
+
+void __iomem *ioremap_wc(paddr_t pa, size_t len)
+{
+ return ioremap_attr(pa, len, PAGE_HYPERVISOR_WC);
+}
+
+void __iomem *ioremap(paddr_t pa, size_t len)
+{
+ return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
+}
--
2.49.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v3 04/14] xen/riscv: introduce init_IRQ()
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
` (2 preceding siblings ...)
2025-05-21 16:03 ` [PATCH v3 03/14] xen/riscv: add ioremap_*() variants using ioremap_attr() Oleksii Kurochko
@ 2025-05-21 16:03 ` Oleksii Kurochko
2025-05-22 7:38 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 05/14] xen/riscv: introduce platform_get_irq() Oleksii Kurochko
` (9 subsequent siblings)
13 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini
Implement init_IRQ() to initalize various IRQs.
Currently, this function initializes the irq_desc[] array,
which stores IRQ descriptors containing various information
about each IRQ, such as the type of hardware handling, whether
the IRQ is disabled, etc.
The initialization is basic at this point and includes setting
IRQ_TYPE_INVALID as the IRQ type, assigning the IRQ number ( which
is just a consequent index of irq_desc[] array ) to
desc->irq.
Additionally, the function init_irq_data() is introduced to
initialize the IRQ descriptors for all IRQs in the system.
Reuse defines of IRQ_TYPE_* from asm-generic/irq-dt.h.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
- Add an explanatory comment about NR_IRQS definitions.
- Init desc->irq and desc->action before call of init_one_irq_desc().
- Drop "desc->action = NULL" as irq_desc[] is zero-initialized.
- Update the commit message: drop mention of NULLing of desc->action.
- Drop year in Copyright.
---
Changes in v2:
- Move an introduction of IRQ_TYPE_* defines to the separate patch.
- Reuse asm-generic/irq-dt.h.
- Use 'unsigned int' for local irq variable inside init_irq_data().
---
xen/arch/riscv/Makefile | 1 +
xen/arch/riscv/include/asm/Makefile | 1 +
xen/arch/riscv/include/asm/irq.h | 15 ++++++++++
xen/arch/riscv/irq.c | 45 +++++++++++++++++++++++++++++
xen/arch/riscv/setup.c | 3 ++
xen/arch/riscv/stubs.c | 5 ----
6 files changed, 65 insertions(+), 5 deletions(-)
create mode 100644 xen/arch/riscv/irq.c
diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index f42cf3dfa6..a1c145c506 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -3,6 +3,7 @@ obj-y += cpufeature.o
obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
obj-y += entry.o
obj-y += intc.o
+obj-y += irq.o
obj-y += mm.o
obj-y += pt.o
obj-$(CONFIG_RISCV_64) += riscv64/
diff --git a/xen/arch/riscv/include/asm/Makefile b/xen/arch/riscv/include/asm/Makefile
index c989a7f89b..bfdf186c68 100644
--- a/xen/arch/riscv/include/asm/Makefile
+++ b/xen/arch/riscv/include/asm/Makefile
@@ -5,6 +5,7 @@ generic-y += div64.h
generic-y += hardirq.h
generic-y += hypercall.h
generic-y += iocap.h
+generic-y += irq-dt.h
generic-y += paging.h
generic-y += percpu.h
generic-y += perfc_defn.h
diff --git a/xen/arch/riscv/include/asm/irq.h b/xen/arch/riscv/include/asm/irq.h
index 2a48da2651..ea555afd1a 100644
--- a/xen/arch/riscv/include/asm/irq.h
+++ b/xen/arch/riscv/include/asm/irq.h
@@ -3,6 +3,19 @@
#define ASM__RISCV__IRQ_H
#include <xen/bug.h>
+#include <xen/device_tree.h>
+
+#include <asm/irq-dt.h>
+
+/*
+ * According to the AIA spec:
+ * The maximum number of interrupt sources an APLIC may support is 1023.
+ *
+ * The same is true for PLIC.
+ *
+ * Interrupt Source 0 is reserved and shall never generate an interrupt.
+ */
+#define NR_IRQS 1024
/* TODO */
#define nr_irqs 0U
@@ -25,6 +38,8 @@ static inline void arch_move_irqs(struct vcpu *v)
BUG_ON("unimplemented");
}
+void init_IRQ(void);
+
#endif /* ASM__RISCV__IRQ_H */
/*
diff --git a/xen/arch/riscv/irq.c b/xen/arch/riscv/irq.c
new file mode 100644
index 0000000000..b5ae7a114b
--- /dev/null
+++ b/xen/arch/riscv/irq.c
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+/*
+ * RISC-V Interrupt support
+ *
+ * Copyright (c) Vates
+ */
+
+#include <xen/bug.h>
+#include <xen/init.h>
+#include <xen/irq.h>
+
+static irq_desc_t irq_desc[NR_IRQS];
+
+int arch_init_one_irq_desc(struct irq_desc *desc)
+{
+ desc->arch.type = IRQ_TYPE_INVALID;
+
+ return 0;
+}
+
+static int __init init_irq_data(void)
+{
+ unsigned int irq;
+
+ for ( irq = 0; irq < NR_IRQS; irq++ )
+ {
+ struct irq_desc *desc = irq_to_desc(irq);
+ int rc;
+
+ desc->irq = irq;
+
+ rc = init_one_irq_desc(desc);
+ if ( rc )
+ return rc;
+ }
+
+ return 0;
+}
+
+void __init init_IRQ(void)
+{
+ if ( init_irq_data() < 0 )
+ panic("initialization of IRQ data failed\n");
+}
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index a9c0c61fb6..8bcd19218d 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -6,6 +6,7 @@
#include <xen/compile.h>
#include <xen/device_tree.h>
#include <xen/init.h>
+#include <xen/irq.h>
#include <xen/mm.h>
#include <xen/shutdown.h>
#include <xen/smp.h>
@@ -125,6 +126,8 @@ void __init noreturn start_xen(unsigned long bootcpu_id,
panic("Booting using ACPI isn't supported\n");
}
+ init_IRQ();
+
riscv_fill_hwcap();
preinit_xen_time();
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index fdcf91054e..e396b67cd3 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -107,11 +107,6 @@ void irq_ack_none(struct irq_desc *desc)
BUG_ON("unimplemented");
}
-int arch_init_one_irq_desc(struct irq_desc *desc)
-{
- BUG_ON("unimplemented");
-}
-
void smp_send_state_dump(unsigned int cpu)
{
BUG_ON("unimplemented");
--
2.49.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v3 05/14] xen/riscv: introduce platform_get_irq()
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
` (3 preceding siblings ...)
2025-05-21 16:03 ` [PATCH v3 04/14] xen/riscv: introduce init_IRQ() Oleksii Kurochko
@ 2025-05-21 16:03 ` Oleksii Kurochko
2025-05-22 7:41 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 06/14] xen/riscv: dt_processor_hartid() implementation Oleksii Kurochko
` (8 subsequent siblings)
13 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Romain Caritey
platform_get_irq() recieves information about device's irq ( type
and irq number ) from device tree node and using this information
update irq descriptor in irq_desc[] array.
Introduce dt_irq_xlate and initialize with aplic_irq_xlate() as
it is used by dt_device_get_irq() which is called by
platform_get_irq().
Co-developed-by: Romain Caritey <Romain.Caritey@microchip.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V3:
- Drop parentheses in return inside irq_validate_new_type().
- Add a check in platform_get_irq() that dt_irq.irq is less
then NR_IRQS.
Also, add BUILD_BUG_ON(NR_IRQS > INT_MAX).
---
Changes in V2:
- Add cf_check for aplic_irq_xlate().
- Ident label in irq_set_type().
- Return proper -E... values for platform_get_irq().
---
xen/arch/riscv/aplic.c | 20 ++++++++++++++
xen/arch/riscv/include/asm/irq.h | 3 ++
xen/arch/riscv/irq.c | 47 ++++++++++++++++++++++++++++++++
3 files changed, 70 insertions(+)
diff --git a/xen/arch/riscv/aplic.c b/xen/arch/riscv/aplic.c
index caba8f8993..10ae81f7ac 100644
--- a/xen/arch/riscv/aplic.c
+++ b/xen/arch/riscv/aplic.c
@@ -11,6 +11,7 @@
#include <xen/errno.h>
#include <xen/init.h>
+#include <xen/irq.h>
#include <xen/sections.h>
#include <xen/types.h>
@@ -21,6 +22,23 @@ static struct intc_info __ro_after_init aplic_info = {
.hw_version = INTC_APLIC,
};
+static int cf_check aplic_irq_xlate(const uint32_t *intspec,
+ unsigned int intsize,
+ unsigned int *out_hwirq,
+ unsigned int *out_type)
+{
+ if ( intsize < 2 )
+ return -EINVAL;
+
+ /* Mapping 1:1 */
+ *out_hwirq = intspec[0];
+
+ if ( out_type )
+ *out_type = intspec[1] & IRQ_TYPE_SENSE_MASK;
+
+ return 0;
+}
+
static int __init aplic_preinit(struct dt_device_node *node, const void *dat)
{
if ( aplic_info.node )
@@ -35,6 +53,8 @@ static int __init aplic_preinit(struct dt_device_node *node, const void *dat)
aplic_info.node = node;
+ dt_irq_xlate = aplic_irq_xlate;
+
return 0;
}
diff --git a/xen/arch/riscv/include/asm/irq.h b/xen/arch/riscv/include/asm/irq.h
index ea555afd1a..84c3c2904d 100644
--- a/xen/arch/riscv/include/asm/irq.h
+++ b/xen/arch/riscv/include/asm/irq.h
@@ -38,6 +38,9 @@ static inline void arch_move_irqs(struct vcpu *v)
BUG_ON("unimplemented");
}
+struct dt_device_node;
+int platform_get_irq(const struct dt_device_node *device, int index);
+
void init_IRQ(void);
#endif /* ASM__RISCV__IRQ_H */
diff --git a/xen/arch/riscv/irq.c b/xen/arch/riscv/irq.c
index b5ae7a114b..669ef3ae9e 100644
--- a/xen/arch/riscv/irq.c
+++ b/xen/arch/riscv/irq.c
@@ -7,11 +7,58 @@
*/
#include <xen/bug.h>
+#include <xen/device_tree.h>
+#include <xen/errno.h>
#include <xen/init.h>
#include <xen/irq.h>
static irq_desc_t irq_desc[NR_IRQS];
+static bool irq_validate_new_type(unsigned int curr, unsigned int new)
+{
+ return curr == IRQ_TYPE_INVALID || curr == new;
+}
+
+static int irq_set_type(unsigned int irq, unsigned int type)
+{
+ unsigned long flags;
+ struct irq_desc *desc = irq_to_desc(irq);
+ int ret = -EBUSY;
+
+ spin_lock_irqsave(&desc->lock, flags);
+
+ if ( !irq_validate_new_type(desc->arch.type, type) )
+ goto err;
+
+ desc->arch.type = type;
+
+ ret = 0;
+
+ err:
+ spin_unlock_irqrestore(&desc->lock, flags);
+
+ return ret;
+}
+
+int platform_get_irq(const struct dt_device_node *device, int index)
+{
+ struct dt_irq dt_irq;
+ int ret;
+
+ if ( (ret = dt_device_get_irq(device, index, &dt_irq)) != 0 )
+ return ret;
+
+ BUILD_BUG_ON(NR_IRQS > INT_MAX);
+
+ if ( dt_irq.irq >= NR_IRQS )
+ panic("irq%d is bigger then NR_IRQS(%d)\n", dt_irq.irq, NR_IRQS);
+
+ if ( (ret = irq_set_type(dt_irq.irq, dt_irq.type)) != 0 )
+ return ret;
+
+ return dt_irq.irq;
+}
+
int arch_init_one_irq_desc(struct irq_desc *desc)
{
desc->arch.type = IRQ_TYPE_INVALID;
--
2.49.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v3 06/14] xen/riscv: dt_processor_hartid() implementation
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
` (4 preceding siblings ...)
2025-05-21 16:03 ` [PATCH v3 05/14] xen/riscv: introduce platform_get_irq() Oleksii Kurochko
@ 2025-05-21 16:03 ` Oleksii Kurochko
2025-05-22 7:50 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 07/14] xen/riscv: introduce register_intc_ops() and intc_hw_ops Oleksii Kurochko
` (7 subsequent siblings)
13 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini
Implements dt_processor_hartid() to get the hart ID of the given
device tree node and do some checks if CPU is available and given device
tree node has proper riscv,isa property.
As a helper function dt_get_hartid() is introduced to deal specifically
with reg propery of a CPU device node.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V3:
- s/dt_processor_cpuid/dt_processor_hartid.
- s/dt_get_cpuid/dt_get_hartid.
- use ~0UL in dt_get_cpuid() and in the comment above it.
- change type for local variable ac in dt_get_cpuid() to unsigned int.
- Update the return errors for dt_processor_cpuid().
- Update the commit message + subject: s/cpuid/hartid.
---
Changes in V2:
- s/of_get_cpu_hwid()/dt_get_cpu_id().
- Update prototype of dt_get_cpu_hwid(), use pointer-to-const for cpun arg.
- Add empty line before last return in dt_get_cpu_hwid().
- s/riscv_of_processor_hartid/dt_processor_cpuid().
- Use pointer-to_const for node argument of dt_processor_cpuid().
- Use for hart_id unsigned long type as according to the spec for RV128
mhartid register will be 128 bit long.
- Update commit message and subject.
- use 'CPU' instead of 'HART'.
- Drop thread argument of dt_get_cpu_id() (of_get_cpu_hwid) as it is
expected to be always 0 according to RISC-V's DTS binding.
---
xen/arch/riscv/include/asm/smp.h | 4 ++
xen/arch/riscv/smpboot.c | 69 ++++++++++++++++++++++++++++++++
2 files changed, 73 insertions(+)
diff --git a/xen/arch/riscv/include/asm/smp.h b/xen/arch/riscv/include/asm/smp.h
index 5e170b57b3..eb58b6576b 100644
--- a/xen/arch/riscv/include/asm/smp.h
+++ b/xen/arch/riscv/include/asm/smp.h
@@ -26,6 +26,10 @@ static inline void set_cpuid_to_hartid(unsigned long cpuid,
void setup_tp(unsigned int cpuid);
+struct dt_device_node;
+int dt_processor_hartid(const struct dt_device_node *node,
+ unsigned long *hartid);
+
#endif
/*
diff --git a/xen/arch/riscv/smpboot.c b/xen/arch/riscv/smpboot.c
index 0f9c2cc54a..b8d18fc3ea 100644
--- a/xen/arch/riscv/smpboot.c
+++ b/xen/arch/riscv/smpboot.c
@@ -1,5 +1,8 @@
#include <xen/cpumask.h>
+#include <xen/device_tree.h>
+#include <xen/errno.h>
#include <xen/init.h>
+#include <xen/types.h>
#include <xen/sections.h>
#include <asm/current.h>
@@ -14,3 +17,69 @@ void __init smp_prepare_boot_cpu(void)
cpumask_set_cpu(0, &cpu_possible_map);
cpumask_set_cpu(0, &cpu_online_map);
}
+
+/**
+ * dt_get_hartid - Get the hartid from a CPU device node
+ *
+ * @cpun: CPU number(logical index) for which device node is required
+ *
+ * Return: The hartid for the CPU node or ~0UL if not found.
+ */
+static unsigned long dt_get_hartid(const struct dt_device_node *cpun)
+{
+ const __be32 *cell;
+ unsigned int ac;
+ uint32_t len;
+
+ ac = dt_n_addr_cells(cpun);
+ cell = dt_get_property(cpun, "reg", &len);
+ if ( !cell || !ac || ((sizeof(*cell) * ac) > len) )
+ return ~0UL;
+
+ return dt_read_number(cell, ac);
+}
+
+/*
+ * Returns the hartid of the given device tree node, or -ENODEV if the node
+ * isn't an enabled and valid RISC-V hart node.
+ */
+int dt_processor_hartid(const struct dt_device_node *node,
+ unsigned long *hartid)
+{
+ const char *isa;
+ int ret;
+
+ if ( !dt_device_is_compatible(node, "riscv") )
+ {
+ printk("Found incompatible CPU\n");
+ return -ENODEV;
+ }
+
+ *hartid = dt_get_hartid(node);
+ if ( *hartid == ~0UL )
+ {
+ printk("Found CPU without CPU ID\n");
+ return -ENODATA;
+ }
+
+ if ( !dt_device_is_available(node))
+ {
+ printk("CPU with hartid=%lu is not available\n", *hartid);
+ return -ENODEV;
+ }
+
+ if ( (ret = dt_property_read_string(node, "riscv,isa", &isa)) )
+ {
+ printk("CPU with hartid=%lu has no \"riscv,isa\" property\n", *hartid);
+ return ret;
+ }
+
+ if ( isa[0] != 'r' || isa[1] != 'v' )
+ {
+ printk("CPU with hartid=%lu has an invalid ISA of \"%s\"\n", *hartid,
+ isa);
+ return -EINVAL;
+ }
+
+ return 0;
+}
--
2.49.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v3 07/14] xen/riscv: introduce register_intc_ops() and intc_hw_ops.
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
` (5 preceding siblings ...)
2025-05-21 16:03 ` [PATCH v3 06/14] xen/riscv: dt_processor_hartid() implementation Oleksii Kurochko
@ 2025-05-21 16:03 ` Oleksii Kurochko
2025-05-22 13:49 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 08/14] xen/riscv: imsic_init() implementation Oleksii Kurochko
` (6 subsequent siblings)
13 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Romain Caritey
Introduce the intc_hw_operations structure to encapsulate interrupt
controller-specific data and operations. This structure includes:
- A pointer to interrupt controller information (`intc_info`)
- Callbacks to initialize the controller and set IRQ type/priority
- A reference to an interupt controller descriptor (`host_irq_type`)
- number of interrupt controller irqs.
Add function register_intc_ops() to mentioned above structure.
Co-developed-by: Romain Caritey <Romain.Caritey@microchip.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V3:
- Drop inclusion of xen/irq.h in asm/intc.h as forward declaration is enogh
for types used in asm/intc.h.
- Drop forward declaration for dt_device_node and hw_irq_controller.
- Declare intc_hw_ops as:
const struct intc_hw_operations * __ro_after_init intc_hw_ops;
---
Changes in V2:
- Declare host_irq_type member of intc_hw_operations as pointer-to-const.
- Moved up forward declaration of irq_desc.
- Use attribute __init for register_intc_ops().
- Use attribute __ro_after_init for intc_hw_ops variable.
- Update prototype of register_intc_ops() because of what mention in the
previous point.
---
xen/arch/riscv/include/asm/intc.h | 19 +++++++++++++++++++
xen/arch/riscv/intc.c | 9 +++++++++
2 files changed, 28 insertions(+)
diff --git a/xen/arch/riscv/include/asm/intc.h b/xen/arch/riscv/include/asm/intc.h
index 52ba196d87..860737f965 100644
--- a/xen/arch/riscv/include/asm/intc.h
+++ b/xen/arch/riscv/include/asm/intc.h
@@ -12,11 +12,30 @@ enum intc_version {
INTC_APLIC,
};
+struct irq_desc;
+
struct intc_info {
enum intc_version hw_version;
const struct dt_device_node *node;
};
+struct intc_hw_operations {
+ /* Hold intc hw information */
+ const struct intc_info *info;
+ /* Initialize the intc and the boot CPU */
+ int (*init)(void);
+
+ /* hw_irq_controller to enable/disable/eoi host irq */
+ const struct hw_interrupt_type *host_irq_type;
+
+ /* Set IRQ type */
+ void (*set_irq_type)(struct irq_desc *desc, unsigned int type);
+ /* Set IRQ priority */
+ void (*set_irq_priority)(struct irq_desc *desc, unsigned int priority);
+};
+
void intc_preinit(void);
+void register_intc_ops(const struct intc_hw_operations *ops);
+
#endif /* ASM__RISCV__INTERRUPT_CONTOLLER_H */
diff --git a/xen/arch/riscv/intc.c b/xen/arch/riscv/intc.c
index 4061a3c457..1ecd651bf3 100644
--- a/xen/arch/riscv/intc.c
+++ b/xen/arch/riscv/intc.c
@@ -5,6 +5,15 @@
#include <xen/init.h>
#include <xen/lib.h>
+#include <asm/intc.h>
+
+static const struct intc_hw_operations *__ro_after_init intc_hw_ops;
+
+void __init register_intc_ops(const struct intc_hw_operations *ops)
+{
+ intc_hw_ops = ops;
+}
+
void __init intc_preinit(void)
{
if ( acpi_disabled )
--
2.49.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
` (6 preceding siblings ...)
2025-05-21 16:03 ` [PATCH v3 07/14] xen/riscv: introduce register_intc_ops() and intc_hw_ops Oleksii Kurochko
@ 2025-05-21 16:03 ` Oleksii Kurochko
2025-05-22 14:46 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 09/14] xen/riscv: aplic_init() implementation Oleksii Kurochko
` (5 subsequent siblings)
13 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Romain Caritey
imsic_init() is introduced to parse device tree node, which has the following
bindings [2], and based on the parsed information update IMSIC configuration
which is stored in imsic_cfg.
The following helpers are introduces for imsic_init() usage:
- imsic_parse_node() parses IMSIC node from DTS
- imsic_get_parent_hartid() returns the hart ( CPU ) ID of the given device
tree node.
This patch is based on the code from [1].
Since Microchip originally developed imsic.{c,h}, an internal discussion with
them led to the decision to use the MIT license.
[1] https://gitlab.com/xen-project/people/olkur/xen/-/commit/0b1a94f2bc3bb1a81cd26bb75f0bf578f84cb4d4
[2] https://elixir.bootlin.com/linux/v6.12/source/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
Co-developed-by: Romain Caritey <Romain.Caritey@microchip.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V3:
- Drop year in imsic.h in copyrights.
- Correct identation in imsic_parse_node() and imsic_init()
where for imsic_cfg.base_addr a mask is applied.
- Use unsigned int istead of uint32_t for local variable nr_parent_irqs,
index, nr_handlers in imsic_init().
- Fix a leakage of ealiers successfull allocations in case if imsic_init()
returns with an error.
- Excess blank in printk() message: "%s: unable to parse MMIO regset %d\n".
- Introduce hartid_to_cpuid() and use it in the check:
if ( hardid_to_cpuid(cpuid) >= num_possible_cpus() )
in imsic_init().
- Use "%u" for unsigned int in printk(...).
- Fix for loop condition: nr_mmios -> "j < nr_mmios".
- [imsic_init()] Drop usage of j in nested loop. It is enough to have only
index.
- Change 0x%lx to %#lx for formatting of an address in printk().
- [imsic_init()] Rename local variable cpuid to hartid.
- s/imsic_get_parent_cpuid/imsic_get_parent_hartid, s/cpuid/hartid for an
imsic_get_parent_hartid()'s argument.
- Declare cpus member of struct imsic_mmios as cpumask_t.
- [imsic_init()] Allocate imsic_mmios.cpus by using of alloc_cpumask_var().
- [imsic_init()] Use cpumask_set_cpu() instead of bitmap_set().
- [imsic_parse_node()] add check for correctness of "interrupt-extended" property.
- [imsic_parse_node()] Use dt_node_name() or dt_full_node_name() instead of
derefence of struct dt_node.
- [imsic_parse_node()] Add cleanup label and update 'rc = XXX; goto cleanup'
instead of 'return rc' as now we have to cleanup dynamically allocated irq_range
array.
- Add comments above imsic_init() and imsic_parse_node().
- s/xen/arch/riscv/imsic.h/xen/arch/riscv/include/asm/imsic.h in the comment of
imsic.h.
---
Changes in V2:
- Drop years in copyrights.
- s/riscv_of_processor_hartid/dt_processor_cpuid.
- s/imsic_get_parent_hartid/imsic_get_parent_hartid.
Rename argument hartid to cpuid.
Make node argument const.
Return res instead of -EINVAL for the failure case of dt_processor_cpuid().
Drop local variable hart and use cpuid argument instead.
Drop useless return res;
- imsic_parse_node() changes:
- Make node argument const.
- Check the return value of dt_property_read_u32() directly instead of
saving it to rc variable.
- Update tmp usage, use short form "-=".
- Update a check (imsic_cfg.nr_ids >= IMSIC_MAX_ID) to (imsic_cfg.nr_ids > IMSIC_MAX_ID)
as IMSIC_MAX_ID is changed to maximum valid value, not just the firsr out-of-range.
- Use `rc` to return value instead of explicitly use -EINVAL.
- Use do {} while() to find number of MMIO register sets.
- Set IMSIC_MAX_ID to 2047 (maximum possible IRQ number).
- imsic_init() changes:
- Use unsigned int in for's expression1.
- s/xfree/XFEE.
- Allocate msi and cpus array dynamically.
- Drop forward declaration before declaration of imsic_get_config() in asm/imsic.h
as it is not used as parameter type.
- Align declaration of imisic_init with defintion.
- s/harts/cpus in imisic_mmios.
Also, change type from bool harts[NR_CPUS] to unsigned long *cpus.
- Allocate msi member of imsic_config dynamically to save some memory.
- Code style fixes.
- Update the commit message.
---
xen/arch/riscv/Makefile | 1 +
xen/arch/riscv/imsic.c | 354 +++++++++++++++++++++++++++++
xen/arch/riscv/include/asm/imsic.h | 65 ++++++
xen/arch/riscv/include/asm/smp.h | 13 ++
4 files changed, 433 insertions(+)
create mode 100644 xen/arch/riscv/imsic.c
create mode 100644 xen/arch/riscv/include/asm/imsic.h
diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index a1c145c506..e2b8aa42c8 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -2,6 +2,7 @@ obj-y += aplic.o
obj-y += cpufeature.o
obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
obj-y += entry.o
+obj-y += imsic.o
obj-y += intc.o
obj-y += irq.o
obj-y += mm.o
diff --git a/xen/arch/riscv/imsic.c b/xen/arch/riscv/imsic.c
new file mode 100644
index 0000000000..9f8b492e97
--- /dev/null
+++ b/xen/arch/riscv/imsic.c
@@ -0,0 +1,354 @@
+/* SPDX-License-Identifier: MIT */
+
+/*
+ * xen/arch/riscv/imsic.c
+ *
+ * RISC-V Incoming MSI Controller support
+ *
+ * (c) Microchip Technology Inc.
+ * (c) Vates
+ */
+
+#include <xen/bitops.h>
+#include <xen/const.h>
+#include <xen/cpumask.h>
+#include <xen/device_tree.h>
+#include <xen/errno.h>
+#include <xen/init.h>
+#include <xen/macros.h>
+#include <xen/smp.h>
+#include <xen/spinlock.h>
+#include <xen/xmalloc.h>
+
+#include <asm/imsic.h>
+
+static struct imsic_config imsic_cfg;
+
+/* Callers aren't expected to changed imsic_cfg so return const. */
+const struct imsic_config *imsic_get_config(void)
+{
+ return &imsic_cfg;
+}
+
+static int __init imsic_get_parent_hartid(const struct dt_device_node *node,
+ unsigned int index,
+ unsigned long *hartid)
+{
+ int res;
+ struct dt_phandle_args args;
+
+ res = dt_parse_phandle_with_args(node, "interrupts-extended",
+ "#interrupt-cells", index, &args);
+ if ( !res )
+ res = dt_processor_hartid(args.np->parent, hartid);
+
+ return res;
+}
+
+/*
+ * Parses IMSIC DT node.
+ *
+ * Returns 0 if initialization is successful, a negative value on failure,
+ * or IRQ_M_EXT if the IMSIC node corresponds to a machine-mode IMSIC,
+ * which should be ignored by the hypervisor.
+ */
+static int imsic_parse_node(const struct dt_device_node *node,
+ unsigned int *nr_parent_irqs)
+{
+ int rc;
+ unsigned int tmp;
+ paddr_t base_addr;
+ uint32_t *irq_range;
+
+ *nr_parent_irqs = dt_number_of_irq(node);
+ if ( !*nr_parent_irqs )
+ panic("%s: irq_num can be 0. Check %s node\n", __func__,
+ dt_node_full_name(node));
+
+ irq_range = xzalloc_array(uint32_t, *nr_parent_irqs * 2);
+ if ( !irq_range )
+ panic("%s: irq_range[] allocation failed\n", __func__);
+
+ if ( (rc = dt_property_read_u32_array(node, "interrupts-extended",
+ irq_range, *nr_parent_irqs * 2)) )
+ panic("%s: unable to find interrupt-extended in %s node: %d\n",
+ __func__, dt_node_full_name(node), rc);
+
+ if ( irq_range[1] == IRQ_M_EXT )
+ {
+ /* Machine mode imsic node, ignore it. */
+ rc = IRQ_M_EXT;
+ goto cleanup;
+ }
+
+ /* Check that interrupts-extended property is well-formed. */
+ for ( unsigned int i = 2; i < (*nr_parent_irqs * 2); i += 2 )
+ {
+ if ( irq_range[i + 1] != irq_range[1] )
+ panic("%s: mode[%d] != %d\n", __func__, i + 1, irq_range[1]);
+ }
+
+ if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
+ &imsic_cfg.guest_index_bits) )
+ imsic_cfg.guest_index_bits = 0;
+ tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
+ if ( tmp < imsic_cfg.guest_index_bits )
+ {
+ printk(XENLOG_ERR "%s: guest index bits too big\n",
+ dt_node_name(node));
+ rc = -ENOENT;
+ goto cleanup;
+ }
+
+ /* Find number of HART index bits */
+ if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
+ &imsic_cfg.hart_index_bits) )
+ {
+ /* Assume default value */
+ imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
+ if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
+ imsic_cfg.hart_index_bits++;
+ }
+ tmp -= imsic_cfg.guest_index_bits;
+ if ( tmp < imsic_cfg.hart_index_bits )
+ {
+ printk(XENLOG_ERR "%s: HART index bits too big\n",
+ dt_node_name(node));
+ rc = -ENOENT;
+ goto cleanup;
+ }
+
+ /* Find number of group index bits */
+ if ( !dt_property_read_u32(node, "riscv,group-index-bits",
+ &imsic_cfg.group_index_bits) )
+ imsic_cfg.group_index_bits = 0;
+ tmp -= imsic_cfg.hart_index_bits;
+ if ( tmp < imsic_cfg.group_index_bits )
+ {
+ printk(XENLOG_ERR "%s: group index bits too big\n",
+ dt_node_name(node));
+ rc = -ENOENT;
+ goto cleanup;
+ }
+
+ /* Find first bit position of group index */
+ tmp = IMSIC_MMIO_PAGE_SHIFT * 2;
+ if ( !dt_property_read_u32(node, "riscv,group-index-shift",
+ &imsic_cfg.group_index_shift) )
+ imsic_cfg.group_index_shift = tmp;
+ if ( imsic_cfg.group_index_shift < tmp )
+ {
+ printk(XENLOG_ERR "%s: group index shift too small\n",
+ dt_node_name(node));
+ rc = -ENOENT;
+ goto cleanup;
+ }
+ tmp = imsic_cfg.group_index_bits + imsic_cfg.group_index_shift - 1;
+ if ( tmp >= BITS_PER_LONG )
+ {
+ printk(XENLOG_ERR "%s: group index shift too big\n",
+ dt_node_name(node));
+ rc = -EINVAL;
+ goto cleanup;
+ }
+
+ /* Find number of interrupt identities */
+ if ( !dt_property_read_u32(node, "riscv,num-ids", &imsic_cfg.nr_ids) )
+ {
+ printk(XENLOG_ERR "%s: number of interrupt identities not found\n",
+ node->name);
+ rc = -ENOENT;
+ goto cleanup;
+ }
+
+ if ( (imsic_cfg.nr_ids < IMSIC_MIN_ID) ||
+ (imsic_cfg.nr_ids > IMSIC_MAX_ID) ||
+ ((imsic_cfg.nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID) )
+ {
+ printk(XENLOG_ERR "%s: invalid number of interrupt identities\n",
+ node->name);
+ rc = -EINVAL;
+ goto cleanup;
+ }
+
+ /* Compute base address */
+ imsic_cfg.nr_mmios = 0;
+ rc = dt_device_get_address(node, imsic_cfg.nr_mmios, &base_addr, NULL);
+ if ( rc )
+ {
+ printk(XENLOG_ERR "%s: first MMIO resource not found: %d\n",
+ dt_node_name(node), rc);
+ goto cleanup;
+ }
+
+ imsic_cfg.base_addr = base_addr;
+ imsic_cfg.base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
+ imsic_cfg.hart_index_bits +
+ IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
+ imsic_cfg.base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
+ imsic_cfg.group_index_shift);
+
+ /* Find number of MMIO register sets */
+ do {
+ imsic_cfg.nr_mmios++;
+ } while ( !dt_device_get_address(node, imsic_cfg.nr_mmios, &base_addr, NULL) );
+
+ cleanup:
+ xfree(irq_range);
+
+ return rc;
+}
+
+/*
+ * Initialize the imsic_cfg structure based on the IMSIC DT node.
+ *
+ * Returns 0 if initialization is successful, a negative value on failure,
+ * or IRQ_M_EXT if the IMSIC node corresponds to a machine-mode IMSIC,
+ * which should be ignored by the hypervisor.
+ */
+int __init imsic_init(const struct dt_device_node *node)
+{
+ int rc;
+ unsigned long reloff, hartid;
+ unsigned int nr_parent_irqs, index, nr_handlers = 0;
+ paddr_t base_addr;
+ unsigned int nr_mmios;
+
+ /* Parse IMSIC node */
+ rc = imsic_parse_node(node, &nr_parent_irqs);
+ /*
+ * If machine mode imsic node => ignore it.
+ * If rc < 0 => parsing of IMSIC DT node failed.
+ */
+ if ( (rc == IRQ_M_EXT) || rc )
+ return rc;
+
+ nr_mmios = imsic_cfg.nr_mmios;
+
+ /* Allocate MMIO resource array */
+ imsic_cfg.mmios = xzalloc_array(struct imsic_mmios, nr_mmios);
+ if ( !imsic_cfg.mmios )
+ {
+ rc = -ENOMEM;
+ goto imsic_init_err;
+ }
+
+ imsic_cfg.msi = xzalloc_array(struct imsic_msi, nr_parent_irqs);
+ if ( !imsic_cfg.msi )
+ {
+ rc = -ENOMEM;
+ goto imsic_init_err;
+ }
+
+ /* Check MMIO register sets */
+ for ( unsigned int i = 0; i < nr_mmios; i++ )
+ {
+ if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
+ {
+ rc = -ENOMEM;
+ goto imsic_init_err;
+ }
+
+ rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
+ &imsic_cfg.mmios[i].size);
+ if ( rc )
+ {
+ printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
+ node->name, i);
+ goto imsic_init_err;
+ }
+
+ base_addr = imsic_cfg.mmios[i].base_addr;
+ base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
+ imsic_cfg.hart_index_bits +
+ IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
+ base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
+ imsic_cfg.group_index_shift);
+ if ( base_addr != imsic_cfg.base_addr )
+ {
+ rc = -EINVAL;
+ printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
+ node->name, i);
+ goto imsic_init_err;
+ }
+ }
+
+ /* Configure handlers for target CPUs */
+ for ( unsigned int i = 0; i < nr_parent_irqs; i++ )
+ {
+ unsigned long xen_cpuid;
+
+ rc = imsic_get_parent_hartid(node, i, &hartid);
+ if ( rc )
+ {
+ printk(XENLOG_WARNING "%s: cpu ID for parent irq%u not found\n",
+ node->name, i);
+ continue;
+ }
+
+ xen_cpuid = hartid_to_cpuid(hartid);
+
+ if ( xen_cpuid >= num_possible_cpus() )
+ {
+ printk(XENLOG_WARNING "%s: unsupported cpu ID=%lu for parent irq%u\n",
+ node->name, hartid, i);
+ continue;
+ }
+
+ /* Find MMIO location of MSI page */
+ reloff = i * BIT(imsic_cfg.guest_index_bits, UL) * IMSIC_MMIO_PAGE_SZ;
+ for ( index = 0; index < nr_mmios; index++ )
+ {
+ if ( reloff < imsic_cfg.mmios[index].size )
+ break;
+
+ /*
+ * MMIO region size may not be aligned to
+ * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
+ * if holes are present.
+ */
+ reloff -= ROUNDUP(imsic_cfg.mmios[index].size,
+ BIT(imsic_cfg.guest_index_bits, UL) * IMSIC_MMIO_PAGE_SZ);
+ }
+
+ if ( index == nr_mmios )
+ {
+ printk(XENLOG_WARNING "%s: MMIO not found for parent irq%u\n",
+ node->name, i);
+ continue;
+ }
+
+ if ( !IS_ALIGNED(imsic_cfg.msi[xen_cpuid].base_addr + reloff, PAGE_SIZE) )
+ {
+ printk(XENLOG_WARNING "%s: MMIO address %#lx is not aligned on a page\n",
+ node->name, imsic_cfg.msi[xen_cpuid].base_addr + reloff);
+ imsic_cfg.msi[xen_cpuid].offset = 0;
+ imsic_cfg.msi[xen_cpuid].base_addr = 0;
+ continue;
+ }
+
+ cpumask_set_cpu(xen_cpuid, imsic_cfg.mmios[index].cpus);
+
+ imsic_cfg.msi[xen_cpuid].base_addr = imsic_cfg.mmios[index].base_addr;
+ imsic_cfg.msi[xen_cpuid].offset = reloff;
+
+ nr_handlers++;
+ }
+
+ if ( !nr_handlers )
+ {
+ printk(XENLOG_ERR "%s: No CPU handlers found\n", node->name);
+ rc = -ENODEV;
+ goto imsic_init_err;
+ }
+
+ return 0;
+
+ imsic_init_err:
+ for ( unsigned int i = 0; i < nr_mmios; i++ )
+ free_cpumask_var(imsic_cfg.mmios[i].cpus);
+ XFREE(imsic_cfg.mmios);
+ XFREE(imsic_cfg.msi);
+
+ return rc;
+}
diff --git a/xen/arch/riscv/include/asm/imsic.h b/xen/arch/riscv/include/asm/imsic.h
new file mode 100644
index 0000000000..0d17881884
--- /dev/null
+++ b/xen/arch/riscv/include/asm/imsic.h
@@ -0,0 +1,65 @@
+/* SPDX-License-Identifier: MIT */
+
+/*
+ * xen/arch/riscv/include/asm/imsic.h
+ *
+ * RISC-V Incoming MSI Controller support
+ *
+ * (c) Microchip Technology Inc.
+ */
+
+#ifndef ASM__RISCV__IMSIC_H
+#define ASM__RISCV__IMSIC_H
+
+#include <xen/types.h>
+
+#define IMSIC_MMIO_PAGE_SHIFT 12
+#define IMSIC_MMIO_PAGE_SZ (1UL << IMSIC_MMIO_PAGE_SHIFT)
+
+#define IMSIC_MIN_ID 63
+#define IMSIC_MAX_ID 2047
+
+struct imsic_msi {
+ paddr_t base_addr;
+ unsigned long offset;
+};
+
+struct imsic_mmios {
+ paddr_t base_addr;
+ unsigned long size;
+ cpumask_var_t cpus;
+};
+
+struct imsic_config {
+ /* Base address */
+ paddr_t base_addr;
+
+ /* Bits representing Guest index, HART index, and Group index */
+ unsigned int guest_index_bits;
+ unsigned int hart_index_bits;
+ unsigned int group_index_bits;
+ unsigned int group_index_shift;
+
+ /* IMSIC phandle */
+ unsigned int phandle;
+
+ /* Number of parent irq */
+ unsigned int nr_parent_irqs;
+
+ /* Number off interrupt identities */
+ unsigned int nr_ids;
+
+ /* MMIOs */
+ unsigned int nr_mmios;
+ struct imsic_mmios *mmios;
+
+ /* MSI */
+ struct imsic_msi *msi;
+};
+
+struct dt_device_node;
+int imsic_init(const struct dt_device_node *node);
+
+const struct imsic_config *imsic_get_config(void);
+
+#endif /* ASM__RISCV__IMSIC_H */
diff --git a/xen/arch/riscv/include/asm/smp.h b/xen/arch/riscv/include/asm/smp.h
index eb58b6576b..33ee5ec06b 100644
--- a/xen/arch/riscv/include/asm/smp.h
+++ b/xen/arch/riscv/include/asm/smp.h
@@ -3,6 +3,7 @@
#define ASM__RISCV__SMP_H
#include <xen/cpumask.h>
+#include <xen/macros.h>
#include <xen/percpu.h>
#include <asm/current.h>
@@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
return pcpu_info[cpuid].hart_id;
}
+static inline unsigned long hartid_to_cpuid(unsigned long hartid)
+{
+ for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
+ {
+ if ( hartid == cpuid_to_hartid(cpuid) )
+ return cpuid;
+ }
+
+ /* hartid isn't valid for some reason */
+ return NR_CPUS;
+}
+
static inline void set_cpuid_to_hartid(unsigned long cpuid,
unsigned long hartid)
{
--
2.49.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v3 09/14] xen/riscv: aplic_init() implementation
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
` (7 preceding siblings ...)
2025-05-21 16:03 ` [PATCH v3 08/14] xen/riscv: imsic_init() implementation Oleksii Kurochko
@ 2025-05-21 16:03 ` Oleksii Kurochko
2025-05-22 15:26 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 10/14] xen/riscv: introduce intc_init() and helpers Oleksii Kurochko
` (4 subsequent siblings)
13 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Romain Caritey
aplic_init() function does the following few things:
- checks that IMSIC in device tree node ( by checking msi-parent property
in APLIC node ) is present as current one implmenetaion of AIA is
supported only MSI method.
- initialize IMSIC based on IMSIC device tree node
- Read value of APLIC's paddr start/end and size.
- Map aplic.regs
- Setup APLIC initial state interrupts (disable all interrupts, set
interrupt type and default priority, confgifure APLIC domaincfg) by
calling aplic_init_hw_interrutps().
aplic_init() is based on the code from [1] and [2].
Since Microchip originally developed aplic.c, an internal discussion with
them led to the decision to use the MIT license.
[1] https://gitlab.com/xen-project/people/olkur/xen/-/commit/7cfb4bd4748ca268142497ac5c327d2766fb342d
[2] https://gitlab.com/xen-project/people/olkur/xen/-/commit/392a531bfad39bf4656ce8128e004b241b8b3f3e
Co-developed-by: Romain Caritey <Romain.Caritey@microchip.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V3:
- Correct the comment on top of aplic-priv.h:
xen/arch/riscv/aplic.h -> .../aplic-priv.h
- Add __iomem for regs member of aplic_priv structure.
- [aplic_init_hw_interrupts] Use ~0U instead of -1U in aplic_init_hw_interrupts()
to disable all interrupts.
- [aplic_init_hw_interrupts] Start 'i' (for-cycle variable) from 0, not from 1.
- [aplic_init()] Declare imsic_node as pointer-to-const.
- Use decimal for arrays in struct aplic_regs.
- [aplic_init()] Check that aplic_info.num_irqs are less then 1023.
- [aplic_init()] Drop out check of IMSIC's node interrupt-extended property
from aplic_init().
---
Changes in V2:
- use __ro_after_init for aplic_ops.
- s/nr_irqs/num_irqs.
- s/dt_processor_hartid/dt_processor_cpuid.
- Drop confusing comment in aplic_init_hw_interrupts().
- Code style fixes.
- Drop years for Copyright.
- Revert changes which drop nr_irq define from asm/irq.h,
it shouldn't be, at least, part of this patch.
- Drop paddr_enf from struct aplic_regs. It is enough to have pair
(paddr_start, size).
- Make struct aplic_priv of asm/aplic.h private by moving it to
riscv/aplic-priv.h.
- Add the comment above the initialization of APLIC's target register.
- use writel() to access APLIC's registers.
- use 'unsinged int' for local variable i in aplic_init_hw_interrupts.
- Add the check that all modes in interrupts-extended property of
imsic node are equal. And drop rc != EOVERFLOW when interrupts-extended
property is read.
- Add cf_check to aplic_init().
- Fix a cycle of clrie register initialization in aplic_init_hw_interrupts().
Previous implementation leads to out-of-boundary.
- Declare member num_irqs in struct intc_info as it is used by APLIC code.
---
xen/arch/riscv/aplic-priv.h | 34 +++++++++++
xen/arch/riscv/aplic.c | 98 ++++++++++++++++++++++++++++++
xen/arch/riscv/include/asm/aplic.h | 64 +++++++++++++++++++
xen/arch/riscv/include/asm/intc.h | 3 +
4 files changed, 199 insertions(+)
create mode 100644 xen/arch/riscv/aplic-priv.h
create mode 100644 xen/arch/riscv/include/asm/aplic.h
diff --git a/xen/arch/riscv/aplic-priv.h b/xen/arch/riscv/aplic-priv.h
new file mode 100644
index 0000000000..e5f9f5fd90
--- /dev/null
+++ b/xen/arch/riscv/aplic-priv.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: MIT */
+
+/*
+ * xen/arch/riscv/aplic-priv.h
+ *
+ * Private part of aplic.h header.
+ *
+ * RISC-V Advanced Platform-Level Interrupt Controller support
+ *
+ * Copyright (c) Microchip.
+ * Copyright (c) Vates.
+ */
+
+#ifndef ASM__RISCV_PRIV_APLIC_H
+#define ASM__RISCV_PRIV_APLIC_H
+
+#include <xen/types.h>
+
+#include <asm/aplic.h>
+#include <asm/imsic.h>
+
+struct aplic_priv {
+ /* base physical address and size */
+ paddr_t paddr_start;
+ size_t size;
+
+ /* registers */
+ volatile struct aplic_regs __iomem *regs;
+
+ /* imsic configuration */
+ const struct imsic_config *imsic_cfg;
+};
+
+#endif /* ASM__RISCV_PRIV_APLIC_H */
diff --git a/xen/arch/riscv/aplic.c b/xen/arch/riscv/aplic.c
index 10ae81f7ac..069d157723 100644
--- a/xen/arch/riscv/aplic.c
+++ b/xen/arch/riscv/aplic.c
@@ -9,19 +9,113 @@
* Copyright (c) 2024-2025 Vates
*/
+#include <xen/device_tree.h>
#include <xen/errno.h>
#include <xen/init.h>
#include <xen/irq.h>
+#include <xen/mm.h>
#include <xen/sections.h>
#include <xen/types.h>
+#include <xen/vmap.h>
+
+#include "aplic-priv.h"
#include <asm/device.h>
+#include <asm/imsic.h>
#include <asm/intc.h>
+#include <asm/riscv_encoding.h>
+
+#define APLIC_DEFAULT_PRIORITY 1
+
+/* The maximum number of wired interrupt sources supported by APLIC domain. */
+#define APLIC_MAX_NUM_WIRED_IRQ_SOURCES 1023
+
+static struct aplic_priv aplic;
static struct intc_info __ro_after_init aplic_info = {
.hw_version = INTC_APLIC,
};
+static void __init aplic_init_hw_interrupts(void)
+{
+ unsigned int i;
+
+ /* Disable all interrupts */
+ for ( i = 0; i < ARRAY_SIZE(aplic.regs->clrie); i++)
+ writel(~0U, &aplic.regs->clrie[i]);
+
+ /* Set interrupt type and default priority for all interrupts */
+ for ( i = 0; i < aplic_info.num_irqs; i++ )
+ {
+ writel(0, &aplic.regs->sourcecfg[i]);
+ /*
+ * Low bits of target register contains Interrupt Priority bits which
+ * can't be zero according to AIA spec.
+ * Thereby they are initialized to APLIC_DEFAULT_PRIORITY.
+ */
+ writel(APLIC_DEFAULT_PRIORITY, &aplic.regs->target[i]);
+ }
+
+ writel(APLIC_DOMAINCFG_IE | APLIC_DOMAINCFG_DM, &aplic.regs->domaincfg);
+}
+
+static int __init cf_check aplic_init(void)
+{
+ dt_phandle imsic_phandle;
+ const __be32 *prop;
+ uint64_t size, paddr;
+ const struct dt_device_node *imsic_node;
+ const struct dt_device_node *node = aplic_info.node;
+ int rc;
+
+ /* Check for associated imsic node */
+ if ( !dt_property_read_u32(node, "msi-parent", &imsic_phandle) )
+ panic("%s: IDC mode not supported\n", node->full_name);
+
+ imsic_node = dt_find_node_by_phandle(imsic_phandle);
+ if ( !imsic_node )
+ panic("%s: unable to find IMSIC node\n", node->full_name);
+
+ rc = imsic_init(imsic_node);
+ if ( rc == IRQ_M_EXT )
+ /* Machine mode imsic node, ignore this aplic node */
+ return 0;
+ else if ( rc )
+ panic("%s: Failded to initialize IMSIC\n", node->full_name);
+
+ /* Find out number of interrupt sources */
+ if ( !dt_property_read_u32(node, "riscv,num-sources",
+ &aplic_info.num_irqs) )
+ panic("%s: failed to get number of interrupt sources\n",
+ node->full_name);
+
+ if ( aplic_info.num_irqs > APLIC_MAX_NUM_WIRED_IRQ_SOURCES )
+ panic("%s: too big number of riscv,num-source: %u\n",
+ __func__, aplic_info.num_irqs);
+
+ prop = dt_get_property(node, "reg", NULL);
+ dt_get_range(&prop, node, &paddr, &size);
+ if ( !paddr )
+ panic("%s: first MMIO resource not found\n", node->full_name);
+
+ aplic.paddr_start = paddr;
+ aplic.size = size;
+
+ aplic.regs = ioremap(paddr, size);
+ if ( !aplic.regs )
+ panic("%s: unable to map\n", node->full_name);
+
+ /* Setup initial state APLIC interrupts */
+ aplic_init_hw_interrupts();
+
+ return 0;
+}
+
+static struct intc_hw_operations __ro_after_init aplic_ops = {
+ .info = &aplic_info,
+ .init = aplic_init,
+};
+
static int cf_check aplic_irq_xlate(const uint32_t *intspec,
unsigned int intsize,
unsigned int *out_hwirq,
@@ -53,8 +147,12 @@ static int __init aplic_preinit(struct dt_device_node *node, const void *dat)
aplic_info.node = node;
+ aplic.imsic_cfg = imsic_get_config();
+
dt_irq_xlate = aplic_irq_xlate;
+ register_intc_ops(&aplic_ops);
+
return 0;
}
diff --git a/xen/arch/riscv/include/asm/aplic.h b/xen/arch/riscv/include/asm/aplic.h
new file mode 100644
index 0000000000..fef5f90a61
--- /dev/null
+++ b/xen/arch/riscv/include/asm/aplic.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: MIT */
+
+/*
+ * xen/arch/riscv/asm/include/aplic.h
+ *
+ * RISC-V Advanced Platform-Level Interrupt Controller support
+ *
+ * Copyright (c) Microchip.
+ */
+
+#ifndef ASM__RISCV__APLIC_H
+#define ASM__RISCV__APLIC_H
+
+#include <xen/types.h>
+
+#include <asm/imsic.h>
+
+#define APLIC_DOMAINCFG_IE BIT(8, UL)
+#define APLIC_DOMAINCFG_DM BIT(2, UL)
+
+struct aplic_regs {
+ uint32_t domaincfg;
+ uint32_t sourcecfg[1023];
+ uint8_t _reserved1[3008];
+
+ uint32_t mmsiaddrcfg;
+ uint32_t mmsiaddrcfgh;
+ uint32_t smsiaddrcfg;
+ uint32_t smsiaddrcfgh;
+ uint8_t _reserved2[48];
+
+ uint32_t setip[32];
+ uint8_t _reserved3[92];
+
+ uint32_t setipnum;
+ uint8_t _reserved4[32];
+
+ uint32_t in_clrip[32];
+ uint8_t _reserved5[92];
+
+ uint32_t clripnum;
+ uint8_t _reserved6[32];
+
+ uint32_t setie[32];
+ uint8_t _reserved7[92];
+
+ uint32_t setienum;
+ uint8_t _reserved8[32];
+
+ uint32_t clrie[32];
+ uint8_t _reserved9[92];
+
+ uint32_t clrienum;
+ uint8_t _reserved10[32];
+
+ uint32_t setipnum_le;
+ uint32_t setipnum_be;
+ uint8_t _reserved11[4088];
+
+ uint32_t genmsi;
+ uint32_t target[1023];
+};
+
+#endif /* ASM__RISCV__APLIC_H */
diff --git a/xen/arch/riscv/include/asm/intc.h b/xen/arch/riscv/include/asm/intc.h
index 860737f965..f3bbd281fa 100644
--- a/xen/arch/riscv/include/asm/intc.h
+++ b/xen/arch/riscv/include/asm/intc.h
@@ -17,6 +17,9 @@ struct irq_desc;
struct intc_info {
enum intc_version hw_version;
const struct dt_device_node *node;
+
+ /* number of irqs */
+ unsigned int num_irqs;
};
struct intc_hw_operations {
--
2.49.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v3 10/14] xen/riscv: introduce intc_init() and helpers
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
` (8 preceding siblings ...)
2025-05-21 16:03 ` [PATCH v3 09/14] xen/riscv: aplic_init() implementation Oleksii Kurochko
@ 2025-05-21 16:03 ` Oleksii Kurochko
2025-05-22 15:32 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 11/14] xen/riscv: implementation of aplic and imsic operations Oleksii Kurochko
` (3 subsequent siblings)
13 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Romain Caritey
Introduce intc_init() to initialize the interrupt controller using the
registered hardware ops.
Also add intc_route_irq_to_xen() to route IRQs to Xen, with support for
setting IRQ type and priority via new internal helpers intc_set_irq_type()
and intc_set_irq_priority().
Call intc_init() to do basic initialization steps for APLIC and IMSIC.
Co-developed-by: Romain Caritey <Romain.Caritey@microchip.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V3:
- Drop ASSERIT(intc_hw_ops) in intc_init().
- Drop ASSERT(intc_hw_ops) in intc_set_irq_type() and
intc_set_irq_priority() as intc_init() will be called first and so if it
won't crash, then intc_hw_ops is registered.
---
Changes in V2:
- This patch was part of "xen/riscv: Introduce intc_hw_operations abstraction"
and splitted to have ability to merge patch "xen/riscv: initialize interrupt controller"
to the current patch (where intc_init() call is actually introduced).
- Add checks of that callbacks aren't set to NULL in intc_set_irq_priority()
and intc_set_irq_type().
- add num_irqs member to struct intc_info as it is used now in
intc_route_irq_to_xen().
- Add ASSERT(spin_is_locked(&desc->lock)) to intc_set_irq_priority() in
the case this function will be called outside intc_route_irq_to_xen().
---
xen/arch/riscv/include/asm/intc.h | 4 +++
xen/arch/riscv/intc.c | 41 +++++++++++++++++++++++++++++++
xen/arch/riscv/setup.c | 2 ++
3 files changed, 47 insertions(+)
diff --git a/xen/arch/riscv/include/asm/intc.h b/xen/arch/riscv/include/asm/intc.h
index f3bbd281fa..1a88505518 100644
--- a/xen/arch/riscv/include/asm/intc.h
+++ b/xen/arch/riscv/include/asm/intc.h
@@ -41,4 +41,8 @@ void intc_preinit(void);
void register_intc_ops(const struct intc_hw_operations *ops);
+void intc_init(void);
+
+void intc_route_irq_to_xen(struct irq_desc *desc, unsigned int priority);
+
#endif /* ASM__RISCV__INTERRUPT_CONTOLLER_H */
diff --git a/xen/arch/riscv/intc.c b/xen/arch/riscv/intc.c
index 1ecd651bf3..f2823267a9 100644
--- a/xen/arch/riscv/intc.c
+++ b/xen/arch/riscv/intc.c
@@ -1,9 +1,12 @@
/* SPDX-License-Identifier: GPL-2.0-only */
#include <xen/acpi.h>
+#include <xen/bug.h>
#include <xen/device_tree.h>
#include <xen/init.h>
+#include <xen/irq.h>
#include <xen/lib.h>
+#include <xen/spinlock.h>
#include <asm/intc.h>
@@ -21,3 +24,41 @@ void __init intc_preinit(void)
else
panic("ACPI interrupt controller preinit() isn't implemented\n");
}
+
+void __init intc_init(void)
+{
+ if ( intc_hw_ops->init() )
+ panic("Failed to initialize the interrupt controller drivers\n");
+}
+
+/* desc->irq needs to be disabled before calling this function */
+static void intc_set_irq_type(struct irq_desc *desc, unsigned int type)
+{
+ ASSERT(desc->status & IRQ_DISABLED);
+ ASSERT(spin_is_locked(&desc->lock));
+ ASSERT(type != IRQ_TYPE_INVALID);
+
+ if ( intc_hw_ops->set_irq_type )
+ intc_hw_ops->set_irq_type(desc, type);
+}
+
+static void intc_set_irq_priority(struct irq_desc *desc, unsigned int priority)
+{
+ ASSERT(spin_is_locked(&desc->lock));
+
+ if ( intc_hw_ops->set_irq_priority )
+ intc_hw_ops->set_irq_priority(desc, priority);
+}
+
+void intc_route_irq_to_xen(struct irq_desc *desc, unsigned int priority)
+{
+ ASSERT(desc->status & IRQ_DISABLED);
+ ASSERT(spin_is_locked(&desc->lock));
+ /* Can't route interrupts that don't exist */
+ ASSERT(intc_hw_ops && desc->irq < intc_hw_ops->info->num_irqs);
+
+ desc->handler = intc_hw_ops->host_irq_type;
+
+ intc_set_irq_type(desc, desc->arch.type);
+ intc_set_irq_priority(desc, priority);
+}
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index 8bcd19218d..0e7398159c 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -134,6 +134,8 @@ void __init noreturn start_xen(unsigned long bootcpu_id,
intc_preinit();
+ intc_init();
+
printk("All set up\n");
machine_halt();
--
2.49.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v3 11/14] xen/riscv: implementation of aplic and imsic operations
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
` (9 preceding siblings ...)
2025-05-21 16:03 ` [PATCH v3 10/14] xen/riscv: introduce intc_init() and helpers Oleksii Kurochko
@ 2025-05-21 16:03 ` Oleksii Kurochko
2025-05-22 15:55 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 12/14] xen/riscv: add external interrupt handling for hypervisor mode Oleksii Kurochko
` (2 subsequent siblings)
13 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Romain Caritey
Introduce interrupt controller descriptor for host APLIC to describe
the low-lovel hardare. It includes implementation of the following functions:
- aplic_irq_startup()
- aplic_irq_enable()
- aplic_irq_disable()
- aplic_set_irq_affinity()
As APLIC is used in MSI mode it requires to enable/disable interrupts not
only for APLIC but also for IMSIC. Thereby for the purpose of
aplic_irq_{enable,disable}() it is introduced imsic_irq_{enable,disable)().
For the purpose of aplic_set_irq_affinity() aplic_get_cpu_from_mask() is
introduced to get hart id.
Also, introduce additional interrupt controller h/w operations and
host_irq_type for APLIC:
- aplic_host_irq_type
Patch is based on the code from [1].
[1] https://gitlab.com/xen-project/people/olkur/xen/-/commit/7390e2365828b83e27ead56b03114a56e3699dd5
Co-developed-by: Romain Caritey <Romain.Caritey@microchip.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V3:
- Update the lock above lock member of struct aplic_priv and imsic_config struct.
- Use spin_{un}lock() in aplic_irq_{enable,disable}() as it is IRQ-safe.
Also drop local variable 'flags'.
- Add ASSERT(spin_is_locked(&desc->lock)) to aplic_set_irq_affinity() and
aplic_set_irq_type().
- Use an initializer instead of spin_lock_init() for aplic.lock.
- Drop "(l)" in the comment in imsic_irq_enable() as it doesn't point to
anything.
- Use ASSERT(!local_irq_is_enabled()) + spin_lock() in imsic_irq_{enable,disable}().
- Use an initializer instead of spin_lock_init() for imsic_config.lock.
---
Changes in V2:
- Move imsic_ids_local_delivery() and connected to it parts to the current
patch to fix compilation issue. Also, add __init for
imsic_ids_local_delivery().
- Move introduction of aplic_set_irq_type() and aplic_set_irq_priority()
to patch [PATCH v1 12/14] xen/riscv: implement setup_irq() where they
really started to be used.
- Update the commit message.
- Drop is_used variable for imsic_cfg and use (aplic.regs->domaincfg & APLIC_DOMAINCFG_DM) instead.
- Use writel() to write to APLIC regs.
- Drop aplic_irq_shutdown() and use aplic_irq_disable explicitly.
- Drop local variable cpu in aplic_get_cpu_from_mask():
Use cpu_online_map instead of cpu_possible_map.
Remame possible_mask to mask.
- Code style fixes.
- Move spin_lock(&aplic.lock) down before write to the register in aplic_set_irq_affinity.
- Make aplic_host_irq_type const.
- imsic_local_eix_update() updates:
- move unsigned long isel, ireg; to inner loop.
- Drop unnecessary parentheses.
- Optimize inner loop of ireg's setting.
- Drop aplic_irq_ack() and aplic_host_irq_end() as they do nothing.
- Rename s/hwirq/irq.
- Add explanatory comment to imsic_irq_enable() about why there is not -1 for IRQ in
comparison with APLIC's sourcecfg.
- Use IMSIC_MMIO_PAGE_SHIFT instead of constant 12 in aplic_set_irq_affinity().
- s/aplic_host_irq_type/aplic_xen_irq_type
- Drop set/clear of IRQ_DISABLED bit in aplic_{enable,disable}() as guest will always
first request an interrupt and then only an interrupt will be enabled.
(for example, in Arm, the physical interrupts would be enabled when the
interrupt is initially routed. This could lead to problem because the guest
will usually boot with interrupt disabled.)
---
xen/arch/riscv/aplic-priv.h | 4 +
xen/arch/riscv/aplic.c | 113 +++++++++++++++++++++++++-
xen/arch/riscv/imsic.c | 122 ++++++++++++++++++++++++++++-
xen/arch/riscv/include/asm/aplic.h | 2 +
xen/arch/riscv/include/asm/imsic.h | 18 +++++
5 files changed, 257 insertions(+), 2 deletions(-)
diff --git a/xen/arch/riscv/aplic-priv.h b/xen/arch/riscv/aplic-priv.h
index e5f9f5fd90..cd7db2a9d2 100644
--- a/xen/arch/riscv/aplic-priv.h
+++ b/xen/arch/riscv/aplic-priv.h
@@ -14,6 +14,7 @@
#ifndef ASM__RISCV_PRIV_APLIC_H
#define ASM__RISCV_PRIV_APLIC_H
+#include <xen/spinlock.h>
#include <xen/types.h>
#include <asm/aplic.h>
@@ -27,6 +28,9 @@ struct aplic_priv {
/* registers */
volatile struct aplic_regs __iomem *regs;
+ /* lock to protect access to APLIC's registers */
+ spinlock_t lock;
+
/* imsic configuration */
const struct imsic_config *imsic_cfg;
};
diff --git a/xen/arch/riscv/aplic.c b/xen/arch/riscv/aplic.c
index 069d157723..f48937ccc6 100644
--- a/xen/arch/riscv/aplic.c
+++ b/xen/arch/riscv/aplic.c
@@ -15,6 +15,7 @@
#include <xen/irq.h>
#include <xen/mm.h>
#include <xen/sections.h>
+#include <xen/spinlock.h>
#include <xen/types.h>
#include <xen/vmap.h>
@@ -23,6 +24,7 @@
#include <asm/device.h>
#include <asm/imsic.h>
#include <asm/intc.h>
+#include <asm/io.h>
#include <asm/riscv_encoding.h>
#define APLIC_DEFAULT_PRIORITY 1
@@ -30,7 +32,9 @@
/* The maximum number of wired interrupt sources supported by APLIC domain. */
#define APLIC_MAX_NUM_WIRED_IRQ_SOURCES 1023
-static struct aplic_priv aplic;
+static struct aplic_priv aplic = {
+ .lock = SPIN_LOCK_UNLOCKED,
+};
static struct intc_info __ro_after_init aplic_info = {
.hw_version = INTC_APLIC,
@@ -111,9 +115,116 @@ static int __init cf_check aplic_init(void)
return 0;
}
+static void aplic_irq_enable(struct irq_desc *desc)
+{
+ /*
+ * TODO: Currently, APLIC is supported only with MSI interrupts.
+ * If APLIC without MSI interrupts is required in the future,
+ * this function will need to be updated accordingly.
+ */
+ ASSERT(readl(&aplic.regs->domaincfg) & APLIC_DOMAINCFG_DM);
+
+ ASSERT(spin_is_locked(&desc->lock));
+
+ spin_lock(&aplic.lock);
+
+ /* Enable interrupt in IMSIC */
+ imsic_irq_enable(desc->irq);
+
+ /* Enable interrupt in APLIC */
+ writel(desc->irq, &aplic.regs->setienum);
+
+ spin_unlock(&aplic.lock);
+}
+
+static void aplic_irq_disable(struct irq_desc *desc)
+{
+ /*
+ * TODO: Currently, APLIC is supported only with MSI interrupts.
+ * If APLIC without MSI interrupts is required in the future,
+ * this function will need to be updated accordingly.
+ */
+ ASSERT(readl(&aplic.regs->domaincfg) & APLIC_DOMAINCFG_DM);
+
+ ASSERT(spin_is_locked(&desc->lock));
+
+ spin_lock(&aplic.lock);
+
+ /* disable interrupt in APLIC */
+ writel(desc->irq, &aplic.regs->clrienum);
+
+ /* disable interrupt in IMSIC */
+ imsic_irq_disable(desc->irq);
+
+ spin_unlock(&aplic.lock);
+}
+
+static unsigned int aplic_irq_startup(struct irq_desc *desc)
+{
+ aplic_irq_enable(desc);
+
+ return 0;
+}
+
+static unsigned int aplic_get_cpu_from_mask(const cpumask_t *cpumask)
+{
+ cpumask_t mask;
+
+ cpumask_and(&mask, cpumask, &cpu_online_map);
+
+ return cpumask_any(&mask);
+}
+
+static void aplic_set_irq_affinity(struct irq_desc *desc, const cpumask_t *mask)
+{
+ unsigned int cpu;
+ uint64_t group_index, base_ppn;
+ uint32_t hhxw, lhxw ,hhxs, value;
+ const struct imsic_config *imsic = aplic.imsic_cfg;
+
+ /*
+ * TODO: Currently, APLIC is supported only with MSI interrupts.
+ * If APLIC without MSI interrupts is required in the future,
+ * this function will need to be updated accordingly.
+ */
+ ASSERT(readl(&aplic.regs->domaincfg) & APLIC_DOMAINCFG_DM);
+
+ ASSERT(!cpumask_empty(mask));
+
+ ASSERT(spin_is_locked(&desc->lock));
+
+ cpu = cpuid_to_hartid(aplic_get_cpu_from_mask(mask));
+ hhxw = imsic->group_index_bits;
+ lhxw = imsic->hart_index_bits;
+ hhxs = imsic->group_index_shift - IMSIC_MMIO_PAGE_SHIFT * 2;
+ base_ppn = imsic->msi[cpu].base_addr >> IMSIC_MMIO_PAGE_SHIFT;
+
+ /* Update hart and EEID in the target register */
+ group_index = (base_ppn >> (hhxs + IMSIC_MMIO_PAGE_SHIFT)) & (BIT(hhxw, UL) - 1);
+ value = desc->irq;
+ value |= cpu << APLIC_TARGET_HART_IDX_SHIFT;
+ value |= group_index << (lhxw + APLIC_TARGET_HART_IDX_SHIFT) ;
+
+ spin_lock(&aplic.lock);
+
+ writel(value, &aplic.regs->target[desc->irq - 1]);
+
+ spin_unlock(&aplic.lock);
+}
+
+static const hw_irq_controller aplic_xen_irq_type = {
+ .typename = "aplic",
+ .startup = aplic_irq_startup,
+ .shutdown = aplic_irq_disable,
+ .enable = aplic_irq_enable,
+ .disable = aplic_irq_disable,
+ .set_affinity = aplic_set_irq_affinity,
+};
+
static struct intc_hw_operations __ro_after_init aplic_ops = {
.info = &aplic_info,
.init = aplic_init,
+ .host_irq_type = &aplic_xen_irq_type,
};
static int cf_check aplic_irq_xlate(const uint32_t *intspec,
diff --git a/xen/arch/riscv/imsic.c b/xen/arch/riscv/imsic.c
index 9f8b492e97..2d139943c5 100644
--- a/xen/arch/riscv/imsic.c
+++ b/xen/arch/riscv/imsic.c
@@ -22,7 +22,124 @@
#include <asm/imsic.h>
-static struct imsic_config imsic_cfg;
+static struct imsic_config imsic_cfg = {
+ .lock = SPIN_LOCK_UNLOCKED,
+};
+
+#define IMSIC_DISABLE_EIDELIVERY 0
+#define IMSIC_ENABLE_EIDELIVERY 1
+#define IMSIC_DISABLE_EITHRESHOLD 1
+#define IMSIC_ENABLE_EITHRESHOLD 0
+
+#define imsic_csr_write(c, v) \
+do { \
+ csr_write(CSR_SISELECT, c); \
+ csr_write(CSR_SIREG, v); \
+} while (0)
+
+#define imsic_csr_set(c, v) \
+do { \
+ csr_write(CSR_SISELECT, c); \
+ csr_set(CSR_SIREG, v); \
+} while (0)
+
+#define imsic_csr_clear(c, v) \
+do { \
+ csr_write(CSR_SISELECT, c); \
+ csr_clear(CSR_SIREG, v); \
+} while (0)
+
+void __init imsic_ids_local_delivery(bool enable)
+{
+ if ( enable )
+ {
+ imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
+ imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
+ }
+ else
+ {
+ imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
+ imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
+ }
+}
+
+static void imsic_local_eix_update(unsigned long base_id, unsigned long num_id,
+ bool pend, bool val)
+{
+ unsigned long id = base_id, last_id = base_id + num_id;
+
+ while ( id < last_id )
+ {
+ unsigned long isel, ireg;
+ unsigned long start_id = id & (__riscv_xlen - 1);
+ unsigned long chunk = __riscv_xlen - start_id;
+ unsigned long count = (last_id - id < chunk) ? last_id - id : chunk;
+
+ isel = id / __riscv_xlen;
+ isel *= __riscv_xlen / IMSIC_EIPx_BITS;
+ isel += pend ? IMSIC_EIP0 : IMSIC_EIE0;
+
+ ireg = GENMASK(start_id + count - 1, start_id);
+
+ id += count;
+
+ if ( val )
+ imsic_csr_set(isel, ireg);
+ else
+ imsic_csr_clear(isel, ireg);
+ }
+}
+
+void imsic_irq_enable(unsigned int irq)
+{
+ /*
+ * The only caller of imsic_irq_enable() is aplic_irq_enable(), which
+ * already runs with IRQs disabled. Therefore, there's no need to use
+ * spin_lock_irqsave() in this function.
+ *
+ * This ASSERT is added as a safeguard: if imsic_irq_enable() is ever
+ * called from a context where IRQs are not disabled,
+ * spin_lock_irqsave() should be used instead of spin_lock().
+ */
+ ASSERT(!local_irq_is_enabled());
+
+ spin_lock(&imsic_cfg.lock);
+ /*
+ * There is no irq - 1 here (look at aplic_set_irq_type()) because:
+ * From the spec:
+ * When an interrupt file supports distinct interrupt identities,
+ * valid identity numbers are between 1 and inclusive. The identity
+ * numbers within this range are said to be implemented by the interrupt
+ * file; numbers outside this range are not implemented. The number zero
+ * is never a valid interrupt identity.
+ * ...
+ * Bit positions in a valid eiek register that don’t correspond to a
+ * supported interrupt identity (such as bit 0 of eie0) are read-only zeros.
+ *
+ * So in EIx registers interrupt i corresponds to bit i in comparison wiht
+ * APLIC's sourcecfg which starts from 0.
+ */
+ imsic_local_eix_update(irq, 1, false, true);
+ spin_unlock(&imsic_cfg.lock);
+}
+
+void imsic_irq_disable(unsigned int irq)
+{
+ /*
+ * The only caller of imsic_irq_disable() is aplic_irq_enable(), which
+ * already runs with IRQs disabled. Therefore, there's no need to use
+ * spin_lock_irqsave() in this function.
+ *
+ * This ASSERT is added as a safeguard: if imsic_irq_disable() is ever
+ * called from a context where IRQs are not disabled,
+ * spin_lock_irqsave() should be used instead of spin_lock().
+ */
+ ASSERT(!local_irq_is_enabled());
+
+ spin_lock(&imsic_cfg.lock);
+ imsic_local_eix_update(irq, 1, false, false);
+ spin_unlock(&imsic_cfg.lock);
+}
/* Callers aren't expected to changed imsic_cfg so return const. */
const struct imsic_config *imsic_get_config(void)
@@ -342,6 +459,9 @@ int __init imsic_init(const struct dt_device_node *node)
goto imsic_init_err;
}
+ /* Enable local interrupt delivery */
+ imsic_ids_local_delivery(true);
+
return 0;
imsic_init_err:
diff --git a/xen/arch/riscv/include/asm/aplic.h b/xen/arch/riscv/include/asm/aplic.h
index fef5f90a61..a814a36a82 100644
--- a/xen/arch/riscv/include/asm/aplic.h
+++ b/xen/arch/riscv/include/asm/aplic.h
@@ -18,6 +18,8 @@
#define APLIC_DOMAINCFG_IE BIT(8, UL)
#define APLIC_DOMAINCFG_DM BIT(2, UL)
+#define APLIC_TARGET_HART_IDX_SHIFT 18
+
struct aplic_regs {
uint32_t domaincfg;
uint32_t sourcecfg[1023];
diff --git a/xen/arch/riscv/include/asm/imsic.h b/xen/arch/riscv/include/asm/imsic.h
index 0d17881884..a0eba55f99 100644
--- a/xen/arch/riscv/include/asm/imsic.h
+++ b/xen/arch/riscv/include/asm/imsic.h
@@ -11,6 +11,7 @@
#ifndef ASM__RISCV__IMSIC_H
#define ASM__RISCV__IMSIC_H
+#include <xen/spinlock.h>
#include <xen/types.h>
#define IMSIC_MMIO_PAGE_SHIFT 12
@@ -19,6 +20,15 @@
#define IMSIC_MIN_ID 63
#define IMSIC_MAX_ID 2047
+#define IMSIC_EIDELIVERY 0x70
+
+#define IMSIC_EITHRESHOLD 0x72
+
+#define IMSIC_EIP0 0x80
+#define IMSIC_EIPx_BITS 32
+
+#define IMSIC_EIE0 0xC0
+
struct imsic_msi {
paddr_t base_addr;
unsigned long offset;
@@ -55,6 +65,9 @@ struct imsic_config {
/* MSI */
struct imsic_msi *msi;
+
+ /* Lock to protect access to IMSIC's stuff */
+ spinlock_t lock;
};
struct dt_device_node;
@@ -62,4 +75,9 @@ int imsic_init(const struct dt_device_node *node);
const struct imsic_config *imsic_get_config(void);
+void imsic_irq_enable(unsigned int hwirq);
+void imsic_irq_disable(unsigned int hwirq);
+
+void imsic_ids_local_delivery(bool enable);
+
#endif /* ASM__RISCV__IMSIC_H */
--
2.49.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v3 12/14] xen/riscv: add external interrupt handling for hypervisor mode
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
` (10 preceding siblings ...)
2025-05-21 16:03 ` [PATCH v3 11/14] xen/riscv: implementation of aplic and imsic operations Oleksii Kurochko
@ 2025-05-21 16:03 ` Oleksii Kurochko
2025-05-21 16:03 ` [PATCH v3 13/14] xen/riscv: implement setup_irq() Oleksii Kurochko
2025-05-21 16:03 ` [PATCH v3 14/14] xen/riscv: add basic UART support Oleksii Kurochko
13 siblings, 0 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Romain Caritey
Implement functions necessarry to have working external interrupts in
hypervisor mode. The following changes are done:
- Add a common function intc_handle_external_irq() to call APLIC specific
function to handle an interrupt.
- Update do_trap() function to handle IRQ_S_EXT case; add the check to catch
case when cause of trap is an interrupt.
- Add handle_interrrupt() member to intc_hw_operations structure.
- Enable local interrupt delivery for IMSIC by calling of
imsic_ids_local_delivery() in imsic_init(); additionally introduce helper
imsic_csr_write() to update IMSIC_EITHRESHOLD and IMSIC_EITHRESHOLD.
- Enable hypervisor external interrupts.
- Implement aplic_handler_interrupt() and use it to init ->handle_interrupt
member of intc_hw_operations for APLIC.
- Add implementation of do_IRQ() to dispatch the interrupt.
The current patch is based on the code from [1].
[1] https://gitlab.com/xen-project/people/olkur/xen/-/commit/7390e2365828b83e27ead56b03114a56e3699dd5
Co-developed-by: Romain Caritey <Romain.Caritey@microchip.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V3:
- Add ASSERT(spin_is_locked(&desc->lock)) to aplic_set_irq_type().
- Fix code style for switch () in aplic_set_irq_type().
- Drop fallthrough between "case IRQ_TYPE_NONE: case IRQ_TYPE_INVALID:" as there
is no other statements in "case IRQ_TYPE_NONE".
- Add Acked-by: Jan Beulich <jbeulich@suse.com>.
---
Changes in V2:
- use BIT() macros instead of 1UL << bit_num in aplic.c.
- Drop passing of a cause to aplic_handle_interrupt() function. And update
prototype of handle_interrupt() callback.
- Drop ASSERT() in intc_handle_external_irqs() as it is useless.
- Code style fixes.
- Drop cause argument for intc_handle_external_irqs().
- Update the commit message: drop words that imsic_ids_local_delivery() is
implemented in this patch, it is only called.
- Add cf_check for aplic_handle_interrupt(), aplic_set_irq_type().
- Move forward declarations in asm/intc.h up.
- Use plain C operator instead if {clear,set}_bit() for desc->status as it
is always used under spinlock().
- use writel() for writing to APLIC's sourcecfg in aplic_set_irq_type().
---
xen/arch/riscv/aplic.c | 71 ++++++++++++++++++++++++++++++
xen/arch/riscv/include/asm/aplic.h | 7 +++
xen/arch/riscv/include/asm/imsic.h | 1 +
xen/arch/riscv/include/asm/intc.h | 6 +++
xen/arch/riscv/include/asm/irq.h | 6 ++-
xen/arch/riscv/intc.c | 5 +++
xen/arch/riscv/irq.c | 47 ++++++++++++++++++++
xen/arch/riscv/traps.c | 19 ++++++++
8 files changed, 161 insertions(+), 1 deletion(-)
diff --git a/xen/arch/riscv/aplic.c b/xen/arch/riscv/aplic.c
index f48937ccc6..5415471680 100644
--- a/xen/arch/riscv/aplic.c
+++ b/xen/arch/riscv/aplic.c
@@ -9,6 +9,7 @@
* Copyright (c) 2024-2025 Vates
*/
+#include <xen/const.h>
#include <xen/device_tree.h>
#include <xen/errno.h>
#include <xen/init.h>
@@ -212,6 +213,71 @@ static void aplic_set_irq_affinity(struct irq_desc *desc, const cpumask_t *mask)
spin_unlock(&aplic.lock);
}
+static void cf_check aplic_handle_interrupt(struct cpu_user_regs *regs)
+{
+ /* disable to avoid more external interrupts */
+ csr_clear(CSR_SIE, BIT(IRQ_S_EXT, UL));
+
+ /* clear the pending bit */
+ csr_clear(CSR_SIP, BIT(IRQ_S_EXT, UL));
+
+ /* dispatch the interrupt */
+ do_IRQ(regs, csr_swap(CSR_STOPEI, 0) >> TOPI_IID_SHIFT);
+
+ /* enable external interrupts */
+ csr_set(CSR_SIE, BIT(IRQ_S_EXT, UL));
+}
+
+static void cf_check aplic_set_irq_type(struct irq_desc *desc, unsigned int type)
+{
+ /*
+ * Interrupt 0 isn't possible based on the spec:
+ * Each of an APLIC’s interrupt sources has a fixed unique identity number in the range 1 to N,
+ * where N is the total number of sources at the APLIC. The number zero is not a valid interrupt
+ * identity number at an APLIC. The maximum number of interrupt sources an APLIC may support
+ * is 1023.
+ *
+ * Thereby interrupt 1 will correspond to bit 0 in sourcecfg[] register,
+ * interrupt 2 ->sourcecfg[1] and so on.
+ *
+ * And that is the reason why we need -1.
+ */
+ unsigned int irq_bit = desc->irq - 1;
+
+ ASSERT(spin_is_locked(&desc->lock));
+
+ spin_lock(&aplic.lock);
+
+ switch ( type )
+ {
+ case IRQ_TYPE_EDGE_RISING:
+ writel(APLIC_SOURCECFG_SM_EDGE_RISE, &aplic.regs->sourcecfg[irq_bit]);
+ break;
+
+ case IRQ_TYPE_EDGE_FALLING:
+ writel(APLIC_SOURCECFG_SM_EDGE_FALL, &aplic.regs->sourcecfg[irq_bit]);
+ break;
+
+ case IRQ_TYPE_LEVEL_HIGH:
+ writel(APLIC_SOURCECFG_SM_LEVEL_HIGH, &aplic.regs->sourcecfg[irq_bit]);
+ break;
+
+ case IRQ_TYPE_LEVEL_LOW:
+ writel(APLIC_SOURCECFG_SM_LEVEL_LOW, &aplic.regs->sourcecfg[irq_bit]);
+ break;
+
+ case IRQ_TYPE_NONE:
+ case IRQ_TYPE_INVALID:
+ writel(APLIC_SOURCECFG_SM_INACTIVE, &aplic.regs->sourcecfg[irq_bit]);
+ break;
+
+ default:
+ panic("%s: APLIC doesnt support IRQ type: 0x%x?\n", __func__, type);
+ }
+
+ spin_unlock(&aplic.lock);
+}
+
static const hw_irq_controller aplic_xen_irq_type = {
.typename = "aplic",
.startup = aplic_irq_startup,
@@ -225,6 +291,8 @@ static struct intc_hw_operations __ro_after_init aplic_ops = {
.info = &aplic_info,
.init = aplic_init,
.host_irq_type = &aplic_xen_irq_type,
+ .handle_interrupt = aplic_handle_interrupt,
+ .set_irq_type = aplic_set_irq_type,
};
static int cf_check aplic_irq_xlate(const uint32_t *intspec,
@@ -264,6 +332,9 @@ static int __init aplic_preinit(struct dt_device_node *node, const void *dat)
register_intc_ops(&aplic_ops);
+ /* Enable supervisor external interrupt */
+ csr_set(CSR_SIE, BIT(IRQ_S_EXT, UL));
+
return 0;
}
diff --git a/xen/arch/riscv/include/asm/aplic.h b/xen/arch/riscv/include/asm/aplic.h
index a814a36a82..ef5b1d3e85 100644
--- a/xen/arch/riscv/include/asm/aplic.h
+++ b/xen/arch/riscv/include/asm/aplic.h
@@ -18,6 +18,13 @@
#define APLIC_DOMAINCFG_IE BIT(8, UL)
#define APLIC_DOMAINCFG_DM BIT(2, UL)
+#define APLIC_SOURCECFG_SM_INACTIVE 0x0
+#define APLIC_SOURCECFG_SM_DETACH 0x1
+#define APLIC_SOURCECFG_SM_EDGE_RISE 0x4
+#define APLIC_SOURCECFG_SM_EDGE_FALL 0x5
+#define APLIC_SOURCECFG_SM_LEVEL_HIGH 0x6
+#define APLIC_SOURCECFG_SM_LEVEL_LOW 0x7
+
#define APLIC_TARGET_HART_IDX_SHIFT 18
struct aplic_regs {
diff --git a/xen/arch/riscv/include/asm/imsic.h b/xen/arch/riscv/include/asm/imsic.h
index a0eba55f99..4973016cd8 100644
--- a/xen/arch/riscv/include/asm/imsic.h
+++ b/xen/arch/riscv/include/asm/imsic.h
@@ -12,6 +12,7 @@
#define ASM__RISCV__IMSIC_H
#include <xen/spinlock.h>
+#include <xen/stdbool.h>
#include <xen/types.h>
#define IMSIC_MMIO_PAGE_SHIFT 12
diff --git a/xen/arch/riscv/include/asm/intc.h b/xen/arch/riscv/include/asm/intc.h
index 1a88505518..b11b26addd 100644
--- a/xen/arch/riscv/include/asm/intc.h
+++ b/xen/arch/riscv/include/asm/intc.h
@@ -12,6 +12,7 @@ enum intc_version {
INTC_APLIC,
};
+struct cpu_user_regs;
struct irq_desc;
struct intc_info {
@@ -35,6 +36,9 @@ struct intc_hw_operations {
void (*set_irq_type)(struct irq_desc *desc, unsigned int type);
/* Set IRQ priority */
void (*set_irq_priority)(struct irq_desc *desc, unsigned int priority);
+
+ /* handle external interrupt */
+ void (*handle_interrupt)(struct cpu_user_regs *regs);
};
void intc_preinit(void);
@@ -45,4 +49,6 @@ void intc_init(void);
void intc_route_irq_to_xen(struct irq_desc *desc, unsigned int priority);
+void intc_handle_external_irqs(struct cpu_user_regs *regs);
+
#endif /* ASM__RISCV__INTERRUPT_CONTOLLER_H */
diff --git a/xen/arch/riscv/include/asm/irq.h b/xen/arch/riscv/include/asm/irq.h
index 84c3c2904d..94151eb083 100644
--- a/xen/arch/riscv/include/asm/irq.h
+++ b/xen/arch/riscv/include/asm/irq.h
@@ -33,16 +33,20 @@ struct arch_irq_desc {
unsigned int type;
};
+struct cpu_user_regs;
+struct dt_device_node;
+
static inline void arch_move_irqs(struct vcpu *v)
{
BUG_ON("unimplemented");
}
-struct dt_device_node;
int platform_get_irq(const struct dt_device_node *device, int index);
void init_IRQ(void);
+void do_IRQ(struct cpu_user_regs *regs, unsigned int irq);
+
#endif /* ASM__RISCV__IRQ_H */
/*
diff --git a/xen/arch/riscv/intc.c b/xen/arch/riscv/intc.c
index f2823267a9..ea317aea5a 100644
--- a/xen/arch/riscv/intc.c
+++ b/xen/arch/riscv/intc.c
@@ -50,6 +50,11 @@ static void intc_set_irq_priority(struct irq_desc *desc, unsigned int priority)
intc_hw_ops->set_irq_priority(desc, priority);
}
+void intc_handle_external_irqs(struct cpu_user_regs *regs)
+{
+ intc_hw_ops->handle_interrupt(regs);
+}
+
void intc_route_irq_to_xen(struct irq_desc *desc, unsigned int priority)
{
ASSERT(desc->status & IRQ_DISABLED);
diff --git a/xen/arch/riscv/irq.c b/xen/arch/riscv/irq.c
index 669ef3ae9e..466f1b4ba9 100644
--- a/xen/arch/riscv/irq.c
+++ b/xen/arch/riscv/irq.c
@@ -11,6 +11,10 @@
#include <xen/errno.h>
#include <xen/init.h>
#include <xen/irq.h>
+#include <xen/spinlock.h>
+
+#include <asm/hardirq.h>
+#include <asm/intc.h>
static irq_desc_t irq_desc[NR_IRQS];
@@ -90,3 +94,46 @@ void __init init_IRQ(void)
if ( init_irq_data() < 0 )
panic("initialization of IRQ data failed\n");
}
+
+/* Dispatch an interrupt */
+void do_IRQ(struct cpu_user_regs *regs, unsigned int irq)
+{
+ struct irq_desc *desc = irq_to_desc(irq);
+ struct irqaction *action;
+
+ irq_enter();
+
+ spin_lock(&desc->lock);
+
+ if ( desc->handler->ack )
+ desc->handler->ack(desc);
+
+ if ( desc->status & IRQ_DISABLED )
+ goto out;
+
+ desc->status |= IRQ_INPROGRESS;
+
+ action = desc->action;
+
+ spin_unlock_irq(&desc->lock);
+
+#ifndef CONFIG_IRQ_HAS_MULTIPLE_ACTION
+ action->handler(irq, action->dev_id);
+#else
+ do {
+ action->handler(irq, action->dev_id);
+ action = action->next;
+ } while ( action );
+#endif /* CONFIG_IRQ_HAS_MULTIPLE_ACTION */
+
+ spin_lock_irq(&desc->lock);
+
+ desc->status &= ~IRQ_INPROGRESS;
+
+ out:
+ if ( desc->handler->end )
+ desc->handler->end(desc);
+
+ spin_unlock(&desc->lock);
+ irq_exit();
+}
diff --git a/xen/arch/riscv/traps.c b/xen/arch/riscv/traps.c
index ea3638a54f..f061004d83 100644
--- a/xen/arch/riscv/traps.c
+++ b/xen/arch/riscv/traps.c
@@ -11,6 +11,7 @@
#include <xen/nospec.h>
#include <xen/sched.h>
+#include <asm/intc.h>
#include <asm/processor.h>
#include <asm/riscv_encoding.h>
#include <asm/traps.h>
@@ -128,6 +129,24 @@ void do_trap(struct cpu_user_regs *cpu_regs)
}
fallthrough;
default:
+ if ( cause & CAUSE_IRQ_FLAG )
+ {
+ /* Handle interrupt */
+ unsigned long icause = cause & ~CAUSE_IRQ_FLAG;
+
+ switch ( icause )
+ {
+ case IRQ_S_EXT:
+ intc_handle_external_irqs(cpu_regs);
+ break;
+
+ default:
+ break;
+ }
+
+ break;
+ }
+
do_unexpected_trap(cpu_regs);
break;
}
--
2.49.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v3 13/14] xen/riscv: implement setup_irq()
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
` (11 preceding siblings ...)
2025-05-21 16:03 ` [PATCH v3 12/14] xen/riscv: add external interrupt handling for hypervisor mode Oleksii Kurochko
@ 2025-05-21 16:03 ` Oleksii Kurochko
2025-05-21 16:03 ` [PATCH v3 14/14] xen/riscv: add basic UART support Oleksii Kurochko
13 siblings, 0 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
Romain Caritey
Introduce support for IRQ setup on RISC-V by implementing setup_irq() and
__setup_irq(), adapted and extended from an initial implementation by [1].
__setup_irq() does the following:
- Sets up an IRQ action.
- Validates that shared IRQs have non-NULL `dev_id` and are only used when
existing handlers allow sharing.
- Uses smp_wmb() to enforce memory ordering after assigning desc->action
to ensure visibility before enabling the IRQ.
- Supports multi-action setups via CONFIG_IRQ_HAS_MULTIPLE_ACTION.
setup_irq() does the following:
- Converts IRQ number to descriptor and acquires its lock.
- Rejects registration if the IRQ is already assigned to a guest domain,
printing an error.
- Delegates the core setup to __setup_irq().
- On first-time setup, disables the IRQ, routes it to Xen using
intc_route_irq_to_xen(), sets default CPU affinity (current CPU),
calls the handler’s startup routine, and finally enables the IRQ.
irq_set_affinity() invokes set_affinity() callback from the IRQ handler
if present.
Defined IRQ_NO_PRIORITY as default priority used when routing IRQs to Xen.
[1] https://gitlab.com/xen-project/people/olkur/xen/-/commit/7390e2365828b83e27ead56b03114a56e3699dd5
Co-developed-by: Romain Caritey <Romain.Caritey@microchip.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V3:
- Nothing changed. Only rebase.
---
Changes in V2:
- Added implenmtation of aplic_set_irq_type() as it is going to be used in
this commit. And also, update the implementation of it. Make default case
of switch to do panic().
- Move all forward declaration up in asm/irq.h.
- s/__setup_irq/_setup_irq.
- Code style fixes.
- Update commit message.
- use smp_wmb() instead of smp_mb() in _setup_irq().
- Drop irq_set_affinity().
- Use plain C operator instead if {clear,set}_bit() for desc->status as it
is always used under spinlock().
- Drop set_bit(_IRQ_DISABLED, &desc->status) in setup_irq() as in the case
when IRQ is setuped for a first time, desc->status should be already set
to IRQ_DISABLED in init_one_irq_desc().
----
xen/arch/riscv/include/asm/irq.h | 2 +
xen/arch/riscv/irq.c | 84 ++++++++++++++++++++++++++++++++
2 files changed, 86 insertions(+)
diff --git a/xen/arch/riscv/include/asm/irq.h b/xen/arch/riscv/include/asm/irq.h
index 94151eb083..f633636dc3 100644
--- a/xen/arch/riscv/include/asm/irq.h
+++ b/xen/arch/riscv/include/asm/irq.h
@@ -17,6 +17,8 @@
*/
#define NR_IRQS 1024
+#define IRQ_NO_PRIORITY 0
+
/* TODO */
#define nr_irqs 0U
#define nr_static_irqs 0
diff --git a/xen/arch/riscv/irq.c b/xen/arch/riscv/irq.c
index 466f1b4ba9..25d3295002 100644
--- a/xen/arch/riscv/irq.c
+++ b/xen/arch/riscv/irq.c
@@ -7,6 +7,7 @@
*/
#include <xen/bug.h>
+#include <xen/cpumask.h>
#include <xen/device_tree.h>
#include <xen/errno.h>
#include <xen/init.h>
@@ -63,6 +64,89 @@ int platform_get_irq(const struct dt_device_node *device, int index)
return dt_irq.irq;
}
+static int _setup_irq(struct irq_desc *desc, unsigned int irqflags,
+ struct irqaction *new)
+{
+ bool shared = irqflags & IRQF_SHARED;
+
+ ASSERT(new != NULL);
+
+ /*
+ * Sanity checks:
+ * - if the IRQ is marked as shared
+ * - dev_id is not NULL when IRQF_SHARED is set
+ */
+ if ( desc->action != NULL && (!(desc->status & IRQF_SHARED) || !shared) )
+ return -EINVAL;
+ if ( shared && new->dev_id == NULL )
+ return -EINVAL;
+
+ if ( shared )
+ desc->status |= IRQF_SHARED;
+
+#ifdef CONFIG_IRQ_HAS_MULTIPLE_ACTION
+ new->next = desc->action;
+#endif
+
+ desc->action = new;
+ smp_wmb();
+
+ return 0;
+}
+
+int setup_irq(unsigned int irq, unsigned int irqflags, struct irqaction *new)
+{
+ int rc;
+ unsigned long flags;
+ struct irq_desc *desc = irq_to_desc(irq);
+ bool disabled;
+
+ spin_lock_irqsave(&desc->lock, flags);
+
+ disabled = (desc->action == NULL);
+
+ if ( desc->status & IRQ_GUEST )
+ {
+ spin_unlock_irqrestore(&desc->lock, flags);
+ /*
+ * TODO: would be nice to have functionality to print which domain owns
+ * an IRQ.
+ */
+ printk(XENLOG_ERR "ERROR: IRQ %u is already in use by a domain\n", irq);
+ return -EBUSY;
+ }
+
+ rc = _setup_irq(desc, irqflags, new);
+ if ( rc )
+ goto err;
+
+ /* First time the IRQ is setup */
+ if ( disabled )
+ {
+ /* Route interrupt to xen */
+ intc_route_irq_to_xen(desc, IRQ_NO_PRIORITY);
+
+ /*
+ * We don't care for now which CPU will receive the
+ * interrupt.
+ *
+ * TODO: Handle case where IRQ is setup on different CPU than
+ * the targeted CPU and the priority.
+ */
+ desc->handler->set_affinity(desc, cpumask_of(smp_processor_id()));
+
+ desc->handler->startup(desc);
+
+ /* Enable irq */
+ desc->status &= ~IRQ_DISABLED;
+ }
+
+ err:
+ spin_unlock_irqrestore(&desc->lock, flags);
+
+ return rc;
+}
+
int arch_init_one_irq_desc(struct irq_desc *desc)
{
desc->arch.type = IRQ_TYPE_INVALID;
--
2.49.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v3 14/14] xen/riscv: add basic UART support
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
` (12 preceding siblings ...)
2025-05-21 16:03 ` [PATCH v3 13/14] xen/riscv: implement setup_irq() Oleksii Kurochko
@ 2025-05-21 16:03 ` Oleksii Kurochko
13 siblings, 0 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-21 16:03 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini
Update Kconfig to select GENERIC_UART_INIT for basic UART init ( find a dt node
and call device specific device_init() ).
Drop `default n if RISCV` statement for config HAS_NS16550 as now ns16550 is
ready to be compiled and used by RISC-V. Also, make the config user selectable
for everyone except X86.
Initialize a minimal amount of stuff to have UART and Xen console:
- Initialize uart by calling uart_init().
- Initialize Xen console by calling console_init_{pre,post}irq().
- Initialize timer and its internal lists which are used by
init_timer() which is called by ns16550_init_postirq(); otherwise
"Unhandled exception: Store/AMO Page Fault" occurs.
- Enable local interrupt to recieve an input from UART
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v3:
- Drop inclusion of <xen/percpu.h> as nothing in setup.c requires it.
- Add Acked-by: Jan Beulich <jbeulich@suse.com>.
---
Changes in v2:
- Drop #include <xen/keyhandler.h> in setup.c, isn't needed anymore.
- Drop call of percpu_init_areas() as it was needed when I used polling
mode for UART, for this case percpu is used to receive serial port info:
struct serial_port *port = this_cpu(poll_port);
So percpu isn't really needed at the current development state.
- Make HAS_NS16550 user selectable for everyone, except X86.
- Update the commit message.
---
xen/arch/riscv/Kconfig | 1 +
xen/arch/riscv/setup.c | 12 ++++++++++++
xen/drivers/char/Kconfig | 3 +--
3 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/xen/arch/riscv/Kconfig b/xen/arch/riscv/Kconfig
index 62c5b7ba34..96bef90751 100644
--- a/xen/arch/riscv/Kconfig
+++ b/xen/arch/riscv/Kconfig
@@ -2,6 +2,7 @@ config RISCV
def_bool y
select FUNCTION_ALIGNMENT_16B
select GENERIC_BUG_FRAME
+ select GENERIC_UART_INIT
select HAS_DEVICE_TREE
select HAS_PMAP
select HAS_UBSAN
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index 0e7398159c..a17096bf02 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -4,12 +4,15 @@
#include <xen/bug.h>
#include <xen/bootfdt.h>
#include <xen/compile.h>
+#include <xen/console.h>
#include <xen/device_tree.h>
#include <xen/init.h>
#include <xen/irq.h>
#include <xen/mm.h>
+#include <xen/serial.h>
#include <xen/shutdown.h>
#include <xen/smp.h>
+#include <xen/timer.h>
#include <xen/vmap.h>
#include <xen/xvmalloc.h>
@@ -134,8 +137,17 @@ void __init noreturn start_xen(unsigned long bootcpu_id,
intc_preinit();
+ uart_init();
+ console_init_preirq();
+
intc_init();
+ timer_init();
+
+ local_irq_enable();
+
+ console_init_postirq();
+
printk("All set up\n");
machine_halt();
diff --git a/xen/drivers/char/Kconfig b/xen/drivers/char/Kconfig
index e6e12bb413..8e49a52c73 100644
--- a/xen/drivers/char/Kconfig
+++ b/xen/drivers/char/Kconfig
@@ -2,8 +2,7 @@ config GENERIC_UART_INIT
bool
config HAS_NS16550
- bool "NS16550 UART driver" if ARM
- default n if RISCV
+ bool "NS16550 UART driver" if !X86
default y
help
This selects the 16550-series UART support. For most systems, say Y.
--
2.49.0
^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [PATCH v3 01/14] xen/riscv: introduce smp_prepare_boot_cpu()
2025-05-21 16:03 ` [PATCH v3 01/14] xen/riscv: introduce smp_prepare_boot_cpu() Oleksii Kurochko
@ 2025-05-22 7:22 ` Jan Beulich
0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2025-05-22 7:22 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, xen-devel
On 21.05.2025 18:03, Oleksii Kurochko wrote:
> Initialize cpu_{possible, online}_map by using smp_prepare_boot_cpu().
>
> Drop DEFINE_PER_CPU(unsigned int, cpu_id) from stubs.c as this variable isn't
> expected to be used in RISC-V at all.
>
> Move declaration of cpu_{possible,online}_map from stubs.c to smpboot.c
> as now smpboot.c is now introduced.
> Other defintions keep in stubs.c as they are not initialized and not needed, at
> the moment.
>
> Drop cpu_present_map as it is enough to have cpu_possible_map. Also, ask
> linker to provide symbol for cpu_present_map as common code references it.
>
> Move call of set_processor_id(0) to smp_prepare_boot_cpu().
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 02/14] xen/riscv: introduce support of Svpbmt extension and make it mandatory
2025-05-21 16:03 ` [PATCH v3 02/14] xen/riscv: introduce support of Svpbmt extension and make it mandatory Oleksii Kurochko
@ 2025-05-22 7:26 ` Jan Beulich
2025-05-30 14:20 ` Oleksii Kurochko
0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2025-05-22 7:26 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Doug Goldstein, Stefano Stabellini, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Alistair Francis, Bob Eshleman, Connor Davis, xen-devel
On 21.05.2025 18:03, Oleksii Kurochko wrote:
> Svpbmt extension is necessary for chaning the memory type for a page contains
> a combination of attributes that indicate the cacheability, idempotency,
> and ordering properties for access to that page.
>
> As a part of the patch the following is introduced:
> - Svpbmt memory type defintions: PTE_PBMT_{NOCACHE,IO}.
> - PAGE_HYPERVISOR_{NOCACHE,WC}.
> - RISCV_ISA_EXT_svpbmt and add a check in runtime that Svpbmt is
> supported by platform.
> - Update riscv/booting.txt with information about Svpbmt.
> - Update logic of pt_update_entry() to take into account PBMT bits.
>
> Use 'unsigned long' for pte_attr_t as PMBT bits are 61 and 62 and it doesn't
> fit into 'unsigned int'. Also, update function prototypes which uses
> 'unsigned int' for flags/attibutes.
>
> Enable Svpbmt for testing in QEMU as Svpmbt is now mandatory for
> Xen work.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 03/14] xen/riscv: add ioremap_*() variants using ioremap_attr()
2025-05-21 16:03 ` [PATCH v3 03/14] xen/riscv: add ioremap_*() variants using ioremap_attr() Oleksii Kurochko
@ 2025-05-22 7:33 ` Jan Beulich
0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2025-05-22 7:33 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, xen-devel
On 21.05.2025 18:03, Oleksii Kurochko wrote:
> Introduce ioremap_attr() as a shared helper to implement architecture-specific
> ioremap variants:
> - ioremap_cache()
> - ioremap_wc()
>
> These functions use __vmap() internally and apply appropriate memory attributes
> for RISC-V.
>
> These functions are implemned not as static inline function or macros as it will
> require to include asm/page.h into asm/io.h what will lead to compilation
> issue.
>
> Also, remove the unused ioremap_wt() macro from asm/io.h, as write-through
> mappings are not expected to be used on RISC-V.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 04/14] xen/riscv: introduce init_IRQ()
2025-05-21 16:03 ` [PATCH v3 04/14] xen/riscv: introduce init_IRQ() Oleksii Kurochko
@ 2025-05-22 7:38 ` Jan Beulich
0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2025-05-22 7:38 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, xen-devel
On 21.05.2025 18:03, Oleksii Kurochko wrote:
> Implement init_IRQ() to initalize various IRQs.
>
> Currently, this function initializes the irq_desc[] array,
> which stores IRQ descriptors containing various information
> about each IRQ, such as the type of hardware handling, whether
> the IRQ is disabled, etc.
> The initialization is basic at this point and includes setting
> IRQ_TYPE_INVALID as the IRQ type, assigning the IRQ number ( which
> is just a consequent index of irq_desc[] array ) to
> desc->irq.
>
> Additionally, the function init_irq_data() is introduced to
> initialize the IRQ descriptors for all IRQs in the system.
>
> Reuse defines of IRQ_TYPE_* from asm-generic/irq-dt.h.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
> ---
> - Add an explanatory comment about NR_IRQS definitions.
> - Init desc->irq and desc->action before call of init_one_irq_desc().
> - Drop "desc->action = NULL" as irq_desc[] is zero-initialized.
Just to mention: This is odd to read, as it partially invalidates the
earlier bullet point.
> - Update the commit message: drop mention of NULLing of desc->action.
Again only for the future: Adjusting the description to match changes
being made is the expected thing (and hence doesn't need mentioning
separately, in the common case at least).
Jan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 05/14] xen/riscv: introduce platform_get_irq()
2025-05-21 16:03 ` [PATCH v3 05/14] xen/riscv: introduce platform_get_irq() Oleksii Kurochko
@ 2025-05-22 7:41 ` Jan Beulich
0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2025-05-22 7:41 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 21.05.2025 18:03, Oleksii Kurochko wrote:
> platform_get_irq() recieves information about device's irq ( type
> and irq number ) from device tree node and using this information
> update irq descriptor in irq_desc[] array.
>
> Introduce dt_irq_xlate and initialize with aplic_irq_xlate() as
> it is used by dt_device_get_irq() which is called by
> platform_get_irq().
>
> Co-developed-by: Romain Caritey <Romain.Caritey@microchip.com>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 06/14] xen/riscv: dt_processor_hartid() implementation
2025-05-21 16:03 ` [PATCH v3 06/14] xen/riscv: dt_processor_hartid() implementation Oleksii Kurochko
@ 2025-05-22 7:50 ` Jan Beulich
2025-05-26 10:46 ` Oleksii Kurochko
0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2025-05-22 7:50 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, xen-devel
On 21.05.2025 18:03, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/smpboot.c
> +++ b/xen/arch/riscv/smpboot.c
> @@ -1,5 +1,8 @@
> #include <xen/cpumask.h>
> +#include <xen/device_tree.h>
> +#include <xen/errno.h>
> #include <xen/init.h>
> +#include <xen/types.h>
> #include <xen/sections.h>
Nit: The latter insertion wants to move one line down. Yet then - isn't
xen/stdint.h sufficient here?
> @@ -14,3 +17,69 @@ void __init smp_prepare_boot_cpu(void)
> cpumask_set_cpu(0, &cpu_possible_map);
> cpumask_set_cpu(0, &cpu_online_map);
> }
> +
> +/**
> + * dt_get_hartid - Get the hartid from a CPU device node
> + *
> + * @cpun: CPU number(logical index) for which device node is required
> + *
> + * Return: The hartid for the CPU node or ~0UL if not found.
> + */
> +static unsigned long dt_get_hartid(const struct dt_device_node *cpun)
> +{
> + const __be32 *cell;
> + unsigned int ac;
> + uint32_t len;
> +
> + ac = dt_n_addr_cells(cpun);
> + cell = dt_get_property(cpun, "reg", &len);
> + if ( !cell || !ac || ((sizeof(*cell) * ac) > len) )
Does DT make any guarantees for this multiplication to not overflow?
> + return ~0UL;
> +
> + return dt_read_number(cell, ac);
> +}
> +
> +/*
> + * Returns the hartid of the given device tree node, or -ENODEV if the node
> + * isn't an enabled and valid RISC-V hart node.
> + */
> +int dt_processor_hartid(const struct dt_device_node *node,
> + unsigned long *hartid)
> +{
> + const char *isa;
> + int ret;
> +
> + if ( !dt_device_is_compatible(node, "riscv") )
> + {
> + printk("Found incompatible CPU\n");
> + return -ENODEV;
> + }
> +
> + *hartid = dt_get_hartid(node);
> + if ( *hartid == ~0UL )
> + {
> + printk("Found CPU without CPU ID\n");
> + return -ENODATA;
> + }
> +
> + if ( !dt_device_is_available(node))
> + {
> + printk("CPU with hartid=%lu is not available\n", *hartid);
Considering that hart ID assignment is outside of our control, would we
perhaps better (uniformly) log such using %#lx?
> + return -ENODEV;
> + }
> +
> + if ( (ret = dt_property_read_string(node, "riscv,isa", &isa)) )
> + {
> + printk("CPU with hartid=%lu has no \"riscv,isa\" property\n", *hartid);
> + return ret;
> + }
> +
> + if ( isa[0] != 'r' || isa[1] != 'v' )
> + {
> + printk("CPU with hartid=%lu has an invalid ISA of \"%s\"\n", *hartid,
> + isa);
> + return -EINVAL;
As before -EINVAL is appropriate when input arguments have wrong values.
Here, however, you found an unexpected value in something the platform
has presented to you. While not entirely appropriate either, maybe
-ENODEV again (if nothing better can be found)?
Jan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 07/14] xen/riscv: introduce register_intc_ops() and intc_hw_ops.
2025-05-21 16:03 ` [PATCH v3 07/14] xen/riscv: introduce register_intc_ops() and intc_hw_ops Oleksii Kurochko
@ 2025-05-22 13:49 ` Jan Beulich
0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2025-05-22 13:49 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 21.05.2025 18:03, Oleksii Kurochko wrote:
> Introduce the intc_hw_operations structure to encapsulate interrupt
> controller-specific data and operations. This structure includes:
> - A pointer to interrupt controller information (`intc_info`)
> - Callbacks to initialize the controller and set IRQ type/priority
> - A reference to an interupt controller descriptor (`host_irq_type`)
> - number of interrupt controller irqs.
>
> Add function register_intc_ops() to mentioned above structure.
>
> Co-developed-by: Romain Caritey <Romain.Caritey@microchip.com>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-05-21 16:03 ` [PATCH v3 08/14] xen/riscv: imsic_init() implementation Oleksii Kurochko
@ 2025-05-22 14:46 ` Jan Beulich
2025-05-26 18:44 ` Oleksii Kurochko
0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2025-05-22 14:46 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 21.05.2025 18:03, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/arch/riscv/imsic.c
> @@ -0,0 +1,354 @@
> +/* SPDX-License-Identifier: MIT */
> +
> +/*
> + * xen/arch/riscv/imsic.c
> + *
> + * RISC-V Incoming MSI Controller support
> + *
> + * (c) Microchip Technology Inc.
> + * (c) Vates
> + */
> +
> +#include <xen/bitops.h>
> +#include <xen/const.h>
> +#include <xen/cpumask.h>
> +#include <xen/device_tree.h>
> +#include <xen/errno.h>
> +#include <xen/init.h>
> +#include <xen/macros.h>
> +#include <xen/smp.h>
> +#include <xen/spinlock.h>
> +#include <xen/xmalloc.h>
> +
> +#include <asm/imsic.h>
> +
> +static struct imsic_config imsic_cfg;
> +
> +/* Callers aren't expected to changed imsic_cfg so return const. */
> +const struct imsic_config *imsic_get_config(void)
> +{
> + return &imsic_cfg;
> +}
Minor remark regarding the comment: Consider replacing "expected" by "supposed"
or "intended"?
> +/*
> + * Parses IMSIC DT node.
> + *
> + * Returns 0 if initialization is successful, a negative value on failure,
> + * or IRQ_M_EXT if the IMSIC node corresponds to a machine-mode IMSIC,
> + * which should be ignored by the hypervisor.
> + */
> +static int imsic_parse_node(const struct dt_device_node *node,
> + unsigned int *nr_parent_irqs)
> +{
> + int rc;
> + unsigned int tmp;
> + paddr_t base_addr;
> + uint32_t *irq_range;
> +
> + *nr_parent_irqs = dt_number_of_irq(node);
> + if ( !*nr_parent_irqs )
> + panic("%s: irq_num can be 0. Check %s node\n", __func__,
> + dt_node_full_name(node));
DYM "can't be"?
> + irq_range = xzalloc_array(uint32_t, *nr_parent_irqs * 2);
> + if ( !irq_range )
> + panic("%s: irq_range[] allocation failed\n", __func__);
> +
> + if ( (rc = dt_property_read_u32_array(node, "interrupts-extended",
> + irq_range, *nr_parent_irqs * 2)) )
> + panic("%s: unable to find interrupt-extended in %s node: %d\n",
> + __func__, dt_node_full_name(node), rc);
> +
> + if ( irq_range[1] == IRQ_M_EXT )
> + {
> + /* Machine mode imsic node, ignore it. */
> + rc = IRQ_M_EXT;
> + goto cleanup;
> + }
Wouldn't this better be done ...
> + /* Check that interrupts-extended property is well-formed. */
> + for ( unsigned int i = 2; i < (*nr_parent_irqs * 2); i += 2 )
> + {
> + if ( irq_range[i + 1] != irq_range[1] )
> + panic("%s: mode[%d] != %d\n", __func__, i + 1, irq_range[1]);
> + }
... after this consistency check?
Also %u please when you log unsigned values.
> + if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
> + &imsic_cfg.guest_index_bits) )
> + imsic_cfg.guest_index_bits = 0;
> + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
> + if ( tmp < imsic_cfg.guest_index_bits )
> + {
> + printk(XENLOG_ERR "%s: guest index bits too big\n",
> + dt_node_name(node));
> + rc = -ENOENT;
> + goto cleanup;
> + }
> +
> + /* Find number of HART index bits */
> + if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
> + &imsic_cfg.hart_index_bits) )
> + {
> + /* Assume default value */
> + imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
> + if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
> + imsic_cfg.hart_index_bits++;
Since fls() returns a 1-based bit number, isn't it rather that in the
exact-power-of-2 case you'd need to subtract 1?
> + }
> + tmp -= imsic_cfg.guest_index_bits;
> + if ( tmp < imsic_cfg.hart_index_bits )
> + {
> + printk(XENLOG_ERR "%s: HART index bits too big\n",
> + dt_node_name(node));
> + rc = -ENOENT;
> + goto cleanup;
> + }
> +
> + /* Find number of group index bits */
> + if ( !dt_property_read_u32(node, "riscv,group-index-bits",
> + &imsic_cfg.group_index_bits) )
> + imsic_cfg.group_index_bits = 0;
> + tmp -= imsic_cfg.hart_index_bits;
> + if ( tmp < imsic_cfg.group_index_bits )
> + {
> + printk(XENLOG_ERR "%s: group index bits too big\n",
> + dt_node_name(node));
> + rc = -ENOENT;
> + goto cleanup;
> + }
> +
> + /* Find first bit position of group index */
> + tmp = IMSIC_MMIO_PAGE_SHIFT * 2;
> + if ( !dt_property_read_u32(node, "riscv,group-index-shift",
> + &imsic_cfg.group_index_shift) )
> + imsic_cfg.group_index_shift = tmp;
> + if ( imsic_cfg.group_index_shift < tmp )
> + {
> + printk(XENLOG_ERR "%s: group index shift too small\n",
> + dt_node_name(node));
> + rc = -ENOENT;
> + goto cleanup;
> + }
> + tmp = imsic_cfg.group_index_bits + imsic_cfg.group_index_shift - 1;
> + if ( tmp >= BITS_PER_LONG )
> + {
> + printk(XENLOG_ERR "%s: group index shift too big\n",
> + dt_node_name(node));
> + rc = -EINVAL;
> + goto cleanup;
> + }
> +
> + /* Find number of interrupt identities */
> + if ( !dt_property_read_u32(node, "riscv,num-ids", &imsic_cfg.nr_ids) )
> + {
> + printk(XENLOG_ERR "%s: number of interrupt identities not found\n",
> + node->name);
> + rc = -ENOENT;
> + goto cleanup;
> + }
> +
> + if ( (imsic_cfg.nr_ids < IMSIC_MIN_ID) ||
> + (imsic_cfg.nr_ids > IMSIC_MAX_ID) ||
> + ((imsic_cfg.nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID) )
Now that you've explained to me what the deal is with these constants: Isn't
the 1st of the three checks redundant with the last one?
> + {
> + printk(XENLOG_ERR "%s: invalid number of interrupt identities\n",
> + node->name);
> + rc = -EINVAL;
> + goto cleanup;
> + }
> +
> + /* Compute base address */
> + imsic_cfg.nr_mmios = 0;
> + rc = dt_device_get_address(node, imsic_cfg.nr_mmios, &base_addr, NULL);
> + if ( rc )
> + {
> + printk(XENLOG_ERR "%s: first MMIO resource not found: %d\n",
> + dt_node_name(node), rc);
> + goto cleanup;
> + }
> +
> + imsic_cfg.base_addr = base_addr;
> + imsic_cfg.base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
> + imsic_cfg.hart_index_bits +
> + IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
> + imsic_cfg.base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
> + imsic_cfg.group_index_shift);
> +
> + /* Find number of MMIO register sets */
> + do {
> + imsic_cfg.nr_mmios++;
> + } while ( !dt_device_get_address(node, imsic_cfg.nr_mmios, &base_addr, NULL) );
> +
> + cleanup:
> + xfree(irq_range);
Afacit you could free this array way earlier. That would then simplify quite
a few of the error paths, I think.
> +/*
> + * Initialize the imsic_cfg structure based on the IMSIC DT node.
> + *
> + * Returns 0 if initialization is successful, a negative value on failure,
> + * or IRQ_M_EXT if the IMSIC node corresponds to a machine-mode IMSIC,
> + * which should be ignored by the hypervisor.
> + */
> +int __init imsic_init(const struct dt_device_node *node)
> +{
> + int rc;
> + unsigned long reloff, hartid;
> + unsigned int nr_parent_irqs, index, nr_handlers = 0;
> + paddr_t base_addr;
> + unsigned int nr_mmios;
> +
> + /* Parse IMSIC node */
> + rc = imsic_parse_node(node, &nr_parent_irqs);
> + /*
> + * If machine mode imsic node => ignore it.
> + * If rc < 0 => parsing of IMSIC DT node failed.
> + */
> + if ( (rc == IRQ_M_EXT) || rc )
> + return rc;
The former of the checks is redundant with the latter. Did you perhaps mean
"rc < 0" for that one?
> + nr_mmios = imsic_cfg.nr_mmios;
> +
> + /* Allocate MMIO resource array */
> + imsic_cfg.mmios = xzalloc_array(struct imsic_mmios, nr_mmios);
How large can this and ...
> + if ( !imsic_cfg.mmios )
> + {
> + rc = -ENOMEM;
> + goto imsic_init_err;
> + }
> +
> + imsic_cfg.msi = xzalloc_array(struct imsic_msi, nr_parent_irqs);
... this array grow (in principle)? I think you're aware that in principle
new code is expected to use xvmalloc() and friends unless there are specific
reasons speaking against that.
> + if ( !imsic_cfg.msi )
> + {
> + rc = -ENOMEM;
> + goto imsic_init_err;
> + }
> +
> + /* Check MMIO register sets */
> + for ( unsigned int i = 0; i < nr_mmios; i++ )
> + {
> + if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
> + {
> + rc = -ENOMEM;
> + goto imsic_init_err;
> + }
> +
> + rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
> + &imsic_cfg.mmios[i].size);
> + if ( rc )
> + {
> + printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
> + node->name, i);
> + goto imsic_init_err;
> + }
> +
> + base_addr = imsic_cfg.mmios[i].base_addr;
> + base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
> + imsic_cfg.hart_index_bits +
> + IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
> + base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
> + imsic_cfg.group_index_shift);
> + if ( base_addr != imsic_cfg.base_addr )
> + {
> + rc = -EINVAL;
> + printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
> + node->name, i);
> + goto imsic_init_err;
> + }
Maybe just for my own understanding: There's no similar check for the
sizes to match / be consistent wanted / needed?
> + }
> +
> + /* Configure handlers for target CPUs */
> + for ( unsigned int i = 0; i < nr_parent_irqs; i++ )
> + {
> + unsigned long xen_cpuid;
> +
> + rc = imsic_get_parent_hartid(node, i, &hartid);
> + if ( rc )
> + {
> + printk(XENLOG_WARNING "%s: cpu ID for parent irq%u not found\n",
> + node->name, i);
> + continue;
> + }
> +
> + xen_cpuid = hartid_to_cpuid(hartid);
I'm probably biased by "cpuid" having different meaning on x86, but: To
be consistent with variable names elsewhere, couldn't this variable simply
be named "cpu"? With the other item named "hartid" there's no ambiguity
there anymore.
> + if ( xen_cpuid >= num_possible_cpus() )
> + {
> + printk(XENLOG_WARNING "%s: unsupported cpu ID=%lu for parent irq%u\n",
> + node->name, hartid, i);
The message continues to be ambiguous (to me as a non-RISC-V person at
least): You log a hart ID, while you say "cpu ID". Also, as I think I
said elsewhere already, the hart ID may better be logged using %#lx.
> + continue;
> + }
> +
> + /* Find MMIO location of MSI page */
> + reloff = i * BIT(imsic_cfg.guest_index_bits, UL) * IMSIC_MMIO_PAGE_SZ;
> + for ( index = 0; index < nr_mmios; index++ )
> + {
> + if ( reloff < imsic_cfg.mmios[index].size )
> + break;
> +
> + /*
> + * MMIO region size may not be aligned to
> + * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
> + * if holes are present.
> + */
> + reloff -= ROUNDUP(imsic_cfg.mmios[index].size,
> + BIT(imsic_cfg.guest_index_bits, UL) * IMSIC_MMIO_PAGE_SZ);
Nit: Indentation once again.
> + }
> +
> + if ( index == nr_mmios )
> + {
> + printk(XENLOG_WARNING "%s: MMIO not found for parent irq%u\n",
> + node->name, i);
> + continue;
> + }
> +
> + if ( !IS_ALIGNED(imsic_cfg.msi[xen_cpuid].base_addr + reloff, PAGE_SIZE) )
If this is the crucial thing to check, ...
> + {
> + printk(XENLOG_WARNING "%s: MMIO address %#lx is not aligned on a page\n",
> + node->name, imsic_cfg.msi[xen_cpuid].base_addr + reloff);
> + imsic_cfg.msi[xen_cpuid].offset = 0;
> + imsic_cfg.msi[xen_cpuid].base_addr = 0;
> + continue;
> + }
> +
> + cpumask_set_cpu(xen_cpuid, imsic_cfg.mmios[index].cpus);
> +
> + imsic_cfg.msi[xen_cpuid].base_addr = imsic_cfg.mmios[index].base_addr;
> + imsic_cfg.msi[xen_cpuid].offset = reloff;
... why is it that the two parts are stored separately? If their sum needs to
be page-aligned, I'd kind of expect it's only ever the sum which is used?
Also is it really PAGE_SHIFT or rather IMSIC_MMIO_PAGE_SHIFT that needs
chacking against?
Further please pay attention to line length limits - there are at least two
violations around my earlier comment here.
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/imsic.h
> @@ -0,0 +1,65 @@
> +/* SPDX-License-Identifier: MIT */
> +
> +/*
> + * xen/arch/riscv/include/asm/imsic.h
> + *
> + * RISC-V Incoming MSI Controller support
> + *
> + * (c) Microchip Technology Inc.
> + */
> +
> +#ifndef ASM__RISCV__IMSIC_H
> +#define ASM__RISCV__IMSIC_H
Please update according to the most recent naming rules change (all it takes
may be to shrink the double underscores).
> +#include <xen/types.h>
> +
> +#define IMSIC_MMIO_PAGE_SHIFT 12
> +#define IMSIC_MMIO_PAGE_SZ (1UL << IMSIC_MMIO_PAGE_SHIFT)
> +
> +#define IMSIC_MIN_ID 63
> +#define IMSIC_MAX_ID 2047
> +
> +struct imsic_msi {
> + paddr_t base_addr;
> + unsigned long offset;
> +};
> +
> +struct imsic_mmios {
> + paddr_t base_addr;
> + unsigned long size;
> + cpumask_var_t cpus;
> +};
> +
> +struct imsic_config {
> + /* Base address */
> + paddr_t base_addr;
> +
> + /* Bits representing Guest index, HART index, and Group index */
> + unsigned int guest_index_bits;
> + unsigned int hart_index_bits;
> + unsigned int group_index_bits;
> + unsigned int group_index_shift;
> +
> + /* IMSIC phandle */
> + unsigned int phandle;
> +
> + /* Number of parent irq */
> + unsigned int nr_parent_irqs;
> +
> + /* Number off interrupt identities */
> + unsigned int nr_ids;
> +
> + /* MMIOs */
> + unsigned int nr_mmios;
> + struct imsic_mmios *mmios;
Are the contents of this and ...
> + /* MSI */
> + struct imsic_msi *msi;
... this array ever changing post-init? If not, the pointers here may want
to be pointer-to-const (requiring local variables in the function populating
the field).
> @@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
> return pcpu_info[cpuid].hart_id;
> }
>
> +static inline unsigned long hartid_to_cpuid(unsigned long hartid)
> +{
> + for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
> + {
> + if ( hartid == cpuid_to_hartid(cpuid) )
> + return cpuid;
> + }
> +
> + /* hartid isn't valid for some reason */
> + return NR_CPUS;
> +}
Considering the values being returned, why's the function's return type
"unsigned long"?
Why the use of ARRAY_SIZE() in the loop header? You don't use pcpu_info[]
in the loop body.
Finally - on big systems this is going to be pretty inefficient a lookup.
This may want to at least have a TODO comment attached.
Jan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 09/14] xen/riscv: aplic_init() implementation
2025-05-21 16:03 ` [PATCH v3 09/14] xen/riscv: aplic_init() implementation Oleksii Kurochko
@ 2025-05-22 15:26 ` Jan Beulich
2025-05-27 14:48 ` Oleksii Kurochko
0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2025-05-22 15:26 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 21.05.2025 18:03, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/arch/riscv/aplic-priv.h
> @@ -0,0 +1,34 @@
> +/* SPDX-License-Identifier: MIT */
> +
> +/*
> + * xen/arch/riscv/aplic-priv.h
> + *
> + * Private part of aplic.h header.
> + *
> + * RISC-V Advanced Platform-Level Interrupt Controller support
> + *
> + * Copyright (c) Microchip.
> + * Copyright (c) Vates.
> + */
> +
> +#ifndef ASM__RISCV_PRIV_APLIC_H
> +#define ASM__RISCV_PRIV_APLIC_H
Nit: This one conforms to neither prior nor current rules.
> --- a/xen/arch/riscv/aplic.c
> +++ b/xen/arch/riscv/aplic.c
> @@ -9,19 +9,113 @@
> * Copyright (c) 2024-2025 Vates
> */
>
> +#include <xen/device_tree.h>
> #include <xen/errno.h>
> #include <xen/init.h>
> #include <xen/irq.h>
> +#include <xen/mm.h>
> #include <xen/sections.h>
> #include <xen/types.h>
> +#include <xen/vmap.h>
> +
> +#include "aplic-priv.h"
>
> #include <asm/device.h>
> +#include <asm/imsic.h>
> #include <asm/intc.h>
> +#include <asm/riscv_encoding.h>
> +
> +#define APLIC_DEFAULT_PRIORITY 1
> +
> +/* The maximum number of wired interrupt sources supported by APLIC domain. */
> +#define APLIC_MAX_NUM_WIRED_IRQ_SOURCES 1023
Wait - what's "wired" here? There's only MSI you said elsewhere?
Further - how's this 1023 related to any of the other uses of that number?
Is this by chance ARRAY_SIZE(aplic.regs->sourcecfg)? If so, it wants
expressing like that, to allow making the connection.
> +static struct aplic_priv aplic;
>
> static struct intc_info __ro_after_init aplic_info = {
> .hw_version = INTC_APLIC,
> };
>
> +static void __init aplic_init_hw_interrupts(void)
> +{
> + unsigned int i;
> +
> + /* Disable all interrupts */
> + for ( i = 0; i < ARRAY_SIZE(aplic.regs->clrie); i++)
> + writel(~0U, &aplic.regs->clrie[i]);
> +
> + /* Set interrupt type and default priority for all interrupts */
> + for ( i = 0; i < aplic_info.num_irqs; i++ )
> + {
> + writel(0, &aplic.regs->sourcecfg[i]);
> + /*
> + * Low bits of target register contains Interrupt Priority bits which
> + * can't be zero according to AIA spec.
> + * Thereby they are initialized to APLIC_DEFAULT_PRIORITY.
> + */
> + writel(APLIC_DEFAULT_PRIORITY, &aplic.regs->target[i]);
> + }
> +
> + writel(APLIC_DOMAINCFG_IE | APLIC_DOMAINCFG_DM, &aplic.regs->domaincfg);
> +}
> +
> +static int __init cf_check aplic_init(void)
> +{
> + dt_phandle imsic_phandle;
> + const __be32 *prop;
> + uint64_t size, paddr;
> + const struct dt_device_node *imsic_node;
> + const struct dt_device_node *node = aplic_info.node;
> + int rc;
> +
> + /* Check for associated imsic node */
> + if ( !dt_property_read_u32(node, "msi-parent", &imsic_phandle) )
> + panic("%s: IDC mode not supported\n", node->full_name);
> +
> + imsic_node = dt_find_node_by_phandle(imsic_phandle);
> + if ( !imsic_node )
> + panic("%s: unable to find IMSIC node\n", node->full_name);
> +
> + rc = imsic_init(imsic_node);
> + if ( rc == IRQ_M_EXT )
> + /* Machine mode imsic node, ignore this aplic node */
> + return 0;
> + else if ( rc )
As before: No "else" please when the earlier if() ends in an unconditional
control flow change.
> + panic("%s: Failded to initialize IMSIC\n", node->full_name);
> +
> + /* Find out number of interrupt sources */
> + if ( !dt_property_read_u32(node, "riscv,num-sources",
> + &aplic_info.num_irqs) )
> + panic("%s: failed to get number of interrupt sources\n",
> + node->full_name);
> +
> + if ( aplic_info.num_irqs > APLIC_MAX_NUM_WIRED_IRQ_SOURCES )
> + panic("%s: too big number of riscv,num-source: %u\n",
> + __func__, aplic_info.num_irqs);
Is it actually necessary to panic() in this case? Can't you just lower
.num_irqs instead (rendering higher IRQs,if any, non-functional)?
> + prop = dt_get_property(node, "reg", NULL);
> + dt_get_range(&prop, node, &paddr, &size);
> + if ( !paddr )
> + panic("%s: first MMIO resource not found\n", node->full_name);
> +
> + aplic.paddr_start = paddr;
> + aplic.size = size;
> +
> + aplic.regs = ioremap(paddr, size);
Doesn't size need to match certain constraints? If too low, you may
need to panic(), while if too high you may not need to map the entire
range?
Does paddr perhaps also need to match certain contraints, like having
the low so many bits clear?
> +static struct intc_hw_operations __ro_after_init aplic_ops = {
> + .info = &aplic_info,
> + .init = aplic_init,
> +};
Why's this __ro_after_init and not simply const? I can't spot any write
to it.
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/aplic.h
> @@ -0,0 +1,64 @@
> +/* SPDX-License-Identifier: MIT */
> +
> +/*
> + * xen/arch/riscv/asm/include/aplic.h
> + *
> + * RISC-V Advanced Platform-Level Interrupt Controller support
> + *
> + * Copyright (c) Microchip.
> + */
> +
> +#ifndef ASM__RISCV__APLIC_H
> +#define ASM__RISCV__APLIC_H
Wants updating again.
> +#include <xen/types.h>
> +
> +#include <asm/imsic.h>
> +
> +#define APLIC_DOMAINCFG_IE BIT(8, UL)
> +#define APLIC_DOMAINCFG_DM BIT(2, UL)
Why UL when ...
> +struct aplic_regs {
> + uint32_t domaincfg;
... this is just 32 bits wide?
Jan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 10/14] xen/riscv: introduce intc_init() and helpers
2025-05-21 16:03 ` [PATCH v3 10/14] xen/riscv: introduce intc_init() and helpers Oleksii Kurochko
@ 2025-05-22 15:32 ` Jan Beulich
0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2025-05-22 15:32 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 21.05.2025 18:03, Oleksii Kurochko wrote:
> Introduce intc_init() to initialize the interrupt controller using the
> registered hardware ops.
> Also add intc_route_irq_to_xen() to route IRQs to Xen, with support for
> setting IRQ type and priority via new internal helpers intc_set_irq_type()
> and intc_set_irq_priority().
>
> Call intc_init() to do basic initialization steps for APLIC and IMSIC.
>
> Co-developed-by: Romain Caritey <Romain.Caritey@microchip.com>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 11/14] xen/riscv: implementation of aplic and imsic operations
2025-05-21 16:03 ` [PATCH v3 11/14] xen/riscv: implementation of aplic and imsic operations Oleksii Kurochko
@ 2025-05-22 15:55 ` Jan Beulich
2025-05-28 11:00 ` Oleksii Kurochko
0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2025-05-22 15:55 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 21.05.2025 18:03, Oleksii Kurochko wrote:
> +static void aplic_set_irq_affinity(struct irq_desc *desc, const cpumask_t *mask)
> +{
> + unsigned int cpu;
> + uint64_t group_index, base_ppn;
> + uint32_t hhxw, lhxw ,hhxs, value;
Nit: Comma vs blank placement.
> + const struct imsic_config *imsic = aplic.imsic_cfg;
> +
> + /*
> + * TODO: Currently, APLIC is supported only with MSI interrupts.
> + * If APLIC without MSI interrupts is required in the future,
> + * this function will need to be updated accordingly.
> + */
> + ASSERT(readl(&aplic.regs->domaincfg) & APLIC_DOMAINCFG_DM);
> +
> + ASSERT(!cpumask_empty(mask));
> +
> + ASSERT(spin_is_locked(&desc->lock));
> +
> + cpu = cpuid_to_hartid(aplic_get_cpu_from_mask(mask));
> + hhxw = imsic->group_index_bits;
> + lhxw = imsic->hart_index_bits;
> + hhxs = imsic->group_index_shift - IMSIC_MMIO_PAGE_SHIFT * 2;
> + base_ppn = imsic->msi[cpu].base_addr >> IMSIC_MMIO_PAGE_SHIFT;
> +
> + /* Update hart and EEID in the target register */
> + group_index = (base_ppn >> (hhxs + IMSIC_MMIO_PAGE_SHIFT)) & (BIT(hhxw, UL) - 1);
Nit: Line length.
I'm also puzzled by the various uses of IMSIC_MMIO_PAGE_SHIFT. Why do you
subtract double the value when calculating hhxs, just to add the value
back in here? There's no other usee of the variable afaics.
> + value = desc->irq;
> + value |= cpu << APLIC_TARGET_HART_IDX_SHIFT;
> + value |= group_index << (lhxw + APLIC_TARGET_HART_IDX_SHIFT) ;
Nit: Stray blank.
> + spin_lock(&aplic.lock);
> +
> + writel(value, &aplic.regs->target[desc->irq - 1]);
> +
> + spin_unlock(&aplic.lock);
> +}
> +
> +static const hw_irq_controller aplic_xen_irq_type = {
> + .typename = "aplic",
> + .startup = aplic_irq_startup,
> + .shutdown = aplic_irq_disable,
> + .enable = aplic_irq_enable,
> + .disable = aplic_irq_disable,
> + .set_affinity = aplic_set_irq_affinity,
As indicated before, for functions you use as hooks you want to save
yourself (or someone else) future work by marking them cf_check right
away.
> --- a/xen/arch/riscv/imsic.c
> +++ b/xen/arch/riscv/imsic.c
> @@ -22,7 +22,124 @@
>
> #include <asm/imsic.h>
>
> -static struct imsic_config imsic_cfg;
> +static struct imsic_config imsic_cfg = {
> + .lock = SPIN_LOCK_UNLOCKED,
> +};
> +
> +#define IMSIC_DISABLE_EIDELIVERY 0
> +#define IMSIC_ENABLE_EIDELIVERY 1
> +#define IMSIC_DISABLE_EITHRESHOLD 1
> +#define IMSIC_ENABLE_EITHRESHOLD 0
> +
> +#define imsic_csr_write(c, v) \
> +do { \
> + csr_write(CSR_SISELECT, c); \
> + csr_write(CSR_SIREG, v); \
> +} while (0)
> +
> +#define imsic_csr_set(c, v) \
> +do { \
> + csr_write(CSR_SISELECT, c); \
> + csr_set(CSR_SIREG, v); \
> +} while (0)
> +
> +#define imsic_csr_clear(c, v) \
> +do { \
> + csr_write(CSR_SISELECT, c); \
> + csr_clear(CSR_SIREG, v); \
> +} while (0)
> +
> +void __init imsic_ids_local_delivery(bool enable)
> +{
> + if ( enable )
> + {
> + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> + }
> + else
> + {
> + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> + }
> +}
> +
> +static void imsic_local_eix_update(unsigned long base_id, unsigned long num_id,
> + bool pend, bool val)
> +{
> + unsigned long id = base_id, last_id = base_id + num_id;
> +
> + while ( id < last_id )
> + {
> + unsigned long isel, ireg;
> + unsigned long start_id = id & (__riscv_xlen - 1);
> + unsigned long chunk = __riscv_xlen - start_id;
> + unsigned long count = (last_id - id < chunk) ? last_id - id : chunk;
Any reason you open-code min() here?
> + isel = id / __riscv_xlen;
> + isel *= __riscv_xlen / IMSIC_EIPx_BITS;
> + isel += pend ? IMSIC_EIP0 : IMSIC_EIE0;
> +
> + ireg = GENMASK(start_id + count - 1, start_id);
> +
> + id += count;
> +
> + if ( val )
> + imsic_csr_set(isel, ireg);
> + else
> + imsic_csr_clear(isel, ireg);
> + }
> +}
> +
> +void imsic_irq_enable(unsigned int irq)
> +{
> + /*
> + * The only caller of imsic_irq_enable() is aplic_irq_enable(), which
> + * already runs with IRQs disabled. Therefore, there's no need to use
> + * spin_lock_irqsave() in this function.
> + *
> + * This ASSERT is added as a safeguard: if imsic_irq_enable() is ever
> + * called from a context where IRQs are not disabled,
> + * spin_lock_irqsave() should be used instead of spin_lock().
> + */
> + ASSERT(!local_irq_is_enabled());
> +
> + spin_lock(&imsic_cfg.lock);
> + /*
> + * There is no irq - 1 here (look at aplic_set_irq_type()) because:
> + * From the spec:
> + * When an interrupt file supports distinct interrupt identities,
> + * valid identity numbers are between 1 and inclusive. The identity
> + * numbers within this range are said to be implemented by the interrupt
> + * file; numbers outside this range are not implemented. The number zero
> + * is never a valid interrupt identity.
> + * ...
> + * Bit positions in a valid eiek register that don’t correspond to a
> + * supported interrupt identity (such as bit 0 of eie0) are read-only zeros.
> + *
> + * So in EIx registers interrupt i corresponds to bit i in comparison wiht
> + * APLIC's sourcecfg which starts from 0.
> + */
> + imsic_local_eix_update(irq, 1, false, true);
> + spin_unlock(&imsic_cfg.lock);
> +}
> +
> +void imsic_irq_disable(unsigned int irq)
> +{
> + /*
> + * The only caller of imsic_irq_disable() is aplic_irq_enable(), which
s/enable/disable/ ?
Jan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 06/14] xen/riscv: dt_processor_hartid() implementation
2025-05-22 7:50 ` Jan Beulich
@ 2025-05-26 10:46 ` Oleksii Kurochko
2025-05-26 12:56 ` Oleksii Kurochko
0 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-26 10:46 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, xen-devel
[-- Attachment #1: Type: text/plain, Size: 3580 bytes --]
On 5/22/25 9:50 AM, Jan Beulich wrote:
> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>> --- a/xen/arch/riscv/smpboot.c
>> +++ b/xen/arch/riscv/smpboot.c
>> @@ -1,5 +1,8 @@
>> #include <xen/cpumask.h>
>> +#include <xen/device_tree.h>
>> +#include <xen/errno.h>
>> #include <xen/init.h>
>> +#include <xen/types.h>
>> #include <xen/sections.h>
> Nit: The latter insertion wants to move one line down. Yet then - isn't
> xen/stdint.h sufficient here?
__be32 used in dt_get_hartid() is defined in xen/types.h.
>
>> @@ -14,3 +17,69 @@ void __init smp_prepare_boot_cpu(void)
>> cpumask_set_cpu(0, &cpu_possible_map);
>> cpumask_set_cpu(0, &cpu_online_map);
>> }
>> +
>> +/**
>> + * dt_get_hartid - Get the hartid from a CPU device node
>> + *
>> + * @cpun: CPU number(logical index) for which device node is required
>> + *
>> + * Return: The hartid for the CPU node or ~0UL if not found.
>> + */
>> +static unsigned long dt_get_hartid(const struct dt_device_node *cpun)
>> +{
>> + const __be32 *cell;
>> + unsigned int ac;
>> + uint32_t len;
>> +
>> + ac = dt_n_addr_cells(cpun);
>> + cell = dt_get_property(cpun, "reg", &len);
>> + if ( !cell || !ac || ((sizeof(*cell) * ac) > len) )
> Does DT make any guarantees for this multiplication to not overflow?
I haven't tried of DTC checks such things during compilation but considering that
ac value is uin32_t value (according to DT spec) then overflow could really happen.
I will add the following to check an overflow:
if ( ac > ((sizeof(size_t) * BIT_PER_BYTE) / sizeof(*cell)) )
{
printk("%s: overflow detected\n", __func__);
return ~0UL;
}
>
>> + return ~0UL;
>> +
>> + return dt_read_number(cell, ac);
>> +}
>> +
>> +/*
>> + * Returns the hartid of the given device tree node, or -ENODEV if the node
>> + * isn't an enabled and valid RISC-V hart node.
>> + */
>> +int dt_processor_hartid(const struct dt_device_node *node,
>> + unsigned long *hartid)
>> +{
>> + const char *isa;
>> + int ret;
>> +
>> + if ( !dt_device_is_compatible(node, "riscv") )
>> + {
>> + printk("Found incompatible CPU\n");
>> + return -ENODEV;
>> + }
>> +
>> + *hartid = dt_get_hartid(node);
>> + if ( *hartid == ~0UL )
>> + {
>> + printk("Found CPU without CPU ID\n");
>> + return -ENODATA;
>> + }
>> +
>> + if ( !dt_device_is_available(node))
>> + {
>> + printk("CPU with hartid=%lu is not available\n", *hartid);
> Considering that hart ID assignment is outside of our control, would we
> perhaps better (uniformly) log such using %#lx?
It makes sense, DTC when generates dts from dtb also uses hex number, so it could
help to find a failure node faster.
>
>> + return -ENODEV;
>> + }
>> +
>> + if ( (ret = dt_property_read_string(node, "riscv,isa", &isa)) )
>> + {
>> + printk("CPU with hartid=%lu has no \"riscv,isa\" property\n", *hartid);
>> + return ret;
>> + }
>> +
>> + if ( isa[0] != 'r' || isa[1] != 'v' )
>> + {
>> + printk("CPU with hartid=%lu has an invalid ISA of \"%s\"\n", *hartid,
>> + isa);
>> + return -EINVAL;
> As before -EINVAL is appropriate when input arguments have wrong values.
> Here, however, you found an unexpected value in something the platform
> has presented to you. While not entirely appropriate either, maybe
> -ENODEV again (if nothing better can be found)?
I don't see better candidate.
Interesting if some reserved region exists for user
defined errors.
~ Oleksii
[-- Attachment #2: Type: text/html, Size: 4879 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 06/14] xen/riscv: dt_processor_hartid() implementation
2025-05-26 10:46 ` Oleksii Kurochko
@ 2025-05-26 12:56 ` Oleksii Kurochko
0 siblings, 0 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-26 12:56 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, xen-devel
[-- Attachment #1: Type: text/plain, Size: 3919 bytes --]
On 5/26/25 12:46 PM, Oleksii Kurochko wrote:
>
>
> On 5/22/25 9:50 AM, Jan Beulich wrote:
>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>> --- a/xen/arch/riscv/smpboot.c
>>> +++ b/xen/arch/riscv/smpboot.c
>>> @@ -1,5 +1,8 @@
>>> #include <xen/cpumask.h>
>>> +#include <xen/device_tree.h>
>>> +#include <xen/errno.h>
>>> #include <xen/init.h>
>>> +#include <xen/types.h>
>>> #include <xen/sections.h>
>> Nit: The latter insertion wants to move one line down. Yet then - isn't
>> xen/stdint.h sufficient here?
> __be32 used in dt_get_hartid() is defined in xen/types.h.
>
>>> @@ -14,3 +17,69 @@ void __init smp_prepare_boot_cpu(void)
>>> cpumask_set_cpu(0, &cpu_possible_map);
>>> cpumask_set_cpu(0, &cpu_online_map);
>>> }
>>> +
>>> +/**
>>> + * dt_get_hartid - Get the hartid from a CPU device node
>>> + *
>>> + * @cpun: CPU number(logical index) for which device node is required
>>> + *
>>> + * Return: The hartid for the CPU node or ~0UL if not found.
>>> + */
>>> +static unsigned long dt_get_hartid(const struct dt_device_node *cpun)
>>> +{
>>> + const __be32 *cell;
>>> + unsigned int ac;
>>> + uint32_t len;
>>> +
>>> + ac = dt_n_addr_cells(cpun);
>>> + cell = dt_get_property(cpun, "reg", &len);
>>> + if ( !cell || !ac || ((sizeof(*cell) * ac) > len) )
>> Does DT make any guarantees for this multiplication to not overflow?
> I haven't tried of DTC checks such things during compilation but considering that
> ac value is uin32_t value (according to DT spec) then overflow could really happen.
> I will add the following to check an overflow:
> if ( ac > ((sizeof(size_t) * BIT_PER_BYTE) / sizeof(*cell)) )
> {
> printk("%s: overflow detected\n", __func__);
> return ~0UL;
> }
Oops, I meant here size_t_max instead of (sizeof(size_t) * BIT_PER_BYTE), lost power of 2 minus 1.
Probably, SIZE_T_MAX or something similar exists. I have to check.
~ Oleksii
>
>>> + return ~0UL;
>>> +
>>> + return dt_read_number(cell, ac);
>>> +}
>>> +
>>> +/*
>>> + * Returns the hartid of the given device tree node, or -ENODEV if the node
>>> + * isn't an enabled and valid RISC-V hart node.
>>> + */
>>> +int dt_processor_hartid(const struct dt_device_node *node,
>>> + unsigned long *hartid)
>>> +{
>>> + const char *isa;
>>> + int ret;
>>> +
>>> + if ( !dt_device_is_compatible(node, "riscv") )
>>> + {
>>> + printk("Found incompatible CPU\n");
>>> + return -ENODEV;
>>> + }
>>> +
>>> + *hartid = dt_get_hartid(node);
>>> + if ( *hartid == ~0UL )
>>> + {
>>> + printk("Found CPU without CPU ID\n");
>>> + return -ENODATA;
>>> + }
>>> +
>>> + if ( !dt_device_is_available(node))
>>> + {
>>> + printk("CPU with hartid=%lu is not available\n", *hartid);
>> Considering that hart ID assignment is outside of our control, would we
>> perhaps better (uniformly) log such using %#lx?
> It makes sense, DTC when generates dts from dtb also uses hex number, so it could
> help to find a failure node faster.
>
>>> + return -ENODEV;
>>> + }
>>> +
>>> + if ( (ret = dt_property_read_string(node, "riscv,isa", &isa)) )
>>> + {
>>> + printk("CPU with hartid=%lu has no \"riscv,isa\" property\n", *hartid);
>>> + return ret;
>>> + }
>>> +
>>> + if ( isa[0] != 'r' || isa[1] != 'v' )
>>> + {
>>> + printk("CPU with hartid=%lu has an invalid ISA of \"%s\"\n", *hartid,
>>> + isa);
>>> + return -EINVAL;
>> As before -EINVAL is appropriate when input arguments have wrong values.
>> Here, however, you found an unexpected value in something the platform
>> has presented to you. While not entirely appropriate either, maybe
>> -ENODEV again (if nothing better can be found)?
> I don't see better candidate.
>
> Interesting if some reserved region exists for user
> defined errors.
>
> ~ Oleksii
[-- Attachment #2: Type: text/html, Size: 5434 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-05-22 14:46 ` Jan Beulich
@ 2025-05-26 18:44 ` Oleksii Kurochko
2025-05-27 11:30 ` Oleksii Kurochko
2025-06-02 10:21 ` Jan Beulich
0 siblings, 2 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-26 18:44 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
[-- Attachment #1: Type: text/plain, Size: 15567 bytes --]
On 5/22/25 4:46 PM, Jan Beulich wrote:
> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>> --- /dev/null
>> +++ b/xen/arch/riscv/imsic.c
>> @@ -0,0 +1,354 @@
>> +/* SPDX-License-Identifier: MIT */
>> +
>> +/*
>> + * xen/arch/riscv/imsic.c
>> + *
>> + * RISC-V Incoming MSI Controller support
>> + *
>> + * (c) Microchip Technology Inc.
>> + * (c) Vates
>> + */
>> +
>> +#include <xen/bitops.h>
>> +#include <xen/const.h>
>> +#include <xen/cpumask.h>
>> +#include <xen/device_tree.h>
>> +#include <xen/errno.h>
>> +#include <xen/init.h>
>> +#include <xen/macros.h>
>> +#include <xen/smp.h>
>> +#include <xen/spinlock.h>
>> +#include <xen/xmalloc.h>
>> +
>> +#include <asm/imsic.h>
>> +
>> +static struct imsic_config imsic_cfg;
>> +
>> + irq_range = xzalloc_array(uint32_t, *nr_parent_irqs * 2);
>> + if ( !irq_range )
>> + panic("%s: irq_range[] allocation failed\n", __func__);
>> +
>> + if ( (rc = dt_property_read_u32_array(node, "interrupts-extended",
>> + irq_range, *nr_parent_irqs * 2)) )
>> + panic("%s: unable to find interrupt-extended in %s node: %d\n",
>> + __func__, dt_node_full_name(node), rc);
>> +
>> + if ( irq_range[1] == IRQ_M_EXT )
>> + {
>> + /* Machine mode imsic node, ignore it. */
>> + rc = IRQ_M_EXT;
>> + goto cleanup;
>> + }
> Wouldn't this better be done ...
>
>> + /* Check that interrupts-extended property is well-formed. */
>> + for ( unsigned int i = 2; i < (*nr_parent_irqs * 2); i += 2 )
>> + {
>> + if ( irq_range[i + 1] != irq_range[1] )
>> + panic("%s: mode[%d] != %d\n", __func__, i + 1, irq_range[1]);
>> + }
> ... after this consistency check?
My intention was: if the first|irq_range| isn't|IRQ_M_EXT|, then there's no
point in parsing the full|interrupts-extended| property.
However, on the other hand, it might be important to ensure that the
|interrupts-extended| property is properly formed.
So yes, it makes sense to move the check above, before the|for| loop.
> Also %u please when you log unsigned values.
>
>> + if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>> + &imsic_cfg.guest_index_bits) )
>> + imsic_cfg.guest_index_bits = 0;
>> + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>> + if ( tmp < imsic_cfg.guest_index_bits )
>> + {
>> + printk(XENLOG_ERR "%s: guest index bits too big\n",
>> + dt_node_name(node));
>> + rc = -ENOENT;
>> + goto cleanup;
>> + }
>> +
>> + /* Find number of HART index bits */
>> + if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>> + &imsic_cfg.hart_index_bits) )
>> + {
>> + /* Assume default value */
>> + imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>> + if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>> + imsic_cfg.hart_index_bits++;
> Since fls() returns a 1-based bit number, isn't it rather that in the
> exact-power-of-2 case you'd need to subtract 1?
Agree, in this case, -1 should be taken into account.
>> + }
>> + tmp -= imsic_cfg.guest_index_bits;
>> + if ( tmp < imsic_cfg.hart_index_bits )
>> + {
>> + printk(XENLOG_ERR "%s: HART index bits too big\n",
>> + dt_node_name(node));
>> + rc = -ENOENT;
>> + goto cleanup;
>> + }
>> +
>> + /* Find number of group index bits */
>> + if ( !dt_property_read_u32(node, "riscv,group-index-bits",
>> + &imsic_cfg.group_index_bits) )
>> + imsic_cfg.group_index_bits = 0;
>> + tmp -= imsic_cfg.hart_index_bits;
>> + if ( tmp < imsic_cfg.group_index_bits )
>> + {
>> + printk(XENLOG_ERR "%s: group index bits too big\n",
>> + dt_node_name(node));
>> + rc = -ENOENT;
>> + goto cleanup;
>> + }
>> +
>> + /* Find first bit position of group index */
>> + tmp = IMSIC_MMIO_PAGE_SHIFT * 2;
>> + if ( !dt_property_read_u32(node, "riscv,group-index-shift",
>> + &imsic_cfg.group_index_shift) )
>> + imsic_cfg.group_index_shift = tmp;
>> + if ( imsic_cfg.group_index_shift < tmp )
>> + {
>> + printk(XENLOG_ERR "%s: group index shift too small\n",
>> + dt_node_name(node));
>> + rc = -ENOENT;
>> + goto cleanup;
>> + }
>> + tmp = imsic_cfg.group_index_bits + imsic_cfg.group_index_shift - 1;
>> + if ( tmp >= BITS_PER_LONG )
>> + {
>> + printk(XENLOG_ERR "%s: group index shift too big\n",
>> + dt_node_name(node));
>> + rc = -EINVAL;
>> + goto cleanup;
>> + }
>> +
>> + /* Find number of interrupt identities */
>> + if ( !dt_property_read_u32(node, "riscv,num-ids", &imsic_cfg.nr_ids) )
>> + {
>> + printk(XENLOG_ERR "%s: number of interrupt identities not found\n",
>> + node->name);
>> + rc = -ENOENT;
>> + goto cleanup;
>> + }
>> +
>> + if ( (imsic_cfg.nr_ids < IMSIC_MIN_ID) ||
>> + (imsic_cfg.nr_ids > IMSIC_MAX_ID) ||
>> + ((imsic_cfg.nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID) )
> Now that you've explained to me what the deal is with these constants: Isn't
> the 1st of the three checks redundant with the last one?
Agree, one of them could be dropped.
>> +/*
>> + * Initialize the imsic_cfg structure based on the IMSIC DT node.
>> + *
>> + * Returns 0 if initialization is successful, a negative value on failure,
>> + * or IRQ_M_EXT if the IMSIC node corresponds to a machine-mode IMSIC,
>> + * which should be ignored by the hypervisor.
>> + */
>> +int __init imsic_init(const struct dt_device_node *node)
>> +{
>> + int rc;
>> + unsigned long reloff, hartid;
>> + unsigned int nr_parent_irqs, index, nr_handlers = 0;
>> + paddr_t base_addr;
>> + unsigned int nr_mmios;
>> +
>> + /* Parse IMSIC node */
>> + rc = imsic_parse_node(node, &nr_parent_irqs);
>> + /*
>> + * If machine mode imsic node => ignore it.
>> + * If rc < 0 => parsing of IMSIC DT node failed.
>> + */
>> + if ( (rc == IRQ_M_EXT) || rc )
>> + return rc;
> The former of the checks is redundant with the latter. Did you perhaps mean
> "rc < 0" for that one?
Yes, like the comment is correct but in code I did a mistake.
>
>> + nr_mmios = imsic_cfg.nr_mmios;
>> +
>> + /* Allocate MMIO resource array */
>> + imsic_cfg.mmios = xzalloc_array(struct imsic_mmios, nr_mmios);
> How large can this and ...
>
>> + if ( !imsic_cfg.mmios )
>> + {
>> + rc = -ENOMEM;
>> + goto imsic_init_err;
>> + }
>> +
>> + imsic_cfg.msi = xzalloc_array(struct imsic_msi, nr_parent_irqs);
> ... this array grow (in principle)?
Roughly speaking, this is the number of processors. The highests amount of processors
on the market I saw it was 32. But it was over a year ago when I last checked this.
> I think you're aware that in principle
> new code is expected to use xvmalloc() and friends unless there are specific
> reasons speaking against that.
Oh, missed 'v'...
>
>> + if ( !imsic_cfg.msi )
>> + {
>> + rc = -ENOMEM;
>> + goto imsic_init_err;
>> + }
>> +
>> + /* Check MMIO register sets */
>> + for ( unsigned int i = 0; i < nr_mmios; i++ )
>> + {
>> + if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>> + {
>> + rc = -ENOMEM;
>> + goto imsic_init_err;
>> + }
>> +
>> + rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>> + &imsic_cfg.mmios[i].size);
>> + if ( rc )
>> + {
>> + printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>> + node->name, i);
>> + goto imsic_init_err;
>> + }
>> +
>> + base_addr = imsic_cfg.mmios[i].base_addr;
>> + base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>> + imsic_cfg.hart_index_bits +
>> + IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>> + base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>> + imsic_cfg.group_index_shift);
>> + if ( base_addr != imsic_cfg.base_addr )
>> + {
>> + rc = -EINVAL;
>> + printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>> + node->name, i);
>> + goto imsic_init_err;
>> + }
> Maybe just for my own understanding: There's no similar check for the
> sizes to match / be consistent wanted / needed?
If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
provide, IMO.
So I don't what is possible range for imsic_cfg.mmios[i].size.
>> + }
>> +
>> + /* Configure handlers for target CPUs */
>> + for ( unsigned int i = 0; i < nr_parent_irqs; i++ )
>> + {
>> + unsigned long xen_cpuid;
>> +
>> + rc = imsic_get_parent_hartid(node, i, &hartid);
>> + if ( rc )
>> + {
>> + printk(XENLOG_WARNING "%s: cpu ID for parent irq%u not found\n",
>> + node->name, i);
>> + continue;
>> + }
>> +
>> + xen_cpuid = hartid_to_cpuid(hartid);
> I'm probably biased by "cpuid" having different meaning on x86, but: To
> be consistent with variable names elsewhere, couldn't this variable simply
> be named "cpu"? With the other item named "hartid" there's no ambiguity
> there anymore.
Sure, I will use "cpu"/"xen_cpu" for instead of xen_cpuid.
>
>> + if ( xen_cpuid >= num_possible_cpus() )
>> + {
>> + printk(XENLOG_WARNING "%s: unsupported cpu ID=%lu for parent irq%u\n",
>> + node->name, hartid, i);
> The message continues to be ambiguous (to me as a non-RISC-V person at
> least): You log a hart ID, while you say "cpu ID". Also, as I think I
> said elsewhere already, the hart ID may better be logged using %#lx.
I will correct the message.
>> + }
>> +
>> + if ( index == nr_mmios )
>> + {
>> + printk(XENLOG_WARNING "%s: MMIO not found for parent irq%u\n",
>> + node->name, i);
>> + continue;
>> + }
>> +
>> + if ( !IS_ALIGNED(imsic_cfg.msi[xen_cpuid].base_addr + reloff, PAGE_SIZE) )
> If this is the crucial thing to check, ...
>
>> + {
>> + printk(XENLOG_WARNING "%s: MMIO address %#lx is not aligned on a page\n",
>> + node->name, imsic_cfg.msi[xen_cpuid].base_addr + reloff);
>> + imsic_cfg.msi[xen_cpuid].offset = 0;
>> + imsic_cfg.msi[xen_cpuid].base_addr = 0;
>> + continue;
>> + }
>> +
>> + cpumask_set_cpu(xen_cpuid, imsic_cfg.mmios[index].cpus);
>> +
>> + imsic_cfg.msi[xen_cpuid].base_addr = imsic_cfg.mmios[index].base_addr;
>> + imsic_cfg.msi[xen_cpuid].offset = reloff;
> ... why is it that the two parts are stored separately? If their sum needs to
> be page-aligned, I'd kind of expect it's only ever the sum which is used?
Because in APLIC's target register it is used only base_addr and which one interrupt
file to use is chosen by hart/guest index:
static void vaplic_update_target(const struct imsic_config *imsic,
int guest_id, unsigned long hart_id,
uint32_t *value)
{
unsigned long group_index;
uint32_t hhxw = imsic->group_index_bits;
uint32_t lhxw = imsic->hart_index_bits;
uint32_t hhxs = imsic->group_index_shift - IMSIC_MMIO_PAGE_SHIFT * 2;
unsigned long base_ppn = imsic->msi[hart_id].base_addr >> IMSIC_MMIO_PAGE_SHIFT;
group_index = (base_ppn >> (hhxs + 12)) & (BIT(hhxw, UL) - 1);
*value &= APLIC_TARGET_EIID_MASK;
*value |= guest_id << APLIC_TARGET_GUEST_IDX_SHIFT;
*value |= hart_id << APLIC_TARGET_HART_IDX_SHIFT;
*value |= group_index << (lhxw + APLIC_TARGET_HART_IDX_SHIFT) ;
}
> Also is it really PAGE_SHIFT or rather IMSIC_MMIO_PAGE_SHIFT that needs
> chacking against?
I think more correct will be IMSIC_MMIO_PAGE_SHIFT.
>
> Further please pay attention to line length limits - there are at least two
> violations around my earlier comment here.
>
>> --- /dev/null
>> +++ b/xen/arch/riscv/include/asm/imsic.h
>> @@ -0,0 +1,65 @@
>> +/* SPDX-License-Identifier: MIT */
>> +
>> +/*
>> + * xen/arch/riscv/include/asm/imsic.h
>> + *
>> + * RISC-V Incoming MSI Controller support
>> + *
>> + * (c) Microchip Technology Inc.
>> + */
>> +
>> +#ifndef ASM__RISCV__IMSIC_H
>> +#define ASM__RISCV__IMSIC_H
> Please update according to the most recent naming rules change (all it takes
> may be to shrink the double underscores).
>
>> +#include <xen/types.h>
>> +
>> +#define IMSIC_MMIO_PAGE_SHIFT 12
>> +#define IMSIC_MMIO_PAGE_SZ (1UL << IMSIC_MMIO_PAGE_SHIFT)
>> +
>> +#define IMSIC_MIN_ID 63
>> +#define IMSIC_MAX_ID 2047
>> +
>> +struct imsic_msi {
>> + paddr_t base_addr;
>> + unsigned long offset;
>> +};
>> +
>> +struct imsic_mmios {
>> + paddr_t base_addr;
>> + unsigned long size;
>> + cpumask_var_t cpus;
>> +};
>> +
>> +struct imsic_config {
>> + /* Base address */
>> + paddr_t base_addr;
>> +
>> + /* Bits representing Guest index, HART index, and Group index */
>> + unsigned int guest_index_bits;
>> + unsigned int hart_index_bits;
>> + unsigned int group_index_bits;
>> + unsigned int group_index_shift;
>> +
>> + /* IMSIC phandle */
>> + unsigned int phandle;
>> +
>> + /* Number of parent irq */
>> + unsigned int nr_parent_irqs;
>> +
>> + /* Number off interrupt identities */
>> + unsigned int nr_ids;
>> +
>> + /* MMIOs */
>> + unsigned int nr_mmios;
>> + struct imsic_mmios *mmios;
> Are the contents of this and ...
>
>> + /* MSI */
>> + struct imsic_msi *msi;
> ... this array ever changing post-init? If not, the pointers here may want
> to be pointer-to-const (requiring local variables in the function populating
> the field).
No, they will be initialized once.
Even more I think we can drop *mmios and nr_mmios from this struct as it is used only
in imsic_init(), so could be allocated and freed there.
Only *msi is used in the function (vaplic_update_target) mentioned above.
>
>> @@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
>> return pcpu_info[cpuid].hart_id;
>> }
>>
>> +static inline unsigned long hartid_to_cpuid(unsigned long hartid)
>> +{
>> + for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
>> + {
>> + if ( hartid == cpuid_to_hartid(cpuid) )
>> + return cpuid;
>> + }
>> +
>> + /* hartid isn't valid for some reason */
>> + return NR_CPUS;
>> +}
> Considering the values being returned, why's the function's return type
> "unsigned long"?
mhartid register has MXLEN length, so theoretically we could have from 0 to MXLEN-1
Harts and so we could have the same amount of Xen's CPUIDs. and MXLEN is 32 for RV32
and MXLEN is 64 for RV64.
>
> Why the use of ARRAY_SIZE() in the loop header? You don't use pcpu_info[]
> in the loop body.
I will chang to NR_CPUs. I thought that it would be more generic if pcpu_info will
be initialized with something else, not NR_CPUs, but I don't rembember why I think
it would be better.
>
> Finally - on big systems this is going to be pretty inefficient a lookup.
> This may want to at least have a TODO comment attached.
Sure, I will add.
Thanks.
~ Oleksii
[-- Attachment #2: Type: text/html, Size: 20171 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-05-26 18:44 ` Oleksii Kurochko
@ 2025-05-27 11:30 ` Oleksii Kurochko
2025-06-02 10:22 ` Jan Beulich
2025-06-02 10:21 ` Jan Beulich
1 sibling, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-27 11:30 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
[-- Attachment #1: Type: text/plain, Size: 1715 bytes --]
On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>> + if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>> + &imsic_cfg.guest_index_bits) )
>>> + imsic_cfg.guest_index_bits = 0;
>>> + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>> + if ( tmp < imsic_cfg.guest_index_bits )
>>> + {
>>> + printk(XENLOG_ERR "%s: guest index bits too big\n",
>>> + dt_node_name(node));
>>> + rc = -ENOENT;
>>> + goto cleanup;
>>> + }
>>> +
>>> + /* Find number of HART index bits */
>>> + if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>> + &imsic_cfg.hart_index_bits) )
>>> + {
>>> + /* Assume default value */
>>> + imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>> + if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>> + imsic_cfg.hart_index_bits++;
>> Since fls() returns a 1-based bit number, isn't it rather that in the
>> exact-power-of-2 case you'd need to subtract 1?
> Agree, in this case, -1 should be taken into account.
Hmm, it seems like in case of fls() returns a 1-based bit number there
is not need for the check:
(2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
checking *nr_parent_irqs is power-of-two or not, and then just leave the
check (2).
And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
I amn't mistaken something. And if I'm not mistaken, then probably it make
sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
case is so special.
Does it make sense?
~ Oleksii
[-- Attachment #2: Type: text/html, Size: 2273 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 09/14] xen/riscv: aplic_init() implementation
2025-05-22 15:26 ` Jan Beulich
@ 2025-05-27 14:48 ` Oleksii Kurochko
0 siblings, 0 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-27 14:48 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
[-- Attachment #1: Type: text/plain, Size: 7278 bytes --]
On 5/22/25 5:26 PM, Jan Beulich wrote:
> On 21.05.2025 18:03, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/aplic.c
> +++ b/xen/arch/riscv/aplic.c
> @@ -9,19 +9,113 @@
> * Copyright (c) 2024-2025 Vates
> */
>
> +#include <xen/device_tree.h>
> #include <xen/errno.h>
> #include <xen/init.h>
> #include <xen/irq.h>
> +#include <xen/mm.h>
> #include <xen/sections.h>
> #include <xen/types.h>
> +#include <xen/vmap.h>
> +
> +#include "aplic-priv.h"
>
> #include <asm/device.h>
> +#include <asm/imsic.h>
> #include <asm/intc.h>
> +#include <asm/riscv_encoding.h>
> +
> +#define APLIC_DEFAULT_PRIORITY 1
> +
> +/* The maximum number of wired interrupt sources supported by APLIC domain. */
> +#define APLIC_MAX_NUM_WIRED_IRQ_SOURCES 1023
> Wait - what's "wired" here? There's only MSI you said elsewhere?
This what was mentioned in DT binding:
riscv,num-sources:
$ref: /schemas/types.yaml#/definitions/uint32
minimum: 1
maximum: 1023
description:
Specifies the number of wired interrupt sources supported by this
APLIC domain.
But 'wired' isn't really mention in AIA spec:
For each possible interrupt source , register sourcecfg[ ] controls the source
mode for source in this interrupt domain as well as any delegation of the source
to a child domain.
So IMO it isn't connected to wired or not interrupts. So ...
>
> Further - how's this 1023 related to any of the other uses of that number?
> Is this by chance ARRAY_SIZE(aplic.regs->sourcecfg)? If so, it wants
> expressing like that, to allow making the connection.
...ARRAY_SIZE(aplic.regs->sourcecfg) perfectly match and will be used
instead of APLIC_MAX_NUM_WIRED_IRQ_SOURCES.
>> +static struct aplic_priv aplic;
>>
>> static struct intc_info __ro_after_init aplic_info = {
>> .hw_version = INTC_APLIC,
>> };
>>
>> +static void __init aplic_init_hw_interrupts(void)
>> +{
>> + unsigned int i;
>> +
>> + /* Disable all interrupts */
>> + for ( i = 0; i < ARRAY_SIZE(aplic.regs->clrie); i++)
>> + writel(~0U, &aplic.regs->clrie[i]);
>> +
>> + /* Set interrupt type and default priority for all interrupts */
>> + for ( i = 0; i < aplic_info.num_irqs; i++ )
>> + {
>> + writel(0, &aplic.regs->sourcecfg[i]);
>> + /*
>> + * Low bits of target register contains Interrupt Priority bits which
>> + * can't be zero according to AIA spec.
>> + * Thereby they are initialized to APLIC_DEFAULT_PRIORITY.
>> + */
>> + writel(APLIC_DEFAULT_PRIORITY, &aplic.regs->target[i]);
>> + }
>> +
>> + writel(APLIC_DOMAINCFG_IE | APLIC_DOMAINCFG_DM, &aplic.regs->domaincfg);
>> +}
>> +
>> +static int __init cf_check aplic_init(void)
>> +{
>> + dt_phandle imsic_phandle;
>> + const __be32 *prop;
>> + uint64_t size, paddr;
>> + const struct dt_device_node *imsic_node;
>> + const struct dt_device_node *node = aplic_info.node;
>> + int rc;
>> +
>> + /* Check for associated imsic node */
>> + if ( !dt_property_read_u32(node, "msi-parent", &imsic_phandle) )
>> + panic("%s: IDC mode not supported\n", node->full_name);
>> +
>> + imsic_node = dt_find_node_by_phandle(imsic_phandle);
>> + if ( !imsic_node )
>> + panic("%s: unable to find IMSIC node\n", node->full_name);
>> +
>> + rc = imsic_init(imsic_node);
>> + if ( rc == IRQ_M_EXT )
>> + /* Machine mode imsic node, ignore this aplic node */
>> + return 0;
>> + else if ( rc )
> As before: No "else" please when the earlier if() ends in an unconditional
> control flow change.
> + panic("%s: Failded to initialize IMSIC\n", node->full_name);
> +
> + /* Find out number of interrupt sources */
> + if ( !dt_property_read_u32(node, "riscv,num-sources",
> + &aplic_info.num_irqs) )
> + panic("%s: failed to get number of interrupt sources\n",
> + node->full_name);
> +
> + if ( aplic_info.num_irqs > APLIC_MAX_NUM_WIRED_IRQ_SOURCES )
> + panic("%s: too big number of riscv,num-source: %u\n",
> + __func__, aplic_info.num_irqs);
> Is it actually necessary to panic() in this case? Can't you just lower
> .num_irqs instead (rendering higher IRQs,if any, non-functional)?
Good point. I think we can just use aplic_info.num_irqs =ARRAY_SIZE(aplic.regs->sourcecfg);
>
>> + prop = dt_get_property(node, "reg", NULL);
>> + dt_get_range(&prop, node, &paddr, &size);
>> + if ( !paddr )
>> + panic("%s: first MMIO resource not found\n", node->full_name);
>> +
>> + aplic.paddr_start = paddr;
>> + aplic.size = size;
>> +
>> + aplic.regs = ioremap(paddr, size);
> Doesn't size need to match certain constraints? If too low, you may
> need to panic(), while if too high you may not need to map the entire
> range?
>
> Does paddr perhaps also need to match certain contraints, like having
> the low so many bits clear?
Good question. According to AIA spec:
4.5. Memory-mapped control region for an interrupt domain
For each interrupt domain that an APLIC supports, there is a dedicated memory-mapped control
region for managing interrupts in that domain. This control region is a multiple of 4 KiB in size and
aligned to a 4-KiB address boundary. The smallest valid control region is 16 KiB. An interrupt domain’s
control region is populated by a set of 32-bit registers. The first 16 KiB contains the registers listed in
Table 6
The best what I can do is:
1. Check that the size is a multiple of 4KiB is size and is not less then 16Kib. But nothing I can do with
the high boundary.
2. Regarding paddr the best I can do it is to check that it is a 4-KiB aligned.
I added the following:
if ( !IS_ALIGNED(paddr, KB(4)) )
panic("%s: paddr of memory-mapped control region should be 4Kb "
"aligned:%#lx\n", __func__, paddr);
if ( !IS_ALIGNED(size, KB(4)) && (size < KB(16)) )
panic("%s: memory-mapped control region should be a multiple of 4 KiB "
"in size and the smallest valid control is 16Kb: %#lx\n",
__func__, size);
Would it be enough?
>> +static struct intc_hw_operations __ro_after_init aplic_ops = {
>> + .info = &aplic_info,
>> + .init = aplic_init,
>> +};
> Why's this __ro_after_init and not simply const? I can't spot any write
> to it.
It could be const. I added __ro_after_init when I misinterpreted a correct usage of it.
I will return back const instead of __ro_after_init.
>
>> --- /dev/null
>> +++ b/xen/arch/riscv/include/asm/aplic.h
>> @@ -0,0 +1,64 @@
>> +/* SPDX-License-Identifier: MIT */
>> +
>> +/*
>> + * xen/arch/riscv/asm/include/aplic.h
>> + *
>> + * RISC-V Advanced Platform-Level Interrupt Controller support
>> + *
>> + * Copyright (c) Microchip.
>> + */
>> +
>> +#ifndef ASM__RISCV__APLIC_H
>> +#define ASM__RISCV__APLIC_H
> Wants updating again.
>
>> +#include <xen/types.h>
>> +
>> +#include <asm/imsic.h>
>> +
>> +#define APLIC_DOMAINCFG_IE BIT(8, UL)
>> +#define APLIC_DOMAINCFG_DM BIT(2, UL)
> Why UL when ...
>
>> +struct aplic_regs {
>> + uint32_t domaincfg;
> ... this is just 32 bits wide?
Agree, BIT(x,U) would be more correct.
Thanks.
~ Oleksii
[-- Attachment #2: Type: text/html, Size: 9650 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 11/14] xen/riscv: implementation of aplic and imsic operations
2025-05-22 15:55 ` Jan Beulich
@ 2025-05-28 11:00 ` Oleksii Kurochko
0 siblings, 0 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-28 11:00 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
[-- Attachment #1: Type: text/plain, Size: 2835 bytes --]
On 5/22/25 5:55 PM, Jan Beulich wrote:
> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>> +static void aplic_set_irq_affinity(struct irq_desc *desc, const cpumask_t *mask)
>> +{
>> + unsigned int cpu;
>> + uint64_t group_index, base_ppn;
>> + uint32_t hhxw, lhxw ,hhxs, value;
> Nit: Comma vs blank placement.
>
>> + const struct imsic_config *imsic = aplic.imsic_cfg;
>> +
>> + /*
>> + * TODO: Currently, APLIC is supported only with MSI interrupts.
>> + * If APLIC without MSI interrupts is required in the future,
>> + * this function will need to be updated accordingly.
>> + */
>> + ASSERT(readl(&aplic.regs->domaincfg) & APLIC_DOMAINCFG_DM);
>> +
>> + ASSERT(!cpumask_empty(mask));
>> +
>> + ASSERT(spin_is_locked(&desc->lock));
>> +
>> + cpu = cpuid_to_hartid(aplic_get_cpu_from_mask(mask));
>> + hhxw = imsic->group_index_bits;
>> + lhxw = imsic->hart_index_bits;
>> + hhxs = imsic->group_index_shift - IMSIC_MMIO_PAGE_SHIFT * 2;
>> + base_ppn = imsic->msi[cpu].base_addr >> IMSIC_MMIO_PAGE_SHIFT;
>> +
>> + /* Update hart and EEID in the target register */
>> + group_index = (base_ppn >> (hhxs + IMSIC_MMIO_PAGE_SHIFT)) & (BIT(hhxw, UL) - 1);
> Nit: Line length.
>
> I'm also puzzled by the various uses of IMSIC_MMIO_PAGE_SHIFT. Why do you
> subtract double the value when calculating hhxs, just to add the value
> back in here? There's no other usee of the variable afaics.
To follow AIA spec:
MSI_address = ((Base_PPN | (group_index << (HHXS + 12)) | (hart_index << LHXS) | guest_index) << 12)
Represented in the terms of Section 3.6, HHXW = j, LHXW = k, HHXS = E - 24, LHXS = D - 12, Base PPN = B >> 12.
Specifically, in this case it is possible to calculate as hhxs = imsic->group_index_shift - IMSIC_MMIO_PAGE_SHIFT,
and then drop "+ IMSIC_MMIO_PAGE_SHIFT" in (hhxs + IMSIC_MMIO_PAGE_SHIFT), but then it could be harder to
understand a formula when you look into AIA spec and then what is in code.
Also, possible one day hhxs will be used somewhere else, for example, to verify target base physical PPN as Linux
kernel does:
tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
/* Compute target HART Base PPN */
tbppn = tppn;
tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
WARN_ON(tbppn != mc->base_ppn);
/* Compute target group and hart indexes */
group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
...
At the moment, I can add the comment above hhxs (and/or group_index) that it is calculated in this way and
isn't simplified to "hhxs = imsic->group_index_shift - IMSIC_MMIO_PAGE_SHIFT * 2" to follow a formula
from AIA spec.
~ Oleksii
[-- Attachment #2: Type: text/html, Size: 3595 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 02/14] xen/riscv: introduce support of Svpbmt extension and make it mandatory
2025-05-22 7:26 ` Jan Beulich
@ 2025-05-30 14:20 ` Oleksii Kurochko
0 siblings, 0 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2025-05-30 14:20 UTC (permalink / raw)
To: Jan Beulich
Cc: Doug Goldstein, Stefano Stabellini, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Alistair Francis, Bob Eshleman, Connor Davis, xen-devel
[-- Attachment #1: Type: text/plain, Size: 3024 bytes --]
On 5/22/25 9:26 AM, Jan Beulich wrote:
> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>> Svpbmt extension is necessary for chaning the memory type for a page contains
>> a combination of attributes that indicate the cacheability, idempotency,
>> and ordering properties for access to that page.
>>
>> As a part of the patch the following is introduced:
>> - Svpbmt memory type defintions: PTE_PBMT_{NOCACHE,IO}.
>> - PAGE_HYPERVISOR_{NOCACHE,WC}.
>> - RISCV_ISA_EXT_svpbmt and add a check in runtime that Svpbmt is
>> supported by platform.
>> - Update riscv/booting.txt with information about Svpbmt.
>> - Update logic of pt_update_entry() to take into account PBMT bits.
>>
>> Use 'unsigned long' for pte_attr_t as PMBT bits are 61 and 62 and it doesn't
>> fit into 'unsigned int'. Also, update function prototypes which uses
>> 'unsigned int' for flags/attibutes.
>>
>> Enable Svpbmt for testing in QEMU as Svpmbt is now mandatory for
>> Xen work.
>>
>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
> Acked-by: Jan Beulich<jbeulich@suse.com>
>
Thanks.
I just noticed a minor typo (PMBT_{IO,NOCACHE}->PMBT_{IO,NOCACHE} in the changes:
xen$ git diff
diff --git a/xen/arch/riscv/include/asm/page.h b/xen/arch/riscv/include/asm/page.h
index 81b91b63d8..4cb0179648 100644
--- a/xen/arch/riscv/include/asm/page.h
+++ b/xen/arch/riscv/include/asm/page.h
@@ -45,8 +45,8 @@
* 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O memory
* 11 - Rsvd Reserved for future standard use
*/
-#define PTE_PMBT_NOCACHE BIT(61, UL)
-#define PTE_PMBT_IO BIT(62, UL)
+#define PTE_PBMT_NOCACHE BIT(61, UL)
+#define PTE_PBMT_IO BIT(62, UL)
#define PTE_LEAF_DEFAULT (PTE_VALID | PTE_READABLE | PTE_WRITABLE)
#define PTE_TABLE (PTE_VALID)
@@ -59,12 +59,12 @@
/*
* PAGE_HYPERVISOR_NOCACHE is used for ioremap().
*
- * Both PTE_PMBT_IO and PTE_PMBT_NOCACHE are non-cacheable, but the difference
+ * Both PTE_PBMT_IO and PTE_PBMT_NOCACHE are non-cacheable, but the difference
* is that IO is non-idempotent and strongly ordered, which makes it a good
* candidate for mapping IOMEM.
*/
-#define PAGE_HYPERVISOR_NOCACHE (PAGE_HYPERVISOR_RW | PTE_PMBT_IO)
-#define PAGE_HYPERVISOR_WC (PAGE_HYPERVISOR_RW | PTE_PMBT_NOCACHE)
+#define PAGE_HYPERVISOR_NOCACHE (PAGE_HYPERVISOR_RW | PTE_PBMT_IO)
+#define PAGE_HYPERVISOR_WC (PAGE_HYPERVISOR_RW | PTE_PBMT_NOCACHE)
/*
* The PTE format does not contain the following bits within itself;
@@ -77,7 +77,7 @@
#define PTE_ACCESS_MASK (PTE_READABLE | PTE_WRITABLE | PTE_EXECUTABLE)
-#define PTE_PBMT_MASK (PTE_PMBT_NOCACHE | PTE_PMBT_IO)
+#define PTE_PBMT_MASK (PTE_PBMT_NOCACHE | PTE_PBMT_IO)
/* Calculate the offsets into the pagetables for a given VA */
#define pt_linear_offset(lvl, va) ((va) >> XEN_PT_LEVEL_SHIFT(lvl))
I will send them as a part of v4. (if it can't
be done during commit)
~ Oleksii
[-- Attachment #2: Type: text/html, Size: 3644 bytes --]
^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-05-26 18:44 ` Oleksii Kurochko
2025-05-27 11:30 ` Oleksii Kurochko
@ 2025-06-02 10:21 ` Jan Beulich
2025-06-04 15:36 ` Oleksii Kurochko
1 sibling, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2025-06-02 10:21 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 26.05.2025 20:44, Oleksii Kurochko wrote:
> On 5/22/25 4:46 PM, Jan Beulich wrote:
>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>> + /* Allocate MMIO resource array */
>>> + imsic_cfg.mmios = xzalloc_array(struct imsic_mmios, nr_mmios);
>> How large can this and ...
>>
>>> + if ( !imsic_cfg.mmios )
>>> + {
>>> + rc = -ENOMEM;
>>> + goto imsic_init_err;
>>> + }
>>> +
>>> + imsic_cfg.msi = xzalloc_array(struct imsic_msi, nr_parent_irqs);
>> ... this array grow (in principle)?
>
> Roughly speaking, this is the number of processors. The highests amount of processors
> on the market I saw it was 32. But it was over a year ago when I last checked this.
Unless there's an architectural limit, I don't think it's a good idea to
take as reference what's available at present. But yes, ...
>> I think you're aware that in principle
>> new code is expected to use xvmalloc() and friends unless there are specific
>> reasons speaking against that.
>
> Oh, missed 'v'...
... adding the missing 'v' will take care of my concern. Provided of
course this isn't running to early for vmalloc() to be usable just yet.
>>> + if ( !imsic_cfg.msi )
>>> + {
>>> + rc = -ENOMEM;
>>> + goto imsic_init_err;
>>> + }
>>> +
>>> + /* Check MMIO register sets */
>>> + for ( unsigned int i = 0; i < nr_mmios; i++ )
>>> + {
>>> + if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>>> + {
>>> + rc = -ENOMEM;
>>> + goto imsic_init_err;
>>> + }
>>> +
>>> + rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>>> + &imsic_cfg.mmios[i].size);
>>> + if ( rc )
>>> + {
>>> + printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>>> + node->name, i);
>>> + goto imsic_init_err;
>>> + }
>>> +
>>> + base_addr = imsic_cfg.mmios[i].base_addr;
>>> + base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>>> + imsic_cfg.hart_index_bits +
>>> + IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>>> + base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>>> + imsic_cfg.group_index_shift);
>>> + if ( base_addr != imsic_cfg.base_addr )
>>> + {
>>> + rc = -EINVAL;
>>> + printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>>> + node->name, i);
>>> + goto imsic_init_err;
>>> + }
>> Maybe just for my own understanding: There's no similar check for the
>> sizes to match / be consistent wanted / needed?
>
> If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
> provide, IMO.
> So I don't what is possible range for imsic_cfg.mmios[i].size.
Well, all I can say is that's it feels odd that you sanity check base_addr
but permit effectively any size.
>>> @@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
>>> return pcpu_info[cpuid].hart_id;
>>> }
>>>
>>> +static inline unsigned long hartid_to_cpuid(unsigned long hartid)
>>> +{
>>> + for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
>>> + {
>>> + if ( hartid == cpuid_to_hartid(cpuid) )
>>> + return cpuid;
>>> + }
>>> +
>>> + /* hartid isn't valid for some reason */
>>> + return NR_CPUS;
>>> +}
>> Considering the values being returned, why's the function's return type
>> "unsigned long"?
>
> mhartid register has MXLEN length, so theoretically we could have from 0 to MXLEN-1
> Harts and so we could have the same amount of Xen's CPUIDs. and MXLEN is 32 for RV32
> and MXLEN is 64 for RV64.
Yet the return value here is always constrained by NR_CPUS, isn't it?
Jan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-05-27 11:30 ` Oleksii Kurochko
@ 2025-06-02 10:22 ` Jan Beulich
2025-06-04 13:42 ` Oleksii Kurochko
0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2025-06-02 10:22 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 27.05.2025 13:30, Oleksii Kurochko wrote:
>
> On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>>> + if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>>> + &imsic_cfg.guest_index_bits) )
>>>> + imsic_cfg.guest_index_bits = 0;
>>>> + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>>> + if ( tmp < imsic_cfg.guest_index_bits )
>>>> + {
>>>> + printk(XENLOG_ERR "%s: guest index bits too big\n",
>>>> + dt_node_name(node));
>>>> + rc = -ENOENT;
>>>> + goto cleanup;
>>>> + }
>>>> +
>>>> + /* Find number of HART index bits */
>>>> + if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>> + &imsic_cfg.hart_index_bits) )
>>>> + {
>>>> + /* Assume default value */
>>>> + imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>>> + if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>> + imsic_cfg.hart_index_bits++;
>>> Since fls() returns a 1-based bit number, isn't it rather that in the
>>> exact-power-of-2 case you'd need to subtract 1?
>> Agree, in this case, -1 should be taken into account.
>
> Hmm, it seems like in case of fls() returns a 1-based bit number there
> is not need for the check:
> (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>
> We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
> checking *nr_parent_irqs is power-of-two or not, and then just leave the
> check (2).
> And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
> I amn't mistaken something. And if I'm not mistaken, then probably it make
> sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
> case is so special.
>
> Does it make sense?
Can't easily tell; I'd like to see the resulting code instead of the textual
description.
Jan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-06-02 10:22 ` Jan Beulich
@ 2025-06-04 13:42 ` Oleksii Kurochko
2025-06-04 15:03 ` Jan Beulich
2025-06-04 15:05 ` Jan Beulich
0 siblings, 2 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2025-06-04 13:42 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
[-- Attachment #1: Type: text/plain, Size: 2382 bytes --]
On 6/2/25 12:22 PM, Jan Beulich wrote:
> On 27.05.2025 13:30, Oleksii Kurochko wrote:
>> On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>>>> + if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>>>> + &imsic_cfg.guest_index_bits) )
>>>>> + imsic_cfg.guest_index_bits = 0;
>>>>> + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>>>> + if ( tmp < imsic_cfg.guest_index_bits )
>>>>> + {
>>>>> + printk(XENLOG_ERR "%s: guest index bits too big\n",
>>>>> + dt_node_name(node));
>>>>> + rc = -ENOENT;
>>>>> + goto cleanup;
>>>>> + }
>>>>> +
>>>>> + /* Find number of HART index bits */
>>>>> + if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>>> + &imsic_cfg.hart_index_bits) )
>>>>> + {
>>>>> + /* Assume default value */
>>>>> + imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>>>> + if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>> + imsic_cfg.hart_index_bits++;
>>>> Since fls() returns a 1-based bit number, isn't it rather that in the
>>>> exact-power-of-2 case you'd need to subtract 1?
>>> Agree, in this case, -1 should be taken into account.
>> Hmm, it seems like in case of fls() returns a 1-based bit number there
>> is not need for the check:
>> (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>
>> We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
>> checking *nr_parent_irqs is power-of-two or not, and then just leave the
>> check (2).
>> And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
>> I amn't mistaken something. And if I'm not mistaken, then probably it make
>> sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
>> case is so special.
>>
>> Does it make sense?
> Can't easily tell; I'd like to see the resulting code instead of the textual
> description.
Here is the code:
/* Find number of HART index bits */
if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
&imsic_cfg.hart_index_bits) )
{
/* Assume default value */
imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) +
(*nr_parent_irqs == 1);
}
It seems like it covers all the cases.
~ Oleksii
[-- Attachment #2: Type: text/html, Size: 3135 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-06-04 13:42 ` Oleksii Kurochko
@ 2025-06-04 15:03 ` Jan Beulich
2025-06-04 15:05 ` Jan Beulich
1 sibling, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2025-06-04 15:03 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 04.06.2025 15:42, Oleksii Kurochko wrote:
>
> On 6/2/25 12:22 PM, Jan Beulich wrote:
>> On 27.05.2025 13:30, Oleksii Kurochko wrote:
>>> On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>>>>> + if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>>>>> + &imsic_cfg.guest_index_bits) )
>>>>>> + imsic_cfg.guest_index_bits = 0;
>>>>>> + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>>>>> + if ( tmp < imsic_cfg.guest_index_bits )
>>>>>> + {
>>>>>> + printk(XENLOG_ERR "%s: guest index bits too big\n",
>>>>>> + dt_node_name(node));
>>>>>> + rc = -ENOENT;
>>>>>> + goto cleanup;
>>>>>> + }
>>>>>> +
>>>>>> + /* Find number of HART index bits */
>>>>>> + if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>>>> + &imsic_cfg.hart_index_bits) )
>>>>>> + {
>>>>>> + /* Assume default value */
>>>>>> + imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>>>>> + if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>>> + imsic_cfg.hart_index_bits++;
>>>>> Since fls() returns a 1-based bit number, isn't it rather that in the
>>>>> exact-power-of-2 case you'd need to subtract 1?
>>>> Agree, in this case, -1 should be taken into account.
>>> Hmm, it seems like in case of fls() returns a 1-based bit number there
>>> is not need for the check:
>>> (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>
>>> We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
>>> checking *nr_parent_irqs is power-of-two or not, and then just leave the
>>> check (2).
>>> And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
>>> I amn't mistaken something. And if I'm not mistaken, then probably it make
>>> sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
>>> case is so special.
>>>
>>> Does it make sense?
>> Can't easily tell; I'd like to see the resulting code instead of the textual
>> description.
>
> Here is the code:
> /* Find number of HART index bits */
> if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
> &imsic_cfg.hart_index_bits) )
> {
> /* Assume default value */
> imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) +
> (*nr_parent_irqs == 1);
> }
>
> It seems like it covers all the cases.
*nr_parent_irqs imsic_cfg.hart_index_bits
1 0
2 2 (1 + 1)
3 2
4 2
5 3
6 3
IOW why the special casing of *nr_parent_irqs == 1?
Jan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-06-04 13:42 ` Oleksii Kurochko
2025-06-04 15:03 ` Jan Beulich
@ 2025-06-04 15:05 ` Jan Beulich
2025-06-04 15:41 ` Oleksii Kurochko
1 sibling, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2025-06-04 15:05 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 04.06.2025 15:42, Oleksii Kurochko wrote:
>
> On 6/2/25 12:22 PM, Jan Beulich wrote:
>> On 27.05.2025 13:30, Oleksii Kurochko wrote:
>>> On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>>>>> + if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>>>>> + &imsic_cfg.guest_index_bits) )
>>>>>> + imsic_cfg.guest_index_bits = 0;
>>>>>> + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>>>>> + if ( tmp < imsic_cfg.guest_index_bits )
>>>>>> + {
>>>>>> + printk(XENLOG_ERR "%s: guest index bits too big\n",
>>>>>> + dt_node_name(node));
>>>>>> + rc = -ENOENT;
>>>>>> + goto cleanup;
>>>>>> + }
>>>>>> +
>>>>>> + /* Find number of HART index bits */
>>>>>> + if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>>>> + &imsic_cfg.hart_index_bits) )
>>>>>> + {
>>>>>> + /* Assume default value */
>>>>>> + imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>>>>> + if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>>> + imsic_cfg.hart_index_bits++;
>>>>> Since fls() returns a 1-based bit number, isn't it rather that in the
>>>>> exact-power-of-2 case you'd need to subtract 1?
>>>> Agree, in this case, -1 should be taken into account.
>>> Hmm, it seems like in case of fls() returns a 1-based bit number there
>>> is not need for the check:
>>> (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>
>>> We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
>>> checking *nr_parent_irqs is power-of-two or not, and then just leave the
>>> check (2).
>>> And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
>>> I amn't mistaken something. And if I'm not mistaken, then probably it make
>>> sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
>>> case is so special.
>>>
>>> Does it make sense?
>> Can't easily tell; I'd like to see the resulting code instead of the textual
>> description.
>
> Here is the code:
> /* Find number of HART index bits */
> if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
> &imsic_cfg.hart_index_bits) )
> {
> /* Assume default value */
> imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) +
> (*nr_parent_irqs == 1);
> }
>
> It seems like it covers all the cases.
*nr_parent_irqs imsic_cfg.hart_index_bits
1 1 (0 + 1)
2 1
3 2
4 2
5 3
6 3
IOW why the special casing of *nr_parent_irqs == 1?
Jan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-06-02 10:21 ` Jan Beulich
@ 2025-06-04 15:36 ` Oleksii Kurochko
2025-06-05 6:50 ` Jan Beulich
0 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-06-04 15:36 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
[-- Attachment #1: Type: text/plain, Size: 6947 bytes --]
On 6/2/25 12:21 PM, Jan Beulich wrote:
> On 26.05.2025 20:44, Oleksii Kurochko wrote:
>> On 5/22/25 4:46 PM, Jan Beulich wrote:
>>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>>> + /* Allocate MMIO resource array */
>>>> + imsic_cfg.mmios = xzalloc_array(struct imsic_mmios, nr_mmios);
>>> How large can this and ...
>>>
>>>> + if ( !imsic_cfg.mmios )
>>>> + {
>>>> + rc = -ENOMEM;
>>>> + goto imsic_init_err;
>>>> + }
>>>> +
>>>> + imsic_cfg.msi = xzalloc_array(struct imsic_msi, nr_parent_irqs);
>>> ... this array grow (in principle)?
>> Roughly speaking, this is the number of processors. The highests amount of processors
>> on the market I saw it was 32. But it was over a year ago when I last checked this.
> Unless there's an architectural limit, I don't think it's a good idea to
> take as reference what's available at present. But yes, ...
This (32) is not an architectural limit.
I assume that if mhartd id accepts a range from 0 to 2^64-1 for RV64 then I assume
that the *theoretical* limit for amount of cpus is 2^64-1. And in RISC-V spec. I can't
find if it is theoretical limit or not.
But if look into AIA (interrupt controller) specification then it tells explicitly that limit
is 16,384:
1.2 Limits
In its current version, the RISC-V Advanced Interrupt Architecture can support RISC-V symmet-ric
multiprocessing (SMP) systems with up to 16,384 harts. If the harts are 64-bit (RV64) and implement
the hypervisor extension, and if all features of the Advanced Interrupt Architecture are fully
implemented as well, then for each physical hart there may be up to 63 active virtual harts and
potentially thousands of additional idle (swapped-out) virtual harts, where each virtual hart has
direct control of one or more physical devices.
Also 16,384 is used as a maximum for nr_parent_irqs from DTS point of view.
>
>>> I think you're aware that in principle
>>> new code is expected to use xvmalloc() and friends unless there are specific
>>> reasons speaking against that.
>> Oh, missed 'v'...
> ... adding the missing 'v' will take care of my concern. Provided of
> course this isn't running to early for vmalloc() to be usable just yet.
>
>>>> + if ( !imsic_cfg.msi )
>>>> + {
>>>> + rc = -ENOMEM;
>>>> + goto imsic_init_err;
>>>> + }
>>>> +
>>>> + /* Check MMIO register sets */
>>>> + for ( unsigned int i = 0; i < nr_mmios; i++ )
>>>> + {
>>>> + if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>>>> + {
>>>> + rc = -ENOMEM;
>>>> + goto imsic_init_err;
>>>> + }
>>>> +
>>>> + rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>>>> + &imsic_cfg.mmios[i].size);
>>>> + if ( rc )
>>>> + {
>>>> + printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>>>> + node->name, i);
>>>> + goto imsic_init_err;
>>>> + }
>>>> +
>>>> + base_addr = imsic_cfg.mmios[i].base_addr;
>>>> + base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>>>> + imsic_cfg.hart_index_bits +
>>>> + IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>>>> + base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>>>> + imsic_cfg.group_index_shift);
>>>> + if ( base_addr != imsic_cfg.base_addr )
>>>> + {
>>>> + rc = -EINVAL;
>>>> + printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>>>> + node->name, i);
>>>> + goto imsic_init_err;
>>>> + }
>>> Maybe just for my own understanding: There's no similar check for the
>>> sizes to match / be consistent wanted / needed?
>> If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
>> provide, IMO.
>> So I don't what is possible range for imsic_cfg.mmios[i].size.
> Well, all I can say is that's it feels odd that you sanity check base_addr
> but permit effectively any size.
Okay, I think I have two ideas how to check the size:
1. Based on guest bits from IMSIC's DT node. QEMU calculates a size as:
for (socket = 0; socket < socket_count; socket++) {
imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
s->soc[socket].num_harts;
...
where:
#define IMSIC_MMIO_PAGE_SHIFT 12
#define IMSIC_MMIO_PAGE_SZ (1UL << IMSIC_MMIO_PAGE_SHIFT)
#define IMSIC_HART_NUM_GUESTS(__guest_bits) \
(1U << (__guest_bits))
#define IMSIC_HART_SIZE(__guest_bits) \
(IMSIC_HART_NUM_GUESTS(__guest_bits) * IMSIC_MMIO_PAGE_SZ)
2. Just take a theoretical maximum for S-mode IMSIC's node:
16,384 * 64 1(S-mode interrupt file) + 63(max guest interrupt files)) * 4 KiB
Where,
16,384 - maximum possible amount of harts according to AIA spec
64 - a maximum amount of possible interrupt file for S-mode IMSIC node:
1 - S interupt file + 63 guest interrupt files.
4 Kib - a maximum size of one interrupt file.
Which option is preferred?
The specification doesn’t seem to mention (or I couldn’t find) that all platforms
must calculate the MMIO size in the same way QEMU does. Therefore, it’s probably
better to use the approach described in option 2.
On the other hand, I don't think a platform should be considered correct if it
provides slightly more than needed but still less than the theoretical maximum.
>
>>>> @@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
>>>> return pcpu_info[cpuid].hart_id;
>>>> }
>>>>
>>>> +static inline unsigned long hartid_to_cpuid(unsigned long hartid)
>>>> +{
>>>> + for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
>>>> + {
>>>> + if ( hartid == cpuid_to_hartid(cpuid) )
>>>> + return cpuid;
>>>> + }
>>>> +
>>>> + /* hartid isn't valid for some reason */
>>>> + return NR_CPUS;
>>>> +}
>>> Considering the values being returned, why's the function's return type
>>> "unsigned long"?
>> mhartid register has MXLEN length, so theoretically we could have from 0 to MXLEN-1
>> Harts and so we could have the same amount of Xen's CPUIDs. and MXLEN is 32 for RV32
>> and MXLEN is 64 for RV64.
> Yet the return value here is always constrained by NR_CPUS, isn't it?
I am not 100% sure that I get your point, but I put NR_CPUS here because of:
/*
* tp points to one of these per cpu.
*
* hart_id would be valid (no matter which value) if its
* processor_id field is valid (less than NR_CPUS).
*/
struct pcpu_info pcpu_info[NR_CPUS] = { [0 ... NR_CPUS - 1] = {
.processor_id = NR_CPUS,
}};
As an option we could use ULONG_MAX. Would it be better?
~ Oleksii
[-- Attachment #2: Type: text/html, Size: 9084 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-06-04 15:05 ` Jan Beulich
@ 2025-06-04 15:41 ` Oleksii Kurochko
2025-06-05 6:52 ` Jan Beulich
0 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-06-04 15:41 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
[-- Attachment #1: Type: text/plain, Size: 3149 bytes --]
On 6/4/25 5:05 PM, Jan Beulich wrote:
> On 04.06.2025 15:42, Oleksii Kurochko wrote:
>> On 6/2/25 12:22 PM, Jan Beulich wrote:
>>> On 27.05.2025 13:30, Oleksii Kurochko wrote:
>>>> On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>>>>>> + if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>>>>>> + &imsic_cfg.guest_index_bits) )
>>>>>>> + imsic_cfg.guest_index_bits = 0;
>>>>>>> + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>>>>>> + if ( tmp < imsic_cfg.guest_index_bits )
>>>>>>> + {
>>>>>>> + printk(XENLOG_ERR "%s: guest index bits too big\n",
>>>>>>> + dt_node_name(node));
>>>>>>> + rc = -ENOENT;
>>>>>>> + goto cleanup;
>>>>>>> + }
>>>>>>> +
>>>>>>> + /* Find number of HART index bits */
>>>>>>> + if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>>>>> + &imsic_cfg.hart_index_bits) )
>>>>>>> + {
>>>>>>> + /* Assume default value */
>>>>>>> + imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>>>>>> + if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>>>> + imsic_cfg.hart_index_bits++;
>>>>>> Since fls() returns a 1-based bit number, isn't it rather that in the
>>>>>> exact-power-of-2 case you'd need to subtract 1?
>>>>> Agree, in this case, -1 should be taken into account.
>>>> Hmm, it seems like in case of fls() returns a 1-based bit number there
>>>> is not need for the check:
>>>> (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>
>>>> We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
>>>> checking *nr_parent_irqs is power-of-two or not, and then just leave the
>>>> check (2).
>>>> And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
>>>> I amn't mistaken something. And if I'm not mistaken, then probably it make
>>>> sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
>>>> case is so special.
>>>>
>>>> Does it make sense?
>>> Can't easily tell; I'd like to see the resulting code instead of the textual
>>> description.
>> Here is the code:
>> /* Find number of HART index bits */
>> if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>> &imsic_cfg.hart_index_bits) )
>> {
>> /* Assume default value */
>> imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) +
>> (*nr_parent_irqs == 1);
>> }
>>
>> It seems like it covers all the cases.
> *nr_parent_irqs imsic_cfg.hart_index_bits
> 1 1 (0 + 1)
> 2 1
> 3 2
> 4 2
> 5 3
> 6 3
>
> IOW why the special casing of *nr_parent_irqs == 1?
If we don't have "... + (*nr_parent_irqs == 1)" then for the case when *nr_parent_irqs == 1,
we will have imsic_cfg.hart_index_bits = fls(1-1) = fls(0) = 0 because:
#define arch_fls(x) ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
and imsic_cfg.hart_index_bits = 0 doesn't seem correct because I expect that if I have only
1 hart then at least 1 bit will be needed to point to it.
~ Oleksii
[-- Attachment #2: Type: text/html, Size: 4152 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-06-04 15:36 ` Oleksii Kurochko
@ 2025-06-05 6:50 ` Jan Beulich
2025-06-05 9:13 ` Oleksii Kurochko
0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2025-06-05 6:50 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 04.06.2025 17:36, Oleksii Kurochko wrote:
> On 6/2/25 12:21 PM, Jan Beulich wrote:
>> On 26.05.2025 20:44, Oleksii Kurochko wrote:
>>> On 5/22/25 4:46 PM, Jan Beulich wrote:
>>>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>>>> + /* Check MMIO register sets */
>>>>> + for ( unsigned int i = 0; i < nr_mmios; i++ )
>>>>> + {
>>>>> + if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>>>>> + {
>>>>> + rc = -ENOMEM;
>>>>> + goto imsic_init_err;
>>>>> + }
>>>>> +
>>>>> + rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>>>>> + &imsic_cfg.mmios[i].size);
>>>>> + if ( rc )
>>>>> + {
>>>>> + printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>>>>> + node->name, i);
>>>>> + goto imsic_init_err;
>>>>> + }
>>>>> +
>>>>> + base_addr = imsic_cfg.mmios[i].base_addr;
>>>>> + base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>>>>> + imsic_cfg.hart_index_bits +
>>>>> + IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>>>>> + base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>>>>> + imsic_cfg.group_index_shift);
>>>>> + if ( base_addr != imsic_cfg.base_addr )
>>>>> + {
>>>>> + rc = -EINVAL;
>>>>> + printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>>>>> + node->name, i);
>>>>> + goto imsic_init_err;
>>>>> + }
>>>> Maybe just for my own understanding: There's no similar check for the
>>>> sizes to match / be consistent wanted / needed?
>>> If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
>>> provide, IMO.
>>> So I don't what is possible range for imsic_cfg.mmios[i].size.
>> Well, all I can say is that's it feels odd that you sanity check base_addr
>> but permit effectively any size.
>
> Okay, I think I have two ideas how to check the size:
> 1. Based on guest bits from IMSIC's DT node. QEMU calculates a size as:
> for (socket = 0; socket < socket_count; socket++) {
> imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
> imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
> s->soc[socket].num_harts;
> ...
> where:
> #define IMSIC_MMIO_PAGE_SHIFT 12
> #define IMSIC_MMIO_PAGE_SZ (1UL << IMSIC_MMIO_PAGE_SHIFT)
>
> #define IMSIC_HART_NUM_GUESTS(__guest_bits) \
> (1U << (__guest_bits))
> #define IMSIC_HART_SIZE(__guest_bits) \
> (IMSIC_HART_NUM_GUESTS(__guest_bits) * IMSIC_MMIO_PAGE_SZ)
>
> 2. Just take a theoretical maximum for S-mode IMSIC's node:
> 16,384 * 64 1(S-mode interrupt file) + 63(max guest interrupt files)) * 4 KiB
> Where,
> 16,384 - maximum possible amount of harts according to AIA spec
> 64 - a maximum amount of possible interrupt file for S-mode IMSIC node:
> 1 - S interupt file + 63 guest interrupt files.
> 4 Kib - a maximum size of one interrupt file.
>
> Which option is preferred?
I would have said 2, if your outline used "actual" rather than "maximum" values.
> The specification doesn’t seem to mention (or I couldn’t find) that all platforms
> must calculate the MMIO size in the same way QEMU does. Therefore, it’s probably
> better to use the approach described in option 2.
>
> On the other hand, I don't think a platform should be considered correct if it
> provides slightly more than needed but still less than the theoretical maximum.
>
>>
>>>>> @@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
>>>>> return pcpu_info[cpuid].hart_id;
>>>>> }
>>>>>
>>>>> +static inline unsigned long hartid_to_cpuid(unsigned long hartid)
>>>>> +{
>>>>> + for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
>>>>> + {
>>>>> + if ( hartid == cpuid_to_hartid(cpuid) )
>>>>> + return cpuid;
>>>>> + }
>>>>> +
>>>>> + /* hartid isn't valid for some reason */
>>>>> + return NR_CPUS;
>>>>> +}
>>>> Considering the values being returned, why's the function's return type
>>>> "unsigned long"?
>>> mhartid register has MXLEN length, so theoretically we could have from 0 to MXLEN-1
>>> Harts and so we could have the same amount of Xen's CPUIDs. and MXLEN is 32 for RV32
>>> and MXLEN is 64 for RV64.
>> Yet the return value here is always constrained by NR_CPUS, isn't it?
>
> I am not 100% sure that I get your point,
NR_CPUS is guaranteed to fit in a unsigned int. Furthermore variables / parameters
holding Xen-internal CPU numbers also are unsigned int everywhere else.
> but I put NR_CPUS here because of:
> /*
> * tp points to one of these per cpu.
> *
> * hart_id would be valid (no matter which value) if its
> * processor_id field is valid (less than NR_CPUS).
> */
> struct pcpu_info pcpu_info[NR_CPUS] = { [0 ... NR_CPUS - 1] = {
> .processor_id = NR_CPUS,
> }};
>
> As an option we could use ULONG_MAX. Would it be better?
No. NR_CPUS is the appropriate value to use here, imo.
Jan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-06-04 15:41 ` Oleksii Kurochko
@ 2025-06-05 6:52 ` Jan Beulich
2025-06-05 8:35 ` Oleksii Kurochko
0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2025-06-05 6:52 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 04.06.2025 17:41, Oleksii Kurochko wrote:
>
> On 6/4/25 5:05 PM, Jan Beulich wrote:
>> On 04.06.2025 15:42, Oleksii Kurochko wrote:
>>> On 6/2/25 12:22 PM, Jan Beulich wrote:
>>>> On 27.05.2025 13:30, Oleksii Kurochko wrote:
>>>>> On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>>>>>>> + if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>>>>>>> + &imsic_cfg.guest_index_bits) )
>>>>>>>> + imsic_cfg.guest_index_bits = 0;
>>>>>>>> + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>>>>>>> + if ( tmp < imsic_cfg.guest_index_bits )
>>>>>>>> + {
>>>>>>>> + printk(XENLOG_ERR "%s: guest index bits too big\n",
>>>>>>>> + dt_node_name(node));
>>>>>>>> + rc = -ENOENT;
>>>>>>>> + goto cleanup;
>>>>>>>> + }
>>>>>>>> +
>>>>>>>> + /* Find number of HART index bits */
>>>>>>>> + if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>>>>>> + &imsic_cfg.hart_index_bits) )
>>>>>>>> + {
>>>>>>>> + /* Assume default value */
>>>>>>>> + imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>>>>>>> + if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>>>>> + imsic_cfg.hart_index_bits++;
>>>>>>> Since fls() returns a 1-based bit number, isn't it rather that in the
>>>>>>> exact-power-of-2 case you'd need to subtract 1?
>>>>>> Agree, in this case, -1 should be taken into account.
>>>>> Hmm, it seems like in case of fls() returns a 1-based bit number there
>>>>> is not need for the check:
>>>>> (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>>
>>>>> We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
>>>>> checking *nr_parent_irqs is power-of-two or not, and then just leave the
>>>>> check (2).
>>>>> And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
>>>>> I amn't mistaken something. And if I'm not mistaken, then probably it make
>>>>> sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
>>>>> case is so special.
>>>>>
>>>>> Does it make sense?
>>>> Can't easily tell; I'd like to see the resulting code instead of the textual
>>>> description.
>>> Here is the code:
>>> /* Find number of HART index bits */
>>> if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>> &imsic_cfg.hart_index_bits) )
>>> {
>>> /* Assume default value */
>>> imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) +
>>> (*nr_parent_irqs == 1);
>>> }
>>>
>>> It seems like it covers all the cases.
>> *nr_parent_irqs imsic_cfg.hart_index_bits
>> 1 1 (0 + 1)
>> 2 1
>> 3 2
>> 4 2
>> 5 3
>> 6 3
>>
>> IOW why the special casing of *nr_parent_irqs == 1?
>
> If we don't have "... + (*nr_parent_irqs == 1)" then for the case when *nr_parent_irqs == 1,
> we will have imsic_cfg.hart_index_bits = fls(1-1) = fls(0) = 0 because:
> #define arch_fls(x) ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
> and imsic_cfg.hart_index_bits = 0 doesn't seem correct because I expect that if I have only
> 1 hart then at least 1 bit will be needed to point to it.
No, why? To pick 1 out of 1 you need no bits at all to represent.
Jan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-06-05 6:52 ` Jan Beulich
@ 2025-06-05 8:35 ` Oleksii Kurochko
0 siblings, 0 replies; 47+ messages in thread
From: Oleksii Kurochko @ 2025-06-05 8:35 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
[-- Attachment #1: Type: text/plain, Size: 3705 bytes --]
On 6/5/25 8:52 AM, Jan Beulich wrote:
> On 04.06.2025 17:41, Oleksii Kurochko wrote:
>> On 6/4/25 5:05 PM, Jan Beulich wrote:
>>> On 04.06.2025 15:42, Oleksii Kurochko wrote:
>>>> On 6/2/25 12:22 PM, Jan Beulich wrote:
>>>>> On 27.05.2025 13:30, Oleksii Kurochko wrote:
>>>>>> On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>>>>>>>> + if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>>>>>>>> + &imsic_cfg.guest_index_bits) )
>>>>>>>>> + imsic_cfg.guest_index_bits = 0;
>>>>>>>>> + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>>>>>>>> + if ( tmp < imsic_cfg.guest_index_bits )
>>>>>>>>> + {
>>>>>>>>> + printk(XENLOG_ERR "%s: guest index bits too big\n",
>>>>>>>>> + dt_node_name(node));
>>>>>>>>> + rc = -ENOENT;
>>>>>>>>> + goto cleanup;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> + /* Find number of HART index bits */
>>>>>>>>> + if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>>>>>>> + &imsic_cfg.hart_index_bits) )
>>>>>>>>> + {
>>>>>>>>> + /* Assume default value */
>>>>>>>>> + imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>>>>>>>> + if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>>>>>> + imsic_cfg.hart_index_bits++;
>>>>>>>> Since fls() returns a 1-based bit number, isn't it rather that in the
>>>>>>>> exact-power-of-2 case you'd need to subtract 1?
>>>>>>> Agree, in this case, -1 should be taken into account.
>>>>>> Hmm, it seems like in case of fls() returns a 1-based bit number there
>>>>>> is not need for the check:
>>>>>> (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>>>
>>>>>> We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
>>>>>> checking *nr_parent_irqs is power-of-two or not, and then just leave the
>>>>>> check (2).
>>>>>> And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
>>>>>> I amn't mistaken something. And if I'm not mistaken, then probably it make
>>>>>> sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
>>>>>> case is so special.
>>>>>>
>>>>>> Does it make sense?
>>>>> Can't easily tell; I'd like to see the resulting code instead of the textual
>>>>> description.
>>>> Here is the code:
>>>> /* Find number of HART index bits */
>>>> if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>> &imsic_cfg.hart_index_bits) )
>>>> {
>>>> /* Assume default value */
>>>> imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) +
>>>> (*nr_parent_irqs == 1);
>>>> }
>>>>
>>>> It seems like it covers all the cases.
>>> *nr_parent_irqs imsic_cfg.hart_index_bits
>>> 1 1 (0 + 1)
>>> 2 1
>>> 3 2
>>> 4 2
>>> 5 3
>>> 6 3
>>>
>>> IOW why the special casing of *nr_parent_irqs == 1?
>> If we don't have "... + (*nr_parent_irqs == 1)" then for the case when *nr_parent_irqs == 1,
>> we will have imsic_cfg.hart_index_bits = fls(1-1) = fls(0) = 0 because:
>> #define arch_fls(x) ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
>> and imsic_cfg.hart_index_bits = 0 doesn't seem correct because I expect that if I have only
>> 1 hart then at least 1 bit will be needed to point to it.
> No, why? To pick 1 out of 1 you need no bits at all to represent.
It seems you are right, I thought that DT's binding requires it to be minimum 1, but it could
be zero.
Then it is okay just to initialize hart_index_bits without extra " + (*nr_parent_irqs == 1)":
imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1);
~ Oleksii
[-- Attachment #2: Type: text/html, Size: 4948 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-06-05 6:50 ` Jan Beulich
@ 2025-06-05 9:13 ` Oleksii Kurochko
2025-06-05 9:42 ` Jan Beulich
0 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-06-05 9:13 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
[-- Attachment #1: Type: text/plain, Size: 7417 bytes --]
On 6/5/25 8:50 AM, Jan Beulich wrote:
> On 04.06.2025 17:36, Oleksii Kurochko wrote:
>> On 6/2/25 12:21 PM, Jan Beulich wrote:
>>> On 26.05.2025 20:44, Oleksii Kurochko wrote:
>>>> On 5/22/25 4:46 PM, Jan Beulich wrote:
>>>>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>>>>> + /* Check MMIO register sets */
>>>>>> + for ( unsigned int i = 0; i < nr_mmios; i++ )
>>>>>> + {
>>>>>> + if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>>>>>> + {
>>>>>> + rc = -ENOMEM;
>>>>>> + goto imsic_init_err;
>>>>>> + }
>>>>>> +
>>>>>> + rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>>>>>> + &imsic_cfg.mmios[i].size);
>>>>>> + if ( rc )
>>>>>> + {
>>>>>> + printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>>>>>> + node->name, i);
>>>>>> + goto imsic_init_err;
>>>>>> + }
>>>>>> +
>>>>>> + base_addr = imsic_cfg.mmios[i].base_addr;
>>>>>> + base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>>>>>> + imsic_cfg.hart_index_bits +
>>>>>> + IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>>>>>> + base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>>>>>> + imsic_cfg.group_index_shift);
>>>>>> + if ( base_addr != imsic_cfg.base_addr )
>>>>>> + {
>>>>>> + rc = -EINVAL;
>>>>>> + printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>>>>>> + node->name, i);
>>>>>> + goto imsic_init_err;
>>>>>> + }
>>>>> Maybe just for my own understanding: There's no similar check for the
>>>>> sizes to match / be consistent wanted / needed?
>>>> If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
>>>> provide, IMO.
>>>> So I don't what is possible range for imsic_cfg.mmios[i].size.
>>> Well, all I can say is that's it feels odd that you sanity check base_addr
>>> but permit effectively any size.
>> Okay, I think I have two ideas how to check the size:
>> 1. Based on guest bits from IMSIC's DT node. QEMU calculates a size as:
>> for (socket = 0; socket < socket_count; socket++) {
>> imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
>> imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
>> s->soc[socket].num_harts;
>> ...
>> where:
>> #define IMSIC_MMIO_PAGE_SHIFT 12
>> #define IMSIC_MMIO_PAGE_SZ (1UL << IMSIC_MMIO_PAGE_SHIFT)
>>
>> #define IMSIC_HART_NUM_GUESTS(__guest_bits) \
>> (1U << (__guest_bits))
>> #define IMSIC_HART_SIZE(__guest_bits) \
>> (IMSIC_HART_NUM_GUESTS(__guest_bits) * IMSIC_MMIO_PAGE_SZ)
>>
>> 2. Just take a theoretical maximum for S-mode IMSIC's node:
>> 16,384 * 64 1(S-mode interrupt file) + 63(max guest interrupt files)) * 4 KiB
>> Where,
>> 16,384 - maximum possible amount of harts according to AIA spec
>> 64 - a maximum amount of possible interrupt file for S-mode IMSIC node:
>> 1 - S interupt file + 63 guest interrupt files.
>> 4 Kib - a maximum size of one interrupt file.
>>
>> Which option is preferred?
> I would have said 2, if your outline used "actual" rather than "maximum" values.
In option 2 maximum possible values are used. If we want to use "actual" values then
the option 1 should be used. At the moment, I did in the following way that S-mode
IMSIC node, at least, should contain 1 S-mode interrupt file + an amount of guest
interrupts files (based on imsic_cfg.guest_index_bits):
+#define IMSIC_HART_SIZE(guest_bits_) (BIT(guest_bits_, U) * IMSIC_MMIO_PAGE_SZ)
+
#define IMSIC_DISABLE_EIDELIVERY 0
#define IMSIC_ENABLE_EIDELIVERY 1
#define IMSIC_DISABLE_EITHRESHOLD 1
@@ -359,6 +356,10 @@ int __init imsic_init(const struct dt_device_node *node)
/* Check MMIO register sets */
for ( unsigned int i = 0; i < nr_mmios; i++ )
{
+ unsigned int guest_bits = imsic_cfg.guest_index_bits;
+ unsigned long expected_size =
+ IMSIC_HART_SIZE(guest_bits) * nr_parent_irqs;
+
rc = dt_device_get_address(node, i, &mmios[i].base_addr,
&mmios[i].size);
if ( rc )
@@ -369,7 +370,7 @@ int __init imsic_init(const struct dt_device_node *node)
}
base_addr = mmios[i].base_addr;
- base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
+ base_addr &= ~(BIT(guest_bits +
imsic_cfg.hart_index_bits +
IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
@@ -381,6 +382,14 @@ int __init imsic_init(const struct dt_device_node *node)
node->name, i);
goto imsic_init_err;
}
+
+ if ( mmios[i].size != expected_size )
+ {
+ rc = -EINVAL;
+ printk(XENLOG_ERR "%s: IMSIC MMIO size is incorrect %ld, "
+ "max:%ld\n", node->name, mmios[i].size, max_size);
+ goto imsic_init_err;
+ }
}
>
>> The specification doesn’t seem to mention (or I couldn’t find) that all platforms
>> must calculate the MMIO size in the same way QEMU does. Therefore, it’s probably
>> better to use the approach described in option 2.
>>
>> On the other hand, I don't think a platform should be considered correct if it
>> provides slightly more than needed but still less than the theoretical maximum.
>>
>>>>>> @@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
>>>>>> return pcpu_info[cpuid].hart_id;
>>>>>> }
>>>>>>
>>>>>> +static inline unsigned long hartid_to_cpuid(unsigned long hartid)
>>>>>> +{
>>>>>> + for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
>>>>>> + {
>>>>>> + if ( hartid == cpuid_to_hartid(cpuid) )
>>>>>> + return cpuid;
>>>>>> + }
>>>>>> +
>>>>>> + /* hartid isn't valid for some reason */
>>>>>> + return NR_CPUS;
>>>>>> +}
>>>>> Considering the values being returned, why's the function's return type
>>>>> "unsigned long"?
>>>> mhartid register has MXLEN length, so theoretically we could have from 0 to MXLEN-1
>>>> Harts and so we could have the same amount of Xen's CPUIDs. and MXLEN is 32 for RV32
>>>> and MXLEN is 64 for RV64.
>>> Yet the return value here is always constrained by NR_CPUS, isn't it?
>> I am not 100% sure that I get your point,
> NR_CPUS is guaranteed to fit in a unsigned int. Furthermore variables / parameters
> holding Xen-internal CPU numbers also are unsigned int everywhere else.
Okay, then agree, hartid_to_cpuid() should return unsigned int. I'll update correspondingly.
Thanks.
~ Oleksii
>
>> but I put NR_CPUS here because of:
>> /*
>> * tp points to one of these per cpu.
>> *
>> * hart_id would be valid (no matter which value) if its
>> * processor_id field is valid (less than NR_CPUS).
>> */
>> struct pcpu_info pcpu_info[NR_CPUS] = { [0 ... NR_CPUS - 1] = {
>> .processor_id = NR_CPUS,
>> }};
>>
>> As an option we could use ULONG_MAX. Would it be better?
> No. NR_CPUS is the appropriate value to use here, imo.
>
> Jan
[-- Attachment #2: Type: text/html, Size: 9315 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-06-05 9:13 ` Oleksii Kurochko
@ 2025-06-05 9:42 ` Jan Beulich
2025-06-05 10:15 ` Oleksii Kurochko
0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2025-06-05 9:42 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 05.06.2025 11:13, Oleksii Kurochko wrote:
>
> On 6/5/25 8:50 AM, Jan Beulich wrote:
>> On 04.06.2025 17:36, Oleksii Kurochko wrote:
>>> On 6/2/25 12:21 PM, Jan Beulich wrote:
>>>> On 26.05.2025 20:44, Oleksii Kurochko wrote:
>>>>> On 5/22/25 4:46 PM, Jan Beulich wrote:
>>>>>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>>>>>> + /* Check MMIO register sets */
>>>>>>> + for ( unsigned int i = 0; i < nr_mmios; i++ )
>>>>>>> + {
>>>>>>> + if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>>>>>>> + {
>>>>>>> + rc = -ENOMEM;
>>>>>>> + goto imsic_init_err;
>>>>>>> + }
>>>>>>> +
>>>>>>> + rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>>>>>>> + &imsic_cfg.mmios[i].size);
>>>>>>> + if ( rc )
>>>>>>> + {
>>>>>>> + printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>>>>>>> + node->name, i);
>>>>>>> + goto imsic_init_err;
>>>>>>> + }
>>>>>>> +
>>>>>>> + base_addr = imsic_cfg.mmios[i].base_addr;
>>>>>>> + base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>>>>>>> + imsic_cfg.hart_index_bits +
>>>>>>> + IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>>>>>>> + base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>>>>>>> + imsic_cfg.group_index_shift);
>>>>>>> + if ( base_addr != imsic_cfg.base_addr )
>>>>>>> + {
>>>>>>> + rc = -EINVAL;
>>>>>>> + printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>>>>>>> + node->name, i);
>>>>>>> + goto imsic_init_err;
>>>>>>> + }
>>>>>> Maybe just for my own understanding: There's no similar check for the
>>>>>> sizes to match / be consistent wanted / needed?
>>>>> If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
>>>>> provide, IMO.
>>>>> So I don't what is possible range for imsic_cfg.mmios[i].size.
>>>> Well, all I can say is that's it feels odd that you sanity check base_addr
>>>> but permit effectively any size.
>>> Okay, I think I have two ideas how to check the size:
>>> 1. Based on guest bits from IMSIC's DT node. QEMU calculates a size as:
>>> for (socket = 0; socket < socket_count; socket++) {
>>> imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
>>> imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
>>> s->soc[socket].num_harts;
>>> ...
>>> where:
>>> #define IMSIC_MMIO_PAGE_SHIFT 12
>>> #define IMSIC_MMIO_PAGE_SZ (1UL << IMSIC_MMIO_PAGE_SHIFT)
>>>
>>> #define IMSIC_HART_NUM_GUESTS(__guest_bits) \
>>> (1U << (__guest_bits))
>>> #define IMSIC_HART_SIZE(__guest_bits) \
>>> (IMSIC_HART_NUM_GUESTS(__guest_bits) * IMSIC_MMIO_PAGE_SZ)
>>>
>>> 2. Just take a theoretical maximum for S-mode IMSIC's node:
>>> 16,384 * 64 1(S-mode interrupt file) + 63(max guest interrupt files)) * 4 KiB
>>> Where,
>>> 16,384 - maximum possible amount of harts according to AIA spec
>>> 64 - a maximum amount of possible interrupt file for S-mode IMSIC node:
>>> 1 - S interupt file + 63 guest interrupt files.
>>> 4 Kib - a maximum size of one interrupt file.
>>>
>>> Which option is preferred?
>> I would have said 2, if your outline used "actual" rather than "maximum" values.
>
> In option 2 maximum possible values are used. If we want to use "actual" values then
> the option 1 should be used.
Actually I was wrong with request "actual" uniformly. It's only the hart count where
I meant to ask for that. For interrupts we should allow the maximum possible unless
we already know their count.
Jan
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-06-05 9:42 ` Jan Beulich
@ 2025-06-05 10:15 ` Oleksii Kurochko
2025-06-05 10:17 ` Jan Beulich
0 siblings, 1 reply; 47+ messages in thread
From: Oleksii Kurochko @ 2025-06-05 10:15 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
[-- Attachment #1: Type: text/plain, Size: 4116 bytes --]
On 6/5/25 11:42 AM, Jan Beulich wrote:
> On 05.06.2025 11:13, Oleksii Kurochko wrote:
>> On 6/5/25 8:50 AM, Jan Beulich wrote:
>>> On 04.06.2025 17:36, Oleksii Kurochko wrote:
>>>> On 6/2/25 12:21 PM, Jan Beulich wrote:
>>>>> On 26.05.2025 20:44, Oleksii Kurochko wrote:
>>>>>> On 5/22/25 4:46 PM, Jan Beulich wrote:
>>>>>>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>>>>>>> + /* Check MMIO register sets */
>>>>>>>> + for ( unsigned int i = 0; i < nr_mmios; i++ )
>>>>>>>> + {
>>>>>>>> + if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>>>>>>>> + {
>>>>>>>> + rc = -ENOMEM;
>>>>>>>> + goto imsic_init_err;
>>>>>>>> + }
>>>>>>>> +
>>>>>>>> + rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>>>>>>>> + &imsic_cfg.mmios[i].size);
>>>>>>>> + if ( rc )
>>>>>>>> + {
>>>>>>>> + printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>>>>>>>> + node->name, i);
>>>>>>>> + goto imsic_init_err;
>>>>>>>> + }
>>>>>>>> +
>>>>>>>> + base_addr = imsic_cfg.mmios[i].base_addr;
>>>>>>>> + base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>>>>>>>> + imsic_cfg.hart_index_bits +
>>>>>>>> + IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>>>>>>>> + base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>>>>>>>> + imsic_cfg.group_index_shift);
>>>>>>>> + if ( base_addr != imsic_cfg.base_addr )
>>>>>>>> + {
>>>>>>>> + rc = -EINVAL;
>>>>>>>> + printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>>>>>>>> + node->name, i);
>>>>>>>> + goto imsic_init_err;
>>>>>>>> + }
>>>>>>> Maybe just for my own understanding: There's no similar check for the
>>>>>>> sizes to match / be consistent wanted / needed?
>>>>>> If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
>>>>>> provide, IMO.
>>>>>> So I don't what is possible range for imsic_cfg.mmios[i].size.
>>>>> Well, all I can say is that's it feels odd that you sanity check base_addr
>>>>> but permit effectively any size.
>>>> Okay, I think I have two ideas how to check the size:
>>>> 1. Based on guest bits from IMSIC's DT node. QEMU calculates a size as:
>>>> for (socket = 0; socket < socket_count; socket++) {
>>>> imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
>>>> imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
>>>> s->soc[socket].num_harts;
>>>> ...
>>>> where:
>>>> #define IMSIC_MMIO_PAGE_SHIFT 12
>>>> #define IMSIC_MMIO_PAGE_SZ (1UL << IMSIC_MMIO_PAGE_SHIFT)
>>>>
>>>> #define IMSIC_HART_NUM_GUESTS(__guest_bits) \
>>>> (1U << (__guest_bits))
>>>> #define IMSIC_HART_SIZE(__guest_bits) \
>>>> (IMSIC_HART_NUM_GUESTS(__guest_bits) * IMSIC_MMIO_PAGE_SZ)
>>>>
>>>> 2. Just take a theoretical maximum for S-mode IMSIC's node:
>>>> 16,384 * 64 1(S-mode interrupt file) + 63(max guest interrupt files)) * 4 KiB
>>>> Where,
>>>> 16,384 - maximum possible amount of harts according to AIA spec
>>>> 64 - a maximum amount of possible interrupt file for S-mode IMSIC node:
>>>> 1 - S interupt file + 63 guest interrupt files.
>>>> 4 Kib - a maximum size of one interrupt file.
>>>>
>>>> Which option is preferred?
>>> I would have said 2, if your outline used "actual" rather than "maximum" values.
>> In option 2 maximum possible values are used. If we want to use "actual" values then
>> the option 1 should be used.
> Actually I was wrong with request "actual" uniformly. It's only the hart count where
> I meant to ask for that. For interrupts we should allow the maximum possible unless
> we already know their count.
Do you mean 'interrupt file' here? If yes, then an amount of them shouldn't be bigger
then 1 + BIT(guest_bits).
~ Oleksii
[-- Attachment #2: Type: text/html, Size: 5256 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
2025-06-05 10:15 ` Oleksii Kurochko
@ 2025-06-05 10:17 ` Jan Beulich
0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2025-06-05 10:17 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 05.06.2025 12:15, Oleksii Kurochko wrote:
>
> On 6/5/25 11:42 AM, Jan Beulich wrote:
>> On 05.06.2025 11:13, Oleksii Kurochko wrote:
>>> On 6/5/25 8:50 AM, Jan Beulich wrote:
>>>> On 04.06.2025 17:36, Oleksii Kurochko wrote:
>>>>> On 6/2/25 12:21 PM, Jan Beulich wrote:
>>>>>> On 26.05.2025 20:44, Oleksii Kurochko wrote:
>>>>>>> On 5/22/25 4:46 PM, Jan Beulich wrote:
>>>>>>>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>>>>>>>> + /* Check MMIO register sets */
>>>>>>>>> + for ( unsigned int i = 0; i < nr_mmios; i++ )
>>>>>>>>> + {
>>>>>>>>> + if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>>>>>>>>> + {
>>>>>>>>> + rc = -ENOMEM;
>>>>>>>>> + goto imsic_init_err;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> + rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>>>>>>>>> + &imsic_cfg.mmios[i].size);
>>>>>>>>> + if ( rc )
>>>>>>>>> + {
>>>>>>>>> + printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>>>>>>>>> + node->name, i);
>>>>>>>>> + goto imsic_init_err;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> + base_addr = imsic_cfg.mmios[i].base_addr;
>>>>>>>>> + base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>>>>>>>>> + imsic_cfg.hart_index_bits +
>>>>>>>>> + IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>>>>>>>>> + base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>>>>>>>>> + imsic_cfg.group_index_shift);
>>>>>>>>> + if ( base_addr != imsic_cfg.base_addr )
>>>>>>>>> + {
>>>>>>>>> + rc = -EINVAL;
>>>>>>>>> + printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>>>>>>>>> + node->name, i);
>>>>>>>>> + goto imsic_init_err;
>>>>>>>>> + }
>>>>>>>> Maybe just for my own understanding: There's no similar check for the
>>>>>>>> sizes to match / be consistent wanted / needed?
>>>>>>> If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
>>>>>>> provide, IMO.
>>>>>>> So I don't what is possible range for imsic_cfg.mmios[i].size.
>>>>>> Well, all I can say is that's it feels odd that you sanity check base_addr
>>>>>> but permit effectively any size.
>>>>> Okay, I think I have two ideas how to check the size:
>>>>> 1. Based on guest bits from IMSIC's DT node. QEMU calculates a size as:
>>>>> for (socket = 0; socket < socket_count; socket++) {
>>>>> imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
>>>>> imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
>>>>> s->soc[socket].num_harts;
>>>>> ...
>>>>> where:
>>>>> #define IMSIC_MMIO_PAGE_SHIFT 12
>>>>> #define IMSIC_MMIO_PAGE_SZ (1UL << IMSIC_MMIO_PAGE_SHIFT)
>>>>>
>>>>> #define IMSIC_HART_NUM_GUESTS(__guest_bits) \
>>>>> (1U << (__guest_bits))
>>>>> #define IMSIC_HART_SIZE(__guest_bits) \
>>>>> (IMSIC_HART_NUM_GUESTS(__guest_bits) * IMSIC_MMIO_PAGE_SZ)
>>>>>
>>>>> 2. Just take a theoretical maximum for S-mode IMSIC's node:
>>>>> 16,384 * 64 1(S-mode interrupt file) + 63(max guest interrupt files)) * 4 KiB
>>>>> Where,
>>>>> 16,384 - maximum possible amount of harts according to AIA spec
>>>>> 64 - a maximum amount of possible interrupt file for S-mode IMSIC node:
>>>>> 1 - S interupt file + 63 guest interrupt files.
>>>>> 4 Kib - a maximum size of one interrupt file.
>>>>>
>>>>> Which option is preferred?
>>>> I would have said 2, if your outline used "actual" rather than "maximum" values.
>>> In option 2 maximum possible values are used. If we want to use "actual" values then
>>> the option 1 should be used.
>> Actually I was wrong with request "actual" uniformly. It's only the hart count where
>> I meant to ask for that. For interrupts we should allow the maximum possible unless
>> we already know their count.
>
> Do you mean 'interrupt file' here?
Yes, I do. Sorry for getting the terminology wrong.
Jan
> If yes, then an amount of them shouldn't be bigger
> then 1 + BIT(guest_bits).
>
> ~ Oleksii
>
^ permalink raw reply [flat|nested] 47+ messages in thread
end of thread, other threads:[~2025-06-05 10:17 UTC | newest]
Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-21 16:03 [PATCH v3 00/14] riscv: introduce basic UART support and interrupts for hypervisor mode Oleksii Kurochko
2025-05-21 16:03 ` [PATCH v3 01/14] xen/riscv: introduce smp_prepare_boot_cpu() Oleksii Kurochko
2025-05-22 7:22 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 02/14] xen/riscv: introduce support of Svpbmt extension and make it mandatory Oleksii Kurochko
2025-05-22 7:26 ` Jan Beulich
2025-05-30 14:20 ` Oleksii Kurochko
2025-05-21 16:03 ` [PATCH v3 03/14] xen/riscv: add ioremap_*() variants using ioremap_attr() Oleksii Kurochko
2025-05-22 7:33 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 04/14] xen/riscv: introduce init_IRQ() Oleksii Kurochko
2025-05-22 7:38 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 05/14] xen/riscv: introduce platform_get_irq() Oleksii Kurochko
2025-05-22 7:41 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 06/14] xen/riscv: dt_processor_hartid() implementation Oleksii Kurochko
2025-05-22 7:50 ` Jan Beulich
2025-05-26 10:46 ` Oleksii Kurochko
2025-05-26 12:56 ` Oleksii Kurochko
2025-05-21 16:03 ` [PATCH v3 07/14] xen/riscv: introduce register_intc_ops() and intc_hw_ops Oleksii Kurochko
2025-05-22 13:49 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 08/14] xen/riscv: imsic_init() implementation Oleksii Kurochko
2025-05-22 14:46 ` Jan Beulich
2025-05-26 18:44 ` Oleksii Kurochko
2025-05-27 11:30 ` Oleksii Kurochko
2025-06-02 10:22 ` Jan Beulich
2025-06-04 13:42 ` Oleksii Kurochko
2025-06-04 15:03 ` Jan Beulich
2025-06-04 15:05 ` Jan Beulich
2025-06-04 15:41 ` Oleksii Kurochko
2025-06-05 6:52 ` Jan Beulich
2025-06-05 8:35 ` Oleksii Kurochko
2025-06-02 10:21 ` Jan Beulich
2025-06-04 15:36 ` Oleksii Kurochko
2025-06-05 6:50 ` Jan Beulich
2025-06-05 9:13 ` Oleksii Kurochko
2025-06-05 9:42 ` Jan Beulich
2025-06-05 10:15 ` Oleksii Kurochko
2025-06-05 10:17 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 09/14] xen/riscv: aplic_init() implementation Oleksii Kurochko
2025-05-22 15:26 ` Jan Beulich
2025-05-27 14:48 ` Oleksii Kurochko
2025-05-21 16:03 ` [PATCH v3 10/14] xen/riscv: introduce intc_init() and helpers Oleksii Kurochko
2025-05-22 15:32 ` Jan Beulich
2025-05-21 16:03 ` [PATCH v3 11/14] xen/riscv: implementation of aplic and imsic operations Oleksii Kurochko
2025-05-22 15:55 ` Jan Beulich
2025-05-28 11:00 ` Oleksii Kurochko
2025-05-21 16:03 ` [PATCH v3 12/14] xen/riscv: add external interrupt handling for hypervisor mode Oleksii Kurochko
2025-05-21 16:03 ` [PATCH v3 13/14] xen/riscv: implement setup_irq() Oleksii Kurochko
2025-05-21 16:03 ` [PATCH v3 14/14] xen/riscv: add basic UART support Oleksii Kurochko
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.