All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/16] xen/riscv: introduce vtimer related things
@ 2026-02-09 16:52 Oleksii Kurochko
  2026-02-09 16:52 ` [PATCH v3 01/16] xen/riscv: implement arch_vcpu_{create,destroy}() Oleksii Kurochko
                   ` (15 more replies)
  0 siblings, 16 replies; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk

This patch series introduces the components necessary to implement a virtual
timer (vtimer).

Since the SSTC extension is not supported by Xen, an emulated (SBI-based)
timer is required. To address this, a virtual timer built on Xen’s timer
infrastructure is introduced, with save/restore support and SBI-based
programming.

To provide full guest software–based timer support, the following components
are also introduced:
- arch_vcpu_{create,destroy}() to initialize the virtual timer and other
  vCPU-related state not directly tied to timer functionality. As part of this
  work, struct arch_vcpu is introduced to describe the internal state of a
  virtual CPU, along with vcpu_csr_init() to initialize the relevant CSR state.
- Support functions required by the virtual timer, including:
  - vcpu_kick(), and a stub implementation of smp_send_event_check_mask()
    (since SMP is not yet supported in Xen), which is used by vcpu_kick().
  - Support for guest timer programming via interception of the SBI legacy
    SET_TIMER call from guest.
  - Implement reprogram_timer() using introduced sbi_set_timer().
  - Initial lockless tracking of pending vCPU interrupts using atomic bitmaps.
- Handling of hypervisor timer interrupts and dispatch into Xen’s generic timer
  softirq.

CI tests: https://gitlab.com/xen-project/people/olkur/xen/-/pipelines/2314656143

---
Changes in v3:
 - Squash patch "xen/riscv: introduce struct arch_vcpu" into other
   patches of the patch series.
 - Merged to staging:
   - xen/riscv: implement stub for smp_send_event_check_mask()
 - Address other comments from ML.
---
Changes in v2:
 - Add consumer part of tracking of pending vCPU interrupts.
 - Split patch "xen/riscv: init tasklet subsystem" to two.
 - Patches were acked:
   - xen/riscv: introduce vcpu_kick() implementation
   - xen/riscv: implement SBI legacy SET_TIMER support for guests
 - All other changes are patch-specific. Please check them.
---

Oleksii Kurochko (16):
  xen/riscv: implement arch_vcpu_{create,destroy}()
  xen/riscv: avoid reading hstateen0 when Smstateen is not implemented
  xen/riscv: detect and store supported hypervisor CSR bits at boot
  xen/riscv: implement vcpu_csr_init()
  xen/riscv: introduce tracking of pending vCPU interrupts, part 1
  xen/riscv: introduce tracking of pending vCPU interrupts, part 2
  xen/time: move ticks<->ns helpers to common code
  xen/riscv: introduce basic vtimer infrastructure for guests
  xen/riscv: introduce vcpu_kick() implementation
  xen/riscv: add vtimer context switch helpers
  xen/riscv: implement SBI legacy SET_TIMER support for guests
  xen/riscv: introduce sbi_set_timer()
  xen/riscv: implement reprogram_timer() via SBI
  xen/riscv: handle hypervisor timer interrupts
  xen/riscv: init tasklet subsystem
  xen/riscv: implement sync_vcpu_execstate()

 xen/arch/arm/include/asm/time.h             |   3 -
 xen/arch/arm/time.c                         |  11 -
 xen/arch/arm/vtimer.c                       |   2 +-
 xen/arch/riscv/Makefile                     |   2 +
 xen/arch/riscv/cpufeature.c                 |   1 +
 xen/arch/riscv/domain.c                     | 257 ++++++++++++++++++++
 xen/arch/riscv/include/asm/Makefile         |   1 -
 xen/arch/riscv/include/asm/config.h         |   3 +-
 xen/arch/riscv/include/asm/cpufeature.h     |   1 +
 xen/arch/riscv/include/asm/current.h        |   8 +
 xen/arch/riscv/include/asm/domain.h         |  58 +++++
 xen/arch/riscv/include/asm/perfc_defn.h     |   3 +
 xen/arch/riscv/include/asm/riscv_encoding.h |   2 +
 xen/arch/riscv/include/asm/sbi.h            |  18 ++
 xen/arch/riscv/include/asm/setup.h          |   9 +
 xen/arch/riscv/include/asm/time.h           |   5 -
 xen/arch/riscv/include/asm/vtimer.h         |  23 ++
 xen/arch/riscv/sbi.c                        |  40 ++-
 xen/arch/riscv/setup.c                      |  29 +++
 xen/arch/riscv/stubs.c                      |  30 ---
 xen/arch/riscv/time.c                       |  44 ++++
 xen/arch/riscv/traps.c                      |  32 ++-
 xen/arch/riscv/vsbi/legacy-extension.c      |   6 +
 xen/arch/riscv/vtimer.c                     |  86 +++++++
 xen/include/xen/time.h                      |  11 +
 25 files changed, 631 insertions(+), 54 deletions(-)
 create mode 100644 xen/arch/riscv/domain.c
 create mode 100644 xen/arch/riscv/include/asm/perfc_defn.h
 create mode 100644 xen/arch/riscv/include/asm/vtimer.h
 create mode 100644 xen/arch/riscv/vtimer.c

-- 
2.52.0



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v3 01/16] xen/riscv: implement arch_vcpu_{create,destroy}()
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-10 16:24   ` Jan Beulich
  2026-02-09 16:52 ` [PATCH v3 02/16] xen/riscv: avoid reading hstateen0 when Smstateen is not implemented Oleksii Kurochko
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Introduce architecture-specific functions to create and destroy VCPUs.
Note that arch_vcpu_create() currently returns -EOPNOTSUPP, as the virtual
timer and interrupt controller are not yet implemented.

Add calle-saved registers used to preserve Xen’s own execution context
when switching between vCPU stacks.
It is going to be used in the following way (pseudocode):
  context_switch(prev_vcpu, next_vcpu):
    ...

    /* Switch from previous stack to the next stack. */
    __context_switch(prev_vcpu, next_vcpu);

    ...
    schedule_tail(prev_vcpu):
        Save and restore vCPU's CSRs.
The Xen-saved context allows __context_switch() to switch execution
from the previous vCPU’s stack to the next vCPU’s stack and later resume
execution on the original stack when switching back.

During vCPU creation, the Xen-saved context is going to be initialized
with:
  - SP pointing to the newly allocated vCPU stack
  - RA pointing to a helper that performs final vCPU setup before
    transferring control to the guest
After the first execution of __context_switch(), RA naturally points to
the instruction following the call site, and the remaining callee-saved
registers contain the Xen register state at the time of the switch.

As part of this change, add continue_new_vcpu(), which will be used after
the first context_switch() of a new vCPU. Since this functionality is not
yet implemented, continue_new_vcpu() is currently provided as a stub.
The prev argument is going to be set by RISC-V ABI (prev will be stored in
a0) when __context_swtich() will be introduced and called from
context_switch().

Update the STACK_SIZE definition and introduce STACK_ORDER (to align with
other architectures) for allocating the vCPU stack.

Introduce struct cpu_info to store per-vCPU state that lives at the top
of the vCPU's stack.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v3:
 - Move declaration of xen_saved_context structure and cpu_info structure
   here as they are going to be used in this patch.
 - Drop separate zero-ing of arch.cpu_info as a memory for it is allocated
   by vzalloc().
 - Correct calculation of stack pointer in arch_vcpu_destroy() function.
---
Changes in v2:
 - Drop BUILD_BUG_ON() in arch_vcpu_create() as a check isn't very useful.
 - Use vzalloc() instead of alloc_xenheap_page() to use the larger domheap to
   allocate vCPU's stack.
 - Drop printk() inside arch_vcpu_create() to not have potential big noise
   in console as it could be that an amount of vCPUs is pretty big.
 - Use XVFREE() instead of free_xenheap_pages() as vCPU's stack allocation
   happens with a usage of vzalloc() now.
 - Drop stack field as it is enough to have only cpu_info as stack pointer
   could be calculated based on cpu_info.
 - Drop cast when v.arch.cpu_info is inialized as it is not necessary
        to have it.
 - Drop memset() for arch.cpu_info() as it is enough to have vzalloc().
---
 xen/arch/riscv/Makefile              |  1 +
 xen/arch/riscv/domain.c              | 58 ++++++++++++++++++++++++++++
 xen/arch/riscv/include/asm/config.h  |  3 +-
 xen/arch/riscv/include/asm/current.h |  6 +++
 xen/arch/riscv/include/asm/domain.h  | 24 ++++++++++++
 xen/arch/riscv/stubs.c               | 10 -----
 6 files changed, 91 insertions(+), 11 deletions(-)
 create mode 100644 xen/arch/riscv/domain.c

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 0df139b27423..868514c25006 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -1,5 +1,6 @@
 obj-y += aplic.o
 obj-y += cpufeature.o
+obj-y += domain.o
 obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
 obj-y += entry.o
 obj-y += imsic.o
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
new file mode 100644
index 000000000000..d035b105c2cc
--- /dev/null
+++ b/xen/arch/riscv/domain.c
@@ -0,0 +1,58 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/init.h>
+#include <xen/mm.h>
+#include <xen/sched.h>
+#include <xen/vmap.h>
+
+static void continue_new_vcpu(struct vcpu *prev)
+{
+    BUG_ON("unimplemented\n");
+}
+
+static void __init __maybe_unused build_assertions(void)
+{
+    /*
+     * Enforce the requirement documented in struct cpu_info that
+     * guest_cpu_user_regs must be the first field.
+     */
+    BUILD_BUG_ON(offsetof(struct cpu_info, guest_cpu_user_regs) != 0);
+}
+
+int arch_vcpu_create(struct vcpu *v)
+{
+    int rc = 0;
+    void *stack = vzalloc(STACK_SIZE);
+
+    if ( !stack )
+        return -ENOMEM;
+
+    v->arch.cpu_info = stack + STACK_SIZE - sizeof(struct cpu_info);
+
+    v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
+    v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
+
+    /* Idle VCPUs don't need the rest of this setup */
+    if ( is_idle_vcpu(v) )
+        return rc;
+
+    /*
+     * As the vtimer and interrupt controller (IC) are not yet implemented,
+     * return an error.
+     *
+     * TODO: Drop this once the vtimer and IC are implemented.
+     */
+    rc = -EOPNOTSUPP;
+    goto fail;
+
+    return rc;
+
+ fail:
+    arch_vcpu_destroy(v);
+    return rc;
+}
+
+void arch_vcpu_destroy(struct vcpu *v)
+{
+    vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info) - STACK_SIZE);
+}
diff --git a/xen/arch/riscv/include/asm/config.h b/xen/arch/riscv/include/asm/config.h
index 1e08d3bf78be..86a95df018b5 100644
--- a/xen/arch/riscv/include/asm/config.h
+++ b/xen/arch/riscv/include/asm/config.h
@@ -143,7 +143,8 @@
 
 #define SMP_CACHE_BYTES (1 << 6)
 
-#define STACK_SIZE PAGE_SIZE
+#define STACK_ORDER 3
+#define STACK_SIZE (PAGE_SIZE << STACK_ORDER)
 
 #define IDENT_AREA_SIZE 64
 
diff --git a/xen/arch/riscv/include/asm/current.h b/xen/arch/riscv/include/asm/current.h
index 0c3ea70c2ec8..58c9f1506b7c 100644
--- a/xen/arch/riscv/include/asm/current.h
+++ b/xen/arch/riscv/include/asm/current.h
@@ -21,6 +21,12 @@ struct pcpu_info {
 /* tp points to one of these */
 extern struct pcpu_info pcpu_info[NR_CPUS];
 
+/* Per-VCPU state that lives at the top of the stack */
+struct cpu_info {
+    /* This should be the first member. */
+    struct cpu_user_regs guest_cpu_user_regs;
+};
+
 #define set_processor_id(id)    do { \
     tp->processor_id = (id);         \
 } while (0)
diff --git a/xen/arch/riscv/include/asm/domain.h b/xen/arch/riscv/include/asm/domain.h
index 316e7c6c8448..f78f145258d6 100644
--- a/xen/arch/riscv/include/asm/domain.h
+++ b/xen/arch/riscv/include/asm/domain.h
@@ -24,6 +24,30 @@ struct arch_vcpu_io {
 
 struct arch_vcpu {
     struct vcpu_vmid vmid;
+
+    /*
+     * Callee saved registers for Xen's state used to switch from
+     * prev's stack to the next's stack during context switch.
+     */
+    struct
+    {
+        register_t s0;
+        register_t s1;
+        register_t s2;
+        register_t s3;
+        register_t s4;
+        register_t s5;
+        register_t s6;
+        register_t s7;
+        register_t s8;
+        register_t s9;
+        register_t s10;
+        register_t s11;
+        register_t sp;
+        register_t ra;
+    } xen_saved_context;
+
+    struct cpu_info *cpu_info;
 };
 
 struct paging_domain {
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index acbfde79b5a7..c5784a436574 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -116,16 +116,6 @@ void dump_pageframe_info(struct domain *d)
     BUG_ON("unimplemented");
 }
 
-int arch_vcpu_create(struct vcpu *v)
-{
-    BUG_ON("unimplemented");
-}
-
-void arch_vcpu_destroy(struct vcpu *v)
-{
-    BUG_ON("unimplemented");
-}
-
 void vcpu_switch_to_aarch64_mode(struct vcpu *v)
 {
     BUG_ON("unimplemented");
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 02/16] xen/riscv: avoid reading hstateen0 when Smstateen is not implemented
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
  2026-02-09 16:52 ` [PATCH v3 01/16] xen/riscv: implement arch_vcpu_{create,destroy}() Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-11  7:22   ` Jan Beulich
  2026-02-09 16:52 ` [PATCH v3 03/16] xen/riscv: detect and store supported hypervisor CSR bits at boot Oleksii Kurochko
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

If the Smstateen extension is not implemented, the hstateen0 CSR is
considered non-existent. Any attempt to access it will raise an
illegal-instruction exception.

Guard the hstateen0 dump with a runtime check for Smstateen support to
avoid triggering traps when dumping CSRs on systems that do not
implement this extension.

Fixes: 3babc8d2e546 ("xen/riscv: dump GPRs and CSRs on unexpected traps")
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V3:
 - New patch
---
 xen/arch/riscv/cpufeature.c             | 1 +
 xen/arch/riscv/include/asm/cpufeature.h | 1 +
 xen/arch/riscv/traps.c                  | 8 +++++++-
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/xen/arch/riscv/cpufeature.c b/xen/arch/riscv/cpufeature.c
index 02b68aeaa49f..03e27b037be0 100644
--- a/xen/arch/riscv/cpufeature.c
+++ b/xen/arch/riscv/cpufeature.c
@@ -137,6 +137,7 @@ const struct riscv_isa_ext_data __initconst riscv_isa_ext[] = {
     RISCV_ISA_EXT_DATA(zbb),
     RISCV_ISA_EXT_DATA(zbs),
     RISCV_ISA_EXT_DATA(smaia),
+    RISCV_ISA_EXT_DATA(smstateen),
     RISCV_ISA_EXT_DATA(ssaia),
     RISCV_ISA_EXT_DATA(svade),
     RISCV_ISA_EXT_DATA(svpbmt),
diff --git a/xen/arch/riscv/include/asm/cpufeature.h b/xen/arch/riscv/include/asm/cpufeature.h
index b69616038888..ef02a3e26d2c 100644
--- a/xen/arch/riscv/include/asm/cpufeature.h
+++ b/xen/arch/riscv/include/asm/cpufeature.h
@@ -36,6 +36,7 @@ enum riscv_isa_ext_id {
     RISCV_ISA_EXT_zbb,
     RISCV_ISA_EXT_zbs,
     RISCV_ISA_EXT_smaia,
+    RISCV_ISA_EXT_smstateen,
     RISCV_ISA_EXT_ssaia,
     RISCV_ISA_EXT_svade,
     RISCV_ISA_EXT_svpbmt,
diff --git a/xen/arch/riscv/traps.c b/xen/arch/riscv/traps.c
index 34920f4e5693..c81a4f79a0d2 100644
--- a/xen/arch/riscv/traps.c
+++ b/xen/arch/riscv/traps.c
@@ -11,6 +11,7 @@
 #include <xen/nospec.h>
 #include <xen/sched.h>
 
+#include <asm/cpufeature.h>
 #include <asm/intc.h>
 #include <asm/processor.h>
 #include <asm/riscv_encoding.h>
@@ -144,7 +145,12 @@ static void dump_csrs(const char *ctx)
       (v & HSTATUS_SPV)  ? " SPV"  : "",
       (v & HSTATUS_GVA)  ? " GVA"  : "");
     X(hgatp, CSR_HGATP, "\n");
-    X(hstateen0, CSR_HSTATEEN0, "\n");
+
+    if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_smstateen) )
+    {
+        X(hstateen0, CSR_HSTATEEN0, "\n");
+    }
+
     X(stvec, CSR_STVEC, " "); X(vstvec, CSR_VSTVEC, "\n");
     X(sepc, CSR_SEPC, " "); X(vsepc, CSR_VSEPC, "\n");
     X(stval, CSR_STVAL, " "); X(vstval, CSR_VSTVAL, "\n");
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 03/16] xen/riscv: detect and store supported hypervisor CSR bits at boot
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
  2026-02-09 16:52 ` [PATCH v3 01/16] xen/riscv: implement arch_vcpu_{create,destroy}() Oleksii Kurochko
  2026-02-09 16:52 ` [PATCH v3 02/16] xen/riscv: avoid reading hstateen0 when Smstateen is not implemented Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-11  7:49   ` Jan Beulich
  2026-02-09 16:52 ` [PATCH v3 04/16] xen/riscv: implement vcpu_csr_init() Oleksii Kurochko
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Some hypervisor CSRs expose optional functionality and may not implement
all architectural bits. Writing unsupported bits can either be ignored
or raise an exception depending on the platform.

Detect the set of writable bits for selected hypervisor CSRs at boot and
store the resulting masks for later use. This allows safely programming
these CSRs during vCPU context switching and avoids relying on hardcoded
architectural assumptions.

Note that csr_set() is used instead of csr_write() to write all ones to
the mask, as the CSRRS instruction, according to the RISC-V specification,
sets only those bits that are writable:
    Any bit that is high in rs1 will cause the corresponding bit to be set
    in the CSR, if that CSR bit is writable.
In contrast, the CSRRW instruction does not take CSR bit writability into
account, which could lead to unintended side effects when writing all ones
to a CSR.

Masks are calculated at the moment only for hdeleg, henvcfg, hideleg,
hstateen0 registers as only them are going to be used in the follow up
patch.

If the Smstateen extension is not implemented, hstateen0 cannot be read
because the register is considered non-existent. Instructions that attempt
to access a CSR that is not implemented or not visible in the current mode
are reserved and will raise an illegal-instruction exception.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V3:
 - New patch.
---
 xen/arch/riscv/include/asm/setup.h |  9 +++++++++
 xen/arch/riscv/setup.c             | 26 ++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/xen/arch/riscv/include/asm/setup.h b/xen/arch/riscv/include/asm/setup.h
index c9d69cdf5166..d54f6a2d1d29 100644
--- a/xen/arch/riscv/include/asm/setup.h
+++ b/xen/arch/riscv/include/asm/setup.h
@@ -5,6 +5,15 @@
 
 #include <xen/types.h>
 
+struct csr_masks {
+    register_t hedeleg;
+    register_t henvcfg;
+    register_t hideleg;
+    register_t hstateen0;
+};
+
+extern struct csr_masks csr_masks;
+
 #define max_init_domid (0)
 
 void setup_mm(void);
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index 9b4835960d20..010489f0b713 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -32,6 +32,8 @@
 unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
     __aligned(STACK_SIZE);
 
+struct csr_masks __ro_after_init csr_masks;
+
 /**
  * copy_from_paddr - copy data from a physical address
  * @dst: destination virtual address
@@ -70,6 +72,28 @@ static void * __init relocate_fdt(paddr_t dtb_paddr, size_t dtb_size)
     return fdt;
 }
 
+void __init init_csr_masks(void)
+{
+    register_t old;
+
+#define X(csr, field) \
+        old = csr_read(CSR_##csr); \
+        csr_set(CSR_##csr, ULONG_MAX); \
+        csr_masks.field = csr_read(CSR_##csr); \
+        csr_write(CSR_##csr, old)
+
+    X(HEDELEG, hedeleg);
+    X(HENVCFG, henvcfg);
+    X(HIDELEG, hideleg);
+
+    if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_smstateen) )
+    {
+        X(HSTATEEN0, hstateen0);
+    }
+
+#undef X
+}
+
 void __init noreturn start_xen(unsigned long bootcpu_id,
                                paddr_t dtb_addr)
 {
@@ -137,6 +161,8 @@ void __init noreturn start_xen(unsigned long bootcpu_id,
 
     riscv_fill_hwcap();
 
+    init_csr_masks();
+
     preinit_xen_time();
 
     intc_preinit();
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 04/16] xen/riscv: implement vcpu_csr_init()
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
                   ` (2 preceding siblings ...)
  2026-02-09 16:52 ` [PATCH v3 03/16] xen/riscv: detect and store supported hypervisor CSR bits at boot Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-11  9:44   ` Jan Beulich
  2026-02-09 16:52 ` [PATCH v3 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1 Oleksii Kurochko
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Introduce vcpu_csr_init() to initialise hypervisor CSRs that control
vCPU execution and virtualization behaviour before the vCPU is first
scheduled.
The function configures trap and interrupt delegation to VS-mode by
setting the appropriate bits in the hedeleg and hideleg registers,
initializes hstatus so that execution enters VS-mode when control is
passed to the guest, and restricts guest access to hardware performance
counters by initializing hcounteren, as unrestricted access would
require additional handling in Xen.
When the Smstateen and SSAIA extensions are available, access to AIA
CSRs and IMSIC guest interrupt files is enabled by setting the
corresponding bits in hstateen0, avoiding unnecessary traps into Xen
(note that SVSLCT(Supervisor Virtual Select) name is used intead of
CSRIND as OpenSBI uses such name and riscv_encoding.h is mostly based
on it).
If the Svpbmt extension is supported, the PBMTE bit is set in
henvcfg to allow its use for VS-stage address translation. Guest
access to the ENVCFG CSR is also enabled by setting ENVCFG bit in
hstateen0, as a guest may need to control certain characteristics of
the U-mode (VU-mode when V=1) execution environment.

For CSRs that may contain read-only bits (e.g. hedeleg, hideleg,
hstateen0), to the written value a correspondent mask is applied to
avoid divergence between the software state and the actual CSR
contents.

As hstatus is not part of struct arch_vcpu (it already resides in
struct cpu_user_regs), introduce vcpu_guest_cpu_user_regs() to provide
a uniform way to access hstatus and other guest CPU user registers.

This establishes a consistent and well-defined initial CSR state for
vCPUs prior to their first context switch.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v3:
 - Add hypervisor register used to initalize vCPU state.
 - Apply masks introduced before instead of csr_write()/csr_read() pattern.
---
Changes in v2:
 - As hstatus isn't a part of arch_vcpu structure (as it is already a part of
   cpu_user_regs) introduce vcpu_guest_cpu_user_regs() to be able to access
   hstatus and other CPU user regs.
 - Sort hideleg bit setting by value. Drop a stray blank.
 - Drop | when the first initialization of hcounteren and hennvcfg happen.
 - Introduce HEDELEG_DEFAULT. Sort set bits by value and use BIT() macros
   instead of open-coding it.
 - Apply pattern csr_write() -> csr_read() for hedeleg and hideleg instead
   of direct bit setting in v->arch.h{i,e}deleg as it could be that for some
   reason some bits of hedeleg and hideleg are r/o.
   The similar patter is used for hstateen0 as some of the bits could be r/o.
 - Add check that SSAIA is avaialable before setting of SMSTATEEN0_AIA |
   SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT bits.
 - Drop local variables hstatus, hideleg and hedeleg as they aren't used
   anymore.
---
 xen/arch/riscv/domain.c                     | 68 +++++++++++++++++++++
 xen/arch/riscv/include/asm/current.h        |  2 +
 xen/arch/riscv/include/asm/domain.h         |  6 ++
 xen/arch/riscv/include/asm/riscv_encoding.h |  2 +
 4 files changed, 78 insertions(+)

diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index d035b105c2cc..af9586a4eb0d 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -5,6 +5,72 @@
 #include <xen/sched.h>
 #include <xen/vmap.h>
 
+#include <asm/cpufeature.h>
+#include <asm/csr.h>
+#include <asm/riscv_encoding.h>
+#include <asm/setup.h>
+
+#define HEDELEG_DEFAULT (BIT(CAUSE_MISALIGNED_FETCH, U) | \
+                         BIT(CAUSE_FETCH_ACCESS, U) | \
+                         BIT(CAUSE_ILLEGAL_INSTRUCTION, U) | \
+                         BIT(CAUSE_BREAKPOINT, U) | \
+                         BIT(CAUSE_MISALIGNED_LOAD, U) | \
+                         BIT(CAUSE_LOAD_ACCESS, U) | \
+                         BIT(CAUSE_MISALIGNED_STORE, U) | \
+                         BIT(CAUSE_STORE_ACCESS, U) | \
+                         BIT(CAUSE_USER_ECALL, U) | \
+                         BIT(CAUSE_FETCH_PAGE_FAULT, U) | \
+                         BIT(CAUSE_LOAD_PAGE_FAULT, U) | \
+                         BIT(CAUSE_STORE_PAGE_FAULT, U))
+
+#define HIDELEG_DEFAULT (MIP_VSSIP | MIP_VSTIP | MIP_VSEIP)
+
+static void vcpu_csr_init(struct vcpu *v)
+{
+    register_t hstateen0;
+
+    v->arch.hedeleg = HEDELEG_DEFAULT & csr_masks.hedeleg;
+
+    vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
+
+    v->arch.hideleg = HIDELEG_DEFAULT & csr_masks.hideleg;
+
+    /*
+     * VS should access only the time counter directly.
+     * Everything else should trap.
+     */
+    v->arch.hcounteren = HCOUNTEREN_TM;
+
+    if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_svpbmt) )
+        v->arch.henvcfg = ENVCFG_PBMTE & csr_masks.henvcfg;
+
+    if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_smstateen) )
+    {
+         if (riscv_isa_extension_available(NULL, RISCV_ISA_EXT_ssaia))
+            /*
+             * If the hypervisor extension is implemented, the same three
+             * bitsare defined also in hypervisor CSR hstateen0 but concern
+             * only the state potentially accessible to a virtual machine
+             * executing in privilege modes VS and VU:
+             *      bit 60 CSRs siselect and sireg (really vsiselect and
+             *             vsireg)
+             *      bit 59 CSRs siph and sieh (RV32 only) and stopi (really
+             *             vsiph, vsieh, and vstopi)
+             *      bit 58 all state of IMSIC guest interrupt files, including
+             *             CSR stopei (really vstopei)
+             * If one of these bits is zero in hstateen0, and the same bit is
+             * one in mstateen0, then an attempt to access the corresponding
+             * state from VS or VU-mode raises a virtual instruction exception.
+             */
+            hstateen0 = SMSTATEEN0_AIA | SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT;
+
+        /* Allow guest to access CSR_ENVCFG */
+        hstateen0 |= SMSTATEEN0_HSENVCFG;
+
+        v->arch.hstateen0 = hstateen0 & csr_masks.hstateen0;
+    }
+}
+
 static void continue_new_vcpu(struct vcpu *prev)
 {
     BUG_ON("unimplemented\n");
@@ -32,6 +98,8 @@ int arch_vcpu_create(struct vcpu *v)
     v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
     v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
 
+    vcpu_csr_init(v);
+
     /* Idle VCPUs don't need the rest of this setup */
     if ( is_idle_vcpu(v) )
         return rc;
diff --git a/xen/arch/riscv/include/asm/current.h b/xen/arch/riscv/include/asm/current.h
index 58c9f1506b7c..5fbee8182caa 100644
--- a/xen/arch/riscv/include/asm/current.h
+++ b/xen/arch/riscv/include/asm/current.h
@@ -48,6 +48,8 @@ DECLARE_PER_CPU(struct vcpu *, curr_vcpu);
 #define get_cpu_current(cpu)  per_cpu(curr_vcpu, cpu)
 
 #define guest_cpu_user_regs() ({ BUG_ON("unimplemented"); NULL; })
+#define vcpu_guest_cpu_user_regs(vcpu) \
+    (&(vcpu)->arch.cpu_info->guest_cpu_user_regs)
 
 #define switch_stack_and_jump(stack, fn) do {               \
     asm volatile (                                          \
diff --git a/xen/arch/riscv/include/asm/domain.h b/xen/arch/riscv/include/asm/domain.h
index f78f145258d6..6bb06a50c6ab 100644
--- a/xen/arch/riscv/include/asm/domain.h
+++ b/xen/arch/riscv/include/asm/domain.h
@@ -48,6 +48,12 @@ struct arch_vcpu {
     } xen_saved_context;
 
     struct cpu_info *cpu_info;
+
+    register_t hcounteren;
+    register_t hedeleg;
+    register_t henvcfg;
+    register_t hideleg;
+    register_t hstateen0;
 };
 
 struct paging_domain {
diff --git a/xen/arch/riscv/include/asm/riscv_encoding.h b/xen/arch/riscv/include/asm/riscv_encoding.h
index 1f7e612366f8..dd15731a86fa 100644
--- a/xen/arch/riscv/include/asm/riscv_encoding.h
+++ b/xen/arch/riscv/include/asm/riscv_encoding.h
@@ -228,6 +228,8 @@
 #define ENVCFG_CBIE_INV			_UL(0x3)
 #define ENVCFG_FIOM			_UL(0x1)
 
+#define HCOUNTEREN_TM BIT(1, U)
+
 /* ===== User-level CSRs ===== */
 
 /* User Trap Setup (N-extension) */
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
                   ` (3 preceding siblings ...)
  2026-02-09 16:52 ` [PATCH v3 04/16] xen/riscv: implement vcpu_csr_init() Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-11 14:16   ` Jan Beulich
  2026-02-09 16:52 ` [PATCH v3 06/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 2 Oleksii Kurochko
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Based on Linux kernel v6.16.0.
Note that smp_wmb() is used instead of smp_mb__before_atomic() as what
we want to guarantee that if a bit in irqs_pending_mask is obversable
that the correspondent bit in irqs_pending is observable too.

Add lockless tracking of pending vCPU interrupts using atomic bitops.
Two bitmaps are introduced:
 - irqs_pending — interrupts currently pending for the vCPU
 - irqs_pending_mask — bits that have changed in irqs_pending

The design follows a multi-producer, single-consumer model, where the
consumer is the vCPU itself. Producers may set bits in
irqs_pending_mask without a lock. Clearing bits in irqs_pending_mask is
performed only by the consumer via xchg_acquire(). The consumer must not
write to irqs_pending and must not act on bits that are not set in the
mask. Otherwise, extra synchronization should be provided.

On RISC-V interrupts are not injected via guest registers, so pending
interrupts must be recorded in irqs_pending (using the new
vcpu_{un}set_interrupt() helpers) and flushed to the guest by updating
HVIP before returning control to the guest. The consumer side is
implemented in a follow-up patch.

A barrier between updating irqs_pending and setting the corresponding
mask bit in vcpu_set_interrupt()/vcpu_unset_interrupt() guarantees
that if the consumer observes a mask bit set, the corresponding pending
bit is also visible. This prevents missed interrupts during the flush.

It is possible a guest could have pending bit not result in the hardware
register without to be marked pending in irq_pending bitmap as:
  According to the RISC-V ISA specification:
    Bits hip.VSSIP and hie.VSSIE are the interrupt-pending and
    interrupt-enable  bits for VS-level software interrupts. VSSIP in hip
    is an alias (writable) of the same bit in hvip.
  Additionally:
    When bit 2 of hideleg is zero, vsip.SSIP and vsie.SSIE are read-only
    zeros. Else, vsip.SSIP and vsie.SSIE are aliases of hip.VSSIP and
    hie.VSSIE.
This means the guest may modify vsip.SSIP, which implicitly updates
hip.VSSIP and the bit being writable with 1 would also trigger an interrupt
as according to the RISC-V spec:
  These conditions for an interrupt trap to occur must be evaluated in a
  bounded   amount of time from when an interrupt becomes, or ceases to be,
  pending in sip,  and must also be evaluated immediately following the
  execution of an SRET  instruction or an explicit write to a CSR on which
  these interrupt trap conditions expressly depend (including sip, sie and
  sstatus).
What means that IRQ_VS_SOFT must be synchronized separately, what is done
in vcpu_sync_interrupts(). Note, also, that IRQ_PMU_OVF would want to be
synced for the similar reason as IRQ_VS_SOFT, but isn't sync-ed now as
PMU isn't supported now.

For the remaining VS-level interrupt types (IRQ_VS_TIMER and
IRQ_VS_EXT), the specification states they cannot be modified by the guest
and are read-only:
  Bits hip.VSEIP and hie.VSEIE are the interrupt-pending and interrupt-enable
  bits for VS-level external interrupts. VSEIP is read-only in hip, and is
  the logical-OR of these interrupt sources:
    • bit VSEIP of hvip;
    • the bit of hgeip selected by hstatus.VGEIN; and
    • any other platform-specific external interrupt signal directed to
      VS-level.
  Bits hip.VSTIP and hie.VSTIE are the interrupt-pending and interrupt-enable
  bits for VS-level timer interrupts. VSTIP is read-only in hip, and is the
  logical-OR of hvip.VSTIP and any other platform-specific timer interrupt
  signal directed to VS-level.
Thus, for these interrupt types, it is sufficient to use vcpu_set_interrupt()
and vcpu_unset_interrupt(), and flush them during the call of
vcpu_flush_interrupts() (which is introduced in follow up patch).

vcpu_sync_interrupts(), which is called just before entering the VM,
slightly bends the rule that the irqs_pending bit must be written
first, followed by updating the corresponding bit in irqs_pending_mask.
However, it still respects the core guarantee that the producer never
clears the mask and only writes to irqs_pending if it is the one that
flipped the corresponding mask bit from 0 to 1.
Moreover, since the consumer won't run concurrently because
vcpu_sync_interrupts() and the consumer path are going to be invoked
equentially immediately before VM entry, it is safe to slightly relax
this ordering rule in vcpu_sync_interrupts().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v3:
 - Use smp_wb() instead of smp_mb__before_atomic().
 - Add explanation of the change above in the commit message.
 - Move vcpu_sync_interrupts() here to producers side.
 - Introduce check_for_pcpu_work() to be clear from where vcpu_sync_interrupts()
   is called.
---
Changes in V2:
 - Move the patch before an introduction of vtimer.
 - Drop bitmap_zero() of irqs_pending and irqs_pending_mask bitmaps as
   vcpu structure starts out all zeros.
 - Drop const for irq argument of vcpu_{un}set_interrupt().
 - Drop check "irq < IRQ_LOCAL_MAX" in vcpu_{un}set_interrupt() as it
   could lead to overrun of irqs_pending and irqs_pending_mask bitmaps.
 - Drop IRQ_LOCAL_MAX as there is no usage for it now.
---
 xen/arch/riscv/domain.c             | 70 +++++++++++++++++++++++++++++
 xen/arch/riscv/include/asm/domain.h | 24 ++++++++++
 xen/arch/riscv/traps.c              |  8 ++++
 3 files changed, 102 insertions(+)

diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index af9586a4eb0d..4513f778cdc4 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -5,6 +5,7 @@
 #include <xen/sched.h>
 #include <xen/vmap.h>
 
+#include <asm/bitops.h>
 #include <asm/cpufeature.h>
 #include <asm/csr.h>
 #include <asm/riscv_encoding.h>
@@ -124,3 +125,72 @@ void arch_vcpu_destroy(struct vcpu *v)
 {
     vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info) - STACK_SIZE);
 }
+
+int vcpu_set_interrupt(struct vcpu *v, unsigned int irq)
+{
+    /*
+     * We only allow VS-mode software, timer, and external
+     * interrupts when irq is one of the local interrupts
+     * defined by RISC-V privilege specification.
+     */
+    if ( irq != IRQ_VS_SOFT &&
+         irq != IRQ_VS_TIMER &&
+         irq != IRQ_VS_EXT )
+        return -EINVAL;
+
+    set_bit(irq, v->arch.irqs_pending);
+    smp_wmb();
+    set_bit(irq, v->arch.irqs_pending_mask);
+
+    vcpu_kick(v);
+
+    return 0;
+}
+
+int vcpu_unset_interrupt(struct vcpu *v, unsigned int irq)
+{
+    /*
+     * We only allow VS-mode software, timer, external
+     * interrupts when irq is one of the local interrupts
+     * defined by RISC-V privilege specification.
+     */
+    if ( irq != IRQ_VS_SOFT &&
+         irq != IRQ_VS_TIMER &&
+         irq != IRQ_VS_EXT )
+        return -EINVAL;
+
+    clear_bit(irq, v->arch.irqs_pending);
+    smp_wmb();
+    set_bit(irq, v->arch.irqs_pending_mask);
+
+    return 0;
+}
+
+void vcpu_sync_interrupts(struct vcpu *v)
+{
+    unsigned long hvip;
+
+    /* Read current HVIP and VSIE CSRs */
+    v->arch.vsie = csr_read(CSR_VSIE);
+
+    /* Sync-up HVIP.VSSIP bit changes done by Guest */
+    hvip = csr_read(CSR_HVIP);
+    if ( (v->arch.hvip ^ hvip) & BIT(IRQ_VS_SOFT, UL) &&
+         !test_and_set_bit(IRQ_VS_SOFT, &v->arch.irqs_pending_mask) )
+    {
+        if ( hvip & BIT(IRQ_VS_SOFT, UL) )
+            set_bit(IRQ_VS_SOFT, &v->arch.irqs_pending);
+        else
+            clear_bit(IRQ_VS_SOFT, &v->arch.irqs_pending);
+    }
+
+    /*
+     * Sync-up AIA high interrupts.
+     *
+     * It is necessary to do only for CONFIG_RISCV_32 which isn't supported
+     * now.
+     */
+#ifdef CONFIG_RISCV_32
+#   error "Update v->arch.vsieh"
+#endif
+}
diff --git a/xen/arch/riscv/include/asm/domain.h b/xen/arch/riscv/include/asm/domain.h
index 6bb06a50c6ab..8d9432ec5a8b 100644
--- a/xen/arch/riscv/include/asm/domain.h
+++ b/xen/arch/riscv/include/asm/domain.h
@@ -54,6 +54,25 @@ struct arch_vcpu {
     register_t henvcfg;
     register_t hideleg;
     register_t hstateen0;
+    register_t hvip;
+
+    register_t vsie;
+
+    /*
+     * VCPU interrupts
+     *
+     * We have a lockless approach for tracking pending VCPU interrupts
+     * implemented using atomic bitops. The irqs_pending bitmap represent
+     * pending interrupts whereas irqs_pending_mask represent bits changed
+     * in irqs_pending. Our approach is modeled around multiple producer
+     * and single consumer problem where the consumer is the VCPU itself.
+     *
+     * DECLARE_BITMAP() is needed here to support 64 vCPU local interrupts
+     * on RV32 host.
+     */
+#define RISCV_VCPU_NR_IRQS 64
+    DECLARE_BITMAP(irqs_pending, RISCV_VCPU_NR_IRQS);
+    DECLARE_BITMAP(irqs_pending_mask, RISCV_VCPU_NR_IRQS);
 };
 
 struct paging_domain {
@@ -92,6 +111,11 @@ static inline void update_guest_memory_policy(struct vcpu *v,
 
 static inline void arch_vcpu_block(struct vcpu *v) {}
 
+int vcpu_set_interrupt(struct vcpu *v, unsigned int irq);
+int vcpu_unset_interrupt(struct vcpu *v, unsigned int irq);
+
+void vcpu_sync_interrupts(struct vcpu *v);
+
 #endif /* ASM__RISCV__DOMAIN_H */
 
 /*
diff --git a/xen/arch/riscv/traps.c b/xen/arch/riscv/traps.c
index c81a4f79a0d2..82e1dc59cdea 100644
--- a/xen/arch/riscv/traps.c
+++ b/xen/arch/riscv/traps.c
@@ -169,6 +169,11 @@ static void do_unexpected_trap(const struct cpu_user_regs *regs)
     die();
 }
 
+static void check_for_pcpu_work(void)
+{
+    vcpu_sync_interrupts(current);
+}
+
 void do_trap(struct cpu_user_regs *cpu_regs)
 {
     register_t pc = cpu_regs->sepc;
@@ -222,6 +227,9 @@ void do_trap(struct cpu_user_regs *cpu_regs)
         do_unexpected_trap(cpu_regs);
         break;
     }
+
+    if ( cpu_regs->hstatus & HSTATUS_SPV )
+        check_for_pcpu_work();
 }
 
 void vcpu_show_execution_state(struct vcpu *v)
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 06/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 2
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
                   ` (4 preceding siblings ...)
  2026-02-09 16:52 ` [PATCH v3 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1 Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-11 14:26   ` Jan Beulich
  2026-02-09 16:52 ` [PATCH v3 07/16] xen/time: move ticks<->ns helpers to common code Oleksii Kurochko
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

This patch is based on Linux kernel 6.16.0.

Add the consumer side (vcpu_flush_interrupts()) of the lockless pending
interrupt tracking introduced in part 1 (for producers). According, to the
design only one consumer is possible, and it is vCPU itself.
vcpu_flush_interrupts() is expected to be ran (as guests aren't ran now due
to the lack of functionality) before the hypervisor returns control to the
guest.

Producers may set bits in irqs_pending_mask without a lock. Clearing bits in
irqs_pending_mask is performed only by the consumer via xchg() (with aquire &
release semantics). The consumer must not write to irqs_pending and must not
act on bits that are not set in the mask. Otherwise, extra synchronization
should be provided.
The worst thing which could happen with such approach is that a new pending
bit will be set to irqs_pending bitmap during update of hvip variable in
vcpu_flush_interrupt() but it isn't problem as the new pending bit won't
be lost and just be proceded during the next flush.

As AIA specs introduced hviph register which would want to be updated when
guest related AIA code vcpu_update_hvip() is introduced instead of just
open-code it in vcpu_flush_interrupts().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v3:
 - Update the error message in case of RV32 from "hviph" to v->arch.hviph.
 - Make const argument of vcpu_update_hvip.
 - Move local variables mask and val inside if() in vcpu_flush_interrupts().
 - Call vcpu_flush_interrupts() in check_pcpu_work().
 - Move vcpu_update_hvip() inside if() in vcpu_flush_interrupts().
---
Changes in v2:
 - New patch.
---
 xen/arch/riscv/domain.c             | 33 +++++++++++++++++++++++++++++
 xen/arch/riscv/include/asm/domain.h |  1 +
 xen/arch/riscv/traps.c              |  2 ++
 3 files changed, 36 insertions(+)

diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index 4513f778cdc4..67437912605a 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -194,3 +194,36 @@ void vcpu_sync_interrupts(struct vcpu *v)
 #   error "Update v->arch.vsieh"
 #endif
 }
+
+static void vcpu_update_hvip(const struct vcpu *v)
+{
+    csr_write(CSR_HVIP, v->arch.hvip);
+}
+
+void vcpu_flush_interrupts(struct vcpu *v)
+{
+    register_t *hvip = &v->arch.hvip;
+
+    if ( ACCESS_ONCE(v->arch.irqs_pending_mask[0]) )
+    {
+        unsigned long mask, val;
+
+        mask = xchg(&v->arch.irqs_pending_mask[0], 0UL);
+        val = ACCESS_ONCE(v->arch.irqs_pending[0]) & mask;
+
+        *hvip &= ~mask;
+        *hvip |= val;
+
+        /*
+         * Flush AIA high interrupts.
+         *
+         * It is necessary to do only for CONFIG_RISCV_32 which isn't
+         * supported now.
+         */
+#ifdef CONFIG_RISCV_32
+        #   error "Update v->arch.hviph"
+#endif
+
+        vcpu_update_hvip(v);
+    }
+}
diff --git a/xen/arch/riscv/include/asm/domain.h b/xen/arch/riscv/include/asm/domain.h
index 8d9432ec5a8b..de5aecb862b5 100644
--- a/xen/arch/riscv/include/asm/domain.h
+++ b/xen/arch/riscv/include/asm/domain.h
@@ -115,6 +115,7 @@ int vcpu_set_interrupt(struct vcpu *v, unsigned int irq);
 int vcpu_unset_interrupt(struct vcpu *v, unsigned int irq);
 
 void vcpu_sync_interrupts(struct vcpu *v);
+void vcpu_flush_interrupts(struct vcpu *v);
 
 #endif /* ASM__RISCV__DOMAIN_H */
 
diff --git a/xen/arch/riscv/traps.c b/xen/arch/riscv/traps.c
index 82e1dc59cdea..676a2da55811 100644
--- a/xen/arch/riscv/traps.c
+++ b/xen/arch/riscv/traps.c
@@ -172,6 +172,8 @@ static void do_unexpected_trap(const struct cpu_user_regs *regs)
 static void check_for_pcpu_work(void)
 {
     vcpu_sync_interrupts(current);
+
+    vcpu_flush_interrupts(current);
 }
 
 void do_trap(struct cpu_user_regs *cpu_regs)
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 07/16] xen/time: move ticks<->ns helpers to common code
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
                   ` (5 preceding siblings ...)
  2026-02-09 16:52 ` [PATCH v3 06/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 2 Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-09 16:52 ` [PATCH v3 08/16] xen/riscv: introduce basic vtimer infrastructure for guests Oleksii Kurochko
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Andrew Cooper, Anthony PERARD, Jan Beulich, Roger Pau Monné,
	Alistair Francis, Connor Davis

ticks_to_ns() and ns_to_ticks() are not architecture-specific, so provide a
common implementation that is more resilient to overflow than the historical
Arm version. This is not a practical issue for Arm, as the latest ARM ARM
that timer frequency should be fixed at 1 GHz and older platforms used much
lower rates, which is shy of 32-bit overflow. As the helpers are declared
as static inline, they should not affect x86, which does not use them.

On Arm, these helpers were historically implemented as out-of-line functions
because the counter frequency was originally defined as static and unavailable
to headers [1]. Later changes [2] removed this restriction, but the helpers
remained unchanged. Now they can be implemented as static inline without any
issues.

Centralising the helpers avoids duplication and removes subtle differences
between architectures while keeping the implementation simple.

Drop redundant <asm/time.h> includes where <xen/time.h> already pulls it in.

No functional change is intended.

[1] ddee56dc2994 arm: driver for the generic timer for ARMv7
[2] 096578b4e489 xen: move XEN_SYSCTL_physinfo, XEN_SYSCTL_numainfo and
                      XEN_SYSCTL_topologyinfo to common code

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v3:
 - Add Reviewed-by: Jan Beulich <jbeulich@suse.com>.
---
Changes in v2:
 - Move ns_to_ticks() and ticks_to_ns() to common code.
---
 xen/arch/arm/include/asm/time.h   |  3 ---
 xen/arch/arm/time.c               | 11 -----------
 xen/arch/arm/vtimer.c             |  2 +-
 xen/arch/riscv/include/asm/time.h |  5 -----
 xen/arch/riscv/time.c             |  1 +
 xen/include/xen/time.h            | 11 +++++++++++
 6 files changed, 13 insertions(+), 20 deletions(-)

diff --git a/xen/arch/arm/include/asm/time.h b/xen/arch/arm/include/asm/time.h
index 49ad8c1a6d47..c194dbb9f52d 100644
--- a/xen/arch/arm/include/asm/time.h
+++ b/xen/arch/arm/include/asm/time.h
@@ -101,9 +101,6 @@ extern void init_timer_interrupt(void);
 /* Counter value at boot time */
 extern uint64_t boot_count;
 
-extern s_time_t ticks_to_ns(uint64_t ticks);
-extern uint64_t ns_to_ticks(s_time_t ns);
-
 void preinit_xen_time(void);
 
 void force_update_vcpu_system_time(struct vcpu *v);
diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
index cc3fcf47b66a..a12912a106a0 100644
--- a/xen/arch/arm/time.c
+++ b/xen/arch/arm/time.c
@@ -27,7 +27,6 @@
 #include <asm/cpufeature.h>
 #include <asm/platform.h>
 #include <asm/system.h>
-#include <asm/time.h>
 #include <asm/vgic.h>
 
 uint64_t __read_mostly boot_count;
@@ -47,16 +46,6 @@ unsigned int timer_get_irq(enum timer_ppi ppi)
     return timer_irq[ppi];
 }
 
-/*static inline*/ s_time_t ticks_to_ns(uint64_t ticks)
-{
-    return muldiv64(ticks, SECONDS(1), 1000 * cpu_khz);
-}
-
-/*static inline*/ uint64_t ns_to_ticks(s_time_t ns)
-{
-    return muldiv64(ns, 1000 * cpu_khz, SECONDS(1));
-}
-
 static __initdata struct dt_device_node *timer;
 
 #ifdef CONFIG_ACPI
diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
index d2124b175521..2e85ff2b6e62 100644
--- a/xen/arch/arm/vtimer.c
+++ b/xen/arch/arm/vtimer.c
@@ -12,13 +12,13 @@
 #include <xen/lib.h>
 #include <xen/perfc.h>
 #include <xen/sched.h>
+#include <xen/time.h>
 #include <xen/timer.h>
 
 #include <asm/cpregs.h>
 #include <asm/div64.h>
 #include <asm/irq.h>
 #include <asm/regs.h>
-#include <asm/time.h>
 #include <asm/vgic.h>
 #include <asm/vreg.h>
 #include <asm/vtimer.h>
diff --git a/xen/arch/riscv/include/asm/time.h b/xen/arch/riscv/include/asm/time.h
index 1e7801e2ea0e..be3875b9984e 100644
--- a/xen/arch/riscv/include/asm/time.h
+++ b/xen/arch/riscv/include/asm/time.h
@@ -24,11 +24,6 @@ static inline cycles_t get_cycles(void)
     return csr_read(CSR_TIME);
 }
 
-static inline s_time_t ticks_to_ns(uint64_t ticks)
-{
-    return muldiv64(ticks, MILLISECS(1), cpu_khz);
-}
-
 void preinit_xen_time(void);
 
 #endif /* ASM__RISCV__TIME_H */
diff --git a/xen/arch/riscv/time.c b/xen/arch/riscv/time.c
index e962f8518d78..2c7af0a5d63b 100644
--- a/xen/arch/riscv/time.c
+++ b/xen/arch/riscv/time.c
@@ -4,6 +4,7 @@
 #include <xen/init.h>
 #include <xen/lib.h>
 #include <xen/sections.h>
+#include <xen/time.h>
 #include <xen/types.h>
 
 unsigned long __ro_after_init cpu_khz; /* CPU clock frequency in kHz. */
diff --git a/xen/include/xen/time.h b/xen/include/xen/time.h
index fe0d7a578a99..2185dd26a439 100644
--- a/xen/include/xen/time.h
+++ b/xen/include/xen/time.h
@@ -8,6 +8,7 @@
 #ifndef __XEN_TIME_H__
 #define __XEN_TIME_H__
 
+#include <xen/muldiv64.h>
 #include <xen/types.h>
 #include <public/xen.h>
 
@@ -75,6 +76,16 @@ extern void send_timer_event(struct vcpu *v);
 
 void domain_set_time_offset(struct domain *d, int64_t time_offset_seconds);
 
+static inline s_time_t ticks_to_ns(uint64_t ticks)
+{
+    return muldiv64(ticks, MILLISECS(1), cpu_khz);
+}
+
+static inline uint64_t ns_to_ticks(s_time_t ns)
+{
+    return muldiv64(ns, cpu_khz, MILLISECS(1));
+}
+
 #include <asm/time.h>
 
 #endif /* __XEN_TIME_H__ */
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 08/16] xen/riscv: introduce basic vtimer infrastructure for guests
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
                   ` (6 preceding siblings ...)
  2026-02-09 16:52 ` [PATCH v3 07/16] xen/time: move ticks<->ns helpers to common code Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-11 14:51   ` Jan Beulich
  2026-02-09 16:52 ` [PATCH v3 09/16] xen/riscv: introduce vcpu_kick() implementation Oleksii Kurochko
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Lay the groundwork for guest timer support by introducing a per-vCPU
virtual timer backed by Xen’s common timer infrastructure.

The virtual timer is programmed in response to the guest SBI
sbi_set_timer() call and injects a virtual supervisor timer interrupt
into the vCPU when it expires.

While a dedicated struct vtimer is not strictly required at present,
it is expected to become necessary once SSTC support is introduced.
In particular, it will need to carry additional state such as whether
SSTC is enabled, the next compare value (e.g. for the VSTIMECMP CSR)
to be saved and restored across context switches, and time delta state
(e.g. HTIMEDELTA) required for use cases such as migration. Introducing
struct vtimer now avoids a later refactoring.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v3:
 - use one container_of() to get vcpu instead of two container_of()s.
---
Changes in v2:
 - Drop domain_vtimer_init() as it does nothing.
 - Drop "struct vcpu *v;" from struct vtimer as it could be taken
   from arch_vcpu using container_of().
 - Drop vtimer_initialized, use t->status == TIMER_STATUS_invalid
   instead to understand if timer was or wasn't initialized.
 - Drop inclusion of xen/domain.h as xen/sched.h already includes it.
 - s/ xen/time.h/ xen.timer.h in vtimer.c.
 - Drop ULL in if-conidtion in vtimer_set_timer() as with the cast
   it isn't necessary to have suffix ULL.
 - Add migrate timer to vtimer_set_timer() to be sure that vtimer
   will occur on pCPU it was ran, so the signalling to that vCPU
   will (commonly) be cheaper.
 - Check if the timeout has already expired and just inject the event
   in vtimer_vtimer_set_timer().
 - Drop const for ticks argument of vtimer_set_timer().
 - Merge two patches to one:
   - xen/riscv: introduce vtimer
   - xen/riscv: introduce vtimer_set_timer() and vtimer_expired()
---
 xen/arch/riscv/Makefile             |  1 +
 xen/arch/riscv/domain.c             |  8 +++-
 xen/arch/riscv/include/asm/domain.h |  3 ++
 xen/arch/riscv/include/asm/vtimer.h | 20 ++++++++
 xen/arch/riscv/vtimer.c             | 71 +++++++++++++++++++++++++++++
 5 files changed, 101 insertions(+), 2 deletions(-)
 create mode 100644 xen/arch/riscv/include/asm/vtimer.h
 create mode 100644 xen/arch/riscv/vtimer.c

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 868514c25006..7439d029cc45 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -22,6 +22,7 @@ obj-y += traps.o
 obj-y += vmid.o
 obj-y += vm_event.o
 obj-y += vsbi/
+obj-y += vtimer.o
 
 $(TARGET): $(TARGET)-syms
 	$(OBJCOPY) -O binary -S $< $@
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index 67437912605a..3f4b062b6ce8 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -10,6 +10,7 @@
 #include <asm/csr.h>
 #include <asm/riscv_encoding.h>
 #include <asm/setup.h>
+#include <asm/vtimer.h>
 
 #define HEDELEG_DEFAULT (BIT(CAUSE_MISALIGNED_FETCH, U) | \
                          BIT(CAUSE_FETCH_ACCESS, U) | \
@@ -105,11 +106,14 @@ int arch_vcpu_create(struct vcpu *v)
     if ( is_idle_vcpu(v) )
         return rc;
 
+    if ( (rc = vcpu_vtimer_init(v)) )
+        goto fail;
+
     /*
-     * As the vtimer and interrupt controller (IC) are not yet implemented,
+     * As interrupt controller (IC) is not yet implemented,
      * return an error.
      *
-     * TODO: Drop this once the vtimer and IC are implemented.
+     * TODO: Drop this once IC is implemented.
      */
     rc = -EOPNOTSUPP;
     goto fail;
diff --git a/xen/arch/riscv/include/asm/domain.h b/xen/arch/riscv/include/asm/domain.h
index de5aecb862b5..b5a8a9f711ac 100644
--- a/xen/arch/riscv/include/asm/domain.h
+++ b/xen/arch/riscv/include/asm/domain.h
@@ -8,6 +8,7 @@
 #include <public/hvm/params.h>
 
 #include <asm/p2m.h>
+#include <asm/vtimer.h>
 
 struct vcpu_vmid {
     uint64_t generation;
@@ -49,6 +50,8 @@ struct arch_vcpu {
 
     struct cpu_info *cpu_info;
 
+    struct vtimer vtimer;
+
     register_t hcounteren;
     register_t hedeleg;
     register_t henvcfg;
diff --git a/xen/arch/riscv/include/asm/vtimer.h b/xen/arch/riscv/include/asm/vtimer.h
new file mode 100644
index 000000000000..0d1555511755
--- /dev/null
+++ b/xen/arch/riscv/include/asm/vtimer.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * (c) 2023-2024 Vates
+ */
+
+#ifndef ASM__RISCV__VTIMER_H
+#define ASM__RISCV__VTIMER_H
+
+#include <xen/timer.h>
+
+struct vtimer {
+    struct timer timer;
+};
+
+int vcpu_vtimer_init(struct vcpu *v);
+void vcpu_timer_destroy(struct vcpu *v);
+
+void vtimer_set_timer(struct vtimer *t, uint64_t ticks);
+
+#endif /* ASM__RISCV__VTIMER_H */
diff --git a/xen/arch/riscv/vtimer.c b/xen/arch/riscv/vtimer.c
new file mode 100644
index 000000000000..32d142bcdfcd
--- /dev/null
+++ b/xen/arch/riscv/vtimer.c
@@ -0,0 +1,71 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/sched.h>
+#include <xen/timer.h>
+
+#include <asm/vtimer.h>
+
+static void vtimer_expired(void *data)
+{
+    struct vtimer *t = data;
+    struct vcpu *v = container_of(t, struct vcpu, arch.vtimer);
+
+    vcpu_set_interrupt(v, IRQ_VS_TIMER);
+}
+
+int vcpu_vtimer_init(struct vcpu *v)
+{
+    struct vtimer *t = &v->arch.vtimer;
+
+    init_timer(&t->timer, vtimer_expired, t, v->processor);
+
+    return 0;
+}
+
+void vcpu_timer_destroy(struct vcpu *v)
+{
+    struct vtimer *t = &v->arch.vtimer;
+
+    if ( t->timer.status == TIMER_STATUS_invalid )
+        return;
+
+    kill_timer(&v->arch.vtimer.timer);
+}
+
+void vtimer_set_timer(struct vtimer *t, const uint64_t ticks)
+{
+    struct vcpu *v = container_of(t, struct vcpu, arch.vtimer);
+    s_time_t expires = ticks_to_ns(ticks - boot_clock_cycles);
+
+    vcpu_unset_interrupt(v, IRQ_VS_TIMER);
+
+    /*
+     * According to the RISC-V sbi spec:
+     *   If the supervisor wishes to clear the timer interrupt without
+     *   scheduling the next timer event, it can either request a timer
+     *   interrupt infinitely far into the future (i.e., (uint64_t)-1),
+     *   or it can instead mask the timer interrupt by clearing sie.STIE CSR
+     *   bit.
+     */
+    if ( ticks == ((uint64_t)~0) )
+    {
+        stop_timer(&t->timer);
+
+        return;
+    }
+
+    if ( expires < NOW() )
+    {
+        /*
+         * Simplify the logic if the timeout has already expired and just
+         * inject the event.
+         */
+        stop_timer(&t->timer);
+        vcpu_set_interrupt(v, IRQ_VS_TIMER);
+
+        return;
+    }
+
+    migrate_timer(&t->timer, smp_processor_id());
+    set_timer(&t->timer, expires);
+}
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 09/16] xen/riscv: introduce vcpu_kick() implementation
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
                   ` (7 preceding siblings ...)
  2026-02-09 16:52 ` [PATCH v3 08/16] xen/riscv: introduce basic vtimer infrastructure for guests Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-09 16:52 ` [PATCH v3 10/16] xen/riscv: add vtimer context switch helpers Oleksii Kurochko
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Add a RISC-V implementation of vcpu_kick(), which unblocks the target
vCPU and sends an event check IPI if the vCPU was running on another
processor. This mirrors the behavior of Arm and enables proper vCPU
wakeup handling on RISC-V.

Remove the stub implementation from stubs.c, as it is now provided by
arch/riscv/domain.c.

Since vcpu_kick() calls perfc_incr(vcpu_kick), add perfcounter for
vcpu_kick to handle the case when CONFIG_PERF_COUNTERS=y. Although
CONFIG_PERF_COUNTERS is not enabled by default, it can be enabled,
for example, by randconfig what will lead to CI build issues.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v3:
 - Add asm/perfc_defn.h to provide vcpu_kick perfcoounter to cover
   the case when CONFIG_PERF_COUNTERS=y.
---
Changes in v2:
 - Add Acked-by: Jan Beulich <jbeulich@suse.com>.
---
 xen/arch/riscv/domain.c                 | 14 ++++++++++++++
 xen/arch/riscv/include/asm/Makefile     |  1 -
 xen/arch/riscv/include/asm/perfc_defn.h |  3 +++
 xen/arch/riscv/stubs.c                  |  5 -----
 4 files changed, 17 insertions(+), 6 deletions(-)
 create mode 100644 xen/arch/riscv/include/asm/perfc_defn.h

diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index 3f4b062b6ce8..30a966f53c1d 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -1,8 +1,10 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 
+#include <xen/cpumask.h>
 #include <xen/init.h>
 #include <xen/mm.h>
 #include <xen/sched.h>
+#include <xen/smp.h>
 #include <xen/vmap.h>
 
 #include <asm/bitops.h>
@@ -231,3 +233,15 @@ void vcpu_flush_interrupts(struct vcpu *v)
         vcpu_update_hvip(v);
     }
 }
+
+void vcpu_kick(struct vcpu *v)
+{
+    bool running = v->is_running;
+
+    vcpu_unblock(v);
+    if ( running && v != current )
+    {
+        perfc_incr(vcpu_kick);
+        smp_send_event_check_mask(cpumask_of(v->processor));
+    }
+}
diff --git a/xen/arch/riscv/include/asm/Makefile b/xen/arch/riscv/include/asm/Makefile
index 3824f31c395c..86c56251d5d7 100644
--- a/xen/arch/riscv/include/asm/Makefile
+++ b/xen/arch/riscv/include/asm/Makefile
@@ -7,7 +7,6 @@ generic-y += hypercall.h
 generic-y += iocap.h
 generic-y += irq-dt.h
 generic-y += percpu.h
-generic-y += perfc_defn.h
 generic-y += random.h
 generic-y += softirq.h
 generic-y += vm_event.h
diff --git a/xen/arch/riscv/include/asm/perfc_defn.h b/xen/arch/riscv/include/asm/perfc_defn.h
new file mode 100644
index 000000000000..8a4b945df662
--- /dev/null
+++ b/xen/arch/riscv/include/asm/perfc_defn.h
@@ -0,0 +1,3 @@
+/* This file is intended to be included multiple times. */
+
+PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index c5784a436574..1f0add97b361 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -208,11 +208,6 @@ void vcpu_block_unless_event_pending(struct vcpu *v)
     BUG_ON("unimplemented");
 }
 
-void vcpu_kick(struct vcpu *v)
-{
-    BUG_ON("unimplemented");
-}
-
 unsigned long
 hypercall_create_continuation(unsigned int op, const char *format, ...)
 {
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 10/16] xen/riscv: add vtimer context switch helpers
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
                   ` (8 preceding siblings ...)
  2026-02-09 16:52 ` [PATCH v3 09/16] xen/riscv: introduce vcpu_kick() implementation Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-09 16:52 ` [PATCH v3 11/16] xen/riscv: implement SBI legacy SET_TIMER support for guests Oleksii Kurochko
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Introduce vtimer_ctxt_switch_from() and vtimer_ctxt_switch_to() to handle
virtual timer state across vCPU context switches.

At present, vtimer_ctxt_switch_from() is a no-op because the RISC-V SSTC
extension, which provides a virtualization-aware timer, is not yet
supported. Xen therefore relies the virtual (SBI-based) timer.

The virtual timer uses Xen's internal timer infrastructure and must be
associated with the pCPU on which the vCPU is currently running so that
timer events can be delivered efficiently. As a result, vtimer_ctxt_switch_to()
migrates the timer to the target pCPU when a vCPU is scheduled in.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v3:
 - s/vtimer_ctx_switch_to/vtimer_ctxt_switch_to
 - s/vtimer_ctx_switch_from/vtimer_ctxt_switch_from
 - Add Acked-by: Jan Beulich <jbeulich@suse.com>.
---
Changes in v2:
 - Align the parameters names for  vtimer_ctx_switch_from() and vtimer_ctx_switch_to() in
   declarations to match the ones in the defintions to make Misra happy.
 - s/vtimer_save/vtimer_ctx_switch_from.
 - s/vtimer_restore/vtimer_ctx_switch_to.
 - Update the commit message.
---
 xen/arch/riscv/include/asm/vtimer.h |  3 +++
 xen/arch/riscv/vtimer.c             | 15 +++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/xen/arch/riscv/include/asm/vtimer.h b/xen/arch/riscv/include/asm/vtimer.h
index 0d1555511755..c70b0226515e 100644
--- a/xen/arch/riscv/include/asm/vtimer.h
+++ b/xen/arch/riscv/include/asm/vtimer.h
@@ -17,4 +17,7 @@ void vcpu_timer_destroy(struct vcpu *v);
 
 void vtimer_set_timer(struct vtimer *t, uint64_t ticks);
 
+void vtimer_ctxt_switch_from(struct vcpu *p);
+void vtimer_ctxt_switch_to(struct vcpu *n);
+
 #endif /* ASM__RISCV__VTIMER_H */
diff --git a/xen/arch/riscv/vtimer.c b/xen/arch/riscv/vtimer.c
index 32d142bcdfcd..afd8a53a7387 100644
--- a/xen/arch/riscv/vtimer.c
+++ b/xen/arch/riscv/vtimer.c
@@ -1,5 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 
+#include <xen/bug.h>
 #include <xen/sched.h>
 #include <xen/timer.h>
 
@@ -69,3 +70,17 @@ void vtimer_set_timer(struct vtimer *t, const uint64_t ticks)
     migrate_timer(&t->timer, smp_processor_id());
     set_timer(&t->timer, expires);
 }
+
+void vtimer_ctxt_switch_from(struct vcpu *p)
+{
+    ASSERT(!is_idle_vcpu(p));
+
+    /* Nothing to do at the moment as SSTC isn't supported now. */
+}
+
+void vtimer_ctxt_switch_to(struct vcpu *n)
+{
+    ASSERT(!is_idle_vcpu(n));
+
+    migrate_timer(&n->arch.vtimer.timer, n->processor);
+}
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 11/16] xen/riscv: implement SBI legacy SET_TIMER support for guests
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
                   ` (9 preceding siblings ...)
  2026-02-09 16:52 ` [PATCH v3 10/16] xen/riscv: add vtimer context switch helpers Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-09 16:52 ` [PATCH v3 12/16] xen/riscv: introduce sbi_set_timer() Oleksii Kurochko
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Add handling of the SBI_EXT_0_1_SET_TIMER function ID to the legacy
extension ecall handler. The handler now programs the vCPU’s virtual
timer via vtimer_set_timer() and returns SBI_SUCCESS.

This enables guests using the legacy SBI timer interface to schedule
timer events correctly.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v3:
 - Nothing changed. Only rebase.
---
Changes in v2:
 - Add Acked-by: Jan Beulich <jbeulich@suse.com>.
---
 xen/arch/riscv/vsbi/legacy-extension.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/xen/arch/riscv/vsbi/legacy-extension.c b/xen/arch/riscv/vsbi/legacy-extension.c
index 2e8df191c295..090c23440cea 100644
--- a/xen/arch/riscv/vsbi/legacy-extension.c
+++ b/xen/arch/riscv/vsbi/legacy-extension.c
@@ -7,6 +7,7 @@
 
 #include <asm/processor.h>
 #include <asm/vsbi.h>
+#include <asm/vtimer.h>
 
 static void vsbi_print_char(char c)
 {
@@ -44,6 +45,11 @@ static int vsbi_legacy_ecall_handler(unsigned long eid, unsigned long fid,
         ret = SBI_ERR_NOT_SUPPORTED;
         break;
 
+    case SBI_EXT_0_1_SET_TIMER:
+        vtimer_set_timer(&current->arch.vtimer, regs->a0);
+        regs->a0 = SBI_SUCCESS;
+        break;
+
     default:
         /*
          * TODO: domain_crash() is acceptable here while things are still under
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 12/16] xen/riscv: introduce sbi_set_timer()
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
                   ` (10 preceding siblings ...)
  2026-02-09 16:52 ` [PATCH v3 11/16] xen/riscv: implement SBI legacy SET_TIMER support for guests Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-11 15:03   ` Jan Beulich
  2026-02-09 16:52 ` [PATCH v3 13/16] xen/riscv: implement reprogram_timer() via SBI Oleksii Kurochko
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Introduce a function pointer for sbi_set_timer(), since different OpenSBI
versions may implement the TIME extension with different extension IDs
and/or function IDs.

If the TIME extension is not available, fall back to the legacy timer
mechanism. This is useful when Xen runs as a guest under another Xen,
because the TIME extension is not currently virtualised and therefore
will not appear as available.
Despite of the fact that sbi_set_timer_v01 is introduced and used as
fall back, SBI v0.1 still isn't fully supported (with the current SBI
calls usage, sbi_rfence_v01 should be introduced too), so panic()
in sbi_init() isn't removed.

The sbi_set_timer() pointer will be used by reprogram_timer() to program
Xen’s physical timer as without SSTC extension there is no any other
option except SBI call to do that as only M-timer is available for us.

Use dprintk() for all the cases to print that a speicifc SBI extension
is available as it isn't really necessary in case of release builds.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v3:
 - Init sbi_set_timer with sbi_set_timer_v01 as fallback value.
 - Sort SBI IDs in the same way as SBI EXT IDs are declared.
 - Add __ro_after_init for sbi_set_timer variable.
 - use dprintk instead of printk to print information if SBI ext is available.
---
Changes in v2:
 - Move up defintion of SBI_EXT_TIME_SET_TIMER and use the same padding as
   defintions around it.
 - Add an extra comment about stime_value granuality above declaration of
   sbi_set_timer function pointer.
 - Refactor implemetation of sbi_set_timer_v02().
 - Provide fallback for sbi_set_timer_v01().
 - Update the commit message.
---
 xen/arch/riscv/include/asm/sbi.h | 18 ++++++++++++++
 xen/arch/riscv/sbi.c             | 40 +++++++++++++++++++++++++++++++-
 2 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/xen/arch/riscv/include/asm/sbi.h b/xen/arch/riscv/include/asm/sbi.h
index 79f7ff5c5501..a237f8b1393c 100644
--- a/xen/arch/riscv/include/asm/sbi.h
+++ b/xen/arch/riscv/include/asm/sbi.h
@@ -29,6 +29,7 @@
 
 #define SBI_EXT_BASE                    0x10
 #define SBI_EXT_RFENCE                  0x52464E43
+#define SBI_EXT_TIME                    0x54494D45
 
 /* SBI function IDs for BASE extension */
 #define SBI_EXT_BASE_GET_SPEC_VERSION   0x0
@@ -48,6 +49,9 @@
 #define SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA       0x5
 #define SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID  0x6
 
+/* SBI function IDs for TIME extension */
+#define SBI_EXT_TIME_SET_TIMER          0x0
+
 #define SBI_SPEC_VERSION_MAJOR_MASK     0x7f000000
 #define SBI_SPEC_VERSION_MINOR_MASK     0x00ffffff
 
@@ -134,6 +138,20 @@ int sbi_remote_hfence_gvma(const cpumask_t *cpu_mask, vaddr_t start,
 int sbi_remote_hfence_gvma_vmid(const cpumask_t *cpu_mask, vaddr_t start,
                                 size_t size, unsigned long vmid);
 
+/*
+ * Programs the clock for next event after stime_value time. This function also
+ * clears the pending timer interrupt bit.
+ * If the supervisor wishes to clear the timer interrupt without scheduling the
+ * next timer event, it can either request a timer interrupt infinitely far
+ * into the future (i.e., (uint64_t)-1), or it can instead mask the timer
+ * interrupt by clearing sie.STIE CSR bit.
+ * The stime_value parameter represents absolute time measured in ticks.
+ *
+ * This SBI call returns 0 upon success or an implementation specific negative
+ * error code.
+ */
+extern int (* __ro_after_init sbi_set_timer)(uint64_t stime_value);
+
 /*
  * Initialize SBI library
  *
diff --git a/xen/arch/riscv/sbi.c b/xen/arch/riscv/sbi.c
index 425dce44c679..b4a7ae6940c1 100644
--- a/xen/arch/riscv/sbi.c
+++ b/xen/arch/riscv/sbi.c
@@ -249,6 +249,38 @@ static int (* __ro_after_init sbi_rfence)(unsigned long fid,
                                           unsigned long arg4,
                                           unsigned long arg5);
 
+static int cf_check sbi_set_timer_v02(uint64_t stime_value)
+{
+    struct sbiret ret;
+
+    ret = sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value,
+#ifdef CONFIG_RISCV_32
+                    stime_value >> 32,
+#else
+                    0,
+#endif
+                    0, 0, 0, 0);
+
+    return sbi_err_map_xen_errno(ret.error);
+}
+
+static int cf_check sbi_set_timer_v01(uint64_t stime_value)
+{
+    struct sbiret ret;
+
+    ret = sbi_ecall(SBI_EXT_0_1_SET_TIMER, 0, stime_value,
+#ifdef CONFIG_RISCV_32
+                    stime_value >> 32,
+#else
+                    0,
+#endif
+                    0, 0, 0, 0);
+
+    return sbi_err_map_xen_errno(ret.error);
+}
+
+int (* __ro_after_init sbi_set_timer)(uint64_t stime_value) = sbi_set_timer_v01;
+
 int sbi_remote_sfence_vma(const cpumask_t *cpu_mask, vaddr_t start,
                           size_t size)
 {
@@ -324,7 +356,13 @@ int __init sbi_init(void)
         if ( sbi_probe_extension(SBI_EXT_RFENCE) > 0 )
         {
             sbi_rfence = sbi_rfence_v02;
-            printk("SBI v0.2 RFENCE extension detected\n");
+            dprintk(XENLOG_INFO, "SBI v0.2 RFENCE extension detected\n");
+        }
+
+        if ( sbi_probe_extension(SBI_EXT_TIME) > 0 )
+        {
+            sbi_set_timer = sbi_set_timer_v02;
+            dprintk(XENLOG_INFO, "SBI v0.2 TIME extension detected\n");
         }
     }
     else
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 13/16] xen/riscv: implement reprogram_timer() via SBI
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
                   ` (11 preceding siblings ...)
  2026-02-09 16:52 ` [PATCH v3 12/16] xen/riscv: introduce sbi_set_timer() Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-11 15:12   ` Jan Beulich
  2026-02-09 16:52 ` [PATCH v3 14/16] xen/riscv: handle hypervisor timer interrupts Oleksii Kurochko
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Implement reprogram_timer() on RISC-V using the standard SBI timer call.

The privileged architecture only defines machine-mode timer interrupts
(using mtime/mtimecmp). Therefore, timer services for S/HS/VS mode must
be provided by M-mode via SBI calls. SSTC (Supervisor-mode Timer Control)
is optional and is not supported on the boards available to me, so the
only viable approach today is to program the timer through SBI.

reprogram_timer() enables/disables the supervisor timer interrupt and
programs the next timer deadline using sbi_set_timer(). If the SBI call
fails, the code panics, because sbi_set_timer() is expected to return
either 0 or -ENOSUPP (this has been stable from early OpenSBI versions to
the latest ones). The SBI spec does not define a standard negative error
code for this call, and without SSTC there is no alternative method to
program the timer, so the SBI timer call must be available.

reprogram_timer() currently returns int for compatibility with the
existing prototype. While it might be cleaner to return bool, keeping the
existing signature avoids premature changes in case sbi_set_timer() ever
needs to return other values (based on which we could try to avoid
panic-ing) in the future.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v3:
 - Correct the comments in reprogram_timer().
 - Move enablement of timer interrupt after sbi_set_timer() to avoid
   potentially receiving a timer interrupt between these 2 operations.
---
Changes in v2:
 - Add TODO comment above sbi_set_timer() call.
 - Update the commit message.
---
 xen/arch/riscv/stubs.c |  5 -----
 xen/arch/riscv/time.c  | 43 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+), 5 deletions(-)

diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index 1f0add97b361..cb7546558b8e 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -21,11 +21,6 @@ nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
 
 /* time.c */
 
-int reprogram_timer(s_time_t timeout)
-{
-    BUG_ON("unimplemented");
-}
-
 void send_timer_event(struct vcpu *v)
 {
     BUG_ON("unimplemented");
diff --git a/xen/arch/riscv/time.c b/xen/arch/riscv/time.c
index 2c7af0a5d63b..7efa76fdbcb1 100644
--- a/xen/arch/riscv/time.c
+++ b/xen/arch/riscv/time.c
@@ -7,6 +7,9 @@
 #include <xen/time.h>
 #include <xen/types.h>
 
+#include <asm/csr.h>
+#include <asm/sbi.h>
+
 unsigned long __ro_after_init cpu_khz; /* CPU clock frequency in kHz. */
 uint64_t __ro_after_init boot_clock_cycles;
 
@@ -40,6 +43,46 @@ static void __init preinit_dt_xen_time(void)
     cpu_khz = rate / 1000;
 }
 
+int reprogram_timer(s_time_t timeout)
+{
+    uint64_t deadline, now;
+    int rc;
+
+    if ( timeout == 0 )
+    {
+        /* Disable timer interrupt */
+        csr_clear(CSR_SIE, BIT(IRQ_S_TIMER, UL));
+
+        return 1;
+    }
+
+    deadline = ns_to_ticks(timeout) + boot_clock_cycles;
+    now = get_cycles();
+    if ( deadline <= now )
+        return 0;
+
+    /*
+     * TODO: When the SSTC extension is supported, it would be preferable to
+     *       use the supervisor timer registers directly here for better
+     *       performance, since an SBI call and mode switch would no longer
+     *       be required.
+     *
+     *       This would also reduce reliance on a specific SBI implementation.
+     *       For example, it is not ideal to panic() if sbi_set_timer() returns
+     *       a non-zero value. Currently it can return 0 or -ENOSUPP, and
+     *       without SSTC we still need an implementation because only the
+     *       M-mode timer is available, and it can only be programmed in
+     *       M-mode.
+     */
+    if ( (rc = sbi_set_timer(deadline)) )
+        panic("%s: timer wasn't set because: %d\n", __func__, rc);
+
+    /* Enable timer interrupt */
+    csr_set(CSR_SIE, BIT(IRQ_S_TIMER, UL));
+
+    return 1;
+}
+
 void __init preinit_xen_time(void)
 {
     if ( acpi_disabled )
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 14/16] xen/riscv: handle hypervisor timer interrupts
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
                   ` (12 preceding siblings ...)
  2026-02-09 16:52 ` [PATCH v3 13/16] xen/riscv: implement reprogram_timer() via SBI Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-09 16:52 ` [PATCH v3 15/16] xen/riscv: init tasklet subsystem Oleksii Kurochko
  2026-02-09 16:52 ` [PATCH v3 16/16] xen/riscv: implement sync_vcpu_execstate() Oleksii Kurochko
  15 siblings, 0 replies; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

Introduce timer_interrupt() to process IRQ_S_TIMER interrupts.
The handler disables further timer interrupts by clearing
SIE.STIE and raises TIMER_SOFTIRQ so the generic timer subsystem
can perform its processing.

Update do_trap() to dispatch IRQ_S_TIMER to this new handler.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v3:
 - add Acked-by: Jan Beulich <jbeulich@suse.com>.
---
Changes in v2:
 - Drop cause argument of timer_interrupt() as it isn't used inside
   the function and anyway it is pretty clear what is the cause inside
   timer_interrupt().
---
 xen/arch/riscv/traps.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/xen/arch/riscv/traps.c b/xen/arch/riscv/traps.c
index 676a2da55811..e8d9ca902d9c 100644
--- a/xen/arch/riscv/traps.c
+++ b/xen/arch/riscv/traps.c
@@ -10,6 +10,7 @@
 #include <xen/lib.h>
 #include <xen/nospec.h>
 #include <xen/sched.h>
+#include <xen/softirq.h>
 
 #include <asm/cpufeature.h>
 #include <asm/intc.h>
@@ -176,6 +177,15 @@ static void check_for_pcpu_work(void)
     vcpu_flush_interrupts(current);
 }
 
+static void timer_interrupt(void)
+{
+    /* Disable the timer to avoid more interrupts */
+    csr_clear(CSR_SIE, BIT(IRQ_S_TIMER, UL));
+
+    /* Signal the generic timer code to do its work */
+    raise_softirq(TIMER_SOFTIRQ);
+}
+
 void do_trap(struct cpu_user_regs *cpu_regs)
 {
     register_t pc = cpu_regs->sepc;
@@ -217,6 +227,10 @@ void do_trap(struct cpu_user_regs *cpu_regs)
                 intc_handle_external_irqs(cpu_regs);
                 break;
 
+            case IRQ_S_TIMER:
+                timer_interrupt();
+                break;
+
             default:
                 intr_handled = false;
                 break;
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 15/16] xen/riscv: init tasklet subsystem
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
                   ` (13 preceding siblings ...)
  2026-02-09 16:52 ` [PATCH v3 14/16] xen/riscv: handle hypervisor timer interrupts Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-09 16:52 ` [PATCH v3 16/16] xen/riscv: implement sync_vcpu_execstate() Oleksii Kurochko
  15 siblings, 0 replies; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

As the tasklet subsystem is now initialized, it is necessary to implement
sync_local_execstate(), since it is invoked when something calls
tasklet_softirq_action(), which is registered in tasklet_subsys_init().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v3:
 - add Acked-by: Jan Beulich <jbeulich@suse.com>.
---
Changes in v2:
 - Update the commit message.
 - Move implementation of sync_vcpu_execstate() to separate commit
   as it doesn't connect to tasklet subsystem.
---
 xen/arch/riscv/setup.c |  3 +++
 xen/arch/riscv/stubs.c | 10 ----------
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index 010489f0b713..0cea1435ff32 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -12,6 +12,7 @@
 #include <xen/serial.h>
 #include <xen/shutdown.h>
 #include <xen/smp.h>
+#include <xen/tasklet.h>
 #include <xen/timer.h>
 #include <xen/vmap.h>
 #include <xen/xvmalloc.h>
@@ -157,6 +158,8 @@ void __init noreturn start_xen(unsigned long bootcpu_id,
         panic("Booting using ACPI isn't supported\n");
     }
 
+    tasklet_subsys_init();
+
     init_IRQ();
 
     riscv_fill_hwcap();
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index cb7546558b8e..26434166acc6 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -91,16 +91,6 @@ void continue_running(struct vcpu *same)
     BUG_ON("unimplemented");
 }
 
-void sync_local_execstate(void)
-{
-    BUG_ON("unimplemented");
-}
-
-void sync_vcpu_execstate(struct vcpu *v)
-{
-    BUG_ON("unimplemented");
-}
-
 void startup_cpu_idle_loop(void)
 {
     BUG_ON("unimplemented");
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 16/16] xen/riscv: implement sync_vcpu_execstate()
  2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
                   ` (14 preceding siblings ...)
  2026-02-09 16:52 ` [PATCH v3 15/16] xen/riscv: init tasklet subsystem Oleksii Kurochko
@ 2026-02-09 16:52 ` Oleksii Kurochko
  2026-02-11 15:16   ` Jan Beulich
  15 siblings, 1 reply; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 16:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Romain Caritey, Oleksii Kurochko, Alistair Francis, Connor Davis,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

The scheduler may call this function to force synchronization of given
vCPU's state. RISC-V does not support lazy context switching, so nothing
is done in sync_vcpu_execstate() and sync_local_execstate().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
This patch is dependent on patch from ML:
  [PATCH] sched: move vCPU exec state barriers
---
Changes in v3:
 - Align sync_vcpu_execstate() with patch:
     [PATCH] sched: move vCPU exec state barriers
---
Changes in v2:
 - New patch.
---
 xen/arch/riscv/domain.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index 30a966f53c1d..ecb4ef8d0c89 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -245,3 +245,13 @@ void vcpu_kick(struct vcpu *v)
         smp_send_event_check_mask(cpumask_of(v->processor));
     }
 }
+
+void sync_local_execstate(void)
+{
+    /* Nothing to do -- no lazy switching */
+}
+
+void sync_vcpu_execstate(struct vcpu *v)
+{
+    /* Nothing to do -- no lazy switching */
+}
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 01/16] xen/riscv: implement arch_vcpu_{create,destroy}()
  2026-02-09 16:52 ` [PATCH v3 01/16] xen/riscv: implement arch_vcpu_{create,destroy}() Oleksii Kurochko
@ 2026-02-10 16:24   ` Jan Beulich
  0 siblings, 0 replies; 39+ messages in thread
From: Jan Beulich @ 2026-02-10 16:24 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.02.2026 17:52, Oleksii Kurochko wrote:
> Introduce architecture-specific functions to create and destroy VCPUs.
> Note that arch_vcpu_create() currently returns -EOPNOTSUPP, as the virtual
> timer and interrupt controller are not yet implemented.
> 
> Add calle-saved registers used to preserve Xen’s own execution context
> when switching between vCPU stacks.

"Add" is lacking context here: You don't add those to arch_vcpu_create(),
which is the context left from the earlier paragraph.

> It is going to be used in the following way (pseudocode):
>   context_switch(prev_vcpu, next_vcpu):
>     ...
> 
>     /* Switch from previous stack to the next stack. */
>     __context_switch(prev_vcpu, next_vcpu);
> 
>     ...
>     schedule_tail(prev_vcpu):
>         Save and restore vCPU's CSRs.
> The Xen-saved context allows __context_switch() to switch execution
> from the previous vCPU’s stack to the next vCPU’s stack and later resume
> execution on the original stack when switching back.
> 
> During vCPU creation, the Xen-saved context is going to be initialized
> with:
>   - SP pointing to the newly allocated vCPU stack
>   - RA pointing to a helper that performs final vCPU setup before
>     transferring control to the guest
> After the first execution of __context_switch(), RA naturally points to
> the instruction following the call site, and the remaining callee-saved
> registers contain the Xen register state at the time of the switch.

RA doesn't "naturally" point anywhere until you actually implement more
pieces. Please, again, can descriptions be written such that they make
sense at the point where the patch being described applies?

> --- /dev/null
> +++ b/xen/arch/riscv/domain.c
> @@ -0,0 +1,58 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <xen/init.h>
> +#include <xen/mm.h>
> +#include <xen/sched.h>
> +#include <xen/vmap.h>
> +
> +static void continue_new_vcpu(struct vcpu *prev)
> +{
> +    BUG_ON("unimplemented\n");
> +}
> +
> +static void __init __maybe_unused build_assertions(void)
> +{
> +    /*
> +     * Enforce the requirement documented in struct cpu_info that
> +     * guest_cpu_user_regs must be the first field.
> +     */
> +    BUILD_BUG_ON(offsetof(struct cpu_info, guest_cpu_user_regs) != 0);
> +}
> +
> +int arch_vcpu_create(struct vcpu *v)
> +{
> +    int rc = 0;
> +    void *stack = vzalloc(STACK_SIZE);

Much like you use void * here, ...

> +    if ( !stack )
> +        return -ENOMEM;
> +
> +    v->arch.cpu_info = stack + STACK_SIZE - sizeof(struct cpu_info);
> +
> +    v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
> +    v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
> +
> +    /* Idle VCPUs don't need the rest of this setup */
> +    if ( is_idle_vcpu(v) )
> +        return rc;
> +
> +    /*
> +     * As the vtimer and interrupt controller (IC) are not yet implemented,
> +     * return an error.
> +     *
> +     * TODO: Drop this once the vtimer and IC are implemented.
> +     */
> +    rc = -EOPNOTSUPP;
> +    goto fail;
> +
> +    return rc;
> +
> + fail:
> +    arch_vcpu_destroy(v);
> +    return rc;
> +}
> +
> +void arch_vcpu_destroy(struct vcpu *v)
> +{
> +    vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info) - STACK_SIZE);

... you probably want to do so here as well. And btw, this can be shortened:

    vfree((void *)&v->arch.cpu_info[1] - STACK_SIZE);

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 02/16] xen/riscv: avoid reading hstateen0 when Smstateen is not implemented
  2026-02-09 16:52 ` [PATCH v3 02/16] xen/riscv: avoid reading hstateen0 when Smstateen is not implemented Oleksii Kurochko
@ 2026-02-11  7:22   ` Jan Beulich
  2026-02-11  8:45     ` Oleksii Kurochko
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2026-02-11  7:22 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.02.2026 17:52, Oleksii Kurochko wrote:
> If the Smstateen extension is not implemented, the hstateen0 CSR is
> considered non-existent. Any attempt to access it will raise an
> illegal-instruction exception.
> 
> Guard the hstateen0 dump with a runtime check for Smstateen support to
> avoid triggering traps when dumping CSRs on systems that do not
> implement this extension.
> 
> Fixes: 3babc8d2e546 ("xen/riscv: dump GPRs and CSRs on unexpected traps")
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

It is, aiui, independent of patch 1 and hence can go in right away.

> @@ -144,7 +145,12 @@ static void dump_csrs(const char *ctx)
>        (v & HSTATUS_SPV)  ? " SPV"  : "",
>        (v & HSTATUS_GVA)  ? " GVA"  : "");
>      X(hgatp, CSR_HGATP, "\n");
> -    X(hstateen0, CSR_HSTATEEN0, "\n");
> +
> +    if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_smstateen) )
> +    {
> +        X(hstateen0, CSR_HSTATEEN0, "\n");
> +    }

I was going to ask for the braces to be dropped, but I notice they are
required as long as X() isn't properly adjusted. This is why even for
local use macros we should take a little more care when introducing
them, so they can be used without having to pay too close attention to
their actual implementation.

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/16] xen/riscv: detect and store supported hypervisor CSR bits at boot
  2026-02-09 16:52 ` [PATCH v3 03/16] xen/riscv: detect and store supported hypervisor CSR bits at boot Oleksii Kurochko
@ 2026-02-11  7:49   ` Jan Beulich
  2026-02-11  9:47     ` Oleksii Kurochko
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2026-02-11  7:49 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.02.2026 17:52, Oleksii Kurochko wrote:
> Some hypervisor CSRs expose optional functionality and may not implement
> all architectural bits. Writing unsupported bits can either be ignored
> or raise an exception depending on the platform.
> 
> Detect the set of writable bits for selected hypervisor CSRs at boot and
> store the resulting masks for later use. This allows safely programming
> these CSRs during vCPU context switching and avoids relying on hardcoded
> architectural assumptions.
> 
> Note that csr_set() is used instead of csr_write() to write all ones to
> the mask, as the CSRRS instruction, according to the RISC-V specification,
> sets only those bits that are writable:
>     Any bit that is high in rs1 will cause the corresponding bit to be set
>     in the CSR, if that CSR bit is writable.
> In contrast, the CSRRW instruction does not take CSR bit writability into
> account, which could lead to unintended side effects when writing all ones
> to a CSR.

Hmm, I wonder in how far the wording there is precise. In a subsequent
paragraph there is:

"For both CSRRS and CSRRC, if rs1=x0, then the instruction will not write
 to the CSR at all, and so shall not cause any of the side effects that
 might otherwise occur on a CSR write, nor raise illegal-instruction
 exceptions on accesses to read-only CSRs."

To me, a read-only CSR is a CSR with all bits read-only. With this
interpretation, the two statements conflict with one another. Is this
interpretation ruled out somewhere?

> Masks are calculated at the moment only for hdeleg, henvcfg, hideleg,

Nit: First one is hedeleg.

> hstateen0 registers as only them are going to be used in the follow up
> patch.
> 
> If the Smstateen extension is not implemented, hstateen0 cannot be read
> because the register is considered non-existent. Instructions that attempt
> to access a CSR that is not implemented or not visible in the current mode
> are reserved and will raise an illegal-instruction exception.
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Changes in V3:
>  - New patch.
> 
> --- a/xen/arch/riscv/setup.c
> +++ b/xen/arch/riscv/setup.c
> @@ -32,6 +32,8 @@
>  unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
>      __aligned(STACK_SIZE);
>  
> +struct csr_masks __ro_after_init csr_masks;

setup.c would be nice to only have __init functions and __initdata data.
Really up to now that's the case, and I wonder why the makefile doesn't
leverage this by using setup.init.o in place of setup.o. This variable
would likely better live elsewhere anyway, imo: Somewhere it's actually
(going to be) used.

> @@ -70,6 +72,28 @@ static void * __init relocate_fdt(paddr_t dtb_paddr, size_t dtb_size)
>      return fdt;
>  }
>  
> +void __init init_csr_masks(void)
> +{
> +    register_t old;
> +
> +#define X(csr, field) \
> +        old = csr_read(CSR_##csr); \
> +        csr_set(CSR_##csr, ULONG_MAX); \
> +        csr_masks.field = csr_read(CSR_##csr); \
> +        csr_write(CSR_##csr, old)

See my remark on the earlier patch regarding locally used macros. You
shouldn't ...

> +    X(HEDELEG, hedeleg);
> +    X(HENVCFG, henvcfg);
> +    X(HIDELEG, hideleg);
> +
> +    if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_smstateen) )
> +    {
> +        X(HSTATEEN0, hstateen0);
> +    }

... be required to put braces here. (Then I'd further recommend to make "old"
local to the macro's scope.)

I'm also inclined to recommend to avoid an inflation of X() macros. Give
each such macro a somewhat sensible (yet still short) name. This way you'll
avoid Misra rule 5.4 ("Macro identifiers shall be distinct") concerns, in
combination with rule 20.5 ("#undef should not be used"). Note that we
didn't accept the latter rule, hence why I'm only saying "concerns", not
"violations".

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 02/16] xen/riscv: avoid reading hstateen0 when Smstateen is not implemented
  2026-02-11  7:22   ` Jan Beulich
@ 2026-02-11  8:45     ` Oleksii Kurochko
  0 siblings, 0 replies; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-11  8:45 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel


On 2/11/26 8:22 AM, Jan Beulich wrote:
> On 09.02.2026 17:52, Oleksii Kurochko wrote:
>> If the Smstateen extension is not implemented, the hstateen0 CSR is
>> considered non-existent. Any attempt to access it will raise an
>> illegal-instruction exception.
>>
>> Guard the hstateen0 dump with a runtime check for Smstateen support to
>> avoid triggering traps when dumping CSRs on systems that do not
>> implement this extension.
>>
>> Fixes: 3babc8d2e546 ("xen/riscv: dump GPRs and CSRs on unexpected traps")
>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks.

>
> It is, aiui, independent of patch 1 and hence can go in right away.

Yes, it is independent.

>
>> @@ -144,7 +145,12 @@ static void dump_csrs(const char *ctx)
>>         (v & HSTATUS_SPV)  ? " SPV"  : "",
>>         (v & HSTATUS_GVA)  ? " GVA"  : "");
>>       X(hgatp, CSR_HGATP, "\n");
>> -    X(hstateen0, CSR_HSTATEEN0, "\n");
>> +
>> +    if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_smstateen) )
>> +    {
>> +        X(hstateen0, CSR_HSTATEEN0, "\n");
>> +    }
> I was going to ask for the braces to be dropped, but I notice they are
> required as long as X() isn't properly adjusted. This is why even for
> local use macros we should take a little more care when introducing
> them, so they can be used without having to pay too close attention to
> their actual implementation.

It would be better to have do {...} while(0) in the definition of X macros.

I will add do {...} while(0) to one of the next patches where I have the same
case.

I don't mind to update X() in dump_csrs() in the next patch version to:
#define X(name, csr, fmt, ...) do { \
     v = csr_read(csr); \
     printk("%-10s: %016lx" fmt, #name, v, ##__VA_ARGS__); \
while(0)

~ Oleksii



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 04/16] xen/riscv: implement vcpu_csr_init()
  2026-02-09 16:52 ` [PATCH v3 04/16] xen/riscv: implement vcpu_csr_init() Oleksii Kurochko
@ 2026-02-11  9:44   ` Jan Beulich
  2026-02-11  9:53     ` Oleksii Kurochko
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2026-02-11  9:44 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.02.2026 17:52, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/domain.c
> +++ b/xen/arch/riscv/domain.c
> @@ -5,6 +5,72 @@
>  #include <xen/sched.h>
>  #include <xen/vmap.h>
>  
> +#include <asm/cpufeature.h>
> +#include <asm/csr.h>
> +#include <asm/riscv_encoding.h>
> +#include <asm/setup.h>
> +
> +#define HEDELEG_DEFAULT (BIT(CAUSE_MISALIGNED_FETCH, U) | \
> +                         BIT(CAUSE_FETCH_ACCESS, U) | \
> +                         BIT(CAUSE_ILLEGAL_INSTRUCTION, U) | \
> +                         BIT(CAUSE_BREAKPOINT, U) | \
> +                         BIT(CAUSE_MISALIGNED_LOAD, U) | \
> +                         BIT(CAUSE_LOAD_ACCESS, U) | \
> +                         BIT(CAUSE_MISALIGNED_STORE, U) | \
> +                         BIT(CAUSE_STORE_ACCESS, U) | \
> +                         BIT(CAUSE_USER_ECALL, U) | \
> +                         BIT(CAUSE_FETCH_PAGE_FAULT, U) | \
> +                         BIT(CAUSE_LOAD_PAGE_FAULT, U) | \
> +                         BIT(CAUSE_STORE_PAGE_FAULT, U))
> +
> +#define HIDELEG_DEFAULT (MIP_VSSIP | MIP_VSTIP | MIP_VSEIP)
> +
> +static void vcpu_csr_init(struct vcpu *v)
> +{
> +    register_t hstateen0;

There not being an initializer here, ...

> +    v->arch.hedeleg = HEDELEG_DEFAULT & csr_masks.hedeleg;
> +
> +    vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
> +
> +    v->arch.hideleg = HIDELEG_DEFAULT & csr_masks.hideleg;
> +
> +    /*
> +     * VS should access only the time counter directly.
> +     * Everything else should trap.
> +     */
> +    v->arch.hcounteren = HCOUNTEREN_TM;
> +
> +    if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_svpbmt) )
> +        v->arch.henvcfg = ENVCFG_PBMTE & csr_masks.henvcfg;
> +
> +    if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_smstateen) )
> +    {
> +         if (riscv_isa_extension_available(NULL, RISCV_ISA_EXT_ssaia))

(Nit: Style.)

> +            /*
> +             * If the hypervisor extension is implemented, the same three
> +             * bitsare defined also in hypervisor CSR hstateen0 but concern

(Nit: "bits are")

> +             * only the state potentially accessible to a virtual machine
> +             * executing in privilege modes VS and VU:
> +             *      bit 60 CSRs siselect and sireg (really vsiselect and
> +             *             vsireg)
> +             *      bit 59 CSRs siph and sieh (RV32 only) and stopi (really
> +             *             vsiph, vsieh, and vstopi)
> +             *      bit 58 all state of IMSIC guest interrupt files, including
> +             *             CSR stopei (really vstopei)
> +             * If one of these bits is zero in hstateen0, and the same bit is
> +             * one in mstateen0, then an attempt to access the corresponding
> +             * state from VS or VU-mode raises a virtual instruction exception.
> +             */
> +            hstateen0 = SMSTATEEN0_AIA | SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT;
> +
> +        /* Allow guest to access CSR_ENVCFG */
> +        hstateen0 |= SMSTATEEN0_HSENVCFG;

... doesn't the compiler complain about the use of a possibly uninitialized
variable? The variable also wants to move to the more narrow scope.

> @@ -32,6 +98,8 @@ int arch_vcpu_create(struct vcpu *v)
>      v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
>      v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
>  
> +    vcpu_csr_init(v);
> +
>      /* Idle VCPUs don't need the rest of this setup */
>      if ( is_idle_vcpu(v) )
>          return rc;

Do idle vCPU-s really need to have vcpu_csr_init() called for them?

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/16] xen/riscv: detect and store supported hypervisor CSR bits at boot
  2026-02-11  7:49   ` Jan Beulich
@ 2026-02-11  9:47     ` Oleksii Kurochko
  2026-02-11  9:50       ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-11  9:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel


On 2/11/26 8:49 AM, Jan Beulich wrote:
> On 09.02.2026 17:52, Oleksii Kurochko wrote:
>> Some hypervisor CSRs expose optional functionality and may not implement
>> all architectural bits. Writing unsupported bits can either be ignored
>> or raise an exception depending on the platform.
>>
>> Detect the set of writable bits for selected hypervisor CSRs at boot and
>> store the resulting masks for later use. This allows safely programming
>> these CSRs during vCPU context switching and avoids relying on hardcoded
>> architectural assumptions.
>>
>> Note that csr_set() is used instead of csr_write() to write all ones to
>> the mask, as the CSRRS instruction, according to the RISC-V specification,
>> sets only those bits that are writable:
>>      Any bit that is high in rs1 will cause the corresponding bit to be set
>>      in the CSR, if that CSR bit is writable.
>> In contrast, the CSRRW instruction does not take CSR bit writability into
>> account, which could lead to unintended side effects when writing all ones
>> to a CSR.
> Hmm, I wonder in how far the wording there is precise. In a subsequent
> paragraph there is:
>
> "For both CSRRS and CSRRC, if rs1=x0, then the instruction will not write
>   to the CSR at all, and so shall not cause any of the side effects that
>   might otherwise occur on a CSR write, nor raise illegal-instruction
>   exceptions on accesses to read-only CSRs."
>
> To me, a read-only CSR is a CSR with all bits read-only. With this
> interpretation, the two statements conflict with one another. Is this
> interpretation ruled out somewhere?

Good question. Actually by read-only CSRs RISC-V spec means that a CSR is
read-only by its design:
   The standard RISC-V ISA sets aside a 12-bit encoding space (csr[11:0])
   for up to 4,096 CSRs. By convention, the upper 4 bits of the CSR address
   (csr[11:8]) are used to encode the read and write accessibility of the
   CSRs according to privilege level as shown in Table 3. The top two bits
   (csr[11:10]) indicate whether the register is read/write (00,01, or 10)
   or read-only (11).
But logically it seems like if CSR is read-only then technically all CSR
bits are read-only as anyway an exception will occur if CSR is read-only.

So CSRRW* can't be used for such read-only CSRs as write always happen
and it will lead to an exception.

I can add in the commit message that the quote about CSRRS considers only
not read-only CSRs if it makes sense as if CSR is read-only then we won't
calculate a mask for it.

>
>> Masks are calculated at the moment only for hdeleg, henvcfg, hideleg,
> Nit: First one is hedeleg.
>
>> hstateen0 registers as only them are going to be used in the follow up
>> patch.
>>
>> If the Smstateen extension is not implemented, hstateen0 cannot be read
>> because the register is considered non-existent. Instructions that attempt
>> to access a CSR that is not implemented or not visible in the current mode
>> are reserved and will raise an illegal-instruction exception.
>>
>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>> ---
>> Changes in V3:
>>   - New patch.
>>
>> --- a/xen/arch/riscv/setup.c
>> +++ b/xen/arch/riscv/setup.c
>> @@ -32,6 +32,8 @@
>>   unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
>>       __aligned(STACK_SIZE);
>>   
>> +struct csr_masks __ro_after_init csr_masks;
> setup.c would be nice to only have __init functions and __initdata data.
> Really up to now that's the case, and I wonder why the makefile doesn't
> leverage this by using setup.init.o in place of setup.o. This variable
> would likely better live elsewhere anyway, imo: Somewhere it's actually
> (going to be) used.

I put it here as I wasn't able to find better place. If it is okay to have
it in domain.c then I'm okay to move this variable there.


>
>> @@ -70,6 +72,28 @@ static void * __init relocate_fdt(paddr_t dtb_paddr, size_t dtb_size)
>>       return fdt;
>>   }
>>   
>> +void __init init_csr_masks(void)
>> +{
>> +    register_t old;
>> +
>> +#define X(csr, field) \
>> +        old = csr_read(CSR_##csr); \
>> +        csr_set(CSR_##csr, ULONG_MAX); \
>> +        csr_masks.field = csr_read(CSR_##csr); \
>> +        csr_write(CSR_##csr, old)
> See my remark on the earlier patch regarding locally used macros. You
> shouldn't ...
>
>> +    X(HEDELEG, hedeleg);
>> +    X(HENVCFG, henvcfg);
>> +    X(HIDELEG, hideleg);
>> +
>> +    if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_smstateen) )
>> +    {
>> +        X(HSTATEEN0, hstateen0);
>> +    }
> ... be required to put braces here. (Then I'd further recommend to make "old"
> local to the macro's scope.)
>
> I'm also inclined to recommend to avoid an inflation of X() macros. Give
> each such macro a somewhat sensible (yet still short) name. This way you'll
> avoid Misra rule 5.4 ("Macro identifiers shall be distinct") concerns, in
> combination with rule 20.5 ("#undef should not be used"). Note that we
> didn't accept the latter rule, hence why I'm only saying "concerns", not
> "violations".

Thanks for explanation MISRA stuff.

I will rename X() here to INIT_CSR_MASK() and add do {...} while(0) to deal
with if()'s brackets.

~ Oleksii



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/16] xen/riscv: detect and store supported hypervisor CSR bits at boot
  2026-02-11  9:47     ` Oleksii Kurochko
@ 2026-02-11  9:50       ` Jan Beulich
  0 siblings, 0 replies; 39+ messages in thread
From: Jan Beulich @ 2026-02-11  9:50 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 11.02.2026 10:47, Oleksii Kurochko wrote:
> On 2/11/26 8:49 AM, Jan Beulich wrote:
>> On 09.02.2026 17:52, Oleksii Kurochko wrote:
>>> --- a/xen/arch/riscv/setup.c
>>> +++ b/xen/arch/riscv/setup.c
>>> @@ -32,6 +32,8 @@
>>>   unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
>>>       __aligned(STACK_SIZE);
>>>   
>>> +struct csr_masks __ro_after_init csr_masks;
>> setup.c would be nice to only have __init functions and __initdata data.
>> Really up to now that's the case, and I wonder why the makefile doesn't
>> leverage this by using setup.init.o in place of setup.o. This variable
>> would likely better live elsewhere anyway, imo: Somewhere it's actually
>> (going to be) used.
> 
> I put it here as I wasn't able to find better place. If it is okay to have
> it in domain.c then I'm okay to move this variable there.

If that's where it's going to be (mainly) used (as the next patch suggests),
I see nothing speaking against you doing so.

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 04/16] xen/riscv: implement vcpu_csr_init()
  2026-02-11  9:44   ` Jan Beulich
@ 2026-02-11  9:53     ` Oleksii Kurochko
  0 siblings, 0 replies; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-11  9:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel


On 2/11/26 10:44 AM, Jan Beulich wrote:
> On 09.02.2026 17:52, Oleksii Kurochko wrote:
>> --- a/xen/arch/riscv/domain.c
>> +++ b/xen/arch/riscv/domain.c
>> @@ -5,6 +5,72 @@
>>   #include <xen/sched.h>
>>   #include <xen/vmap.h>
>>   
>> +#include <asm/cpufeature.h>
>> +#include <asm/csr.h>
>> +#include <asm/riscv_encoding.h>
>> +#include <asm/setup.h>
>> +
>> +#define HEDELEG_DEFAULT (BIT(CAUSE_MISALIGNED_FETCH, U) | \
>> +                         BIT(CAUSE_FETCH_ACCESS, U) | \
>> +                         BIT(CAUSE_ILLEGAL_INSTRUCTION, U) | \
>> +                         BIT(CAUSE_BREAKPOINT, U) | \
>> +                         BIT(CAUSE_MISALIGNED_LOAD, U) | \
>> +                         BIT(CAUSE_LOAD_ACCESS, U) | \
>> +                         BIT(CAUSE_MISALIGNED_STORE, U) | \
>> +                         BIT(CAUSE_STORE_ACCESS, U) | \
>> +                         BIT(CAUSE_USER_ECALL, U) | \
>> +                         BIT(CAUSE_FETCH_PAGE_FAULT, U) | \
>> +                         BIT(CAUSE_LOAD_PAGE_FAULT, U) | \
>> +                         BIT(CAUSE_STORE_PAGE_FAULT, U))
>> +
>> +#define HIDELEG_DEFAULT (MIP_VSSIP | MIP_VSTIP | MIP_VSEIP)
>> +
>> +static void vcpu_csr_init(struct vcpu *v)
>> +{
>> +    register_t hstateen0;
> There not being an initializer here, ...
>
>> +    v->arch.hedeleg = HEDELEG_DEFAULT & csr_masks.hedeleg;
>> +
>> +    vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
>> +
>> +    v->arch.hideleg = HIDELEG_DEFAULT & csr_masks.hideleg;
>> +
>> +    /*
>> +     * VS should access only the time counter directly.
>> +     * Everything else should trap.
>> +     */
>> +    v->arch.hcounteren = HCOUNTEREN_TM;
>> +
>> +    if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_svpbmt) )
>> +        v->arch.henvcfg = ENVCFG_PBMTE & csr_masks.henvcfg;
>> +
>> +    if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_smstateen) )
>> +    {
>> +         if (riscv_isa_extension_available(NULL, RISCV_ISA_EXT_ssaia))
> (Nit: Style.)
>
>> +            /*
>> +             * If the hypervisor extension is implemented, the same three
>> +             * bitsare defined also in hypervisor CSR hstateen0 but concern
> (Nit: "bits are")
>
>> +             * only the state potentially accessible to a virtual machine
>> +             * executing in privilege modes VS and VU:
>> +             *      bit 60 CSRs siselect and sireg (really vsiselect and
>> +             *             vsireg)
>> +             *      bit 59 CSRs siph and sieh (RV32 only) and stopi (really
>> +             *             vsiph, vsieh, and vstopi)
>> +             *      bit 58 all state of IMSIC guest interrupt files, including
>> +             *             CSR stopei (really vstopei)
>> +             * If one of these bits is zero in hstateen0, and the same bit is
>> +             * one in mstateen0, then an attempt to access the corresponding
>> +             * state from VS or VU-mode raises a virtual instruction exception.
>> +             */
>> +            hstateen0 = SMSTATEEN0_AIA | SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT;
>> +
>> +        /* Allow guest to access CSR_ENVCFG */
>> +        hstateen0 |= SMSTATEEN0_HSENVCFG;
> ... doesn't the compiler complain about the use of a possibly uninitialized
> variable? The variable also wants to move to the more narrow scope.

Hmm, for some reason it doesn't. Anyway, I agree that it would be better to move
it to a narrower scope, since `hstateen0` exists only when RISCV_ISA_EXT_smstateen
is supported.


>
>> @@ -32,6 +98,8 @@ int arch_vcpu_create(struct vcpu *v)
>>       v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
>>       v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
>>   
>> +    vcpu_csr_init(v);
>> +
>>       /* Idle VCPUs don't need the rest of this setup */
>>       if ( is_idle_vcpu(v) )
>>           return rc;
> Do idle vCPU-s really need to have vcpu_csr_init() called for them?

Agree, there is no any sense. I will move the call of vcpu_csr_init() after
is_idle_vcpu() check.

Thanks.

~ Oleksii



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1
  2026-02-09 16:52 ` [PATCH v3 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1 Oleksii Kurochko
@ 2026-02-11 14:16   ` Jan Beulich
  2026-02-11 14:53     ` Jan Beulich
  2026-02-12  9:37     ` Oleksii Kurochko
  0 siblings, 2 replies; 39+ messages in thread
From: Jan Beulich @ 2026-02-11 14:16 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.02.2026 17:52, Oleksii Kurochko wrote:
> Based on Linux kernel v6.16.0.
> Note that smp_wmb() is used instead of smp_mb__before_atomic() as what
> we want to guarantee that if a bit in irqs_pending_mask is obversable
> that the correspondent bit in irqs_pending is observable too.

And the counterpart of this barrier is the one encoded implicitly in
xchg()? Could do with making more explicit, e.g. by way of adding a
code comment there.

> Add lockless tracking of pending vCPU interrupts using atomic bitops.
> Two bitmaps are introduced:
>  - irqs_pending — interrupts currently pending for the vCPU
>  - irqs_pending_mask — bits that have changed in irqs_pending
> 
> The design follows a multi-producer, single-consumer model, where the
> consumer is the vCPU itself. Producers may set bits in
> irqs_pending_mask without a lock. Clearing bits in irqs_pending_mask is
> performed only by the consumer via xchg_acquire().

What is xchg_acquire() in Xen? I can't spot anything like this.

> The consumer must not
> write to irqs_pending and must not act on bits that are not set in the
> mask. Otherwise, extra synchronization should be provided.
> 
> On RISC-V interrupts are not injected via guest registers, so pending
> interrupts must be recorded in irqs_pending (using the new
> vcpu_{un}set_interrupt() helpers) and flushed to the guest by updating
> HVIP before returning control to the guest. The consumer side is
> implemented in a follow-up patch.
> 
> A barrier between updating irqs_pending and setting the corresponding
> mask bit in vcpu_set_interrupt()/vcpu_unset_interrupt() guarantees
> that if the consumer observes a mask bit set, the corresponding pending
> bit is also visible. This prevents missed interrupts during the flush.
> 
> It is possible a guest could have pending bit not result in the hardware
> register without to be marked pending in irq_pending bitmap as:

Are there some words missing in this sentence? I find it hard to follow
the way it is.

>   According to the RISC-V ISA specification:
>     Bits hip.VSSIP and hie.VSSIE are the interrupt-pending and
>     interrupt-enable  bits for VS-level software interrupts. VSSIP in hip
>     is an alias (writable) of the same bit in hvip.
>   Additionally:
>     When bit 2 of hideleg is zero, vsip.SSIP and vsie.SSIE are read-only
>     zeros. Else, vsip.SSIP and vsie.SSIE are aliases of hip.VSSIP and
>     hie.VSSIE.
> This means the guest may modify vsip.SSIP, which implicitly updates
> hip.VSSIP and the bit being writable with 1 would also trigger an interrupt

s/writable/written/ ?

> as according to the RISC-V spec:
>   These conditions for an interrupt trap to occur must be evaluated in a
>   bounded   amount of time from when an interrupt becomes, or ceases to be,
>   pending in sip,  and must also be evaluated immediately following the
>   execution of an SRET  instruction or an explicit write to a CSR on which
>   these interrupt trap conditions expressly depend (including sip, sie and
>   sstatus).
> What means that IRQ_VS_SOFT must be synchronized separately, what is done
> in vcpu_sync_interrupts(). Note, also, that IRQ_PMU_OVF would want to be
> synced for the similar reason as IRQ_VS_SOFT, but isn't sync-ed now as
> PMU isn't supported now.
> 
> For the remaining VS-level interrupt types (IRQ_VS_TIMER and
> IRQ_VS_EXT), the specification states they cannot be modified by the guest
> and are read-only:
>   Bits hip.VSEIP and hie.VSEIE are the interrupt-pending and interrupt-enable
>   bits for VS-level external interrupts. VSEIP is read-only in hip, and is
>   the logical-OR of these interrupt sources:
>     • bit VSEIP of hvip;
>     • the bit of hgeip selected by hstatus.VGEIN; and
>     • any other platform-specific external interrupt signal directed to
>       VS-level.
>   Bits hip.VSTIP and hie.VSTIE are the interrupt-pending and interrupt-enable
>   bits for VS-level timer interrupts. VSTIP is read-only in hip, and is the
>   logical-OR of hvip.VSTIP and any other platform-specific timer interrupt
>   signal directed to VS-level.

In what you quote there is something being said about bits being r/o in hip,
but I can't conclude the "cannot be modified by the guest" that is being said
ahead of the quotation.

> Thus, for these interrupt types, it is sufficient to use vcpu_set_interrupt()
> and vcpu_unset_interrupt(), and flush them during the call of
> vcpu_flush_interrupts() (which is introduced in follow up patch).
> 
> vcpu_sync_interrupts(), which is called just before entering the VM,
> slightly bends the rule that the irqs_pending bit must be written
> first, followed by updating the corresponding bit in irqs_pending_mask.
> However, it still respects the core guarantee that the producer never
> clears the mask and only writes to irqs_pending if it is the one that
> flipped the corresponding mask bit from 0 to 1.
> Moreover, since the consumer won't run concurrently because
> vcpu_sync_interrupts() and the consumer path are going to be invoked
> equentially immediately before VM entry, it is safe to slightly relax
> this ordering rule in vcpu_sync_interrupts().
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Changes in v3:
>  - Use smp_wb() instead of smp_mb__before_atomic().
>  - Add explanation of the change above in the commit message.
>  - Move vcpu_sync_interrupts() here to producers side.
>  - Introduce check_for_pcpu_work() to be clear from where vcpu_sync_interrupts()
>    is called.
> ---
> Changes in V2:
>  - Move the patch before an introduction of vtimer.
>  - Drop bitmap_zero() of irqs_pending and irqs_pending_mask bitmaps as
>    vcpu structure starts out all zeros.
>  - Drop const for irq argument of vcpu_{un}set_interrupt().
>  - Drop check "irq < IRQ_LOCAL_MAX" in vcpu_{un}set_interrupt() as it
>    could lead to overrun of irqs_pending and irqs_pending_mask bitmaps.
>  - Drop IRQ_LOCAL_MAX as there is no usage for it now.
> ---
>  xen/arch/riscv/domain.c             | 70 +++++++++++++++++++++++++++++
>  xen/arch/riscv/include/asm/domain.h | 24 ++++++++++
>  xen/arch/riscv/traps.c              |  8 ++++
>  3 files changed, 102 insertions(+)
> 
> diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
> index af9586a4eb0d..4513f778cdc4 100644
> --- a/xen/arch/riscv/domain.c
> +++ b/xen/arch/riscv/domain.c
> @@ -5,6 +5,7 @@
>  #include <xen/sched.h>
>  #include <xen/vmap.h>
>  
> +#include <asm/bitops.h>
>  #include <asm/cpufeature.h>
>  #include <asm/csr.h>
>  #include <asm/riscv_encoding.h>
> @@ -124,3 +125,72 @@ void arch_vcpu_destroy(struct vcpu *v)
>  {
>      vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info) - STACK_SIZE);
>  }
> +
> +int vcpu_set_interrupt(struct vcpu *v, unsigned int irq)
> +{
> +    /*
> +     * We only allow VS-mode software, timer, and external
> +     * interrupts when irq is one of the local interrupts
> +     * defined by RISC-V privilege specification.
> +     */

What is "when irq is one ..." intended to be telling me? There's no ...

> +    if ( irq != IRQ_VS_SOFT &&
> +         irq != IRQ_VS_TIMER &&
> +         irq != IRQ_VS_EXT )
> +        return -EINVAL;

... corresponding code (anymore), afaict.

Further, who are the prospected callers of this function and its sibling
below? If they're all internal (i.e. not reachable via hypercalls or
emulation on behalf of the guest), this may want to be assertions.

> +    set_bit(irq, v->arch.irqs_pending);
> +    smp_wmb();
> +    set_bit(irq, v->arch.irqs_pending_mask);
> +
> +    vcpu_kick(v);

Shouldn't this be conditional upon the pending_mask bit going from 0 to
1?

> +void vcpu_sync_interrupts(struct vcpu *v)
> +{
> +    unsigned long hvip;
> +
> +    /* Read current HVIP and VSIE CSRs */
> +    v->arch.vsie = csr_read(CSR_VSIE);
> +
> +    /* Sync-up HVIP.VSSIP bit changes done by Guest */
> +    hvip = csr_read(CSR_HVIP);
> +    if ( (v->arch.hvip ^ hvip) & BIT(IRQ_VS_SOFT, UL) &&

Nit: Parentheses please around an & expression when used in an &&
one.

> --- a/xen/arch/riscv/include/asm/domain.h
> +++ b/xen/arch/riscv/include/asm/domain.h
> @@ -54,6 +54,25 @@ struct arch_vcpu {
>      register_t henvcfg;
>      register_t hideleg;
>      register_t hstateen0;
> +    register_t hvip;
> +
> +    register_t vsie;
> +
> +    /*
> +     * VCPU interrupts
> +     *
> +     * We have a lockless approach for tracking pending VCPU interrupts
> +     * implemented using atomic bitops. The irqs_pending bitmap represent
> +     * pending interrupts whereas irqs_pending_mask represent bits changed
> +     * in irqs_pending. Our approach is modeled around multiple producer
> +     * and single consumer problem where the consumer is the VCPU itself.
> +     *
> +     * DECLARE_BITMAP() is needed here to support 64 vCPU local interrupts
> +     * on RV32 host.
> +     */
> +#define RISCV_VCPU_NR_IRQS 64

What is this 64 deriving from? IOW - can it be some kind of expression,
helping the reader?

> --- a/xen/arch/riscv/traps.c
> +++ b/xen/arch/riscv/traps.c
> @@ -169,6 +169,11 @@ static void do_unexpected_trap(const struct cpu_user_regs *regs)
>      die();
>  }
>  
> +static void check_for_pcpu_work(void)
> +{
> +    vcpu_sync_interrupts(current);
> +}

Is this really a helpful wrapper? Or is there going to be more stuff in here
later on?

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 06/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 2
  2026-02-09 16:52 ` [PATCH v3 06/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 2 Oleksii Kurochko
@ 2026-02-11 14:26   ` Jan Beulich
  2026-02-11 15:59     ` Oleksii Kurochko
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2026-02-11 14:26 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.02.2026 17:52, Oleksii Kurochko wrote:
> This patch is based on Linux kernel 6.16.0.
> 
> Add the consumer side (vcpu_flush_interrupts()) of the lockless pending
> interrupt tracking introduced in part 1 (for producers). According, to the
> design only one consumer is possible, and it is vCPU itself.
> vcpu_flush_interrupts() is expected to be ran (as guests aren't ran now due
> to the lack of functionality) before the hypervisor returns control to the
> guest.
> 
> Producers may set bits in irqs_pending_mask without a lock. Clearing bits in
> irqs_pending_mask is performed only by the consumer via xchg() (with aquire &
> release semantics).

Where the release part isn't relevant here, aiui.

> --- a/xen/arch/riscv/domain.c
> +++ b/xen/arch/riscv/domain.c
> @@ -194,3 +194,36 @@ void vcpu_sync_interrupts(struct vcpu *v)
>  #   error "Update v->arch.vsieh"
>  #endif
>  }
> +
> +static void vcpu_update_hvip(const struct vcpu *v)
> +{
> +    csr_write(CSR_HVIP, v->arch.hvip);
> +}
> +
> +void vcpu_flush_interrupts(struct vcpu *v)
> +{
> +    register_t *hvip = &v->arch.hvip;

Why not in the if()'s scope, when it's used only there?

> +    if ( ACCESS_ONCE(v->arch.irqs_pending_mask[0]) )
> +    {
> +        unsigned long mask, val;
> +
> +        mask = xchg(&v->arch.irqs_pending_mask[0], 0UL);
> +        val = ACCESS_ONCE(v->arch.irqs_pending[0]) & mask;

Make these the initializers of the variables?

> +        *hvip &= ~mask;
> +        *hvip |= val;
> +
> +        /*
> +         * Flush AIA high interrupts.
> +         *
> +         * It is necessary to do only for CONFIG_RISCV_32 which isn't
> +         * supported now.
> +         */
> +#ifdef CONFIG_RISCV_32
> +        #   error "Update v->arch.hviph"

Ehem. I really dislike having to give the same comment over and over again.
Please have the # of a pre-processor directive always in the first column.

Also, isn't this misplaced? The containing if() checks irqs_pending_mask[0],
yet aiui irqs_pending_mask[1] would be of interest for hviph?

> --- a/xen/arch/riscv/traps.c
> +++ b/xen/arch/riscv/traps.c
> @@ -172,6 +172,8 @@ static void do_unexpected_trap(const struct cpu_user_regs *regs)
>  static void check_for_pcpu_work(void)
>  {
>      vcpu_sync_interrupts(current);
> +
> +    vcpu_flush_interrupts(current);
>  }

Ah, okay - here comes a 2nd call from this function. However, please latch
current into a local variable (already in the earlier patch perhaps); no
need to fetch it from per-CPU data twice (or yet more times, if further
stuff was going to appear here).

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 08/16] xen/riscv: introduce basic vtimer infrastructure for guests
  2026-02-09 16:52 ` [PATCH v3 08/16] xen/riscv: introduce basic vtimer infrastructure for guests Oleksii Kurochko
@ 2026-02-11 14:51   ` Jan Beulich
  2026-02-11 15:31     ` Oleksii Kurochko
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2026-02-11 14:51 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.02.2026 17:52, Oleksii Kurochko wrote:
> @@ -105,11 +106,14 @@ int arch_vcpu_create(struct vcpu *v)
>      if ( is_idle_vcpu(v) )
>          return rc;
>  
> +    if ( (rc = vcpu_vtimer_init(v)) )
> +        goto fail;
> +
>      /*
> -     * As the vtimer and interrupt controller (IC) are not yet implemented,
> +     * As interrupt controller (IC) is not yet implemented,
>       * return an error.
>       *
> -     * TODO: Drop this once the vtimer and IC are implemented.
> +     * TODO: Drop this once IC is implemented.
>       */
>      rc = -EOPNOTSUPP;
>      goto fail;

Shouldn't you then also call vcpu_vtimer_destroy() from arch_vcpu_destroy()?

> --- /dev/null
> +++ b/xen/arch/riscv/vtimer.c
> @@ -0,0 +1,71 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <xen/sched.h>
> +#include <xen/timer.h>
> +
> +#include <asm/vtimer.h>
> +
> +static void vtimer_expired(void *data)
> +{
> +    struct vtimer *t = data;
> +    struct vcpu *v = container_of(t, struct vcpu, arch.vtimer);
> +
> +    vcpu_set_interrupt(v, IRQ_VS_TIMER);
> +}
> +
> +int vcpu_vtimer_init(struct vcpu *v)
> +{
> +    struct vtimer *t = &v->arch.vtimer;
> +
> +    init_timer(&t->timer, vtimer_expired, t, v->processor);
> +
> +    return 0;
> +}
> +
> +void vcpu_timer_destroy(struct vcpu *v)
> +{
> +    struct vtimer *t = &v->arch.vtimer;
> +
> +    if ( t->timer.status == TIMER_STATUS_invalid )
> +        return;
> +
> +    kill_timer(&v->arch.vtimer.timer);
> +}
> +
> +void vtimer_set_timer(struct vtimer *t, const uint64_t ticks)
> +{
> +    struct vcpu *v = container_of(t, struct vcpu, arch.vtimer);
> +    s_time_t expires = ticks_to_ns(ticks - boot_clock_cycles);
> +
> +    vcpu_unset_interrupt(v, IRQ_VS_TIMER);
> +
> +    /*
> +     * According to the RISC-V sbi spec:
> +     *   If the supervisor wishes to clear the timer interrupt without
> +     *   scheduling the next timer event, it can either request a timer
> +     *   interrupt infinitely far into the future (i.e., (uint64_t)-1),
> +     *   or it can instead mask the timer interrupt by clearing sie.STIE CSR
> +     *   bit.
> +     */
> +    if ( ticks == ((uint64_t)~0) )

Btw, such a cast doesn't further need parenthesizing.

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1
  2026-02-11 14:16   ` Jan Beulich
@ 2026-02-11 14:53     ` Jan Beulich
  2026-02-12  9:37     ` Oleksii Kurochko
  1 sibling, 0 replies; 39+ messages in thread
From: Jan Beulich @ 2026-02-11 14:53 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 11.02.2026 15:16, Jan Beulich wrote:
> On 09.02.2026 17:52, Oleksii Kurochko wrote:
>> @@ -124,3 +125,72 @@ void arch_vcpu_destroy(struct vcpu *v)
>>  {
>>      vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info) - STACK_SIZE);
>>  }
>> +
>> +int vcpu_set_interrupt(struct vcpu *v, unsigned int irq)
>> +{
>> +    /*
>> +     * We only allow VS-mode software, timer, and external
>> +     * interrupts when irq is one of the local interrupts
>> +     * defined by RISC-V privilege specification.
>> +     */
> 
> What is "when irq is one ..." intended to be telling me? There's no ...
> 
>> +    if ( irq != IRQ_VS_SOFT &&
>> +         irq != IRQ_VS_TIMER &&
>> +         irq != IRQ_VS_EXT )
>> +        return -EINVAL;
> 
> ... corresponding code (anymore), afaict.
> 
> Further, who are the prospected callers of this function and its sibling
> below? If they're all internal (i.e. not reachable via hypercalls or
> emulation on behalf of the guest), this may want to be assertions.

Having seen a use in patch 08, I should clarify the "reachable" part here:
It's not the "reachable" alone, but whether the guest has control over the
"irq" value passed. For the example in patch 08 this isn't the case.

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 12/16] xen/riscv: introduce sbi_set_timer()
  2026-02-09 16:52 ` [PATCH v3 12/16] xen/riscv: introduce sbi_set_timer() Oleksii Kurochko
@ 2026-02-11 15:03   ` Jan Beulich
  2026-02-11 15:37     ` Oleksii Kurochko
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2026-02-11 15:03 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.02.2026 17:52, Oleksii Kurochko wrote:
> Introduce a function pointer for sbi_set_timer(), since different OpenSBI
> versions may implement the TIME extension with different extension IDs
> and/or function IDs.
> 
> If the TIME extension is not available, fall back to the legacy timer
> mechanism. This is useful when Xen runs as a guest under another Xen,
> because the TIME extension is not currently virtualised and therefore
> will not appear as available.
> Despite of the fact that sbi_set_timer_v01 is introduced and used as
> fall back, SBI v0.1 still isn't fully supported (with the current SBI
> calls usage, sbi_rfence_v01 should be introduced too), so panic()
> in sbi_init() isn't removed.
> 
> The sbi_set_timer() pointer will be used by reprogram_timer() to program
> Xen’s physical timer as without SSTC extension there is no any other
> option except SBI call to do that as only M-timer is available for us.
> 
> Use dprintk() for all the cases to print that a speicifc SBI extension
> is available as it isn't really necessary in case of release builds.
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
with one further request:

> @@ -134,6 +138,20 @@ int sbi_remote_hfence_gvma(const cpumask_t *cpu_mask, vaddr_t start,
>  int sbi_remote_hfence_gvma_vmid(const cpumask_t *cpu_mask, vaddr_t start,
>                                  size_t size, unsigned long vmid);
>  
> +/*
> + * Programs the clock for next event after stime_value time. This function also
> + * clears the pending timer interrupt bit.

"after stime_value time" reads as if this was a relative input, stime_value units
of time need to pass until the event. Iirc values passed are absolute, though.
Furthermore it was my understanding that the interrupt being raised is dependent
upon clock >= value, not clock > value (where the latter is what "after" means
when taken to apply to an absolute value).

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 13/16] xen/riscv: implement reprogram_timer() via SBI
  2026-02-09 16:52 ` [PATCH v3 13/16] xen/riscv: implement reprogram_timer() via SBI Oleksii Kurochko
@ 2026-02-11 15:12   ` Jan Beulich
  2026-02-11 15:14     ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2026-02-11 15:12 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.02.2026 17:52, Oleksii Kurochko wrote:
> Implement reprogram_timer() on RISC-V using the standard SBI timer call.
> 
> The privileged architecture only defines machine-mode timer interrupts
> (using mtime/mtimecmp). Therefore, timer services for S/HS/VS mode must
> be provided by M-mode via SBI calls. SSTC (Supervisor-mode Timer Control)
> is optional and is not supported on the boards available to me, so the
> only viable approach today is to program the timer through SBI.
> 
> reprogram_timer() enables/disables the supervisor timer interrupt and
> programs the next timer deadline using sbi_set_timer(). If the SBI call
> fails, the code panics, because sbi_set_timer() is expected to return
> either 0 or -ENOSUPP (this has been stable from early OpenSBI versions to
> the latest ones). The SBI spec does not define a standard negative error
> code for this call, and without SSTC there is no alternative method to
> program the timer, so the SBI timer call must be available.
> 
> reprogram_timer() currently returns int for compatibility with the
> existing prototype. While it might be cleaner to return bool, keeping the
> existing signature avoids premature changes in case sbi_set_timer() ever
> needs to return other values (based on which we could try to avoid
> panic-ing) in the future.
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Acked-by: Jan Beulich <jbeulich@suse.com>

> ---
> Changes in v3:
>  - Correct the comments in reprogram_timer().
>  - Move enablement of timer interrupt after sbi_set_timer() to avoid
>    potentially receiving a timer interrupt between these 2 operations.

I'd like to mention that this is of only hypothetical concern, at least for
the sole caller in common code: That's doing the call with IRQs off, so
only the bit in SIP could become set too early, while no IRQ would surface
before timer_softirq_action() turns IRQs on again. (This isn't to say that
it wasn't a good thing to adjust the order.)

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 13/16] xen/riscv: implement reprogram_timer() via SBI
  2026-02-11 15:12   ` Jan Beulich
@ 2026-02-11 15:14     ` Jan Beulich
  0 siblings, 0 replies; 39+ messages in thread
From: Jan Beulich @ 2026-02-11 15:14 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 11.02.2026 16:12, Jan Beulich wrote:
> On 09.02.2026 17:52, Oleksii Kurochko wrote:
>> Implement reprogram_timer() on RISC-V using the standard SBI timer call.
>>
>> The privileged architecture only defines machine-mode timer interrupts
>> (using mtime/mtimecmp). Therefore, timer services for S/HS/VS mode must
>> be provided by M-mode via SBI calls. SSTC (Supervisor-mode Timer Control)
>> is optional and is not supported on the boards available to me, so the
>> only viable approach today is to program the timer through SBI.
>>
>> reprogram_timer() enables/disables the supervisor timer interrupt and
>> programs the next timer deadline using sbi_set_timer(). If the SBI call
>> fails, the code panics, because sbi_set_timer() is expected to return
>> either 0 or -ENOSUPP (this has been stable from early OpenSBI versions to
>> the latest ones). The SBI spec does not define a standard negative error
>> code for this call, and without SSTC there is no alternative method to
>> program the timer, so the SBI timer call must be available.
>>
>> reprogram_timer() currently returns int for compatibility with the
>> existing prototype. While it might be cleaner to return bool, keeping the
>> existing signature avoids premature changes in case sbi_set_timer() ever
>> needs to return other values (based on which we could try to avoid
>> panic-ing) in the future.
>>
>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> 
> Acked-by: Jan Beulich <jbeulich@suse.com>
> 
>> ---
>> Changes in v3:
>>  - Correct the comments in reprogram_timer().
>>  - Move enablement of timer interrupt after sbi_set_timer() to avoid
>>    potentially receiving a timer interrupt between these 2 operations.
> 
> I'd like to mention that this is of only hypothetical concern, at least for
> the sole caller in common code: That's doing the call with IRQs off, so
> only the bit in SIP could become set too early, while no IRQ would surface
> before timer_softirq_action() turns IRQs on again.

Actually, further to this: If IRQs were on, an IRQ could still surface
between the two operations, when the SIE bit was already sent upon entry
into the function (i.e. for example when a timeout is being moved earlier).

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 16/16] xen/riscv: implement sync_vcpu_execstate()
  2026-02-09 16:52 ` [PATCH v3 16/16] xen/riscv: implement sync_vcpu_execstate() Oleksii Kurochko
@ 2026-02-11 15:16   ` Jan Beulich
  0 siblings, 0 replies; 39+ messages in thread
From: Jan Beulich @ 2026-02-11 15:16 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 09.02.2026 17:52, Oleksii Kurochko wrote:
> The scheduler may call this function to force synchronization of given
> vCPU's state. RISC-V does not support lazy context switching, so nothing
> is done in sync_vcpu_execstate() and sync_local_execstate().
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Acked-by: Jan Beulich <jbeulich@suse.com>



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 08/16] xen/riscv: introduce basic vtimer infrastructure for guests
  2026-02-11 14:51   ` Jan Beulich
@ 2026-02-11 15:31     ` Oleksii Kurochko
  0 siblings, 0 replies; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-11 15:31 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel


On 2/11/26 3:51 PM, Jan Beulich wrote:
> On 09.02.2026 17:52, Oleksii Kurochko wrote:
>> @@ -105,11 +106,14 @@ int arch_vcpu_create(struct vcpu *v)
>>       if ( is_idle_vcpu(v) )
>>           return rc;
>>   
>> +    if ( (rc = vcpu_vtimer_init(v)) )
>> +        goto fail;
>> +
>>       /*
>> -     * As the vtimer and interrupt controller (IC) are not yet implemented,
>> +     * As interrupt controller (IC) is not yet implemented,
>>        * return an error.
>>        *
>> -     * TODO: Drop this once the vtimer and IC are implemented.
>> +     * TODO: Drop this once IC is implemented.
>>        */
>>       rc = -EOPNOTSUPP;
>>       goto fail;
> Shouldn't you then also call vcpu_vtimer_destroy() from arch_vcpu_destroy()?

Yes, it should be.

Thanks.

~ Oleksii



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 12/16] xen/riscv: introduce sbi_set_timer()
  2026-02-11 15:03   ` Jan Beulich
@ 2026-02-11 15:37     ` Oleksii Kurochko
  0 siblings, 0 replies; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-11 15:37 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel


On 2/11/26 4:03 PM, Jan Beulich wrote:
> On 09.02.2026 17:52, Oleksii Kurochko wrote:
>> Introduce a function pointer for sbi_set_timer(), since different OpenSBI
>> versions may implement the TIME extension with different extension IDs
>> and/or function IDs.
>>
>> If the TIME extension is not available, fall back to the legacy timer
>> mechanism. This is useful when Xen runs as a guest under another Xen,
>> because the TIME extension is not currently virtualised and therefore
>> will not appear as available.
>> Despite of the fact that sbi_set_timer_v01 is introduced and used as
>> fall back, SBI v0.1 still isn't fully supported (with the current SBI
>> calls usage, sbi_rfence_v01 should be introduced too), so panic()
>> in sbi_init() isn't removed.
>>
>> The sbi_set_timer() pointer will be used by reprogram_timer() to program
>> Xen’s physical timer as without SSTC extension there is no any other
>> option except SBI call to do that as only M-timer is available for us.
>>
>> Use dprintk() for all the cases to print that a speicifc SBI extension
>> is available as it isn't really necessary in case of release builds.
>>
>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> with one further request:
>
>> @@ -134,6 +138,20 @@ int sbi_remote_hfence_gvma(const cpumask_t *cpu_mask, vaddr_t start,
>>   int sbi_remote_hfence_gvma_vmid(const cpumask_t *cpu_mask, vaddr_t start,
>>                                   size_t size, unsigned long vmid);
>>   
>> +/*
>> + * Programs the clock for next event after stime_value time. This function also
>> + * clears the pending timer interrupt bit.
> "after stime_value time" reads as if this was a relative input, stime_value units
> of time need to pass until the event. Iirc values passed are absolute, though.
> Furthermore it was my understanding that the interrupt being raised is dependent
> upon clock >= value, not clock > value (where the latter is what "after" means
> when taken to apply to an absolute value).

Interesting that sbi_set_timer() has different description for OpenSBI v0.1 and v0.2.
What you see in the comment it is what v0.1 tells, but v0.2 tells that:
   
   "Programs the clock for next event after stime_value time.*stime_value is in absolute time.* This function must clear the pending timer interrupt bit as well."

"stime_value is in absolute time" has been added in v0.2. I'll update the comment for
a clarity.

Thanks.

~ Oleksii



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 06/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 2
  2026-02-11 14:26   ` Jan Beulich
@ 2026-02-11 15:59     ` Oleksii Kurochko
  0 siblings, 0 replies; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-11 15:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel


On 2/11/26 3:26 PM, Jan Beulich wrote:
> On 09.02.2026 17:52, Oleksii Kurochko wrote:
>> This patch is based on Linux kernel 6.16.0.
>>
>> Add the consumer side (vcpu_flush_interrupts()) of the lockless pending
>> interrupt tracking introduced in part 1 (for producers). According, to the
>> design only one consumer is possible, and it is vCPU itself.
>> vcpu_flush_interrupts() is expected to be ran (as guests aren't ran now due
>> to the lack of functionality) before the hypervisor returns control to the
>> guest.
>>
>> Producers may set bits in irqs_pending_mask without a lock. Clearing bits in
>> irqs_pending_mask is performed only by the consumer via xchg() (with aquire &
>> release semantics).
> Where the release part isn't relevant here, aiui.

Yes, only acquire part is relevant here. release mentioned here as xchg use .aqrl
prefix.


>
>> --- a/xen/arch/riscv/domain.c
>> +++ b/xen/arch/riscv/domain.c
>> @@ -194,3 +194,36 @@ void vcpu_sync_interrupts(struct vcpu *v)
>>   #   error "Update v->arch.vsieh"
>>   #endif
>>   }
>> +
>> +static void vcpu_update_hvip(const struct vcpu *v)
>> +{
>> +    csr_write(CSR_HVIP, v->arch.hvip);
>> +}
>> +
>> +void vcpu_flush_interrupts(struct vcpu *v)
>> +{
>> +    register_t *hvip = &v->arch.hvip;
> Why not in the if()'s scope, when it's used only there?

Missed that during vcpu_update_hvip() inside if().

>
>> +    if ( ACCESS_ONCE(v->arch.irqs_pending_mask[0]) )
>> +    {
>> +        unsigned long mask, val;
>> +
>> +        mask = xchg(&v->arch.irqs_pending_mask[0], 0UL);
>> +        val = ACCESS_ONCE(v->arch.irqs_pending[0]) & mask;
> Make these the initializers of the variables?

Good point. I will do that.

>
>> +        *hvip &= ~mask;
>> +        *hvip |= val;
>> +
>> +        /*
>> +         * Flush AIA high interrupts.
>> +         *
>> +         * It is necessary to do only for CONFIG_RISCV_32 which isn't
>> +         * supported now.
>> +         */
>> +#ifdef CONFIG_RISCV_32
>> +        #   error "Update v->arch.hviph"
> Ehem. I really dislike having to give the same comment over and over again.
> Please have the # of a pre-processor directive always in the first column.
>
> Also, isn't this misplaced? The containing if() checks irqs_pending_mask[0],
> yet aiui irqs_pending_mask[1] would be of interest for hviph?

Agree, it would be more correct to have outside if().

>
>> --- a/xen/arch/riscv/traps.c
>> +++ b/xen/arch/riscv/traps.c
>> @@ -172,6 +172,8 @@ static void do_unexpected_trap(const struct cpu_user_regs *regs)
>>   static void check_for_pcpu_work(void)
>>   {
>>       vcpu_sync_interrupts(current);
>> +
>> +    vcpu_flush_interrupts(current);
>>   }
> Ah, okay - here comes a 2nd call from this function. However, please latch
> current into a local variable (already in the earlier patch perhaps); no
> need to fetch it from per-CPU data twice (or yet more times, if further
> stuff was going to appear here).

It makes sense. Ill do that.

Thanks.

~ Oleksii



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1
  2026-02-11 14:16   ` Jan Beulich
  2026-02-11 14:53     ` Jan Beulich
@ 2026-02-12  9:37     ` Oleksii Kurochko
  2026-02-12 10:24       ` Jan Beulich
  1 sibling, 1 reply; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-12  9:37 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel


On 2/11/26 3:16 PM, Jan Beulich wrote:
> On 09.02.2026 17:52, Oleksii Kurochko wrote:
>> Based on Linux kernel v6.16.0.
>> Note that smp_wmb() is used instead of smp_mb__before_atomic() as what
>> we want to guarantee that if a bit in irqs_pending_mask is obversable
>> that the correspondent bit in irqs_pending is observable too.
> And the counterpart of this barrier is the one encoded implicitly in
> xchg()? Could do with making more explicit, e.g. by way of adding a
> code comment there.

I thought it would be clear from the paragraph below where xchng_acquire()
is mentioned. I'll update the comment to make it more explicit.

>> Add lockless tracking of pending vCPU interrupts using atomic bitops.
>> Two bitmaps are introduced:
>>   - irqs_pending — interrupts currently pending for the vCPU
>>   - irqs_pending_mask — bits that have changed in irqs_pending
>>
>> The design follows a multi-producer, single-consumer model, where the
>> consumer is the vCPU itself. Producers may set bits in
>> irqs_pending_mask without a lock. Clearing bits in irqs_pending_mask is
>> performed only by the consumer via xchg_acquire().
> What is xchg_acquire() in Xen? I can't spot anything like this.

Rudiment from Linux. I'll change to xchng().

>> The consumer must not
>> write to irqs_pending and must not act on bits that are not set in the
>> mask. Otherwise, extra synchronization should be provided.
>>
>> On RISC-V interrupts are not injected via guest registers, so pending
>> interrupts must be recorded in irqs_pending (using the new
>> vcpu_{un}set_interrupt() helpers) and flushed to the guest by updating
>> HVIP before returning control to the guest. The consumer side is
>> implemented in a follow-up patch.
>>
>> A barrier between updating irqs_pending and setting the corresponding
>> mask bit in vcpu_set_interrupt()/vcpu_unset_interrupt() guarantees
>> that if the consumer observes a mask bit set, the corresponding pending
>> bit is also visible. This prevents missed interrupts during the flush.
>>
>> It is possible a guest could have pending bit not result in the hardware
>> register without to be marked pending in irq_pending bitmap as:
> Are there some words missing in this sentence? I find it hard to follow
> the way it is.

Agree, something is wrong with this sentence. I'll rephrase it to:
   It is  possible that a guest could have pending bit in the hardware register
   without being marked pending in irq_pending bitmap as:
   ...


>>    According to the RISC-V ISA specification:
>>      Bits hip.VSSIP and hie.VSSIE are the interrupt-pending and
>>      interrupt-enable  bits for VS-level software interrupts. VSSIP in hip
>>      is an alias (writable) of the same bit in hvip.
>>    Additionally:
>>      When bit 2 of hideleg is zero, vsip.SSIP and vsie.SSIE are read-only
>>      zeros. Else, vsip.SSIP and vsie.SSIE are aliases of hip.VSSIP and
>>      hie.VSSIE.
>> This means the guest may modify vsip.SSIP, which implicitly updates
>> hip.VSSIP and the bit being writable with 1 would also trigger an interrupt
> s/writable/written/ ?
>
>> as according to the RISC-V spec:
>>    These conditions for an interrupt trap to occur must be evaluated in a
>>    bounded   amount of time from when an interrupt becomes, or ceases to be,
>>    pending in sip,  and must also be evaluated immediately following the
>>    execution of an SRET  instruction or an explicit write to a CSR on which
>>    these interrupt trap conditions expressly depend (including sip, sie and
>>    sstatus).
>> What means that IRQ_VS_SOFT must be synchronized separately, what is done
>> in vcpu_sync_interrupts(). Note, also, that IRQ_PMU_OVF would want to be
>> synced for the similar reason as IRQ_VS_SOFT, but isn't sync-ed now as
>> PMU isn't supported now.
>>
>> For the remaining VS-level interrupt types (IRQ_VS_TIMER and
>> IRQ_VS_EXT), the specification states they cannot be modified by the guest
>> and are read-only:
>>    Bits hip.VSEIP and hie.VSEIE are the interrupt-pending and interrupt-enable
>>    bits for VS-level external interrupts. VSEIP is read-only in hip, and is
>>    the logical-OR of these interrupt sources:
>>      • bit VSEIP of hvip;
>>      • the bit of hgeip selected by hstatus.VGEIN; and
>>      • any other platform-specific external interrupt signal directed to
>>        VS-level.
>>    Bits hip.VSTIP and hie.VSTIE are the interrupt-pending and interrupt-enable
>>    bits for VS-level timer interrupts. VSTIP is read-only in hip, and is the
>>    logical-OR of hvip.VSTIP and any other platform-specific timer interrupt
>>    signal directed to VS-level.
> In what you quote there is something being said about bits being r/o in hip,
> but I can't conclude the "cannot be modified by the guest" that is being said
> ahead of the quotation.

I think it would be good then to add:
   When bit 10 of hideleg is zero, vsip.SEIP and vsie.SEIE are read-only zeros.
   Else, vsip.SEIP and vsie.SEIE are aliases of hip.VSEIP and hie.VSEIE.

   When bit 6 of hideleg is zero, vsip.STIP and vsie.STIE are read-only zeros.
   Else, vsip.STIP and vsie.STIE are aliases of hip.VSTIP and hie.VSTIE.
As they are aliases VS* counterparts can't be writable and also:
   Bits sip.SEIP and sie.SEIE are the interrupt-pending and interrupt-enable
   bits for supervisor-level external interrupts. If implemented, SEIP is
   read-only in sip, and is set and cleared by the execution environment,
   typically through a platform-specific interrupt controller.

   Bits sip.STIP and sie.STIE are the interrupt-pending and interrupt-enable
   bits for supervisor-level timer interrupts. If implemented, STIP is
   read-only in sip, and is set and cleared by the execution environment

as SIP = VSIP for guest then guest can't update these bits too.

I will update the comment with extra information.

>> Thus, for these interrupt types, it is sufficient to use vcpu_set_interrupt()
>> and vcpu_unset_interrupt(), and flush them during the call of
>> vcpu_flush_interrupts() (which is introduced in follow up patch).
>>
>> vcpu_sync_interrupts(), which is called just before entering the VM,
>> slightly bends the rule that the irqs_pending bit must be written
>> first, followed by updating the corresponding bit in irqs_pending_mask.
>> However, it still respects the core guarantee that the producer never
>> clears the mask and only writes to irqs_pending if it is the one that
>> flipped the corresponding mask bit from 0 to 1.
>> Moreover, since the consumer won't run concurrently because
>> vcpu_sync_interrupts() and the consumer path are going to be invoked
>> equentially immediately before VM entry, it is safe to slightly relax
>> this ordering rule in vcpu_sync_interrupts().
>>
>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
>> ---
>> Changes in v3:
>>   - Use smp_wb() instead of smp_mb__before_atomic().
>>   - Add explanation of the change above in the commit message.
>>   - Move vcpu_sync_interrupts() here to producers side.
>>   - Introduce check_for_pcpu_work() to be clear from where vcpu_sync_interrupts()
>>     is called.
>> ---
>> Changes in V2:
>>   - Move the patch before an introduction of vtimer.
>>   - Drop bitmap_zero() of irqs_pending and irqs_pending_mask bitmaps as
>>     vcpu structure starts out all zeros.
>>   - Drop const for irq argument of vcpu_{un}set_interrupt().
>>   - Drop check "irq < IRQ_LOCAL_MAX" in vcpu_{un}set_interrupt() as it
>>     could lead to overrun of irqs_pending and irqs_pending_mask bitmaps.
>>   - Drop IRQ_LOCAL_MAX as there is no usage for it now.
>> ---
>>   xen/arch/riscv/domain.c             | 70 +++++++++++++++++++++++++++++
>>   xen/arch/riscv/include/asm/domain.h | 24 ++++++++++
>>   xen/arch/riscv/traps.c              |  8 ++++
>>   3 files changed, 102 insertions(+)
>>
>> diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
>> index af9586a4eb0d..4513f778cdc4 100644
>> --- a/xen/arch/riscv/domain.c
>> +++ b/xen/arch/riscv/domain.c
>> @@ -5,6 +5,7 @@
>>   #include <xen/sched.h>
>>   #include <xen/vmap.h>
>>   
>> +#include <asm/bitops.h>
>>   #include <asm/cpufeature.h>
>>   #include <asm/csr.h>
>>   #include <asm/riscv_encoding.h>
>> @@ -124,3 +125,72 @@ void arch_vcpu_destroy(struct vcpu *v)
>>   {
>>       vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info) - STACK_SIZE);
>>   }
>> +
>> +int vcpu_set_interrupt(struct vcpu *v, unsigned int irq)
>> +{
>> +    /*
>> +     * We only allow VS-mode software, timer, and external
>> +     * interrupts when irq is one of the local interrupts
>> +     * defined by RISC-V privilege specification.
>> +     */
> What is "when irq is one ..." intended to be telling me? There's no ...
>
>> +    if ( irq != IRQ_VS_SOFT &&
>> +         irq != IRQ_VS_TIMER &&
>> +         irq != IRQ_VS_EXT )
>> +        return -EINVAL;
> ... corresponding code (anymore), afaict.

That part should be removed, there is no any sense for it anymore.

> Further, who are the prospected callers of this function and its sibling
> below? If they're all internal (i.e. not reachable via hypercalls or
> emulation on behalf of the guest), this may want to be assertions.

Considering your further reply:
   Having seen a use in patch 08, I should clarify the "reachable" part here:
   It's not the "reachable" alone, but whether the guest has control over the
   "irq" value passed. For the example in patch 08 this isn't the case.

I think I did not fully understand the part about “the guest has control over
the ‘irq’ value passed”, but at the moment I do not have any plans for the guest
to pass the irq value directly. (Do you have any examples where it should be
needed?).
All the use cases I have in mind are similar to the vtimer case: intercepting
the SBI call and then, inside the handler, calling vcpu_(un)set_interrupt().

>> +    set_bit(irq, v->arch.irqs_pending);
>> +    smp_wmb();
>> +    set_bit(irq, v->arch.irqs_pending_mask);
>> +
>> +    vcpu_kick(v);
> Shouldn't this be conditional upon the pending_mask bit going from 0 to
> 1?

It makes sense to do. I will do that in the following way:
   if ( !test_and_set_bit(irq, v->arch.irqs_pending_mask) )
       vcpu_kick(v);

>
>> +void vcpu_sync_interrupts(struct vcpu *v)
>> +{
>> +    unsigned long hvip;
>> +
>> +    /* Read current HVIP and VSIE CSRs */
>> +    v->arch.vsie = csr_read(CSR_VSIE);
>> +
>> +    /* Sync-up HVIP.VSSIP bit changes done by Guest */
>> +    hvip = csr_read(CSR_HVIP);
>> +    if ( (v->arch.hvip ^ hvip) & BIT(IRQ_VS_SOFT, UL) &&
> Nit: Parentheses please around an & expression when used in an &&
> one.
>
>> --- a/xen/arch/riscv/include/asm/domain.h
>> +++ b/xen/arch/riscv/include/asm/domain.h
>> @@ -54,6 +54,25 @@ struct arch_vcpu {
>>       register_t henvcfg;
>>       register_t hideleg;
>>       register_t hstateen0;
>> +    register_t hvip;
>> +
>> +    register_t vsie;
>> +
>> +    /*
>> +     * VCPU interrupts
>> +     *
>> +     * We have a lockless approach for tracking pending VCPU interrupts
>> +     * implemented using atomic bitops. The irqs_pending bitmap represent
>> +     * pending interrupts whereas irqs_pending_mask represent bits changed
>> +     * in irqs_pending. Our approach is modeled around multiple producer
>> +     * and single consumer problem where the consumer is the VCPU itself.
>> +     *
>> +     * DECLARE_BITMAP() is needed here to support 64 vCPU local interrupts
>> +     * on RV32 host.
>> +     */
>> +#define RISCV_VCPU_NR_IRQS 64
> What is this 64 deriving from? IOW - can it be some kind of expression,
> helping the reader?

Originally it derives from the width of mideleg, mie, mvien, mvip, and mip (and
their counterpars for other modes) what means that RV32 will have no more then
32 local interrutps, but considering that RISC-V AIA spec tells ...:

   Table 2.1 lists both the CSRs added for machine level and existing machine-level
   CSRs whose size is changed by the Advanced Interrupt Architecture. Existing CSRs
   mie, mip, and mideleg are widended to 64 bits to support a total of 64 interrupt
   causes.
   For RV32, the high-half CSRs listed in the table allow access to the upper 32
   bits of registers mideleg, mie, mvien, mvip, and mip. The Advanced Interrupt
   Architecture requires that these high-half CSRs exist for RV32, but the bits they
   access may all be merely read-only zeros.

... that for RV32 it was widened to 64, so 64 appears here. I haven't used some AIA
specific name for constant 64 as in case if AIA isn't used it is more then enough
to cover PLIC case, for example.

>> --- a/xen/arch/riscv/traps.c
>> +++ b/xen/arch/riscv/traps.c
>> @@ -169,6 +169,11 @@ static void do_unexpected_trap(const struct cpu_user_regs *regs)
>>       die();
>>   }
>>   
>> +static void check_for_pcpu_work(void)
>> +{
>> +    vcpu_sync_interrupts(current);
>> +}
> Is this really a helpful wrapper? Or is there going to be more stuff in here
> later on?

Yes, as you noticed in the further patches vcpu_sync_interrupts() won't be the only one
stuff to be called in it.

Thanks.

~ Oleksii



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1
  2026-02-12  9:37     ` Oleksii Kurochko
@ 2026-02-12 10:24       ` Jan Beulich
  2026-02-12 11:23         ` Oleksii Kurochko
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2026-02-12 10:24 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 12.02.2026 10:37, Oleksii Kurochko wrote:
> On 2/11/26 3:16 PM, Jan Beulich wrote:
>> On 09.02.2026 17:52, Oleksii Kurochko wrote:
>>> Based on Linux kernel v6.16.0.
>>> Note that smp_wmb() is used instead of smp_mb__before_atomic() as what
>>> we want to guarantee that if a bit in irqs_pending_mask is obversable
>>> that the correspondent bit in irqs_pending is observable too.
>> And the counterpart of this barrier is the one encoded implicitly in
>> xchg()? Could do with making more explicit, e.g. by way of adding a
>> code comment there.
> 
> I thought it would be clear from the paragraph below where xchng_acquire()
> is mentioned. I'll update the comment to make it more explicit.

I'm confused. The (bogus) mentioning of xchg_acquire() is in the patch
description, whereas I suggested a code comment.

>>> @@ -124,3 +125,72 @@ void arch_vcpu_destroy(struct vcpu *v)
>>>   {
>>>       vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info) - STACK_SIZE);
>>>   }
>>> +
>>> +int vcpu_set_interrupt(struct vcpu *v, unsigned int irq)
>>> +{
>>> +    /*
>>> +     * We only allow VS-mode software, timer, and external
>>> +     * interrupts when irq is one of the local interrupts
>>> +     * defined by RISC-V privilege specification.
>>> +     */
>> What is "when irq is one ..." intended to be telling me? There's no ...
>>
>>> +    if ( irq != IRQ_VS_SOFT &&
>>> +         irq != IRQ_VS_TIMER &&
>>> +         irq != IRQ_VS_EXT )
>>> +        return -EINVAL;
>> ... corresponding code (anymore), afaict.
> 
> That part should be removed, there is no any sense for it anymore.
> 
>> Further, who are the prospected callers of this function and its sibling
>> below? If they're all internal (i.e. not reachable via hypercalls or
>> emulation on behalf of the guest), this may want to be assertions.
> 
> Considering your further reply:
>    Having seen a use in patch 08, I should clarify the "reachable" part here:
>    It's not the "reachable" alone, but whether the guest has control over the
>    "irq" value passed. For the example in patch 08 this isn't the case.
> 
> I think I did not fully understand the part about “the guest has control over
> the ‘irq’ value passed”, but at the moment I do not have any plans for the guest
> to pass the irq value directly. (Do you have any examples where it should be
> needed?).

No, I don't. This is all for you to sort out.

>>> --- a/xen/arch/riscv/include/asm/domain.h
>>> +++ b/xen/arch/riscv/include/asm/domain.h
>>> @@ -54,6 +54,25 @@ struct arch_vcpu {
>>>       register_t henvcfg;
>>>       register_t hideleg;
>>>       register_t hstateen0;
>>> +    register_t hvip;
>>> +
>>> +    register_t vsie;
>>> +
>>> +    /*
>>> +     * VCPU interrupts
>>> +     *
>>> +     * We have a lockless approach for tracking pending VCPU interrupts
>>> +     * implemented using atomic bitops. The irqs_pending bitmap represent
>>> +     * pending interrupts whereas irqs_pending_mask represent bits changed
>>> +     * in irqs_pending. Our approach is modeled around multiple producer
>>> +     * and single consumer problem where the consumer is the VCPU itself.
>>> +     *
>>> +     * DECLARE_BITMAP() is needed here to support 64 vCPU local interrupts
>>> +     * on RV32 host.
>>> +     */
>>> +#define RISCV_VCPU_NR_IRQS 64
>> What is this 64 deriving from? IOW - can it be some kind of expression,
>> helping the reader?
> 
> Originally it derives from the width of mideleg, mie, mvien, mvip, and mip (and
> their counterpars for other modes) what means that RV32 will have no more then
> 32 local interrutps, but considering that RISC-V AIA spec tells ...:
> 
>    Table 2.1 lists both the CSRs added for machine level and existing machine-level
>    CSRs whose size is changed by the Advanced Interrupt Architecture. Existing CSRs
>    mie, mip, and mideleg are widended to 64 bits to support a total of 64 interrupt
>    causes.
>    For RV32, the high-half CSRs listed in the table allow access to the upper 32
>    bits of registers mideleg, mie, mvien, mvip, and mip. The Advanced Interrupt
>    Architecture requires that these high-half CSRs exist for RV32, but the bits they
>    access may all be merely read-only zeros.
> 
> ... that for RV32 it was widened to 64, so 64 appears here. I haven't used some AIA
> specific name for constant 64 as in case if AIA isn't used it is more then enough
> to cover PLIC case, for example.

Thing is that with RV128 in mind I wonder whether 64 is correct, or whether it
e.g. wants to be max(BITS_PER_LONG, 64).

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1
  2026-02-12 10:24       ` Jan Beulich
@ 2026-02-12 11:23         ` Oleksii Kurochko
  0 siblings, 0 replies; 39+ messages in thread
From: Oleksii Kurochko @ 2026-02-12 11:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Romain Caritey, Alistair Francis, Connor Davis, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel


On 2/12/26 11:24 AM, Jan Beulich wrote:
> On 12.02.2026 10:37, Oleksii Kurochko wrote:
>> On 2/11/26 3:16 PM, Jan Beulich wrote:
>>> On 09.02.2026 17:52, Oleksii Kurochko wrote:
>>>> Based on Linux kernel v6.16.0.
>>>> Note that smp_wmb() is used instead of smp_mb__before_atomic() as what
>>>> we want to guarantee that if a bit in irqs_pending_mask is obversable
>>>> that the correspondent bit in irqs_pending is observable too.
>>> And the counterpart of this barrier is the one encoded implicitly in
>>> xchg()? Could do with making more explicit, e.g. by way of adding a
>>> code comment there.
>> I thought it would be clear from the paragraph below where xchng_acquire()
>> is mentioned. I'll update the comment to make it more explicit.
> I'm confused. The (bogus) mentioning of xchg_acquire() is in the patch
> description, whereas I suggested a code comment.

Oh, I see. I'll add the a code comment.


>
>>>> @@ -124,3 +125,72 @@ void arch_vcpu_destroy(struct vcpu *v)
>>>>    {
>>>>        vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info) - STACK_SIZE);
>>>>    }
>>>> +
>>>> +int vcpu_set_interrupt(struct vcpu *v, unsigned int irq)
>>>> +{
>>>> +    /*
>>>> +     * We only allow VS-mode software, timer, and external
>>>> +     * interrupts when irq is one of the local interrupts
>>>> +     * defined by RISC-V privilege specification.
>>>> +     */
>>> What is "when irq is one ..." intended to be telling me? There's no ...
>>>
>>>> +    if ( irq != IRQ_VS_SOFT &&
>>>> +         irq != IRQ_VS_TIMER &&
>>>> +         irq != IRQ_VS_EXT )
>>>> +        return -EINVAL;
>>> ... corresponding code (anymore), afaict.
>> That part should be removed, there is no any sense for it anymore.
>>
>>> Further, who are the prospected callers of this function and its sibling
>>> below? If they're all internal (i.e. not reachable via hypercalls or
>>> emulation on behalf of the guest), this may want to be assertions.
>> Considering your further reply:
>>     Having seen a use in patch 08, I should clarify the "reachable" part here:
>>     It's not the "reachable" alone, but whether the guest has control over the
>>     "irq" value passed. For the example in patch 08 this isn't the case.
>>
>> I think I did not fully understand the part about “the guest has control over
>> the ‘irq’ value passed”, but at the moment I do not have any plans for the guest
>> to pass the irq value directly. (Do you have any examples where it should be
>> needed?).
> No, I don't. This is all for you to sort out.
>
>>>> --- a/xen/arch/riscv/include/asm/domain.h
>>>> +++ b/xen/arch/riscv/include/asm/domain.h
>>>> @@ -54,6 +54,25 @@ struct arch_vcpu {
>>>>        register_t henvcfg;
>>>>        register_t hideleg;
>>>>        register_t hstateen0;
>>>> +    register_t hvip;
>>>> +
>>>> +    register_t vsie;
>>>> +
>>>> +    /*
>>>> +     * VCPU interrupts
>>>> +     *
>>>> +     * We have a lockless approach for tracking pending VCPU interrupts
>>>> +     * implemented using atomic bitops. The irqs_pending bitmap represent
>>>> +     * pending interrupts whereas irqs_pending_mask represent bits changed
>>>> +     * in irqs_pending. Our approach is modeled around multiple producer
>>>> +     * and single consumer problem where the consumer is the VCPU itself.
>>>> +     *
>>>> +     * DECLARE_BITMAP() is needed here to support 64 vCPU local interrupts
>>>> +     * on RV32 host.
>>>> +     */
>>>> +#define RISCV_VCPU_NR_IRQS 64
>>> What is this 64 deriving from? IOW - can it be some kind of expression,
>>> helping the reader?
>> Originally it derives from the width of mideleg, mie, mvien, mvip, and mip (and
>> their counterpars for other modes) what means that RV32 will have no more then
>> 32 local interrutps, but considering that RISC-V AIA spec tells ...:
>>
>>     Table 2.1 lists both the CSRs added for machine level and existing machine-level
>>     CSRs whose size is changed by the Advanced Interrupt Architecture. Existing CSRs
>>     mie, mip, and mideleg are widended to 64 bits to support a total of 64 interrupt
>>     causes.
>>     For RV32, the high-half CSRs listed in the table allow access to the upper 32
>>     bits of registers mideleg, mie, mvien, mvip, and mip. The Advanced Interrupt
>>     Architecture requires that these high-half CSRs exist for RV32, but the bits they
>>     access may all be merely read-only zeros.
>>
>> ... that for RV32 it was widened to 64, so 64 appears here. I haven't used some AIA
>> specific name for constant 64 as in case if AIA isn't used it is more then enough
>> to cover PLIC case, for example.
> Thing is that with RV128 in mind I wonder whether 64 is correct, or whether it
> e.g. wants to be max(BITS_PER_LONG, 64).

If to look at registers which are used now in hypervisor [1][Table 3] and are connected
to the bitmask (or their counterparts mentioned in the quote [2][Table 1]) are
hard-coded to 64 and it doesn't dependent on architecture bit-ness. And this is true
for AIA.

But on the other hand in RISC-V privilege spec the length of hvip is defined as
HSXLEN which could be 128 in the case of RV128 and to not depend only on AIA spec
likely it would be better to use suggested by you way to define RISCV_VCPU_NR_IRQS.

[1] https://github.com/riscv/riscv-aia/blob/main/src/CSRs.adoc#hypervisor-and-vs-csrs
[2] https://github.com/riscv/riscv-aia/blob/main/src/CSRs.adoc#machine-level-csrs

Thanks.

~ Oleksii


>
> Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2026-02-12 11:23 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-09 16:52 [PATCH v3 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
2026-02-09 16:52 ` [PATCH v3 01/16] xen/riscv: implement arch_vcpu_{create,destroy}() Oleksii Kurochko
2026-02-10 16:24   ` Jan Beulich
2026-02-09 16:52 ` [PATCH v3 02/16] xen/riscv: avoid reading hstateen0 when Smstateen is not implemented Oleksii Kurochko
2026-02-11  7:22   ` Jan Beulich
2026-02-11  8:45     ` Oleksii Kurochko
2026-02-09 16:52 ` [PATCH v3 03/16] xen/riscv: detect and store supported hypervisor CSR bits at boot Oleksii Kurochko
2026-02-11  7:49   ` Jan Beulich
2026-02-11  9:47     ` Oleksii Kurochko
2026-02-11  9:50       ` Jan Beulich
2026-02-09 16:52 ` [PATCH v3 04/16] xen/riscv: implement vcpu_csr_init() Oleksii Kurochko
2026-02-11  9:44   ` Jan Beulich
2026-02-11  9:53     ` Oleksii Kurochko
2026-02-09 16:52 ` [PATCH v3 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1 Oleksii Kurochko
2026-02-11 14:16   ` Jan Beulich
2026-02-11 14:53     ` Jan Beulich
2026-02-12  9:37     ` Oleksii Kurochko
2026-02-12 10:24       ` Jan Beulich
2026-02-12 11:23         ` Oleksii Kurochko
2026-02-09 16:52 ` [PATCH v3 06/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 2 Oleksii Kurochko
2026-02-11 14:26   ` Jan Beulich
2026-02-11 15:59     ` Oleksii Kurochko
2026-02-09 16:52 ` [PATCH v3 07/16] xen/time: move ticks<->ns helpers to common code Oleksii Kurochko
2026-02-09 16:52 ` [PATCH v3 08/16] xen/riscv: introduce basic vtimer infrastructure for guests Oleksii Kurochko
2026-02-11 14:51   ` Jan Beulich
2026-02-11 15:31     ` Oleksii Kurochko
2026-02-09 16:52 ` [PATCH v3 09/16] xen/riscv: introduce vcpu_kick() implementation Oleksii Kurochko
2026-02-09 16:52 ` [PATCH v3 10/16] xen/riscv: add vtimer context switch helpers Oleksii Kurochko
2026-02-09 16:52 ` [PATCH v3 11/16] xen/riscv: implement SBI legacy SET_TIMER support for guests Oleksii Kurochko
2026-02-09 16:52 ` [PATCH v3 12/16] xen/riscv: introduce sbi_set_timer() Oleksii Kurochko
2026-02-11 15:03   ` Jan Beulich
2026-02-11 15:37     ` Oleksii Kurochko
2026-02-09 16:52 ` [PATCH v3 13/16] xen/riscv: implement reprogram_timer() via SBI Oleksii Kurochko
2026-02-11 15:12   ` Jan Beulich
2026-02-11 15:14     ` Jan Beulich
2026-02-09 16:52 ` [PATCH v3 14/16] xen/riscv: handle hypervisor timer interrupts Oleksii Kurochko
2026-02-09 16:52 ` [PATCH v3 15/16] xen/riscv: init tasklet subsystem Oleksii Kurochko
2026-02-09 16:52 ` [PATCH v3 16/16] xen/riscv: implement sync_vcpu_execstate() Oleksii Kurochko
2026-02-11 15:16   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.