* [PATCH v2 01/16] xen/riscv: introduce struct arch_vcpu
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-01-26 11:41 ` Jan Beulich
2026-01-22 16:47 ` [PATCH v2 02/16] xen/riscv: implement arch_vcpu_{create,destroy}() Oleksii Kurochko
` (14 subsequent siblings)
15 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
Introdce struct arch_vcpu to hold RISC-V vCPU-specific state.
The structure contains:
- Guest-visible CSR state, used to save and restore vCPU execution
state across context switches. hstatus isn't added here as it is
already part of cpu_user_regs struct.
- Callee-saved registers used to preserve Xen’s own execution context
when switching between vCPU stacks.
It is going to be used in the following way (pseudocode):
context_switch(prev_vcpu, next_vcpu):
...
/* Switch from previous stack to the next stack. */
__context_switch(prev_vcpu, next_vcpu);
...
schedule_tail(prev_vcpu):
Save and restore vCPU's CSRs.
The Xen-saved context allows __context_switch() to switch execution
from the previous vCPU’s stack to the next vCPU’s stack and later resume
execution on the original stack when switching back.
During vCPU creation, the Xen-saved context is going to be initialized
with:
- SP pointing to the newly allocated vCPU stack
- RA pointing to a helper that performs final vCPU setup before
transferring control to the guest
After the first execution of __context_switch(), RA naturally points to
the instruction following the call site, and the remaining callee-saved
registers contain the Xen register state at the time of the switch.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
- Drop hstatus from struct arch_vcpu as it is stored in struct cpu_user_regs
which will be stored on top of vCPU's stack.
- Drop the comment above ra in xen_saved_context struct as it is potentially
misleading.
---
xen/arch/riscv/include/asm/domain.h | 57 ++++++++++++++++++++++++++++-
1 file changed, 55 insertions(+), 2 deletions(-)
diff --git a/xen/arch/riscv/include/asm/domain.h b/xen/arch/riscv/include/asm/domain.h
index 316e7c6c8448..0d9b4c4b525e 100644
--- a/xen/arch/riscv/include/asm/domain.h
+++ b/xen/arch/riscv/include/asm/domain.h
@@ -22,9 +22,62 @@ struct hvm_domain
struct arch_vcpu_io {
};
-struct arch_vcpu {
+struct arch_vcpu
+{
struct vcpu_vmid vmid;
-};
+
+ /*
+ * Callee saved registers for Xen's state deep in the callframe used to
+ * switch from prev's stack to the next's stack during context switch.
+ */
+ struct
+ {
+ register_t s0;
+ register_t s1;
+ register_t s2;
+ register_t s3;
+ register_t s4;
+ register_t s5;
+ register_t s6;
+ register_t s7;
+ register_t s8;
+ register_t s9;
+ register_t s10;
+ register_t s11;
+ register_t sp;
+ register_t gp;
+ register_t ra;
+ } xen_saved_context;
+
+ /* CSRs */
+ register_t hedeleg;
+ register_t hideleg;
+ register_t hvip;
+ register_t hip;
+ register_t hie;
+ register_t hgeie;
+ register_t henvcfg;
+ register_t hcounteren;
+ register_t htimedelta;
+ register_t htval;
+ register_t htinst;
+ register_t hstateen0;
+#ifdef CONFIG_RISCV_32
+ register_t henvcfgh;
+ register_t htimedeltah;
+#endif
+
+ /* VCSRs */
+ register_t vsstatus;
+ register_t vsip;
+ register_t vsie;
+ register_t vstvec;
+ register_t vsscratch;
+ register_t vscause;
+ register_t vstval;
+ register_t vsatp;
+ register_t vsepc;
+} __cacheline_aligned;
struct paging_domain {
spinlock_t lock;
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 01/16] xen/riscv: introduce struct arch_vcpu
2026-01-22 16:47 ` [PATCH v2 01/16] xen/riscv: introduce struct arch_vcpu Oleksii Kurochko
@ 2026-01-26 11:41 ` Jan Beulich
2026-01-26 12:30 ` Oleksii Kurochko
0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-01-26 11:41 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 22.01.2026 17:47, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/include/asm/domain.h
> +++ b/xen/arch/riscv/include/asm/domain.h
> @@ -22,9 +22,62 @@ struct hvm_domain
> struct arch_vcpu_io {
> };
>
> -struct arch_vcpu {
> +struct arch_vcpu
> +{
> struct vcpu_vmid vmid;
> -};
> +
> + /*
> + * Callee saved registers for Xen's state deep in the callframe used to
> + * switch from prev's stack to the next's stack during context switch.
> + */
What is "deep in the callframe" intended to convey? I'm in particular wondering
about ...
> + struct
> + {
> + register_t s0;
> + register_t s1;
> + register_t s2;
> + register_t s3;
> + register_t s4;
> + register_t s5;
> + register_t s6;
> + register_t s7;
> + register_t s8;
> + register_t s9;
> + register_t s10;
> + register_t s11;
> + register_t sp;
> + register_t gp;
> + register_t ra;
... sp and ra, which presumably don't live anywhere "deep"?
Also, what about tp? The 't' in there isn't the same as that in "t0", "t1", etc.
> + } xen_saved_context;
> +
> + /* CSRs */
> + register_t hedeleg;
> + register_t hideleg;
> + register_t hvip;
> + register_t hip;
> + register_t hie;
> + register_t hgeie;
> + register_t henvcfg;
> + register_t hcounteren;
> + register_t htimedelta;
> + register_t htval;
> + register_t htinst;
> + register_t hstateen0;
> +#ifdef CONFIG_RISCV_32
> + register_t henvcfgh;
> + register_t htimedeltah;
> +#endif
> +
> + /* VCSRs */
Why the V? These are normal CSRs dedicated to VS use, aren't they?
> + register_t vsstatus;
> + register_t vsip;
> + register_t vsie;
> + register_t vstvec;
> + register_t vsscratch;
> + register_t vscause;
> + register_t vstval;
> + register_t vsatp;
> + register_t vsepc;
> +} __cacheline_aligned;
I continue to question the presence of this attribute.
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 01/16] xen/riscv: introduce struct arch_vcpu
2026-01-26 11:41 ` Jan Beulich
@ 2026-01-26 12:30 ` Oleksii Kurochko
2026-01-26 12:53 ` Jan Beulich
0 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-26 12:30 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 1/26/26 12:41 PM, Jan Beulich wrote:
> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>> --- a/xen/arch/riscv/include/asm/domain.h
>> +++ b/xen/arch/riscv/include/asm/domain.h
>> @@ -22,9 +22,62 @@ struct hvm_domain
>> struct arch_vcpu_io {
>> };
>>
>> -struct arch_vcpu {
>> +struct arch_vcpu
>> +{
>> struct vcpu_vmid vmid;
>> -};
>> +
>> + /*
>> + * Callee saved registers for Xen's state deep in the callframe used to
>> + * switch from prev's stack to the next's stack during context switch.
>> + */
> What is "deep in the callframe" intended to convey? I'm in particular wondering
> about ...
>
>> + struct
>> + {
>> + register_t s0;
>> + register_t s1;
>> + register_t s2;
>> + register_t s3;
>> + register_t s4;
>> + register_t s5;
>> + register_t s6;
>> + register_t s7;
>> + register_t s8;
>> + register_t s9;
>> + register_t s10;
>> + register_t s11;
>> + register_t sp;
>> + register_t gp;
>> + register_t ra;
> ... sp and ra, which presumably don't live anywhere "deep"?
context_switch() is invoked relatively deep in the call stack, so the stack
pointer in use when context_switch() executes can also be considered to be
deep in the call frame. The same applies to RA: after the first
__context_switch() call, RA will point to the next instruction within
context_switch().
I can update the comment and drop the wording about being “deep in the call
frame” to avoid confusion. In that case it would simply read:
+ /*
+ * Callee saved registers for Xen's state used to
+ * switch from prev's stack to the next's stack during context switch.
+ */
>
> Also, what about tp? The 't' in there isn't the same as that in "t0", "t1", etc.
tp stores pcpu_info[] and it isn't expected to be changed during (or between) function
calls.
In this structure we are dealing only with registers which should be saved according
to RISC-V ABI convention:
[1] https://riscv-non-isa.github.io/riscv-elf-psabi-doc/#_integer_register_convention
The exception is for RA (as it is also used to jump to continue_to_new_vcpu() when vcpu is scheduled
first time). During a review of the [1], I think that GP could be dropped as it shouldn't
be preserved across calls.
>
>> + } xen_saved_context;
>> +
>> + /* CSRs */
>> + register_t hedeleg;
>> + register_t hideleg;
>> + register_t hvip;
>> + register_t hip;
>> + register_t hie;
>> + register_t hgeie;
>> + register_t henvcfg;
>> + register_t hcounteren;
>> + register_t htimedelta;
>> + register_t htval;
>> + register_t htinst;
>> + register_t hstateen0;
>> +#ifdef CONFIG_RISCV_32
>> + register_t henvcfgh;
>> + register_t htimedeltah;
>> +#endif
>> +
>> + /* VCSRs */
> Why the V? These are normal CSRs dedicated to VS use, aren't they?
I meant VSCSRs, yes, it is normal CSRs dedicated to VS use.
I'll drop the comment as from the name it is clear that VS-mode CSRs.
>
>> + register_t vsstatus;
>> + register_t vsip;
>> + register_t vsie;
>> + register_t vstvec;
>> + register_t vsscratch;
>> + register_t vscause;
>> + register_t vstval;
>> + register_t vsatp;
>> + register_t vsepc;
>> +} __cacheline_aligned;
> I continue to question the presence of this attribute.
I will drop it until I could really measure that it boosts performance.
At the moment, it is just an assumption.
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 01/16] xen/riscv: introduce struct arch_vcpu
2026-01-26 12:30 ` Oleksii Kurochko
@ 2026-01-26 12:53 ` Jan Beulich
2026-01-26 14:22 ` Oleksii Kurochko
0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-01-26 12:53 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 26.01.2026 13:30, Oleksii Kurochko wrote:
> On 1/26/26 12:41 PM, Jan Beulich wrote:
>> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>>> --- a/xen/arch/riscv/include/asm/domain.h
>>> +++ b/xen/arch/riscv/include/asm/domain.h
>>> @@ -22,9 +22,62 @@ struct hvm_domain
>>> struct arch_vcpu_io {
>>> };
>>>
>>> -struct arch_vcpu {
>>> +struct arch_vcpu
>>> +{
>>> struct vcpu_vmid vmid;
>>> -};
>>> +
>>> + /*
>>> + * Callee saved registers for Xen's state deep in the callframe used to
>>> + * switch from prev's stack to the next's stack during context switch.
>>> + */
>> What is "deep in the callframe" intended to convey? I'm in particular wondering
>> about ...
>>
>>> + struct
>>> + {
>>> + register_t s0;
>>> + register_t s1;
>>> + register_t s2;
>>> + register_t s3;
>>> + register_t s4;
>>> + register_t s5;
>>> + register_t s6;
>>> + register_t s7;
>>> + register_t s8;
>>> + register_t s9;
>>> + register_t s10;
>>> + register_t s11;
>>> + register_t sp;
>>> + register_t gp;
>>> + register_t ra;
>> ... sp and ra, which presumably don't live anywhere "deep"?
>
> context_switch() is invoked relatively deep in the call stack, so the stack
> pointer in use when context_switch() executes can also be considered to be
> deep in the call frame. The same applies to RA: after the first
> __context_switch() call, RA will point to the next instruction within
> context_switch().
While writing, did you maybe notice that "deep" can have two entirely distinct
meanings here? It could be "far from where the stack starts when we enter the
hypervisor" or "far from present top of stack".
> I can update the comment and drop the wording about being “deep in the call
> frame” to avoid confusion. In that case it would simply read:
>
> + /*
> + * Callee saved registers for Xen's state used to
> + * switch from prev's stack to the next's stack during context switch.
> + */
Yes please.
>> Also, what about tp? The 't' in there isn't the same as that in "t0", "t1", etc.
>
> tp stores pcpu_info[] and it isn't expected to be changed during (or between) function
> calls.
Oh, right, I forgot about that aspect. However, the more that you reference ...
> In this structure we are dealing only with registers which should be saved according
> to RISC-V ABI convention:
> [1] https://riscv-non-isa.github.io/riscv-elf-psabi-doc/#_integer_register_convention
> The exception is for RA (as it is also used to jump to continue_to_new_vcpu() when vcpu is scheduled
> first time). During a review of the [1], I think that GP could be dropped as it shouldn't
> be preserved across calls.
... this - why would gp then need saving? That ought to be stable across Xen as
well (or not be used at all)?
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 01/16] xen/riscv: introduce struct arch_vcpu
2026-01-26 12:53 ` Jan Beulich
@ 2026-01-26 14:22 ` Oleksii Kurochko
0 siblings, 0 replies; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-26 14:22 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 1/26/26 1:53 PM, Jan Beulich wrote:
> On 26.01.2026 13:30, Oleksii Kurochko wrote:
>> On 1/26/26 12:41 PM, Jan Beulich wrote:
>>> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>>>> --- a/xen/arch/riscv/include/asm/domain.h
>>>> +++ b/xen/arch/riscv/include/asm/domain.h
>>>> @@ -22,9 +22,62 @@ struct hvm_domain
>>>> struct arch_vcpu_io {
>>>> };
>>>>
>>>> -struct arch_vcpu {
>>>> +struct arch_vcpu
>>>> +{
>>>> struct vcpu_vmid vmid;
>>>> -};
>>>> +
>>>> + /*
>>>> + * Callee saved registers for Xen's state deep in the callframe used to
>>>> + * switch from prev's stack to the next's stack during context switch.
>>>> + */
>>> What is "deep in the callframe" intended to convey? I'm in particular wondering
>>> about ...
>>>
>>>> + struct
>>>> + {
>>>> + register_t s0;
>>>> + register_t s1;
>>>> + register_t s2;
>>>> + register_t s3;
>>>> + register_t s4;
>>>> + register_t s5;
>>>> + register_t s6;
>>>> + register_t s7;
>>>> + register_t s8;
>>>> + register_t s9;
>>>> + register_t s10;
>>>> + register_t s11;
>>>> + register_t sp;
>>>> + register_t gp;
>>>> + register_t ra;
>>> ... sp and ra, which presumably don't live anywhere "deep"?
>> context_switch() is invoked relatively deep in the call stack, so the stack
>> pointer in use when context_switch() executes can also be considered to be
>> deep in the call frame. The same applies to RA: after the first
>> __context_switch() call, RA will point to the next instruction within
>> context_switch().
> While writing, did you maybe notice that "deep" can have two entirely distinct
> meanings here? It could be "far from where the stack starts when we enter the
> hypervisor" or "far from present top of stack".
Yeah, but at time when I was writing the commit I thought only about one meaning
"far from where the stack starts when we enter the hypervisor".
>
>> I can update the comment and drop the wording about being “deep in the call
>> frame” to avoid confusion. In that case it would simply read:
>>
>> + /*
>> + * Callee saved registers for Xen's state used to
>> + * switch from prev's stack to the next's stack during context switch.
>> + */
> Yes please.
>
>>> Also, what about tp? The 't' in there isn't the same as that in "t0", "t1", etc.
>> tp stores pcpu_info[] and it isn't expected to be changed during (or between) function
>> calls.
> Oh, right, I forgot about that aspect. However, the more that you reference ...
>
>> In this structure we are dealing only with registers which should be saved according
>> to RISC-V ABI convention:
>> [1] https://riscv-non-isa.github.io/riscv-elf-psabi-doc/#_integer_register_convention
>> The exception is for RA (as it is also used to jump to continue_to_new_vcpu() when vcpu is scheduled
>> first time). During a review of the [1], I think that GP could be dropped as it shouldn't
>> be preserved across calls.
> ... this - why would gp then need saving? That ought to be stable across Xen as
> well (or not be used at all)?
Totally agree, that why I mentioned in reply that it could (it would be better if
"must/should" were used) be dropped as it shouldn't be preserved across calls and
as you also notice that it ought to be stable across Xen as well (or not be used
at all).
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v2 02/16] xen/riscv: implement arch_vcpu_{create,destroy}()
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
2026-01-22 16:47 ` [PATCH v2 01/16] xen/riscv: introduce struct arch_vcpu Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-01-23 11:30 ` Teddy Astie
2026-01-22 16:47 ` [PATCH v2 03/16] xen/riscv: implement vcpu_csr_init() Oleksii Kurochko
` (13 subsequent siblings)
15 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
Introduce architecture-specific functions to create and destroy VCPUs.
Note that arch_vcpu_create() currently returns -EOPNOTSUPP, as the virtual
timer and interrupt controller are not yet implemented.
As part of this change, add continue_new_vcpu(), which will be used after
the first context_switch() of a new vCPU. Since this functionality is not
yet implemented, continue_new_vcpu() is currently provided as a stub.
Update the STACK_SIZE definition and introduce STACK_ORDER (to align with
other architectures) for allocating the vCPU stack.
Introduce struct cpu_info to store per-vCPU state that lives at the top
of the vCPU's stack.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
- Drop BUILD_BUG_ON() in arch_vcpu_create() as a check isn't very useful.
- Use vzalloc() instead of alloc_xenheap_page() to use the larger domheap to
allocate vCPU's stack.
- Drop printk() inside arch_vcpu_create() to not have potential big noise
in console as it could be that an amount of vCPUs is pretty big.
- Use XVFREE() instead of free_xenheap_pages() as vCPU's stack allocation
happens with a usage of vzalloc() now.
- Drop stack field as it is enough to have only cpu_info as stack pointer
could be calculated based on cpu_info.
- Drop cast when v.arch.cpu_info is inialized as it is not necessary
to have it.
- Drop memset() for arch.cpu_info() as it is enough to have vzalloc().
---
xen/arch/riscv/Makefile | 1 +
xen/arch/riscv/domain.c | 59 ++++++++++++++++++++++++++++
xen/arch/riscv/include/asm/config.h | 3 +-
xen/arch/riscv/include/asm/current.h | 6 +++
xen/arch/riscv/include/asm/domain.h | 2 +
xen/arch/riscv/stubs.c | 10 -----
6 files changed, 70 insertions(+), 11 deletions(-)
create mode 100644 xen/arch/riscv/domain.c
diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 87c1148b0010..8863d4b15605 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -1,5 +1,6 @@
obj-y += aplic.o
obj-y += cpufeature.o
+obj-y += domain.o
obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
obj-y += entry.o
obj-y += imsic.o
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
new file mode 100644
index 000000000000..9c546267881b
--- /dev/null
+++ b/xen/arch/riscv/domain.c
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/init.h>
+#include <xen/mm.h>
+#include <xen/sched.h>
+#include <xen/vmap.h>
+
+static void continue_new_vcpu(struct vcpu *prev)
+{
+ BUG_ON("unimplemented\n");
+}
+
+static void __init __maybe_unused build_assertions(void)
+{
+ /*
+ * Enforce the requirement documented in struct cpu_info that
+ * guest_cpu_user_regs must be the first field.
+ */
+ BUILD_BUG_ON(offsetof(struct cpu_info, guest_cpu_user_regs) != 0);
+}
+
+int arch_vcpu_create(struct vcpu *v)
+{
+ int rc = 0;
+ void *stack = vzalloc(STACK_SIZE);
+
+ if ( !stack )
+ return -ENOMEM;
+
+ v->arch.cpu_info = stack + STACK_SIZE - sizeof(struct cpu_info);
+ memset(v->arch.cpu_info, 0, sizeof(*v->arch.cpu_info));
+
+ v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
+ v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
+
+ /* Idle VCPUs don't need the rest of this setup */
+ if ( is_idle_vcpu(v) )
+ return rc;
+
+ /*
+ * As the vtimer and interrupt controller (IC) are not yet implemented,
+ * return an error.
+ *
+ * TODO: Drop this once the vtimer and IC are implemented.
+ */
+ rc = -EOPNOTSUPP;
+ goto fail;
+
+ return rc;
+
+ fail:
+ arch_vcpu_destroy(v);
+ return rc;
+}
+
+void arch_vcpu_destroy(struct vcpu *v)
+{
+ vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info));
+}
diff --git a/xen/arch/riscv/include/asm/config.h b/xen/arch/riscv/include/asm/config.h
index 1e08d3bf78be..86a95df018b5 100644
--- a/xen/arch/riscv/include/asm/config.h
+++ b/xen/arch/riscv/include/asm/config.h
@@ -143,7 +143,8 @@
#define SMP_CACHE_BYTES (1 << 6)
-#define STACK_SIZE PAGE_SIZE
+#define STACK_ORDER 3
+#define STACK_SIZE (PAGE_SIZE << STACK_ORDER)
#define IDENT_AREA_SIZE 64
diff --git a/xen/arch/riscv/include/asm/current.h b/xen/arch/riscv/include/asm/current.h
index 0c3ea70c2ec8..58c9f1506b7c 100644
--- a/xen/arch/riscv/include/asm/current.h
+++ b/xen/arch/riscv/include/asm/current.h
@@ -21,6 +21,12 @@ struct pcpu_info {
/* tp points to one of these */
extern struct pcpu_info pcpu_info[NR_CPUS];
+/* Per-VCPU state that lives at the top of the stack */
+struct cpu_info {
+ /* This should be the first member. */
+ struct cpu_user_regs guest_cpu_user_regs;
+};
+
#define set_processor_id(id) do { \
tp->processor_id = (id); \
} while (0)
diff --git a/xen/arch/riscv/include/asm/domain.h b/xen/arch/riscv/include/asm/domain.h
index 0d9b4c4b525e..ec7786c76199 100644
--- a/xen/arch/riscv/include/asm/domain.h
+++ b/xen/arch/riscv/include/asm/domain.h
@@ -49,6 +49,8 @@ struct arch_vcpu
register_t ra;
} xen_saved_context;
+ struct cpu_info *cpu_info;
+
/* CSRs */
register_t hedeleg;
register_t hideleg;
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index 29bdb65afbdf..9e30a9a3b50b 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -121,16 +121,6 @@ void dump_pageframe_info(struct domain *d)
BUG_ON("unimplemented");
}
-int arch_vcpu_create(struct vcpu *v)
-{
- BUG_ON("unimplemented");
-}
-
-void arch_vcpu_destroy(struct vcpu *v)
-{
- BUG_ON("unimplemented");
-}
-
void vcpu_switch_to_aarch64_mode(struct vcpu *v)
{
BUG_ON("unimplemented");
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 02/16] xen/riscv: implement arch_vcpu_{create,destroy}()
2026-01-22 16:47 ` [PATCH v2 02/16] xen/riscv: implement arch_vcpu_{create,destroy}() Oleksii Kurochko
@ 2026-01-23 11:30 ` Teddy Astie
2026-01-26 9:00 ` Oleksii Kurochko
0 siblings, 1 reply; 65+ messages in thread
From: Teddy Astie @ 2026-01-23 11:30 UTC (permalink / raw)
To: Oleksii Kurochko, xen-devel
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey
Hello,
Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
> Introduce architecture-specific functions to create and destroy VCPUs.
> Note that arch_vcpu_create() currently returns -EOPNOTSUPP, as the virtual
> timer and interrupt controller are not yet implemented.
>
> As part of this change, add continue_new_vcpu(), which will be used after
> the first context_switch() of a new vCPU. Since this functionality is not
> yet implemented, continue_new_vcpu() is currently provided as a stub.
>
> Update the STACK_SIZE definition and introduce STACK_ORDER (to align with
> other architectures) for allocating the vCPU stack.
>
> Introduce struct cpu_info to store per-vCPU state that lives at the top
> of the vCPU's stack.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Changes in v2:
> - Drop BUILD_BUG_ON() in arch_vcpu_create() as a check isn't very useful.
> - Use vzalloc() instead of alloc_xenheap_page() to use the larger domheap to
> allocate vCPU's stack.
> - Drop printk() inside arch_vcpu_create() to not have potential big noise
> in console as it could be that an amount of vCPUs is pretty big.
> - Use XVFREE() instead of free_xenheap_pages() as vCPU's stack allocation
> happens with a usage of vzalloc() now.
> - Drop stack field as it is enough to have only cpu_info as stack pointer
> could be calculated based on cpu_info.
> - Drop cast when v.arch.cpu_info is inialized as it is not necessary
> to have it.
> - Drop memset() for arch.cpu_info() as it is enough to have vzalloc().
> ---
> xen/arch/riscv/Makefile | 1 +
> xen/arch/riscv/domain.c | 59 ++++++++++++++++++++++++++++
> xen/arch/riscv/include/asm/config.h | 3 +-
> xen/arch/riscv/include/asm/current.h | 6 +++
> xen/arch/riscv/include/asm/domain.h | 2 +
> xen/arch/riscv/stubs.c | 10 -----
> 6 files changed, 70 insertions(+), 11 deletions(-)
> create mode 100644 xen/arch/riscv/domain.c
>
> diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
> index 87c1148b0010..8863d4b15605 100644
> --- a/xen/arch/riscv/Makefile
> +++ b/xen/arch/riscv/Makefile
> @@ -1,5 +1,6 @@
> obj-y += aplic.o
> obj-y += cpufeature.o
> +obj-y += domain.o
> obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
> obj-y += entry.o
> obj-y += imsic.o
> diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
> new file mode 100644
> index 000000000000..9c546267881b
> --- /dev/null
> +++ b/xen/arch/riscv/domain.c
> @@ -0,0 +1,59 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <xen/init.h>
> +#include <xen/mm.h>
> +#include <xen/sched.h>
> +#include <xen/vmap.h>
> +
> +static void continue_new_vcpu(struct vcpu *prev)
> +{
> + BUG_ON("unimplemented\n");
> +}
> +
> +static void __init __maybe_unused build_assertions(void)
> +{
> + /*
> + * Enforce the requirement documented in struct cpu_info that
> + * guest_cpu_user_regs must be the first field.
> + */
> + BUILD_BUG_ON(offsetof(struct cpu_info, guest_cpu_user_regs) != 0);
> +}
> +
> +int arch_vcpu_create(struct vcpu *v)
> +{
> + int rc = 0;
> + void *stack = vzalloc(STACK_SIZE);
> +
Are there alignment constraints for the stack ?
> + if ( !stack )
> + return -ENOMEM;
> +
> + v->arch.cpu_info = stack + STACK_SIZE - sizeof(struct cpu_info);
> + memset(v->arch.cpu_info, 0, sizeof(*v->arch.cpu_info));
> +
Given that vzalloc ensures that the memory is cleared, you don't need to
clear it another time.
> + v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
> + v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
> +
You probably also want to set the first parameter of continue_new_vcpu
(struct vcpu *prev), or otherwise I don't think we want the "prev"
parameter in continue_new_vcpu if it's always going to be 0.
IIRC continue_new_vcpu is only meant to bootstrap the new vCPU, not just
perform a context switch to it.
> + /* Idle VCPUs don't need the rest of this setup */
> + if ( is_idle_vcpu(v) )
> + return rc;
> +
> + /*
> + * As the vtimer and interrupt controller (IC) are not yet implemented,
> + * return an error.
> + *
> + * TODO: Drop this once the vtimer and IC are implemented.
> + */
> + rc = -EOPNOTSUPP;
> + goto fail;
> +
> + return rc;
> +
> + fail:
> + arch_vcpu_destroy(v);
> + return rc;
> +}
> +
> +void arch_vcpu_destroy(struct vcpu *v)
> +{
> + vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info));
> +}
That doesn't look correct, you want to vfree what was allocated as the
bottom of stack, i.e
v->arch.cpu_info + sizeof(struct cpu_info) - STACK_SIZE
Or alternatively introduce bottom of stack as a additional vcpu_arch field.
> diff --git a/xen/arch/riscv/include/asm/config.h b/xen/arch/riscv/include/asm/config.h
> index 1e08d3bf78be..86a95df018b5 100644
> --- a/xen/arch/riscv/include/asm/config.h
> +++ b/xen/arch/riscv/include/asm/config.h
> @@ -143,7 +143,8 @@
>
> #define SMP_CACHE_BYTES (1 << 6)
>
> -#define STACK_SIZE PAGE_SIZE
> +#define STACK_ORDER 3
> +#define STACK_SIZE (PAGE_SIZE << STACK_ORDER)
>
> #define IDENT_AREA_SIZE 64
>
> diff --git a/xen/arch/riscv/include/asm/current.h b/xen/arch/riscv/include/asm/current.h
> index 0c3ea70c2ec8..58c9f1506b7c 100644
> --- a/xen/arch/riscv/include/asm/current.h
> +++ b/xen/arch/riscv/include/asm/current.h
> @@ -21,6 +21,12 @@ struct pcpu_info {
> /* tp points to one of these */
> extern struct pcpu_info pcpu_info[NR_CPUS];
>
> +/* Per-VCPU state that lives at the top of the stack */
> +struct cpu_info {
> + /* This should be the first member. */
> + struct cpu_user_regs guest_cpu_user_regs;
> +};
> +
> #define set_processor_id(id) do { \
> tp->processor_id = (id); \
> } while (0)
> diff --git a/xen/arch/riscv/include/asm/domain.h b/xen/arch/riscv/include/asm/domain.h
> index 0d9b4c4b525e..ec7786c76199 100644
> --- a/xen/arch/riscv/include/asm/domain.h
> +++ b/xen/arch/riscv/include/asm/domain.h
> @@ -49,6 +49,8 @@ struct arch_vcpu
> register_t ra;
> } xen_saved_context;
>
> + struct cpu_info *cpu_info;
> +
> /* CSRs */
> register_t hedeleg;
> register_t hideleg;
> diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
> index 29bdb65afbdf..9e30a9a3b50b 100644
> --- a/xen/arch/riscv/stubs.c
> +++ b/xen/arch/riscv/stubs.c
> @@ -121,16 +121,6 @@ void dump_pageframe_info(struct domain *d)
> BUG_ON("unimplemented");
> }
>
> -int arch_vcpu_create(struct vcpu *v)
> -{
> - BUG_ON("unimplemented");
> -}
> -
> -void arch_vcpu_destroy(struct vcpu *v)
> -{
> - BUG_ON("unimplemented");
> -}
> -
> void vcpu_switch_to_aarch64_mode(struct vcpu *v)
> {
> BUG_ON("unimplemented");
--
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 02/16] xen/riscv: implement arch_vcpu_{create,destroy}()
2026-01-23 11:30 ` Teddy Astie
@ 2026-01-26 9:00 ` Oleksii Kurochko
2026-01-26 11:47 ` Jan Beulich
0 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-26 9:00 UTC (permalink / raw)
To: Teddy Astie, xen-devel
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey
Hello Teddy,
On 1/23/26 12:30 PM, Teddy Astie wrote:
> Hello,
>
> Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
>> Introduce architecture-specific functions to create and destroy VCPUs.
>> Note that arch_vcpu_create() currently returns -EOPNOTSUPP, as the virtual
>> timer and interrupt controller are not yet implemented.
>>
>> As part of this change, add continue_new_vcpu(), which will be used after
>> the first context_switch() of a new vCPU. Since this functionality is not
>> yet implemented, continue_new_vcpu() is currently provided as a stub.
>>
>> Update the STACK_SIZE definition and introduce STACK_ORDER (to align with
>> other architectures) for allocating the vCPU stack.
>>
>> Introduce struct cpu_info to store per-vCPU state that lives at the top
>> of the vCPU's stack.
>>
>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>> ---
>> Changes in v2:
>> - Drop BUILD_BUG_ON() in arch_vcpu_create() as a check isn't very useful.
>> - Use vzalloc() instead of alloc_xenheap_page() to use the larger domheap to
>> allocate vCPU's stack.
>> - Drop printk() inside arch_vcpu_create() to not have potential big noise
>> in console as it could be that an amount of vCPUs is pretty big.
>> - Use XVFREE() instead of free_xenheap_pages() as vCPU's stack allocation
>> happens with a usage of vzalloc() now.
>> - Drop stack field as it is enough to have only cpu_info as stack pointer
>> could be calculated based on cpu_info.
>> - Drop cast when v.arch.cpu_info is inialized as it is not necessary
>> to have it.
>> - Drop memset() for arch.cpu_info() as it is enough to have vzalloc().
>> ---
>> xen/arch/riscv/Makefile | 1 +
>> xen/arch/riscv/domain.c | 59 ++++++++++++++++++++++++++++
>> xen/arch/riscv/include/asm/config.h | 3 +-
>> xen/arch/riscv/include/asm/current.h | 6 +++
>> xen/arch/riscv/include/asm/domain.h | 2 +
>> xen/arch/riscv/stubs.c | 10 -----
>> 6 files changed, 70 insertions(+), 11 deletions(-)
>> create mode 100644 xen/arch/riscv/domain.c
>>
>> diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
>> index 87c1148b0010..8863d4b15605 100644
>> --- a/xen/arch/riscv/Makefile
>> +++ b/xen/arch/riscv/Makefile
>> @@ -1,5 +1,6 @@
>> obj-y += aplic.o
>> obj-y += cpufeature.o
>> +obj-y += domain.o
>> obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
>> obj-y += entry.o
>> obj-y += imsic.o
>> diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
>> new file mode 100644
>> index 000000000000..9c546267881b
>> --- /dev/null
>> +++ b/xen/arch/riscv/domain.c
>> @@ -0,0 +1,59 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +
>> +#include <xen/init.h>
>> +#include <xen/mm.h>
>> +#include <xen/sched.h>
>> +#include <xen/vmap.h>
>> +
>> +static void continue_new_vcpu(struct vcpu *prev)
>> +{
>> + BUG_ON("unimplemented\n");
>> +}
>> +
>> +static void __init __maybe_unused build_assertions(void)
>> +{
>> + /*
>> + * Enforce the requirement documented in struct cpu_info that
>> + * guest_cpu_user_regs must be the first field.
>> + */
>> + BUILD_BUG_ON(offsetof(struct cpu_info, guest_cpu_user_regs) != 0);
>> +}
>> +
>> +int arch_vcpu_create(struct vcpu *v)
>> +{
>> + int rc = 0;
>> + void *stack = vzalloc(STACK_SIZE);
>> +
> Are there alignment constraints for the stack ?
vzalloc() allocates page-aligned memory as far as I can see:
...
va = __vmap(mfn, 1, pages, 1, PAGE_HYPERVISOR, type);
...
where 1 -> means align and what will lead that in vm_alloc():
...
start =(start +align)&~(align -1); ...
>
>> + if ( !stack )
>> + return -ENOMEM;
>> +
>> + v->arch.cpu_info = stack + STACK_SIZE - sizeof(struct cpu_info);
>> + memset(v->arch.cpu_info, 0, sizeof(*v->arch.cpu_info));
>> +
> Given that vzalloc ensures that the memory is cleared, you don't need to
> clear it another time.
Oh, right, missed to drop memset.
>
>> + v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
>> + v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
>> +
> You probably also want to set the first parameter of continue_new_vcpu
> (struct vcpu *prev), or otherwise I don't think we want the "prev"
> parameter in continue_new_vcpu if it's always going to be 0.
It will be set by RISC-V ABI (prev will be stored in a0) when __context_switch()
will be called in context_switch():
void context_switch(struct vcpu *prev, struct vcpu *next)
{
ASSERT(local_irq_is_enabled());
ASSERT(prev != next);
ASSERT(!vcpu_cpu_dirty(next));
local_irq_disable();
set_current(next);
prev = __context_switch(prev, next);
schedule_tail(prev);
}
First call of the __context_switch() will lead to jump to continue_new_vcpu()
function which will have a0=prev.
>
> IIRC continue_new_vcpu is only meant to bootstrap the new vCPU, not just
> perform a context switch to it.
>
>> + /* Idle VCPUs don't need the rest of this setup */
>> + if ( is_idle_vcpu(v) )
>> + return rc;
>> +
>> + /*
>> + * As the vtimer and interrupt controller (IC) are not yet implemented,
>> + * return an error.
>> + *
>> + * TODO: Drop this once the vtimer and IC are implemented.
>> + */
>> + rc = -EOPNOTSUPP;
>> + goto fail;
>> +
>> + return rc;
>> +
>> + fail:
>> + arch_vcpu_destroy(v);
>> + return rc;
>> +}
>> +
>> +void arch_vcpu_destroy(struct vcpu *v)
>> +{
>> + vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info));
>> +}
> That doesn't look correct, you want to vfree what was allocated as the
> bottom of stack, i.e
>
> v->arch.cpu_info + sizeof(struct cpu_info) - STACK_SIZE
Agree formula should be correct, now it points to the end of the stack
memory.
>
> Or alternatively introduce bottom of stack as a additional vcpu_arch field.
There is not too much sense to have the separate field for that.
Thanks.
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 02/16] xen/riscv: implement arch_vcpu_{create,destroy}()
2026-01-26 9:00 ` Oleksii Kurochko
@ 2026-01-26 11:47 ` Jan Beulich
2026-01-26 12:07 ` Oleksii Kurochko
0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-01-26 11:47 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, Teddy Astie, xen-devel
On 26.01.2026 10:00, Oleksii Kurochko wrote:
> On 1/23/26 12:30 PM, Teddy Astie wrote:
>> Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
>>> + v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
>>> + v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
>>> +
>> You probably also want to set the first parameter of continue_new_vcpu
>> (struct vcpu *prev), or otherwise I don't think we want the "prev"
>> parameter in continue_new_vcpu if it's always going to be 0.
>
> It will be set by RISC-V ABI (prev will be stored in a0) when __context_switch()
> will be called in context_switch():
> void context_switch(struct vcpu *prev, struct vcpu *next)
> {
> ASSERT(local_irq_is_enabled());
> ASSERT(prev != next);
> ASSERT(!vcpu_cpu_dirty(next));
>
> local_irq_disable();
>
> set_current(next);
>
> prev = __context_switch(prev, next);
>
> schedule_tail(prev);
> }
> First call of the __context_switch() will lead to jump to continue_new_vcpu()
> function which will have a0=prev.
Problem being that none of this code exists yet, and hence it's rather hard to
see how things will eventually fit together.
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 02/16] xen/riscv: implement arch_vcpu_{create,destroy}()
2026-01-26 11:47 ` Jan Beulich
@ 2026-01-26 12:07 ` Oleksii Kurochko
0 siblings, 0 replies; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-26 12:07 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, Teddy Astie, xen-devel
On 1/26/26 12:47 PM, Jan Beulich wrote:
> On 26.01.2026 10:00, Oleksii Kurochko wrote:
>> On 1/23/26 12:30 PM, Teddy Astie wrote:
>>> Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
>>>> + v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
>>>> + v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
>>>> +
>>> You probably also want to set the first parameter of continue_new_vcpu
>>> (struct vcpu *prev), or otherwise I don't think we want the "prev"
>>> parameter in continue_new_vcpu if it's always going to be 0.
>> It will be set by RISC-V ABI (prev will be stored in a0) when __context_switch()
>> will be called in context_switch():
>> void context_switch(struct vcpu *prev, struct vcpu *next)
>> {
>> ASSERT(local_irq_is_enabled());
>> ASSERT(prev != next);
>> ASSERT(!vcpu_cpu_dirty(next));
>>
>> local_irq_disable();
>>
>> set_current(next);
>>
>> prev = __context_switch(prev, next);
>>
>> schedule_tail(prev);
>> }
>> First call of the __context_switch() will lead to jump to continue_new_vcpu()
>> function which will have a0=prev.
> Problem being that none of this code exists yet, and hence it's rather hard to
> see how things will eventually fit together.
I am trying to avoid a large patch series, and this needs to be introduced
before we can implement context_switch(). At the moment, I'm trying to introduce
everything that I need for dom0less domain creation and context_switch() isn't
really needed at this stage and continue_to_new_vcpu() isn't really needed to be
implemented at this stage as well as context_switch().
The best I can do at the moment is either to write a better commit message
explaining how this will be used, or, specifically for continue_new_vcpu(), to
avoid introducing it for now and not initialise the ra register until the
remaining pieces are ready. But considering that I've already done in this way
then better commit message(s) should work.
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v2 03/16] xen/riscv: implement vcpu_csr_init()
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
2026-01-22 16:47 ` [PATCH v2 01/16] xen/riscv: introduce struct arch_vcpu Oleksii Kurochko
2026-01-22 16:47 ` [PATCH v2 02/16] xen/riscv: implement arch_vcpu_{create,destroy}() Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-01-24 22:44 ` Teddy Astie
2026-01-22 16:47 ` [PATCH v2 04/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1 Oleksii Kurochko
` (12 subsequent siblings)
15 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
Introduce vcpu_csr_init() to set up the initial hypervisor-visible CSR
state for a vCPU before it is first scheduled.
The function configures trap and interrupt delegation to VS-mode by
setting the appropriate bits in the hedeleg and hideleg registers,
initializes hstatus so that execution enters VS-mode when control is
passed to the guest, and restricts guest access to hardware performance
counters by initializing hcounteren, as unrestricted access would
require additional handling in Xen.
When the Smstateen and SSAIA extensions are available, access to AIA
CSRs and IMSIC guest interrupt files is enabled by setting the
corresponding bits in hstateen0, avoiding unnecessary traps into Xen
(note that SVSLCT(Supervisor Virtual Select) name is used intead of
CSRIND as OpenSBI uses such name and riscv_encoding.h is mostly based
on it).
If the Svpbmt extension is supported, the PBMTE bit is set in
henvcfg to allow its use for VS-stage address translation. Guest
access to the ENVCFG CSR is also enabled by setting ENVCFG bit in
hstateen0, as a guest may need to control certain characteristics of
the U-mode (VU-mode when V=1) execution environment.
For CSRs that may contain read-only bits (e.g. hedeleg, hideleg,
hstateen0), the written value is re-read from hardware and cached in
vcpu->arch to avoid divergence between the software state and the actual
CSR contents.
As hstatus is not part of struct arch_vcpu (it already resides in
struct cpu_user_regs), introduce vcpu_guest_cpu_user_regs() to provide
a uniform way to access hstatus and other guest CPU user registers.
This establishes a consistent and well-defined initial CSR state for
vCPUs prior to their first context switch.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
- As hstatus isn't a part of arch_vcpu structure (as it is already a part of
cpu_user_regs) introduce vcpu_guest_cpu_user_regs() to be able to access
hstatus and other CPU user regs.
- Sort hideleg bit setting by value. Drop a stray blank.
- Drop | when the first initialization of hcounteren and hennvcfg happen.
- Introduce HEDELEG_DEFAULT. Sort set bits by value and use BIT() macros
instead of open-coding it.
- Apply pattern csr_write() -> csr_read() for hedeleg and hideleg instead
of direct bit setting in v->arch.h{i,e}deleg as it could be that for some
reason some bits of hedeleg and hideleg are r/o.
The similar patter is used for hstateen0 as some of the bits could be r/o.
- Add check that SSAIA is avaialable before setting of SMSTATEEN0_AIA |
SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT bits.
- Drop local variables hstatus, hideleg and hedeleg as they aren't used
anymore.
---
xen/arch/riscv/cpufeature.c | 1 +
xen/arch/riscv/domain.c | 73 +++++++++++++++++++++
xen/arch/riscv/include/asm/cpufeature.h | 1 +
xen/arch/riscv/include/asm/current.h | 2 +
xen/arch/riscv/include/asm/riscv_encoding.h | 2 +
5 files changed, 79 insertions(+)
diff --git a/xen/arch/riscv/cpufeature.c b/xen/arch/riscv/cpufeature.c
index 02b68aeaa49f..03e27b037be0 100644
--- a/xen/arch/riscv/cpufeature.c
+++ b/xen/arch/riscv/cpufeature.c
@@ -137,6 +137,7 @@ const struct riscv_isa_ext_data __initconst riscv_isa_ext[] = {
RISCV_ISA_EXT_DATA(zbb),
RISCV_ISA_EXT_DATA(zbs),
RISCV_ISA_EXT_DATA(smaia),
+ RISCV_ISA_EXT_DATA(smstateen),
RISCV_ISA_EXT_DATA(ssaia),
RISCV_ISA_EXT_DATA(svade),
RISCV_ISA_EXT_DATA(svpbmt),
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index 9c546267881b..3ae5fa3a8805 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -5,6 +5,77 @@
#include <xen/sched.h>
#include <xen/vmap.h>
+#include <asm/cpufeature.h>
+#include <asm/csr.h>
+#include <asm/riscv_encoding.h>
+
+#define HEDELEG_DEFAULT (BIT(CAUSE_MISALIGNED_FETCH, U) | \
+ BIT(CAUSE_FETCH_ACCESS, U) | \
+ BIT(CAUSE_ILLEGAL_INSTRUCTION, U) | \
+ BIT(CAUSE_BREAKPOINT, U) | \
+ BIT(CAUSE_MISALIGNED_LOAD, U) | \
+ BIT(CAUSE_LOAD_ACCESS, U) | \
+ BIT(CAUSE_MISALIGNED_STORE, U) | \
+ BIT(CAUSE_STORE_ACCESS, U) | \
+ BIT(CAUSE_USER_ECALL, U) | \
+ BIT(CAUSE_FETCH_PAGE_FAULT, U) | \
+ BIT(CAUSE_LOAD_PAGE_FAULT, U) | \
+ BIT(CAUSE_STORE_PAGE_FAULT, U))
+
+#define HIDELEG_DEFAULT (MIP_VSSIP | MIP_VSTIP | MIP_VSEIP)
+
+static void vcpu_csr_init(struct vcpu *v)
+{
+ register_t hstateen0;
+
+ csr_write(CSR_HEDELEG, HEDELEG_DEFAULT);
+ v->arch.hedeleg = csr_read(CSR_HEDELEG);
+
+ vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
+
+ csr_write(CSR_HIDELEG, HIDELEG_DEFAULT);
+ v->arch.hideleg = csr_read(CSR_HIDELEG);
+
+ /*
+ * VS should access only the time counter directly.
+ * Everything else should trap.
+ */
+ v->arch.hcounteren = HCOUNTEREN_TM;
+
+ if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_svpbmt) )
+ {
+ csr_write(CSR_HENVCFG, ENVCFG_PBMTE);
+ v->arch.henvcfg = csr_read(CSR_HENVCFG);
+ }
+
+ if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_smstateen) )
+ {
+ if (riscv_isa_extension_available(NULL, RISCV_ISA_EXT_ssaia))
+ /*
+ * If the hypervisor extension is implemented, the same three
+ * bitsare defined also in hypervisor CSR hstateen0 but concern
+ * only the state potentially accessible to a virtual machine
+ * executing in privilege modes VS and VU:
+ * bit 60 CSRs siselect and sireg (really vsiselect and
+ * vsireg)
+ * bit 59 CSRs siph and sieh (RV32 only) and stopi (really
+ * vsiph, vsieh, and vstopi)
+ * bit 58 all state of IMSIC guest interrupt files, including
+ * CSR stopei (really vstopei)
+ * If one of these bits is zero in hstateen0, and the same bit is
+ * one in mstateen0, then an attempt to access the corresponding
+ * state from VS or VU-mode raises a virtual instruction exception.
+ */
+ hstateen0 = SMSTATEEN0_AIA | SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT;
+
+ /* Allow guest to access CSR_ENVCFG */
+ hstateen0 |= SMSTATEEN0_HSENVCFG;
+
+ csr_write(CSR_HSTATEEN0, hstateen0);
+ v->arch.hstateen0 = csr_read(CSR_HSTATEEN0);
+ }
+}
+
static void continue_new_vcpu(struct vcpu *prev)
{
BUG_ON("unimplemented\n");
@@ -33,6 +104,8 @@ int arch_vcpu_create(struct vcpu *v)
v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
+ vcpu_csr_init(v);
+
/* Idle VCPUs don't need the rest of this setup */
if ( is_idle_vcpu(v) )
return rc;
diff --git a/xen/arch/riscv/include/asm/cpufeature.h b/xen/arch/riscv/include/asm/cpufeature.h
index b69616038888..ef02a3e26d2c 100644
--- a/xen/arch/riscv/include/asm/cpufeature.h
+++ b/xen/arch/riscv/include/asm/cpufeature.h
@@ -36,6 +36,7 @@ enum riscv_isa_ext_id {
RISCV_ISA_EXT_zbb,
RISCV_ISA_EXT_zbs,
RISCV_ISA_EXT_smaia,
+ RISCV_ISA_EXT_smstateen,
RISCV_ISA_EXT_ssaia,
RISCV_ISA_EXT_svade,
RISCV_ISA_EXT_svpbmt,
diff --git a/xen/arch/riscv/include/asm/current.h b/xen/arch/riscv/include/asm/current.h
index 58c9f1506b7c..5fbee8182caa 100644
--- a/xen/arch/riscv/include/asm/current.h
+++ b/xen/arch/riscv/include/asm/current.h
@@ -48,6 +48,8 @@ DECLARE_PER_CPU(struct vcpu *, curr_vcpu);
#define get_cpu_current(cpu) per_cpu(curr_vcpu, cpu)
#define guest_cpu_user_regs() ({ BUG_ON("unimplemented"); NULL; })
+#define vcpu_guest_cpu_user_regs(vcpu) \
+ (&(vcpu)->arch.cpu_info->guest_cpu_user_regs)
#define switch_stack_and_jump(stack, fn) do { \
asm volatile ( \
diff --git a/xen/arch/riscv/include/asm/riscv_encoding.h b/xen/arch/riscv/include/asm/riscv_encoding.h
index 1f7e612366f8..dd15731a86fa 100644
--- a/xen/arch/riscv/include/asm/riscv_encoding.h
+++ b/xen/arch/riscv/include/asm/riscv_encoding.h
@@ -228,6 +228,8 @@
#define ENVCFG_CBIE_INV _UL(0x3)
#define ENVCFG_FIOM _UL(0x1)
+#define HCOUNTEREN_TM BIT(1, U)
+
/* ===== User-level CSRs ===== */
/* User Trap Setup (N-extension) */
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 03/16] xen/riscv: implement vcpu_csr_init()
2026-01-22 16:47 ` [PATCH v2 03/16] xen/riscv: implement vcpu_csr_init() Oleksii Kurochko
@ 2026-01-24 22:44 ` Teddy Astie
2026-01-26 8:36 ` Jan Beulich
2026-01-26 9:39 ` Oleksii Kurochko
0 siblings, 2 replies; 65+ messages in thread
From: Teddy Astie @ 2026-01-24 22:44 UTC (permalink / raw)
To: Oleksii Kurochko, xen-devel
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey
Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
> Introduce vcpu_csr_init() to set up the initial hypervisor-visible CSR
> state for a vCPU before it is first scheduled.
To me, "hypervisor-visible CSR state" sounds like some state of the
guest the hypervisor has view on. But to what I read, it's more
hypervisor state CSRs regarding what to intercept and various
virtualization behavior.
> The function configures trap and interrupt delegation to VS-mode by
> setting the appropriate bits in the hedeleg and hideleg registers,
> initializes hstatus so that execution enters VS-mode when control is
> passed to the guest, and restricts guest access to hardware performance
> counters by initializing hcounteren, as unrestricted access would
> require additional handling in Xen.
> When the Smstateen and SSAIA extensions are available, access to AIA
> CSRs and IMSIC guest interrupt files is enabled by setting the
> corresponding bits in hstateen0, avoiding unnecessary traps into Xen
> (note that SVSLCT(Supervisor Virtual Select) name is used intead of
> CSRIND as OpenSBI uses such name and riscv_encoding.h is mostly based
> on it).
> If the Svpbmt extension is supported, the PBMTE bit is set in
> henvcfg to allow its use for VS-stage address translation. Guest
> access to the ENVCFG CSR is also enabled by setting ENVCFG bit in
> hstateen0, as a guest may need to control certain characteristics of
> the U-mode (VU-mode when V=1) execution environment.
>
> For CSRs that may contain read-only bits (e.g. hedeleg, hideleg,
> hstateen0), the written value is re-read from hardware and cached in
> vcpu->arch to avoid divergence between the software state and the actual
> CSR contents.
>
AFAIU from RISC-V isa document, at least for some CSRs the read-only-0
bits are well-defined and means "can't be configured" due to not having
a meaning (e.g hedeleg defines as read-only "Environment call from
HS-mode" because guest can't be in a "actual" HS-mode).
> As hstatus is not part of struct arch_vcpu (it already resides in
> struct cpu_user_regs), introduce vcpu_guest_cpu_user_regs() to provide
> a uniform way to access hstatus and other guest CPU user registers.
>
> This establishes a consistent and well-defined initial CSR state for
> vCPUs prior to their first context switch.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Changes in v2:
> - As hstatus isn't a part of arch_vcpu structure (as it is already a part of
> cpu_user_regs) introduce vcpu_guest_cpu_user_regs() to be able to access
> hstatus and other CPU user regs.
Sounds like hstatus wants to be splitted from guest_cpu_user_regs (which
are supposed to be GPR ?).
> - Sort hideleg bit setting by value. Drop a stray blank.
> - Drop | when the first initialization of hcounteren and hennvcfg happen.
> - Introduce HEDELEG_DEFAULT. Sort set bits by value and use BIT() macros
> instead of open-coding it.
> - Apply pattern csr_write() -> csr_read() for hedeleg and hideleg instead
> of direct bit setting in v->arch.h{i,e}deleg as it could be that for some
> reason some bits of hedeleg and hideleg are r/o.
> The similar patter is used for hstateen0 as some of the bits could be r/o.
> - Add check that SSAIA is avaialable before setting of SMSTATEEN0_AIA |
> SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT bits.
> - Drop local variables hstatus, hideleg and hedeleg as they aren't used
> anymore.
> ---
> xen/arch/riscv/cpufeature.c | 1 +
> xen/arch/riscv/domain.c | 73 +++++++++++++++++++++
> xen/arch/riscv/include/asm/cpufeature.h | 1 +
> xen/arch/riscv/include/asm/current.h | 2 +
> xen/arch/riscv/include/asm/riscv_encoding.h | 2 +
> 5 files changed, 79 insertions(+)
>
> diff --git a/xen/arch/riscv/cpufeature.c b/xen/arch/riscv/cpufeature.c
> index 02b68aeaa49f..03e27b037be0 100644
> --- a/xen/arch/riscv/cpufeature.c
> +++ b/xen/arch/riscv/cpufeature.c
> @@ -137,6 +137,7 @@ const struct riscv_isa_ext_data __initconst riscv_isa_ext[] = {
> RISCV_ISA_EXT_DATA(zbb),
> RISCV_ISA_EXT_DATA(zbs),
> RISCV_ISA_EXT_DATA(smaia),
> + RISCV_ISA_EXT_DATA(smstateen),
> RISCV_ISA_EXT_DATA(ssaia),
> RISCV_ISA_EXT_DATA(svade),
> RISCV_ISA_EXT_DATA(svpbmt),
> diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
> index 9c546267881b..3ae5fa3a8805 100644
> --- a/xen/arch/riscv/domain.c
> +++ b/xen/arch/riscv/domain.c
> @@ -5,6 +5,77 @@
> #include <xen/sched.h>
> #include <xen/vmap.h>
>
> +#include <asm/cpufeature.h>
> +#include <asm/csr.h>
> +#include <asm/riscv_encoding.h>
> +
> +#define HEDELEG_DEFAULT (BIT(CAUSE_MISALIGNED_FETCH, U) | \
> + BIT(CAUSE_FETCH_ACCESS, U) | \
> + BIT(CAUSE_ILLEGAL_INSTRUCTION, U) | \
> + BIT(CAUSE_BREAKPOINT, U) | \
> + BIT(CAUSE_MISALIGNED_LOAD, U) | \
> + BIT(CAUSE_LOAD_ACCESS, U) | \
> + BIT(CAUSE_MISALIGNED_STORE, U) | \
> + BIT(CAUSE_STORE_ACCESS, U) | \
> + BIT(CAUSE_USER_ECALL, U) | \
> + BIT(CAUSE_FETCH_PAGE_FAULT, U) | \
> + BIT(CAUSE_LOAD_PAGE_FAULT, U) | \
> + BIT(CAUSE_STORE_PAGE_FAULT, U))
> +
> +#define HIDELEG_DEFAULT (MIP_VSSIP | MIP_VSTIP | MIP_VSEIP)
> +
> +static void vcpu_csr_init(struct vcpu *v)
> +{
> + register_t hstateen0;
> +
> + csr_write(CSR_HEDELEG, HEDELEG_DEFAULT);
> + v->arch.hedeleg = csr_read(CSR_HEDELEG);
> +
> + vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
> +
> + csr_write(CSR_HIDELEG, HIDELEG_DEFAULT);
> + v->arch.hideleg = csr_read(CSR_HIDELEG);
> +
To me, that feels odd to set machine CSR here. Especially if (I guess)
that we would update them anyway during context switches.
I think it would be better to have :
- vcpu_csr_init -> sets initial state CSR in vcpu structure, doesn't
touch machine CSR
- context switching logic -> loads vcpu-specific machine CSR from vcpu
structure
We would have to make a context switch to enter the vcpu for the first
time; but that's to be expected.
With my proposal, we would also want to move the vcpu_csr_init()
invocation to the place we initialize the vcpu_arch structure rather
than the first time we schedule inside that vcpu.
That would also allow to take account of per-domain configuration if we
need to (e.g feature flags).
> + /*
> + * VS should access only the time counter directly.
> + * Everything else should trap.
> + */
> + v->arch.hcounteren = HCOUNTEREN_TM;
> +
> + if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_svpbmt) )
> + {
> + csr_write(CSR_HENVCFG, ENVCFG_PBMTE);
> + v->arch.henvcfg = csr_read(CSR_HENVCFG);
> + }
> +
> + if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_smstateen) )
> + {
> + if (riscv_isa_extension_available(NULL, RISCV_ISA_EXT_ssaia))
> + /*
> + * If the hypervisor extension is implemented, the same three
> + * bitsare defined also in hypervisor CSR hstateen0 but concern
> + * only the state potentially accessible to a virtual machine
> + * executing in privilege modes VS and VU:
> + * bit 60 CSRs siselect and sireg (really vsiselect and
> + * vsireg)
> + * bit 59 CSRs siph and sieh (RV32 only) and stopi (really
> + * vsiph, vsieh, and vstopi)
> + * bit 58 all state of IMSIC guest interrupt files, including
> + * CSR stopei (really vstopei)
> + * If one of these bits is zero in hstateen0, and the same bit is
> + * one in mstateen0, then an attempt to access the corresponding
> + * state from VS or VU-mode raises a virtual instruction exception.
> + */
> + hstateen0 = SMSTATEEN0_AIA | SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT;
> +
> + /* Allow guest to access CSR_ENVCFG */
> + hstateen0 |= SMSTATEEN0_HSENVCFG;
> +
> + csr_write(CSR_HSTATEEN0, hstateen0);
> + v->arch.hstateen0 = csr_read(CSR_HSTATEEN0);
> + }
> +}
> +
> static void continue_new_vcpu(struct vcpu *prev)
> {
> BUG_ON("unimplemented\n");
> @@ -33,6 +104,8 @@ int arch_vcpu_create(struct vcpu *v)
> v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
> v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
>
> + vcpu_csr_init(v);
> +
> /* Idle VCPUs don't need the rest of this setup */
> if ( is_idle_vcpu(v) )
> return rc;
> diff --git a/xen/arch/riscv/include/asm/cpufeature.h b/xen/arch/riscv/include/asm/cpufeature.h
> index b69616038888..ef02a3e26d2c 100644
> --- a/xen/arch/riscv/include/asm/cpufeature.h
> +++ b/xen/arch/riscv/include/asm/cpufeature.h
> @@ -36,6 +36,7 @@ enum riscv_isa_ext_id {
> RISCV_ISA_EXT_zbb,
> RISCV_ISA_EXT_zbs,
> RISCV_ISA_EXT_smaia,
> + RISCV_ISA_EXT_smstateen,
> RISCV_ISA_EXT_ssaia,
> RISCV_ISA_EXT_svade,
> RISCV_ISA_EXT_svpbmt,
> diff --git a/xen/arch/riscv/include/asm/current.h b/xen/arch/riscv/include/asm/current.h
> index 58c9f1506b7c..5fbee8182caa 100644
> --- a/xen/arch/riscv/include/asm/current.h
> +++ b/xen/arch/riscv/include/asm/current.h
> @@ -48,6 +48,8 @@ DECLARE_PER_CPU(struct vcpu *, curr_vcpu);
> #define get_cpu_current(cpu) per_cpu(curr_vcpu, cpu)
>
> #define guest_cpu_user_regs() ({ BUG_ON("unimplemented"); NULL; })
> +#define vcpu_guest_cpu_user_regs(vcpu) \
> + (&(vcpu)->arch.cpu_info->guest_cpu_user_regs)
> > #define switch_stack_and_jump(stack, fn) do { \
> asm volatile ( \
> diff --git a/xen/arch/riscv/include/asm/riscv_encoding.h b/xen/arch/riscv/include/asm/riscv_encoding.h
> index 1f7e612366f8..dd15731a86fa 100644
> --- a/xen/arch/riscv/include/asm/riscv_encoding.h
> +++ b/xen/arch/riscv/include/asm/riscv_encoding.h
> @@ -228,6 +228,8 @@
> #define ENVCFG_CBIE_INV _UL(0x3)
> #define ENVCFG_FIOM _UL(0x1)
>
> +#define HCOUNTEREN_TM BIT(1, U)
> +
> /* ===== User-level CSRs ===== */
>
> /* User Trap Setup (N-extension) */
--
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 03/16] xen/riscv: implement vcpu_csr_init()
2026-01-24 22:44 ` Teddy Astie
@ 2026-01-26 8:36 ` Jan Beulich
2026-01-26 9:47 ` Oleksii Kurochko
2026-01-26 9:39 ` Oleksii Kurochko
1 sibling, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-01-26 8:36 UTC (permalink / raw)
To: Teddy Astie, Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 24.01.2026 23:44, Teddy Astie wrote:
> Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
>> +static void vcpu_csr_init(struct vcpu *v)
>> +{
>> + register_t hstateen0;
>> +
>> + csr_write(CSR_HEDELEG, HEDELEG_DEFAULT);
>> + v->arch.hedeleg = csr_read(CSR_HEDELEG);
>> +
>> + vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
>> +
>> + csr_write(CSR_HIDELEG, HIDELEG_DEFAULT);
>> + v->arch.hideleg = csr_read(CSR_HIDELEG);
>> +
>
> To me, that feels odd to set machine CSR here. Especially if (I guess)
> that we would update them anyway during context switches.
>
> I think it would be better to have :
> - vcpu_csr_init -> sets initial state CSR in vcpu structure, doesn't
> touch machine CSR
> - context switching logic -> loads vcpu-specific machine CSR from vcpu
> structure
>
> We would have to make a context switch to enter the vcpu for the first
> time; but that's to be expected.
>
> With my proposal, we would also want to move the vcpu_csr_init()
> invocation to the place we initialize the vcpu_arch structure rather
> than the first time we schedule inside that vcpu.
Aiui the writes were put here to be able to cope with r/o-zero bits. Yet
indeed it can't be cone like this. While it may work for domains created
during boot, these CSRs would be changed under the feet of a running vCPU
when done this way for domain creation later at runtime.
Instead, as I think I had also suggested earlier on, the r/o-zero-ness of
bits in particular CSRs wants determining once during boot, and then that
mask wants using when setting up vCPU-s.
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 03/16] xen/riscv: implement vcpu_csr_init()
2026-01-26 8:36 ` Jan Beulich
@ 2026-01-26 9:47 ` Oleksii Kurochko
2026-01-26 9:54 ` Jan Beulich
0 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-26 9:47 UTC (permalink / raw)
To: Jan Beulich, Teddy Astie
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 1/26/26 9:36 AM, Jan Beulich wrote:
> On 24.01.2026 23:44, Teddy Astie wrote:
>> Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
>>> +static void vcpu_csr_init(struct vcpu *v)
>>> +{
>>> + register_t hstateen0;
>>> +
>>> + csr_write(CSR_HEDELEG, HEDELEG_DEFAULT);
>>> + v->arch.hedeleg = csr_read(CSR_HEDELEG);
>>> +
>>> + vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
>>> +
>>> + csr_write(CSR_HIDELEG, HIDELEG_DEFAULT);
>>> + v->arch.hideleg = csr_read(CSR_HIDELEG);
>>> +
>> To me, that feels odd to set machine CSR here. Especially if (I guess)
>> that we would update them anyway during context switches.
>>
>> I think it would be better to have :
>> - vcpu_csr_init -> sets initial state CSR in vcpu structure, doesn't
>> touch machine CSR
>> - context switching logic -> loads vcpu-specific machine CSR from vcpu
>> structure
>>
>> We would have to make a context switch to enter the vcpu for the first
>> time; but that's to be expected.
>>
>> With my proposal, we would also want to move the vcpu_csr_init()
>> invocation to the place we initialize the vcpu_arch structure rather
>> than the first time we schedule inside that vcpu.
> Aiui the writes were put here to be able to cope with r/o-zero bits. Yet
> indeed it can't be cone like this. While it may work for domains created
> during boot, these CSRs would be changed under the feet of a running vCPU
> when done this way for domain creation later at runtime.
Why these CSRs would want to be changed when we have already running vCPU?
Even they will be changed what is an issue, they will be sync-ed during
context switch.
>
> Instead, as I think I had also suggested earlier on, the r/o-zero-ness of
> bits in particular CSRs wants determining once during boot, and then that
> mask wants using when setting up vCPU-s.
Somewhere even before create_domUs() will be called?
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 03/16] xen/riscv: implement vcpu_csr_init()
2026-01-26 9:47 ` Oleksii Kurochko
@ 2026-01-26 9:54 ` Jan Beulich
0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2026-01-26 9:54 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel, Teddy Astie
On 26.01.2026 10:47, Oleksii Kurochko wrote:
>
> On 1/26/26 9:36 AM, Jan Beulich wrote:
>> On 24.01.2026 23:44, Teddy Astie wrote:
>>> Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
>>>> +static void vcpu_csr_init(struct vcpu *v)
>>>> +{
>>>> + register_t hstateen0;
>>>> +
>>>> + csr_write(CSR_HEDELEG, HEDELEG_DEFAULT);
>>>> + v->arch.hedeleg = csr_read(CSR_HEDELEG);
>>>> +
>>>> + vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
>>>> +
>>>> + csr_write(CSR_HIDELEG, HIDELEG_DEFAULT);
>>>> + v->arch.hideleg = csr_read(CSR_HIDELEG);
>>>> +
>>> To me, that feels odd to set machine CSR here. Especially if (I guess)
>>> that we would update them anyway during context switches.
>>>
>>> I think it would be better to have :
>>> - vcpu_csr_init -> sets initial state CSR in vcpu structure, doesn't
>>> touch machine CSR
>>> - context switching logic -> loads vcpu-specific machine CSR from vcpu
>>> structure
>>>
>>> We would have to make a context switch to enter the vcpu for the first
>>> time; but that's to be expected.
>>>
>>> With my proposal, we would also want to move the vcpu_csr_init()
>>> invocation to the place we initialize the vcpu_arch structure rather
>>> than the first time we schedule inside that vcpu.
>> Aiui the writes were put here to be able to cope with r/o-zero bits. Yet
>> indeed it can't be cone like this. While it may work for domains created
>> during boot, these CSRs would be changed under the feet of a running vCPU
>> when done this way for domain creation later at runtime.
>
> Why these CSRs would want to be changed when we have already running vCPU?
>
> Even they will be changed what is an issue, they will be sync-ed during
> context switch.
Which is too late (plus on ctxt-switch-out I assume they'll be synced from
HW to internal structures). The problem here is that a vCPU, having invoked
a hypercall resulting in a new vCPU (for another domain) to be created, will
observe a change of some of the CSRs controlling its own behavior.
>> Instead, as I think I had also suggested earlier on, the r/o-zero-ness of
>> bits in particular CSRs wants determining once during boot, and then that
>> mask wants using when setting up vCPU-s.
>
> Somewhere even before create_domUs() will be called?
Yes, before any domains (and in particular vCPU-s) are created. The idle
domain may be an exception here, and system domains (not having vCPU-s) may
also be unaffected and hence not depend on particular ordering.
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v2 03/16] xen/riscv: implement vcpu_csr_init()
2026-01-24 22:44 ` Teddy Astie
2026-01-26 8:36 ` Jan Beulich
@ 2026-01-26 9:39 ` Oleksii Kurochko
1 sibling, 0 replies; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-26 9:39 UTC (permalink / raw)
To: Teddy Astie, xen-devel
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey
On 1/24/26 11:44 PM, Teddy Astie wrote:
> Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
>> Introduce vcpu_csr_init() to set up the initial hypervisor-visible CSR
>> state for a vCPU before it is first scheduled.
> To me, "hypervisor-visible CSR state" sounds like some state of the
> guest the hypervisor has view on. But to what I read, it's more
> hypervisor state CSRs regarding what to intercept and various
> virtualization behavior.
I'll rephrase then to:
Introduce vcpu_csr_init() to initialise hypervisor CSRs that control
vCPU execution and virtualization behaviour before the vCPU is first
scheduled.
>
>> The function configures trap and interrupt delegation to VS-mode by
>> setting the appropriate bits in the hedeleg and hideleg registers,
>> initializes hstatus so that execution enters VS-mode when control is
>> passed to the guest, and restricts guest access to hardware performance
>> counters by initializing hcounteren, as unrestricted access would
>> require additional handling in Xen.
>> When the Smstateen and SSAIA extensions are available, access to AIA
>> CSRs and IMSIC guest interrupt files is enabled by setting the
>> corresponding bits in hstateen0, avoiding unnecessary traps into Xen
>> (note that SVSLCT(Supervisor Virtual Select) name is used intead of
>> CSRIND as OpenSBI uses such name and riscv_encoding.h is mostly based
>> on it).
>> If the Svpbmt extension is supported, the PBMTE bit is set in
>> henvcfg to allow its use for VS-stage address translation. Guest
>> access to the ENVCFG CSR is also enabled by setting ENVCFG bit in
>> hstateen0, as a guest may need to control certain characteristics of
>> the U-mode (VU-mode when V=1) execution environment.
>>
>> For CSRs that may contain read-only bits (e.g. hedeleg, hideleg,
>> hstateen0), the written value is re-read from hardware and cached in
>> vcpu->arch to avoid divergence between the software state and the actual
>> CSR contents.
>>
> AFAIU from RISC-V isa document, at least for some CSRs the read-only-0
> bits are well-defined and means "can't be configured" due to not having
> a meaning (e.g hedeleg defines as read-only "Environment call from
> HS-mode" because guest can't be in a "actual" HS-mode).
It was said about bits which hypervisor tries to set in the mentioned
registers and IIRC all of them could be r/o-0 as, for example, M-mode
can decide not to delegate them to HS-mode.
>
>> As hstatus is not part of struct arch_vcpu (it already resides in
>> struct cpu_user_regs), introduce vcpu_guest_cpu_user_regs() to provide
>> a uniform way to access hstatus and other guest CPU user registers.
>>
>> This establishes a consistent and well-defined initial CSR state for
>> vCPUs prior to their first context switch.
>>
>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>> ---
>> Changes in v2:
>> - As hstatus isn't a part of arch_vcpu structure (as it is already a part of
>> cpu_user_regs) introduce vcpu_guest_cpu_user_regs() to be able to access
>> hstatus and other CPU user regs.
> Sounds like hstatus wants to be splitted from guest_cpu_user_regs (which
> are supposed to be GPR ?).
Generally, agree. But hstatus want to be saved during a trap even before guest
is started and considering that we already have the code which stores it in
guest_cpu_user_regs structure and handle it during the trap, I've decided just
to drop hstatus from arch_vcpu.
>
>> - Sort hideleg bit setting by value. Drop a stray blank.
>> - Drop | when the first initialization of hcounteren and hennvcfg happen.
>> - Introduce HEDELEG_DEFAULT. Sort set bits by value and use BIT() macros
>> instead of open-coding it.
>> - Apply pattern csr_write() -> csr_read() for hedeleg and hideleg instead
>> of direct bit setting in v->arch.h{i,e}deleg as it could be that for some
>> reason some bits of hedeleg and hideleg are r/o.
>> The similar patter is used for hstateen0 as some of the bits could be r/o.
>> - Add check that SSAIA is avaialable before setting of SMSTATEEN0_AIA |
>> SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT bits.
>> - Drop local variables hstatus, hideleg and hedeleg as they aren't used
>> anymore.
>> ---
>> xen/arch/riscv/cpufeature.c | 1 +
>> xen/arch/riscv/domain.c | 73 +++++++++++++++++++++
>> xen/arch/riscv/include/asm/cpufeature.h | 1 +
>> xen/arch/riscv/include/asm/current.h | 2 +
>> xen/arch/riscv/include/asm/riscv_encoding.h | 2 +
>> 5 files changed, 79 insertions(+)
>>
>> diff --git a/xen/arch/riscv/cpufeature.c b/xen/arch/riscv/cpufeature.c
>> index 02b68aeaa49f..03e27b037be0 100644
>> --- a/xen/arch/riscv/cpufeature.c
>> +++ b/xen/arch/riscv/cpufeature.c
>> @@ -137,6 +137,7 @@ const struct riscv_isa_ext_data __initconst riscv_isa_ext[] = {
>> RISCV_ISA_EXT_DATA(zbb),
>> RISCV_ISA_EXT_DATA(zbs),
>> RISCV_ISA_EXT_DATA(smaia),
>> + RISCV_ISA_EXT_DATA(smstateen),
>> RISCV_ISA_EXT_DATA(ssaia),
>> RISCV_ISA_EXT_DATA(svade),
>> RISCV_ISA_EXT_DATA(svpbmt),
>> diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
>> index 9c546267881b..3ae5fa3a8805 100644
>> --- a/xen/arch/riscv/domain.c
>> +++ b/xen/arch/riscv/domain.c
>> @@ -5,6 +5,77 @@
>> #include <xen/sched.h>
>> #include <xen/vmap.h>
>>
>> +#include <asm/cpufeature.h>
>> +#include <asm/csr.h>
>> +#include <asm/riscv_encoding.h>
>> +
>> +#define HEDELEG_DEFAULT (BIT(CAUSE_MISALIGNED_FETCH, U) | \
>> + BIT(CAUSE_FETCH_ACCESS, U) | \
>> + BIT(CAUSE_ILLEGAL_INSTRUCTION, U) | \
>> + BIT(CAUSE_BREAKPOINT, U) | \
>> + BIT(CAUSE_MISALIGNED_LOAD, U) | \
>> + BIT(CAUSE_LOAD_ACCESS, U) | \
>> + BIT(CAUSE_MISALIGNED_STORE, U) | \
>> + BIT(CAUSE_STORE_ACCESS, U) | \
>> + BIT(CAUSE_USER_ECALL, U) | \
>> + BIT(CAUSE_FETCH_PAGE_FAULT, U) | \
>> + BIT(CAUSE_LOAD_PAGE_FAULT, U) | \
>> + BIT(CAUSE_STORE_PAGE_FAULT, U))
>> +
>> +#define HIDELEG_DEFAULT (MIP_VSSIP | MIP_VSTIP | MIP_VSEIP)
>> +
>> +static void vcpu_csr_init(struct vcpu *v)
>> +{
>> + register_t hstateen0;
>> +
>> + csr_write(CSR_HEDELEG, HEDELEG_DEFAULT);
>> + v->arch.hedeleg = csr_read(CSR_HEDELEG);
>> +
>> + vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
>> +
>> + csr_write(CSR_HIDELEG, HIDELEG_DEFAULT);
>> + v->arch.hideleg = csr_read(CSR_HIDELEG);
>> +
> To me, that feels odd to set machine CSR here. Especially if (I guess)
> that we would update them anyway during context switches.
Yes, they will be updated anyway.
When this approach was initially suggested, I considered the case where
some code might attempt to use these bits before the context-switch logic
runs. In that situation, we could end up executing code that assumes the
feature is available (because the bit is set in|v->arch.some_reg|), only
to later discover that the corresponding CSR bit is read-only zero.
>
> I think it would be better to have :
> - vcpu_csr_init -> sets initial state CSR in vcpu structure, doesn't
> touch machine CSR
> - context switching logic -> loads vcpu-specific machine CSR from vcpu
> structure
>
> We would have to make a context switch to enter the vcpu for the first
> time; but that's to be expected.
>
> With my proposal, we would also want to move the vcpu_csr_init()
> invocation to the place we initialize the vcpu_arch structure rather
> than the first time we schedule inside that vcpu.
I think I may be misunderstanding something here. vcpu_csr_init() is now
called from arch_vcpu_create(), which initialises architecture-specific
arch_vcpu fields. Am I missing something?
>
> That would also allow to take account of per-domain configuration if we
> need to (e.g feature flags).
Thanks.
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v2 04/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
` (2 preceding siblings ...)
2026-01-22 16:47 ` [PATCH v2 03/16] xen/riscv: implement vcpu_csr_init() Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-01-29 16:44 ` Jan Beulich
2026-01-22 16:47 ` [PATCH v2 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 2 Oleksii Kurochko
` (11 subsequent siblings)
15 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
Based on Linux kernel v6.16.0.
Add lockless tracking of pending vCPU interrupts using atomic bitops.
Two bitmaps are introduced:
- irqs_pending — interrupts currently pending for the vCPU
- irqs_pending_mask — bits that have changed in irqs_pending
The design follows a multi-producer, single-consumer model, where the
consumer is the vCPU itself. Producers may set bits in
irqs_pending_mask without a lock. Clearing bits in irqs_pending_mask is
performed only by the consumer via xchg_acquire(). The consumer must not
write to irqs_pending and must not act on bits that are not set in the
mask. Otherwise, extra synchronization should be provided.
On RISC-V interrupts are not injected via guest registers, so pending
interrupts must be recorded in irqs_pending (using the new
vcpu_{un}set_interrupt() helpers) and flushed to the guest by updating
HVIP before returning control to the guest. The consumer side is
implemented in a follow-up patch.
A barrier between updating irqs_pending and setting the corresponding
mask bit in vcpu_set_interrupt() / vcpu_unset_interrupt() guarantees
that if the consumer observes a mask bit set, the corresponding pending
bit is also visible. This prevents missed interrupts during the flush.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V2:
- Move the patch before an introduction of vtimer.
- Drop bitmap_zero() of irqs_pending and irqs_pending_mask bitmaps as
vcpu structure starts out all zeros.
- Drop const for irq argument of vcpu_{un}set_interrupt().
- Drop check "irq < IRQ_LOCAL_MAX" in vcpu_{un}set_interrupt() as it
could lead to overrun of irqs_pending and irqs_pending_mask bitmaps.
- Drop IRQ_LOCAL_MAX as there is no usage for it now.
---
xen/arch/riscv/domain.c | 41 +++++++++++++++++++++++++++++
xen/arch/riscv/include/asm/domain.h | 19 +++++++++++++
2 files changed, 60 insertions(+)
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index 3ae5fa3a8805..3777888f34ea 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -5,6 +5,7 @@
#include <xen/sched.h>
#include <xen/vmap.h>
+#include <asm/bitops.h>
#include <asm/cpufeature.h>
#include <asm/csr.h>
#include <asm/riscv_encoding.h>
@@ -130,3 +131,43 @@ void arch_vcpu_destroy(struct vcpu *v)
{
vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info));
}
+
+int vcpu_set_interrupt(struct vcpu *v, unsigned int irq)
+{
+ /*
+ * We only allow VS-mode software, timer, and external
+ * interrupts when irq is one of the local interrupts
+ * defined by RISC-V privilege specification.
+ */
+ if ( irq != IRQ_VS_SOFT &&
+ irq != IRQ_VS_TIMER &&
+ irq != IRQ_VS_EXT )
+ return -EINVAL;
+
+ set_bit(irq, v->arch.irqs_pending);
+ smp_mb__before_atomic();
+ set_bit(irq, v->arch.irqs_pending_mask);
+
+ vcpu_kick(v);
+
+ return 0;
+}
+
+int vcpu_unset_interrupt(struct vcpu *v, unsigned int irq)
+{
+ /*
+ * We only allow VS-mode software, timer, external
+ * interrupts when irq is one of the local interrupts
+ * defined by RISC-V privilege specification.
+ */
+ if ( irq != IRQ_VS_SOFT &&
+ irq != IRQ_VS_TIMER &&
+ irq != IRQ_VS_EXT )
+ return -EINVAL;
+
+ clear_bit(irq, v->arch.irqs_pending);
+ smp_mb__before_atomic();
+ set_bit(irq, v->arch.irqs_pending_mask);
+
+ return 0;
+}
diff --git a/xen/arch/riscv/include/asm/domain.h b/xen/arch/riscv/include/asm/domain.h
index ec7786c76199..b8178447c68f 100644
--- a/xen/arch/riscv/include/asm/domain.h
+++ b/xen/arch/riscv/include/asm/domain.h
@@ -79,6 +79,22 @@ struct arch_vcpu
register_t vstval;
register_t vsatp;
register_t vsepc;
+
+ /*
+ * VCPU interrupts
+ *
+ * We have a lockless approach for tracking pending VCPU interrupts
+ * implemented using atomic bitops. The irqs_pending bitmap represent
+ * pending interrupts whereas irqs_pending_mask represent bits changed
+ * in irqs_pending. Our approach is modeled around multiple producer
+ * and single consumer problem where the consumer is the VCPU itself.
+ *
+ * DECLARE_BITMAP() is needed here to support 64 vCPU local interrupts
+ * on RV32 host.
+ */
+#define RISCV_VCPU_NR_IRQS 64
+ DECLARE_BITMAP(irqs_pending, RISCV_VCPU_NR_IRQS);
+ DECLARE_BITMAP(irqs_pending_mask, RISCV_VCPU_NR_IRQS);
} __cacheline_aligned;
struct paging_domain {
@@ -117,6 +133,9 @@ static inline void update_guest_memory_policy(struct vcpu *v,
static inline void arch_vcpu_block(struct vcpu *v) {}
+int vcpu_set_interrupt(struct vcpu *v, unsigned int irq);
+int vcpu_unset_interrupt(struct vcpu *v, unsigned int irq);
+
#endif /* ASM__RISCV__DOMAIN_H */
/*
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 04/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1
2026-01-22 16:47 ` [PATCH v2 04/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1 Oleksii Kurochko
@ 2026-01-29 16:44 ` Jan Beulich
2026-02-02 10:16 ` Oleksii Kurochko
0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-01-29 16:44 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 22.01.2026 17:47, Oleksii Kurochko wrote:
> Based on Linux kernel v6.16.0.
>
> Add lockless tracking of pending vCPU interrupts using atomic bitops.
> Two bitmaps are introduced:
> - irqs_pending — interrupts currently pending for the vCPU
> - irqs_pending_mask — bits that have changed in irqs_pending
>
> The design follows a multi-producer, single-consumer model, where the
> consumer is the vCPU itself. Producers may set bits in
> irqs_pending_mask without a lock. Clearing bits in irqs_pending_mask is
> performed only by the consumer via xchg_acquire(). The consumer must not
> write to irqs_pending and must not act on bits that are not set in the
> mask. Otherwise, extra synchronization should be provided.
>
> On RISC-V interrupts are not injected via guest registers, so pending
> interrupts must be recorded in irqs_pending (using the new
> vcpu_{un}set_interrupt() helpers) and flushed to the guest by updating
> HVIP before returning control to the guest. The consumer side is
> implemented in a follow-up patch.
>
> A barrier between updating irqs_pending and setting the corresponding
> mask bit in vcpu_set_interrupt() / vcpu_unset_interrupt() guarantees
> that if the consumer observes a mask bit set, the corresponding pending
> bit is also visible. This prevents missed interrupts during the flush.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
The use of barriers here matches the (Linux) specification, so
Acked-by: Jan Beulich <jbeulich@suse.com>
However, ...
> @@ -130,3 +131,43 @@ void arch_vcpu_destroy(struct vcpu *v)
> {
> vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info));
> }
> +
> +int vcpu_set_interrupt(struct vcpu *v, unsigned int irq)
> +{
> + /*
> + * We only allow VS-mode software, timer, and external
> + * interrupts when irq is one of the local interrupts
> + * defined by RISC-V privilege specification.
> + */
> + if ( irq != IRQ_VS_SOFT &&
> + irq != IRQ_VS_TIMER &&
> + irq != IRQ_VS_EXT )
> + return -EINVAL;
> +
> + set_bit(irq, v->arch.irqs_pending);
> + smp_mb__before_atomic();
> + set_bit(irq, v->arch.irqs_pending_mask);
... isn't it too heavy a barrier here? You only need ordering of writes,
without any regard to reads, don't you?
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 04/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1
2026-01-29 16:44 ` Jan Beulich
@ 2026-02-02 10:16 ` Oleksii Kurochko
0 siblings, 0 replies; 65+ messages in thread
From: Oleksii Kurochko @ 2026-02-02 10:16 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 1/29/26 5:44 PM, Jan Beulich wrote:
> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>> Based on Linux kernel v6.16.0.
>>
>> Add lockless tracking of pending vCPU interrupts using atomic bitops.
>> Two bitmaps are introduced:
>> - irqs_pending — interrupts currently pending for the vCPU
>> - irqs_pending_mask — bits that have changed in irqs_pending
>>
>> The design follows a multi-producer, single-consumer model, where the
>> consumer is the vCPU itself. Producers may set bits in
>> irqs_pending_mask without a lock. Clearing bits in irqs_pending_mask is
>> performed only by the consumer via xchg_acquire(). The consumer must not
>> write to irqs_pending and must not act on bits that are not set in the
>> mask. Otherwise, extra synchronization should be provided.
>>
>> On RISC-V interrupts are not injected via guest registers, so pending
>> interrupts must be recorded in irqs_pending (using the new
>> vcpu_{un}set_interrupt() helpers) and flushed to the guest by updating
>> HVIP before returning control to the guest. The consumer side is
>> implemented in a follow-up patch.
>>
>> A barrier between updating irqs_pending and setting the corresponding
>> mask bit in vcpu_set_interrupt() / vcpu_unset_interrupt() guarantees
>> that if the consumer observes a mask bit set, the corresponding pending
>> bit is also visible. This prevents missed interrupts during the flush.
>>
>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> The use of barriers here matches the (Linux) specification, so
> Acked-by: Jan Beulich <jbeulich@suse.com>
>
> However, ...
>
>> @@ -130,3 +131,43 @@ void arch_vcpu_destroy(struct vcpu *v)
>> {
>> vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info));
>> }
>> +
>> +int vcpu_set_interrupt(struct vcpu *v, unsigned int irq)
>> +{
>> + /*
>> + * We only allow VS-mode software, timer, and external
>> + * interrupts when irq is one of the local interrupts
>> + * defined by RISC-V privilege specification.
>> + */
>> + if ( irq != IRQ_VS_SOFT &&
>> + irq != IRQ_VS_TIMER &&
>> + irq != IRQ_VS_EXT )
>> + return -EINVAL;
>> +
>> + set_bit(irq, v->arch.irqs_pending);
>> + smp_mb__before_atomic();
>> + set_bit(irq, v->arch.irqs_pending_mask);
> ... isn't it too heavy a barrier here? You only need ordering of writes,
> without any regard to reads, don't you?
It is true, we could have FENCE W, W here. I'll update in the next patch
version.
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v2 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 2
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
` (3 preceding siblings ...)
2026-01-22 16:47 ` [PATCH v2 04/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1 Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-01-29 17:01 ` Jan Beulich
2026-01-22 16:47 ` [PATCH v2 06/16] xen/time: move ticks<->ns helpers to common code Oleksii Kurochko
` (10 subsequent siblings)
15 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
This patch is based on Linux kernel 6.16.0.
Add the consumer side (vcpu_flush_interrupts()) of the lockless pending
interrupt tracking introduced in part 1 (for producers). According, to the
design only one consumer is possible, and it is vCPU itself.
vcpu_flush_interrupts() is expected to be ran (as guests aren't ran now due
to the lack of functionality) before the hypervisor returns control to the
guest.
Producers may set bits in irqs_pending_mask without a lock. Clearing bits in
irqs_pending_mask is performed only by the consumer via xchg() (with aquire &
release semantics). The consumer must not write to irqs_pending and must not
act on bits that are not set in the mask. Otherwise, extra synchronization
should be provided.
The worst thing which could happen with such approach is that a new pending
bit will be set to irqs_pending bitmap during update of hvip variable in
vcpu_flush_interrupt() but it isn't problem as the new pending bit won't
be lost and just be proceded during the next flush.
It is possible a guest could have pending bit not result in the hardware
register without to be marked pending in irq_pending bitmap as:
According to the RISC-V ISA specification:
Bits hip.VSSIP and hie.VSSIE are the interrupt-pending and
interrupt-enable bits for VS-level software interrupts. VSSIP in hip
is an alias (writable) of the same bit in hvip.
Additionally:
When bit 2 of hideleg is zero, vsip.SSIP and vsie.SSIE are read-only
zeros. Else, vsip.SSIP and vsie.SSIE are aliases of hip.VSSIP and
hie.VSSIE.
This means the guest may modify vsip.SSIP, which implicitly updates
hip.VSSIP and the bit being writable with 1 would also trigger an interrupt
as according to the RISC-V spec:
These conditions for an interrupt trap to occur must be evaluated in a
bounded amount of time from when an interrupt becomes, or ceases to be,
pending in sip, and must also be evaluated immediately following the
execution of an SRET instruction or an explicit write to a CSR on which
these interrupt trap conditions expressly depend (including sip, sie and
sstatus).
What means that IRQ_VS_SOFT must be synchronized separately, what is done
in vcpu_sync_interrupts(). Note, also, that IRQ_PMU_OVF would want to be
synced for the similar reason as IRQ_VS_SOFT, but isn't sync-ed now as
PMU isn't supported now.
For the remaining VS-level interrupt types (IRQ_VS_TIMER and
IRQ_VS_EXT), the specification states they cannot be modified by the guest
and are read-only:
Bits hip.VSEIP and hie.VSEIE are the interrupt-pending and interrupt-enable
bits for VS-level external interrupts. VSEIP is read-only in hip, and is
the logical-OR of these interrupt sources:
• bit VSEIP of hvip;
• the bit of hgeip selected by hstatus.VGEIN; and
• any other platform-specific external interrupt signal directed to
VS-level.
Bits hip.VSTIP and hie.VSTIE are the interrupt-pending and interrupt-enable
bits for VS-level timer interrupts. VSTIP is read-only in hip, and is the
logical-OR of hvip.VSTIP and any other platform-specific timer interrupt
signal directed to VS-level.
Thus, for these interrupt types, it is sufficient to use vcpu_set_interrupt()
and vcpu_unset_interrupt(), and flush them during the call of
vcpu_flush_interrupts().
As AIA specs introduced hviph register which would want to be updated when
guest related AIA code will be introduced vcpu_update_hvip() is introduced
instead of just open-code it in vcpu_flush_interrupts().
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
- New patch.
---
xen/arch/riscv/domain.c | 65 +++++++++++++++++++++++++++++
xen/arch/riscv/include/asm/domain.h | 3 ++
2 files changed, 68 insertions(+)
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index 3777888f34ea..c078d595df9c 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -171,3 +171,68 @@ int vcpu_unset_interrupt(struct vcpu *v, unsigned int irq)
return 0;
}
+
+static void vcpu_update_hvip(struct vcpu *v)
+{
+ csr_write(CSR_HVIP, v->arch.hvip);
+}
+
+void vcpu_flush_interrupts(struct vcpu *v)
+{
+ register_t *hvip = &v->arch.hvip;
+
+ unsigned long mask, val;
+
+ if ( ACCESS_ONCE(v->arch.irqs_pending_mask[0]) )
+ {
+ mask = xchg(&v->arch.irqs_pending_mask[0], 0UL);
+ val = ACCESS_ONCE(v->arch.irqs_pending[0]) & mask;
+
+ *hvip &= ~mask;
+ *hvip |= val;
+ }
+
+ /*
+ * Flush AIA high interrupts.
+ *
+ * It is necessary to do only for CONFIG_RISCV_32 which isn't supported
+ * now.
+ */
+#ifdef CONFIG_RISCV_32
+# error "Update hviph"
+#endif
+
+ vcpu_update_hvip(v);
+}
+
+void vcpu_sync_interrupts(struct vcpu *v)
+{
+ unsigned long hvip;
+
+ /* Read current HVIP and VSIE CSRs */
+ v->arch.vsie = csr_read(CSR_VSIE);
+
+ /* Sync-up HVIP.VSSIP bit changes does by Guest */
+ hvip = csr_read(CSR_HVIP);
+ if ( (v->arch.hvip ^ hvip) & BIT(IRQ_VS_SOFT, UL) )
+ {
+ if ( !test_and_set_bit(IRQ_VS_SOFT,
+ &v->arch.irqs_pending_mask) )
+ {
+ if ( hvip & BIT(IRQ_VS_SOFT, UL) )
+ set_bit(IRQ_VS_SOFT, &v->arch.irqs_pending);
+ else
+ clear_bit(IRQ_VS_SOFT, &v->arch.irqs_pending);
+ }
+ }
+
+ /*
+ * Sync-up AIA high interrupts.
+ *
+ * It is necessary to do only for CONFIG_RISCV_32 which isn't supported
+ * now.
+ */
+#ifdef CONFIG_RISCV_32
+# error "Update vsieh"
+#endif
+}
diff --git a/xen/arch/riscv/include/asm/domain.h b/xen/arch/riscv/include/asm/domain.h
index b8178447c68f..fa083094b43e 100644
--- a/xen/arch/riscv/include/asm/domain.h
+++ b/xen/arch/riscv/include/asm/domain.h
@@ -136,6 +136,9 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
int vcpu_set_interrupt(struct vcpu *v, unsigned int irq);
int vcpu_unset_interrupt(struct vcpu *v, unsigned int irq);
+void vcpu_flush_interrupts(struct vcpu *v);
+void vcpu_sync_interrupts(struct vcpu *v);
+
#endif /* ASM__RISCV__DOMAIN_H */
/*
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 2
2026-01-22 16:47 ` [PATCH v2 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 2 Oleksii Kurochko
@ 2026-01-29 17:01 ` Jan Beulich
2026-02-02 10:50 ` Oleksii Kurochko
0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-01-29 17:01 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 22.01.2026 17:47, Oleksii Kurochko wrote:
> This patch is based on Linux kernel 6.16.0.
>
> Add the consumer side (vcpu_flush_interrupts()) of the lockless pending
> interrupt tracking introduced in part 1 (for producers). According, to the
> design only one consumer is possible, and it is vCPU itself.
> vcpu_flush_interrupts() is expected to be ran (as guests aren't ran now due
> to the lack of functionality) before the hypervisor returns control to the
> guest.
>
> Producers may set bits in irqs_pending_mask without a lock. Clearing bits in
> irqs_pending_mask is performed only by the consumer via xchg() (with aquire &
> release semantics). The consumer must not write to irqs_pending and must not
> act on bits that are not set in the mask. Otherwise, extra synchronization
> should be provided.
> The worst thing which could happen with such approach is that a new pending
> bit will be set to irqs_pending bitmap during update of hvip variable in
> vcpu_flush_interrupt() but it isn't problem as the new pending bit won't
> be lost and just be proceded during the next flush.
>
> It is possible a guest could have pending bit not result in the hardware
> register without to be marked pending in irq_pending bitmap as:
> According to the RISC-V ISA specification:
> Bits hip.VSSIP and hie.VSSIE are the interrupt-pending and
> interrupt-enable bits for VS-level software interrupts. VSSIP in hip
> is an alias (writable) of the same bit in hvip.
> Additionally:
> When bit 2 of hideleg is zero, vsip.SSIP and vsie.SSIE are read-only
> zeros. Else, vsip.SSIP and vsie.SSIE are aliases of hip.VSSIP and
> hie.VSSIE.
> This means the guest may modify vsip.SSIP, which implicitly updates
> hip.VSSIP and the bit being writable with 1 would also trigger an interrupt
> as according to the RISC-V spec:
> These conditions for an interrupt trap to occur must be evaluated in a
> bounded amount of time from when an interrupt becomes, or ceases to be,
> pending in sip, and must also be evaluated immediately following the
> execution of an SRET instruction or an explicit write to a CSR on which
> these interrupt trap conditions expressly depend (including sip, sie and
> sstatus).
> What means that IRQ_VS_SOFT must be synchronized separately, what is done
> in vcpu_sync_interrupts().
And this function is going to be used from where? Exit from guest into the
hypervisor? Whereas vcpu_flush_interrupt() is to be called ahead of re-
entering the guest?
I ask because vcpu_sync_interrupts() very much looks like a producer to me,
yet the patch here supposedly is the consumer side.
> --- a/xen/arch/riscv/domain.c
> +++ b/xen/arch/riscv/domain.c
> @@ -171,3 +171,68 @@ int vcpu_unset_interrupt(struct vcpu *v, unsigned int irq)
>
> return 0;
> }
> +
> +static void vcpu_update_hvip(struct vcpu *v)
Pointer-to-const?
> +{
> + csr_write(CSR_HVIP, v->arch.hvip);
> +}
> +
> +void vcpu_flush_interrupts(struct vcpu *v)
> +{
> + register_t *hvip = &v->arch.hvip;
> +
> + unsigned long mask, val;
These are used ...
> + if ( ACCESS_ONCE(v->arch.irqs_pending_mask[0]) )
> + {
> + mask = xchg(&v->arch.irqs_pending_mask[0], 0UL);
> + val = ACCESS_ONCE(v->arch.irqs_pending[0]) & mask;
> +
> + *hvip &= ~mask;
> + *hvip |= val;
... solely in this more narrow scope.
> + }
> +
> + /*
> + * Flush AIA high interrupts.
> + *
> + * It is necessary to do only for CONFIG_RISCV_32 which isn't supported
> + * now.
> + */
> +#ifdef CONFIG_RISCV_32
> +# error "Update hviph"
> +#endif
> +
> + vcpu_update_hvip(v);
Why would bits for which the mask bit wasn't set be written here?
> +void vcpu_sync_interrupts(struct vcpu *v)
> +{
> + unsigned long hvip;
> +
> + /* Read current HVIP and VSIE CSRs */
> + v->arch.vsie = csr_read(CSR_VSIE);
> +
> + /* Sync-up HVIP.VSSIP bit changes does by Guest */
Nit: s/does/done/ ?
> + hvip = csr_read(CSR_HVIP);
> + if ( (v->arch.hvip ^ hvip) & BIT(IRQ_VS_SOFT, UL) )
> + {
> + if ( !test_and_set_bit(IRQ_VS_SOFT,
> + &v->arch.irqs_pending_mask) )
Why two separate, nested if()s?
> + {
> + if ( hvip & BIT(IRQ_VS_SOFT, UL) )
> + set_bit(IRQ_VS_SOFT, &v->arch.irqs_pending);
> + else
> + clear_bit(IRQ_VS_SOFT, &v->arch.irqs_pending);
> + }
In the previous patch you set forth strict ordering rules, with a barrier in
the middle. All of this is violated here.
> + }
> +
> + /*
> + * Sync-up AIA high interrupts.
> + *
> + * It is necessary to do only for CONFIG_RISCV_32 which isn't supported
> + * now.
> + */
> +#ifdef CONFIG_RISCV_32
> +# error "Update vsieh"
> +#endif
Here you mean the register or the struct vcpu field? It may be helpful to
disambiguate; assuming it's the latter, simply spell out v->arch.vsieh?
(Same then for the similar code in vcpu_flush_interrupts().)
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 2
2026-01-29 17:01 ` Jan Beulich
@ 2026-02-02 10:50 ` Oleksii Kurochko
2026-02-02 14:22 ` Jan Beulich
0 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-02-02 10:50 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 1/29/26 6:01 PM, Jan Beulich wrote:
> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>> This patch is based on Linux kernel 6.16.0.
>>
>> Add the consumer side (vcpu_flush_interrupts()) of the lockless pending
>> interrupt tracking introduced in part 1 (for producers). According, to the
>> design only one consumer is possible, and it is vCPU itself.
>> vcpu_flush_interrupts() is expected to be ran (as guests aren't ran now due
>> to the lack of functionality) before the hypervisor returns control to the
>> guest.
>>
>> Producers may set bits in irqs_pending_mask without a lock. Clearing bits in
>> irqs_pending_mask is performed only by the consumer via xchg() (with aquire &
>> release semantics). The consumer must not write to irqs_pending and must not
>> act on bits that are not set in the mask. Otherwise, extra synchronization
>> should be provided.
>> The worst thing which could happen with such approach is that a new pending
>> bit will be set to irqs_pending bitmap during update of hvip variable in
>> vcpu_flush_interrupt() but it isn't problem as the new pending bit won't
>> be lost and just be proceded during the next flush.
>>
>> It is possible a guest could have pending bit not result in the hardware
>> register without to be marked pending in irq_pending bitmap as:
>> According to the RISC-V ISA specification:
>> Bits hip.VSSIP and hie.VSSIE are the interrupt-pending and
>> interrupt-enable bits for VS-level software interrupts. VSSIP in hip
>> is an alias (writable) of the same bit in hvip.
>> Additionally:
>> When bit 2 of hideleg is zero, vsip.SSIP and vsie.SSIE are read-only
>> zeros. Else, vsip.SSIP and vsie.SSIE are aliases of hip.VSSIP and
>> hie.VSSIE.
>> This means the guest may modify vsip.SSIP, which implicitly updates
>> hip.VSSIP and the bit being writable with 1 would also trigger an interrupt
>> as according to the RISC-V spec:
>> These conditions for an interrupt trap to occur must be evaluated in a
>> bounded amount of time from when an interrupt becomes, or ceases to be,
>> pending in sip, and must also be evaluated immediately following the
>> execution of an SRET instruction or an explicit write to a CSR on which
>> these interrupt trap conditions expressly depend (including sip, sie and
>> sstatus).
>> What means that IRQ_VS_SOFT must be synchronized separately, what is done
>> in vcpu_sync_interrupts().
> And this function is going to be used from where? Exit from guest into the
> hypervisor? Whereas vcpu_flush_interrupt() is to be called ahead of re-
> entering the guest?
Both of them are called before returning control to a guest (missed to mention
that in the commit message) in do_trap() at the end:
static void check_for_pcpu_work(void)
{
...
vcpu_flush_interrupts(current);
vcpu_sync_interrupts(current);
}
void do_trap(struct cpu_user_regs *cpu_regs)
{
...
if ( cpu_regs->hstatus & HSTATUS_SPV )
check_for_pcpu_work();
}
>
> I ask because vcpu_sync_interrupts() very much looks like a producer to me,
> yet the patch here supposedly is the consumer side.
Yes, vcpu_sync_interrupts() should be in producer side, I'll move it to the prev.
patch.
>
>> --- a/xen/arch/riscv/domain.c
>> +++ b/xen/arch/riscv/domain.c
>> @@ -171,3 +171,68 @@ int vcpu_unset_interrupt(struct vcpu *v, unsigned int irq)
>>
>> return 0;
>> }
>> +
>> +static void vcpu_update_hvip(struct vcpu *v)
> Pointer-to-const?
>
>> +{
>> + csr_write(CSR_HVIP, v->arch.hvip);
>> +}
>> +
>> +void vcpu_flush_interrupts(struct vcpu *v)
>> +{
>> + register_t *hvip = &v->arch.hvip;
>> +
>> + unsigned long mask, val;
> These are used ...
>
>> + if ( ACCESS_ONCE(v->arch.irqs_pending_mask[0]) )
>> + {
>> + mask = xchg(&v->arch.irqs_pending_mask[0], 0UL);
>> + val = ACCESS_ONCE(v->arch.irqs_pending[0]) & mask;
>> +
>> + *hvip &= ~mask;
>> + *hvip |= val;
> ... solely in this more narrow scope.
I'll declare them inside the if().
>
>> + }
>> +
>> + /*
>> + * Flush AIA high interrupts.
>> + *
>> + * It is necessary to do only for CONFIG_RISCV_32 which isn't supported
>> + * now.
>> + */
>> +#ifdef CONFIG_RISCV_32
>> +# error "Update hviph"
>> +#endif
>> +
>> + vcpu_update_hvip(v);
> Why would bits for which the mask bit wasn't set be written here?
This function inside uses only v->arch.hvip which is updated above according to
the mask.
>
>> +void vcpu_sync_interrupts(struct vcpu *v)
>> +{
>> + unsigned long hvip;
>> +
>> + /* Read current HVIP and VSIE CSRs */
>> + v->arch.vsie = csr_read(CSR_VSIE);
>> +
>> + /* Sync-up HVIP.VSSIP bit changes does by Guest */
> Nit: s/does/done/ ?
>
>> + hvip = csr_read(CSR_HVIP);
>> + if ( (v->arch.hvip ^ hvip) & BIT(IRQ_VS_SOFT, UL) )
>> + {
>> + if ( !test_and_set_bit(IRQ_VS_SOFT,
>> + &v->arch.irqs_pending_mask) )
> Why two separate, nested if()s?
Do you mean that it could be:
if ( !test_and_set_bit(IRQ_VS_SOFT, &v->arch.irqs_pending_mask) && (hvip & BIT(IRQ_VS_SOFT, UL))
?
>
>> + {
>> + if ( hvip & BIT(IRQ_VS_SOFT, UL) )
>> + set_bit(IRQ_VS_SOFT, &v->arch.irqs_pending);
>> + else
>> + clear_bit(IRQ_VS_SOFT, &v->arch.irqs_pending);
>> + }
> In the previous patch you set forth strict ordering rules, with a barrier in
> the middle. All of this is violated here.
It still respects the rule that the producer (|vcpu_sync_interrupts()| which
should be in the producer path) never clears the mask and only writes to
|irqs_pending| if it is the one that flipped the corresponding mask bit from 0
to 1.
Considering that the consumer cannot be called concurrently in this case
(since|vcpu_flush_interrupts()| and|vcpu_sync_interrupts()| are only invoked
sequentially in|check_for_pcpu_work()|, as mentioned above), nothing can
clear a bit in the mask in between. Therefore, I think it is acceptable to
slightly bend (and it should be explained in the comment above the
function or in the commit message) the rule that the|irqs_pending| bit must
be written first, followed by updating the corresponding bit in
|irqs_pending_mask() specifically for |vcpu_sync_interrupts().
>
>> + }
>> +
>> + /*
>> + * Sync-up AIA high interrupts.
>> + *
>> + * It is necessary to do only for CONFIG_RISCV_32 which isn't supported
>> + * now.
>> + */
>> +#ifdef CONFIG_RISCV_32
>> +# error "Update vsieh"
>> +#endif
> Here you mean the register or the struct vcpu field? It may be helpful to
> disambiguate; assuming it's the latter, simply spell out v->arch.vsieh?
> (Same then for the similar code in vcpu_flush_interrupts().)
Agree, it would be better.
Thanks.
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 2
2026-02-02 10:50 ` Oleksii Kurochko
@ 2026-02-02 14:22 ` Jan Beulich
0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2026-02-02 14:22 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 02.02.2026 11:50, Oleksii Kurochko wrote:
> On 1/29/26 6:01 PM, Jan Beulich wrote:
>> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>>> This patch is based on Linux kernel 6.16.0.
>>> +void vcpu_sync_interrupts(struct vcpu *v)
>>> +{
>>> + unsigned long hvip;
>>> +
>>> + /* Read current HVIP and VSIE CSRs */
>>> + v->arch.vsie = csr_read(CSR_VSIE);
>>> +
>>> + /* Sync-up HVIP.VSSIP bit changes does by Guest */
>> Nit: s/does/done/ ?
>>
>>> + hvip = csr_read(CSR_HVIP);
>>> + if ( (v->arch.hvip ^ hvip) & BIT(IRQ_VS_SOFT, UL) )
>>> + {
>>> + if ( !test_and_set_bit(IRQ_VS_SOFT,
>>> + &v->arch.irqs_pending_mask) )
>> Why two separate, nested if()s?
>
> Do you mean that it could be:
> if ( !test_and_set_bit(IRQ_VS_SOFT, &v->arch.irqs_pending_mask) && (hvip & BIT(IRQ_VS_SOFT, UL))
> ?
That's combining with the if() ...
>>> + {
>>> + if ( hvip & BIT(IRQ_VS_SOFT, UL) )
... down here, which - ...
>>> + set_bit(IRQ_VS_SOFT, &v->arch.irqs_pending);
>>> + else
... having an "else" - can't be folded like this, I think. I meant the two
if()s immediately ahead of my remark.
>>> + clear_bit(IRQ_VS_SOFT, &v->arch.irqs_pending);
>>> + }
>> In the previous patch you set forth strict ordering rules, with a barrier in
>> the middle. All of this is violated here.
>
> It still respects the rule that the producer (|vcpu_sync_interrupts()| which
> should be in the producer path) never clears the mask and only writes to
> |irqs_pending| if it is the one that flipped the corresponding mask bit from 0
> to 1.
>
> Considering that the consumer cannot be called concurrently in this case
> (since|vcpu_flush_interrupts()| and|vcpu_sync_interrupts()| are only invoked
> sequentially in|check_for_pcpu_work()|, as mentioned above), nothing can
> clear a bit in the mask in between. Therefore, I think it is acceptable to
> slightly bend (and it should be explained in the comment above the
> function or in the commit message) the rule that the|irqs_pending| bit must
> be written first, followed by updating the corresponding bit in
> |irqs_pending_mask() specifically for |vcpu_sync_interrupts().
With suitable commenting - yes.
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v2 06/16] xen/time: move ticks<->ns helpers to common code
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
` (4 preceding siblings ...)
2026-01-22 16:47 ` [PATCH v2 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 2 Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-01-29 8:48 ` Jan Beulich
2026-01-22 16:47 ` [PATCH v2 07/16] xen/riscv: introduce basic vtimer infrastructure for guests Oleksii Kurochko
` (9 subsequent siblings)
15 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné,
Alistair Francis, Connor Davis, Romain Caritey
ticks_to_ns() and ns_to_ticks() are not architecture-specific, so provide a
common implementation that is more resilient to overflow than the historical
Arm version. This is not a practical issue for Arm, as the latest ARM ARM
that timer frequency should be fixed at 1 GHz and older platforms used much
lower rates, which is shy of 32-bit overflow. As the helpers are declared
as static inline, they should not affect x86, which does not use them.
On Arm, these helpers were historically implemented as out-of-line functions
because the counter frequency was originally defined as static and unavailable
to headers [1]. Later changes [2] removed this restriction, but the helpers
remained unchanged. Now they can be implemented as static inline without any
issues.
Centralising the helpers avoids duplication and removes subtle differences
between architectures while keeping the implementation simple.
Drop redundant <asm/time.h> includes where <xen/time.h> already pulls it in.
No functional change is intended.
[1] ddee56dc2994 arm: driver for the generic timer for ARMv7
[2] 096578b4e489 xen: move XEN_SYSCTL_physinfo, XEN_SYSCTL_numainfo and
XEN_SYSCTL_topologyinfo to common code
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Suggested-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v2:
- Move ns_to_ticks() and ticks_to_ns() to common code.
---
xen/arch/arm/include/asm/time.h | 3 ---
xen/arch/arm/time.c | 11 -----------
xen/arch/arm/vtimer.c | 2 +-
xen/arch/riscv/include/asm/time.h | 5 -----
xen/arch/riscv/time.c | 1 +
xen/include/xen/time.h | 11 +++++++++++
6 files changed, 13 insertions(+), 20 deletions(-)
diff --git a/xen/arch/arm/include/asm/time.h b/xen/arch/arm/include/asm/time.h
index 49ad8c1a6d47..c194dbb9f52d 100644
--- a/xen/arch/arm/include/asm/time.h
+++ b/xen/arch/arm/include/asm/time.h
@@ -101,9 +101,6 @@ extern void init_timer_interrupt(void);
/* Counter value at boot time */
extern uint64_t boot_count;
-extern s_time_t ticks_to_ns(uint64_t ticks);
-extern uint64_t ns_to_ticks(s_time_t ns);
-
void preinit_xen_time(void);
void force_update_vcpu_system_time(struct vcpu *v);
diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
index cc3fcf47b66a..a12912a106a0 100644
--- a/xen/arch/arm/time.c
+++ b/xen/arch/arm/time.c
@@ -27,7 +27,6 @@
#include <asm/cpufeature.h>
#include <asm/platform.h>
#include <asm/system.h>
-#include <asm/time.h>
#include <asm/vgic.h>
uint64_t __read_mostly boot_count;
@@ -47,16 +46,6 @@ unsigned int timer_get_irq(enum timer_ppi ppi)
return timer_irq[ppi];
}
-/*static inline*/ s_time_t ticks_to_ns(uint64_t ticks)
-{
- return muldiv64(ticks, SECONDS(1), 1000 * cpu_khz);
-}
-
-/*static inline*/ uint64_t ns_to_ticks(s_time_t ns)
-{
- return muldiv64(ns, 1000 * cpu_khz, SECONDS(1));
-}
-
static __initdata struct dt_device_node *timer;
#ifdef CONFIG_ACPI
diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
index d2124b175521..2e85ff2b6e62 100644
--- a/xen/arch/arm/vtimer.c
+++ b/xen/arch/arm/vtimer.c
@@ -12,13 +12,13 @@
#include <xen/lib.h>
#include <xen/perfc.h>
#include <xen/sched.h>
+#include <xen/time.h>
#include <xen/timer.h>
#include <asm/cpregs.h>
#include <asm/div64.h>
#include <asm/irq.h>
#include <asm/regs.h>
-#include <asm/time.h>
#include <asm/vgic.h>
#include <asm/vreg.h>
#include <asm/vtimer.h>
diff --git a/xen/arch/riscv/include/asm/time.h b/xen/arch/riscv/include/asm/time.h
index 1e7801e2ea0e..be3875b9984e 100644
--- a/xen/arch/riscv/include/asm/time.h
+++ b/xen/arch/riscv/include/asm/time.h
@@ -24,11 +24,6 @@ static inline cycles_t get_cycles(void)
return csr_read(CSR_TIME);
}
-static inline s_time_t ticks_to_ns(uint64_t ticks)
-{
- return muldiv64(ticks, MILLISECS(1), cpu_khz);
-}
-
void preinit_xen_time(void);
#endif /* ASM__RISCV__TIME_H */
diff --git a/xen/arch/riscv/time.c b/xen/arch/riscv/time.c
index e962f8518d78..2c7af0a5d63b 100644
--- a/xen/arch/riscv/time.c
+++ b/xen/arch/riscv/time.c
@@ -4,6 +4,7 @@
#include <xen/init.h>
#include <xen/lib.h>
#include <xen/sections.h>
+#include <xen/time.h>
#include <xen/types.h>
unsigned long __ro_after_init cpu_khz; /* CPU clock frequency in kHz. */
diff --git a/xen/include/xen/time.h b/xen/include/xen/time.h
index fe0d7a578a99..2185dd26a439 100644
--- a/xen/include/xen/time.h
+++ b/xen/include/xen/time.h
@@ -8,6 +8,7 @@
#ifndef __XEN_TIME_H__
#define __XEN_TIME_H__
+#include <xen/muldiv64.h>
#include <xen/types.h>
#include <public/xen.h>
@@ -75,6 +76,16 @@ extern void send_timer_event(struct vcpu *v);
void domain_set_time_offset(struct domain *d, int64_t time_offset_seconds);
+static inline s_time_t ticks_to_ns(uint64_t ticks)
+{
+ return muldiv64(ticks, MILLISECS(1), cpu_khz);
+}
+
+static inline uint64_t ns_to_ticks(s_time_t ns)
+{
+ return muldiv64(ns, cpu_khz, MILLISECS(1));
+}
+
#include <asm/time.h>
#endif /* __XEN_TIME_H__ */
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 06/16] xen/time: move ticks<->ns helpers to common code
2026-01-22 16:47 ` [PATCH v2 06/16] xen/time: move ticks<->ns helpers to common code Oleksii Kurochko
@ 2026-01-29 8:48 ` Jan Beulich
2026-02-04 8:13 ` Jan Beulich
0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-01-29 8:48 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Roger Pau Monné, Alistair Francis, Connor Davis,
Romain Caritey, xen-devel
On 22.01.2026 17:47, Oleksii Kurochko wrote:
> ticks_to_ns() and ns_to_ticks() are not architecture-specific, so provide a
> common implementation that is more resilient to overflow than the historical
> Arm version. This is not a practical issue for Arm, as the latest ARM ARM
> that timer frequency should be fixed at 1 GHz and older platforms used much
> lower rates, which is shy of 32-bit overflow. As the helpers are declared
> as static inline, they should not affect x86, which does not use them.
>
> On Arm, these helpers were historically implemented as out-of-line functions
> because the counter frequency was originally defined as static and unavailable
> to headers [1]. Later changes [2] removed this restriction, but the helpers
> remained unchanged. Now they can be implemented as static inline without any
> issues.
>
> Centralising the helpers avoids duplication and removes subtle differences
> between architectures while keeping the implementation simple.
>
> Drop redundant <asm/time.h> includes where <xen/time.h> already pulls it in.
>
> No functional change is intended.
>
> [1] ddee56dc2994 arm: driver for the generic timer for ARMv7
> [2] 096578b4e489 xen: move XEN_SYSCTL_physinfo, XEN_SYSCTL_numainfo and
> XEN_SYSCTL_topologyinfo to common code
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> Suggested-by: Jan Beulich <jbeulich@suse.com>
Nit: Flip the two (chronological order).
Reviewed-by: Jan Beulich <jbeulich@suse.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v2 06/16] xen/time: move ticks<->ns helpers to common code
2026-01-29 8:48 ` Jan Beulich
@ 2026-02-04 8:13 ` Jan Beulich
2026-02-04 9:00 ` Oleksii Kurochko
0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-02-04 8:13 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Roger Pau Monné, Alistair Francis, Connor Davis,
Romain Caritey, xen-devel
On 29.01.2026 09:48, Jan Beulich wrote:
> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>> ticks_to_ns() and ns_to_ticks() are not architecture-specific, so provide a
>> common implementation that is more resilient to overflow than the historical
>> Arm version. This is not a practical issue for Arm, as the latest ARM ARM
>> that timer frequency should be fixed at 1 GHz and older platforms used much
>> lower rates, which is shy of 32-bit overflow. As the helpers are declared
>> as static inline, they should not affect x86, which does not use them.
>>
>> On Arm, these helpers were historically implemented as out-of-line functions
>> because the counter frequency was originally defined as static and unavailable
>> to headers [1]. Later changes [2] removed this restriction, but the helpers
>> remained unchanged. Now they can be implemented as static inline without any
>> issues.
>>
>> Centralising the helpers avoids duplication and removes subtle differences
>> between architectures while keeping the implementation simple.
>>
>> Drop redundant <asm/time.h> includes where <xen/time.h> already pulls it in.
>>
>> No functional change is intended.
>>
>> [1] ddee56dc2994 arm: driver for the generic timer for ARMv7
>> [2] 096578b4e489 xen: move XEN_SYSCTL_physinfo, XEN_SYSCTL_numainfo and
>> XEN_SYSCTL_topologyinfo to common code
>>
>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>> Suggested-by: Jan Beulich <jbeulich@suse.com>
>
> Nit: Flip the two (chronological order).
>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Btw, if this got the necessary Arm ack, I think it could also go in ahead of
all earlier patches in the series?
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v2 06/16] xen/time: move ticks<->ns helpers to common code
2026-02-04 8:13 ` Jan Beulich
@ 2026-02-04 9:00 ` Oleksii Kurochko
0 siblings, 0 replies; 65+ messages in thread
From: Oleksii Kurochko @ 2026-02-04 9:00 UTC (permalink / raw)
To: Jan Beulich
Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Roger Pau Monné, Alistair Francis, Connor Davis,
Romain Caritey, xen-devel
On 2/4/26 9:13 AM, Jan Beulich wrote:
> On 29.01.2026 09:48, Jan Beulich wrote:
>> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>>> ticks_to_ns() and ns_to_ticks() are not architecture-specific, so provide a
>>> common implementation that is more resilient to overflow than the historical
>>> Arm version. This is not a practical issue for Arm, as the latest ARM ARM
>>> that timer frequency should be fixed at 1 GHz and older platforms used much
>>> lower rates, which is shy of 32-bit overflow. As the helpers are declared
>>> as static inline, they should not affect x86, which does not use them.
>>>
>>> On Arm, these helpers were historically implemented as out-of-line functions
>>> because the counter frequency was originally defined as static and unavailable
>>> to headers [1]. Later changes [2] removed this restriction, but the helpers
>>> remained unchanged. Now they can be implemented as static inline without any
>>> issues.
>>>
>>> Centralising the helpers avoids duplication and removes subtle differences
>>> between architectures while keeping the implementation simple.
>>>
>>> Drop redundant <asm/time.h> includes where <xen/time.h> already pulls it in.
>>>
>>> No functional change is intended.
>>>
>>> [1] ddee56dc2994 arm: driver for the generic timer for ARMv7
>>> [2] 096578b4e489 xen: move XEN_SYSCTL_physinfo, XEN_SYSCTL_numainfo and
>>> XEN_SYSCTL_topologyinfo to common code
>>>
>>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>>> Suggested-by: Jan Beulich <jbeulich@suse.com>
>> Nit: Flip the two (chronological order).
>>
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> Btw, if this got the necessary Arm ack, I think it could also go in ahead of
> all earlier patches in the series?
Yes, it is independent from earlier patches in the series.
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v2 07/16] xen/riscv: introduce basic vtimer infrastructure for guests
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
` (5 preceding siblings ...)
2026-01-22 16:47 ` [PATCH v2 06/16] xen/time: move ticks<->ns helpers to common code Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-02-03 15:47 ` Jan Beulich
2026-01-22 16:47 ` [PATCH v2 08/16] xen/riscv: add temporary stub for smp_send_event_check_mask() Oleksii Kurochko
` (8 subsequent siblings)
15 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
Lay the groundwork for guest timer support by introducing a per-vCPU
virtual timer backed by Xen’s common timer infrastructure.
The virtual timer is programmed in response to the guest SBI
sbi_set_timer() call and injects a virtual supervisor timer interrupt
into the vCPU when it expires.
While a dedicated struct vtimer is not strictly required at present,
it is expected to become necessary once SSTC support is introduced.
In particular, it will need to carry additional state such as whether
SSTC is enabled, the next compare value (e.g. for the VSTIMECMP CSR)
to be saved and restored across context switches, and time delta state
(e.g. HTIMEDELTA) required for use cases such as migration. Introducing
struct vtimer now avoids a later refactoring.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
- Drop domain_vtimer_init() as it does nothing.
- Drop "struct vcpu *v;" from struct vtimer as it could be taken
from arch_vcpu using container_of().
- Drop vtimer_initialized, use t->status == TIMER_STATUS_invalid
instead to understand if timer was or wasn't initialized.
- Drop inclusion of xen/domain.h as xen/sched.h already includes it.
- s/ xen/time.h/ xen.timer.h in vtimer.c.
- Drop ULL in if-conidtion in vtimer_set_timer() as with the cast
it isn't necessary to have suffix ULL.
- Add migrate timer to vtimer_set_timer() to be sure that vtimer
will occur on pCPU it was ran, so the signalling to that vCPU
will (commonly) be cheaper.
- Check if the timeout has already expired and just inject the event
in vtimer_vtimer_set_timer().
- Drop const for ticks argument of vtimer_set_timer().
- Merge two patches to one:
- xen/riscv: introduce vtimer
- xen/riscv: introduce vtimer_set_timer() and vtimer_expired()
---
xen/arch/riscv/Makefile | 1 +
xen/arch/riscv/domain.c | 8 +++-
xen/arch/riscv/include/asm/domain.h | 3 ++
xen/arch/riscv/include/asm/vtimer.h | 20 ++++++++
xen/arch/riscv/vtimer.c | 73 +++++++++++++++++++++++++++++
5 files changed, 103 insertions(+), 2 deletions(-)
create mode 100644 xen/arch/riscv/include/asm/vtimer.h
create mode 100644 xen/arch/riscv/vtimer.c
diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 8863d4b15605..5bd180130165 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -22,6 +22,7 @@ obj-y += traps.o
obj-y += vmid.o
obj-y += vm_event.o
obj-y += vsbi/
+obj-y += vtimer.o
$(TARGET): $(TARGET)-syms
$(OBJCOPY) -O binary -S $< $@
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index c078d595df9c..e38c0db62cac 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -9,6 +9,7 @@
#include <asm/cpufeature.h>
#include <asm/csr.h>
#include <asm/riscv_encoding.h>
+#include <asm/vtimer.h>
#define HEDELEG_DEFAULT (BIT(CAUSE_MISALIGNED_FETCH, U) | \
BIT(CAUSE_FETCH_ACCESS, U) | \
@@ -111,11 +112,14 @@ int arch_vcpu_create(struct vcpu *v)
if ( is_idle_vcpu(v) )
return rc;
+ if ( (rc = vcpu_vtimer_init(v)) )
+ goto fail;
+
/*
- * As the vtimer and interrupt controller (IC) are not yet implemented,
+ * As interrupt controller (IC) is not yet implemented,
* return an error.
*
- * TODO: Drop this once the vtimer and IC are implemented.
+ * TODO: Drop this once IC is implemented.
*/
rc = -EOPNOTSUPP;
goto fail;
diff --git a/xen/arch/riscv/include/asm/domain.h b/xen/arch/riscv/include/asm/domain.h
index fa083094b43e..482429d4ef33 100644
--- a/xen/arch/riscv/include/asm/domain.h
+++ b/xen/arch/riscv/include/asm/domain.h
@@ -8,6 +8,7 @@
#include <public/hvm/params.h>
#include <asm/p2m.h>
+#include <asm/vtimer.h>
struct vcpu_vmid {
uint64_t generation;
@@ -51,6 +52,8 @@ struct arch_vcpu
struct cpu_info *cpu_info;
+ struct vtimer vtimer;
+
/* CSRs */
register_t hedeleg;
register_t hideleg;
diff --git a/xen/arch/riscv/include/asm/vtimer.h b/xen/arch/riscv/include/asm/vtimer.h
new file mode 100644
index 000000000000..0d1555511755
--- /dev/null
+++ b/xen/arch/riscv/include/asm/vtimer.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * (c) 2023-2024 Vates
+ */
+
+#ifndef ASM__RISCV__VTIMER_H
+#define ASM__RISCV__VTIMER_H
+
+#include <xen/timer.h>
+
+struct vtimer {
+ struct timer timer;
+};
+
+int vcpu_vtimer_init(struct vcpu *v);
+void vcpu_timer_destroy(struct vcpu *v);
+
+void vtimer_set_timer(struct vtimer *t, uint64_t ticks);
+
+#endif /* ASM__RISCV__VTIMER_H */
diff --git a/xen/arch/riscv/vtimer.c b/xen/arch/riscv/vtimer.c
new file mode 100644
index 000000000000..b6599fa383b8
--- /dev/null
+++ b/xen/arch/riscv/vtimer.c
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/sched.h>
+#include <xen/timer.h>
+
+#include <asm/vtimer.h>
+
+static void vtimer_expired(void *data)
+{
+ struct vtimer *t = data;
+ struct arch_vcpu *avcpu = container_of(t, struct arch_vcpu, vtimer);
+ struct vcpu *v = container_of(avcpu, struct vcpu, arch);
+
+ vcpu_set_interrupt(v, IRQ_VS_TIMER);
+}
+
+int vcpu_vtimer_init(struct vcpu *v)
+{
+ struct vtimer *t = &v->arch.vtimer;
+
+ init_timer(&t->timer, vtimer_expired, t, v->processor);
+
+ return 0;
+}
+
+void vcpu_timer_destroy(struct vcpu *v)
+{
+ struct vtimer *t = &v->arch.vtimer;
+
+ if ( t->timer.status == TIMER_STATUS_invalid )
+ return;
+
+ kill_timer(&v->arch.vtimer.timer);
+}
+
+void vtimer_set_timer(struct vtimer *t, const uint64_t ticks)
+{
+ struct arch_vcpu *avcpu = container_of(t, struct arch_vcpu, vtimer);
+ struct vcpu *v = container_of(avcpu, struct vcpu, arch);
+ s_time_t expires = ticks_to_ns(ticks - boot_clock_cycles);
+
+ vcpu_unset_interrupt(v, IRQ_VS_TIMER);
+
+ /*
+ * According to the RISC-V sbi spec:
+ * If the supervisor wishes to clear the timer interrupt without
+ * scheduling the next timer event, it can either request a timer
+ * interrupt infinitely far into the future (i.e., (uint64_t)-1),
+ * or it can instead mask the timer interrupt by clearing sie.STIE CSR
+ * bit.
+ */
+ if ( ticks == ((uint64_t)~0) )
+ {
+ stop_timer(&t->timer);
+
+ return;
+ }
+
+ if ( expires < NOW() )
+ {
+ /*
+ * Simplify the logic if the timeout has already expired and just
+ * inject the event.
+ */
+ stop_timer(&t->timer);
+ vcpu_set_interrupt(v, IRQ_VS_TIMER);
+
+ return;
+ }
+
+ migrate_timer(&t->timer, smp_processor_id());
+ set_timer(&t->timer, expires);
+}
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 07/16] xen/riscv: introduce basic vtimer infrastructure for guests
2026-01-22 16:47 ` [PATCH v2 07/16] xen/riscv: introduce basic vtimer infrastructure for guests Oleksii Kurochko
@ 2026-02-03 15:47 ` Jan Beulich
2026-02-03 16:55 ` Oleksii Kurochko
0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-02-03 15:47 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 22.01.2026 17:47, Oleksii Kurochko wrote:
> +void vtimer_set_timer(struct vtimer *t, const uint64_t ticks)
> +{
> + struct arch_vcpu *avcpu = container_of(t, struct arch_vcpu, vtimer);
> + struct vcpu *v = container_of(avcpu, struct vcpu, arch);
Why two container_of() when one will do? (Same again further down.)
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 07/16] xen/riscv: introduce basic vtimer infrastructure for guests
2026-02-03 15:47 ` Jan Beulich
@ 2026-02-03 16:55 ` Oleksii Kurochko
2026-02-03 17:04 ` Jan Beulich
0 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-02-03 16:55 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 2/3/26 4:47 PM, Jan Beulich wrote:
> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>> +void vtimer_set_timer(struct vtimer *t, const uint64_t ticks)
>> +{
>> + struct arch_vcpu *avcpu = container_of(t, struct arch_vcpu, vtimer);
>> + struct vcpu *v = container_of(avcpu, struct vcpu, arch);
> Why two container_of() when one will do? (Same again further down.)
I didn't think that container_of(t, struct vcpu, arch.vtimer) would work
(as arch.vtimer is embedded inside struct vcpu. Is my assumption correct that
if it was arch->vtimer then it won't work?)
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 07/16] xen/riscv: introduce basic vtimer infrastructure for guests
2026-02-03 16:55 ` Oleksii Kurochko
@ 2026-02-03 17:04 ` Jan Beulich
0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2026-02-03 17:04 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 03.02.2026 17:55, Oleksii Kurochko wrote:
> On 2/3/26 4:47 PM, Jan Beulich wrote:
>> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>>> +void vtimer_set_timer(struct vtimer *t, const uint64_t ticks)
>>> +{
>>> + struct arch_vcpu *avcpu = container_of(t, struct arch_vcpu, vtimer);
>>> + struct vcpu *v = container_of(avcpu, struct vcpu, arch);
>> Why two container_of() when one will do? (Same again further down.)
>
> I didn't think that container_of(t, struct vcpu, arch.vtimer) would work
> (as arch.vtimer is embedded inside struct vcpu. Is my assumption correct that
> if it was arch->vtimer then it won't work?)
Please simply go look at the container_of() definition.
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v2 08/16] xen/riscv: add temporary stub for smp_send_event_check_mask()
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
` (6 preceding siblings ...)
2026-01-22 16:47 ` [PATCH v2 07/16] xen/riscv: introduce basic vtimer infrastructure for guests Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-01-29 16:32 ` Jan Beulich
2026-01-22 16:47 ` [PATCH v2 09/16] xen/riscv: introduce vcpu_kick() implementation Oleksii Kurochko
` (7 subsequent siblings)
15 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
RISC-V SMP support is not yet implemented, but smp_send_event_check_mask()
is required by common code and vcpu_kick(), which is introduced later.
Provide a temporary stub implementation that asserts the mask only targets
CPU0.
cpumask_subset() is used instead of cpumask_equal() because some callers
(e.g. cpumask_raise_softirq() or cpu_raise_softirq_batch_finish()) may
legitimately pass an empty mask, which would otherwise cause false
failures.
The BUG_ON() ensures that attempts to use this function with multiple CPUs
are caught early once SMP support is introduced.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
- use BUG_ON(cpumask_subset(...)) instead of "#ifdef NR_CPUS > 1".
- Update the commit message.
---
xen/arch/riscv/smp.c | 7 +++++++
xen/arch/riscv/stubs.c | 5 -----
2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/xen/arch/riscv/smp.c b/xen/arch/riscv/smp.c
index 4ca6a4e89200..d645364ea47d 100644
--- a/xen/arch/riscv/smp.c
+++ b/xen/arch/riscv/smp.c
@@ -1,3 +1,4 @@
+#include <xen/cpumask.h>
#include <xen/smp.h>
/*
@@ -13,3 +14,9 @@
struct pcpu_info pcpu_info[NR_CPUS] = { [0 ... NR_CPUS - 1] = {
.processor_id = NR_CPUS,
}};
+
+void smp_send_event_check_mask(const cpumask_t *mask)
+{
+ /* Catch missing implementation once SMP support is introduced */
+ BUG_ON(!cpumask_subset(mask, cpumask_of(0)));
+}
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index 9e30a9a3b50b..c5784a436574 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -65,11 +65,6 @@ int arch_monitor_domctl_event(struct domain *d,
/* smp.c */
-void smp_send_event_check_mask(const cpumask_t *mask)
-{
- BUG_ON("unimplemented");
-}
-
void smp_send_call_function_mask(const cpumask_t *mask)
{
BUG_ON("unimplemented");
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 08/16] xen/riscv: add temporary stub for smp_send_event_check_mask()
2026-01-22 16:47 ` [PATCH v2 08/16] xen/riscv: add temporary stub for smp_send_event_check_mask() Oleksii Kurochko
@ 2026-01-29 16:32 ` Jan Beulich
2026-01-29 16:46 ` Oleksii Kurochko
0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-01-29 16:32 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 22.01.2026 17:47, Oleksii Kurochko wrote:
> RISC-V SMP support is not yet implemented, but smp_send_event_check_mask()
> is required by common code and vcpu_kick(), which is introduced later.
> Provide a temporary stub implementation that asserts the mask only targets
> CPU0.
>
> cpumask_subset() is used instead of cpumask_equal() because some callers
> (e.g. cpumask_raise_softirq() or cpu_raise_softirq_batch_finish()) may
> legitimately pass an empty mask, which would otherwise cause false
> failures.
>
> The BUG_ON() ensures that attempts to use this function with multiple CPUs
> are caught early once SMP support is introduced.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Looks like this is independent of earlier patches in the series, and hence
could go in right away? Please confirm.
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v2 08/16] xen/riscv: add temporary stub for smp_send_event_check_mask()
2026-01-29 16:32 ` Jan Beulich
@ 2026-01-29 16:46 ` Oleksii Kurochko
0 siblings, 0 replies; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-29 16:46 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 1/29/26 5:32 PM, Jan Beulich wrote:
> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>> RISC-V SMP support is not yet implemented, but smp_send_event_check_mask()
>> is required by common code and vcpu_kick(), which is introduced later.
>> Provide a temporary stub implementation that asserts the mask only targets
>> CPU0.
>>
>> cpumask_subset() is used instead of cpumask_equal() because some callers
>> (e.g. cpumask_raise_softirq() or cpu_raise_softirq_batch_finish()) may
>> legitimately pass an empty mask, which would otherwise cause false
>> failures.
>>
>> The BUG_ON() ensures that attempts to use this function with multiple CPUs
>> are caught early once SMP support is introduced.
>>
>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
Thanks.
>
> Looks like this is independent of earlier patches in the series, and hence
> could go in right away? Please confirm.
I confirm that it is independent and could go right away. I, also, checked
locally compilation of this patch on top of current staging and it is fine.
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v2 09/16] xen/riscv: introduce vcpu_kick() implementation
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
` (7 preceding siblings ...)
2026-01-22 16:47 ` [PATCH v2 08/16] xen/riscv: add temporary stub for smp_send_event_check_mask() Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-02-06 16:36 ` Oleksii Kurochko
2026-01-22 16:47 ` [PATCH v2 10/16] xen/riscv: add vtimer context switch helpers Oleksii Kurochko
` (6 subsequent siblings)
15 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
Add a RISC-V implementation of vcpu_kick(), which unblocks the target
vCPU and sends an event check IPI if the vCPU was running on another
processor. This mirrors the behavior of Arm and enables proper vCPU
wakeup handling on RISC-V.
Remove the stub implementation from stubs.c, as it is now provided by
arch/riscv/domain.c.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v2:
- Add Acked-by: Jan Beulich <jbeulich@suse.com>.
---
xen/arch/riscv/domain.c | 14 ++++++++++++++
xen/arch/riscv/stubs.c | 5 -----
2 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index e38c0db62cac..13ac384c4b76 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -1,8 +1,10 @@
/* SPDX-License-Identifier: GPL-2.0-only */
+#include <xen/cpumask.h>
#include <xen/init.h>
#include <xen/mm.h>
#include <xen/sched.h>
+#include <xen/smp.h>
#include <xen/vmap.h>
#include <asm/bitops.h>
@@ -240,3 +242,15 @@ void vcpu_sync_interrupts(struct vcpu *v)
# error "Update vsieh"
#endif
}
+
+void vcpu_kick(struct vcpu *v)
+{
+ bool running = v->is_running;
+
+ vcpu_unblock(v);
+ if ( running && v != current )
+ {
+ perfc_incr(vcpu_kick);
+ smp_send_event_check_mask(cpumask_of(v->processor));
+ }
+}
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index c5784a436574..1f0add97b361 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -208,11 +208,6 @@ void vcpu_block_unless_event_pending(struct vcpu *v)
BUG_ON("unimplemented");
}
-void vcpu_kick(struct vcpu *v)
-{
- BUG_ON("unimplemented");
-}
-
unsigned long
hypercall_create_continuation(unsigned int op, const char *format, ...)
{
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 09/16] xen/riscv: introduce vcpu_kick() implementation
2026-01-22 16:47 ` [PATCH v2 09/16] xen/riscv: introduce vcpu_kick() implementation Oleksii Kurochko
@ 2026-02-06 16:36 ` Oleksii Kurochko
2026-02-09 9:07 ` Jan Beulich
0 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-02-06 16:36 UTC (permalink / raw)
To: xen-devel
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey
On 1/22/26 5:47 PM, Oleksii Kurochko wrote:
> Add a RISC-V implementation of vcpu_kick(), which unblocks the target
> vCPU and sends an event check IPI if the vCPU was running on another
> processor. This mirrors the behavior of Arm and enables proper vCPU
> wakeup handling on RISC-V.
>
> Remove the stub implementation from stubs.c, as it is now provided by
> arch/riscv/domain.c.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> ---
> Changes in v2:
> - Add Acked-by: Jan Beulich <jbeulich@suse.com>.
> ---
> xen/arch/riscv/domain.c | 14 ++++++++++++++
> xen/arch/riscv/stubs.c | 5 -----
> 2 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
> index e38c0db62cac..13ac384c4b76 100644
> --- a/xen/arch/riscv/domain.c
> +++ b/xen/arch/riscv/domain.c
> @@ -1,8 +1,10 @@
> /* SPDX-License-Identifier: GPL-2.0-only */
>
> +#include <xen/cpumask.h>
> #include <xen/init.h>
> #include <xen/mm.h>
> #include <xen/sched.h>
> +#include <xen/smp.h>
> #include <xen/vmap.h>
>
> #include <asm/bitops.h>
> @@ -240,3 +242,15 @@ void vcpu_sync_interrupts(struct vcpu *v)
> # error "Update vsieh"
> #endif
> }
> +
> +void vcpu_kick(struct vcpu *v)
> +{
> + bool running = v->is_running;
> +
> + vcpu_unblock(v);
> + if ( running && v != current )
> + {
> + perfc_incr(vcpu_kick);
Because of this it is needed to introduce:
PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
Otherwise randconfig build will fail when CONFIG_PERF_COUNTERS=y.
I would like to ask if it would be okay to add it xen/include/xen/perfc_defn.h
just after PERFCOUNTER(need_flush_tlb_flush,...) or would it be better to have
it in arch specific perfc_defn.h?
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 09/16] xen/riscv: introduce vcpu_kick() implementation
2026-02-06 16:36 ` Oleksii Kurochko
@ 2026-02-09 9:07 ` Jan Beulich
2026-02-09 9:40 ` Oleksii Kurochko
0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-02-09 9:07 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 06.02.2026 17:36, Oleksii Kurochko wrote:
>
> On 1/22/26 5:47 PM, Oleksii Kurochko wrote:
>> Add a RISC-V implementation of vcpu_kick(), which unblocks the target
>> vCPU and sends an event check IPI if the vCPU was running on another
>> processor. This mirrors the behavior of Arm and enables proper vCPU
>> wakeup handling on RISC-V.
>>
>> Remove the stub implementation from stubs.c, as it is now provided by
>> arch/riscv/domain.c.
>>
>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>> Acked-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> Changes in v2:
>> - Add Acked-by: Jan Beulich <jbeulich@suse.com>.
>> ---
>> xen/arch/riscv/domain.c | 14 ++++++++++++++
>> xen/arch/riscv/stubs.c | 5 -----
>> 2 files changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
>> index e38c0db62cac..13ac384c4b76 100644
>> --- a/xen/arch/riscv/domain.c
>> +++ b/xen/arch/riscv/domain.c
>> @@ -1,8 +1,10 @@
>> /* SPDX-License-Identifier: GPL-2.0-only */
>>
>> +#include <xen/cpumask.h>
>> #include <xen/init.h>
>> #include <xen/mm.h>
>> #include <xen/sched.h>
>> +#include <xen/smp.h>
>> #include <xen/vmap.h>
>>
>> #include <asm/bitops.h>
>> @@ -240,3 +242,15 @@ void vcpu_sync_interrupts(struct vcpu *v)
>> # error "Update vsieh"
>> #endif
>> }
>> +
>> +void vcpu_kick(struct vcpu *v)
>> +{
>> + bool running = v->is_running;
>> +
>> + vcpu_unblock(v);
>> + if ( running && v != current )
>> + {
>> + perfc_incr(vcpu_kick);
>
> Because of this it is needed to introduce:
> PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
> Otherwise randconfig build will fail when CONFIG_PERF_COUNTERS=y.
>
> I would like to ask if it would be okay to add it xen/include/xen/perfc_defn.h
> just after PERFCOUNTER(need_flush_tlb_flush,...) or would it be better to have
> it in arch specific perfc_defn.h?
Arch-specific please - it's not used by x86 nor ppc.
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 09/16] xen/riscv: introduce vcpu_kick() implementation
2026-02-09 9:07 ` Jan Beulich
@ 2026-02-09 9:40 ` Oleksii Kurochko
2026-02-09 9:51 ` Jan Beulich
0 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 9:40 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 2/9/26 10:07 AM, Jan Beulich wrote:
> On 06.02.2026 17:36, Oleksii Kurochko wrote:
>> On 1/22/26 5:47 PM, Oleksii Kurochko wrote:
>>> Add a RISC-V implementation of vcpu_kick(), which unblocks the target
>>> vCPU and sends an event check IPI if the vCPU was running on another
>>> processor. This mirrors the behavior of Arm and enables proper vCPU
>>> wakeup handling on RISC-V.
>>>
>>> Remove the stub implementation from stubs.c, as it is now provided by
>>> arch/riscv/domain.c.
>>>
>>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>>> Acked-by: Jan Beulich <jbeulich@suse.com>
>>> ---
>>> Changes in v2:
>>> - Add Acked-by: Jan Beulich <jbeulich@suse.com>.
>>> ---
>>> xen/arch/riscv/domain.c | 14 ++++++++++++++
>>> xen/arch/riscv/stubs.c | 5 -----
>>> 2 files changed, 14 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
>>> index e38c0db62cac..13ac384c4b76 100644
>>> --- a/xen/arch/riscv/domain.c
>>> +++ b/xen/arch/riscv/domain.c
>>> @@ -1,8 +1,10 @@
>>> /* SPDX-License-Identifier: GPL-2.0-only */
>>>
>>> +#include <xen/cpumask.h>
>>> #include <xen/init.h>
>>> #include <xen/mm.h>
>>> #include <xen/sched.h>
>>> +#include <xen/smp.h>
>>> #include <xen/vmap.h>
>>>
>>> #include <asm/bitops.h>
>>> @@ -240,3 +242,15 @@ void vcpu_sync_interrupts(struct vcpu *v)
>>> # error "Update vsieh"
>>> #endif
>>> }
>>> +
>>> +void vcpu_kick(struct vcpu *v)
>>> +{
>>> + bool running = v->is_running;
>>> +
>>> + vcpu_unblock(v);
>>> + if ( running && v != current )
>>> + {
>>> + perfc_incr(vcpu_kick);
>> Because of this it is needed to introduce:
>> PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
>> Otherwise randconfig build will fail when CONFIG_PERF_COUNTERS=y.
>>
>> I would like to ask if it would be okay to add it xen/include/xen/perfc_defn.h
>> just after PERFCOUNTER(need_flush_tlb_flush,...) or would it be better to have
>> it in arch specific perfc_defn.h?
> Arch-specific please - it's not used by x86 nor ppc.
Then I will do the following changes:
diff --git a/xen/arch/riscv/include/asm/Makefile b/xen/arch/riscv/include/asm/Makefile
index 3824f31c395c..86c56251d5d7 100644
--- a/xen/arch/riscv/include/asm/Makefile
+++ b/xen/arch/riscv/include/asm/Makefile
@@ -7,7 +7,6 @@ generic-y += hypercall.h
generic-y += iocap.h
generic-y += irq-dt.h
generic-y += percpu.h
-generic-y += perfc_defn.h
generic-y += random.h
generic-y += softirq.h
generic-y += vm_event.h
diff --git a/xen/arch/riscv/include/asm/perfc_defn.h b/xen/arch/riscv/include/asm/perfc_defn.h
new file mode 100644
index 000000000000..4fc161f1abad
--- /dev/null
+++ b/xen/arch/riscv/include/asm/perfc_defn.h
@@ -0,0 +1,7 @@
+/* This file is intended to be included multiple times. */
+/*#ifndef __XEN_PERFC_DEFN_H__*/
+/*#define __XEN_PERFC_DEFN_H__*/
+
+PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
+
+/*#endif*/ /* __XEN_PERFC_DEFN_H__ */
and add the following to commit message:
Since vcpu_kick() calls perfc_incr(vcpu_kick), add perfcounter for
vcpu_kick to handle the case when CONFIG_PERF_COUNTERS=y. Although
CONFIG_PERF_COUNTERS is not enabled by default, it can be enabled,
for example, by randconfig what will lead to CI build issues.
Note that I keep __XEN_PERFC_DEFN_H__ as other archictectures use the same,
not something like ASM__<arch>__PERFC_DEFN_H.
Let me know if these changes are okay for you and if I can keep your
Acked-by for this patch.
~ Oleksii
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 09/16] xen/riscv: introduce vcpu_kick() implementation
2026-02-09 9:40 ` Oleksii Kurochko
@ 2026-02-09 9:51 ` Jan Beulich
2026-02-09 12:35 ` Oleksii Kurochko
0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-02-09 9:51 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 09.02.2026 10:40, Oleksii Kurochko wrote:
>
> On 2/9/26 10:07 AM, Jan Beulich wrote:
>> On 06.02.2026 17:36, Oleksii Kurochko wrote:
>>> On 1/22/26 5:47 PM, Oleksii Kurochko wrote:
>>>> Add a RISC-V implementation of vcpu_kick(), which unblocks the target
>>>> vCPU and sends an event check IPI if the vCPU was running on another
>>>> processor. This mirrors the behavior of Arm and enables proper vCPU
>>>> wakeup handling on RISC-V.
>>>>
>>>> Remove the stub implementation from stubs.c, as it is now provided by
>>>> arch/riscv/domain.c.
>>>>
>>>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>>>> Acked-by: Jan Beulich <jbeulich@suse.com>
>>>> ---
>>>> Changes in v2:
>>>> - Add Acked-by: Jan Beulich <jbeulich@suse.com>.
>>>> ---
>>>> xen/arch/riscv/domain.c | 14 ++++++++++++++
>>>> xen/arch/riscv/stubs.c | 5 -----
>>>> 2 files changed, 14 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
>>>> index e38c0db62cac..13ac384c4b76 100644
>>>> --- a/xen/arch/riscv/domain.c
>>>> +++ b/xen/arch/riscv/domain.c
>>>> @@ -1,8 +1,10 @@
>>>> /* SPDX-License-Identifier: GPL-2.0-only */
>>>>
>>>> +#include <xen/cpumask.h>
>>>> #include <xen/init.h>
>>>> #include <xen/mm.h>
>>>> #include <xen/sched.h>
>>>> +#include <xen/smp.h>
>>>> #include <xen/vmap.h>
>>>>
>>>> #include <asm/bitops.h>
>>>> @@ -240,3 +242,15 @@ void vcpu_sync_interrupts(struct vcpu *v)
>>>> # error "Update vsieh"
>>>> #endif
>>>> }
>>>> +
>>>> +void vcpu_kick(struct vcpu *v)
>>>> +{
>>>> + bool running = v->is_running;
>>>> +
>>>> + vcpu_unblock(v);
>>>> + if ( running && v != current )
>>>> + {
>>>> + perfc_incr(vcpu_kick);
>>> Because of this it is needed to introduce:
>>> PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
>>> Otherwise randconfig build will fail when CONFIG_PERF_COUNTERS=y.
>>>
>>> I would like to ask if it would be okay to add it xen/include/xen/perfc_defn.h
>>> just after PERFCOUNTER(need_flush_tlb_flush,...) or would it be better to have
>>> it in arch specific perfc_defn.h?
>> Arch-specific please - it's not used by x86 nor ppc.
>
> Then I will do the following changes:
>
> diff --git a/xen/arch/riscv/include/asm/Makefile b/xen/arch/riscv/include/asm/Makefile
> index 3824f31c395c..86c56251d5d7 100644
> --- a/xen/arch/riscv/include/asm/Makefile
> +++ b/xen/arch/riscv/include/asm/Makefile
> @@ -7,7 +7,6 @@ generic-y += hypercall.h
> generic-y += iocap.h
> generic-y += irq-dt.h
> generic-y += percpu.h
> -generic-y += perfc_defn.h
> generic-y += random.h
> generic-y += softirq.h
> generic-y += vm_event.h
> diff --git a/xen/arch/riscv/include/asm/perfc_defn.h b/xen/arch/riscv/include/asm/perfc_defn.h
> new file mode 100644
> index 000000000000..4fc161f1abad
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/perfc_defn.h
> @@ -0,0 +1,7 @@
> +/* This file is intended to be included multiple times. */
> +/*#ifndef __XEN_PERFC_DEFN_H__*/
> +/*#define __XEN_PERFC_DEFN_H__*/
> +
> +PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
> +
> +/*#endif*/ /* __XEN_PERFC_DEFN_H__ */
>
> and add the following to commit message:
> Since vcpu_kick() calls perfc_incr(vcpu_kick), add perfcounter for
> vcpu_kick to handle the case when CONFIG_PERF_COUNTERS=y. Although
> CONFIG_PERF_COUNTERS is not enabled by default, it can be enabled,
> for example, by randconfig what will lead to CI build issues.
>
> Note that I keep __XEN_PERFC_DEFN_H__ as other archictectures use the same,
> not something like ASM__<arch>__PERFC_DEFN_H.
Please don't copy this mistake. I actually question the commented-out pre-
processor directives altogether: Misra also has a rule against commented-
out code (directive 4.4, which we didn't accept [yet], but which exists
nevertheless). Yet at the very least what's commented out should not be
obviously wrong.
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 09/16] xen/riscv: introduce vcpu_kick() implementation
2026-02-09 9:51 ` Jan Beulich
@ 2026-02-09 12:35 ` Oleksii Kurochko
2026-02-09 12:54 ` Jan Beulich
0 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-02-09 12:35 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 2/9/26 10:51 AM, Jan Beulich wrote:
> On 09.02.2026 10:40, Oleksii Kurochko wrote:
>> On 2/9/26 10:07 AM, Jan Beulich wrote:
>>> On 06.02.2026 17:36, Oleksii Kurochko wrote:
>>>> On 1/22/26 5:47 PM, Oleksii Kurochko wrote:
>>>>> Add a RISC-V implementation of vcpu_kick(), which unblocks the target
>>>>> vCPU and sends an event check IPI if the vCPU was running on another
>>>>> processor. This mirrors the behavior of Arm and enables proper vCPU
>>>>> wakeup handling on RISC-V.
>>>>>
>>>>> Remove the stub implementation from stubs.c, as it is now provided by
>>>>> arch/riscv/domain.c.
>>>>>
>>>>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>>>>> Acked-by: Jan Beulich <jbeulich@suse.com>
>>>>> ---
>>>>> Changes in v2:
>>>>> - Add Acked-by: Jan Beulich <jbeulich@suse.com>.
>>>>> ---
>>>>> xen/arch/riscv/domain.c | 14 ++++++++++++++
>>>>> xen/arch/riscv/stubs.c | 5 -----
>>>>> 2 files changed, 14 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
>>>>> index e38c0db62cac..13ac384c4b76 100644
>>>>> --- a/xen/arch/riscv/domain.c
>>>>> +++ b/xen/arch/riscv/domain.c
>>>>> @@ -1,8 +1,10 @@
>>>>> /* SPDX-License-Identifier: GPL-2.0-only */
>>>>>
>>>>> +#include <xen/cpumask.h>
>>>>> #include <xen/init.h>
>>>>> #include <xen/mm.h>
>>>>> #include <xen/sched.h>
>>>>> +#include <xen/smp.h>
>>>>> #include <xen/vmap.h>
>>>>>
>>>>> #include <asm/bitops.h>
>>>>> @@ -240,3 +242,15 @@ void vcpu_sync_interrupts(struct vcpu *v)
>>>>> # error "Update vsieh"
>>>>> #endif
>>>>> }
>>>>> +
>>>>> +void vcpu_kick(struct vcpu *v)
>>>>> +{
>>>>> + bool running = v->is_running;
>>>>> +
>>>>> + vcpu_unblock(v);
>>>>> + if ( running && v != current )
>>>>> + {
>>>>> + perfc_incr(vcpu_kick);
>>>> Because of this it is needed to introduce:
>>>> PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
>>>> Otherwise randconfig build will fail when CONFIG_PERF_COUNTERS=y.
>>>>
>>>> I would like to ask if it would be okay to add it xen/include/xen/perfc_defn.h
>>>> just after PERFCOUNTER(need_flush_tlb_flush,...) or would it be better to have
>>>> it in arch specific perfc_defn.h?
>>> Arch-specific please - it's not used by x86 nor ppc.
>> Then I will do the following changes:
>>
>> diff --git a/xen/arch/riscv/include/asm/Makefile b/xen/arch/riscv/include/asm/Makefile
>> index 3824f31c395c..86c56251d5d7 100644
>> --- a/xen/arch/riscv/include/asm/Makefile
>> +++ b/xen/arch/riscv/include/asm/Makefile
>> @@ -7,7 +7,6 @@ generic-y += hypercall.h
>> generic-y += iocap.h
>> generic-y += irq-dt.h
>> generic-y += percpu.h
>> -generic-y += perfc_defn.h
>> generic-y += random.h
>> generic-y += softirq.h
>> generic-y += vm_event.h
>> diff --git a/xen/arch/riscv/include/asm/perfc_defn.h b/xen/arch/riscv/include/asm/perfc_defn.h
>> new file mode 100644
>> index 000000000000..4fc161f1abad
>> --- /dev/null
>> +++ b/xen/arch/riscv/include/asm/perfc_defn.h
>> @@ -0,0 +1,7 @@
>> +/* This file is intended to be included multiple times. */
>> +/*#ifndef __XEN_PERFC_DEFN_H__*/
>> +/*#define __XEN_PERFC_DEFN_H__*/
>> +
>> +PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
>> +
>> +/*#endif*/ /* __XEN_PERFC_DEFN_H__ */
>>
>> and add the following to commit message:
>> Since vcpu_kick() calls perfc_incr(vcpu_kick), add perfcounter for
>> vcpu_kick to handle the case when CONFIG_PERF_COUNTERS=y. Although
>> CONFIG_PERF_COUNTERS is not enabled by default, it can be enabled,
>> for example, by randconfig what will lead to CI build issues.
>>
>> Note that I keep __XEN_PERFC_DEFN_H__ as other archictectures use the same,
>> not something like ASM__<arch>__PERFC_DEFN_H.
> Please don't copy this mistake. I actually question the commented-out pre-
> processor directives altogether: Misra also has a rule against commented-
> out code (directive 4.4, which we didn't accept [yet], but which exists
> nevertheless). Yet at the very least what's commented out should not be
> obviously wrong.
Do I understand correctly that it would be acceptable to simply drop the
commented-out preprocessor directives and keep only /* This file is intended
to be included multiple times. */ ?
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 09/16] xen/riscv: introduce vcpu_kick() implementation
2026-02-09 12:35 ` Oleksii Kurochko
@ 2026-02-09 12:54 ` Jan Beulich
0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2026-02-09 12:54 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 09.02.2026 13:35, Oleksii Kurochko wrote:
> On 2/9/26 10:51 AM, Jan Beulich wrote:
>> On 09.02.2026 10:40, Oleksii Kurochko wrote:
>>> On 2/9/26 10:07 AM, Jan Beulich wrote:
>>>> On 06.02.2026 17:36, Oleksii Kurochko wrote:
>>>>> On 1/22/26 5:47 PM, Oleksii Kurochko wrote:
>>>>>> Add a RISC-V implementation of vcpu_kick(), which unblocks the target
>>>>>> vCPU and sends an event check IPI if the vCPU was running on another
>>>>>> processor. This mirrors the behavior of Arm and enables proper vCPU
>>>>>> wakeup handling on RISC-V.
>>>>>>
>>>>>> Remove the stub implementation from stubs.c, as it is now provided by
>>>>>> arch/riscv/domain.c.
>>>>>>
>>>>>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>>>>>> Acked-by: Jan Beulich <jbeulich@suse.com>
>>>>>> ---
>>>>>> Changes in v2:
>>>>>> - Add Acked-by: Jan Beulich <jbeulich@suse.com>.
>>>>>> ---
>>>>>> xen/arch/riscv/domain.c | 14 ++++++++++++++
>>>>>> xen/arch/riscv/stubs.c | 5 -----
>>>>>> 2 files changed, 14 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
>>>>>> index e38c0db62cac..13ac384c4b76 100644
>>>>>> --- a/xen/arch/riscv/domain.c
>>>>>> +++ b/xen/arch/riscv/domain.c
>>>>>> @@ -1,8 +1,10 @@
>>>>>> /* SPDX-License-Identifier: GPL-2.0-only */
>>>>>>
>>>>>> +#include <xen/cpumask.h>
>>>>>> #include <xen/init.h>
>>>>>> #include <xen/mm.h>
>>>>>> #include <xen/sched.h>
>>>>>> +#include <xen/smp.h>
>>>>>> #include <xen/vmap.h>
>>>>>>
>>>>>> #include <asm/bitops.h>
>>>>>> @@ -240,3 +242,15 @@ void vcpu_sync_interrupts(struct vcpu *v)
>>>>>> # error "Update vsieh"
>>>>>> #endif
>>>>>> }
>>>>>> +
>>>>>> +void vcpu_kick(struct vcpu *v)
>>>>>> +{
>>>>>> + bool running = v->is_running;
>>>>>> +
>>>>>> + vcpu_unblock(v);
>>>>>> + if ( running && v != current )
>>>>>> + {
>>>>>> + perfc_incr(vcpu_kick);
>>>>> Because of this it is needed to introduce:
>>>>> PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
>>>>> Otherwise randconfig build will fail when CONFIG_PERF_COUNTERS=y.
>>>>>
>>>>> I would like to ask if it would be okay to add it xen/include/xen/perfc_defn.h
>>>>> just after PERFCOUNTER(need_flush_tlb_flush,...) or would it be better to have
>>>>> it in arch specific perfc_defn.h?
>>>> Arch-specific please - it's not used by x86 nor ppc.
>>> Then I will do the following changes:
>>>
>>> diff --git a/xen/arch/riscv/include/asm/Makefile b/xen/arch/riscv/include/asm/Makefile
>>> index 3824f31c395c..86c56251d5d7 100644
>>> --- a/xen/arch/riscv/include/asm/Makefile
>>> +++ b/xen/arch/riscv/include/asm/Makefile
>>> @@ -7,7 +7,6 @@ generic-y += hypercall.h
>>> generic-y += iocap.h
>>> generic-y += irq-dt.h
>>> generic-y += percpu.h
>>> -generic-y += perfc_defn.h
>>> generic-y += random.h
>>> generic-y += softirq.h
>>> generic-y += vm_event.h
>>> diff --git a/xen/arch/riscv/include/asm/perfc_defn.h b/xen/arch/riscv/include/asm/perfc_defn.h
>>> new file mode 100644
>>> index 000000000000..4fc161f1abad
>>> --- /dev/null
>>> +++ b/xen/arch/riscv/include/asm/perfc_defn.h
>>> @@ -0,0 +1,7 @@
>>> +/* This file is intended to be included multiple times. */
>>> +/*#ifndef __XEN_PERFC_DEFN_H__*/
>>> +/*#define __XEN_PERFC_DEFN_H__*/
>>> +
>>> +PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
>>> +
>>> +/*#endif*/ /* __XEN_PERFC_DEFN_H__ */
>>>
>>> and add the following to commit message:
>>> Since vcpu_kick() calls perfc_incr(vcpu_kick), add perfcounter for
>>> vcpu_kick to handle the case when CONFIG_PERF_COUNTERS=y. Although
>>> CONFIG_PERF_COUNTERS is not enabled by default, it can be enabled,
>>> for example, by randconfig what will lead to CI build issues.
>>>
>>> Note that I keep __XEN_PERFC_DEFN_H__ as other archictectures use the same,
>>> not something like ASM__<arch>__PERFC_DEFN_H.
>> Please don't copy this mistake. I actually question the commented-out pre-
>> processor directives altogether: Misra also has a rule against commented-
>> out code (directive 4.4, which we didn't accept [yet], but which exists
>> nevertheless). Yet at the very least what's commented out should not be
>> obviously wrong.
>
> Do I understand correctly that it would be acceptable to simply drop the
> commented-out preprocessor directives and keep only /* This file is intended
> to be included multiple times. */ ?
I think so, yes. You could simply check how the same situation is covered
for elsewhere.
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v2 10/16] xen/riscv: add vtimer context switch helpers
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
` (8 preceding siblings ...)
2026-01-22 16:47 ` [PATCH v2 09/16] xen/riscv: introduce vcpu_kick() implementation Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-02-03 16:43 ` Jan Beulich
2026-01-22 16:47 ` [PATCH v2 11/16] xen/riscv: implement SBI legacy SET_TIMER support for guests Oleksii Kurochko
` (5 subsequent siblings)
15 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
Introduce vtimer_ctx_switch_from() and vtimer_ctx_switch_to() to handle
virtual timer state across vCPU context switches.
At present, vtimer_ctx_switch_from() is a no-op because the RISC-V SSTC
extension, which provides a virtualization-aware timer, is not yet
supported. Xen therefore relies the virtual (SBI-based) timer.
The virtual timer uses Xen's internal timer infrastructure and must be
associated with the pCPU on which the vCPU is currently running so that
timer events can be delivered efficiently. As a result, vtimer_ctx_switch_to()
migrates the timer to the target pCPU when a vCPU is scheduled in.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
- Align the parameters names for vtimer_ctx_switch_from() and vtimer_ctx_switch_to() in
declarations to match the ones in the defintions to make Misra happy.
- s/vtimer_save/vtimer_ctx_switch_from.
- s/vtimer_restore/vtimer_ctx_switch_to.
- Update the commit message.
---
xen/arch/riscv/include/asm/vtimer.h | 3 +++
xen/arch/riscv/vtimer.c | 15 +++++++++++++++
2 files changed, 18 insertions(+)
diff --git a/xen/arch/riscv/include/asm/vtimer.h b/xen/arch/riscv/include/asm/vtimer.h
index 0d1555511755..52b7fb7b1cbb 100644
--- a/xen/arch/riscv/include/asm/vtimer.h
+++ b/xen/arch/riscv/include/asm/vtimer.h
@@ -17,4 +17,7 @@ void vcpu_timer_destroy(struct vcpu *v);
void vtimer_set_timer(struct vtimer *t, uint64_t ticks);
+void vtimer_ctx_switch_from(struct vcpu *p);
+void vtimer_ctx_switch_to(struct vcpu *n);
+
#endif /* ASM__RISCV__VTIMER_H */
diff --git a/xen/arch/riscv/vtimer.c b/xen/arch/riscv/vtimer.c
index b6599fa383b8..6dfd6d836260 100644
--- a/xen/arch/riscv/vtimer.c
+++ b/xen/arch/riscv/vtimer.c
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: GPL-2.0-only */
+#include <xen/bug.h>
#include <xen/sched.h>
#include <xen/timer.h>
@@ -71,3 +72,17 @@ void vtimer_set_timer(struct vtimer *t, const uint64_t ticks)
migrate_timer(&t->timer, smp_processor_id());
set_timer(&t->timer, expires);
}
+
+void vtimer_ctx_switch_from(struct vcpu *p)
+{
+ ASSERT(!is_idle_vcpu(p));
+
+ /* Nothing to do at the moment as SSTC isn't supported now. */
+}
+
+void vtimer_ctx_switch_to(struct vcpu *n)
+{
+ ASSERT(!is_idle_vcpu(n));
+
+ migrate_timer(&n->arch.vtimer.timer, n->processor);
+}
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 10/16] xen/riscv: add vtimer context switch helpers
2026-01-22 16:47 ` [PATCH v2 10/16] xen/riscv: add vtimer context switch helpers Oleksii Kurochko
@ 2026-02-03 16:43 ` Jan Beulich
2026-02-03 16:56 ` Oleksii Kurochko
0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-02-03 16:43 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 22.01.2026 17:47, Oleksii Kurochko wrote:
> Introduce vtimer_ctx_switch_from() and vtimer_ctx_switch_to() to handle
> virtual timer state across vCPU context switches.
>
> At present, vtimer_ctx_switch_from() is a no-op because the RISC-V SSTC
> extension, which provides a virtualization-aware timer, is not yet
> supported. Xen therefore relies the virtual (SBI-based) timer.
>
> The virtual timer uses Xen's internal timer infrastructure and must be
> associated with the pCPU on which the vCPU is currently running so that
> timer events can be delivered efficiently. As a result, vtimer_ctx_switch_to()
> migrates the timer to the target pCPU when a vCPU is scheduled in.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
However, ...
> --- a/xen/arch/riscv/include/asm/vtimer.h
> +++ b/xen/arch/riscv/include/asm/vtimer.h
> @@ -17,4 +17,7 @@ void vcpu_timer_destroy(struct vcpu *v);
>
> void vtimer_set_timer(struct vtimer *t, uint64_t ticks);
>
> +void vtimer_ctx_switch_from(struct vcpu *p);
> +void vtimer_ctx_switch_to(struct vcpu *n);
... may I ask that you reconsider naming here? Both Arm and x86 have functions
where the prefix / infix is "ctxt", not just "ctx". Being able to find all by
grep-ing for e.g. ctxt_switch_from might be quite nice.
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v2 10/16] xen/riscv: add vtimer context switch helpers
2026-02-03 16:43 ` Jan Beulich
@ 2026-02-03 16:56 ` Oleksii Kurochko
0 siblings, 0 replies; 65+ messages in thread
From: Oleksii Kurochko @ 2026-02-03 16:56 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 2/3/26 5:43 PM, Jan Beulich wrote:
> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>> Introduce vtimer_ctx_switch_from() and vtimer_ctx_switch_to() to handle
>> virtual timer state across vCPU context switches.
>>
>> At present, vtimer_ctx_switch_from() is a no-op because the RISC-V SSTC
>> extension, which provides a virtualization-aware timer, is not yet
>> supported. Xen therefore relies the virtual (SBI-based) timer.
>>
>> The virtual timer uses Xen's internal timer infrastructure and must be
>> associated with the pCPU on which the vCPU is currently running so that
>> timer events can be delivered efficiently. As a result, vtimer_ctx_switch_to()
>> migrates the timer to the target pCPU when a vCPU is scheduled in.
>>
>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
Thanks.
>
> However, ...
>
>> --- a/xen/arch/riscv/include/asm/vtimer.h
>> +++ b/xen/arch/riscv/include/asm/vtimer.h
>> @@ -17,4 +17,7 @@ void vcpu_timer_destroy(struct vcpu *v);
>>
>> void vtimer_set_timer(struct vtimer *t, uint64_t ticks);
>>
>> +void vtimer_ctx_switch_from(struct vcpu *p);
>> +void vtimer_ctx_switch_to(struct vcpu *n);
> ... may I ask that you reconsider naming here? Both Arm and x86 have functions
> where the prefix / infix is "ctxt", not just "ctx". Being able to find all by
> grep-ing for e.g. ctxt_switch_from might be quite nice.
Sure, I will do that, it makes sense.
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v2 11/16] xen/riscv: implement SBI legacy SET_TIMER support for guests
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
` (9 preceding siblings ...)
2026-01-22 16:47 ` [PATCH v2 10/16] xen/riscv: add vtimer context switch helpers Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-01-22 16:47 ` [PATCH v2 12/16] xen/riscv: introduce sbi_set_timer() Oleksii Kurochko
` (4 subsequent siblings)
15 siblings, 0 replies; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
Add handling of the SBI_EXT_0_1_SET_TIMER function ID to the legacy
extension ecall handler. The handler now programs the vCPU’s virtual
timer via vtimer_set_timer() and returns SBI_SUCCESS.
This enables guests using the legacy SBI timer interface to schedule
timer events correctly.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v2:
- Add Acked-by: Jan Beulich <jbeulich@suse.com>.
---
xen/arch/riscv/vsbi/legacy-extension.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/xen/arch/riscv/vsbi/legacy-extension.c b/xen/arch/riscv/vsbi/legacy-extension.c
index 2e8df191c295..090c23440cea 100644
--- a/xen/arch/riscv/vsbi/legacy-extension.c
+++ b/xen/arch/riscv/vsbi/legacy-extension.c
@@ -7,6 +7,7 @@
#include <asm/processor.h>
#include <asm/vsbi.h>
+#include <asm/vtimer.h>
static void vsbi_print_char(char c)
{
@@ -44,6 +45,11 @@ static int vsbi_legacy_ecall_handler(unsigned long eid, unsigned long fid,
ret = SBI_ERR_NOT_SUPPORTED;
break;
+ case SBI_EXT_0_1_SET_TIMER:
+ vtimer_set_timer(¤t->arch.vtimer, regs->a0);
+ regs->a0 = SBI_SUCCESS;
+ break;
+
default:
/*
* TODO: domain_crash() is acceptable here while things are still under
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* [PATCH v2 12/16] xen/riscv: introduce sbi_set_timer()
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
` (10 preceding siblings ...)
2026-01-22 16:47 ` [PATCH v2 11/16] xen/riscv: implement SBI legacy SET_TIMER support for guests Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-02-03 17:02 ` Jan Beulich
2026-02-04 6:50 ` Jan Beulich
2026-01-22 16:47 ` [PATCH v2 13/16] xen/riscv: implement reprogram_timer() via SBI Oleksii Kurochko
` (3 subsequent siblings)
15 siblings, 2 replies; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
Introduce a function pointer for sbi_set_timer(), since different OpenSBI
versions may implement the TIME extension with different extension IDs
and/or function IDs.
If the TIME extension is not available, fall back to the legacy timer
mechanism. This is useful when Xen runs as a guest under another Xen,
because the TIME extension is not currently virtualised and therefore
will not appear as available.
The sbi_set_timer() pointer will be used by reprogram_timer() to program
Xen’s physical timer as without SSTC extension there is no any other
option except SBI call to do that as only M-timer is available for us.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
- Move up defintion of SBI_EXT_TIME_SET_TIMER and use the same padding as
defintions around it.
- Add an extra comment about stime_value granuality above declaration of
sbi_set_timer function pointer.
- Refactor implemetation of sbi_set_timer_v02().
- Provide fallback for sbi_set_timer_v01().
- Update the commit message.
---
xen/arch/riscv/include/asm/sbi.h | 18 ++++++++++++++
xen/arch/riscv/sbi.c | 40 ++++++++++++++++++++++++++++++++
2 files changed, 58 insertions(+)
diff --git a/xen/arch/riscv/include/asm/sbi.h b/xen/arch/riscv/include/asm/sbi.h
index 79f7ff5c5501..e0e31d7afa20 100644
--- a/xen/arch/riscv/include/asm/sbi.h
+++ b/xen/arch/riscv/include/asm/sbi.h
@@ -29,6 +29,10 @@
#define SBI_EXT_BASE 0x10
#define SBI_EXT_RFENCE 0x52464E43
+#define SBI_EXT_TIME 0x54494D45
+
+/* SBI function IDs for TIME extension */
+#define SBI_EXT_TIME_SET_TIMER 0x0
/* SBI function IDs for BASE extension */
#define SBI_EXT_BASE_GET_SPEC_VERSION 0x0
@@ -134,6 +138,20 @@ int sbi_remote_hfence_gvma(const cpumask_t *cpu_mask, vaddr_t start,
int sbi_remote_hfence_gvma_vmid(const cpumask_t *cpu_mask, vaddr_t start,
size_t size, unsigned long vmid);
+/*
+ * Programs the clock for next event after stime_value time. This function also
+ * clears the pending timer interrupt bit.
+ * If the supervisor wishes to clear the timer interrupt without scheduling the
+ * next timer event, it can either request a timer interrupt infinitely far
+ * into the future (i.e., (uint64_t)-1), or it can instead mask the timer
+ * interrupt by clearing sie.STIE CSR bit.
+ * The stime_value parameter represents absolute time measured in ticks.
+ *
+ * This SBI call returns 0 upon success or an implementation specific negative
+ * error code.
+ */
+extern int (*sbi_set_timer)(uint64_t stime_value);
+
/*
* Initialize SBI library
*
diff --git a/xen/arch/riscv/sbi.c b/xen/arch/riscv/sbi.c
index 425dce44c679..2c7757c8839f 100644
--- a/xen/arch/riscv/sbi.c
+++ b/xen/arch/riscv/sbi.c
@@ -249,6 +249,38 @@ static int (* __ro_after_init sbi_rfence)(unsigned long fid,
unsigned long arg4,
unsigned long arg5);
+static int cf_check sbi_set_timer_v02(uint64_t stime_value)
+{
+ struct sbiret ret;
+
+ ret = sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value,
+#ifdef CONFIG_RISCV_32
+ stime_value >> 32,
+#else
+ 0,
+#endif
+ 0, 0, 0, 0);
+
+ return sbi_err_map_xen_errno(ret.error);
+}
+
+static int cf_check sbi_set_timer_v01(uint64_t stime_value)
+{
+ struct sbiret ret;
+
+ ret = sbi_ecall(SBI_EXT_0_1_SET_TIMER, 0, stime_value,
+#ifdef CONFIG_RISCV_32
+ stime_value >> 32,
+#else
+ 0,
+#endif
+ 0, 0, 0, 0);
+
+ return sbi_err_map_xen_errno(ret.error);
+}
+
+int (* __ro_after_init sbi_set_timer)(uint64_t stime_value);
+
int sbi_remote_sfence_vma(const cpumask_t *cpu_mask, vaddr_t start,
size_t size)
{
@@ -326,6 +358,14 @@ int __init sbi_init(void)
sbi_rfence = sbi_rfence_v02;
printk("SBI v0.2 RFENCE extension detected\n");
}
+
+ if ( sbi_probe_extension(SBI_EXT_TIME) > 0 )
+ {
+ sbi_set_timer = sbi_set_timer_v02;
+ printk("SBI v0.2 TIME extension detected\n");
+ }
+ else
+ sbi_set_timer = sbi_set_timer_v01;
}
else
panic("Ooops. SBI spec version 0.1 detected. Need to add support");
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 12/16] xen/riscv: introduce sbi_set_timer()
2026-01-22 16:47 ` [PATCH v2 12/16] xen/riscv: introduce sbi_set_timer() Oleksii Kurochko
@ 2026-02-03 17:02 ` Jan Beulich
2026-02-04 9:29 ` Oleksii Kurochko
2026-02-04 6:50 ` Jan Beulich
1 sibling, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-02-03 17:02 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 22.01.2026 17:47, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/sbi.c
> +++ b/xen/arch/riscv/sbi.c
> @@ -249,6 +249,38 @@ static int (* __ro_after_init sbi_rfence)(unsigned long fid,
> unsigned long arg4,
> unsigned long arg5);
>
> +static int cf_check sbi_set_timer_v02(uint64_t stime_value)
> +{
> + struct sbiret ret;
> +
> + ret = sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value,
> +#ifdef CONFIG_RISCV_32
> + stime_value >> 32,
> +#else
> + 0,
> +#endif
> + 0, 0, 0, 0);
> +
> + return sbi_err_map_xen_errno(ret.error);
> +}
> +
> +static int cf_check sbi_set_timer_v01(uint64_t stime_value)
> +{
> + struct sbiret ret;
> +
> + ret = sbi_ecall(SBI_EXT_0_1_SET_TIMER, 0, stime_value,
From this name I'm judging version 0.1 is meant. How does this fit with ...
> @@ -326,6 +358,14 @@ int __init sbi_init(void)
> sbi_rfence = sbi_rfence_v02;
> printk("SBI v0.2 RFENCE extension detected\n");
> }
> +
> + if ( sbi_probe_extension(SBI_EXT_TIME) > 0 )
> + {
> + sbi_set_timer = sbi_set_timer_v02;
> + printk("SBI v0.2 TIME extension detected\n");
> + }
> + else
> + sbi_set_timer = sbi_set_timer_v01;
> }
> else
> panic("Ooops. SBI spec version 0.1 detected. Need to add support");
... the panic() here?
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 12/16] xen/riscv: introduce sbi_set_timer()
2026-02-03 17:02 ` Jan Beulich
@ 2026-02-04 9:29 ` Oleksii Kurochko
2026-02-04 10:11 ` Jan Beulich
0 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-02-04 9:29 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 2/3/26 6:02 PM, Jan Beulich wrote:
> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>> --- a/xen/arch/riscv/sbi.c
>> +++ b/xen/arch/riscv/sbi.c
>> @@ -249,6 +249,38 @@ static int (* __ro_after_init sbi_rfence)(unsigned long fid,
>> unsigned long arg4,
>> unsigned long arg5);
>>
>> +static int cf_check sbi_set_timer_v02(uint64_t stime_value)
>> +{
>> + struct sbiret ret;
>> +
>> + ret = sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value,
>> +#ifdef CONFIG_RISCV_32
>> + stime_value >> 32,
>> +#else
>> + 0,
>> +#endif
>> + 0, 0, 0, 0);
>> +
>> + return sbi_err_map_xen_errno(ret.error);
>> +}
>> +
>> +static int cf_check sbi_set_timer_v01(uint64_t stime_value)
>> +{
>> + struct sbiret ret;
>> +
>> + ret = sbi_ecall(SBI_EXT_0_1_SET_TIMER, 0, stime_value,
> From this name I'm judging version 0.1 is meant. How does this fit with ...
>
>> @@ -326,6 +358,14 @@ int __init sbi_init(void)
>> sbi_rfence = sbi_rfence_v02;
>> printk("SBI v0.2 RFENCE extension detected\n");
>> }
>> +
>> + if ( sbi_probe_extension(SBI_EXT_TIME) > 0 )
>> + {
>> + sbi_set_timer = sbi_set_timer_v02;
>> + printk("SBI v0.2 TIME extension detected\n");
>> + }
>> + else
>> + sbi_set_timer = sbi_set_timer_v01;
>> }
>> else
>> panic("Ooops. SBI spec version 0.1 detected. Need to add support");
> ... the panic() here?
panic() is still here as then we will want to add support for rfence v01 SBI call
too what hasn't been done yet.
Could it be an option to change panic to:
sbi_set_timer = sbi_set_timer_v01;
dprintk("SBI v0.1 isn't fully supported\n");
and then add sbi_rfence = sbi_rfence_v01 when i will be first used?
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 12/16] xen/riscv: introduce sbi_set_timer()
2026-02-04 9:29 ` Oleksii Kurochko
@ 2026-02-04 10:11 ` Jan Beulich
0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2026-02-04 10:11 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 04.02.2026 10:29, Oleksii Kurochko wrote:
> On 2/3/26 6:02 PM, Jan Beulich wrote:
>> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>>> --- a/xen/arch/riscv/sbi.c
>>> +++ b/xen/arch/riscv/sbi.c
>>> @@ -249,6 +249,38 @@ static int (* __ro_after_init sbi_rfence)(unsigned long fid,
>>> unsigned long arg4,
>>> unsigned long arg5);
>>>
>>> +static int cf_check sbi_set_timer_v02(uint64_t stime_value)
>>> +{
>>> + struct sbiret ret;
>>> +
>>> + ret = sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value,
>>> +#ifdef CONFIG_RISCV_32
>>> + stime_value >> 32,
>>> +#else
>>> + 0,
>>> +#endif
>>> + 0, 0, 0, 0);
>>> +
>>> + return sbi_err_map_xen_errno(ret.error);
>>> +}
>>> +
>>> +static int cf_check sbi_set_timer_v01(uint64_t stime_value)
>>> +{
>>> + struct sbiret ret;
>>> +
>>> + ret = sbi_ecall(SBI_EXT_0_1_SET_TIMER, 0, stime_value,
>> From this name I'm judging version 0.1 is meant. How does this fit with ...
>>
>>> @@ -326,6 +358,14 @@ int __init sbi_init(void)
>>> sbi_rfence = sbi_rfence_v02;
>>> printk("SBI v0.2 RFENCE extension detected\n");
>>> }
>>> +
>>> + if ( sbi_probe_extension(SBI_EXT_TIME) > 0 )
>>> + {
>>> + sbi_set_timer = sbi_set_timer_v02;
>>> + printk("SBI v0.2 TIME extension detected\n");
>>> + }
>>> + else
>>> + sbi_set_timer = sbi_set_timer_v01;
>>> }
>>> else
>>> panic("Ooops. SBI spec version 0.1 detected. Need to add support");
>> ... the panic() here?
>
> panic() is still here as then we will want to add support for rfence v01 SBI call
> too what hasn't been done yet.
>
> Could it be an option to change panic to:
> sbi_set_timer = sbi_set_timer_v01;
> dprintk("SBI v0.1 isn't fully supported\n");
> and then add sbi_rfence = sbi_rfence_v01 when i will be first used?
I don't mind keeping the panic(), but what you add here wants to be structured
such that things won't need moving around once the panic() goes away. I.e. you
want to avoid dealing with v0.1 in both the if() and the else. To accommodate
that, perhaps sbi_set_timer_v01 could simply be the initializer of the new
function pointer variable?
Beyond that, once again you want to clarify things in the commit message.
Adding support for a case which elsewhere you panic() on is, well, confusing
without some explanation.
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v2 12/16] xen/riscv: introduce sbi_set_timer()
2026-01-22 16:47 ` [PATCH v2 12/16] xen/riscv: introduce sbi_set_timer() Oleksii Kurochko
2026-02-03 17:02 ` Jan Beulich
@ 2026-02-04 6:50 ` Jan Beulich
2026-02-04 9:36 ` Oleksii Kurochko
1 sibling, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-02-04 6:50 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 22.01.2026 17:47, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/include/asm/sbi.h
> +++ b/xen/arch/riscv/include/asm/sbi.h
> @@ -29,6 +29,10 @@
>
> #define SBI_EXT_BASE 0x10
> #define SBI_EXT_RFENCE 0x52464E43
> +#define SBI_EXT_TIME 0x54494D45
> +
> +/* SBI function IDs for TIME extension */
> +#define SBI_EXT_TIME_SET_TIMER 0x0
Nit: Do you really mean to have the time extension IDs above ...
> /* SBI function IDs for BASE extension */
> #define SBI_EXT_BASE_GET_SPEC_VERSION 0x0
... the base extension ones?
> @@ -134,6 +138,20 @@ int sbi_remote_hfence_gvma(const cpumask_t *cpu_mask, vaddr_t start,
> int sbi_remote_hfence_gvma_vmid(const cpumask_t *cpu_mask, vaddr_t start,
> size_t size, unsigned long vmid);
>
> +/*
> + * Programs the clock for next event after stime_value time. This function also
> + * clears the pending timer interrupt bit.
> + * If the supervisor wishes to clear the timer interrupt without scheduling the
> + * next timer event, it can either request a timer interrupt infinitely far
> + * into the future (i.e., (uint64_t)-1), or it can instead mask the timer
> + * interrupt by clearing sie.STIE CSR bit.
> + * The stime_value parameter represents absolute time measured in ticks.
> + *
> + * This SBI call returns 0 upon success or an implementation specific negative
> + * error code.
> + */
> +extern int (*sbi_set_timer)(uint64_t stime_value);
__read_mostly or even __ro_after_init?
> @@ -326,6 +358,14 @@ int __init sbi_init(void)
> sbi_rfence = sbi_rfence_v02;
> printk("SBI v0.2 RFENCE extension detected\n");
> }
> +
> + if ( sbi_probe_extension(SBI_EXT_TIME) > 0 )
> + {
> + sbi_set_timer = sbi_set_timer_v02;
> + printk("SBI v0.2 TIME extension detected\n");
Is this really relevant to log especially in release builds? IOW can this at
least be downgraded to dprintk()?
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 12/16] xen/riscv: introduce sbi_set_timer()
2026-02-04 6:50 ` Jan Beulich
@ 2026-02-04 9:36 ` Oleksii Kurochko
0 siblings, 0 replies; 65+ messages in thread
From: Oleksii Kurochko @ 2026-02-04 9:36 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 2/4/26 7:50 AM, Jan Beulich wrote:
> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>> --- a/xen/arch/riscv/include/asm/sbi.h
>> +++ b/xen/arch/riscv/include/asm/sbi.h
>> @@ -29,6 +29,10 @@
>>
>> #define SBI_EXT_BASE 0x10
>> #define SBI_EXT_RFENCE 0x52464E43
>> +#define SBI_EXT_TIME 0x54494D45
>> +
>> +/* SBI function IDs for TIME extension */
>> +#define SBI_EXT_TIME_SET_TIMER 0x0
> Nit: Do you really mean to have the time extension IDs above ...
>
>> /* SBI function IDs for BASE extension */
>> #define SBI_EXT_BASE_GET_SPEC_VERSION 0x0
> ... the base extension ones?
I will move it after SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID to be aligned with how
extensions are declared.
>
>> @@ -134,6 +138,20 @@ int sbi_remote_hfence_gvma(const cpumask_t *cpu_mask, vaddr_t start,
>> int sbi_remote_hfence_gvma_vmid(const cpumask_t *cpu_mask, vaddr_t start,
>> size_t size, unsigned long vmid);
>>
>> +/*
>> + * Programs the clock for next event after stime_value time. This function also
>> + * clears the pending timer interrupt bit.
>> + * If the supervisor wishes to clear the timer interrupt without scheduling the
>> + * next timer event, it can either request a timer interrupt infinitely far
>> + * into the future (i.e., (uint64_t)-1), or it can instead mask the timer
>> + * interrupt by clearing sie.STIE CSR bit.
>> + * The stime_value parameter represents absolute time measured in ticks.
>> + *
>> + * This SBI call returns 0 upon success or an implementation specific negative
>> + * error code.
>> + */
>> +extern int (*sbi_set_timer)(uint64_t stime_value);
> __read_mostly or even __ro_after_init?
I will add __ro_after_init to be in sync with sbi_rfence.
>
>> @@ -326,6 +358,14 @@ int __init sbi_init(void)
>> sbi_rfence = sbi_rfence_v02;
>> printk("SBI v0.2 RFENCE extension detected\n");
>> }
>> +
>> + if ( sbi_probe_extension(SBI_EXT_TIME) > 0 )
>> + {
>> + sbi_set_timer = sbi_set_timer_v02;
>> + printk("SBI v0.2 TIME extension detected\n");
> Is this really relevant to log especially in release builds? IOW can this at
> least be downgraded to dprintk()?
Probably not, it could be useful for debugging to understand what kind and version
of extension is used. I am okay with using of dprintk().
Thanks.
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v2 13/16] xen/riscv: implement reprogram_timer() via SBI
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
` (11 preceding siblings ...)
2026-01-22 16:47 ` [PATCH v2 12/16] xen/riscv: introduce sbi_set_timer() Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-01-24 23:13 ` Teddy Astie
2026-02-04 10:39 ` Jan Beulich
2026-01-22 16:47 ` [PATCH v2 14/16] xen/riscv: handle hypervisor timer interrupts Oleksii Kurochko
` (2 subsequent siblings)
15 siblings, 2 replies; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
Implement reprogram_timer() on RISC-V using the standard SBI timer call.
The privileged architecture only defines machine-mode timer interrupts
(using mtime/mtimecmp). Therefore, timer services for S/HS/VS mode must
be provided by M-mode via SBI calls. SSTC (Supervisor-mode Timer Control)
is optional and is not supported on the boards available to me, so the
only viable approach today is to program the timer through SBI.
reprogram_timer() enables/disables the supervisor timer interrupt and
programs the next timer deadline using sbi_set_timer(). If the SBI call
fails, the code panics, because sbi_set_timer() is expected to return
either 0 or -ENOSUPP (this has been stable from early OpenSBI versions to
the latest ones). The SBI spec does not define a standard negative error
code for this call, and without SSTC there is no alternative method to
program the timer, so the SBI timer call must be available.
reprogram_timer() currently returns int for compatibility with the
existing prototype. While it might be cleaner to return bool, keeping the
existing signature avoids premature changes in case sbi_set_timer() ever
needs to return other values (based on which we could try to avoid
panic-ing) in the future.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
- Add TODO comment above sbi_set_timer() call.
- Update the commit message.
---
xen/arch/riscv/stubs.c | 5 -----
xen/arch/riscv/time.c | 43 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 43 insertions(+), 5 deletions(-)
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index 1f0add97b361..cb7546558b8e 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -21,11 +21,6 @@ nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
/* time.c */
-int reprogram_timer(s_time_t timeout)
-{
- BUG_ON("unimplemented");
-}
-
void send_timer_event(struct vcpu *v)
{
BUG_ON("unimplemented");
diff --git a/xen/arch/riscv/time.c b/xen/arch/riscv/time.c
index 2c7af0a5d63b..f021ceab8ec4 100644
--- a/xen/arch/riscv/time.c
+++ b/xen/arch/riscv/time.c
@@ -7,6 +7,9 @@
#include <xen/time.h>
#include <xen/types.h>
+#include <asm/csr.h>
+#include <asm/sbi.h>
+
unsigned long __ro_after_init cpu_khz; /* CPU clock frequency in kHz. */
uint64_t __ro_after_init boot_clock_cycles;
@@ -40,6 +43,46 @@ static void __init preinit_dt_xen_time(void)
cpu_khz = rate / 1000;
}
+int reprogram_timer(s_time_t timeout)
+{
+ uint64_t deadline, now;
+ int rc;
+
+ if ( timeout == 0 )
+ {
+ /* Disable timers */
+ csr_clear(CSR_SIE, BIT(IRQ_S_TIMER, UL));
+
+ return 1;
+ }
+
+ deadline = ns_to_ticks(timeout) + boot_clock_cycles;
+ now = get_cycles();
+ if ( deadline <= now )
+ return 0;
+
+ /* Enable timer */
+ csr_set(CSR_SIE, BIT(IRQ_S_TIMER, UL));
+
+ /*
+ * TODO: When the SSTC extension is supported, it would be preferable to
+ * use the supervisor timer registers directly here for better
+ * performance, since an SBI call and context switch would no longer
+ * be required.
+ *
+ * This would also reduce reliance on a specific SBI implementation.
+ * For example, it is not ideal to panic() if sbi_set_timer() returns
+ * a non-zero value. Currently it can return 0 or -ENOSUPP, and
+ * without SSTC we still need an implementation because only the
+ * M-mode timer is available, and it can only be programmed in
+ * M-mode.
+ */
+ if ( (rc = sbi_set_timer(deadline)) )
+ panic("%s: timer wasn't set because: %d\n", __func__, rc);
+
+ return 1;
+}
+
void __init preinit_xen_time(void)
{
if ( acpi_disabled )
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 13/16] xen/riscv: implement reprogram_timer() via SBI
2026-01-22 16:47 ` [PATCH v2 13/16] xen/riscv: implement reprogram_timer() via SBI Oleksii Kurochko
@ 2026-01-24 23:13 ` Teddy Astie
2026-01-26 10:20 ` Oleksii Kurochko
2026-02-04 10:39 ` Jan Beulich
1 sibling, 1 reply; 65+ messages in thread
From: Teddy Astie @ 2026-01-24 23:13 UTC (permalink / raw)
To: Oleksii Kurochko, xen-devel
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey
Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
> Implement reprogram_timer() on RISC-V using the standard SBI timer call.
>
> The privileged architecture only defines machine-mode timer interrupts
> (using mtime/mtimecmp). Therefore, timer services for S/HS/VS mode must
> be provided by M-mode via SBI calls. SSTC (Supervisor-mode Timer Control)
> is optional and is not supported on the boards available to me, so the
> only viable approach today is to program the timer through SBI.
>
> reprogram_timer() enables/disables the supervisor timer interrupt and
> programs the next timer deadline using sbi_set_timer(). If the SBI call
> fails, the code panics, because sbi_set_timer() is expected to return
> either 0 or -ENOSUPP (this has been stable from early OpenSBI versions to
> the latest ones). The SBI spec does not define a standard negative error
> code for this call, and without SSTC there is no alternative method to
> program the timer, so the SBI timer call must be available.
>
> reprogram_timer() currently returns int for compatibility with the
> existing prototype. While it might be cleaner to return bool, keeping the
> existing signature avoids premature changes in case sbi_set_timer() ever
> needs to return other values (based on which we could try to avoid
> panic-ing) in the future.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Changes in v2:
> - Add TODO comment above sbi_set_timer() call.
> - Update the commit message.
> ---
> xen/arch/riscv/stubs.c | 5 -----
> xen/arch/riscv/time.c | 43 ++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 43 insertions(+), 5 deletions(-)
>
> diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
> index 1f0add97b361..cb7546558b8e 100644
> --- a/xen/arch/riscv/stubs.c
> +++ b/xen/arch/riscv/stubs.c
> @@ -21,11 +21,6 @@ nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
>
> /* time.c */
>
> -int reprogram_timer(s_time_t timeout)
> -{
> - BUG_ON("unimplemented");
> -}
> -
> void send_timer_event(struct vcpu *v)
> {
> BUG_ON("unimplemented");
> diff --git a/xen/arch/riscv/time.c b/xen/arch/riscv/time.c
> index 2c7af0a5d63b..f021ceab8ec4 100644
> --- a/xen/arch/riscv/time.c
> +++ b/xen/arch/riscv/time.c
> @@ -7,6 +7,9 @@
> #include <xen/time.h>
> #include <xen/types.h>
>
> +#include <asm/csr.h>
> +#include <asm/sbi.h>
> +
> unsigned long __ro_after_init cpu_khz; /* CPU clock frequency in kHz. */
> uint64_t __ro_after_init boot_clock_cycles;
>
> @@ -40,6 +43,46 @@ static void __init preinit_dt_xen_time(void)
> cpu_khz = rate / 1000;
> }
>
> +int reprogram_timer(s_time_t timeout)
> +{
> + uint64_t deadline, now;
> + int rc;
> +
> + if ( timeout == 0 )
> + {
> + /* Disable timers */
> + csr_clear(CSR_SIE, BIT(IRQ_S_TIMER, UL));
> +
> + return 1;
> + }
Do disabling the timers interrupt actually stops the timer or just
prevents Xen from receiving the timer interrupt ?
If it doesn't "stop the timer", we probably would want to swap "enabling
the timer interrupt" and "setting the timer through SBI" (to avoid
potentially receiving a timer interrupt between these 2 operations).
Though, it's unclear in SBI specification if the sbi_set_timer touches
the timer interrupt masking or not (at least it does if you set a timer
too far in the future).
> +
> + deadline = ns_to_ticks(timeout) + boot_clock_cycles;
> + now = get_cycles();
> + if ( deadline <= now )
> + return 0;
> +
> + /* Enable timer */
> + csr_set(CSR_SIE, BIT(IRQ_S_TIMER, UL));
> +
> + /*
> + * TODO: When the SSTC extension is supported, it would be preferable to
> + * use the supervisor timer registers directly here for better
> + * performance, since an SBI call and context switch would no longer
> + * be required.
> + *
> + * This would also reduce reliance on a specific SBI implementation.
> + * For example, it is not ideal to panic() if sbi_set_timer() returns
> + * a non-zero value. Currently it can return 0 or -ENOSUPP, and
> + * without SSTC we still need an implementation because only the
> + * M-mode timer is available, and it can only be programmed in
> + * M-mode.
> + */
> + if ( (rc = sbi_set_timer(deadline)) )
> + panic("%s: timer wasn't set because: %d\n", __func__, rc);
> +> + return 1;
> +}
> +
> void __init preinit_xen_time(void)
> {
> if ( acpi_disabled )
--
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 13/16] xen/riscv: implement reprogram_timer() via SBI
2026-01-24 23:13 ` Teddy Astie
@ 2026-01-26 10:20 ` Oleksii Kurochko
0 siblings, 0 replies; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-26 10:20 UTC (permalink / raw)
To: Teddy Astie, xen-devel
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey
On 1/25/26 12:13 AM, Teddy Astie wrote:
> Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
>> Implement reprogram_timer() on RISC-V using the standard SBI timer call.
>>
>> The privileged architecture only defines machine-mode timer interrupts
>> (using mtime/mtimecmp). Therefore, timer services for S/HS/VS mode must
>> be provided by M-mode via SBI calls. SSTC (Supervisor-mode Timer Control)
>> is optional and is not supported on the boards available to me, so the
>> only viable approach today is to program the timer through SBI.
>>
>> reprogram_timer() enables/disables the supervisor timer interrupt and
>> programs the next timer deadline using sbi_set_timer(). If the SBI call
>> fails, the code panics, because sbi_set_timer() is expected to return
>> either 0 or -ENOSUPP (this has been stable from early OpenSBI versions to
>> the latest ones). The SBI spec does not define a standard negative error
>> code for this call, and without SSTC there is no alternative method to
>> program the timer, so the SBI timer call must be available.
>>
>> reprogram_timer() currently returns int for compatibility with the
>> existing prototype. While it might be cleaner to return bool, keeping the
>> existing signature avoids premature changes in case sbi_set_timer() ever
>> needs to return other values (based on which we could try to avoid
>> panic-ing) in the future.
>>
>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>> ---
>> Changes in v2:
>> - Add TODO comment above sbi_set_timer() call.
>> - Update the commit message.
>> ---
>> xen/arch/riscv/stubs.c | 5 -----
>> xen/arch/riscv/time.c | 43 ++++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 43 insertions(+), 5 deletions(-)
>>
>> diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
>> index 1f0add97b361..cb7546558b8e 100644
>> --- a/xen/arch/riscv/stubs.c
>> +++ b/xen/arch/riscv/stubs.c
>> @@ -21,11 +21,6 @@ nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
>>
>> /* time.c */
>>
>> -int reprogram_timer(s_time_t timeout)
>> -{
>> - BUG_ON("unimplemented");
>> -}
>> -
>> void send_timer_event(struct vcpu *v)
>> {
>> BUG_ON("unimplemented");
>> diff --git a/xen/arch/riscv/time.c b/xen/arch/riscv/time.c
>> index 2c7af0a5d63b..f021ceab8ec4 100644
>> --- a/xen/arch/riscv/time.c
>> +++ b/xen/arch/riscv/time.c
>> @@ -7,6 +7,9 @@
>> #include <xen/time.h>
>> #include <xen/types.h>
>>
>> +#include <asm/csr.h>
>> +#include <asm/sbi.h>
>> +
>> unsigned long __ro_after_init cpu_khz; /* CPU clock frequency in kHz. */
>> uint64_t __ro_after_init boot_clock_cycles;
>>
>> @@ -40,6 +43,46 @@ static void __init preinit_dt_xen_time(void)
>> cpu_khz = rate / 1000;
>> }
>>
>> +int reprogram_timer(s_time_t timeout)
>> +{
>> + uint64_t deadline, now;
>> + int rc;
>> +
>> + if ( timeout == 0 )
>> + {
>> + /* Disable timers */
>> + csr_clear(CSR_SIE, BIT(IRQ_S_TIMER, UL));
>> +
>> + return 1;
>> + }
> Do disabling the timers interrupt actually stops the timer or just
> prevents Xen from receiving the timer interrupt ?
According to OpenSBI spec:
If the supervisor wishes to clear the timer interrupt without scheduling
the next timer event, it can either request a timer interrupt infinitely
far into the future (i.e., (uint64_t)-1), or it can instead mask the timer
interrupt by clearing sie.STIE CSR bit.
It seems like it never stops it even if to use OpenSBI call with -1 arguemnt.
>
> If it doesn't "stop the timer", we probably would want to swap "enabling
> the timer interrupt" and "setting the timer through SBI" (to avoid
> potentially receiving a timer interrupt between these 2 operations).
>
> Though, it's unclear in SBI specification if the sbi_set_timer touches
> the timer interrupt masking or not (at least it does if you set a timer
> too far in the future).
Considering how it is implemented:
void sbi_timer_event_start(u64 next_event)
{
...
} else if (timer_dev && timer_dev->timer_event_start) {
timer_dev->timer_event_start(next_event);
csr_clear(CSR_MIP, MIP_STIP);
}
csr_set(CSR_MIE, MIP_MTIP);
}
It will clear timer pending bit even it was set before. So move this:
+ /* Enable timer */
+ csr_set(CSR_SIE, BIT(IRQ_S_TIMER, UL));
after sbi_set_timer() should work.
As an option we could just use sbi_set_timer(UINT64_MAX) instead of
" csr_clear(CSR_SIE, BIT(IRQ_S_TIMER, UL));" to "stop" a timer.
But first one option looks better for me.
~ Oleksii
>
>> +
>> + deadline = ns_to_ticks(timeout) + boot_clock_cycles;
>> + now = get_cycles();
>> + if ( deadline <= now )
>> + return 0;
>> +
>> + /* Enable timer */
>> + csr_set(CSR_SIE, BIT(IRQ_S_TIMER, UL));
>> +
>> + /*
>> + * TODO: When the SSTC extension is supported, it would be preferable to
>> + * use the supervisor timer registers directly here for better
>> + * performance, since an SBI call and context switch would no longer
>> + * be required.
>> + *
>> + * This would also reduce reliance on a specific SBI implementation.
>> + * For example, it is not ideal to panic() if sbi_set_timer() returns
>> + * a non-zero value. Currently it can return 0 or -ENOSUPP, and
>> + * without SSTC we still need an implementation because only the
>> + * M-mode timer is available, and it can only be programmed in
>> + * M-mode.
>> + */
>> + if ( (rc = sbi_set_timer(deadline)) )
>> + panic("%s: timer wasn't set because: %d\n", __func__, rc);
>> +> + return 1;
>> +}
>> +
>> void __init preinit_xen_time(void)
>> {
>> if ( acpi_disabled )
>
>
> --
> Teddy Astie | Vates XCP-ng Developer
>
> XCP-ng & Xen Orchestra - Vates solutions
>
> web: https://vates.tech
>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v2 13/16] xen/riscv: implement reprogram_timer() via SBI
2026-01-22 16:47 ` [PATCH v2 13/16] xen/riscv: implement reprogram_timer() via SBI Oleksii Kurochko
2026-01-24 23:13 ` Teddy Astie
@ 2026-02-04 10:39 ` Jan Beulich
2026-02-05 16:25 ` Oleksii Kurochko
1 sibling, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-02-04 10:39 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 22.01.2026 17:47, Oleksii Kurochko wrote:
> @@ -40,6 +43,46 @@ static void __init preinit_dt_xen_time(void)
> cpu_khz = rate / 1000;
> }
>
> +int reprogram_timer(s_time_t timeout)
> +{
> + uint64_t deadline, now;
> + int rc;
> +
> + if ( timeout == 0 )
> + {
> + /* Disable timers */
> + csr_clear(CSR_SIE, BIT(IRQ_S_TIMER, UL));
For here and below: Is it guaranteed that the SIE bit is writable? The privileged
spec looks to have provisions for the case that it isn't (together with the
corresponding SIP bit).
As to the comment - why plural here, when ...
> + return 1;
> + }
> +
> + deadline = ns_to_ticks(timeout) + boot_clock_cycles;
> + now = get_cycles();
> + if ( deadline <= now )
> + return 0;
> +
> + /* Enable timer */
> + csr_set(CSR_SIE, BIT(IRQ_S_TIMER, UL));
... it's singular here? Also in both cases, isn't it the timer interrupt you
enable, not the timer itself?
> + /*
> + * TODO: When the SSTC extension is supported, it would be preferable to
> + * use the supervisor timer registers directly here for better
> + * performance, since an SBI call and context switch would no longer
> + * be required.
I think you mean a mode switch here, not a context one?
Jan
> + * This would also reduce reliance on a specific SBI implementation.
> + * For example, it is not ideal to panic() if sbi_set_timer() returns
> + * a non-zero value. Currently it can return 0 or -ENOSUPP, and
> + * without SSTC we still need an implementation because only the
> + * M-mode timer is available, and it can only be programmed in
> + * M-mode.
> + */
> + if ( (rc = sbi_set_timer(deadline)) )
> + panic("%s: timer wasn't set because: %d\n", __func__, rc);
> +
> + return 1;
> +}
> +
> void __init preinit_xen_time(void)
> {
> if ( acpi_disabled )
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 13/16] xen/riscv: implement reprogram_timer() via SBI
2026-02-04 10:39 ` Jan Beulich
@ 2026-02-05 16:25 ` Oleksii Kurochko
0 siblings, 0 replies; 65+ messages in thread
From: Oleksii Kurochko @ 2026-02-05 16:25 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 2/4/26 11:39 AM, Jan Beulich wrote:
> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>> @@ -40,6 +43,46 @@ static void __init preinit_dt_xen_time(void)
>> cpu_khz = rate / 1000;
>> }
>>
>> +int reprogram_timer(s_time_t timeout)
>> +{
>> + uint64_t deadline, now;
>> + int rc;
>> +
>> + if ( timeout == 0 )
>> + {
>> + /* Disable timers */
>> + csr_clear(CSR_SIE, BIT(IRQ_S_TIMER, UL));
> For here and below: Is it guaranteed that the SIE bit is writable? The privileged
> spec looks to have provisions for the case that it isn't (together with the
> corresponding SIP bit).
My understanding is that yes if S-mode is present.
>
> As to the comment - why plural here, when ...
>
>> + return 1;
>> + }
>> +
>> + deadline = ns_to_ticks(timeout) + boot_clock_cycles;
>> + now = get_cycles();
>> + if ( deadline <= now )
>> + return 0;
>> +
>> + /* Enable timer */
>> + csr_set(CSR_SIE, BIT(IRQ_S_TIMER, UL));
> ... it's singular here? Also in both cases, isn't it the timer interrupt you
> enable, not the timer itself?
It is timer interrupt. I will correct the comments.
>
>> + /*
>> + * TODO: When the SSTC extension is supported, it would be preferable to
>> + * use the supervisor timer registers directly here for better
>> + * performance, since an SBI call and context switch would no longer
>> + * be required.
> I think you mean a mode switch here, not a context one?
Right, mode switch is better.
~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v2 14/16] xen/riscv: handle hypervisor timer interrupts
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
` (12 preceding siblings ...)
2026-01-22 16:47 ` [PATCH v2 13/16] xen/riscv: implement reprogram_timer() via SBI Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-02-04 10:42 ` Jan Beulich
2026-01-22 16:47 ` [PATCH v2 15/16] xen/riscv: init tasklet subsystem Oleksii Kurochko
2026-01-22 16:47 ` [PATCH v2 16/16] xen/riscv: implement sync_vcpu_execstate() Oleksii Kurochko
15 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
Introduce timer_interrupt() to process IRQ_S_TIMER interrupts.
The handler disables further timer interrupts by clearing
SIE.STIE and raises TIMER_SOFTIRQ so the generic timer subsystem
can perform its processing.
Update do_trap() to dispatch IRQ_S_TIMER to this new handler.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
- Drop cause argument of timer_interrupt() as it isn't used inside
the function and anyway it is pretty clear what is the cause inside
timer_interrupt().
---
xen/arch/riscv/traps.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/xen/arch/riscv/traps.c b/xen/arch/riscv/traps.c
index e9c967786312..53f96f143823 100644
--- a/xen/arch/riscv/traps.c
+++ b/xen/arch/riscv/traps.c
@@ -10,6 +10,7 @@
#include <xen/lib.h>
#include <xen/nospec.h>
#include <xen/sched.h>
+#include <xen/softirq.h>
#include <asm/intc.h>
#include <asm/processor.h>
@@ -108,6 +109,15 @@ static void do_unexpected_trap(const struct cpu_user_regs *regs)
die();
}
+static void timer_interrupt(void)
+{
+ /* Disable the timer to avoid more interrupts */
+ csr_clear(CSR_SIE, BIT(IRQ_S_TIMER, UL));
+
+ /* Signal the generic timer code to do its work */
+ raise_softirq(TIMER_SOFTIRQ);
+}
+
void do_trap(struct cpu_user_regs *cpu_regs)
{
register_t pc = cpu_regs->sepc;
@@ -148,6 +158,10 @@ void do_trap(struct cpu_user_regs *cpu_regs)
intc_handle_external_irqs(cpu_regs);
break;
+ case IRQ_S_TIMER:
+ timer_interrupt();
+ break;
+
default:
break;
}
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 14/16] xen/riscv: handle hypervisor timer interrupts
2026-01-22 16:47 ` [PATCH v2 14/16] xen/riscv: handle hypervisor timer interrupts Oleksii Kurochko
@ 2026-02-04 10:42 ` Jan Beulich
0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2026-02-04 10:42 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 22.01.2026 17:47, Oleksii Kurochko wrote:
> Introduce timer_interrupt() to process IRQ_S_TIMER interrupts.
> The handler disables further timer interrupts by clearing
> SIE.STIE and raises TIMER_SOFTIRQ so the generic timer subsystem
> can perform its processing.
>
> Update do_trap() to dispatch IRQ_S_TIMER to this new handler.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v2 15/16] xen/riscv: init tasklet subsystem
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
` (13 preceding siblings ...)
2026-01-22 16:47 ` [PATCH v2 14/16] xen/riscv: handle hypervisor timer interrupts Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-01-29 16:33 ` Jan Beulich
2026-01-22 16:47 ` [PATCH v2 16/16] xen/riscv: implement sync_vcpu_execstate() Oleksii Kurochko
15 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
As the tasklet subsystem is now initialized, it is necessary to implement
sync_local_execstate(), since it is invoked when something calls
tasklet_softirq_action(), which is registered in tasklet_subsys_init().
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
- Update the commit message.
- Move implementation of sync_vcpu_execstate() to separate commit
as it doesn't connect to tasklet subsystem.
---
xen/arch/riscv/domain.c | 5 +++++
xen/arch/riscv/setup.c | 3 +++
xen/arch/riscv/stubs.c | 5 -----
3 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index 13ac384c4b76..1458902aff82 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -254,3 +254,8 @@ void vcpu_kick(struct vcpu *v)
smp_send_event_check_mask(cpumask_of(v->processor));
}
}
+
+void sync_local_execstate(void)
+{
+ /* Nothing to do -- no lazy switching */
+}
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index 9b4835960d20..e8dbd55ce79e 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -12,6 +12,7 @@
#include <xen/serial.h>
#include <xen/shutdown.h>
#include <xen/smp.h>
+#include <xen/tasklet.h>
#include <xen/timer.h>
#include <xen/vmap.h>
#include <xen/xvmalloc.h>
@@ -133,6 +134,8 @@ void __init noreturn start_xen(unsigned long bootcpu_id,
panic("Booting using ACPI isn't supported\n");
}
+ tasklet_subsys_init();
+
init_IRQ();
riscv_fill_hwcap();
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index cb7546558b8e..c912d46f1e42 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -91,11 +91,6 @@ void continue_running(struct vcpu *same)
BUG_ON("unimplemented");
}
-void sync_local_execstate(void)
-{
- BUG_ON("unimplemented");
-}
-
void sync_vcpu_execstate(struct vcpu *v)
{
BUG_ON("unimplemented");
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 15/16] xen/riscv: init tasklet subsystem
2026-01-22 16:47 ` [PATCH v2 15/16] xen/riscv: init tasklet subsystem Oleksii Kurochko
@ 2026-01-29 16:33 ` Jan Beulich
0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2026-01-29 16:33 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 22.01.2026 17:47, Oleksii Kurochko wrote:
> As the tasklet subsystem is now initialized, it is necessary to implement
> sync_local_execstate(), since it is invoked when something calls
> tasklet_softirq_action(), which is registered in tasklet_subsys_init().
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v2 16/16] xen/riscv: implement sync_vcpu_execstate()
2026-01-22 16:47 [PATCH v2 00/16] xen/riscv: introduce vtimer related things Oleksii Kurochko
` (14 preceding siblings ...)
2026-01-22 16:47 ` [PATCH v2 15/16] xen/riscv: init tasklet subsystem Oleksii Kurochko
@ 2026-01-22 16:47 ` Oleksii Kurochko
2026-02-04 10:49 ` Jan Beulich
15 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-01-22 16:47 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Connor Davis, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Romain Caritey
The scheduler may call this function to force synchronization of given
vCPU's state. Although RISC-V does not support lazy context switching,
a full memory barrier is still required to order observation of the
saved context correctly.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
- New patch.
---
xen/arch/riscv/domain.c | 18 ++++++++++++++++++
xen/arch/riscv/stubs.c | 5 -----
2 files changed, 18 insertions(+), 5 deletions(-)
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index 1458902aff82..48ba7584acaa 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -259,3 +259,21 @@ void sync_local_execstate(void)
{
/* Nothing to do -- no lazy switching */
}
+
+void sync_vcpu_execstate(struct vcpu *v)
+{
+ /*
+ * We don't support lazy switching.
+ *
+ * However the context may have been saved from a remote pCPU so we
+ * need a barrier to ensure it is observed before continuing.
+ *
+ * Per vcpu_context_saved(), the context can be observed when
+ * v->is_running is false (the caller should check it before calling
+ * this function).
+ *
+ * Note this is a full barrier to also prevent update of the context
+ * to happen before it was observed.
+ */
+ smp_mb();
+}
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index c912d46f1e42..26434166acc6 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -91,11 +91,6 @@ void continue_running(struct vcpu *same)
BUG_ON("unimplemented");
}
-void sync_vcpu_execstate(struct vcpu *v)
-{
- BUG_ON("unimplemented");
-}
-
void startup_cpu_idle_loop(void)
{
BUG_ON("unimplemented");
--
2.52.0
^ permalink raw reply related [flat|nested] 65+ messages in thread* Re: [PATCH v2 16/16] xen/riscv: implement sync_vcpu_execstate()
2026-01-22 16:47 ` [PATCH v2 16/16] xen/riscv: implement sync_vcpu_execstate() Oleksii Kurochko
@ 2026-02-04 10:49 ` Jan Beulich
2026-02-05 16:51 ` Oleksii Kurochko
0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2026-02-04 10:49 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 22.01.2026 17:47, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/domain.c
> +++ b/xen/arch/riscv/domain.c
> @@ -259,3 +259,21 @@ void sync_local_execstate(void)
> {
> /* Nothing to do -- no lazy switching */
> }
> +
> +void sync_vcpu_execstate(struct vcpu *v)
> +{
> + /*
> + * We don't support lazy switching.
> + *
> + * However the context may have been saved from a remote pCPU so we
> + * need a barrier to ensure it is observed before continuing.
> + *
> + * Per vcpu_context_saved(), the context can be observed when
> + * v->is_running is false (the caller should check it before calling
> + * this function).
> + *
> + * Note this is a full barrier to also prevent update of the context
> + * to happen before it was observed.
> + */
> + smp_mb();
> +}
Where's the counterpart of this barrier (going to be)?
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 16/16] xen/riscv: implement sync_vcpu_execstate()
2026-02-04 10:49 ` Jan Beulich
@ 2026-02-05 16:51 ` Oleksii Kurochko
2026-02-05 16:53 ` Jan Beulich
0 siblings, 1 reply; 65+ messages in thread
From: Oleksii Kurochko @ 2026-02-05 16:51 UTC (permalink / raw)
To: Jan Beulich
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 2/4/26 11:49 AM, Jan Beulich wrote:
> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>> --- a/xen/arch/riscv/domain.c
>> +++ b/xen/arch/riscv/domain.c
>> @@ -259,3 +259,21 @@ void sync_local_execstate(void)
>> {
>> /* Nothing to do -- no lazy switching */
>> }
>> +
>> +void sync_vcpu_execstate(struct vcpu *v)
>> +{
>> + /*
>> + * We don't support lazy switching.
>> + *
>> + * However the context may have been saved from a remote pCPU so we
>> + * need a barrier to ensure it is observed before continuing.
>> + *
>> + * Per vcpu_context_saved(), the context can be observed when
>> + * v->is_running is false (the caller should check it before calling
>> + * this function).
>> + *
>> + * Note this is a full barrier to also prevent update of the context
>> + * to happen before it was observed.
>> + */
>> + smp_mb();
>> +}
> Where's the counterpart of this barrier (going to be)?
As it is mentioned in the comment it is invcpu_context_saved(). ~ Oleksii
^ permalink raw reply [flat|nested] 65+ messages in thread* Re: [PATCH v2 16/16] xen/riscv: implement sync_vcpu_execstate()
2026-02-05 16:51 ` Oleksii Kurochko
@ 2026-02-05 16:53 ` Jan Beulich
0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2026-02-05 16:53 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: Alistair Francis, Connor Davis, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Romain Caritey, xen-devel
On 05.02.2026 17:51, Oleksii Kurochko wrote:
>
> On 2/4/26 11:49 AM, Jan Beulich wrote:
>> On 22.01.2026 17:47, Oleksii Kurochko wrote:
>>> --- a/xen/arch/riscv/domain.c
>>> +++ b/xen/arch/riscv/domain.c
>>> @@ -259,3 +259,21 @@ void sync_local_execstate(void)
>>> {
>>> /* Nothing to do -- no lazy switching */
>>> }
>>> +
>>> +void sync_vcpu_execstate(struct vcpu *v)
>>> +{
>>> + /*
>>> + * We don't support lazy switching.
>>> + *
>>> + * However the context may have been saved from a remote pCPU so we
>>> + * need a barrier to ensure it is observed before continuing.
>>> + *
>>> + * Per vcpu_context_saved(), the context can be observed when
>>> + * v->is_running is false (the caller should check it before calling
>>> + * this function).
>>> + *
>>> + * Note this is a full barrier to also prevent update of the context
>>> + * to happen before it was observed.
>>> + */
>>> + smp_mb();
>>> +}
>> Where's the counterpart of this barrier (going to be)?
>
> As it is mentioned in the comment it is invcpu_context_saved(). ~ Oleksii
You may have seen the Arm side changes to this, as I did Cc you. From that
I think you should understand the background of the question.
Jan
^ permalink raw reply [flat|nested] 65+ messages in thread