* [PATCH 0/8] accel, cpus: clean up cpu->exit_request
@ 2025-08-08 18:58 Paolo Bonzini
2025-08-08 18:58 ` [PATCH 1/8] accel: use store_release/load_acquire for cross-thread exit_request Paolo Bonzini
` (7 more replies)
0 siblings, 8 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-08-08 18:58 UTC (permalink / raw)
To: qemu-devel; +Cc: imammedo, richard.henderson, peterx
cpu->exit_request is used to kick vCPU threads into qemu_wait_io_event().
The code that handles the signaling of cpu->exit_request is messy, mostly
due to no one ever taking a look at it as a whole. In fact already in commit
4b8523ee896 ("kvm: First step to push iothread lock out of inner run loop",
2015-07-01), the read of cpu->exit_request was placed outside the BQL
critical section without much attention to ordering; and it only got
worse from that point, in no small part due to a young and naive me.
This series is complementary to the cpu->interrupt_request cleanups
that Igor did in "memory: reintroduce BQL-free fine-grained PIO/MMIO"
(https://lore.kernel.org/qemu-devel/20250808120137.2208800-1-imammedo@redhat.com/T/),
and is organized like this:
- patches 1-3 clean up accesses to the variable to properly use
qatomic_* accessors
- patches 4-6 makes it possible to use cpu_exit() for all accelerators
- patch 7 removes from user-mode emulation a hack that is only needed
by system emulation
- patch 8 cleans up the way in which exit_request is cleared.
Paolo
Paolo Bonzini (8):
accel: use store_release/load_acquire for cross-thread exit_request
accel/hvf: check exit_request before running the vCPU
accel: use atomic accesses for exit_request
accel/tcg: introduce tcg_kick_vcpu_thread
cpus: remove TCG-ism from cpu_exit()
cpus: properly kick CPUs out of inner execution loop
tcg/user: do not set exit_request gratuitously
accel: make all calls to qemu_wait_io_event look the same
docs/devel/tcg-icount.rst | 2 +-
accel/tcg/tcg-accel-ops-mttcg.h | 3 ---
include/exec/cpu-common.h | 1 +
accel/dummy-cpus.c | 2 +-
accel/hvf/hvf-accel-ops.c | 2 +-
accel/kvm/kvm-accel-ops.c | 3 ++-
accel/kvm/kvm-all.c | 21 +++++++---------
accel/tcg/cpu-exec.c | 27 +++++++++++++++++---
accel/tcg/tcg-accel-ops-mttcg.c | 12 ++-------
accel/tcg/tcg-accel-ops-rr.c | 42 ++++++++++++++++---------------
accel/tcg/tcg-accel-ops.c | 4 +--
bsd-user/main.c | 2 +-
cpu-common.c | 3 ++-
hw/core/cpu-common.c | 7 +++---
hw/ppc/ppc.c | 2 ++
hw/ppc/spapr_hcall.c | 7 +++---
hw/ppc/spapr_rtas.c | 2 +-
linux-user/main.c | 2 +-
replay/replay-events.c | 3 ++-
system/cpu-timers.c | 6 ++---
system/cpus.c | 6 +++--
target/arm/hvf/hvf.c | 3 +++
target/arm/tcg/mte_helper.c | 2 +-
target/i386/hvf/hvf.c | 3 +++
target/i386/kvm/hyperv.c | 1 -
target/i386/kvm/kvm.c | 2 +-
target/i386/nvmm/nvmm-accel-ops.c | 8 +++---
target/i386/nvmm/nvmm-all.c | 9 +++----
target/i386/whpx/whpx-accel-ops.c | 6 ++---
target/i386/whpx/whpx-all.c | 11 ++++----
30 files changed, 106 insertions(+), 98 deletions(-)
--
2.50.1
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 1/8] accel: use store_release/load_acquire for cross-thread exit_request
2025-08-08 18:58 [PATCH 0/8] accel, cpus: clean up cpu->exit_request Paolo Bonzini
@ 2025-08-08 18:58 ` Paolo Bonzini
2025-08-09 22:41 ` Richard Henderson
2025-08-11 19:31 ` Peter Xu
2025-08-08 18:58 ` [PATCH 2/8] accel/hvf: check exit_request before running the vCPU Paolo Bonzini
` (6 subsequent siblings)
7 siblings, 2 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-08-08 18:58 UTC (permalink / raw)
To: qemu-devel; +Cc: imammedo, richard.henderson, peterx
cpu->exit_request do not use a load-acquire/store-release pair right now,
but this means that cpu_exit() does not store it after any flags that are
read in the slow path.
Probably everything is protected one way or the other by the BQL, because
after reading cpu->exit_request the CPU thread often goes to sleep
(by waiting on the BQL-protected cpu->halt_cond), but it's not clear.
Use load-acquire/store-release consistently.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
accel/kvm/kvm-all.c | 19 +++++++++----------
accel/tcg/cpu-exec.c | 7 +++++--
accel/tcg/tcg-accel-ops-rr.c | 2 +-
hw/core/cpu-common.c | 3 ++-
target/i386/nvmm/nvmm-all.c | 5 ++---
target/i386/whpx/whpx-all.c | 3 ++-
6 files changed, 21 insertions(+), 18 deletions(-)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 890d5ea9f86..57e35960125 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -3029,10 +3029,6 @@ static void kvm_eat_signals(CPUState *cpu)
if (kvm_immediate_exit) {
qatomic_set(&cpu->kvm_run->immediate_exit, 0);
- /* Write kvm_run->immediate_exit before the cpu->exit_request
- * write in kvm_cpu_exec.
- */
- smp_wmb();
return;
}
@@ -3187,7 +3183,8 @@ int kvm_cpu_exec(CPUState *cpu)
}
kvm_arch_pre_run(cpu, run);
- if (qatomic_read(&cpu->exit_request)) {
+ /* Corresponding store-release is in cpu_exit. */
+ if (qatomic_load_acquire(&cpu->exit_request)) {
trace_kvm_interrupt_exit_request();
/*
* KVM requires us to reenter the kernel after IO exits to complete
@@ -3197,13 +3194,15 @@ int kvm_cpu_exec(CPUState *cpu)
kvm_cpu_kick_self();
}
- /* Read cpu->exit_request before KVM_RUN reads run->immediate_exit.
- * Matching barrier in kvm_eat_signals.
- */
- smp_rmb();
-
run_ret = kvm_vcpu_ioctl(cpu, KVM_RUN, 0);
+ /*
+ * After writing cpu->exit_request, cpu_exit() sends a signal that writes
+ * kvm->run->immediate_exit. The signal is always happening after the
+ * write to cpu->exit_request so, if KVM read kvm->run->immediate_exit
+ * as true, cpu->exit_request will always read as true.
+ */
+
attrs = kvm_arch_post_run(cpu, run);
#ifdef KVM_HAVE_MCE_INJECTION
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 82867f456c1..4bd9ee01c19 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -849,8 +849,11 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
bql_unlock();
}
- /* Finally, check if we need to exit to the main loop. */
- if (unlikely(qatomic_read(&cpu->exit_request)) || icount_exit_request(cpu)) {
+ /*
+ * Finally, check if we need to exit to the main loop.
+ * The corresponding store-release is in cpu_exit.
+ */
+ if (unlikely(qatomic_load_acquire(&cpu->exit_request)) || icount_exit_request(cpu)) {
qatomic_set(&cpu->exit_request, 0);
if (cpu->exception_index == -1) {
cpu->exception_index = EXCP_INTERRUPT;
diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
index 6eec5c9eee9..1e551e92d6d 100644
--- a/accel/tcg/tcg-accel-ops-rr.c
+++ b/accel/tcg/tcg-accel-ops-rr.c
@@ -242,7 +242,7 @@ static void *rr_cpu_thread_fn(void *arg)
cpu = first_cpu;
}
- while (cpu && cpu_work_list_empty(cpu) && !cpu->exit_request) {
+ while (cpu && cpu_work_list_empty(cpu) && !qatomic_load_acquire(&cpu->exit_request)) {
/* Store rr_current_cpu before evaluating cpu_can_run(). */
qatomic_set_mb(&rr_current_cpu, cpu);
diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
index 39e674aca21..f189ce861c9 100644
--- a/hw/core/cpu-common.c
+++ b/hw/core/cpu-common.c
@@ -84,7 +84,8 @@ void cpu_reset_interrupt(CPUState *cpu, int mask)
void cpu_exit(CPUState *cpu)
{
- qatomic_set(&cpu->exit_request, 1);
+ /* Ensure cpu_exec will see the reason why the exit request was set. */
+ qatomic_store_release(&cpu->exit_request, 1);
/* Ensure cpu_exec will see the exit request after TCG has exited. */
smp_wmb();
qatomic_set(&cpu->neg.icount_decr.u16.high, -1);
diff --git a/target/i386/nvmm/nvmm-all.c b/target/i386/nvmm/nvmm-all.c
index c1ac74c4f04..a5e3485c1f8 100644
--- a/target/i386/nvmm/nvmm-all.c
+++ b/target/i386/nvmm/nvmm-all.c
@@ -743,7 +743,8 @@ nvmm_vcpu_loop(CPUState *cpu)
nvmm_vcpu_pre_run(cpu);
- if (qatomic_read(&cpu->exit_request)) {
+ /* Corresponding store-release is in cpu_exit. */
+ if (qatomic_load_acquire(&cpu->exit_request)) {
#if NVMM_USER_VERSION >= 2
nvmm_vcpu_stop(vcpu);
#else
@@ -751,8 +752,6 @@ nvmm_vcpu_loop(CPUState *cpu)
#endif
}
- /* Read exit_request before the kernel reads the immediate exit flag */
- smp_rmb();
ret = nvmm_vcpu_run(mach, vcpu);
if (ret == -1) {
error_report("NVMM: Failed to exec a virtual processor,"
diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index 878cdd1668c..9f88e368d4d 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -1714,7 +1714,8 @@ static int whpx_vcpu_run(CPUState *cpu)
if (exclusive_step_mode == WHPX_STEP_NONE) {
whpx_vcpu_pre_run(cpu);
- if (qatomic_read(&cpu->exit_request)) {
+ /* Corresponding store-release is in cpu_exit. */
+ if (qatomic_load_acquire(&cpu->exit_request)) {
whpx_vcpu_kick(cpu);
}
}
--
2.50.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 2/8] accel/hvf: check exit_request before running the vCPU
2025-08-08 18:58 [PATCH 0/8] accel, cpus: clean up cpu->exit_request Paolo Bonzini
2025-08-08 18:58 ` [PATCH 1/8] accel: use store_release/load_acquire for cross-thread exit_request Paolo Bonzini
@ 2025-08-08 18:58 ` Paolo Bonzini
2025-08-08 21:28 ` Philippe Mathieu-Daudé
2025-08-09 23:09 ` Richard Henderson
2025-08-08 18:59 ` [PATCH 3/8] accel: use atomic accesses for exit_request Paolo Bonzini
` (5 subsequent siblings)
7 siblings, 2 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-08-08 18:58 UTC (permalink / raw)
To: qemu-devel; +Cc: imammedo, richard.henderson, peterx
This is done by all other accelerators, but was missing for
Hypervisor.framework.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/arm/hvf/hvf.c | 4 ++++
target/i386/hvf/hvf.c | 4 ++++
2 files changed, 8 insertions(+)
diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
index b77db99079e..478bc75fee6 100644
--- a/target/arm/hvf/hvf.c
+++ b/target/arm/hvf/hvf.c
@@ -1927,6 +1927,10 @@ int hvf_vcpu_exec(CPUState *cpu)
flush_cpu_state(cpu);
bql_unlock();
+ /* Corresponding store-release is in cpu_exit. */
+ if (qatomic_load_acquire(&cpu->exit_request)) {
+ hv_vcpus_exit(&cpu->accel->fd, 1);
+ }
r = hv_vcpu_run(cpu->accel->fd);
bql_lock();
switch (r) {
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 8445cadecec..b7c4b849cdf 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -749,6 +749,10 @@ int hvf_vcpu_exec(CPUState *cpu)
return EXCP_HLT;
}
+ /* Corresponding store-release is in cpu_exit. */
+ if (qatomic_load_acquire(&cpu->exit_request)) {
+ hv_vcpu_interrupt(&cpu->accel->fd, 1);
+ }
hv_return_t r = hv_vcpu_run_until(cpu->accel->fd, HV_DEADLINE_FOREVER);
assert_hvf_ok(r);
--
2.50.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 3/8] accel: use atomic accesses for exit_request
2025-08-08 18:58 [PATCH 0/8] accel, cpus: clean up cpu->exit_request Paolo Bonzini
2025-08-08 18:58 ` [PATCH 1/8] accel: use store_release/load_acquire for cross-thread exit_request Paolo Bonzini
2025-08-08 18:58 ` [PATCH 2/8] accel/hvf: check exit_request before running the vCPU Paolo Bonzini
@ 2025-08-08 18:59 ` Paolo Bonzini
2025-08-08 21:04 ` Philippe Mathieu-Daudé
` (3 more replies)
2025-08-08 18:59 ` [PATCH 4/8] accel/tcg: introduce tcg_kick_vcpu_thread Paolo Bonzini
` (4 subsequent siblings)
7 siblings, 4 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-08-08 18:59 UTC (permalink / raw)
To: qemu-devel; +Cc: imammedo, richard.henderson, peterx
CPU threads write exit_request as a "note to self" that they need to
go out to a slow path. This write happens out of the BQL and can be
a data race with another threads' cpu_exit(); use atomic accesses
consistently.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
accel/tcg/tcg-accel-ops-mttcg.c | 2 +-
accel/tcg/tcg-accel-ops-rr.c | 4 ++--
hw/ppc/spapr_hcall.c | 6 +++---
target/i386/kvm/kvm.c | 2 +-
target/i386/nvmm/nvmm-accel-ops.c | 2 +-
target/i386/nvmm/nvmm-all.c | 2 +-
target/i386/whpx/whpx-all.c | 6 +++---
7 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/accel/tcg/tcg-accel-ops-mttcg.c b/accel/tcg/tcg-accel-ops-mttcg.c
index 337b993d3da..39237421757 100644
--- a/accel/tcg/tcg-accel-ops-mttcg.c
+++ b/accel/tcg/tcg-accel-ops-mttcg.c
@@ -85,7 +85,7 @@ static void *mttcg_cpu_thread_fn(void *arg)
qemu_guest_random_seed_thread_part2(cpu->random_seed);
/* process any pending work */
- cpu->exit_request = 1;
+ qatomic_set(&cpu->exit_request, 1);
do {
if (cpu_can_run(cpu)) {
diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
index 1e551e92d6d..c06f3beef2e 100644
--- a/accel/tcg/tcg-accel-ops-rr.c
+++ b/accel/tcg/tcg-accel-ops-rr.c
@@ -212,7 +212,7 @@ static void *rr_cpu_thread_fn(void *arg)
cpu = first_cpu;
/* process any pending work */
- cpu->exit_request = 1;
+ qatomic_set(&cpu->exit_request, 1);
while (1) {
/* Only used for icount_enabled() */
@@ -286,7 +286,7 @@ static void *rr_cpu_thread_fn(void *arg)
/* Does not need a memory barrier because a spurious wakeup is okay. */
qatomic_set(&rr_current_cpu, NULL);
- if (cpu && cpu->exit_request) {
+ if (cpu && qatomic_read(&cpu->exit_request)) {
qatomic_set_mb(&cpu->exit_request, 0);
}
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 1e936f35e44..03a62b047b3 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -509,7 +509,7 @@ static target_ulong h_cede(PowerPCCPU *cpu, SpaprMachineState *spapr,
if (!cpu_has_work(cs)) {
cs->halted = 1;
cs->exception_index = EXCP_HLT;
- cs->exit_request = 1;
+ qatomic_set(&cs->exit_request, 1);
ppc_maybe_interrupt(env);
}
@@ -531,7 +531,7 @@ static target_ulong h_confer_self(PowerPCCPU *cpu)
}
cs->halted = 1;
cs->exception_index = EXCP_HALTED;
- cs->exit_request = 1;
+ qatomic_set(&cs->exit_request, 1);
ppc_maybe_interrupt(&cpu->env);
return H_SUCCESS;
@@ -624,7 +624,7 @@ static target_ulong h_confer(PowerPCCPU *cpu, SpaprMachineState *spapr,
}
cs->exception_index = EXCP_YIELD;
- cs->exit_request = 1;
+ qatomic_set(&cs->exit_request, 1);
cpu_loop_exit(cs);
return H_SUCCESS;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 306430a0521..867eabc6969 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -5604,7 +5604,7 @@ int kvm_arch_process_async_events(CPUState *cs)
if (env->exception_nr == EXCP08_DBLE) {
/* this means triple fault */
qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
- cs->exit_request = 1;
+ qatomic_set(&cs->exit_request, 1);
return 0;
}
kvm_queue_exception(env, EXCP12_MCHK, 0, 0);
diff --git a/target/i386/nvmm/nvmm-accel-ops.c b/target/i386/nvmm/nvmm-accel-ops.c
index 3799260bbde..3658a583bc8 100644
--- a/target/i386/nvmm/nvmm-accel-ops.c
+++ b/target/i386/nvmm/nvmm-accel-ops.c
@@ -77,7 +77,7 @@ static void nvmm_start_vcpu_thread(CPUState *cpu)
*/
static void nvmm_kick_vcpu_thread(CPUState *cpu)
{
- cpu->exit_request = 1;
+ qatomic_set(&cpu->exit_request, 1);
cpus_kick_thread(cpu);
}
diff --git a/target/i386/nvmm/nvmm-all.c b/target/i386/nvmm/nvmm-all.c
index a5e3485c1f8..d2d90f38976 100644
--- a/target/i386/nvmm/nvmm-all.c
+++ b/target/i386/nvmm/nvmm-all.c
@@ -414,7 +414,7 @@ nvmm_vcpu_pre_run(CPUState *cpu)
* or commit pending TPR access.
*/
if (cpu_test_interrupt(cpu, CPU_INTERRUPT_INIT | CPU_INTERRUPT_TPR)) {
- cpu->exit_request = 1;
+ qatomic_set(&cpu->exit_request, 1);
}
if (!has_event && cpu_test_interrupt(cpu, CPU_INTERRUPT_NMI)) {
diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index 9f88e368d4d..9b07716121a 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -1489,10 +1489,10 @@ static void whpx_vcpu_pre_run(CPUState *cpu)
if (cpu_test_interrupt(cpu, CPU_INTERRUPT_INIT | CPU_INTERRUPT_TPR)) {
if (cpu_test_interrupt(cpu, CPU_INTERRUPT_INIT) &&
!(env->hflags & HF_SMM_MASK)) {
- cpu->exit_request = 1;
+ qatomic_set(&cpu->exit_request, 1);
}
if (cpu_test_interrupt(cpu, CPU_INTERRUPT_TPR)) {
- cpu->exit_request = 1;
+ qatomic_set(&cpu->exit_request, 1);
}
}
@@ -1539,7 +1539,7 @@ static void whpx_vcpu_pre_run(CPUState *cpu)
if (tpr != vcpu->tpr) {
vcpu->tpr = tpr;
reg_values[reg_count].Reg64 = tpr;
- cpu->exit_request = 1;
+ qatomic_set(&cpu->exit_request, 1);
reg_names[reg_count] = WHvX64RegisterCr8;
reg_count += 1;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 4/8] accel/tcg: introduce tcg_kick_vcpu_thread
2025-08-08 18:58 [PATCH 0/8] accel, cpus: clean up cpu->exit_request Paolo Bonzini
` (2 preceding siblings ...)
2025-08-08 18:59 ` [PATCH 3/8] accel: use atomic accesses for exit_request Paolo Bonzini
@ 2025-08-08 18:59 ` Paolo Bonzini
2025-08-08 21:15 ` Philippe Mathieu-Daudé
2025-08-09 23:26 ` Richard Henderson
2025-08-08 18:59 ` [PATCH 5/8] cpus: remove TCG-ism from cpu_exit() Paolo Bonzini
` (3 subsequent siblings)
7 siblings, 2 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-08-08 18:59 UTC (permalink / raw)
To: qemu-devel; +Cc: imammedo, richard.henderson, peterx
qemu_cpu_kick() does not always have to set cpu->exit_request,
though it does for now. Introduce a function tcg_kick_vcpu_thread()
which can be changed in the future to not set cpu->exit_request,
and reserve cpu_exit() for when *all accelerators* need to set
that flag.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
docs/devel/tcg-icount.rst | 2 +-
accel/tcg/tcg-accel-ops-mttcg.h | 3 ---
include/exec/cpu-common.h | 1 +
accel/tcg/cpu-exec.c | 17 ++++++++++++++++-
accel/tcg/tcg-accel-ops-mttcg.c | 5 -----
accel/tcg/tcg-accel-ops-rr.c | 2 +-
accel/tcg/tcg-accel-ops.c | 2 +-
bsd-user/main.c | 2 +-
linux-user/main.c | 2 +-
9 files changed, 22 insertions(+), 14 deletions(-)
diff --git a/docs/devel/tcg-icount.rst b/docs/devel/tcg-icount.rst
index 7df883446a7..a1dcd79e0fd 100644
--- a/docs/devel/tcg-icount.rst
+++ b/docs/devel/tcg-icount.rst
@@ -37,7 +37,7 @@ translator starts by allocating a budget of instructions to be
executed. The budget of instructions is limited by how long it will be
until the next timer will expire. We store this budget as part of a
vCPU icount_decr field which shared with the machinery for handling
-cpu_exit(). The whole field is checked at the start of every
+qemu_cpu_kick(). The whole field is checked at the start of every
translated block and will cause a return to the outer loop to deal
with whatever caused the exit.
diff --git a/accel/tcg/tcg-accel-ops-mttcg.h b/accel/tcg/tcg-accel-ops-mttcg.h
index 8ffa7a9a9fe..5c145cc8595 100644
--- a/accel/tcg/tcg-accel-ops-mttcg.h
+++ b/accel/tcg/tcg-accel-ops-mttcg.h
@@ -10,9 +10,6 @@
#ifndef TCG_ACCEL_OPS_MTTCG_H
#define TCG_ACCEL_OPS_MTTCG_H
-/* kick MTTCG vCPU thread */
-void mttcg_kick_vcpu_thread(CPUState *cpu);
-
/* start an mttcg vCPU thread */
void mttcg_start_vcpu_thread(CPUState *cpu);
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 9b658a3f48f..e843b09cc99 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -226,6 +226,7 @@ G_NORETURN void cpu_loop_exit(CPUState *cpu);
G_NORETURN void cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc);
/* accel/tcg/cpu-exec.c */
+void tcg_kick_vcpu_thread(CPUState *cpu);
int cpu_exec(CPUState *cpu);
/**
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 4bd9ee01c19..1a973596d87 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -748,6 +748,20 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
return false;
}
+void tcg_kick_vcpu_thread(CPUState *cpu)
+{
+ /*
+ * Ensure cpu_exec will see the reason why the exit request was set.
+ * FIXME: this is not always needed. Other accelerators instead
+ * read interrupt_request and set exit_request on demand from the
+ * CPU thread; see kvm_arch_pre_run() for example.
+ */
+ qatomic_store_release(&cpu->exit_request, 1);
+
+ /* Ensure cpu_exec will see the exit request after TCG has exited. */
+ qatomic_store_release(&cpu->neg.icount_decr.u16.high, -1);
+}
+
static inline bool icount_exit_request(CPUState *cpu)
{
if (!icount_enabled()) {
@@ -774,7 +788,8 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
/* Clear the interrupt flag now since we're processing
* cpu->interrupt_request and cpu->exit_request.
* Ensure zeroing happens before reading cpu->exit_request or
- * cpu->interrupt_request (see also smp_wmb in cpu_exit())
+ * cpu->interrupt_request (see also store-release in
+ * tcg_kick_vcpu_thread())
*/
qatomic_set_mb(&cpu->neg.icount_decr.u16.high, 0);
diff --git a/accel/tcg/tcg-accel-ops-mttcg.c b/accel/tcg/tcg-accel-ops-mttcg.c
index 39237421757..5757324a8c2 100644
--- a/accel/tcg/tcg-accel-ops-mttcg.c
+++ b/accel/tcg/tcg-accel-ops-mttcg.c
@@ -123,11 +123,6 @@ static void *mttcg_cpu_thread_fn(void *arg)
return NULL;
}
-void mttcg_kick_vcpu_thread(CPUState *cpu)
-{
- cpu_exit(cpu);
-}
-
void mttcg_start_vcpu_thread(CPUState *cpu)
{
char thread_name[VCPU_THREAD_NAME_SIZE];
diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
index c06f3beef2e..87b49377c78 100644
--- a/accel/tcg/tcg-accel-ops-rr.c
+++ b/accel/tcg/tcg-accel-ops-rr.c
@@ -43,7 +43,7 @@ void rr_kick_vcpu_thread(CPUState *unused)
CPUState *cpu;
CPU_FOREACH(cpu) {
- cpu_exit(cpu);
+ tcg_kick_vcpu_thread(cpu);
};
}
diff --git a/accel/tcg/tcg-accel-ops.c b/accel/tcg/tcg-accel-ops.c
index 02c7600bb7d..f4d5372866a 100644
--- a/accel/tcg/tcg-accel-ops.c
+++ b/accel/tcg/tcg-accel-ops.c
@@ -207,7 +207,7 @@ static void tcg_accel_ops_init(AccelClass *ac)
if (qemu_tcg_mttcg_enabled()) {
ops->create_vcpu_thread = mttcg_start_vcpu_thread;
- ops->kick_vcpu_thread = mttcg_kick_vcpu_thread;
+ ops->kick_vcpu_thread = tcg_kick_vcpu_thread;
ops->handle_interrupt = tcg_handle_interrupt;
} else {
ops->create_vcpu_thread = rr_start_vcpu_thread;
diff --git a/bsd-user/main.c b/bsd-user/main.c
index 7e5d4bbce09..20159207040 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -216,7 +216,7 @@ bool qemu_cpu_is_self(CPUState *cpu)
void qemu_cpu_kick(CPUState *cpu)
{
- cpu_exit(cpu);
+ tcg_kick_vcpu_thread(cpu);
}
/* Assumes contents are already zeroed. */
diff --git a/linux-user/main.c b/linux-user/main.c
index 68972f00a15..dc68fd448b7 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -192,7 +192,7 @@ bool qemu_cpu_is_self(CPUState *cpu)
void qemu_cpu_kick(CPUState *cpu)
{
- cpu_exit(cpu);
+ tcg_kick_vcpu_thread(cpu);
}
void task_settid(TaskState *ts)
--
2.50.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 5/8] cpus: remove TCG-ism from cpu_exit()
2025-08-08 18:58 [PATCH 0/8] accel, cpus: clean up cpu->exit_request Paolo Bonzini
` (3 preceding siblings ...)
2025-08-08 18:59 ` [PATCH 4/8] accel/tcg: introduce tcg_kick_vcpu_thread Paolo Bonzini
@ 2025-08-08 18:59 ` Paolo Bonzini
2025-08-08 21:17 ` Philippe Mathieu-Daudé
2025-08-09 23:17 ` Richard Henderson
2025-08-08 18:59 ` [PATCH 6/8] cpus: properly kick CPUs out of inner execution loop Paolo Bonzini
` (2 subsequent siblings)
7 siblings, 2 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-08-08 18:59 UTC (permalink / raw)
To: qemu-devel; +Cc: imammedo, richard.henderson, peterx
Now that TCG has its own kick function, make cpu_exit() do the right kick
for all accelerators.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/core/cpu-common.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
index f189ce861c9..045b1778236 100644
--- a/hw/core/cpu-common.c
+++ b/hw/core/cpu-common.c
@@ -86,9 +86,7 @@ void cpu_exit(CPUState *cpu)
{
/* Ensure cpu_exec will see the reason why the exit request was set. */
qatomic_store_release(&cpu->exit_request, 1);
- /* Ensure cpu_exec will see the exit request after TCG has exited. */
- smp_wmb();
- qatomic_set(&cpu->neg.icount_decr.u16.high, -1);
+ qemu_cpu_kick(cpu);
}
static int cpu_common_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg)
--
2.50.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 6/8] cpus: properly kick CPUs out of inner execution loop
2025-08-08 18:58 [PATCH 0/8] accel, cpus: clean up cpu->exit_request Paolo Bonzini
` (4 preceding siblings ...)
2025-08-08 18:59 ` [PATCH 5/8] cpus: remove TCG-ism from cpu_exit() Paolo Bonzini
@ 2025-08-08 18:59 ` Paolo Bonzini
2025-08-11 12:56 ` Alex Bennée
2025-08-08 18:59 ` [PATCH 7/8] tcg/user: do not set exit_request gratuitously Paolo Bonzini
2025-08-08 18:59 ` [PATCH 8/8] accel: make all calls to qemu_wait_io_event look the same Paolo Bonzini
7 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2025-08-08 18:59 UTC (permalink / raw)
To: qemu-devel; +Cc: imammedo, richard.henderson, peterx
Now that cpu_exit() actually kicks all accelerators, use it whenever
the message to another thread is processed in qemu_wait_io_event()
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
cpu-common.c | 3 ++-
hw/ppc/ppc.c | 2 ++
hw/ppc/spapr_hcall.c | 7 +++----
hw/ppc/spapr_rtas.c | 2 +-
replay/replay-events.c | 3 ++-
system/cpu-timers.c | 6 +++---
system/cpus.c | 5 +++--
target/arm/tcg/mte_helper.c | 2 +-
target/i386/kvm/hyperv.c | 1 -
9 files changed, 17 insertions(+), 14 deletions(-)
diff --git a/cpu-common.c b/cpu-common.c
index ef5757d23bf..152661df8e9 100644
--- a/cpu-common.c
+++ b/cpu-common.c
@@ -137,7 +137,8 @@ static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
wi->done = false;
qemu_mutex_unlock(&cpu->work_mutex);
- qemu_cpu_kick(cpu);
+ /* exit the inner loop and reach qemu_wait_io_event_common(). */
+ cpu_exit(cpu);
}
void do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data,
diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index 43d0d0e7553..3e436c70413 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -190,6 +190,7 @@ static void ppc970_set_irq(void *opaque, int pin, int level)
if (level) {
trace_ppc_irq_cpu("stop");
cs->halted = 1;
+ cpu_exit(cs);
} else {
trace_ppc_irq_cpu("restart");
cs->halted = 0;
@@ -386,6 +387,7 @@ static void ppc40x_set_irq(void *opaque, int pin, int level)
if (level) {
trace_ppc_irq_cpu("stop");
cs->halted = 1;
+ cpu_exit(cs);
} else {
trace_ppc_irq_cpu("restart");
cs->halted = 0;
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 03a62b047b3..a64320214b4 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -509,8 +509,8 @@ static target_ulong h_cede(PowerPCCPU *cpu, SpaprMachineState *spapr,
if (!cpu_has_work(cs)) {
cs->halted = 1;
cs->exception_index = EXCP_HLT;
- qatomic_set(&cs->exit_request, 1);
ppc_maybe_interrupt(env);
+ cpu_exit(env);
}
return H_SUCCESS;
@@ -531,8 +531,8 @@ static target_ulong h_confer_self(PowerPCCPU *cpu)
}
cs->halted = 1;
cs->exception_index = EXCP_HALTED;
- qatomic_set(&cs->exit_request, 1);
ppc_maybe_interrupt(&cpu->env);
+ cpu_exit(&cpu->env);
return H_SUCCESS;
}
@@ -624,8 +624,7 @@ static target_ulong h_confer(PowerPCCPU *cpu, SpaprMachineState *spapr,
}
cs->exception_index = EXCP_YIELD;
- qatomic_set(&cs->exit_request, 1);
- cpu_loop_exit(cs);
+ cpu_exit(cs);
return H_SUCCESS;
}
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 78309dbb09d..143bc8c3794 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -221,7 +221,7 @@ static void rtas_stop_self(PowerPCCPU *cpu, SpaprMachineState *spapr,
cs->halted = 1;
ppc_store_lpcr(cpu, env->spr[SPR_LPCR] & ~pcc->lpcr_pm);
kvmppc_set_reg_ppc_online(cpu, 0);
- qemu_cpu_kick(cs);
+ cpu_exit(cs);
}
static void rtas_ibm_suspend_me(PowerPCCPU *cpu, SpaprMachineState *spapr,
diff --git a/replay/replay-events.c b/replay/replay-events.c
index 8959da9f1fa..a96e47e7740 100644
--- a/replay/replay-events.c
+++ b/replay/replay-events.c
@@ -118,7 +118,8 @@ void replay_add_event(ReplayAsyncEventKind event_kind,
g_assert(replay_mutex_locked());
QTAILQ_INSERT_TAIL(&events_list, event, events);
- qemu_cpu_kick(first_cpu);
+ /* Kick the TCG thread out of tcg_cpu_exec(). */
+ cpu_exit(first_cpu);
}
void replay_bh_schedule_event(QEMUBH *bh)
diff --git a/system/cpu-timers.c b/system/cpu-timers.c
index cb35fa62b8a..6a00691b8d5 100644
--- a/system/cpu-timers.c
+++ b/system/cpu-timers.c
@@ -246,14 +246,14 @@ void qemu_timer_notify_cb(void *opaque, QEMUClockType type)
if (qemu_in_vcpu_thread()) {
/*
- * A CPU is currently running; kick it back out to the
+ * A CPU is currently running; kick it back out of the
* tcg_cpu_exec() loop so it will recalculate its
* icount deadline immediately.
*/
- qemu_cpu_kick(current_cpu);
+ cpu_exit(current_cpu);
} else if (first_cpu) {
/*
- * qemu_cpu_kick is not enough to kick a halted CPU out of
+ * cpu_exit() is not enough to kick a halted CPU out of
* qemu_tcg_wait_io_event. async_run_on_cpu, instead,
* causes cpu_thread_is_idle to return false. This way,
* handle_icount_deadline can run.
diff --git a/system/cpus.c b/system/cpus.c
index 8e543488c30..d2cfa2a9c4e 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -599,7 +599,7 @@ void cpu_pause(CPUState *cpu)
qemu_cpu_stop(cpu, true);
} else {
cpu->stop = true;
- qemu_cpu_kick(cpu);
+ cpu_exit(cpu);
}
}
@@ -639,6 +639,7 @@ void pause_all_vcpus(void)
while (!all_vcpus_paused()) {
qemu_cond_wait(&qemu_pause_cond, &bql);
+ /* FIXME: is this needed? */
CPU_FOREACH(cpu) {
qemu_cpu_kick(cpu);
}
@@ -667,7 +668,7 @@ void cpu_remove_sync(CPUState *cpu)
{
cpu->stop = true;
cpu->unplug = true;
- qemu_cpu_kick(cpu);
+ cpu_exit(cpu);
bql_unlock();
qemu_thread_join(cpu->thread);
bql_lock();
diff --git a/target/arm/tcg/mte_helper.c b/target/arm/tcg/mte_helper.c
index 0efc18a181e..302e899287c 100644
--- a/target/arm/tcg/mte_helper.c
+++ b/target/arm/tcg/mte_helper.c
@@ -591,7 +591,7 @@ static void mte_async_check_fail(CPUARMState *env, uint64_t dirty_ptr,
* which is rather sooner than "normal". But the alternative
* is waiting until the next syscall.
*/
- qemu_cpu_kick(env_cpu(env));
+ cpu_exit(env_cpu(env));
#endif
}
diff --git a/target/i386/kvm/hyperv.c b/target/i386/kvm/hyperv.c
index 9865120cc43..f7a81bd2700 100644
--- a/target/i386/kvm/hyperv.c
+++ b/target/i386/kvm/hyperv.c
@@ -81,7 +81,6 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
* necessary because memory hierarchy is being changed
*/
async_safe_run_on_cpu(CPU(cpu), async_synic_update, RUN_ON_CPU_NULL);
- cpu_exit(CPU(cpu));
return EXCP_INTERRUPT;
case KVM_EXIT_HYPERV_HCALL: {
--
2.50.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 7/8] tcg/user: do not set exit_request gratuitously
2025-08-08 18:58 [PATCH 0/8] accel, cpus: clean up cpu->exit_request Paolo Bonzini
` (5 preceding siblings ...)
2025-08-08 18:59 ` [PATCH 6/8] cpus: properly kick CPUs out of inner execution loop Paolo Bonzini
@ 2025-08-08 18:59 ` Paolo Bonzini
2025-08-08 21:21 ` Philippe Mathieu-Daudé
2025-08-08 18:59 ` [PATCH 8/8] accel: make all calls to qemu_wait_io_event look the same Paolo Bonzini
7 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2025-08-08 18:59 UTC (permalink / raw)
To: qemu-devel; +Cc: imammedo, richard.henderson, peterx
User-mode emulation correctly uses cpu_exit() whenever it needs to go
all the way out of the cpu exec loop. It never uses qemu_cpu_kick();
therefore, there is no need for tcg_kick_vcpu_thread() to set
cpu->exit_request again.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
accel/tcg/cpu-exec.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 1a973596d87..b9da2e3770e 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -750,6 +750,7 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
void tcg_kick_vcpu_thread(CPUState *cpu)
{
+#ifdef CONFIG_SYSTEM
/*
* Ensure cpu_exec will see the reason why the exit request was set.
* FIXME: this is not always needed. Other accelerators instead
@@ -757,6 +758,7 @@ void tcg_kick_vcpu_thread(CPUState *cpu)
* CPU thread; see kvm_arch_pre_run() for example.
*/
qatomic_store_release(&cpu->exit_request, 1);
+#endif
/* Ensure cpu_exec will see the exit request after TCG has exited. */
qatomic_store_release(&cpu->neg.icount_decr.u16.high, -1);
--
2.50.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 8/8] accel: make all calls to qemu_wait_io_event look the same
2025-08-08 18:58 [PATCH 0/8] accel, cpus: clean up cpu->exit_request Paolo Bonzini
` (6 preceding siblings ...)
2025-08-08 18:59 ` [PATCH 7/8] tcg/user: do not set exit_request gratuitously Paolo Bonzini
@ 2025-08-08 18:59 ` Paolo Bonzini
2025-08-08 21:24 ` Philippe Mathieu-Daudé
2025-08-09 23:34 ` Richard Henderson
7 siblings, 2 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-08-08 18:59 UTC (permalink / raw)
To: qemu-devel; +Cc: imammedo, richard.henderson, peterx
There is no reason for some accelerators to use qemu_wait_io_event_common
(which is specifically separated for round robin). They can also check
on the first pass through the loop as well directly, without setting
cpu->exit_request for no particular reason.
There is also no need to use qatomic_set_mb() because the ordering of
the communication between I/O and vCPU threads is always the same.
In the I/O thread:
(a) store other memory locations that will be checked if cpu->exit_request
or cpu->interrupt_request is 1 (for example cpu->stop or cpu->work_list
for cpu->exit_request)
(b) cpu_exit(): store-release cpu->exit_request, or
(b) cpu_interrupt(): store-release cpu->interrupt_request
>>> at this point, cpu->halt_cond is broadcast and the BQL released
(c) do the accelerator-specific kick (e.g. write icount_decr for TCG,
pthread_kill for KVM, etc.)
In the vCPU thread instead the opposite order is respected:
(c) the accelerator's execution loop exits thanks to the kick
(b) then the inner execution loop checks cpu->interrupt_request
and cpu->exit_request. If needed cpu->interrupt_request is
converted into cpu->exit_request when work is needed outside
the execution loop.
(a) then the other memory locations are checked. Some may need
to be read under the BQL, and the vCPU thread may also take
for the vCPU thread can sleep on cpu->halt_cond; but in
principle this step is correct even without the BQL.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
accel/dummy-cpus.c | 2 +-
accel/hvf/hvf-accel-ops.c | 2 +-
accel/kvm/kvm-accel-ops.c | 3 ++-
accel/kvm/kvm-all.c | 2 --
accel/tcg/cpu-exec.c | 1 -
accel/tcg/tcg-accel-ops-mttcg.c | 7 ++----
accel/tcg/tcg-accel-ops-rr.c | 38 ++++++++++++++++---------------
accel/tcg/tcg-accel-ops.c | 2 --
system/cpus.c | 1 +
target/i386/nvmm/nvmm-accel-ops.c | 6 ++---
target/i386/nvmm/nvmm-all.c | 2 --
target/i386/whpx/whpx-accel-ops.c | 6 ++---
target/i386/whpx/whpx-all.c | 2 --
13 files changed, 31 insertions(+), 43 deletions(-)
diff --git a/accel/dummy-cpus.c b/accel/dummy-cpus.c
index 03cfc0fa01e..1f74c727c42 100644
--- a/accel/dummy-cpus.c
+++ b/accel/dummy-cpus.c
@@ -43,6 +43,7 @@ static void *dummy_cpu_thread_fn(void *arg)
qemu_guest_random_seed_thread_part2(cpu->random_seed);
do {
+ qemu_wait_io_event(cpu);
bql_unlock();
#ifndef _WIN32
do {
@@ -57,7 +58,6 @@ static void *dummy_cpu_thread_fn(void *arg)
qemu_sem_wait(&cpu->sem);
#endif
bql_lock();
- qemu_wait_io_event(cpu);
} while (!cpu->unplug);
bql_unlock();
diff --git a/accel/hvf/hvf-accel-ops.c b/accel/hvf/hvf-accel-ops.c
index d488d6afbac..4ba3e40831f 100644
--- a/accel/hvf/hvf-accel-ops.c
+++ b/accel/hvf/hvf-accel-ops.c
@@ -192,13 +192,13 @@ static void *hvf_cpu_thread_fn(void *arg)
qemu_guest_random_seed_thread_part2(cpu->random_seed);
do {
+ qemu_wait_io_event(cpu);
if (cpu_can_run(cpu)) {
r = hvf_vcpu_exec(cpu);
if (r == EXCP_DEBUG) {
cpu_handle_guest_debug(cpu);
}
}
- qemu_wait_io_event(cpu);
} while (!cpu->unplug || cpu_can_run(cpu));
hvf_vcpu_destroy(cpu);
diff --git a/accel/kvm/kvm-accel-ops.c b/accel/kvm/kvm-accel-ops.c
index b709187c7d7..80f0141a8a6 100644
--- a/accel/kvm/kvm-accel-ops.c
+++ b/accel/kvm/kvm-accel-ops.c
@@ -47,13 +47,14 @@ static void *kvm_vcpu_thread_fn(void *arg)
qemu_guest_random_seed_thread_part2(cpu->random_seed);
do {
+ qemu_wait_io_event(cpu);
+
if (cpu_can_run(cpu)) {
r = kvm_cpu_exec(cpu);
if (r == EXCP_DEBUG) {
cpu_handle_guest_debug(cpu);
}
}
- qemu_wait_io_event(cpu);
} while (!cpu->unplug || cpu_can_run(cpu));
kvm_destroy_vcpu(cpu);
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 57e35960125..db95e06e768 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -3155,7 +3155,6 @@ int kvm_cpu_exec(CPUState *cpu)
trace_kvm_cpu_exec();
if (kvm_arch_process_async_events(cpu)) {
- qatomic_set(&cpu->exit_request, 0);
return EXCP_HLT;
}
@@ -3345,7 +3344,6 @@ int kvm_cpu_exec(CPUState *cpu)
vm_stop(RUN_STATE_INTERNAL_ERROR);
}
- qatomic_set(&cpu->exit_request, 0);
return ret;
}
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index b9da2e3770e..f474ccb37f5 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -871,7 +871,6 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
* The corresponding store-release is in cpu_exit.
*/
if (unlikely(qatomic_load_acquire(&cpu->exit_request)) || icount_exit_request(cpu)) {
- qatomic_set(&cpu->exit_request, 0);
if (cpu->exception_index == -1) {
cpu->exception_index = EXCP_INTERRUPT;
}
diff --git a/accel/tcg/tcg-accel-ops-mttcg.c b/accel/tcg/tcg-accel-ops-mttcg.c
index 5757324a8c2..04012900a30 100644
--- a/accel/tcg/tcg-accel-ops-mttcg.c
+++ b/accel/tcg/tcg-accel-ops-mttcg.c
@@ -84,10 +84,9 @@ static void *mttcg_cpu_thread_fn(void *arg)
cpu_thread_signal_created(cpu);
qemu_guest_random_seed_thread_part2(cpu->random_seed);
- /* process any pending work */
- qatomic_set(&cpu->exit_request, 1);
-
do {
+ qemu_wait_io_event(cpu);
+
if (cpu_can_run(cpu)) {
int r;
bql_unlock();
@@ -112,8 +111,6 @@ static void *mttcg_cpu_thread_fn(void *arg)
break;
}
}
-
- qemu_wait_io_event(cpu);
} while (!cpu->unplug || cpu_can_run(cpu));
tcg_cpu_destroy(cpu);
diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
index 87b49377c78..205f379b6f8 100644
--- a/accel/tcg/tcg-accel-ops-rr.c
+++ b/accel/tcg/tcg-accel-ops-rr.c
@@ -211,13 +211,30 @@ static void *rr_cpu_thread_fn(void *arg)
cpu = first_cpu;
- /* process any pending work */
- qatomic_set(&cpu->exit_request, 1);
-
while (1) {
/* Only used for icount_enabled() */
int64_t cpu_budget = 0;
+ if (cpu) {
+ /*
+ * This could even reset exit_request for all CPUs, but in practice
+ * races between CPU exits and changes to "cpu" are so rare that
+ * there's no advantage in doing so.
+ */
+ qatomic_set(&cpu->exit_request, 0);
+ }
+
+ if (icount_enabled() && all_cpu_threads_idle()) {
+ /*
+ * When all cpus are sleeping (e.g in WFI), to avoid a deadlock
+ * in the main_loop, wake it up in order to start the warp timer.
+ */
+ qemu_notify_event();
+ }
+
+ rr_wait_io_event();
+ rr_deal_with_unplugged_cpus();
+
bql_unlock();
replay_mutex_lock();
bql_lock();
@@ -285,21 +302,6 @@ static void *rr_cpu_thread_fn(void *arg)
/* Does not need a memory barrier because a spurious wakeup is okay. */
qatomic_set(&rr_current_cpu, NULL);
-
- if (cpu && qatomic_read(&cpu->exit_request)) {
- qatomic_set_mb(&cpu->exit_request, 0);
- }
-
- if (icount_enabled() && all_cpu_threads_idle()) {
- /*
- * When all cpus are sleeping (e.g in WFI), to avoid a deadlock
- * in the main_loop, wake it up in order to start the warp timer.
- */
- qemu_notify_event();
- }
-
- rr_wait_io_event();
- rr_deal_with_unplugged_cpus();
}
g_assert_not_reached();
diff --git a/accel/tcg/tcg-accel-ops.c b/accel/tcg/tcg-accel-ops.c
index f4d5372866a..ad3f29107e1 100644
--- a/accel/tcg/tcg-accel-ops.c
+++ b/accel/tcg/tcg-accel-ops.c
@@ -82,8 +82,6 @@ int tcg_cpu_exec(CPUState *cpu)
ret = cpu_exec(cpu);
cpu_exec_end(cpu);
- qatomic_set_mb(&cpu->exit_request, 0);
-
return ret;
}
diff --git a/system/cpus.c b/system/cpus.c
index d2cfa2a9c4e..0cc14eae6a0 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -458,6 +458,7 @@ void qemu_wait_io_event(CPUState *cpu)
{
bool slept = false;
+ qatomic_set(&cpu->exit_request, false);
while (cpu_thread_is_idle(cpu)) {
if (!slept) {
slept = true;
diff --git a/target/i386/nvmm/nvmm-accel-ops.c b/target/i386/nvmm/nvmm-accel-ops.c
index 3658a583bc8..6d2e2542d80 100644
--- a/target/i386/nvmm/nvmm-accel-ops.c
+++ b/target/i386/nvmm/nvmm-accel-ops.c
@@ -42,16 +42,14 @@ static void *qemu_nvmm_cpu_thread_fn(void *arg)
qemu_guest_random_seed_thread_part2(cpu->random_seed);
do {
+ qemu_wait_io_event(cpu);
+
if (cpu_can_run(cpu)) {
r = nvmm_vcpu_exec(cpu);
if (r == EXCP_DEBUG) {
cpu_handle_guest_debug(cpu);
}
}
- while (cpu_thread_is_idle(cpu)) {
- qemu_cond_wait_bql(cpu->halt_cond);
- }
- qemu_wait_io_event_common(cpu);
} while (!cpu->unplug || cpu_can_run(cpu));
nvmm_destroy_vcpu(cpu);
diff --git a/target/i386/nvmm/nvmm-all.c b/target/i386/nvmm/nvmm-all.c
index d2d90f38976..09839d45b92 100644
--- a/target/i386/nvmm/nvmm-all.c
+++ b/target/i386/nvmm/nvmm-all.c
@@ -817,8 +817,6 @@ nvmm_vcpu_loop(CPUState *cpu)
cpu_exec_end(cpu);
bql_lock();
- qatomic_set(&cpu->exit_request, false);
-
return ret < 0;
}
diff --git a/target/i386/whpx/whpx-accel-ops.c b/target/i386/whpx/whpx-accel-ops.c
index da58805b1a6..611eeedeef7 100644
--- a/target/i386/whpx/whpx-accel-ops.c
+++ b/target/i386/whpx/whpx-accel-ops.c
@@ -42,16 +42,14 @@ static void *whpx_cpu_thread_fn(void *arg)
qemu_guest_random_seed_thread_part2(cpu->random_seed);
do {
+ qemu_wait_io_event(cpu);
+
if (cpu_can_run(cpu)) {
r = whpx_vcpu_exec(cpu);
if (r == EXCP_DEBUG) {
cpu_handle_guest_debug(cpu);
}
}
- while (cpu_thread_is_idle(cpu)) {
- qemu_cond_wait_bql(cpu->halt_cond);
- }
- qemu_wait_io_event_common(cpu);
} while (!cpu->unplug || cpu_can_run(cpu));
whpx_destroy_vcpu(cpu);
diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index 9b07716121a..2e248a0a6d5 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -2050,8 +2050,6 @@ static int whpx_vcpu_run(CPUState *cpu)
whpx_last_vcpu_stopping(cpu);
}
- qatomic_set(&cpu->exit_request, false);
-
return ret < 0;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 3/8] accel: use atomic accesses for exit_request
2025-08-08 18:59 ` [PATCH 3/8] accel: use atomic accesses for exit_request Paolo Bonzini
@ 2025-08-08 21:04 ` Philippe Mathieu-Daudé
2025-08-09 23:10 ` Richard Henderson
` (2 subsequent siblings)
3 siblings, 0 replies; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-08 21:04 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: imammedo, richard.henderson, peterx
On 8/8/25 20:59, Paolo Bonzini wrote:
> CPU threads write exit_request as a "note to self" that they need to
> go out to a slow path. This write happens out of the BQL and can be
> a data race with another threads' cpu_exit(); use atomic accesses
> consistently.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> accel/tcg/tcg-accel-ops-mttcg.c | 2 +-
> accel/tcg/tcg-accel-ops-rr.c | 4 ++--
> hw/ppc/spapr_hcall.c | 6 +++---
> target/i386/kvm/kvm.c | 2 +-
> target/i386/nvmm/nvmm-accel-ops.c | 2 +-
> target/i386/nvmm/nvmm-all.c | 2 +-
> target/i386/whpx/whpx-all.c | 6 +++---
> 7 files changed, 12 insertions(+), 12 deletions(-)
Cool, I have the same one locally but was not sure about myself there.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 4/8] accel/tcg: introduce tcg_kick_vcpu_thread
2025-08-08 18:59 ` [PATCH 4/8] accel/tcg: introduce tcg_kick_vcpu_thread Paolo Bonzini
@ 2025-08-08 21:15 ` Philippe Mathieu-Daudé
2025-08-09 23:16 ` Richard Henderson
2025-08-09 23:26 ` Richard Henderson
1 sibling, 1 reply; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-08 21:15 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: imammedo, richard.henderson, peterx
On 8/8/25 20:59, Paolo Bonzini wrote:
> qemu_cpu_kick() does not always have to set cpu->exit_request,
> though it does for now. Introduce a function tcg_kick_vcpu_thread()
> which can be changed in the future to not set cpu->exit_request,
> and reserve cpu_exit() for when *all accelerators* need to set
> that flag.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> docs/devel/tcg-icount.rst | 2 +-
> accel/tcg/tcg-accel-ops-mttcg.h | 3 ---
> include/exec/cpu-common.h | 1 +
> accel/tcg/cpu-exec.c | 17 ++++++++++++++++-
> accel/tcg/tcg-accel-ops-mttcg.c | 5 -----
> accel/tcg/tcg-accel-ops-rr.c | 2 +-
> accel/tcg/tcg-accel-ops.c | 2 +-
> bsd-user/main.c | 2 +-
> linux-user/main.c | 2 +-
> 9 files changed, 22 insertions(+), 14 deletions(-)
>
> diff --git a/docs/devel/tcg-icount.rst b/docs/devel/tcg-icount.rst
> index 7df883446a7..a1dcd79e0fd 100644
> --- a/docs/devel/tcg-icount.rst
> +++ b/docs/devel/tcg-icount.rst
> @@ -37,7 +37,7 @@ translator starts by allocating a budget of instructions to be
> executed. The budget of instructions is limited by how long it will be
> until the next timer will expire. We store this budget as part of a
> vCPU icount_decr field which shared with the machinery for handling
> -cpu_exit(). The whole field is checked at the start of every
> +qemu_cpu_kick(). The whole field is checked at the start of every
> translated block and will cause a return to the outer loop to deal
> with whatever caused the exit.
>
> diff --git a/accel/tcg/tcg-accel-ops-mttcg.h b/accel/tcg/tcg-accel-ops-mttcg.h
> index 8ffa7a9a9fe..5c145cc8595 100644
> --- a/accel/tcg/tcg-accel-ops-mttcg.h
> +++ b/accel/tcg/tcg-accel-ops-mttcg.h
> @@ -10,9 +10,6 @@
> #ifndef TCG_ACCEL_OPS_MTTCG_H
> #define TCG_ACCEL_OPS_MTTCG_H
>
> -/* kick MTTCG vCPU thread */
> -void mttcg_kick_vcpu_thread(CPUState *cpu);
> -
> /* start an mttcg vCPU thread */
> void mttcg_start_vcpu_thread(CPUState *cpu);
>
> diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
> index 9b658a3f48f..e843b09cc99 100644
> --- a/include/exec/cpu-common.h
> +++ b/include/exec/cpu-common.h
> @@ -226,6 +226,7 @@ G_NORETURN void cpu_loop_exit(CPUState *cpu);
> G_NORETURN void cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc);
>
> /* accel/tcg/cpu-exec.c */
> +void tcg_kick_vcpu_thread(CPUState *cpu);
"exec/cpu-common.h" is not the place, maybe "accel/tcg/cpu-ops.h"?
(Nitpicking, for clarity I'd rather a preliminary pach renaming
mttcg_kick_vcpu_thread -> tcg_kick_vcpu_thread and using it
instead of cpu_exit, then this patch only modifying
tcg_kick_vcpu_thread).
> int cpu_exec(CPUState *cpu);
>
> /**
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 4bd9ee01c19..1a973596d87 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -748,6 +748,20 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
> return false;
> }
>
> +void tcg_kick_vcpu_thread(CPUState *cpu)
> +{
> + /*
> + * Ensure cpu_exec will see the reason why the exit request was set.
> + * FIXME: this is not always needed. Other accelerators instead
> + * read interrupt_request and set exit_request on demand from the
> + * CPU thread; see kvm_arch_pre_run() for example.
> + */
> + qatomic_store_release(&cpu->exit_request, 1);
> +
> + /* Ensure cpu_exec will see the exit request after TCG has exited. */
> + qatomic_store_release(&cpu->neg.icount_decr.u16.high, -1);
> +}
> +
> static inline bool icount_exit_request(CPUState *cpu)
> {
> if (!icount_enabled()) {
> @@ -774,7 +788,8 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
> /* Clear the interrupt flag now since we're processing
> * cpu->interrupt_request and cpu->exit_request.
> * Ensure zeroing happens before reading cpu->exit_request or
> - * cpu->interrupt_request (see also smp_wmb in cpu_exit())
> + * cpu->interrupt_request (see also store-release in
> + * tcg_kick_vcpu_thread())
> */
> qatomic_set_mb(&cpu->neg.icount_decr.u16.high, 0);
>
> diff --git a/accel/tcg/tcg-accel-ops-mttcg.c b/accel/tcg/tcg-accel-ops-mttcg.c
> index 39237421757..5757324a8c2 100644
> --- a/accel/tcg/tcg-accel-ops-mttcg.c
> +++ b/accel/tcg/tcg-accel-ops-mttcg.c
> @@ -123,11 +123,6 @@ static void *mttcg_cpu_thread_fn(void *arg)
> return NULL;
> }
>
> -void mttcg_kick_vcpu_thread(CPUState *cpu)
> -{
> - cpu_exit(cpu);
> -}
> -
> void mttcg_start_vcpu_thread(CPUState *cpu)
> {
> char thread_name[VCPU_THREAD_NAME_SIZE];
> diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
> index c06f3beef2e..87b49377c78 100644
> --- a/accel/tcg/tcg-accel-ops-rr.c
> +++ b/accel/tcg/tcg-accel-ops-rr.c
> @@ -43,7 +43,7 @@ void rr_kick_vcpu_thread(CPUState *unused)
> CPUState *cpu;
>
> CPU_FOREACH(cpu) {
> - cpu_exit(cpu);
> + tcg_kick_vcpu_thread(cpu);
> };
> }
>
> diff --git a/accel/tcg/tcg-accel-ops.c b/accel/tcg/tcg-accel-ops.c
> index 02c7600bb7d..f4d5372866a 100644
> --- a/accel/tcg/tcg-accel-ops.c
> +++ b/accel/tcg/tcg-accel-ops.c
> @@ -207,7 +207,7 @@ static void tcg_accel_ops_init(AccelClass *ac)
>
> if (qemu_tcg_mttcg_enabled()) {
> ops->create_vcpu_thread = mttcg_start_vcpu_thread;
> - ops->kick_vcpu_thread = mttcg_kick_vcpu_thread;
> + ops->kick_vcpu_thread = tcg_kick_vcpu_thread;
> ops->handle_interrupt = tcg_handle_interrupt;
> } else {
> ops->create_vcpu_thread = rr_start_vcpu_thread;
> diff --git a/bsd-user/main.c b/bsd-user/main.c
> index 7e5d4bbce09..20159207040 100644
> --- a/bsd-user/main.c
> +++ b/bsd-user/main.c
> @@ -216,7 +216,7 @@ bool qemu_cpu_is_self(CPUState *cpu)
>
> void qemu_cpu_kick(CPUState *cpu)
> {
> - cpu_exit(cpu);
> + tcg_kick_vcpu_thread(cpu);
> }
>
> /* Assumes contents are already zeroed. */
> diff --git a/linux-user/main.c b/linux-user/main.c
> index 68972f00a15..dc68fd448b7 100644
> --- a/linux-user/main.c
> +++ b/linux-user/main.c
> @@ -192,7 +192,7 @@ bool qemu_cpu_is_self(CPUState *cpu)
>
> void qemu_cpu_kick(CPUState *cpu)
> {
> - cpu_exit(cpu);
> + tcg_kick_vcpu_thread(cpu);
> }
>
> void task_settid(TaskState *ts)
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 5/8] cpus: remove TCG-ism from cpu_exit()
2025-08-08 18:59 ` [PATCH 5/8] cpus: remove TCG-ism from cpu_exit() Paolo Bonzini
@ 2025-08-08 21:17 ` Philippe Mathieu-Daudé
2025-08-09 23:17 ` Richard Henderson
1 sibling, 0 replies; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-08 21:17 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: imammedo, richard.henderson, peterx
On 8/8/25 20:59, Paolo Bonzini wrote:
> Now that TCG has its own kick function, make cpu_exit() do the right kick
> for all accelerators.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> hw/core/cpu-common.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
Thanks!
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 7/8] tcg/user: do not set exit_request gratuitously
2025-08-08 18:59 ` [PATCH 7/8] tcg/user: do not set exit_request gratuitously Paolo Bonzini
@ 2025-08-08 21:21 ` Philippe Mathieu-Daudé
2025-08-08 21:45 ` Paolo Bonzini
0 siblings, 1 reply; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-08 21:21 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: imammedo, richard.henderson, peterx
On 8/8/25 20:59, Paolo Bonzini wrote:
> User-mode emulation correctly uses cpu_exit() whenever it needs to go
> all the way out of the cpu exec loop. It never uses qemu_cpu_kick();
> therefore, there is no need for tcg_kick_vcpu_thread() to set
> cpu->exit_request again.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> accel/tcg/cpu-exec.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 1a973596d87..b9da2e3770e 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -750,6 +750,7 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
>
> void tcg_kick_vcpu_thread(CPUState *cpu)
> {
> +#ifdef CONFIG_SYSTEM
> /*
> * Ensure cpu_exec will see the reason why the exit request was set.
> * FIXME: this is not always needed. Other accelerators instead
> @@ -757,6 +758,7 @@ void tcg_kick_vcpu_thread(CPUState *cpu)
> * CPU thread; see kvm_arch_pre_run() for example.
> */
> qatomic_store_release(&cpu->exit_request, 1);
> +#endif
>
> /* Ensure cpu_exec will see the exit request after TCG has exited. */
> qatomic_store_release(&cpu->neg.icount_decr.u16.high, -1);
What about cpu_handle_interrupt()?
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 8/8] accel: make all calls to qemu_wait_io_event look the same
2025-08-08 18:59 ` [PATCH 8/8] accel: make all calls to qemu_wait_io_event look the same Paolo Bonzini
@ 2025-08-08 21:24 ` Philippe Mathieu-Daudé
2025-08-09 23:34 ` Richard Henderson
1 sibling, 0 replies; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-08 21:24 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: imammedo, richard.henderson, peterx
On 8/8/25 20:59, Paolo Bonzini wrote:
> There is no reason for some accelerators to use qemu_wait_io_event_common
> (which is specifically separated for round robin). They can also check
> on the first pass through the loop as well directly, without setting
> cpu->exit_request for no particular reason.
>
> There is also no need to use qatomic_set_mb() because the ordering of
> the communication between I/O and vCPU threads is always the same.
> In the I/O thread:
>
> (a) store other memory locations that will be checked if cpu->exit_request
> or cpu->interrupt_request is 1 (for example cpu->stop or cpu->work_list
> for cpu->exit_request)
>
> (b) cpu_exit(): store-release cpu->exit_request, or
> (b) cpu_interrupt(): store-release cpu->interrupt_request
>
>>>> at this point, cpu->halt_cond is broadcast and the BQL released
>
> (c) do the accelerator-specific kick (e.g. write icount_decr for TCG,
> pthread_kill for KVM, etc.)
>
> In the vCPU thread instead the opposite order is respected:
>
> (c) the accelerator's execution loop exits thanks to the kick
>
> (b) then the inner execution loop checks cpu->interrupt_request
> and cpu->exit_request. If needed cpu->interrupt_request is
> converted into cpu->exit_request when work is needed outside
> the execution loop.
>
> (a) then the other memory locations are checked. Some may need
> to be read under the BQL, and the vCPU thread may also take
> for the vCPU thread can sleep on cpu->halt_cond; but in
> principle this step is correct even without the BQL.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> accel/dummy-cpus.c | 2 +-
> accel/hvf/hvf-accel-ops.c | 2 +-
> accel/kvm/kvm-accel-ops.c | 3 ++-
> accel/kvm/kvm-all.c | 2 --
> accel/tcg/cpu-exec.c | 1 -
> accel/tcg/tcg-accel-ops-mttcg.c | 7 ++----
> accel/tcg/tcg-accel-ops-rr.c | 38 ++++++++++++++++---------------
> accel/tcg/tcg-accel-ops.c | 2 --
> system/cpus.c | 1 +
> target/i386/nvmm/nvmm-accel-ops.c | 6 ++---
> target/i386/nvmm/nvmm-all.c | 2 --
> target/i386/whpx/whpx-accel-ops.c | 6 ++---
> target/i386/whpx/whpx-all.c | 2 --
> 13 files changed, 31 insertions(+), 43 deletions(-)
Great!
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/8] accel/hvf: check exit_request before running the vCPU
2025-08-08 18:58 ` [PATCH 2/8] accel/hvf: check exit_request before running the vCPU Paolo Bonzini
@ 2025-08-08 21:28 ` Philippe Mathieu-Daudé
2025-08-09 23:09 ` Richard Henderson
1 sibling, 0 replies; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-08 21:28 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel
Cc: imammedo, richard.henderson, peterx, Mohamed Mediouni
On 8/8/25 20:58, Paolo Bonzini wrote:
> This is done by all other accelerators, but was missing for
> Hypervisor.framework.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> target/arm/hvf/hvf.c | 4 ++++
> target/i386/hvf/hvf.c | 4 ++++
> 2 files changed, 8 insertions(+)
>
> diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
> index b77db99079e..478bc75fee6 100644
> --- a/target/arm/hvf/hvf.c
> +++ b/target/arm/hvf/hvf.c
> @@ -1927,6 +1927,10 @@ int hvf_vcpu_exec(CPUState *cpu)
> flush_cpu_state(cpu);
>
> bql_unlock();
> + /* Corresponding store-release is in cpu_exit. */
> + if (qatomic_load_acquire(&cpu->exit_request)) {
> + hv_vcpus_exit(&cpu->accel->fd, 1);
> + }
> r = hv_vcpu_run(cpu->accel->fd);
> bql_lock();
> switch (r) {
> diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
> index 8445cadecec..b7c4b849cdf 100644
> --- a/target/i386/hvf/hvf.c
> +++ b/target/i386/hvf/hvf.c
> @@ -749,6 +749,10 @@ int hvf_vcpu_exec(CPUState *cpu)
> return EXCP_HLT;
> }
>
> + /* Corresponding store-release is in cpu_exit. */
> + if (qatomic_load_acquire(&cpu->exit_request)) {
> + hv_vcpu_interrupt(&cpu->accel->fd, 1);
> + }
> hv_return_t r = hv_vcpu_run_until(cpu->accel->fd, HV_DEADLINE_FOREVER);
> assert_hvf_ok(r);
>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 7/8] tcg/user: do not set exit_request gratuitously
2025-08-08 21:21 ` Philippe Mathieu-Daudé
@ 2025-08-08 21:45 ` Paolo Bonzini
0 siblings, 0 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-08-08 21:45 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: qemu-devel, imammedo, richard.henderson, peterx
On Fri, Aug 8, 2025 at 11:21 PM Philippe Mathieu-Daudé
<philmd@linaro.org> wrote:
> > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> > index 1a973596d87..b9da2e3770e 100644
> > --- a/accel/tcg/cpu-exec.c
> > +++ b/accel/tcg/cpu-exec.c
> > @@ -750,6 +750,7 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
> >
> > void tcg_kick_vcpu_thread(CPUState *cpu)
> > {
> > +#ifdef CONFIG_SYSTEM
> > /*
> > * Ensure cpu_exec will see the reason why the exit request was set.
> > * FIXME: this is not always needed. Other accelerators instead
> > @@ -757,6 +758,7 @@ void tcg_kick_vcpu_thread(CPUState *cpu)
> > * CPU thread; see kvm_arch_pre_run() for example.
> > */
> > qatomic_store_release(&cpu->exit_request, 1);
> > +#endif
> >
> > /* Ensure cpu_exec will see the exit request after TCG has exited. */
> > qatomic_store_release(&cpu->neg.icount_decr.u16.high, -1);
>
> What about cpu_handle_interrupt()?
The point of this patch isn't that qemu-user never reads exit_request
(as you point out, it does). The point is that qemu-user always uses
cpu_exit() rather than qemu_cpu_kick(), and therefore it's already
always writing exit_request.
For system emulation, writing cpu->exit_request should be moved from
tcg_kick_vcpu_thread to tcg_ops->cpu_exec_interrupt.
Paolo
Paolo
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/8] accel: use store_release/load_acquire for cross-thread exit_request
2025-08-08 18:58 ` [PATCH 1/8] accel: use store_release/load_acquire for cross-thread exit_request Paolo Bonzini
@ 2025-08-09 22:41 ` Richard Henderson
2025-08-11 19:31 ` Peter Xu
1 sibling, 0 replies; 31+ messages in thread
From: Richard Henderson @ 2025-08-09 22:41 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: imammedo, peterx
On 8/9/25 04:58, Paolo Bonzini wrote:
> cpu->exit_request do not use a load-acquire/store-release pair right now,
> but this means that cpu_exit() does not store it after any flags that are
> read in the slow path.
>
> Probably everything is protected one way or the other by the BQL, because
> after reading cpu->exit_request the CPU thread often goes to sleep
> (by waiting on the BQL-protected cpu->halt_cond), but it's not clear.
> Use load-acquire/store-release consistently.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> accel/kvm/kvm-all.c | 19 +++++++++----------
> accel/tcg/cpu-exec.c | 7 +++++--
> accel/tcg/tcg-accel-ops-rr.c | 2 +-
> hw/core/cpu-common.c | 3 ++-
> target/i386/nvmm/nvmm-all.c | 5 ++---
> target/i386/whpx/whpx-all.c | 3 ++-
> 6 files changed, 21 insertions(+), 18 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
> index 39e674aca21..f189ce861c9 100644
> --- a/hw/core/cpu-common.c
> +++ b/hw/core/cpu-common.c
> @@ -84,7 +84,8 @@ void cpu_reset_interrupt(CPUState *cpu, int mask)
>
> void cpu_exit(CPUState *cpu)
> {
> - qatomic_set(&cpu->exit_request, 1);
> + /* Ensure cpu_exec will see the reason why the exit request was set. */
> + qatomic_store_release(&cpu->exit_request, 1);
While you're touching the lines, since exit_request is bool, let's use true (and elsewhere
in other patches, false).
r~
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/8] accel/hvf: check exit_request before running the vCPU
2025-08-08 18:58 ` [PATCH 2/8] accel/hvf: check exit_request before running the vCPU Paolo Bonzini
2025-08-08 21:28 ` Philippe Mathieu-Daudé
@ 2025-08-09 23:09 ` Richard Henderson
2025-08-11 7:06 ` Paolo Bonzini
1 sibling, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2025-08-09 23:09 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: imammedo, peterx
On 8/9/25 04:58, Paolo Bonzini wrote:
> This is done by all other accelerators, but was missing for
> Hypervisor.framework.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> target/arm/hvf/hvf.c | 4 ++++
> target/i386/hvf/hvf.c | 4 ++++
> 2 files changed, 8 insertions(+)
>
> diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
> index b77db99079e..478bc75fee6 100644
> --- a/target/arm/hvf/hvf.c
> +++ b/target/arm/hvf/hvf.c
> @@ -1927,6 +1927,10 @@ int hvf_vcpu_exec(CPUState *cpu)
> flush_cpu_state(cpu);
>
> bql_unlock();
> + /* Corresponding store-release is in cpu_exit. */
> + if (qatomic_load_acquire(&cpu->exit_request)) {
> + hv_vcpus_exit(&cpu->accel->fd, 1);
> + }
> r = hv_vcpu_run(cpu->accel->fd);
This looks odd.
hv_vcpus_exit: "Forces an immediate exit of a set of vCPUs of the VM".
But we know for a fact fd isn't running, because that's to start in the next line. I
suppose this must set a flag so that the kernel's hv_vcpu_run exits immediately with CANCELED?
Does executing of hv_vcpu_run buy us anything else? Is it less confusing to simply return
0, modulo bql fiddling?
> diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
> index 8445cadecec..b7c4b849cdf 100644
> --- a/target/i386/hvf/hvf.c
> +++ b/target/i386/hvf/hvf.c
> @@ -749,6 +749,10 @@ int hvf_vcpu_exec(CPUState *cpu)
> return EXCP_HLT;
> }
>
> + /* Corresponding store-release is in cpu_exit. */
> + if (qatomic_load_acquire(&cpu->exit_request)) {
> + hv_vcpu_interrupt(&cpu->accel->fd, 1);
> + }
> hv_return_t r = hv_vcpu_run_until(cpu->accel->fd, HV_DEADLINE_FOREVER);
> assert_hvf_ok(r);
This, similarly, I guess returns EXCP_INTERRUPT. Is that better or worse than 0?
r~
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 3/8] accel: use atomic accesses for exit_request
2025-08-08 18:59 ` [PATCH 3/8] accel: use atomic accesses for exit_request Paolo Bonzini
2025-08-08 21:04 ` Philippe Mathieu-Daudé
@ 2025-08-09 23:10 ` Richard Henderson
2025-08-11 14:49 ` Alex Bennée
2025-08-11 19:33 ` Peter Xu
3 siblings, 0 replies; 31+ messages in thread
From: Richard Henderson @ 2025-08-09 23:10 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: imammedo, peterx
On 8/9/25 04:59, Paolo Bonzini wrote:
> CPU threads write exit_request as a "note to self" that they need to
> go out to a slow path. This write happens out of the BQL and can be
> a data race with another threads' cpu_exit(); use atomic accesses
> consistently.
>
> Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
> ---
> accel/tcg/tcg-accel-ops-mttcg.c | 2 +-
> accel/tcg/tcg-accel-ops-rr.c | 4 ++--
> hw/ppc/spapr_hcall.c | 6 +++---
> target/i386/kvm/kvm.c | 2 +-
> target/i386/nvmm/nvmm-accel-ops.c | 2 +-
> target/i386/nvmm/nvmm-all.c | 2 +-
> target/i386/whpx/whpx-all.c | 6 +++---
> 7 files changed, 12 insertions(+), 12 deletions(-)
s/1/true/
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 4/8] accel/tcg: introduce tcg_kick_vcpu_thread
2025-08-08 21:15 ` Philippe Mathieu-Daudé
@ 2025-08-09 23:16 ` Richard Henderson
0 siblings, 0 replies; 31+ messages in thread
From: Richard Henderson @ 2025-08-09 23:16 UTC (permalink / raw)
To: Philippe Mathieu-Daudé, Paolo Bonzini, qemu-devel; +Cc: imammedo, peterx
On 8/9/25 07:15, Philippe Mathieu-Daudé wrote:
> On 8/8/25 20:59, Paolo Bonzini wrote:
>> qemu_cpu_kick() does not always have to set cpu->exit_request,
>> though it does for now. Introduce a function tcg_kick_vcpu_thread()
>> which can be changed in the future to not set cpu->exit_request,
>> and reserve cpu_exit() for when *all accelerators* need to set
>> that flag.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>> docs/devel/tcg-icount.rst | 2 +-
>> accel/tcg/tcg-accel-ops-mttcg.h | 3 ---
>> include/exec/cpu-common.h | 1 +
>> accel/tcg/cpu-exec.c | 17 ++++++++++++++++-
>> accel/tcg/tcg-accel-ops-mttcg.c | 5 -----
>> accel/tcg/tcg-accel-ops-rr.c | 2 +-
>> accel/tcg/tcg-accel-ops.c | 2 +-
>> bsd-user/main.c | 2 +-
>> linux-user/main.c | 2 +-
>> 9 files changed, 22 insertions(+), 14 deletions(-)
>>
>> diff --git a/docs/devel/tcg-icount.rst b/docs/devel/tcg-icount.rst
>> index 7df883446a7..a1dcd79e0fd 100644
>> --- a/docs/devel/tcg-icount.rst
>> +++ b/docs/devel/tcg-icount.rst
>> @@ -37,7 +37,7 @@ translator starts by allocating a budget of instructions to be
>> executed. The budget of instructions is limited by how long it will be
>> until the next timer will expire. We store this budget as part of a
>> vCPU icount_decr field which shared with the machinery for handling
>> -cpu_exit(). The whole field is checked at the start of every
>> +qemu_cpu_kick(). The whole field is checked at the start of every
>> translated block and will cause a return to the outer loop to deal
>> with whatever caused the exit.
>> diff --git a/accel/tcg/tcg-accel-ops-mttcg.h b/accel/tcg/tcg-accel-ops-mttcg.h
>> index 8ffa7a9a9fe..5c145cc8595 100644
>> --- a/accel/tcg/tcg-accel-ops-mttcg.h
>> +++ b/accel/tcg/tcg-accel-ops-mttcg.h
>> @@ -10,9 +10,6 @@
>> #ifndef TCG_ACCEL_OPS_MTTCG_H
>> #define TCG_ACCEL_OPS_MTTCG_H
>> -/* kick MTTCG vCPU thread */
>> -void mttcg_kick_vcpu_thread(CPUState *cpu);
>> -
>> /* start an mttcg vCPU thread */
>> void mttcg_start_vcpu_thread(CPUState *cpu);
>> diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
>> index 9b658a3f48f..e843b09cc99 100644
>> --- a/include/exec/cpu-common.h
>> +++ b/include/exec/cpu-common.h
>> @@ -226,6 +226,7 @@ G_NORETURN void cpu_loop_exit(CPUState *cpu);
>> G_NORETURN void cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc);
>> /* accel/tcg/cpu-exec.c */
>> +void tcg_kick_vcpu_thread(CPUState *cpu);
>
> "exec/cpu-common.h" is not the place, maybe "accel/tcg/cpu-ops.h"?
Agreed. At some point we should split the non-system qemu_cpu_kick() etc out of
{linux,bsd}-user to common-user, or something.
r~
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 5/8] cpus: remove TCG-ism from cpu_exit()
2025-08-08 18:59 ` [PATCH 5/8] cpus: remove TCG-ism from cpu_exit() Paolo Bonzini
2025-08-08 21:17 ` Philippe Mathieu-Daudé
@ 2025-08-09 23:17 ` Richard Henderson
1 sibling, 0 replies; 31+ messages in thread
From: Richard Henderson @ 2025-08-09 23:17 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: imammedo, peterx
On 8/9/25 04:59, Paolo Bonzini wrote:
> Now that TCG has its own kick function, make cpu_exit() do the right kick
> for all accelerators.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> hw/core/cpu-common.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
> index f189ce861c9..045b1778236 100644
> --- a/hw/core/cpu-common.c
> +++ b/hw/core/cpu-common.c
> @@ -86,9 +86,7 @@ void cpu_exit(CPUState *cpu)
> {
> /* Ensure cpu_exec will see the reason why the exit request was set. */
> qatomic_store_release(&cpu->exit_request, 1);
> - /* Ensure cpu_exec will see the exit request after TCG has exited. */
> - smp_wmb();
> - qatomic_set(&cpu->neg.icount_decr.u16.high, -1);
> + qemu_cpu_kick(cpu);
> }
>
> static int cpu_common_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 4/8] accel/tcg: introduce tcg_kick_vcpu_thread
2025-08-08 18:59 ` [PATCH 4/8] accel/tcg: introduce tcg_kick_vcpu_thread Paolo Bonzini
2025-08-08 21:15 ` Philippe Mathieu-Daudé
@ 2025-08-09 23:26 ` Richard Henderson
2025-08-11 6:10 ` Paolo Bonzini
1 sibling, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2025-08-09 23:26 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: imammedo, peterx
On 8/9/25 04:59, Paolo Bonzini wrote:
> +void tcg_kick_vcpu_thread(CPUState *cpu)
> +{
> + /*
> + * Ensure cpu_exec will see the reason why the exit request was set.
> + * FIXME: this is not always needed. Other accelerators instead
> + * read interrupt_request and set exit_request on demand from the
> + * CPU thread; see kvm_arch_pre_run() for example.
> + */
> + qatomic_store_release(&cpu->exit_request, 1);
> +
> + /* Ensure cpu_exec will see the exit request after TCG has exited. */
> + qatomic_store_release(&cpu->neg.icount_decr.u16.high, -1);
> +}
So, now both cpu_exit and cpu_kick set exit_request.
You ifdef this out again for user-only in patch 7, but this does suggest that kick and
exit are essentially interchangeable. You rearrange things a bit in patch 6, but it's
still not clear to me what the difference between the two should be. There's certainly
nothing at all in include/hw/core/cpu.h to differentiate them.
Should we instead eliminate one of kick or exit, unifying the paths?
r~
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 8/8] accel: make all calls to qemu_wait_io_event look the same
2025-08-08 18:59 ` [PATCH 8/8] accel: make all calls to qemu_wait_io_event look the same Paolo Bonzini
2025-08-08 21:24 ` Philippe Mathieu-Daudé
@ 2025-08-09 23:34 ` Richard Henderson
1 sibling, 0 replies; 31+ messages in thread
From: Richard Henderson @ 2025-08-09 23:34 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: imammedo, peterx
On 8/9/25 04:59, Paolo Bonzini wrote:
> There is no reason for some accelerators to use qemu_wait_io_event_common
> (which is specifically separated for round robin). They can also check
> on the first pass through the loop as well directly, without setting
> cpu->exit_request for no particular reason.
>
> There is also no need to use qatomic_set_mb() because the ordering of
> the communication between I/O and vCPU threads is always the same.
> In the I/O thread:
>
> (a) store other memory locations that will be checked if cpu->exit_request
> or cpu->interrupt_request is 1 (for example cpu->stop or cpu->work_list
> for cpu->exit_request)
>
> (b) cpu_exit(): store-release cpu->exit_request, or
> (b) cpu_interrupt(): store-release cpu->interrupt_request
>
>>>> at this point, cpu->halt_cond is broadcast and the BQL released
>
> (c) do the accelerator-specific kick (e.g. write icount_decr for TCG,
> pthread_kill for KVM, etc.)
>
> In the vCPU thread instead the opposite order is respected:
>
> (c) the accelerator's execution loop exits thanks to the kick
>
> (b) then the inner execution loop checks cpu->interrupt_request
> and cpu->exit_request. If needed cpu->interrupt_request is
> converted into cpu->exit_request when work is needed outside
> the execution loop.
>
> (a) then the other memory locations are checked. Some may need
> to be read under the BQL, and the vCPU thread may also take
> for the vCPU thread can sleep on cpu->halt_cond; but in
> principle this step is correct even without the BQL.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> accel/dummy-cpus.c | 2 +-
> accel/hvf/hvf-accel-ops.c | 2 +-
> accel/kvm/kvm-accel-ops.c | 3 ++-
> accel/kvm/kvm-all.c | 2 --
> accel/tcg/cpu-exec.c | 1 -
> accel/tcg/tcg-accel-ops-mttcg.c | 7 ++----
> accel/tcg/tcg-accel-ops-rr.c | 38 ++++++++++++++++---------------
> accel/tcg/tcg-accel-ops.c | 2 --
> system/cpus.c | 1 +
> target/i386/nvmm/nvmm-accel-ops.c | 6 ++---
> target/i386/nvmm/nvmm-all.c | 2 --
> target/i386/whpx/whpx-accel-ops.c | 6 ++---
> target/i386/whpx/whpx-all.c | 2 --
> 13 files changed, 31 insertions(+), 43 deletions(-)
I think this is doing two separate things: rearranging the qemu_wait_io_event, and ...
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 57e35960125..db95e06e768 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -3155,7 +3155,6 @@ int kvm_cpu_exec(CPUState *cpu)
> trace_kvm_cpu_exec();
>
> if (kvm_arch_process_async_events(cpu)) {
> - qatomic_set(&cpu->exit_request, 0);
> return EXCP_HLT;
> }
>
> @@ -3345,7 +3344,6 @@ int kvm_cpu_exec(CPUState *cpu)
> vm_stop(RUN_STATE_INTERNAL_ERROR);
> }
>
> - qatomic_set(&cpu->exit_request, 0);
> return ret;
> }
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index b9da2e3770e..f474ccb37f5 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -871,7 +871,6 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
> * The corresponding store-release is in cpu_exit.
> */
> if (unlikely(qatomic_load_acquire(&cpu->exit_request)) || icount_exit_request(cpu)) {
> - qatomic_set(&cpu->exit_request, 0);
> if (cpu->exception_index == -1) {
> cpu->exception_index = EXCP_INTERRUPT;
> }
> diff --git a/accel/tcg/tcg-accel-ops.c b/accel/tcg/tcg-accel-ops.c
> index f4d5372866a..ad3f29107e1 100644
> --- a/accel/tcg/tcg-accel-ops.c
> +++ b/accel/tcg/tcg-accel-ops.c
> @@ -82,8 +82,6 @@ int tcg_cpu_exec(CPUState *cpu)
> ret = cpu_exec(cpu);
> cpu_exec_end(cpu);
>
> - qatomic_set_mb(&cpu->exit_request, 0);
> -
> return ret;
> }
>
> diff --git a/system/cpus.c b/system/cpus.c
> index d2cfa2a9c4e..0cc14eae6a0 100644
> --- a/system/cpus.c
> +++ b/system/cpus.c
> @@ -458,6 +458,7 @@ void qemu_wait_io_event(CPUState *cpu)
> {
> bool slept = false;
>
> + qatomic_set(&cpu->exit_request, false);
> while (cpu_thread_is_idle(cpu)) {
> if (!slept) {
> slept = true;
> diff --git a/target/i386/nvmm/nvmm-all.c b/target/i386/nvmm/nvmm-all.c
> index d2d90f38976..09839d45b92 100644
> --- a/target/i386/nvmm/nvmm-all.c
> +++ b/target/i386/nvmm/nvmm-all.c
> @@ -817,8 +817,6 @@ nvmm_vcpu_loop(CPUState *cpu)
> cpu_exec_end(cpu);
> bql_lock();
>
> - qatomic_set(&cpu->exit_request, false);
> -
> return ret < 0;
> }
>
> diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
> index 9b07716121a..2e248a0a6d5 100644
> --- a/target/i386/whpx/whpx-all.c
> +++ b/target/i386/whpx/whpx-all.c
> @@ -2050,8 +2050,6 @@ static int whpx_vcpu_run(CPUState *cpu)
> whpx_last_vcpu_stopping(cpu);
> }
>
> - qatomic_set(&cpu->exit_request, false);
> -
> return ret < 0;
> }
>
... sinking the clear of exit_request.
It would be nice to actually unify the run loop that you're manipulating here. But I
suppose that can be a follow-up.
r~
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 4/8] accel/tcg: introduce tcg_kick_vcpu_thread
2025-08-09 23:26 ` Richard Henderson
@ 2025-08-11 6:10 ` Paolo Bonzini
2025-08-11 8:33 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2025-08-11 6:10 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: imammedo, peterx
On 8/10/25 01:26, Richard Henderson wrote:
> On 8/9/25 04:59, Paolo Bonzini wrote:
>> +void tcg_kick_vcpu_thread(CPUState *cpu)
>> +{
>> + /*
>> + * Ensure cpu_exec will see the reason why the exit request was set.
>> + * FIXME: this is not always needed. Other accelerators instead
>> + * read interrupt_request and set exit_request on demand from the
>> + * CPU thread; see kvm_arch_pre_run() for example.
>> + */
>> + qatomic_store_release(&cpu->exit_request, 1);
>> +
>> + /* Ensure cpu_exec will see the exit request after TCG has
>> exited. */
>> + qatomic_store_release(&cpu->neg.icount_decr.u16.high, -1);
>> +}
>
> So, now both cpu_exit and cpu_kick set exit_request.
>
> You ifdef this out again for user-only in patch 7, but this does suggest
> that kick and exit are essentially interchangeable. You rearrange
> things a bit in patch 6, but it's still not clear to me what the
> difference between the two should be. There's certainly nothing at all
> in include/hw/core/cpu.h to differentiate them.
>
> Should we instead eliminate one of kick or exit, unifying the paths?
In cpu-exec.c terms, qemu_cpu_kick() *should* go out to
cpu_handle_interrupt() whereas cpu_exit() *should* go out to
cpu_handle_exception(). The difference matters for some accelerators
where qemu_cpu_kick() tries not to take the BQL in the vCPU thread.
Until now TCG's implementation of kick_vcpu_thread set both exit_request
and interrupt_request, and I'm not changing that yet for system
emulation. Patch 7 does that for user-mode emulation, because it's
trivial: neither linux-user not bsd-user use qemu_cpu_kick() directly.
Paolo
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/8] accel/hvf: check exit_request before running the vCPU
2025-08-09 23:09 ` Richard Henderson
@ 2025-08-11 7:06 ` Paolo Bonzini
0 siblings, 0 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-08-11 7:06 UTC (permalink / raw)
To: Richard Henderson, qemu-devel, Phil Dennis-Jordan; +Cc: imammedo, peterx
On 8/10/25 01:09, Richard Henderson wrote:
> On 8/9/25 04:58, Paolo Bonzini wrote:
>> This is done by all other accelerators, but was missing for
>> Hypervisor.framework.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>> target/arm/hvf/hvf.c | 4 ++++
>> target/i386/hvf/hvf.c | 4 ++++
>> 2 files changed, 8 insertions(+)
>>
>> diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
>> index b77db99079e..478bc75fee6 100644
>> --- a/target/arm/hvf/hvf.c
>> +++ b/target/arm/hvf/hvf.c
>> @@ -1927,6 +1927,10 @@ int hvf_vcpu_exec(CPUState *cpu)
>> flush_cpu_state(cpu);
>> bql_unlock();
>> + /* Corresponding store-release is in cpu_exit. */
>> + if (qatomic_load_acquire(&cpu->exit_request)) {
>> + hv_vcpus_exit(&cpu->accel->fd, 1);
>> + }
>> r = hv_vcpu_run(cpu->accel->fd);
>
> This looks odd.
>
> hv_vcpus_exit: "Forces an immediate exit of a set of vCPUs of the VM".
>
> But we know for a fact fd isn't running, because that's to start in the
> next line. I suppose this must set a flag so that the kernel's
> hv_vcpu_run exits immediately with CANCELED?
>
> Does executing of hv_vcpu_run buy us anything else? Is it less
> confusing to simply return 0, modulo bql fiddling?
>
>> diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
>> index 8445cadecec..b7c4b849cdf 100644
>> --- a/target/i386/hvf/hvf.c
>> +++ b/target/i386/hvf/hvf.c
>> @@ -749,6 +749,10 @@ int hvf_vcpu_exec(CPUState *cpu)
>> return EXCP_HLT;
>> }
>> + /* Corresponding store-release is in cpu_exit. */
>> + if (qatomic_load_acquire(&cpu->exit_request)) {
>> + hv_vcpu_interrupt(&cpu->accel->fd, 1);
>> + }
>> hv_return_t r = hv_vcpu_run_until(cpu->accel->fd,
>> HV_DEADLINE_FOREVER);
>> assert_hvf_ok(r);
>
> This, similarly, I guess returns EXCP_INTERRUPT. Is that better or
> worse than 0?
The advantage is that you can reuse the code for when another thread
kicks you out of the execution loop. For x86, for example, the effects
of hvf_inject_interrupts() are undone by hv_vcpu_run() + hvf_store_events().
But on second thought this patch is not needed. The kick is handled
completely by Apple code and that's where any synchronization must
happen. In fact, there should be no need for hvf_kick_vcpu_thread() to
call cpus_kick_thread() even (though I'd leave it to more HVF-literate
people like Phil to do that).
Paolo
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 4/8] accel/tcg: introduce tcg_kick_vcpu_thread
2025-08-11 6:10 ` Paolo Bonzini
@ 2025-08-11 8:33 ` Philippe Mathieu-Daudé
2025-08-11 13:34 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-11 8:33 UTC (permalink / raw)
To: Paolo Bonzini, Richard Henderson, qemu-devel; +Cc: imammedo, peterx
On 11/8/25 08:10, Paolo Bonzini wrote:
> On 8/10/25 01:26, Richard Henderson wrote:
>> On 8/9/25 04:59, Paolo Bonzini wrote:
>>> +void tcg_kick_vcpu_thread(CPUState *cpu)
>>> +{
>>> + /*
>>> + * Ensure cpu_exec will see the reason why the exit request was
>>> set.
>>> + * FIXME: this is not always needed. Other accelerators instead
>>> + * read interrupt_request and set exit_request on demand from the
>>> + * CPU thread; see kvm_arch_pre_run() for example.
>>> + */
>>> + qatomic_store_release(&cpu->exit_request, 1);
>>> +
>>> + /* Ensure cpu_exec will see the exit request after TCG has
>>> exited. */
>>> + qatomic_store_release(&cpu->neg.icount_decr.u16.high, -1);
>>> +}
>>
>> So, now both cpu_exit and cpu_kick set exit_request.
>>
>> You ifdef this out again for user-only in patch 7, but this does
>> suggest that kick and exit are essentially interchangeable. You
>> rearrange things a bit in patch 6, but it's still not clear to me what
>> the difference between the two should be. There's certainly nothing
>> at all in include/hw/core/cpu.h to differentiate them.
>>
>> Should we instead eliminate one of kick or exit, unifying the paths?
> In cpu-exec.c terms, qemu_cpu_kick() *should* go out to
> cpu_handle_interrupt() whereas cpu_exit() *should* go out to
> cpu_handle_exception(). The difference matters for some accelerators
> where qemu_cpu_kick() tries not to take the BQL in the vCPU thread.
While I'm still learning this area, this answer makes sense to me.
If this is correct (reviewed by others), can we document it in the
corresponding prototype declarations?
>
> Until now TCG's implementation of kick_vcpu_thread set both exit_request
> and interrupt_request, and I'm not changing that yet for system
> emulation.
For long I tried to figure out why both are set there.
> Patch 7 does that for user-mode emulation, because it's
> trivial: neither linux-user not bsd-user use qemu_cpu_kick() directly.
>
> Paolo
>
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 6/8] cpus: properly kick CPUs out of inner execution loop
2025-08-08 18:59 ` [PATCH 6/8] cpus: properly kick CPUs out of inner execution loop Paolo Bonzini
@ 2025-08-11 12:56 ` Alex Bennée
0 siblings, 0 replies; 31+ messages in thread
From: Alex Bennée @ 2025-08-11 12:56 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, imammedo, richard.henderson, peterx
Paolo Bonzini <pbonzini@redhat.com> writes:
> Now that cpu_exit() actually kicks all accelerators, use it whenever
> the message to another thread is processed in qemu_wait_io_event()
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
<snip>
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -509,8 +509,8 @@ static target_ulong h_cede(PowerPCCPU *cpu, SpaprMachineState *spapr,
> if (!cpu_has_work(cs)) {
> cs->halted = 1;
> cs->exception_index = EXCP_HLT;
> - qatomic_set(&cs->exit_request, 1);
> ppc_maybe_interrupt(env);
> + cpu_exit(env);
Should be cs
> }
>
> return H_SUCCESS;
> @@ -531,8 +531,8 @@ static target_ulong h_confer_self(PowerPCCPU *cpu)
> }
> cs->halted = 1;
> cs->exception_index = EXCP_HALTED;
> - qatomic_set(&cs->exit_request, 1);
> ppc_maybe_interrupt(&cpu->env);
> + cpu_exit(&cpu->env);
Should be cs
<snip>
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 4/8] accel/tcg: introduce tcg_kick_vcpu_thread
2025-08-11 8:33 ` Philippe Mathieu-Daudé
@ 2025-08-11 13:34 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-11 13:34 UTC (permalink / raw)
To: Paolo Bonzini, Richard Henderson, qemu-devel; +Cc: imammedo, peterx
On 11/8/25 10:33, Philippe Mathieu-Daudé wrote:
> On 11/8/25 08:10, Paolo Bonzini wrote:
>> On 8/10/25 01:26, Richard Henderson wrote:
>>> On 8/9/25 04:59, Paolo Bonzini wrote:
>>>> +void tcg_kick_vcpu_thread(CPUState *cpu)
>>>> +{
>>>> + /*
>>>> + * Ensure cpu_exec will see the reason why the exit request was
>>>> set.
>>>> + * FIXME: this is not always needed. Other accelerators instead
>>>> + * read interrupt_request and set exit_request on demand from the
>>>> + * CPU thread; see kvm_arch_pre_run() for example.
>>>> + */
>>>> + qatomic_store_release(&cpu->exit_request, 1);
>>>> +
>>>> + /* Ensure cpu_exec will see the exit request after TCG has
>>>> exited. */
>>>> + qatomic_store_release(&cpu->neg.icount_decr.u16.high, -1);
>>>> +}
>>>
>>> So, now both cpu_exit and cpu_kick set exit_request.
>>>
>>> You ifdef this out again for user-only in patch 7, but this does
>>> suggest that kick and exit are essentially interchangeable. You
>>> rearrange things a bit in patch 6, but it's still not clear to me
>>> what the difference between the two should be. There's certainly
>>> nothing at all in include/hw/core/cpu.h to differentiate them.
>>>
>>> Should we instead eliminate one of kick or exit, unifying the paths?
>> In cpu-exec.c terms, qemu_cpu_kick() *should* go out to
>> cpu_handle_interrupt() whereas cpu_exit() *should* go out to
>> cpu_handle_exception(). The difference matters for some accelerators
>> where qemu_cpu_kick() tries not to take the BQL in the vCPU thread.
>
> While I'm still learning this area, this answer makes sense to me.
> If this is correct (reviewed by others), can we document it in the
> corresponding prototype declarations?
Maybe we can rename as something like cpu_kick_for_interrupt() and
cpu_exit_for_exception()? Or cpu_kick_maybe_[handle]_interrupt()
and cpu_exit_maybe_[handle]_exception()...
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 3/8] accel: use atomic accesses for exit_request
2025-08-08 18:59 ` [PATCH 3/8] accel: use atomic accesses for exit_request Paolo Bonzini
2025-08-08 21:04 ` Philippe Mathieu-Daudé
2025-08-09 23:10 ` Richard Henderson
@ 2025-08-11 14:49 ` Alex Bennée
2025-08-11 19:33 ` Peter Xu
3 siblings, 0 replies; 31+ messages in thread
From: Alex Bennée @ 2025-08-11 14:49 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, imammedo, richard.henderson, peterx
Paolo Bonzini <pbonzini@redhat.com> writes:
> CPU threads write exit_request as a "note to self" that they need to
> go out to a slow path. This write happens out of the BQL and can be
> a data race with another threads' cpu_exit(); use atomic accesses
> consistently.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> accel/tcg/tcg-accel-ops-mttcg.c | 2 +-
> accel/tcg/tcg-accel-ops-rr.c | 4 ++--
> hw/ppc/spapr_hcall.c | 6 +++---
> target/i386/kvm/kvm.c | 2 +-
> target/i386/nvmm/nvmm-accel-ops.c | 2 +-
> target/i386/nvmm/nvmm-all.c | 2 +-
> target/i386/whpx/whpx-all.c | 6 +++---
> 7 files changed, 12 insertions(+), 12 deletions(-)
Could you please update include/hw/core/cpu.h with a doc patch to
describe exit_request and note it should be read/written with qatomic
primitives please.
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/8] accel: use store_release/load_acquire for cross-thread exit_request
2025-08-08 18:58 ` [PATCH 1/8] accel: use store_release/load_acquire for cross-thread exit_request Paolo Bonzini
2025-08-09 22:41 ` Richard Henderson
@ 2025-08-11 19:31 ` Peter Xu
1 sibling, 0 replies; 31+ messages in thread
From: Peter Xu @ 2025-08-11 19:31 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, imammedo, richard.henderson
On Fri, Aug 08, 2025 at 08:58:58PM +0200, Paolo Bonzini wrote:
> cpu->exit_request do not use a load-acquire/store-release pair right now,
> but this means that cpu_exit() does not store it after any flags that are
> read in the slow path.
>
> Probably everything is protected one way or the other by the BQL, because
> after reading cpu->exit_request the CPU thread often goes to sleep
> (by waiting on the BQL-protected cpu->halt_cond), but it's not clear.
> Use load-acquire/store-release consistently.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 3/8] accel: use atomic accesses for exit_request
2025-08-08 18:59 ` [PATCH 3/8] accel: use atomic accesses for exit_request Paolo Bonzini
` (2 preceding siblings ...)
2025-08-11 14:49 ` Alex Bennée
@ 2025-08-11 19:33 ` Peter Xu
3 siblings, 0 replies; 31+ messages in thread
From: Peter Xu @ 2025-08-11 19:33 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, imammedo, richard.henderson
On Fri, Aug 08, 2025 at 08:59:00PM +0200, Paolo Bonzini wrote:
> CPU threads write exit_request as a "note to self" that they need to
> go out to a slow path. This write happens out of the BQL and can be
> a data race with another threads' cpu_exit(); use atomic accesses
> consistently.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2025-08-11 19:35 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-08 18:58 [PATCH 0/8] accel, cpus: clean up cpu->exit_request Paolo Bonzini
2025-08-08 18:58 ` [PATCH 1/8] accel: use store_release/load_acquire for cross-thread exit_request Paolo Bonzini
2025-08-09 22:41 ` Richard Henderson
2025-08-11 19:31 ` Peter Xu
2025-08-08 18:58 ` [PATCH 2/8] accel/hvf: check exit_request before running the vCPU Paolo Bonzini
2025-08-08 21:28 ` Philippe Mathieu-Daudé
2025-08-09 23:09 ` Richard Henderson
2025-08-11 7:06 ` Paolo Bonzini
2025-08-08 18:59 ` [PATCH 3/8] accel: use atomic accesses for exit_request Paolo Bonzini
2025-08-08 21:04 ` Philippe Mathieu-Daudé
2025-08-09 23:10 ` Richard Henderson
2025-08-11 14:49 ` Alex Bennée
2025-08-11 19:33 ` Peter Xu
2025-08-08 18:59 ` [PATCH 4/8] accel/tcg: introduce tcg_kick_vcpu_thread Paolo Bonzini
2025-08-08 21:15 ` Philippe Mathieu-Daudé
2025-08-09 23:16 ` Richard Henderson
2025-08-09 23:26 ` Richard Henderson
2025-08-11 6:10 ` Paolo Bonzini
2025-08-11 8:33 ` Philippe Mathieu-Daudé
2025-08-11 13:34 ` Philippe Mathieu-Daudé
2025-08-08 18:59 ` [PATCH 5/8] cpus: remove TCG-ism from cpu_exit() Paolo Bonzini
2025-08-08 21:17 ` Philippe Mathieu-Daudé
2025-08-09 23:17 ` Richard Henderson
2025-08-08 18:59 ` [PATCH 6/8] cpus: properly kick CPUs out of inner execution loop Paolo Bonzini
2025-08-11 12:56 ` Alex Bennée
2025-08-08 18:59 ` [PATCH 7/8] tcg/user: do not set exit_request gratuitously Paolo Bonzini
2025-08-08 21:21 ` Philippe Mathieu-Daudé
2025-08-08 21:45 ` Paolo Bonzini
2025-08-08 18:59 ` [PATCH 8/8] accel: make all calls to qemu_wait_io_event look the same Paolo Bonzini
2025-08-08 21:24 ` Philippe Mathieu-Daudé
2025-08-09 23:34 ` Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).