From: Paolo Bonzini <pbonzini@redhat.com>
To: qemu-devel@nongnu.org
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>,
Phil Dennis-Jordan <phil@philjordan.eu>
Subject: [PULL 20/31] qemu-thread: Avoid futex abstraction for non-Linux
Date: Fri, 6 Jun 2025 14:34:34 +0200 [thread overview]
Message-ID: <20250606123447.538131-21-pbonzini@redhat.com> (raw)
In-Reply-To: <20250606123447.538131-1-pbonzini@redhat.com>
From: Akihiko Odaki <akihiko.odaki@daynix.com>
qemu-thread used to abstract pthread primitives into futex for the
QemuEvent implementation of POSIX systems other than Linux. However,
this abstraction has one key difference: unlike futex, pthread
primitives require an explicit destruction, and it must be ordered after
wait and wake operations.
It would be easier to perform destruction if a wait operation ensures
the corresponding wake operation finishes as POSIX semaphore does, but
that requires to protect state accesses in qemu_event_set() and
qemu_event_wait() with a mutex. On the other hand, real futex does not
need such a protection but needs complex barrier and atomic operations
to ensure ordering between the two functions.
Add special implementations of qemu_event_set() and qemu_event_wait()
using pthread primitives. qemu_event_wait() will ensure qemu_event_set()
finishes, and these functions will avoid complex barrier and atomic
operations to ensure ordering between them.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Phil Dennis-Jordan <phil@philjordan.eu>
Link: https://lore.kernel.org/r/20250526-event-v4-5-5b784cc8e1de@daynix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
util/qemu-thread-posix.c | 84 +++++++++++++++++++++++++---------------
1 file changed, 53 insertions(+), 31 deletions(-)
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 3dc4d30052e..7fafbedbc4f 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -319,38 +319,23 @@ void qemu_sem_wait(QemuSemaphore *sem)
#ifdef CONFIG_LINUX
#include "qemu/futex.h"
-#else
-static inline void qemu_futex_wake(QemuEvent *ev, int n)
-{
- assert(ev->initialized);
- pthread_mutex_lock(&ev->lock);
- if (n == 1) {
- pthread_cond_signal(&ev->cond);
- } else {
- pthread_cond_broadcast(&ev->cond);
- }
- pthread_mutex_unlock(&ev->lock);
-}
-
-static inline void qemu_futex_wait(QemuEvent *ev, unsigned val)
-{
- assert(ev->initialized);
- pthread_mutex_lock(&ev->lock);
- if (ev->value == val) {
- pthread_cond_wait(&ev->cond, &ev->lock);
- }
- pthread_mutex_unlock(&ev->lock);
-}
#endif
/* Valid transitions:
- * - free->set, when setting the event
- * - busy->set, when setting the event, followed by qemu_futex_wake_all
- * - set->free, when resetting the event
- * - free->busy, when waiting
+ * - FREE -> SET (qemu_event_set)
+ * - BUSY -> SET (qemu_event_set)
+ * - SET -> FREE (qemu_event_reset)
+ * - FREE -> BUSY (qemu_event_wait)
*
- * set->busy does not happen (it can be observed from the outside but
- * it really is set->free->busy).
+ * With futex, the waking and blocking operations follow
+ * BUSY -> SET and FREE -> BUSY, respectively.
+ *
+ * Without futex, BUSY -> SET and FREE -> BUSY never happen. Instead, the waking
+ * operation follows FREE -> SET and the blocking operation will happen in
+ * qemu_event_wait() if the event is not SET.
+ *
+ * SET->BUSY does not happen (it can be observed from the outside but
+ * it really is SET->FREE->BUSY).
*
* busy->free provably cannot happen; to enforce it, the set->free transition
* is done with an OR, which becomes a no-op if the event has concurrently
@@ -386,6 +371,7 @@ void qemu_event_set(QemuEvent *ev)
{
assert(ev->initialized);
+#ifdef CONFIG_LINUX
/*
* Pairs with both qemu_event_reset() and qemu_event_wait().
*
@@ -403,12 +389,20 @@ void qemu_event_set(QemuEvent *ev)
qemu_futex_wake_all(ev);
}
}
+#else
+ pthread_mutex_lock(&ev->lock);
+ /* Pairs with qemu_event_reset()'s load acquire. */
+ qatomic_store_release(&ev->value, EV_SET);
+ pthread_cond_broadcast(&ev->cond);
+ pthread_mutex_unlock(&ev->lock);
+#endif
}
void qemu_event_reset(QemuEvent *ev)
{
assert(ev->initialized);
+#ifdef CONFIG_LINUX
/*
* If there was a concurrent reset (or even reset+wait),
* do nothing. Otherwise change EV_SET->EV_FREE.
@@ -420,21 +414,42 @@ void qemu_event_reset(QemuEvent *ev)
* Pairs with the first memory barrier in qemu_event_set().
*/
smp_mb__after_rmw();
+#else
+ /*
+ * If futexes are not available, there are no EV_FREE->EV_BUSY
+ * transitions because wakeups are done entirely through the
+ * condition variable. Since qatomic_set() only writes EV_FREE,
+ * the load seems useless but in reality, the acquire synchronizes
+ * with qemu_event_set()'s store release: if qemu_event_reset()
+ * sees EV_SET here, then the caller will certainly see a
+ * successful condition and skip qemu_event_wait():
+ *
+ * done = 1; if (done == 0)
+ * qemu_event_set() { qemu_event_reset() {
+ * lock();
+ * ev->value = EV_SET -----> load ev->value
+ * ev->value = old value | EV_FREE
+ * cond_broadcast()
+ * unlock(); }
+ * } if (done == 0)
+ * // qemu_event_wait() not called
+ */
+ qatomic_set(&ev->value, qatomic_load_acquire(&ev->value) | EV_FREE);
+#endif
}
void qemu_event_wait(QemuEvent *ev)
{
- unsigned value;
-
assert(ev->initialized);
+#ifdef CONFIG_LINUX
while (true) {
/*
* qemu_event_wait must synchronize with qemu_event_set even if it does
* not go down the slow path, so this load-acquire is needed that
* synchronizes with the first memory barrier in qemu_event_set().
*/
- value = qatomic_load_acquire(&ev->value);
+ unsigned value = qatomic_load_acquire(&ev->value);
if (value == EV_SET) {
break;
}
@@ -463,6 +478,13 @@ void qemu_event_wait(QemuEvent *ev)
*/
qemu_futex_wait(ev, EV_BUSY);
}
+#else
+ pthread_mutex_lock(&ev->lock);
+ while (qatomic_read(&ev->value) != EV_SET) {
+ pthread_cond_wait(&ev->cond, &ev->lock);
+ }
+ pthread_mutex_unlock(&ev->lock);
+#endif
}
static __thread NotifierList thread_exit;
--
2.49.0
next prev parent reply other threads:[~2025-06-06 12:42 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-06 12:34 [PULL 00/31] Threading, Rust, i386 changes for 2025-06-06 Paolo Bonzini
2025-06-06 12:34 ` [PULL 01/31] subprojects: add the anyhow crate Paolo Bonzini
2025-06-06 12:34 ` [PULL 02/31] subprojects: add the foreign crate Paolo Bonzini
2025-06-06 12:34 ` [PULL 03/31] util/error: expose Error definition to Rust code Paolo Bonzini
2025-06-06 12:34 ` [PULL 04/31] util/error: allow non-NUL-terminated err->src Paolo Bonzini
2025-06-06 12:34 ` [PULL 05/31] util/error: make func optional Paolo Bonzini
2025-06-06 12:34 ` [PULL 06/31] rust: qemu-api: add bindings to Error Paolo Bonzini
2025-06-06 12:34 ` [PULL 07/31] rust: qemu-api: add tests for Error bindings Paolo Bonzini
2025-06-06 12:34 ` [PULL 08/31] rust: qdev: support returning errors from realize Paolo Bonzini
2025-06-06 12:34 ` [PULL 09/31] rust/hpet: change type of num_timers to usize Paolo Bonzini
2025-06-06 12:34 ` [PULL 10/31] hpet: adjust VMState for consistency with Rust version Paolo Bonzini
2025-06-06 12:34 ` [PULL 11/31] hpet: return errors from realize if properties are incorrect Paolo Bonzini
2025-06-06 12:34 ` [PULL 12/31] rust/hpet: " Paolo Bonzini
2025-06-06 12:34 ` [PULL 13/31] rust/hpet: Drop BqlCell wrapper for num_timers Paolo Bonzini
2025-06-06 12:34 ` [PULL 14/31] docs: update Rust module status Paolo Bonzini
2025-06-06 12:34 ` [PULL 15/31] rust: make TryFrom macro more resilient Paolo Bonzini
2025-06-06 12:34 ` [PULL 16/31] i386/kvm: Prefault memory on page state change Paolo Bonzini
2025-06-11 2:55 ` Xiaoyao Li
2025-06-11 6:12 ` Paolo Bonzini
2025-06-11 6:44 ` Xiaoyao Li
2025-06-06 12:34 ` [PULL 17/31] futex: Check value after qemu_futex_wait() Paolo Bonzini
2025-06-06 12:34 ` [PULL 18/31] futex: Support Windows Paolo Bonzini
2025-06-06 12:34 ` [PULL 19/31] qemu-thread: Replace __linux__ with CONFIG_LINUX Paolo Bonzini
2025-06-06 12:34 ` Paolo Bonzini [this message]
2025-06-06 12:34 ` [PULL 21/31] qemu-thread: Use futex for QemuEvent on Windows Paolo Bonzini
2025-06-06 12:34 ` [PULL 22/31] qemu-thread: Use futex if available for QemuLockCnt Paolo Bonzini
2025-06-06 12:34 ` [PULL 23/31] qemu-thread: Document QemuEvent Paolo Bonzini
2025-06-06 12:34 ` [PULL 24/31] migration: Replace QemuSemaphore with QemuEvent Paolo Bonzini
2025-06-06 12:34 ` [PULL 25/31] migration/colo: " Paolo Bonzini
2025-06-06 12:34 ` [PULL 26/31] migration/postcopy: " Paolo Bonzini
2025-06-06 12:34 ` [PULL 27/31] hw/display/apple-gfx: " Paolo Bonzini
2025-06-06 12:34 ` [PULL 28/31] target/i386: Detect flush-to-zero after rounding Paolo Bonzini
2025-06-06 12:34 ` [PULL 29/31] target/i386: Use correct type for get_float_exception_flags() values Paolo Bonzini
2025-06-06 12:34 ` [PULL 30/31] target/i386: Wire up MXCSR.DE and FPUS.DE correctly Paolo Bonzini
2025-06-06 12:34 ` [PULL 31/31] tests/tcg/x86_64/fma: add test for exact-denormal output Paolo Bonzini
2025-06-06 15:27 ` [PULL 00/31] Threading, Rust, i386 changes for 2025-06-06 Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250606123447.538131-21-pbonzini@redhat.com \
--to=pbonzini@redhat.com \
--cc=akihiko.odaki@daynix.com \
--cc=phil@philjordan.eu \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).