From: Paolo Bonzini <pbonzini@redhat.com>
To: qemu-devel@nongnu.org
Subject: [Qemu-devel] [PATCH 13/30] qemu-thread: report RCU quiescent states
Date: Fri, 28 Jun 2013 20:26:32 +0200 [thread overview]
Message-ID: <1372444009-11544-14-git-send-email-pbonzini@redhat.com> (raw)
In-Reply-To: <1372444009-11544-1-git-send-email-pbonzini@redhat.com>
Most threads will use mutexes and other sleeping synchronization primitives
(condition variables, semaphores, events) periodically. For these threads,
the synchronization primitives are natural places to report a quiescent
state (possibly an extended one).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
docs/rcu.txt | 33 ++++++++++++++++++++++++++++++++-
util/qemu-thread-posix.c | 30 ++++++++++++++++++++++++++----
util/qemu-thread-win32.c | 16 +++++++++++++++-
util/rcu.c | 3 ---
4 files changed, 73 insertions(+), 9 deletions(-)
diff --git a/docs/rcu.txt b/docs/rcu.txt
index a3510b9..6c4a852 100644
--- a/docs/rcu.txt
+++ b/docs/rcu.txt
@@ -168,6 +168,35 @@ of "quiescent states", i.e. points where no RCU read-side critical
section can be active. All threads created with qemu_thread_create
participate in the RCU mechanism and need to annotate such points.
+Luckily, in most cases no manual annotation is needed, because waiting
+on condition variables (qemu_cond_wait), semaphores (qemu_sem_wait,
+qemu_sem_timedwait) or events (qemu_event_wait) implicitly marks the thread
+as quiescent for the whole duration of the wait. (There is an exception
+for semaphore waits with a zero timeout).
+
+Manual annotation is still needed in the following cases:
+
+- threads that spend their sleeping time in the kernel, for example
+ in a call to select(), poll(), sigwait() or WaitForMultipleObjects().
+ The QEMU I/O thread is an example of this case. When running under
+ KVM, VCPUs are also in a quiescent state while running the guest.
+
+- threads that perform a lot of I/O. In QEMU, the workers used for
+ aio=thread are an example of this case (see aio_worker in block/raw-*).
+
+- threads that run continuously until they exit. The migration thread
+ is an example of this case.
+
+Regarding the second case, note that the workers run in the QEMU thread
+pool. The thread pool uses semaphores for synchronization, hence it does
+report quiescent states periodically. However, in some cases (e.g. NFS
+mounted with the "hard" option) the workers can take an arbitrarily long
+amount of time. When this happens, synchronize_rcu() will not exit and
+call_rcu() callbacks will be delayed arbitrarily. It is therefore a
+good idea to mark I/O system calls as quiescence points in the worker
+functions.
+
+
Marking quiescent states is done with the following three APIs:
void rcu_quiescent_state(void);
@@ -229,7 +258,9 @@ DIFFERENCES WITH LINUX
type of the callback's argument to be the type of the first argument.
call_rcu1 is the same as Linux's call_rcu.
-- Quiescent points must be marked explicitly in the thread.
+- Quiescent points must be marked explicitly unless the thread uses
+ condvars/semaphores/events for synchronization. Note that mutexes
+ do not report quiescent points (see the first item above).
RCU PATTERNS
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 2df3382..21190be 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -119,7 +119,9 @@ void qemu_cond_wait(QemuCond *cond, QemuMutex *mutex)
{
int err;
+ rcu_thread_offline();
err = pthread_cond_wait(&cond->cond, &mutex->lock);
+ rcu_thread_online();
if (err)
error_exit(err, __func__);
}
@@ -212,6 +214,10 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms)
int rc;
struct timespec ts;
+ if (ms) {
+ rcu_thread_offline();
+ }
+
#if defined(__APPLE__) || defined(__NetBSD__)
compute_abs_deadline(&ts, ms);
pthread_mutex_lock(&sem->lock);
@@ -227,7 +233,10 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms)
}
}
pthread_mutex_unlock(&sem->lock);
- return (rc == ETIMEDOUT ? -1 : 0);
+ if (rc == ETIMEDOUT) {
+ rc == -1;
+ }
+
#else
if (ms <= 0) {
/* This is cheaper than sem_timedwait. */
@@ -235,7 +244,7 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms)
rc = sem_trywait(&sem->sem);
} while (rc == -1 && errno == EINTR);
if (rc == -1 && errno == EAGAIN) {
- return -1;
+ goto out;
}
} else {
compute_abs_deadline(&ts, ms);
@@ -243,18 +252,25 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms)
rc = sem_timedwait(&sem->sem, &ts);
} while (rc == -1 && errno == EINTR);
if (rc == -1 && errno == ETIMEDOUT) {
- return -1;
+ goto out;
}
}
if (rc < 0) {
error_exit(errno, __func__);
}
- return 0;
#endif
+
+out:
+ if (ms) {
+ rcu_thread_online();
+ }
+ return rc;
}
void qemu_sem_wait(QemuSemaphore *sem)
{
+ rcu_thread_offline();
+
#if defined(__APPLE__) || defined(__NetBSD__)
pthread_mutex_lock(&sem->lock);
--sem->count;
@@ -272,6 +288,8 @@ void qemu_sem_wait(QemuSemaphore *sem)
error_exit(errno, __func__);
}
#endif
+
+ rcu_thread_online();
}
#ifdef __linux__
@@ -380,7 +398,11 @@ void qemu_event_wait(QemuEvent *ev)
return;
}
}
+ rcu_thread_offline();
futex_wait(ev, EV_BUSY);
+ rcu_thread_online();
+ } else {
+ rcu_quiescent_state();
}
}
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 18978be..9c14cf1 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -12,6 +12,7 @@
*/
#include "qemu-common.h"
#include "qemu/thread.h"
+#include "qemu/rcu.h"
#include <process.h>
#include <assert.h>
#include <limits.h>
@@ -187,7 +188,9 @@ void qemu_cond_wait(QemuCond *cond, QemuMutex *mutex)
* leaving mutex unlocked before we wait on semaphore.
*/
qemu_mutex_unlock(mutex);
+ rcu_thread_offline();
WaitForSingleObject(cond->sema, INFINITE);
+ rcu_thread_online();
/* Now waiters must rendez-vous with the signaling thread and
* let it continue. For cond_broadcast this has heavy contention
@@ -227,7 +230,16 @@ void qemu_sem_post(QemuSemaphore *sem)
int qemu_sem_timedwait(QemuSemaphore *sem, int ms)
{
- int rc = WaitForSingleObject(sem->sema, ms);
+ int rc;
+
+ if (ms) {
+ rcu_thread_offline();
+ }
+ rc = WaitForSingleObject(sem->sema, ms);
+ if (ms) {
+ rcu_thread_online();
+ }
+
if (rc == WAIT_OBJECT_0) {
return 0;
}
@@ -267,7 +279,9 @@ void qemu_event_reset(QemuEvent *ev)
void qemu_event_wait(QemuEvent *ev)
{
+ rcu_thread_offline();
WaitForSingleObject(ev->event, INFINITE);
+ rcu_thread_online();
}
struct QemuThreadData {
diff --git a/util/rcu.c b/util/rcu.c
index f1c5736..654f3bb 100644
--- a/util/rcu.c
+++ b/util/rcu.c
@@ -232,9 +232,6 @@ static void *call_rcu_thread(void *opaque)
{
struct rcu_head *node;
- /* This thread is just a writer. */
- rcu_thread_offline();
-
for (;;) {
int tries = 0;
int n = atomic_read(&rcu_call_count);
--
1.8.1.4
next prev parent reply other threads:[~2013-06-28 18:27 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-28 18:26 [Qemu-devel] [PATCH 00/30] Memory API changes for 1.6: RCU-protected address space dispatch Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 01/30] memory: access FlatView from a local variable Paolo Bonzini
2013-06-28 20:01 ` Anthony Liguori
2013-06-28 18:26 ` [Qemu-devel] [PATCH 02/30] memory: use a new FlatView pointer on every topology update Paolo Bonzini
2013-06-28 20:02 ` Anthony Liguori
2013-06-28 18:26 ` [Qemu-devel] [PATCH 03/30] memory: add reference counting to FlatView Paolo Bonzini
2013-06-28 20:07 ` Anthony Liguori
2013-06-28 18:26 ` [Qemu-devel] [PATCH 04/30] add a header file for atomic operations Paolo Bonzini
2013-06-28 20:41 ` Anthony Liguori
2013-07-01 10:21 ` Paolo Bonzini
2013-07-01 13:00 ` Anthony Liguori
2013-07-01 13:04 ` Paolo Bonzini
2013-07-01 13:20 ` Anthony Liguori
2013-07-04 5:24 ` liu ping fan
2013-07-01 11:08 ` Peter Maydell
2013-07-03 2:24 ` liu ping fan
2013-07-03 5:59 ` Paolo Bonzini
2013-07-03 7:07 ` liu ping fan
2013-06-28 18:26 ` [Qemu-devel] [PATCH 05/30] exec: do not use qemu/tls.h Paolo Bonzini
2013-06-28 20:43 ` Anthony Liguori
2013-06-28 23:53 ` Ed Maste
2013-07-01 10:16 ` Paolo Bonzini
2013-06-29 10:55 ` Peter Maydell
2013-07-01 10:45 ` Paolo Bonzini
2013-07-01 11:05 ` Peter Maydell
2013-07-01 16:21 ` Paolo Bonzini
2013-07-01 16:26 ` Peter Maydell
2013-07-01 20:52 ` Paolo Bonzini
2013-07-01 21:34 ` Peter Maydell
2013-07-02 13:40 ` Andreas Färber
2013-07-02 14:06 ` Alexander Graf
2013-06-28 18:26 ` [Qemu-devel] [PATCH 06/30] qemu-thread: add TLS wrappers Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 07/30] qemu-thread: add QemuEvent Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 08/30] rcu: add rcu library Paolo Bonzini
2013-07-01 9:47 ` Jan Kiszka
2013-06-28 18:26 ` [Qemu-devel] [PATCH 09/30] qemu-thread: register threads with RCU Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 10/30] rcu: add call_rcu Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 11/30] rcu: add rcutorture Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 12/30] rcu: allow nested calls to rcu_thread_offline/rcu_thread_online Paolo Bonzini
2013-06-28 18:26 ` Paolo Bonzini [this message]
2013-06-28 18:26 ` [Qemu-devel] [PATCH 14/30] event loop: report RCU quiescent states Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 15/30] cpus: " Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 16/30] block: " Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 17/30] migration: " Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 18/30] memory: protect current_map by RCU Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 19/30] memory: avoid ref/unref in memory_region_find Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 20/30] exec: change well-known physical sections to macros Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 21/30] exec: separate current memory map from the one being built Paolo Bonzini
2013-07-02 14:41 ` Jan Kiszka
2013-06-28 18:26 ` [Qemu-devel] [PATCH 22/30] memory: move MemoryListener declaration earlier Paolo Bonzini
2013-07-02 14:41 ` Jan Kiszka
2013-06-28 18:26 ` [Qemu-devel] [PATCH 23/30] exec: move listener from AddressSpaceDispatch to AddressSpace Paolo Bonzini
2013-07-02 14:41 ` Jan Kiszka
2013-06-28 18:26 ` [Qemu-devel] [PATCH 24/30] exec: separate current radix tree from the one being built Paolo Bonzini
2013-07-02 14:41 ` Jan Kiszka
2013-06-28 18:26 ` [Qemu-devel] [PATCH 25/30] exec: put memory map in AddressSpaceDispatch Paolo Bonzini
2013-07-02 14:42 ` Jan Kiszka
2013-07-02 15:08 ` Paolo Bonzini
2013-07-02 15:48 ` Jan Kiszka
2013-06-28 18:26 ` [Qemu-devel] [PATCH 26/30] exec: remove cur_map Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 27/30] exec: change some APIs to take AddressSpaceDispatch Paolo Bonzini
2013-07-02 14:47 ` Jan Kiszka
2013-06-28 18:26 ` [Qemu-devel] [PATCH 28/30] exec: change iotlb " Paolo Bonzini
2013-07-02 10:00 ` Jan Kiszka
2013-06-28 18:26 ` [Qemu-devel] [PATCH 29/30] exec: add a reference to the region returned by address_space_translate Paolo Bonzini
2013-06-28 18:26 ` [Qemu-devel] [PATCH 30/30] exec: put address space dispatch under RCU critical section Paolo Bonzini
2013-06-28 19:38 ` Jan Kiszka
2013-07-01 11:48 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1372444009-11544-14-git-send-email-pbonzini@redhat.com \
--to=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).