* [PATCH 00/18] block: Introduce a block graph rwlock
@ 2022-12-07 13:18 Kevin Wolf
2022-12-07 13:18 ` [PATCH 01/18] block: Factor out bdrv_drain_all_begin_nopoll() Kevin Wolf
` (18 more replies)
0 siblings, 19 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
This series supersedes the first half of Emanuele's "Protect the block
layer with a rwlock: part 1". It introduces the basic infrastructure for
protecting the block graph (specifically parent/child links) with a
rwlock. Actually taking the reader lock in all necessary places is left
for future series.
Compared to Emanuele's series, this one adds patches to make use of
clang's Thread Safety Analysis (TSA) feature in order to statically
check at compile time that the places where we assert that we hold the
lock actually do hold it. Once we cover all relevant places, the check
can be extended to verify that all accesses of bs->children and
bs->parents hold the lock.
For reference, here is the more detailed version of our plan in
Emanuele's words from his series:
The aim is to replace the current AioContext lock with much
fine-grained locks, aimed to protect only specific data. Currently
the AioContext lock is used pretty much everywhere, and it's not
even clear what it is protecting exactly.
The aim of the rwlock is to cover graph modifications: more
precisely, when a BlockDriverState parent or child list is modified
or read, since it can be concurrently accessed by the main loop and
iothreads.
The main assumption is that the main loop is the only one allowed to
perform graph modifications, and so far this has always been held by
the current code.
The rwlock is inspired from cpus-common.c implementation, and aims
to reduce cacheline bouncing by having per-aiocontext counter of
readers. All details and implementation of the lock are in patch 2.
We distinguish between writer (main loop, under BQL) that modifies
the graph, and readers (all other coroutines running in various
AioContext), that go through the graph edges, reading ->parents
and->children. The writer (main loop) has an "exclusive" access,
so it first waits for current read to finish, and then prevents
incoming ones from entering while it has the exclusive access. The
readers (coroutines in multiple AioContext) are free to access the
graph as long the writer is not modifying the graph. In case it is,
they go in a CoQueue and sleep until the writer is done.
In this and following series, we try to follow the following locking
pattern:
- bdrv_co_* functions that call BlockDriver callbacks always expect
the lock to be taken, therefore they assert.
- blk_co_* functions are called from external code outside the block
layer, which should not have to care about the block layer's
internal locking. Usually they also call blk_wait_while_drained().
Therefore they take the lock internally.
The long term goal of this series is to eventually replace the
AioContext lock, so that we can get rid of it once and for all.
Emanuele Giuseppe Esposito (7):
graph-lock: Implement guard macros
async: Register/unregister aiocontext in graph lock list
block: wrlock in bdrv_replace_child_noperm
block: remove unnecessary assert_bdrv_graph_writable()
block: assert that graph read and writes are performed correctly
block-coroutine-wrapper.py: introduce annotations that take the graph
rdlock
block: use co_wrapper_mixed_bdrv_rdlock in functions taking the rdlock
Kevin Wolf (10):
block: Factor out bdrv_drain_all_begin_nopoll()
Import clang-tsa.h
clang-tsa: Add TSA_ASSERT() macro
clang-tsa: Add macros for shared locks
configure: Enable -Wthread-safety if present
test-bdrv-drain: Fix incorrrect drain assumptions
block: Fix locking in external_snapshot_prepare()
graph-lock: TSA annotations for lock/unlock functions
Mark assert_bdrv_graph_readable/writable() GRAPH_RD/WRLOCK
block: GRAPH_RDLOCK for functions only called by co_wrappers
Paolo Bonzini (1):
graph-lock: Introduce a lock to protect block graph operations
configure | 1 +
block/coroutines.h | 19 +-
include/block/aio.h | 9 +
include/block/block-common.h | 9 +-
include/block/block-global-state.h | 1 +
include/block/block-io.h | 53 +++--
include/block/block_int-common.h | 24 +--
include/block/block_int-global-state.h | 17 --
include/block/block_int.h | 1 +
include/block/graph-lock.h | 280 +++++++++++++++++++++++++
include/qemu/clang-tsa.h | 114 ++++++++++
block.c | 24 ++-
block/graph-lock.c | 275 ++++++++++++++++++++++++
block/io.c | 21 +-
blockdev.c | 4 +
stubs/graph-lock.c | 10 +
tests/unit/test-bdrv-drain.c | 18 ++
util/async.c | 4 +
scripts/block-coroutine-wrapper.py | 12 ++
block/meson.build | 1 +
stubs/meson.build | 1 +
21 files changed, 820 insertions(+), 78 deletions(-)
create mode 100644 include/block/graph-lock.h
create mode 100644 include/qemu/clang-tsa.h
create mode 100644 block/graph-lock.c
create mode 100644 stubs/graph-lock.c
--
2.38.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 01/18] block: Factor out bdrv_drain_all_begin_nopoll()
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 02/18] graph-lock: Introduce a lock to protect block graph operations Kevin Wolf
` (17 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
Provide a separate function that just quiesces the users of a node to
prevent new requests from coming in, but without waiting for the already
in-flight I/O to complete.
This function can be used in contexts where polling is not allowed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
include/block/block-global-state.h | 1 +
block/io.c | 19 +++++++++++++------
2 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/include/block/block-global-state.h b/include/block/block-global-state.h
index 1f8b54f2df..b0a3cfe6b8 100644
--- a/include/block/block-global-state.h
+++ b/include/block/block-global-state.h
@@ -152,6 +152,7 @@ int bdrv_inactivate_all(void);
int bdrv_flush_all(void);
void bdrv_close_all(void);
void bdrv_drain_all_begin(void);
+void bdrv_drain_all_begin_nopoll(void);
void bdrv_drain_all_end(void);
void bdrv_drain_all(void);
diff --git a/block/io.c b/block/io.c
index f4444b7777..d160d2e273 100644
--- a/block/io.c
+++ b/block/io.c
@@ -466,16 +466,11 @@ static bool bdrv_drain_all_poll(void)
* NOTE: no new block jobs or BlockDriverStates can be created between
* the bdrv_drain_all_begin() and bdrv_drain_all_end() calls.
*/
-void bdrv_drain_all_begin(void)
+void bdrv_drain_all_begin_nopoll(void)
{
BlockDriverState *bs = NULL;
GLOBAL_STATE_CODE();
- if (qemu_in_coroutine()) {
- bdrv_co_yield_to_drain(NULL, true, NULL, true);
- return;
- }
-
/*
* bdrv queue is managed by record/replay,
* waiting for finishing the I/O requests may
@@ -500,6 +495,18 @@ void bdrv_drain_all_begin(void)
bdrv_do_drained_begin(bs, NULL, false);
aio_context_release(aio_context);
}
+}
+
+void bdrv_drain_all_begin(void)
+{
+ BlockDriverState *bs = NULL;
+
+ if (qemu_in_coroutine()) {
+ bdrv_co_yield_to_drain(NULL, true, NULL, true);
+ return;
+ }
+
+ bdrv_drain_all_begin_nopoll();
/* Now poll the in-flight requests */
AIO_WAIT_WHILE(NULL, bdrv_drain_all_poll());
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 02/18] graph-lock: Introduce a lock to protect block graph operations
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
2022-12-07 13:18 ` [PATCH 01/18] block: Factor out bdrv_drain_all_begin_nopoll() Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 03/18] graph-lock: Implement guard macros Kevin Wolf
` (16 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
From: Paolo Bonzini <pbonzini@redhat.com>
Block layer graph operations are always run under BQL in the main loop.
This is proved by the assertion qemu_in_main_thread() and its wrapper
macro GLOBAL_STATE_CODE.
However, there are also concurrent coroutines running in other iothreads
that always try to traverse the graph. Currently this is protected
(among various other things) by the AioContext lock, but once this is
removed, we need to make sure that reads do not happen while modifying
the graph.
We distinguish between writer (main loop, under BQL) that modifies the
graph, and readers (all other coroutines running in various AioContext),
that go through the graph edges, reading ->parents and->children.
The writer (main loop) has "exclusive" access, so it first waits for any
current read to finish, and then prevents incoming ones from entering
while it has the exclusive access.
The readers (coroutines in multiple AioContext) are free to access the
graph as long the writer is not modifying the graph. In case it is, they
go in a CoQueue and sleep until the writer is done.
If a coroutine changes AioContext, the counter in the original and new
AioContext are left intact, since the writer does not care where the
reader is, but only if there is one.
As a result, some AioContexts might have a negative reader count, to
balance the positive count of the AioContext that took the lock. This
also means that when an AioContext is deleted it may have a nonzero
reader count. In that case we transfer the count to a global shared
counter so that the writer is always aware of all readers.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
include/block/aio.h | 9 ++
include/block/block_int.h | 1 +
include/block/graph-lock.h | 139 ++++++++++++++++++++
block/graph-lock.c | 261 +++++++++++++++++++++++++++++++++++++
block/meson.build | 1 +
5 files changed, 411 insertions(+)
create mode 100644 include/block/graph-lock.h
create mode 100644 block/graph-lock.c
diff --git a/include/block/aio.h b/include/block/aio.h
index d128558f1d..0f65a3cc9e 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -22,6 +22,7 @@
#include "qemu/event_notifier.h"
#include "qemu/thread.h"
#include "qemu/timer.h"
+#include "block/graph-lock.h"
typedef struct BlockAIOCB BlockAIOCB;
typedef void BlockCompletionFunc(void *opaque, int ret);
@@ -127,6 +128,14 @@ struct AioContext {
/* Used by AioContext users to protect from multi-threaded access. */
QemuRecMutex lock;
+ /*
+ * Keep track of readers and writers of the block layer graph.
+ * This is essential to avoid performing additions and removal
+ * of nodes and edges from block graph while some
+ * other thread is traversing it.
+ */
+ BdrvGraphRWlock *bdrv_graph;
+
/* The list of registered AIO handlers. Protected by ctx->list_lock. */
AioHandlerList aio_handlers;
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 7d50b6bbd1..b35b0138ed 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -26,6 +26,7 @@
#include "block_int-global-state.h"
#include "block_int-io.h"
+#include "block/graph-lock.h"
/* DO NOT ADD ANYTHING IN HERE. USE ONE OF THE HEADERS INCLUDED ABOVE */
diff --git a/include/block/graph-lock.h b/include/block/graph-lock.h
new file mode 100644
index 0000000000..82edb62cfa
--- /dev/null
+++ b/include/block/graph-lock.h
@@ -0,0 +1,139 @@
+/*
+ * Graph lock: rwlock to protect block layer graph manipulations (add/remove
+ * edges and nodes)
+ *
+ * Copyright (c) 2022 Red Hat
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef GRAPH_LOCK_H
+#define GRAPH_LOCK_H
+
+#include "qemu/osdep.h"
+
+#include "qemu/coroutine.h"
+
+/**
+ * Graph Lock API
+ * This API provides a rwlock used to protect block layer
+ * graph modifications like edge (BdrvChild) and node (BlockDriverState)
+ * addition and removal.
+ * Currently we have 1 writer only, the Main loop, and many
+ * readers, mostly coroutines running in other AioContext thus other threads.
+ *
+ * We distinguish between writer (main loop, under BQL) that modifies the
+ * graph, and readers (all other coroutines running in various AioContext),
+ * that go through the graph edges, reading
+ * BlockDriverState ->parents and->children.
+ *
+ * The writer (main loop) has an "exclusive" access, so it first waits for
+ * current read to finish, and then prevents incoming ones from
+ * entering while it has the exclusive access.
+ *
+ * The readers (coroutines in multiple AioContext) are free to
+ * access the graph as long the writer is not modifying the graph.
+ * In case it is, they go in a CoQueue and sleep until the writer
+ * is done.
+ *
+ * If a coroutine changes AioContext, the counter in the original and new
+ * AioContext are left intact, since the writer does not care where is the
+ * reader, but only if there is one.
+ * As a result, some AioContexts might have a negative reader count, to
+ * balance the positive count of the AioContext that took the lock.
+ * This also means that when an AioContext is deleted it may have a nonzero
+ * reader count. In that case we transfer the count to a global shared counter
+ * so that the writer is always aware of all readers.
+ */
+typedef struct BdrvGraphRWlock BdrvGraphRWlock;
+
+/*
+ * register_aiocontext:
+ * Add AioContext @ctx to the list of AioContext.
+ * This list is used to obtain the total number of readers
+ * currently running the graph.
+ */
+void register_aiocontext(AioContext *ctx);
+
+/*
+ * unregister_aiocontext:
+ * Removes AioContext @ctx to the list of AioContext.
+ */
+void unregister_aiocontext(AioContext *ctx);
+
+/*
+ * bdrv_graph_wrlock:
+ * Start an exclusive write operation to modify the graph. This means we are
+ * adding or removing an edge or a node in the block layer graph. Nobody else
+ * is allowed to access the graph.
+ *
+ * Must only be called from outside bdrv_graph_co_rdlock.
+ *
+ * The wrlock can only be taken from the main loop, with BQL held, as only the
+ * main loop is allowed to modify the graph.
+ *
+ * This function polls. Callers must not hold the lock of any AioContext other
+ * than the current one.
+ */
+void bdrv_graph_wrlock(void);
+
+/*
+ * bdrv_graph_wrunlock:
+ * Write finished, reset global has_writer to 0 and restart
+ * all readers that are waiting.
+ */
+void bdrv_graph_wrunlock(void);
+
+/*
+ * bdrv_graph_co_rdlock:
+ * Read the bs graph. This usually means traversing all nodes in
+ * the graph, therefore it can't happen while another thread is
+ * modifying it.
+ * Increases the reader counter of the current aiocontext,
+ * and if has_writer is set, it means that the writer is modifying
+ * the graph, therefore wait in a coroutine queue.
+ * The writer will then wake this coroutine once it is done.
+ *
+ * This lock should be taken from Iothreads (IO_CODE() class of functions)
+ * because it signals the writer that there are some
+ * readers currently running, or waits until the current
+ * write is finished before continuing.
+ * Calling this function from the Main Loop with BQL held
+ * is not necessary, since the Main Loop itself is the only
+ * writer, thus won't be able to read and write at the same time.
+ * The only exception to that is when we can't take the lock in the
+ * function/coroutine itself, and need to delegate the caller (usually main
+ * loop) to take it and wait that the coroutine ends, so that
+ * we always signal that a reader is running.
+ */
+void coroutine_fn bdrv_graph_co_rdlock(void);
+
+/*
+ * bdrv_graph_rdunlock:
+ * Read terminated, decrease the count of readers in the current aiocontext.
+ * If the writer is waiting for reads to finish (has_writer == 1), signal
+ * the writer that we are done via aio_wait_kick() to let it continue.
+ */
+void coroutine_fn bdrv_graph_co_rdunlock(void);
+
+/*
+ * bdrv_graph_rd{un}lock_main_loop:
+ * Just a placeholder to mark where the graph rdlock should be taken
+ * in the main loop. It is just asserting that we are not
+ * in a coroutine and in GLOBAL_STATE_CODE.
+ */
+void bdrv_graph_rdlock_main_loop(void);
+void bdrv_graph_rdunlock_main_loop(void);
+
+#endif /* GRAPH_LOCK_H */
+
diff --git a/block/graph-lock.c b/block/graph-lock.c
new file mode 100644
index 0000000000..e033c6d9ac
--- /dev/null
+++ b/block/graph-lock.c
@@ -0,0 +1,261 @@
+/*
+ * Graph lock: rwlock to protect block layer graph manipulations (add/remove
+ * edges and nodes)
+ *
+ * Copyright (c) 2022 Red Hat
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/main-loop.h"
+#include "block/graph-lock.h"
+#include "block/block.h"
+#include "block/block_int.h"
+
+/* Protects the list of aiocontext and orphaned_reader_count */
+static QemuMutex aio_context_list_lock;
+
+/* Written and read with atomic operations. */
+static int has_writer;
+
+/*
+ * A reader coroutine could move from an AioContext to another.
+ * If this happens, there is no problem from the point of view of
+ * counters. The problem is that the total count becomes
+ * unbalanced if one of the two AioContexts gets deleted.
+ * The count of readers must remain correct, so the AioContext's
+ * balance is transferred to this glboal variable.
+ * Protected by aio_context_list_lock.
+ */
+static uint32_t orphaned_reader_count;
+
+/* Queue of readers waiting for the writer to finish */
+static CoQueue reader_queue;
+
+struct BdrvGraphRWlock {
+ /* How many readers are currently reading the graph. */
+ uint32_t reader_count;
+
+ /*
+ * List of BdrvGraphRWlock kept in graph-lock.c
+ * Protected by aio_context_list_lock
+ */
+ QTAILQ_ENTRY(BdrvGraphRWlock) next_aio;
+};
+
+/*
+ * List of BdrvGraphRWlock. This list ensures that each BdrvGraphRWlock
+ * can safely modify only its own counter, avoid reading/writing
+ * others and thus improving performances by avoiding cacheline bounces.
+ */
+static QTAILQ_HEAD(, BdrvGraphRWlock) aio_context_list =
+ QTAILQ_HEAD_INITIALIZER(aio_context_list);
+
+static void __attribute__((__constructor__)) bdrv_init_graph_lock(void)
+{
+ qemu_mutex_init(&aio_context_list_lock);
+ qemu_co_queue_init(&reader_queue);
+}
+
+void register_aiocontext(AioContext *ctx)
+{
+ ctx->bdrv_graph = g_new0(BdrvGraphRWlock, 1);
+ QEMU_LOCK_GUARD(&aio_context_list_lock);
+ assert(ctx->bdrv_graph->reader_count == 0);
+ QTAILQ_INSERT_TAIL(&aio_context_list, ctx->bdrv_graph, next_aio);
+}
+
+void unregister_aiocontext(AioContext *ctx)
+{
+ QEMU_LOCK_GUARD(&aio_context_list_lock);
+ orphaned_reader_count += ctx->bdrv_graph->reader_count;
+ QTAILQ_REMOVE(&aio_context_list, ctx->bdrv_graph, next_aio);
+ g_free(ctx->bdrv_graph);
+}
+
+static uint32_t reader_count(void)
+{
+ BdrvGraphRWlock *brdv_graph;
+ uint32_t rd;
+
+ QEMU_LOCK_GUARD(&aio_context_list_lock);
+
+ /* rd can temporarly be negative, but the total will *always* be >= 0 */
+ rd = orphaned_reader_count;
+ QTAILQ_FOREACH(brdv_graph, &aio_context_list, next_aio) {
+ rd += qatomic_read(&brdv_graph->reader_count);
+ }
+
+ /* shouldn't overflow unless there are 2^31 readers */
+ assert((int32_t)rd >= 0);
+ return rd;
+}
+
+void bdrv_graph_wrlock(void)
+{
+ GLOBAL_STATE_CODE();
+ assert(!qatomic_read(&has_writer));
+
+ /* Make sure that constantly arriving new I/O doesn't cause starvation */
+ bdrv_drain_all_begin_nopoll();
+
+ /*
+ * reader_count == 0: this means writer will read has_reader as 1
+ * reader_count >= 1: we don't know if writer read has_writer == 0 or 1,
+ * but we need to wait.
+ * Wait by allowing other coroutine (and possible readers) to continue.
+ */
+ do {
+ /*
+ * has_writer must be 0 while polling, otherwise we get a deadlock if
+ * any callback involved during AIO_WAIT_WHILE() tries to acquire the
+ * reader lock.
+ */
+ qatomic_set(&has_writer, 0);
+ AIO_WAIT_WHILE(qemu_get_aio_context(), reader_count() >= 1);
+ qatomic_set(&has_writer, 1);
+
+ /*
+ * We want to only check reader_count() after has_writer = 1 is visible
+ * to other threads. That way no more readers can sneak in after we've
+ * determined reader_count() == 0.
+ */
+ smp_mb();
+ } while (reader_count() >= 1);
+
+ bdrv_drain_all_end();
+}
+
+void bdrv_graph_wrunlock(void)
+{
+ GLOBAL_STATE_CODE();
+ QEMU_LOCK_GUARD(&aio_context_list_lock);
+ assert(qatomic_read(&has_writer));
+
+ /*
+ * No need for memory barriers, this works in pair with
+ * the slow path of rdlock() and both take the lock.
+ */
+ qatomic_store_release(&has_writer, 0);
+
+ /* Wake up all coroutine that are waiting to read the graph */
+ qemu_co_enter_all(&reader_queue, &aio_context_list_lock);
+}
+
+void coroutine_fn bdrv_graph_co_rdlock(void)
+{
+ BdrvGraphRWlock *bdrv_graph;
+ bdrv_graph = qemu_get_current_aio_context()->bdrv_graph;
+
+ /* Do not lock if in main thread */
+ if (qemu_in_main_thread()) {
+ return;
+ }
+
+ for (;;) {
+ qatomic_set(&bdrv_graph->reader_count,
+ bdrv_graph->reader_count + 1);
+ /* make sure writer sees reader_count before we check has_writer */
+ smp_mb();
+
+ /*
+ * has_writer == 0: this means writer will read reader_count as >= 1
+ * has_writer == 1: we don't know if writer read reader_count == 0
+ * or > 0, but we need to wait anyways because
+ * it will write.
+ */
+ if (!qatomic_read(&has_writer)) {
+ break;
+ }
+
+ /*
+ * Synchronize access with reader_count() in bdrv_graph_wrlock().
+ * Case 1:
+ * If this critical section gets executed first, reader_count will
+ * decrease and the reader will go to sleep.
+ * Then the writer will read reader_count that does not take into
+ * account this reader, and if there's no other reader it will
+ * enter the write section.
+ * Case 2:
+ * If reader_count() critical section gets executed first,
+ * then writer will read reader_count >= 1.
+ * It will wait in AIO_WAIT_WHILE(), but once it releases the lock
+ * we will enter this critical section and call aio_wait_kick().
+ */
+ WITH_QEMU_LOCK_GUARD(&aio_context_list_lock) {
+ /*
+ * Additional check when we use the above lock to synchronize
+ * with bdrv_graph_wrunlock().
+ * Case 1:
+ * If this gets executed first, has_writer is still 1, so we reduce
+ * reader_count and go to sleep.
+ * Then the writer will set has_writer to 0 and wake up all readers,
+ * us included.
+ * Case 2:
+ * If bdrv_graph_wrunlock() critical section gets executed first,
+ * then it will set has_writer to 0 and wake up all other readers.
+ * Then we execute this critical section, and therefore must check
+ * again for has_writer, otherwise we sleep without any writer
+ * actually running.
+ */
+ if (!qatomic_read(&has_writer)) {
+ return;
+ }
+
+ /* slow path where reader sleeps */
+ bdrv_graph->reader_count--;
+ aio_wait_kick();
+ qemu_co_queue_wait(&reader_queue, &aio_context_list_lock);
+ }
+ }
+}
+
+void coroutine_fn bdrv_graph_co_rdunlock(void)
+{
+ BdrvGraphRWlock *bdrv_graph;
+ bdrv_graph = qemu_get_current_aio_context()->bdrv_graph;
+
+ /* Do not lock if in main thread */
+ if (qemu_in_main_thread()) {
+ return;
+ }
+
+ qatomic_store_release(&bdrv_graph->reader_count,
+ bdrv_graph->reader_count - 1);
+ /* make sure writer sees reader_count before we check has_writer */
+ smp_mb();
+
+ /*
+ * has_writer == 0: this means reader will read reader_count decreased
+ * has_writer == 1: we don't know if writer read reader_count old or
+ * new. Therefore, kick again so on next iteration
+ * writer will for sure read the updated value.
+ */
+ if (qatomic_read(&has_writer)) {
+ aio_wait_kick();
+ }
+}
+
+void bdrv_graph_rdlock_main_loop(void)
+{
+ GLOBAL_STATE_CODE();
+ assert(!qemu_in_coroutine());
+}
+
+void bdrv_graph_rdunlock_main_loop(void)
+{
+ GLOBAL_STATE_CODE();
+ assert(!qemu_in_coroutine());
+}
diff --git a/block/meson.build b/block/meson.build
index c48a59571e..90011a2805 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -10,6 +10,7 @@ block_ss.add(files(
'blkverify.c',
'block-backend.c',
'block-copy.c',
+ 'graph-lock.c',
'commit.c',
'copy-on-read.c',
'preallocate.c',
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 03/18] graph-lock: Implement guard macros
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
2022-12-07 13:18 ` [PATCH 01/18] block: Factor out bdrv_drain_all_begin_nopoll() Kevin Wolf
2022-12-07 13:18 ` [PATCH 02/18] graph-lock: Introduce a lock to protect block graph operations Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 04/18] async: Register/unregister aiocontext in graph lock list Kevin Wolf
` (15 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Similar to the implementation in lockable.h, implement macros to
automatically take and release the rdlock.
Create the empty GraphLockable and GraphLockableMainloop structs only to
use it as a type for G_DEFINE_AUTOPTR_CLEANUP_FUNC.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
include/block/graph-lock.h | 66 ++++++++++++++++++++++++++++++++++++++
1 file changed, 66 insertions(+)
diff --git a/include/block/graph-lock.h b/include/block/graph-lock.h
index 82edb62cfa..b27d8a5fb1 100644
--- a/include/block/graph-lock.h
+++ b/include/block/graph-lock.h
@@ -135,5 +135,71 @@ void coroutine_fn bdrv_graph_co_rdunlock(void);
void bdrv_graph_rdlock_main_loop(void);
void bdrv_graph_rdunlock_main_loop(void);
+typedef struct GraphLockable { } GraphLockable;
+
+/*
+ * In C, compound literals have the lifetime of an automatic variable.
+ * In C++ it would be different, but then C++ wouldn't need QemuLockable
+ * either...
+ */
+#define GML_OBJ_() (&(GraphLockable) { })
+
+static inline GraphLockable *graph_lockable_auto_lock(GraphLockable *x)
+{
+ bdrv_graph_co_rdlock();
+ return x;
+}
+
+static inline void graph_lockable_auto_unlock(GraphLockable *x)
+{
+ bdrv_graph_co_rdunlock();
+}
+
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(GraphLockable, graph_lockable_auto_unlock)
+
+#define WITH_GRAPH_RDLOCK_GUARD_(var) \
+ for (g_autoptr(GraphLockable) var = graph_lockable_auto_lock(GML_OBJ_()); \
+ var; \
+ graph_lockable_auto_unlock(var), var = NULL)
+
+#define WITH_GRAPH_RDLOCK_GUARD() \
+ WITH_GRAPH_RDLOCK_GUARD_(glue(graph_lockable_auto, __COUNTER__))
+
+#define GRAPH_RDLOCK_GUARD(x) \
+ g_autoptr(GraphLockable) \
+ glue(graph_lockable_auto, __COUNTER__) G_GNUC_UNUSED = \
+ graph_lockable_auto_lock(GML_OBJ_())
+
+
+typedef struct GraphLockableMainloop { } GraphLockableMainloop;
+
+/*
+ * In C, compound literals have the lifetime of an automatic variable.
+ * In C++ it would be different, but then C++ wouldn't need QemuLockable
+ * either...
+ */
+#define GMLML_OBJ_() (&(GraphLockableMainloop) { })
+
+static inline GraphLockableMainloop *
+graph_lockable_auto_lock_mainloop(GraphLockableMainloop *x)
+{
+ bdrv_graph_rdlock_main_loop();
+ return x;
+}
+
+static inline void
+graph_lockable_auto_unlock_mainloop(GraphLockableMainloop *x)
+{
+ bdrv_graph_rdunlock_main_loop();
+}
+
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(GraphLockableMainloop,
+ graph_lockable_auto_unlock_mainloop)
+
+#define GRAPH_RDLOCK_GUARD_MAINLOOP(x) \
+ g_autoptr(GraphLockableMainloop) \
+ glue(graph_lockable_auto, __COUNTER__) G_GNUC_UNUSED = \
+ graph_lockable_auto_lock_mainloop(GMLML_OBJ_())
+
#endif /* GRAPH_LOCK_H */
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 04/18] async: Register/unregister aiocontext in graph lock list
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (2 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 03/18] graph-lock: Implement guard macros Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 05/18] Import clang-tsa.h Kevin Wolf
` (14 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Add/remove the AioContext in aio_context_list in graph-lock.c when it is
created/destroyed. This allows using the graph locking operations from
this AioContext.
In order to allow linking util/async.c with binaries that don't include
the block layer, introduce stubs for (un)register_aiocontext().
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
stubs/graph-lock.c | 10 ++++++++++
util/async.c | 4 ++++
stubs/meson.build | 1 +
3 files changed, 15 insertions(+)
create mode 100644 stubs/graph-lock.c
diff --git a/stubs/graph-lock.c b/stubs/graph-lock.c
new file mode 100644
index 0000000000..177aa0a8ba
--- /dev/null
+++ b/stubs/graph-lock.c
@@ -0,0 +1,10 @@
+#include "qemu/osdep.h"
+#include "block/graph-lock.h"
+
+void register_aiocontext(AioContext *ctx)
+{
+}
+
+void unregister_aiocontext(AioContext *ctx)
+{
+}
diff --git a/util/async.c b/util/async.c
index 63434ddae4..14d63b3091 100644
--- a/util/async.c
+++ b/util/async.c
@@ -27,6 +27,7 @@
#include "qapi/error.h"
#include "block/aio.h"
#include "block/thread-pool.h"
+#include "block/graph-lock.h"
#include "qemu/main-loop.h"
#include "qemu/atomic.h"
#include "qemu/rcu_queue.h"
@@ -376,6 +377,7 @@ aio_ctx_finalize(GSource *source)
qemu_rec_mutex_destroy(&ctx->lock);
qemu_lockcnt_destroy(&ctx->list_lock);
timerlistgroup_deinit(&ctx->tlg);
+ unregister_aiocontext(ctx);
aio_context_destroy(ctx);
}
@@ -574,6 +576,8 @@ AioContext *aio_context_new(Error **errp)
ctx->thread_pool_min = 0;
ctx->thread_pool_max = THREAD_POOL_MAX_THREADS_DEFAULT;
+ register_aiocontext(ctx);
+
return ctx;
fail:
g_source_destroy(&ctx->source);
diff --git a/stubs/meson.build b/stubs/meson.build
index c96a74f095..981585cbdf 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -13,6 +13,7 @@ stub_ss.add(files('error-printf.c'))
stub_ss.add(files('fdset.c'))
stub_ss.add(files('gdbstub.c'))
stub_ss.add(files('get-vm-name.c'))
+stub_ss.add(files('graph-lock.c'))
if linux_io_uring.found()
stub_ss.add(files('io_uring.c'))
endif
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 05/18] Import clang-tsa.h
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (3 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 04/18] async: Register/unregister aiocontext in graph lock list Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 06/18] clang-tsa: Add TSA_ASSERT() macro Kevin Wolf
` (13 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
This defines macros that allow clang to perform Thread Safety Analysis
based on function and variable annotations that specify the locking
rules. On non-clang compilers, the annotations are ignored.
Imported tsa.h from the original repository with the pthread_mutex_t
wrapper removed:
https://github.com/jhi/clang-thread-safety-analysis-for-c.git
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
include/qemu/clang-tsa.h | 101 +++++++++++++++++++++++++++++++++++++++
1 file changed, 101 insertions(+)
create mode 100644 include/qemu/clang-tsa.h
diff --git a/include/qemu/clang-tsa.h b/include/qemu/clang-tsa.h
new file mode 100644
index 0000000000..0a3361dfc8
--- /dev/null
+++ b/include/qemu/clang-tsa.h
@@ -0,0 +1,101 @@
+#ifndef CLANG_TSA_H
+#define CLANG_TSA_H
+
+/*
+ * Copyright 2018 Jarkko Hietaniemi <jhi@iki.fi>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining
+ * a copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without
+ * limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be
+ * included in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/* http://clang.llvm.org/docs/ThreadSafetyAnalysis.html
+ *
+ * TSA is available since clang 3.6-ish.
+ */
+#ifdef __clang__
+# define TSA(x) __attribute__((x))
+#else
+# define TSA(x) /* No TSA, make TSA attributes no-ops. */
+#endif
+
+/* TSA_CAPABILITY() is used to annotate typedefs:
+ *
+ * typedef pthread_mutex_t TSA_CAPABILITY("mutex") tsa_mutex;
+ */
+#define TSA_CAPABILITY(x) TSA(capability(x))
+
+/* TSA_GUARDED_BY() is used to annotate global variables,
+ * the data is guarded:
+ *
+ * Foo foo TSA_GUARDED_BY(mutex);
+ */
+#define TSA_GUARDED_BY(x) TSA(guarded_by(x))
+
+/* TSA_PT_GUARDED_BY() is used to annotate global pointers, the data
+ * behind the pointer is guarded.
+ *
+ * Foo* ptr TSA_PT_GUARDED_BY(mutex);
+ */
+#define TSA_PT_GUARDED_BY(x) TSA(pt_guarded_by(x))
+
+/* The TSA_REQUIRES() is used to annotate functions: the caller of the
+ * function MUST hold the resource, the function will NOT release it.
+ *
+ * More than one mutex may be specified, comma-separated.
+ *
+ * void Foo(void) TSA_REQUIRES(mutex);
+ */
+#define TSA_REQUIRES(...) TSA(requires_capability(__VA_ARGS__))
+
+/* TSA_EXCLUDES() is used to annotate functions: the caller of the
+ * function MUST NOT hold resource, the function first acquires the
+ * resource, and then releases it.
+ *
+ * More than one mutex may be specified, comma-separated.
+ *
+ * void Foo(void) TSA_EXCLUDES(mutex);
+ */
+#define TSA_EXCLUDES(...) TSA(locks_excluded(__VA_ARGS__))
+
+/* TSA_ACQUIRE() is used to annotate functions: the caller of the
+ * function MUST NOT hold the resource, the function will acquire the
+ * resource, but NOT release it.
+ *
+ * More than one mutex may be specified, comma-separated.
+ *
+ * void Foo(void) TSA_ACQUIRE(mutex);
+ */
+#define TSA_ACQUIRE(...) TSA(acquire_capability(__VA_ARGS__))
+
+/* TSA_RELEASE() is used to annotate functions: the caller of the
+ * function MUST hold the resource, but the function will then release it.
+ *
+ * More than one mutex may be specified, comma-separated.
+ *
+ * void Foo(void) TSA_RELEASE(mutex);
+ */
+#define TSA_RELEASE(...) TSA(release_capability(__VA_ARGS__))
+
+/* TSA_NO_TSA is used to annotate functions. Use only when you need to.
+ *
+ * void Foo(void) TSA_NO_TSA;
+ */
+#define TSA_NO_TSA TSA(no_thread_safety_analysis)
+
+#endif /* #ifndef CLANG_TSA_H */
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 06/18] clang-tsa: Add TSA_ASSERT() macro
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (4 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 05/18] Import clang-tsa.h Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 07/18] clang-tsa: Add macros for shared locks Kevin Wolf
` (12 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
include/qemu/clang-tsa.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/include/qemu/clang-tsa.h b/include/qemu/clang-tsa.h
index 0a3361dfc8..211ee0ae73 100644
--- a/include/qemu/clang-tsa.h
+++ b/include/qemu/clang-tsa.h
@@ -98,4 +98,13 @@
*/
#define TSA_NO_TSA TSA(no_thread_safety_analysis)
+/*
+ * TSA_ASSERT() is used to annotate functions: This function will assert that
+ * the lock is held. When it returns, the caller of the function is assumed to
+ * already hold the resource.
+ *
+ * More than one mutex may be specified, comma-separated.
+ */
+#define TSA_ASSERT(...) TSA(assert_capability(__VA_ARGS__))
+
#endif /* #ifndef CLANG_TSA_H */
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 07/18] clang-tsa: Add macros for shared locks
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (5 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 06/18] clang-tsa: Add TSA_ASSERT() macro Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 08/18] configure: Enable -Wthread-safety if present Kevin Wolf
` (11 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
include/qemu/clang-tsa.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/qemu/clang-tsa.h b/include/qemu/clang-tsa.h
index 211ee0ae73..ba06fb8c92 100644
--- a/include/qemu/clang-tsa.h
+++ b/include/qemu/clang-tsa.h
@@ -62,6 +62,7 @@
* void Foo(void) TSA_REQUIRES(mutex);
*/
#define TSA_REQUIRES(...) TSA(requires_capability(__VA_ARGS__))
+#define TSA_REQUIRES_SHARED(...) TSA(requires_shared_capability(__VA_ARGS__))
/* TSA_EXCLUDES() is used to annotate functions: the caller of the
* function MUST NOT hold resource, the function first acquires the
@@ -82,6 +83,7 @@
* void Foo(void) TSA_ACQUIRE(mutex);
*/
#define TSA_ACQUIRE(...) TSA(acquire_capability(__VA_ARGS__))
+#define TSA_ACQUIRE_SHARED(...) TSA(acquire_shared_capability(__VA_ARGS__))
/* TSA_RELEASE() is used to annotate functions: the caller of the
* function MUST hold the resource, but the function will then release it.
@@ -91,6 +93,7 @@
* void Foo(void) TSA_RELEASE(mutex);
*/
#define TSA_RELEASE(...) TSA(release_capability(__VA_ARGS__))
+#define TSA_RELEASE_SHARED(...) TSA(release_shared_capability(__VA_ARGS__))
/* TSA_NO_TSA is used to annotate functions. Use only when you need to.
*
@@ -106,5 +109,6 @@
* More than one mutex may be specified, comma-separated.
*/
#define TSA_ASSERT(...) TSA(assert_capability(__VA_ARGS__))
+#define TSA_ASSERT_SHARED(...) TSA(assert_shared_capability(__VA_ARGS__))
#endif /* #ifndef CLANG_TSA_H */
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 08/18] configure: Enable -Wthread-safety if present
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (6 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 07/18] clang-tsa: Add macros for shared locks Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 09/18] test-bdrv-drain: Fix incorrrect drain assumptions Kevin Wolf
` (10 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
This enables clang's thread safety analysis (TSA), which we'll use to
statically check the block graph locking.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
configure | 1 +
1 file changed, 1 insertion(+)
diff --git a/configure b/configure
index 26c7bc5154..45d693a241 100755
--- a/configure
+++ b/configure
@@ -1208,6 +1208,7 @@ add_to warn_flags -Wnested-externs
add_to warn_flags -Wendif-labels
add_to warn_flags -Wexpansion-to-defined
add_to warn_flags -Wimplicit-fallthrough=2
+add_to warn_flags -Wthread-safety
nowarn_flags=
add_to nowarn_flags -Wno-initializer-overrides
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 09/18] test-bdrv-drain: Fix incorrrect drain assumptions
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (7 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 08/18] configure: Enable -Wthread-safety if present Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 10/18] block: Fix locking in external_snapshot_prepare() Kevin Wolf
` (9 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
The test case assumes that a drain only happens in one specific place
where it drains explicitly. This assumption happened to hold true until
now, but block layer functions may drain interally (any graph
modifications are going to do that through bdrv_graph_wrlock()), so this
is incorrect. Make sure that the test code in .drained_begin only runs
where we actually want it to run.
When scheduling a BH from .drained_begin, we also need to increase the
in_flight counter to make sure that the operation is actually completed
in time before the node that it works on goes away.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
tests/unit/test-bdrv-drain.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
index 2686a8acee..8cedea4959 100644
--- a/tests/unit/test-bdrv-drain.c
+++ b/tests/unit/test-bdrv-drain.c
@@ -1107,6 +1107,7 @@ struct detach_by_parent_data {
BlockDriverState *c;
BdrvChild *child_c;
bool by_parent_cb;
+ bool detach_on_drain;
};
static struct detach_by_parent_data detach_by_parent_data;
@@ -1114,6 +1115,7 @@ static void detach_indirect_bh(void *opaque)
{
struct detach_by_parent_data *data = opaque;
+ bdrv_dec_in_flight(data->child_b->bs);
bdrv_unref_child(data->parent_b, data->child_b);
bdrv_ref(data->c);
@@ -1128,12 +1130,21 @@ static void detach_by_parent_aio_cb(void *opaque, int ret)
g_assert_cmpint(ret, ==, 0);
if (data->by_parent_cb) {
+ bdrv_inc_in_flight(data->child_b->bs);
detach_indirect_bh(data);
}
}
static void detach_by_driver_cb_drained_begin(BdrvChild *child)
{
+ struct detach_by_parent_data *data = &detach_by_parent_data;
+
+ if (!data->detach_on_drain) {
+ return;
+ }
+ data->detach_on_drain = false;
+
+ bdrv_inc_in_flight(data->child_b->bs);
aio_bh_schedule_oneshot(qemu_get_current_aio_context(),
detach_indirect_bh, &detach_by_parent_data);
child_of_bds.drained_begin(child);
@@ -1174,8 +1185,14 @@ static void test_detach_indirect(bool by_parent_cb)
detach_by_driver_cb_class = child_of_bds;
detach_by_driver_cb_class.drained_begin =
detach_by_driver_cb_drained_begin;
+ detach_by_driver_cb_class.drained_end = NULL;
+ detach_by_driver_cb_class.drained_poll = NULL;
}
+ detach_by_parent_data = (struct detach_by_parent_data) {
+ .detach_on_drain = false,
+ };
+
/* Create all involved nodes */
parent_a = bdrv_new_open_driver(&bdrv_test, "parent-a", BDRV_O_RDWR,
&error_abort);
@@ -1227,6 +1244,7 @@ static void test_detach_indirect(bool by_parent_cb)
.child_b = child_b,
.c = c,
.by_parent_cb = by_parent_cb,
+ .detach_on_drain = true,
};
acb = blk_aio_preadv(blk, 0, &qiov, 0, detach_by_parent_aio_cb, NULL);
g_assert(acb != NULL);
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 10/18] block: Fix locking in external_snapshot_prepare()
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (8 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 09/18] test-bdrv-drain: Fix incorrrect drain assumptions Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 11/18] block: wrlock in bdrv_replace_child_noperm Kevin Wolf
` (8 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
bdrv_img_create() polls internally (when calling bdrv_create(), which is
a co_wrapper), so it can't be called while holding the lock of any
AioContext except the current one without causing deadlocks. Drop the
lock around the call in external_snapshot_prepare().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block.c | 4 ++++
blockdev.c | 4 ++++
2 files changed, 8 insertions(+)
diff --git a/block.c b/block.c
index 6191ac1f44..44d59362d6 100644
--- a/block.c
+++ b/block.c
@@ -6924,6 +6924,10 @@ bool bdrv_op_blocker_is_empty(BlockDriverState *bs)
return true;
}
+/*
+ * Must not be called while holding the lock of an AioContext other than the
+ * current one.
+ */
void bdrv_img_create(const char *filename, const char *fmt,
const char *base_filename, const char *base_fmt,
char *options, uint64_t img_size, int flags, bool quiet,
diff --git a/blockdev.c b/blockdev.c
index 8ffb3d9537..011e48df7b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1524,10 +1524,14 @@ static void external_snapshot_prepare(BlkActionState *common,
goto out;
}
bdrv_refresh_filename(state->old_bs);
+
+ aio_context_release(aio_context);
bdrv_img_create(new_image_file, format,
state->old_bs->filename,
state->old_bs->drv->format_name,
NULL, size, flags, false, &local_err);
+ aio_context_acquire(aio_context);
+
if (local_err) {
error_propagate(errp, local_err);
goto out;
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 11/18] block: wrlock in bdrv_replace_child_noperm
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (9 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 10/18] block: Fix locking in external_snapshot_prepare() Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 12/18] block: remove unnecessary assert_bdrv_graph_writable() Kevin Wolf
` (7 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Protect the main function where graph is modified.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/block.c b/block.c
index 44d59362d6..df52c6b012 100644
--- a/block.c
+++ b/block.c
@@ -2836,8 +2836,6 @@ uint64_t bdrv_qapi_perm_to_blk_perm(BlockPermission qapi_perm)
*
* If @new_bs is non-NULL, the parent of @child must already be drained through
* @child.
- *
- * This function does not poll.
*/
static void bdrv_replace_child_noperm(BdrvChild *child,
BlockDriverState *new_bs)
@@ -2875,23 +2873,24 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
assert(bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs));
}
+ /* TODO Pull this up into the callers to avoid polling here */
+ bdrv_graph_wrlock();
if (old_bs) {
if (child->klass->detach) {
child->klass->detach(child);
}
- assert_bdrv_graph_writable(old_bs);
QLIST_REMOVE(child, next_parent);
}
child->bs = new_bs;
if (new_bs) {
- assert_bdrv_graph_writable(new_bs);
QLIST_INSERT_HEAD(&new_bs->parents, child, next_parent);
if (child->klass->attach) {
child->klass->attach(child);
}
}
+ bdrv_graph_wrunlock();
/*
* If the parent was drained through this BdrvChild previously, but new_bs
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 12/18] block: remove unnecessary assert_bdrv_graph_writable()
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (10 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 11/18] block: wrlock in bdrv_replace_child_noperm Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 13/18] block: assert that graph read and writes are performed correctly Kevin Wolf
` (6 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
We don't protect bdrv->aio_context with the graph rwlock,
so these assertions are not needed
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/block.c b/block.c
index df52c6b012..bdffadcdaa 100644
--- a/block.c
+++ b/block.c
@@ -7214,7 +7214,6 @@ static void bdrv_detach_aio_context(BlockDriverState *bs)
if (bs->quiesce_counter) {
aio_enable_external(bs->aio_context);
}
- assert_bdrv_graph_writable(bs);
bs->aio_context = NULL;
}
@@ -7228,7 +7227,6 @@ static void bdrv_attach_aio_context(BlockDriverState *bs,
aio_disable_external(new_context);
}
- assert_bdrv_graph_writable(bs);
bs->aio_context = new_context;
if (bs->drv && bs->drv->bdrv_attach_aio_context) {
@@ -7309,7 +7307,6 @@ static void bdrv_set_aio_context_commit(void *opaque)
BlockDriverState *bs = (BlockDriverState *) state->bs;
AioContext *new_context = state->new_ctx;
AioContext *old_context = bdrv_get_aio_context(bs);
- assert_bdrv_graph_writable(bs);
/*
* Take the old AioContex when detaching it from bs.
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 13/18] block: assert that graph read and writes are performed correctly
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (11 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 12/18] block: remove unnecessary assert_bdrv_graph_writable() Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 14/18] graph-lock: TSA annotations for lock/unlock functions Kevin Wolf
` (5 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Remove the old assert_bdrv_graph_writable, and replace it with
the new version using graph-lock API.
See the function documentation for more information.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
include/block/block_int-global-state.h | 17 -----------------
include/block/graph-lock.h | 15 +++++++++++++++
block.c | 4 ++--
block/graph-lock.c | 11 +++++++++++
4 files changed, 28 insertions(+), 19 deletions(-)
diff --git a/include/block/block_int-global-state.h b/include/block/block_int-global-state.h
index b49f4eb35b..2f0993f6e9 100644
--- a/include/block/block_int-global-state.h
+++ b/include/block/block_int-global-state.h
@@ -310,21 +310,4 @@ void bdrv_remove_aio_context_notifier(BlockDriverState *bs,
*/
void bdrv_drain_all_end_quiesce(BlockDriverState *bs);
-/**
- * Make sure that the function is running under both drain and BQL.
- * The latter protects from concurrent writings
- * from the GS API, while the former prevents concurrent reads
- * from I/O.
- */
-static inline void assert_bdrv_graph_writable(BlockDriverState *bs)
-{
- /*
- * TODO: this function is incomplete. Because the users of this
- * assert lack the necessary drains, check only for BQL.
- * Once the necessary drains are added,
- * assert also for qatomic_read(&bs->quiesce_counter) > 0
- */
- assert(qemu_in_main_thread());
-}
-
#endif /* BLOCK_INT_GLOBAL_STATE_H */
diff --git a/include/block/graph-lock.h b/include/block/graph-lock.h
index b27d8a5fb1..85e8a53b73 100644
--- a/include/block/graph-lock.h
+++ b/include/block/graph-lock.h
@@ -135,6 +135,21 @@ void coroutine_fn bdrv_graph_co_rdunlock(void);
void bdrv_graph_rdlock_main_loop(void);
void bdrv_graph_rdunlock_main_loop(void);
+/*
+ * assert_bdrv_graph_readable:
+ * Make sure that the reader is either the main loop,
+ * or there is at least a reader helding the rdlock.
+ * In this way an incoming writer is aware of the read and waits.
+ */
+void assert_bdrv_graph_readable(void);
+
+/*
+ * assert_bdrv_graph_writable:
+ * Make sure that the writer is the main loop and has set @has_writer,
+ * so that incoming readers will pause.
+ */
+void assert_bdrv_graph_writable(void);
+
typedef struct GraphLockable { } GraphLockable;
/*
diff --git a/block.c b/block.c
index bdffadcdaa..ff53b41af3 100644
--- a/block.c
+++ b/block.c
@@ -1406,7 +1406,7 @@ static void bdrv_child_cb_attach(BdrvChild *child)
{
BlockDriverState *bs = child->opaque;
- assert_bdrv_graph_writable(bs);
+ assert_bdrv_graph_writable();
QLIST_INSERT_HEAD(&bs->children, child, next);
if (bs->drv->is_filter || (child->role & BDRV_CHILD_FILTERED)) {
/*
@@ -1452,7 +1452,7 @@ static void bdrv_child_cb_detach(BdrvChild *child)
bdrv_backing_detach(child);
}
- assert_bdrv_graph_writable(bs);
+ assert_bdrv_graph_writable();
QLIST_REMOVE(child, next);
if (child == bs->backing) {
assert(child != bs->file);
diff --git a/block/graph-lock.c b/block/graph-lock.c
index e033c6d9ac..c4d9d2c274 100644
--- a/block/graph-lock.c
+++ b/block/graph-lock.c
@@ -259,3 +259,14 @@ void bdrv_graph_rdunlock_main_loop(void)
GLOBAL_STATE_CODE();
assert(!qemu_in_coroutine());
}
+
+void assert_bdrv_graph_readable(void)
+{
+ assert(qemu_in_main_thread() || reader_count());
+}
+
+void assert_bdrv_graph_writable(void)
+{
+ assert(qemu_in_main_thread());
+ assert(qatomic_read(&has_writer));
+}
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 14/18] graph-lock: TSA annotations for lock/unlock functions
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (12 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 13/18] block: assert that graph read and writes are performed correctly Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 15/18] Mark assert_bdrv_graph_readable/writable() GRAPH_RD/WRLOCK Kevin Wolf
` (4 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
include/block/graph-lock.h | 80 +++++++++++++++++++++++++++++++++-----
block/graph-lock.c | 3 ++
2 files changed, 73 insertions(+), 10 deletions(-)
diff --git a/include/block/graph-lock.h b/include/block/graph-lock.h
index 85e8a53b73..50b7e7b1b6 100644
--- a/include/block/graph-lock.h
+++ b/include/block/graph-lock.h
@@ -21,6 +21,7 @@
#define GRAPH_LOCK_H
#include "qemu/osdep.h"
+#include "qemu/clang-tsa.h"
#include "qemu/coroutine.h"
@@ -57,6 +58,35 @@
*/
typedef struct BdrvGraphRWlock BdrvGraphRWlock;
+/* Dummy lock object to use for Thread Safety Analysis (TSA) */
+typedef struct TSA_CAPABILITY("graph-lock") BdrvGraphLock {
+} BdrvGraphLock;
+
+extern BdrvGraphLock graph_lock;
+
+/*
+ * clang doesn't check consistency in locking annotations between forward
+ * declarations and the function definition. Having the annotation on the
+ * definition, but not the declaration in a header file, may give the reader
+ * a false sense of security because the condition actually remains unchecked
+ * for callers in other source files.
+ *
+ * Therefore, as a convention, for public functions, GRAPH_RDLOCK and
+ * GRAPH_WRLOCK annotations should be present only in the header file.
+ */
+#define GRAPH_WRLOCK TSA_REQUIRES(graph_lock)
+#define GRAPH_RDLOCK TSA_REQUIRES_SHARED(graph_lock)
+
+/*
+ * TSA annotations are not part of function types, so checks are defeated when
+ * using a function pointer. As a workaround, annotate function pointers with
+ * this macro that will require that the lock is at least taken while reading
+ * the pointer. In most cases this is equivalent to actually protecting the
+ * function call.
+ */
+#define GRAPH_RDLOCK_PTR TSA_GUARDED_BY(graph_lock)
+#define GRAPH_WRLOCK_PTR TSA_GUARDED_BY(graph_lock)
+
/*
* register_aiocontext:
* Add AioContext @ctx to the list of AioContext.
@@ -85,14 +115,14 @@ void unregister_aiocontext(AioContext *ctx);
* This function polls. Callers must not hold the lock of any AioContext other
* than the current one.
*/
-void bdrv_graph_wrlock(void);
+void bdrv_graph_wrlock(void) TSA_ACQUIRE(graph_lock) TSA_NO_TSA;
/*
* bdrv_graph_wrunlock:
* Write finished, reset global has_writer to 0 and restart
* all readers that are waiting.
*/
-void bdrv_graph_wrunlock(void);
+void bdrv_graph_wrunlock(void) TSA_RELEASE(graph_lock) TSA_NO_TSA;
/*
* bdrv_graph_co_rdlock:
@@ -116,7 +146,8 @@ void bdrv_graph_wrunlock(void);
* loop) to take it and wait that the coroutine ends, so that
* we always signal that a reader is running.
*/
-void coroutine_fn bdrv_graph_co_rdlock(void);
+void coroutine_fn TSA_ACQUIRE_SHARED(graph_lock) TSA_NO_TSA
+bdrv_graph_co_rdlock(void);
/*
* bdrv_graph_rdunlock:
@@ -124,7 +155,8 @@ void coroutine_fn bdrv_graph_co_rdlock(void);
* If the writer is waiting for reads to finish (has_writer == 1), signal
* the writer that we are done via aio_wait_kick() to let it continue.
*/
-void coroutine_fn bdrv_graph_co_rdunlock(void);
+void coroutine_fn TSA_RELEASE_SHARED(graph_lock) TSA_NO_TSA
+bdrv_graph_co_rdunlock(void);
/*
* bdrv_graph_rd{un}lock_main_loop:
@@ -132,8 +164,11 @@ void coroutine_fn bdrv_graph_co_rdunlock(void);
* in the main loop. It is just asserting that we are not
* in a coroutine and in GLOBAL_STATE_CODE.
*/
-void bdrv_graph_rdlock_main_loop(void);
-void bdrv_graph_rdunlock_main_loop(void);
+void TSA_ACQUIRE_SHARED(graph_lock) TSA_NO_TSA
+bdrv_graph_rdlock_main_loop(void);
+
+void TSA_RELEASE_SHARED(graph_lock) TSA_NO_TSA
+bdrv_graph_rdunlock_main_loop(void);
/*
* assert_bdrv_graph_readable:
@@ -150,6 +185,17 @@ void assert_bdrv_graph_readable(void);
*/
void assert_bdrv_graph_writable(void);
+/*
+ * Calling this function tells TSA that we know that the lock is effectively
+ * taken even though we cannot prove it (yet) with GRAPH_RDLOCK. This can be
+ * useful in intermediate stages of a conversion to using the GRAPH_RDLOCK
+ * macro.
+ */
+static inline void TSA_ASSERT_SHARED(graph_lock) TSA_NO_TSA
+assume_graph_lock(void)
+{
+}
+
typedef struct GraphLockable { } GraphLockable;
/*
@@ -159,13 +205,21 @@ typedef struct GraphLockable { } GraphLockable;
*/
#define GML_OBJ_() (&(GraphLockable) { })
-static inline GraphLockable *graph_lockable_auto_lock(GraphLockable *x)
+/*
+ * This is not marked as TSA_ACQUIRE() because TSA doesn't understand the
+ * cleanup attribute and would therefore complain that the graph is never
+ * unlocked. TSA_ASSERT() makes sure that the following calls know that we
+ * hold the lock while unlocking is left unchecked.
+ */
+static inline GraphLockable * TSA_ASSERT(graph_lock) TSA_NO_TSA
+graph_lockable_auto_lock(GraphLockable *x)
{
bdrv_graph_co_rdlock();
return x;
}
-static inline void graph_lockable_auto_unlock(GraphLockable *x)
+static inline void TSA_NO_TSA
+graph_lockable_auto_unlock(GraphLockable *x)
{
bdrv_graph_co_rdunlock();
}
@@ -195,14 +249,20 @@ typedef struct GraphLockableMainloop { } GraphLockableMainloop;
*/
#define GMLML_OBJ_() (&(GraphLockableMainloop) { })
-static inline GraphLockableMainloop *
+/*
+ * This is not marked as TSA_ACQUIRE() because TSA doesn't understand the
+ * cleanup attribute and would therefore complain that the graph is never
+ * unlocked. TSA_ASSERT() makes sure that the following calls know that we
+ * hold the lock while unlocking is left unchecked.
+ */
+static inline GraphLockableMainloop * TSA_ASSERT(graph_lock) TSA_NO_TSA
graph_lockable_auto_lock_mainloop(GraphLockableMainloop *x)
{
bdrv_graph_rdlock_main_loop();
return x;
}
-static inline void
+static inline void TSA_NO_TSA
graph_lockable_auto_unlock_mainloop(GraphLockableMainloop *x)
{
bdrv_graph_rdunlock_main_loop();
diff --git a/block/graph-lock.c b/block/graph-lock.c
index c4d9d2c274..454c31e691 100644
--- a/block/graph-lock.c
+++ b/block/graph-lock.c
@@ -24,6 +24,9 @@
#include "block/block.h"
#include "block/block_int.h"
+/* Dummy lock object to use for Thread Safety Analysis (TSA) */
+BdrvGraphLock graph_lock;
+
/* Protects the list of aiocontext and orphaned_reader_count */
static QemuMutex aio_context_list_lock;
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 15/18] Mark assert_bdrv_graph_readable/writable() GRAPH_RD/WRLOCK
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (13 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 14/18] graph-lock: TSA annotations for lock/unlock functions Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 16/18] block-coroutine-wrapper.py: introduce annotations that take the graph rdlock Kevin Wolf
` (3 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
include/block/block_int-common.h | 4 ++--
include/block/graph-lock.h | 4 ++--
block.c | 4 ++--
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index a6bc6b7fe9..b1f0d88307 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -898,8 +898,8 @@ struct BdrvChildClass {
void (*activate)(BdrvChild *child, Error **errp);
int (*inactivate)(BdrvChild *child);
- void (*attach)(BdrvChild *child);
- void (*detach)(BdrvChild *child);
+ void GRAPH_WRLOCK_PTR (*attach)(BdrvChild *child);
+ void GRAPH_WRLOCK_PTR (*detach)(BdrvChild *child);
/*
* Notifies the parent that the filename of its child has changed (e.g.
diff --git a/include/block/graph-lock.h b/include/block/graph-lock.h
index 50b7e7b1b6..0c66386167 100644
--- a/include/block/graph-lock.h
+++ b/include/block/graph-lock.h
@@ -176,14 +176,14 @@ bdrv_graph_rdunlock_main_loop(void);
* or there is at least a reader helding the rdlock.
* In this way an incoming writer is aware of the read and waits.
*/
-void assert_bdrv_graph_readable(void);
+void GRAPH_RDLOCK assert_bdrv_graph_readable(void);
/*
* assert_bdrv_graph_writable:
* Make sure that the writer is the main loop and has set @has_writer,
* so that incoming readers will pause.
*/
-void assert_bdrv_graph_writable(void);
+void GRAPH_WRLOCK assert_bdrv_graph_writable(void);
/*
* Calling this function tells TSA that we know that the lock is effectively
diff --git a/block.c b/block.c
index ff53b41af3..1a82fd101a 100644
--- a/block.c
+++ b/block.c
@@ -1402,7 +1402,7 @@ static void bdrv_inherited_options(BdrvChildRole role, bool parent_is_format,
*child_flags = flags;
}
-static void bdrv_child_cb_attach(BdrvChild *child)
+static void GRAPH_WRLOCK bdrv_child_cb_attach(BdrvChild *child)
{
BlockDriverState *bs = child->opaque;
@@ -1444,7 +1444,7 @@ static void bdrv_child_cb_attach(BdrvChild *child)
}
}
-static void bdrv_child_cb_detach(BdrvChild *child)
+static void GRAPH_WRLOCK bdrv_child_cb_detach(BdrvChild *child)
{
BlockDriverState *bs = child->opaque;
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 16/18] block-coroutine-wrapper.py: introduce annotations that take the graph rdlock
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (14 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 15/18] Mark assert_bdrv_graph_readable/writable() GRAPH_RD/WRLOCK Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 17/18] block: use co_wrapper_mixed_bdrv_rdlock in functions taking the rdlock Kevin Wolf
` (2 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Add co_wrapper_bdrv_rdlock and co_wrapper_mixed_bdrv_rdlock option to
the block-coroutine-wrapper.py script.
This "_bdrv_rdlock" option takes and releases the graph rdlock when a
coroutine function is created.
This means that when used together with "_mixed", the function marked
with co_wrapper_mixed_bdrv_rdlock will support both coroutine and
non-coroutine case, and in the latter case it will create a coroutine
that takes and releases the rdlock. When called from a coroutine, the
caller must already hold the graph lock.
Example:
void co_wrapper_mixed_bdrv_rdlock bdrv_f1();
Becomes
static void bdrv_co_enter_f1()
{
bdrv_graph_co_rdlock();
bdrv_co_function();
bdrv_graph_co_rdunlock();
}
void bdrv_f1()
{
if (qemu_in_coroutine) {
assume_graph_lock();
bdrv_co_function();
} else {
qemu_co_enter(bdrv_co_enter_f1);
...
}
}
When used alone, the function will not work in coroutine context, and
when called in non-coroutine context it will create a new coroutine that
takes care of taking and releasing the rdlock automatically.
Example:
void co_wrapper_bdrv_rdlock bdrv_f1();
Becomes
static void bdrv_co_enter_f1()
{
bdrv_graph_co_rdlock();
bdrv_co_function();
bdrv_graph_co_rdunlock();
}
void bdrv_f1()
{
assert(!qemu_in_coroutine());
qemu_co_enter(bdrv_co_enter_f1);
...
}
About their usage:
- co_wrapper does not take the rdlock, so it can be used also outside
the block layer.
- co_wrapper_mixed will be used by many blk_* functions, since the
coroutine function needs to call blk_wait_while_drained() and
the rdlock *must* be taken afterwards, otherwise it's a deadlock.
In the future this annotation will go away, and blk_* will use
co_wrapper directly.
- co_wrapper_bdrv_rdlock will be used by BlockDriver callbacks, ideally
by all of them in the future.
- co_wrapper_mixed_bdrv_rdlock will be used by the remaining functions
that are still called by coroutine and non-coroutine context. In the
future this annotation will go away, as we will split such mixed
functions.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
include/block/block-common.h | 9 ++++++++-
scripts/block-coroutine-wrapper.py | 12 ++++++++++++
2 files changed, 20 insertions(+), 1 deletion(-)
diff --git a/include/block/block-common.h b/include/block/block-common.h
index 6cf603ab06..4749c46a5e 100644
--- a/include/block/block-common.h
+++ b/include/block/block-common.h
@@ -40,14 +40,21 @@
*
* Usage: read docs/devel/block-coroutine-wrapper.rst
*
- * There are 2 kind of specifiers:
+ * There are 4 kind of specifiers:
* - co_wrapper functions can be called by only non-coroutine context, because
* they always generate a new coroutine.
* - co_wrapper_mixed functions can be called by both coroutine and
* non-coroutine context.
+ * - co_wrapper_bdrv_rdlock are co_wrapper functions but automatically take and
+ * release the graph rdlock when creating a new coroutine
+ * - co_wrapper_mixed_bdrv_rdlock are co_wrapper_mixed functions but
+ * automatically take and release the graph rdlock when creating a new
+ * coroutine.
*/
#define co_wrapper
#define co_wrapper_mixed
+#define co_wrapper_bdrv_rdlock
+#define co_wrapper_mixed_bdrv_rdlock
#include "block/dirty-bitmap.h"
#include "block/blockjob.h"
diff --git a/scripts/block-coroutine-wrapper.py b/scripts/block-coroutine-wrapper.py
index 71a06e917a..6e087fa0b7 100644
--- a/scripts/block-coroutine-wrapper.py
+++ b/scripts/block-coroutine-wrapper.py
@@ -69,6 +69,7 @@ def __init__(self, return_type: str, name: str, args: str,
self.struct_name = snake_to_camel(self.name)
self.args = [ParamDecl(arg.strip()) for arg in args.split(',')]
self.create_only_co = 'mixed' not in variant
+ self.graph_rdlock = 'bdrv_rdlock' in variant
subsystem, subname = self.name.split('_', 1)
self.co_name = f'{subsystem}_co_{subname}'
@@ -123,10 +124,13 @@ def create_mixed_wrapper(func: FuncDecl) -> str:
"""
name = func.co_name
struct_name = func.struct_name
+ graph_assume_lock = 'assume_graph_lock();' if func.graph_rdlock else ''
+
return f"""\
{func.return_type} {func.name}({ func.gen_list('{decl}') })
{{
if (qemu_in_coroutine()) {{
+ {graph_assume_lock}
return {name}({ func.gen_list('{name}') });
}} else {{
{struct_name} s = {{
@@ -174,6 +178,12 @@ def gen_wrapper(func: FuncDecl) -> str:
name = func.co_name
struct_name = func.struct_name
+ graph_lock=''
+ graph_unlock=''
+ if func.graph_rdlock:
+ graph_lock=' bdrv_graph_co_rdlock();'
+ graph_unlock=' bdrv_graph_co_rdunlock();'
+
creation_function = create_mixed_wrapper
if func.create_only_co:
creation_function = create_co_wrapper
@@ -193,7 +203,9 @@ def gen_wrapper(func: FuncDecl) -> str:
{{
{struct_name} *s = opaque;
+{graph_lock}
s->ret = {name}({ func.gen_list('s->{name}') });
+{graph_unlock}
s->poll_state.in_progress = false;
aio_wait_kick();
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 17/18] block: use co_wrapper_mixed_bdrv_rdlock in functions taking the rdlock
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (15 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 16/18] block-coroutine-wrapper.py: introduce annotations that take the graph rdlock Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 13:18 ` [PATCH 18/18] block: GRAPH_RDLOCK for functions only called by co_wrappers Kevin Wolf
2022-12-07 14:12 ` [PATCH 00/18] block: Introduce a block graph rwlock Emanuele Giuseppe Esposito
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Take the rdlock already, before we add the assertions.
All these functions either read the graph recursively, or call
BlockDriver callbacks that will eventually need to be protected by the
graph rdlock.
Do it now to all functions together, because many of these recursively
call each other.
For example, bdrv_co_truncate calls BlockDriver->bdrv_co_truncate, and
some driver callbacks implement their own .bdrv_co_truncate by calling
bdrv_flush inside. So if bdrv_flush asserts but bdrv_truncate does not
take the rdlock yet, the assertion will always fail.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/coroutines.h | 2 +-
include/block/block-io.h | 53 +++++++++++++++++++++++-----------------
2 files changed, 32 insertions(+), 23 deletions(-)
diff --git a/block/coroutines.h b/block/coroutines.h
index 17da4db963..48e9081aa1 100644
--- a/block/coroutines.h
+++ b/block/coroutines.h
@@ -71,7 +71,7 @@ nbd_co_do_establish_connection(BlockDriverState *bs, bool blocking,
* the "I/O or GS" API.
*/
-int co_wrapper_mixed
+int co_wrapper_mixed_bdrv_rdlock
bdrv_common_block_status_above(BlockDriverState *bs,
BlockDriverState *base,
bool include_base,
diff --git a/include/block/block-io.h b/include/block/block-io.h
index 52869ea08e..2ed6214909 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -39,19 +39,24 @@
* to catch when they are accidentally called by the wrong API.
*/
-int co_wrapper_mixed bdrv_pwrite_zeroes(BdrvChild *child, int64_t offset,
- int64_t bytes,
- BdrvRequestFlags flags);
+int co_wrapper_mixed_bdrv_rdlock
+bdrv_pwrite_zeroes(BdrvChild *child, int64_t offset, int64_t bytes,
+ BdrvRequestFlags flags);
+
int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags flags);
-int co_wrapper_mixed bdrv_pread(BdrvChild *child, int64_t offset,
- int64_t bytes, void *buf,
- BdrvRequestFlags flags);
-int co_wrapper_mixed bdrv_pwrite(BdrvChild *child, int64_t offset,
- int64_t bytes, const void *buf,
- BdrvRequestFlags flags);
-int co_wrapper_mixed bdrv_pwrite_sync(BdrvChild *child, int64_t offset,
- int64_t bytes, const void *buf,
- BdrvRequestFlags flags);
+
+int co_wrapper_mixed_bdrv_rdlock
+bdrv_pread(BdrvChild *child, int64_t offset, int64_t bytes, void *buf,
+ BdrvRequestFlags flags);
+
+int co_wrapper_mixed_bdrv_rdlock
+bdrv_pwrite(BdrvChild *child, int64_t offset,int64_t bytes,
+ const void *buf, BdrvRequestFlags flags);
+
+int co_wrapper_mixed_bdrv_rdlock
+bdrv_pwrite_sync(BdrvChild *child, int64_t offset, int64_t bytes,
+ const void *buf, BdrvRequestFlags flags);
+
int coroutine_fn bdrv_co_pwrite_sync(BdrvChild *child, int64_t offset,
int64_t bytes, const void *buf,
BdrvRequestFlags flags);
@@ -287,22 +292,26 @@ int coroutine_fn bdrv_co_copy_range(BdrvChild *src, int64_t src_offset,
void bdrv_drain(BlockDriverState *bs);
-int co_wrapper_mixed
+int co_wrapper_mixed_bdrv_rdlock
bdrv_truncate(BdrvChild *child, int64_t offset, bool exact,
PreallocMode prealloc, BdrvRequestFlags flags, Error **errp);
-int co_wrapper_mixed bdrv_check(BlockDriverState *bs, BdrvCheckResult *res,
- BdrvCheckMode fix);
+int co_wrapper_mixed_bdrv_rdlock
+bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix);
/* Invalidate any cached metadata used by image formats */
-int co_wrapper_mixed bdrv_invalidate_cache(BlockDriverState *bs,
- Error **errp);
-int co_wrapper_mixed bdrv_flush(BlockDriverState *bs);
-int co_wrapper_mixed bdrv_pdiscard(BdrvChild *child, int64_t offset,
- int64_t bytes);
-int co_wrapper_mixed
+int co_wrapper_mixed_bdrv_rdlock
+bdrv_invalidate_cache(BlockDriverState *bs, Error **errp);
+
+int co_wrapper_mixed_bdrv_rdlock bdrv_flush(BlockDriverState *bs);
+
+int co_wrapper_mixed_bdrv_rdlock
+bdrv_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes);
+
+int co_wrapper_mixed_bdrv_rdlock
bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
-int co_wrapper_mixed
+
+int co_wrapper_mixed_bdrv_rdlock
bdrv_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
/**
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 18/18] block: GRAPH_RDLOCK for functions only called by co_wrappers
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (16 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 17/18] block: use co_wrapper_mixed_bdrv_rdlock in functions taking the rdlock Kevin Wolf
@ 2022-12-07 13:18 ` Kevin Wolf
2022-12-07 14:12 ` [PATCH 00/18] block: Introduce a block graph rwlock Emanuele Giuseppe Esposito
18 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 13:18 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, hreitz, eesposit, pbonzini, vsementsov, qemu-devel
The generated coroutine wrappers already take care to take the lock in
the non-coroutine path, and assume that the lock is already taken in the
coroutine path.
The only thing we need to do for the wrapped function is adding the
GRAPH_RDLOCK annotation. Doing so also allows us to mark the
corresponding callbacks in BlockDriver as GRAPH_RDLOCK_PTR.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/coroutines.h | 17 ++++++++++-------
include/block/block_int-common.h | 20 +++++++++-----------
block.c | 2 ++
block/io.c | 2 ++
4 files changed, 23 insertions(+), 18 deletions(-)
diff --git a/block/coroutines.h b/block/coroutines.h
index 48e9081aa1..2a1e0b3c9d 100644
--- a/block/coroutines.h
+++ b/block/coroutines.h
@@ -37,9 +37,11 @@
* the I/O API.
*/
-int coroutine_fn bdrv_co_check(BlockDriverState *bs,
- BdrvCheckResult *res, BdrvCheckMode fix);
-int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs, Error **errp);
+int coroutine_fn GRAPH_RDLOCK
+bdrv_co_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix);
+
+int coroutine_fn GRAPH_RDLOCK
+bdrv_co_invalidate_cache(BlockDriverState *bs, Error **errp);
int coroutine_fn
bdrv_co_common_block_status_above(BlockDriverState *bs,
@@ -53,10 +55,11 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
BlockDriverState **file,
int *depth);
-int coroutine_fn bdrv_co_readv_vmstate(BlockDriverState *bs,
- QEMUIOVector *qiov, int64_t pos);
-int coroutine_fn bdrv_co_writev_vmstate(BlockDriverState *bs,
- QEMUIOVector *qiov, int64_t pos);
+int coroutine_fn GRAPH_RDLOCK
+bdrv_co_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
+
+int coroutine_fn GRAPH_RDLOCK
+bdrv_co_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
int coroutine_fn
nbd_co_do_establish_connection(BlockDriverState *bs, bool blocking,
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index b1f0d88307..c34c525fa6 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -641,8 +641,8 @@ struct BlockDriver {
/*
* Invalidate any cached meta-data.
*/
- void coroutine_fn (*bdrv_co_invalidate_cache)(BlockDriverState *bs,
- Error **errp);
+ void coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_co_invalidate_cache)(
+ BlockDriverState *bs, Error **errp);
/*
* Flushes all data for all layers by calling bdrv_co_flush for underlying
@@ -701,12 +701,11 @@ struct BlockDriver {
Error **errp);
BlockStatsSpecific *(*bdrv_get_specific_stats)(BlockDriverState *bs);
- int coroutine_fn (*bdrv_save_vmstate)(BlockDriverState *bs,
- QEMUIOVector *qiov,
- int64_t pos);
- int coroutine_fn (*bdrv_load_vmstate)(BlockDriverState *bs,
- QEMUIOVector *qiov,
- int64_t pos);
+ int coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_save_vmstate)(
+ BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
+
+ int coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_load_vmstate)(
+ BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
/* removable device specific */
bool (*bdrv_is_inserted)(BlockDriverState *bs);
@@ -724,9 +723,8 @@ struct BlockDriver {
* Returns 0 for completed check, -errno for internal errors.
* The check results are stored in result.
*/
- int coroutine_fn (*bdrv_co_check)(BlockDriverState *bs,
- BdrvCheckResult *result,
- BdrvCheckMode fix);
+ int coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_co_check)(
+ BlockDriverState *bs, BdrvCheckResult *result, BdrvCheckMode fix);
void (*bdrv_debug_event)(BlockDriverState *bs, BlkdebugEvent event);
diff --git a/block.c b/block.c
index 1a82fd101a..9c2ac757e4 100644
--- a/block.c
+++ b/block.c
@@ -5402,6 +5402,7 @@ int coroutine_fn bdrv_co_check(BlockDriverState *bs,
BdrvCheckResult *res, BdrvCheckMode fix)
{
IO_CODE();
+ assert_bdrv_graph_readable();
if (bs->drv == NULL) {
return -ENOMEDIUM;
}
@@ -6617,6 +6618,7 @@ int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs, Error **errp)
IO_CODE();
assert(!(bs->open_flags & BDRV_O_INACTIVE));
+ assert_bdrv_graph_readable();
if (bs->drv->bdrv_co_invalidate_cache) {
bs->drv->bdrv_co_invalidate_cache(bs, &local_err);
diff --git a/block/io.c b/block/io.c
index d160d2e273..d87788dfbb 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2697,6 +2697,7 @@ bdrv_co_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos)
BlockDriverState *child_bs = bdrv_primary_bs(bs);
int ret;
IO_CODE();
+ assert_bdrv_graph_readable();
ret = bdrv_check_qiov_request(pos, qiov->size, qiov, 0, NULL);
if (ret < 0) {
@@ -2729,6 +2730,7 @@ bdrv_co_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos)
BlockDriverState *child_bs = bdrv_primary_bs(bs);
int ret;
IO_CODE();
+ assert_bdrv_graph_readable();
ret = bdrv_check_qiov_request(pos, qiov->size, qiov, 0, NULL);
if (ret < 0) {
--
2.38.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 00/18] block: Introduce a block graph rwlock
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
` (17 preceding siblings ...)
2022-12-07 13:18 ` [PATCH 18/18] block: GRAPH_RDLOCK for functions only called by co_wrappers Kevin Wolf
@ 2022-12-07 14:12 ` Emanuele Giuseppe Esposito
2022-12-07 16:08 ` Kevin Wolf
2022-12-12 17:14 ` Kevin Wolf
18 siblings, 2 replies; 22+ messages in thread
From: Emanuele Giuseppe Esposito @ 2022-12-07 14:12 UTC (permalink / raw)
To: Kevin Wolf, qemu-block; +Cc: hreitz, pbonzini, vsementsov, qemu-devel
Am 07/12/2022 um 14:18 schrieb Kevin Wolf:
> This series supersedes the first half of Emanuele's "Protect the block
> layer with a rwlock: part 1". It introduces the basic infrastructure for
> protecting the block graph (specifically parent/child links) with a
> rwlock. Actually taking the reader lock in all necessary places is left
> for future series.
>
> Compared to Emanuele's series, this one adds patches to make use of
> clang's Thread Safety Analysis (TSA) feature in order to statically
> check at compile time that the places where we assert that we hold the
> lock actually do hold it. Once we cover all relevant places, the check
> can be extended to verify that all accesses of bs->children and
> bs->parents hold the lock.
>
> For reference, here is the more detailed version of our plan in
> Emanuele's words from his series:
>
> The aim is to replace the current AioContext lock with much
> fine-grained locks, aimed to protect only specific data. Currently
> the AioContext lock is used pretty much everywhere, and it's not
> even clear what it is protecting exactly.
>
> The aim of the rwlock is to cover graph modifications: more
> precisely, when a BlockDriverState parent or child list is modified
> or read, since it can be concurrently accessed by the main loop and
> iothreads.
>
> The main assumption is that the main loop is the only one allowed to
> perform graph modifications, and so far this has always been held by
> the current code.
>
> The rwlock is inspired from cpus-common.c implementation, and aims
> to reduce cacheline bouncing by having per-aiocontext counter of
> readers. All details and implementation of the lock are in patch 2.
>
> We distinguish between writer (main loop, under BQL) that modifies
> the graph, and readers (all other coroutines running in various
> AioContext), that go through the graph edges, reading ->parents
> and->children. The writer (main loop) has an "exclusive" access,
> so it first waits for current read to finish, and then prevents
> incoming ones from entering while it has the exclusive access. The
> readers (coroutines in multiple AioContext) are free to access the
> graph as long the writer is not modifying the graph. In case it is,
> they go in a CoQueue and sleep until the writer is done.
>
> In this and following series, we try to follow the following locking
> pattern:
>
> - bdrv_co_* functions that call BlockDriver callbacks always expect
> the lock to be taken, therefore they assert.
>
> - blk_co_* functions are called from external code outside the block
> layer, which should not have to care about the block layer's
> internal locking. Usually they also call blk_wait_while_drained().
> Therefore they take the lock internally.
>
> The long term goal of this series is to eventually replace the
> AioContext lock, so that we can get rid of it once and for all.
>
> Emanuele Giuseppe Esposito (7):
> graph-lock: Implement guard macros
> async: Register/unregister aiocontext in graph lock list
> block: wrlock in bdrv_replace_child_noperm
> block: remove unnecessary assert_bdrv_graph_writable()
> block: assert that graph read and writes are performed correctly
> block-coroutine-wrapper.py: introduce annotations that take the graph
> rdlock
> block: use co_wrapper_mixed_bdrv_rdlock in functions taking the rdlock
>
> Kevin Wolf (10):
> block: Factor out bdrv_drain_all_begin_nopoll()
> Import clang-tsa.h
> clang-tsa: Add TSA_ASSERT() macro
> clang-tsa: Add macros for shared locks
> configure: Enable -Wthread-safety if present
> test-bdrv-drain: Fix incorrrect drain assumptions
> block: Fix locking in external_snapshot_prepare()
> graph-lock: TSA annotations for lock/unlock functions
> Mark assert_bdrv_graph_readable/writable() GRAPH_RD/WRLOCK
> block: GRAPH_RDLOCK for functions only called by co_wrappers
>
> Paolo Bonzini (1):
> graph-lock: Introduce a lock to protect block graph operations
>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
^ I am curious to see if I am allowed to have my r-b also on my patches :)
> configure | 1 +
> block/coroutines.h | 19 +-
> include/block/aio.h | 9 +
> include/block/block-common.h | 9 +-
> include/block/block-global-state.h | 1 +
> include/block/block-io.h | 53 +++--
> include/block/block_int-common.h | 24 +--
> include/block/block_int-global-state.h | 17 --
> include/block/block_int.h | 1 +
> include/block/graph-lock.h | 280 +++++++++++++++++++++++++
> include/qemu/clang-tsa.h | 114 ++++++++++
> block.c | 24 ++-
> block/graph-lock.c | 275 ++++++++++++++++++++++++
> block/io.c | 21 +-
> blockdev.c | 4 +
> stubs/graph-lock.c | 10 +
> tests/unit/test-bdrv-drain.c | 18 ++
> util/async.c | 4 +
> scripts/block-coroutine-wrapper.py | 12 ++
> block/meson.build | 1 +
> stubs/meson.build | 1 +
> 21 files changed, 820 insertions(+), 78 deletions(-)
> create mode 100644 include/block/graph-lock.h
> create mode 100644 include/qemu/clang-tsa.h
> create mode 100644 block/graph-lock.c
> create mode 100644 stubs/graph-lock.c
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 00/18] block: Introduce a block graph rwlock
2022-12-07 14:12 ` [PATCH 00/18] block: Introduce a block graph rwlock Emanuele Giuseppe Esposito
@ 2022-12-07 16:08 ` Kevin Wolf
2022-12-12 17:14 ` Kevin Wolf
1 sibling, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-07 16:08 UTC (permalink / raw)
To: Emanuele Giuseppe Esposito
Cc: qemu-block, hreitz, pbonzini, vsementsov, qemu-devel
Am 07.12.2022 um 15:12 hat Emanuele Giuseppe Esposito geschrieben:
> > Emanuele Giuseppe Esposito (7):
> > graph-lock: Implement guard macros
> > async: Register/unregister aiocontext in graph lock list
> > block: wrlock in bdrv_replace_child_noperm
> > block: remove unnecessary assert_bdrv_graph_writable()
> > block: assert that graph read and writes are performed correctly
> > block-coroutine-wrapper.py: introduce annotations that take the graph
> > rdlock
> > block: use co_wrapper_mixed_bdrv_rdlock in functions taking the rdlock
> >
> > Kevin Wolf (10):
> > block: Factor out bdrv_drain_all_begin_nopoll()
> > Import clang-tsa.h
> > clang-tsa: Add TSA_ASSERT() macro
> > clang-tsa: Add macros for shared locks
> > configure: Enable -Wthread-safety if present
> > test-bdrv-drain: Fix incorrrect drain assumptions
> > block: Fix locking in external_snapshot_prepare()
> > graph-lock: TSA annotations for lock/unlock functions
> > Mark assert_bdrv_graph_readable/writable() GRAPH_RD/WRLOCK
> > block: GRAPH_RDLOCK for functions only called by co_wrappers
> >
> > Paolo Bonzini (1):
> > graph-lock: Introduce a lock to protect block graph operations
> >
> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Thanks!
> ^ I am curious to see if I am allowed to have my r-b also on my patches :)
That's actually a good question. I wondered myself whether I should add
my R-b to patches that I picked up from you, but which already have my
S-o-b now, of course, and are possibly modified by me.
I would say you're allowed as long as you actually reviewed them in the
version I sent to make sure that I didn't mess them up. :-)
And similarly I'll probably add my R-b on patches that contain code from
you.
Kevin
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 00/18] block: Introduce a block graph rwlock
2022-12-07 14:12 ` [PATCH 00/18] block: Introduce a block graph rwlock Emanuele Giuseppe Esposito
2022-12-07 16:08 ` Kevin Wolf
@ 2022-12-12 17:14 ` Kevin Wolf
1 sibling, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2022-12-12 17:14 UTC (permalink / raw)
To: Emanuele Giuseppe Esposito
Cc: qemu-block, hreitz, pbonzini, vsementsov, qemu-devel
Am 07.12.2022 um 15:12 hat Emanuele Giuseppe Esposito geschrieben:
> Am 07/12/2022 um 14:18 schrieb Kevin Wolf:
> > This series supersedes the first half of Emanuele's "Protect the block
> > layer with a rwlock: part 1". It introduces the basic infrastructure for
> > protecting the block graph (specifically parent/child links) with a
> > rwlock. Actually taking the reader lock in all necessary places is left
> > for future series.
> >
> > Compared to Emanuele's series, this one adds patches to make use of
> > clang's Thread Safety Analysis (TSA) feature in order to statically
> > check at compile time that the places where we assert that we hold the
> > lock actually do hold it. Once we cover all relevant places, the check
> > can be extended to verify that all accesses of bs->children and
> > bs->parents hold the lock.
> >
> > For reference, here is the more detailed version of our plan in
> > Emanuele's words from his series:
> >
> > The aim is to replace the current AioContext lock with much
> > fine-grained locks, aimed to protect only specific data. Currently
> > the AioContext lock is used pretty much everywhere, and it's not
> > even clear what it is protecting exactly.
> >
> > The aim of the rwlock is to cover graph modifications: more
> > precisely, when a BlockDriverState parent or child list is modified
> > or read, since it can be concurrently accessed by the main loop and
> > iothreads.
> >
> > The main assumption is that the main loop is the only one allowed to
> > perform graph modifications, and so far this has always been held by
> > the current code.
> >
> > The rwlock is inspired from cpus-common.c implementation, and aims
> > to reduce cacheline bouncing by having per-aiocontext counter of
> > readers. All details and implementation of the lock are in patch 2.
> >
> > We distinguish between writer (main loop, under BQL) that modifies
> > the graph, and readers (all other coroutines running in various
> > AioContext), that go through the graph edges, reading ->parents
> > and->children. The writer (main loop) has an "exclusive" access,
> > so it first waits for current read to finish, and then prevents
> > incoming ones from entering while it has the exclusive access. The
> > readers (coroutines in multiple AioContext) are free to access the
> > graph as long the writer is not modifying the graph. In case it is,
> > they go in a CoQueue and sleep until the writer is done.
> >
> > In this and following series, we try to follow the following locking
> > pattern:
> >
> > - bdrv_co_* functions that call BlockDriver callbacks always expect
> > the lock to be taken, therefore they assert.
> >
> > - blk_co_* functions are called from external code outside the block
> > layer, which should not have to care about the block layer's
> > internal locking. Usually they also call blk_wait_while_drained().
> > Therefore they take the lock internally.
> >
> > The long term goal of this series is to eventually replace the
> > AioContext lock, so that we can get rid of it once and for all.
> >
> > Emanuele Giuseppe Esposito (7):
> > graph-lock: Implement guard macros
> > async: Register/unregister aiocontext in graph lock list
> > block: wrlock in bdrv_replace_child_noperm
> > block: remove unnecessary assert_bdrv_graph_writable()
> > block: assert that graph read and writes are performed correctly
> > block-coroutine-wrapper.py: introduce annotations that take the graph
> > rdlock
> > block: use co_wrapper_mixed_bdrv_rdlock in functions taking the rdlock
> >
> > Kevin Wolf (10):
> > block: Factor out bdrv_drain_all_begin_nopoll()
> > Import clang-tsa.h
> > clang-tsa: Add TSA_ASSERT() macro
> > clang-tsa: Add macros for shared locks
> > configure: Enable -Wthread-safety if present
> > test-bdrv-drain: Fix incorrrect drain assumptions
> > block: Fix locking in external_snapshot_prepare()
> > graph-lock: TSA annotations for lock/unlock functions
> > Mark assert_bdrv_graph_readable/writable() GRAPH_RD/WRLOCK
> > block: GRAPH_RDLOCK for functions only called by co_wrappers
> >
> > Paolo Bonzini (1):
> > graph-lock: Introduce a lock to protect block graph operations
> >
> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Thanks, applied to block-next.
Kevin
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2022-12-12 17:15 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-07 13:18 [PATCH 00/18] block: Introduce a block graph rwlock Kevin Wolf
2022-12-07 13:18 ` [PATCH 01/18] block: Factor out bdrv_drain_all_begin_nopoll() Kevin Wolf
2022-12-07 13:18 ` [PATCH 02/18] graph-lock: Introduce a lock to protect block graph operations Kevin Wolf
2022-12-07 13:18 ` [PATCH 03/18] graph-lock: Implement guard macros Kevin Wolf
2022-12-07 13:18 ` [PATCH 04/18] async: Register/unregister aiocontext in graph lock list Kevin Wolf
2022-12-07 13:18 ` [PATCH 05/18] Import clang-tsa.h Kevin Wolf
2022-12-07 13:18 ` [PATCH 06/18] clang-tsa: Add TSA_ASSERT() macro Kevin Wolf
2022-12-07 13:18 ` [PATCH 07/18] clang-tsa: Add macros for shared locks Kevin Wolf
2022-12-07 13:18 ` [PATCH 08/18] configure: Enable -Wthread-safety if present Kevin Wolf
2022-12-07 13:18 ` [PATCH 09/18] test-bdrv-drain: Fix incorrrect drain assumptions Kevin Wolf
2022-12-07 13:18 ` [PATCH 10/18] block: Fix locking in external_snapshot_prepare() Kevin Wolf
2022-12-07 13:18 ` [PATCH 11/18] block: wrlock in bdrv_replace_child_noperm Kevin Wolf
2022-12-07 13:18 ` [PATCH 12/18] block: remove unnecessary assert_bdrv_graph_writable() Kevin Wolf
2022-12-07 13:18 ` [PATCH 13/18] block: assert that graph read and writes are performed correctly Kevin Wolf
2022-12-07 13:18 ` [PATCH 14/18] graph-lock: TSA annotations for lock/unlock functions Kevin Wolf
2022-12-07 13:18 ` [PATCH 15/18] Mark assert_bdrv_graph_readable/writable() GRAPH_RD/WRLOCK Kevin Wolf
2022-12-07 13:18 ` [PATCH 16/18] block-coroutine-wrapper.py: introduce annotations that take the graph rdlock Kevin Wolf
2022-12-07 13:18 ` [PATCH 17/18] block: use co_wrapper_mixed_bdrv_rdlock in functions taking the rdlock Kevin Wolf
2022-12-07 13:18 ` [PATCH 18/18] block: GRAPH_RDLOCK for functions only called by co_wrappers Kevin Wolf
2022-12-07 14:12 ` [PATCH 00/18] block: Introduce a block graph rwlock Emanuele Giuseppe Esposito
2022-12-07 16:08 ` Kevin Wolf
2022-12-12 17:14 ` Kevin Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).