qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] block: Honour graph read lock even in the main thread
@ 2023-05-10 20:35 Kevin Wolf
  2023-05-10 20:35 ` [PATCH 1/8] block: Call .bdrv_co_create(_opts) unlocked Kevin Wolf
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Kevin Wolf @ 2023-05-10 20:35 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, stefanha, f.ebner, qemu-devel

Fiona asked why it's correct that bdrv_graph_co_rd_lock/unlock() do
nothing if qemu_in_main_thread() returns true. As far as I can tell,
it's not correct. The coroutine can yield while it believes to have the
lock, and then the main loop can call some code that takes a write lock
without waiting for the coroutine to finish.

So this series - or more specifically the last patch - fixes this
problem by enabling the locking even in the main thread. The patches
before this are fixes for bugs that we hadn't discovered while they were
only triggered with iothreads, and which are necessary so that all test
cases still pass after the final patch.

Kevin Wolf (8):
  block: Call .bdrv_co_create(_opts) unlocked
  block/export: Fix null pointer dereference in error path
  qcow2: Unlock the graph in qcow2_do_open() where necessary
  qemu-img: Take graph lock more selectively
  test-bdrv-drain: Take graph lock more selectively
  test-bdrv-drain: Call bdrv_co_unref() in coroutine context
  blockjob: Adhere to rate limit even when reentered early
  graph-lock: Honour read locks even in the main thread

 include/block/block-global-state.h |  8 +++---
 include/block/block_int-common.h   |  4 +--
 include/block/blockjob_int.h       | 14 +++++++---
 block.c                            |  1 -
 block/commit.c                     |  7 ++---
 block/create.c                     |  1 -
 block/crypto.c                     | 25 +++++++++--------
 block/export/export.c              |  6 +++--
 block/graph-lock.c                 | 10 -------
 block/mirror.c                     | 23 +++++++---------
 block/parallels.c                  |  6 ++---
 block/qcow.c                       |  6 ++---
 block/qcow2.c                      | 43 +++++++++++++++++++++---------
 block/qed.c                        |  6 ++---
 block/raw-format.c                 |  2 +-
 block/stream.c                     |  7 ++---
 block/vdi.c                        | 11 ++++----
 block/vhdx.c                       |  8 +++---
 block/vmdk.c                       | 27 +++++++++----------
 block/vpc.c                        |  6 ++---
 blockjob.c                         | 22 +++++++++++++--
 qemu-img.c                         |  5 ++--
 tests/unit/test-bdrv-drain.c       |  6 ++---
 23 files changed, 138 insertions(+), 116 deletions(-)

-- 
2.40.1



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/8] block: Call .bdrv_co_create(_opts) unlocked
  2023-05-10 20:35 [PATCH 0/8] block: Honour graph read lock even in the main thread Kevin Wolf
@ 2023-05-10 20:35 ` Kevin Wolf
  2023-05-12 16:12   ` Eric Blake
  2023-05-10 20:35 ` [PATCH 2/8] block/export: Fix null pointer dereference in error path Kevin Wolf
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2023-05-10 20:35 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, stefanha, f.ebner, qemu-devel

These are functions that modify the graph, so they must be able to take
a writer lock. This is impossible if they already hold the reader lock.
If they need a reader lock for some of their operations, they should
take it internally.

Many of them go through blk_*(), which will always take the lock itself.
Direct calls of bdrv_*() need to take the reader lock. Note that while
locking for bdrv_co_*() calls is checked by TSA, this is not the case
for the mixed_coroutine_fns bdrv_*(). Holding the lock is still required
when they are called from coroutine context like here!

This effectively reverts 4ec8df0183, but adds some internal locking
instead.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/block/block-global-state.h |  8 +++----
 include/block/block_int-common.h   |  4 ++--
 block.c                            |  1 -
 block/create.c                     |  1 -
 block/crypto.c                     | 25 ++++++++++----------
 block/parallels.c                  |  6 ++---
 block/qcow.c                       |  6 ++---
 block/qcow2.c                      | 37 +++++++++++++++++++-----------
 block/qed.c                        |  6 ++---
 block/raw-format.c                 |  2 +-
 block/vdi.c                        | 11 +++++----
 block/vhdx.c                       |  8 ++++---
 block/vmdk.c                       | 27 ++++++++++------------
 block/vpc.c                        |  6 ++---
 14 files changed, 78 insertions(+), 70 deletions(-)

diff --git a/include/block/block-global-state.h b/include/block/block-global-state.h
index 2d93423d35..f347199bff 100644
--- a/include/block/block-global-state.h
+++ b/include/block/block-global-state.h
@@ -58,14 +58,14 @@ BlockDriver *bdrv_find_protocol(const char *filename,
                                 Error **errp);
 BlockDriver *bdrv_find_format(const char *format_name);
 
-int coroutine_fn GRAPH_RDLOCK
+int coroutine_fn GRAPH_UNLOCKED
 bdrv_co_create(BlockDriver *drv, const char *filename, QemuOpts *opts,
                Error **errp);
 
-int co_wrapper_bdrv_rdlock bdrv_create(BlockDriver *drv, const char *filename,
-                                       QemuOpts *opts, Error **errp);
+int co_wrapper bdrv_create(BlockDriver *drv, const char *filename,
+                           QemuOpts *opts, Error **errp);
 
-int coroutine_fn GRAPH_RDLOCK
+int coroutine_fn GRAPH_UNLOCKED
 bdrv_co_create_file(const char *filename, QemuOpts *opts, Error **errp);
 
 BlockDriverState *bdrv_new(void);
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 4909876756..a19b85a6e1 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -245,10 +245,10 @@ struct BlockDriver {
         BlockDriverState *bs, QDict *options, int flags, Error **errp);
     void (*bdrv_close)(BlockDriverState *bs);
 
-    int coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_co_create)(
+    int coroutine_fn GRAPH_UNLOCKED_PTR (*bdrv_co_create)(
         BlockdevCreateOptions *opts, Error **errp);
 
-    int coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_co_create_opts)(
+    int coroutine_fn GRAPH_UNLOCKED_PTR (*bdrv_co_create_opts)(
         BlockDriver *drv, const char *filename, QemuOpts *opts, Error **errp);
 
     int (*bdrv_amend_options)(BlockDriverState *bs,
diff --git a/block.c b/block.c
index dad9a4fa43..e8b51a4391 100644
--- a/block.c
+++ b/block.c
@@ -533,7 +533,6 @@ int coroutine_fn bdrv_co_create(BlockDriver *drv, const char *filename,
     int ret;
     GLOBAL_STATE_CODE();
     ERRP_GUARD();
-    assert_bdrv_graph_readable();
 
     if (!drv->bdrv_co_create_opts) {
         error_setg(errp, "Driver '%s' does not support image creation",
diff --git a/block/create.c b/block/create.c
index bf67b9947c..6b23a21675 100644
--- a/block/create.c
+++ b/block/create.c
@@ -43,7 +43,6 @@ static int coroutine_fn blockdev_create_run(Job *job, Error **errp)
     int ret;
 
     GLOBAL_STATE_CODE();
-    GRAPH_RDLOCK_GUARD();
 
     job_progress_set_remaining(&s->common, 1);
     ret = s->drv->bdrv_co_create(s->opts, errp);
diff --git a/block/crypto.c b/block/crypto.c
index 30093cff9b..6ee8d46d30 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -99,12 +99,10 @@ struct BlockCryptoCreateData {
 };
 
 
-static int block_crypto_create_write_func(QCryptoBlock *block,
-                                          size_t offset,
-                                          const uint8_t *buf,
-                                          size_t buflen,
-                                          void *opaque,
-                                          Error **errp)
+static int coroutine_fn GRAPH_UNLOCKED
+block_crypto_create_write_func(QCryptoBlock *block, size_t offset,
+                               const uint8_t *buf, size_t buflen, void *opaque,
+                               Error **errp)
 {
     struct BlockCryptoCreateData *data = opaque;
     ssize_t ret;
@@ -117,10 +115,9 @@ static int block_crypto_create_write_func(QCryptoBlock *block,
     return 0;
 }
 
-static int block_crypto_create_init_func(QCryptoBlock *block,
-                                         size_t headerlen,
-                                         void *opaque,
-                                         Error **errp)
+static int coroutine_fn GRAPH_UNLOCKED
+block_crypto_create_init_func(QCryptoBlock *block, size_t headerlen,
+                              void *opaque, Error **errp)
 {
     struct BlockCryptoCreateData *data = opaque;
     Error *local_error = NULL;
@@ -314,7 +311,7 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
 }
 
 
-static int coroutine_fn
+static int coroutine_fn GRAPH_UNLOCKED
 block_crypto_co_create_generic(BlockDriverState *bs, int64_t size,
                                QCryptoBlockCreateOptions *opts,
                                PreallocMode prealloc, Error **errp)
@@ -627,7 +624,7 @@ static int block_crypto_open_luks(BlockDriverState *bs,
                                      bs, options, flags, errp);
 }
 
-static int coroutine_fn
+static int coroutine_fn GRAPH_UNLOCKED
 block_crypto_co_create_luks(BlockdevCreateOptions *create_options, Error **errp)
 {
     BlockdevCreateOptionsLUKS *luks_opts;
@@ -665,7 +662,7 @@ fail:
     return ret;
 }
 
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 block_crypto_co_create_opts_luks(BlockDriver *drv, const char *filename,
                                  QemuOpts *opts, Error **errp)
 {
@@ -727,7 +724,9 @@ fail:
      * beforehand, it has been truncated and corrupted in the process.
      */
     if (ret) {
+        bdrv_graph_co_rdlock();
         bdrv_co_delete_file_noerr(bs);
+        bdrv_graph_co_rdunlock();
     }
 
     bdrv_co_unref(bs);
diff --git a/block/parallels.c b/block/parallels.c
index b49c35929e..d8a3f13e24 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -522,8 +522,8 @@ out:
 }
 
 
-static int coroutine_fn parallels_co_create(BlockdevCreateOptions* opts,
-                                            Error **errp)
+static int coroutine_fn GRAPH_UNLOCKED
+parallels_co_create(BlockdevCreateOptions* opts, Error **errp)
 {
     BlockdevCreateOptionsParallels *parallels_opts;
     BlockDriverState *bs;
@@ -622,7 +622,7 @@ exit:
     goto out;
 }
 
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 parallels_co_create_opts(BlockDriver *drv, const char *filename,
                          QemuOpts *opts, Error **errp)
 {
diff --git a/block/qcow.c b/block/qcow.c
index a0c701f578..3644bbf5cb 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -800,8 +800,8 @@ static void qcow_close(BlockDriverState *bs)
     error_free(s->migration_blocker);
 }
 
-static int coroutine_fn qcow_co_create(BlockdevCreateOptions *opts,
-                                       Error **errp)
+static int coroutine_fn GRAPH_UNLOCKED
+qcow_co_create(BlockdevCreateOptions *opts, Error **errp)
 {
     BlockdevCreateOptionsQcow *qcow_opts;
     int header_size, backing_filename_len, l1_size, shift, i;
@@ -921,7 +921,7 @@ exit:
     return ret;
 }
 
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 qcow_co_create_opts(BlockDriver *drv, const char *filename,
                     QemuOpts *opts, Error **errp)
 {
diff --git a/block/qcow2.c b/block/qcow2.c
index 5bde3b8401..73f36447f9 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -118,8 +118,9 @@ static int qcow2_crypto_hdr_read_func(QCryptoBlock *block, size_t offset,
 }
 
 
-static int qcow2_crypto_hdr_init_func(QCryptoBlock *block, size_t headerlen,
-                                      void *opaque, Error **errp)
+static int coroutine_fn GRAPH_RDLOCK
+qcow2_crypto_hdr_init_func(QCryptoBlock *block, size_t headerlen, void *opaque,
+                           Error **errp)
 {
     BlockDriverState *bs = opaque;
     BDRVQcow2State *s = bs->opaque;
@@ -144,9 +145,7 @@ static int qcow2_crypto_hdr_init_func(QCryptoBlock *block, size_t headerlen,
      */
     clusterlen = size_to_clusters(s, headerlen) * s->cluster_size;
     assert(qcow2_pre_write_overlap_check(bs, 0, ret, clusterlen, false) == 0);
-    ret = bdrv_pwrite_zeroes(bs->file,
-                             ret,
-                             clusterlen, 0);
+    ret = bdrv_co_pwrite_zeroes(bs->file, ret, clusterlen, 0);
     if (ret < 0) {
         error_setg_errno(errp, -ret, "Could not zero fill encryption header");
         return -1;
@@ -156,9 +155,11 @@ static int qcow2_crypto_hdr_init_func(QCryptoBlock *block, size_t headerlen,
 }
 
 
-static int qcow2_crypto_hdr_write_func(QCryptoBlock *block, size_t offset,
-                                       const uint8_t *buf, size_t buflen,
-                                       void *opaque, Error **errp)
+/* The graph lock must be held when called in coroutine context */
+static int coroutine_mixed_fn
+qcow2_crypto_hdr_write_func(QCryptoBlock *block, size_t offset,
+                            const uint8_t *buf, size_t buflen,
+                            void *opaque, Error **errp)
 {
     BlockDriverState *bs = opaque;
     BDRVQcow2State *s = bs->opaque;
@@ -3137,9 +3138,10 @@ static int qcow2_change_backing_file(BlockDriverState *bs,
     return qcow2_update_header(bs);
 }
 
-static int qcow2_set_up_encryption(BlockDriverState *bs,
-                                   QCryptoBlockCreateOptions *cryptoopts,
-                                   Error **errp)
+static int coroutine_fn GRAPH_RDLOCK
+qcow2_set_up_encryption(BlockDriverState *bs,
+                        QCryptoBlockCreateOptions *cryptoopts,
+                        Error **errp)
 {
     BDRVQcow2State *s = bs->opaque;
     QCryptoBlock *crypto = NULL;
@@ -3426,7 +3428,7 @@ static uint64_t qcow2_opt_get_refcount_bits_del(QemuOpts *opts, int version,
     return refcount_bits;
 }
 
-static int coroutine_fn
+static int coroutine_fn GRAPH_UNLOCKED
 qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
 {
     BlockdevCreateOptionsQcow2 *qcow2_opts;
@@ -3724,8 +3726,10 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
         goto out;
     }
 
+    bdrv_graph_co_rdlock();
     ret = qcow2_alloc_clusters(blk_bs(blk), 3 * cluster_size);
     if (ret < 0) {
+        bdrv_graph_co_rdunlock();
         error_setg_errno(errp, -ret, "Could not allocate clusters for qcow2 "
                          "header and refcount table");
         goto out;
@@ -3743,6 +3747,8 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
 
     /* Create a full header (including things like feature table) */
     ret = qcow2_update_header(blk_bs(blk));
+    bdrv_graph_co_rdunlock();
+
     if (ret < 0) {
         error_setg_errno(errp, -ret, "Could not update qcow2 header");
         goto out;
@@ -3776,7 +3782,10 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
 
     /* Want encryption? There you go. */
     if (qcow2_opts->encrypt) {
+        bdrv_graph_co_rdlock();
         ret = qcow2_set_up_encryption(blk_bs(blk), qcow2_opts->encrypt, errp);
+        bdrv_graph_co_rdunlock();
+
         if (ret < 0) {
             goto out;
         }
@@ -3813,7 +3822,7 @@ out:
     return ret;
 }
 
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 qcow2_co_create_opts(BlockDriver *drv, const char *filename, QemuOpts *opts,
                      Error **errp)
 {
@@ -3933,8 +3942,10 @@ qcow2_co_create_opts(BlockDriver *drv, const char *filename, QemuOpts *opts,
     ret = qcow2_co_create(create_options, errp);
 finish:
     if (ret < 0) {
+        bdrv_graph_co_rdlock();
         bdrv_co_delete_file_noerr(bs);
         bdrv_co_delete_file_noerr(data_bs);
+        bdrv_graph_co_rdunlock();
     } else {
         ret = 0;
     }
diff --git a/block/qed.c b/block/qed.c
index be9ff0fb34..9a0350b534 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -630,8 +630,8 @@ static void bdrv_qed_close(BlockDriverState *bs)
     qemu_vfree(s->l1_table);
 }
 
-static int coroutine_fn bdrv_qed_co_create(BlockdevCreateOptions *opts,
-                                           Error **errp)
+static int coroutine_fn GRAPH_UNLOCKED
+bdrv_qed_co_create(BlockdevCreateOptions *opts, Error **errp)
 {
     BlockdevCreateOptionsQed *qed_opts;
     BlockBackend *blk = NULL;
@@ -751,7 +751,7 @@ out:
     return ret;
 }
 
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 bdrv_qed_co_create_opts(BlockDriver *drv, const char *filename,
                         QemuOpts *opts, Error **errp)
 {
diff --git a/block/raw-format.c b/block/raw-format.c
index fd9e61f58e..b4c8e7cf61 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -435,7 +435,7 @@ static int raw_has_zero_init(BlockDriverState *bs)
     return bdrv_has_zero_init(bs->file->bs);
 }
 
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 raw_co_create_opts(BlockDriver *drv, const char *filename,
                    QemuOpts *opts, Error **errp)
 {
diff --git a/block/vdi.c b/block/vdi.c
index 08331d2dd7..6c35309e04 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -734,8 +734,9 @@ nonallocating_write:
     return ret;
 }
 
-static int coroutine_fn vdi_co_do_create(BlockdevCreateOptions *create_options,
-                                         size_t block_size, Error **errp)
+static int coroutine_fn GRAPH_UNLOCKED
+vdi_co_do_create(BlockdevCreateOptions *create_options, size_t block_size,
+                 Error **errp)
 {
     BlockdevCreateOptionsVdi *vdi_opts;
     int ret = 0;
@@ -892,13 +893,13 @@ exit:
     return ret;
 }
 
-static int coroutine_fn vdi_co_create(BlockdevCreateOptions *create_options,
-                                      Error **errp)
+static int coroutine_fn GRAPH_UNLOCKED
+vdi_co_create(BlockdevCreateOptions *create_options, Error **errp)
 {
     return vdi_co_do_create(create_options, DEFAULT_CLUSTER_SIZE, errp);
 }
 
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 vdi_co_create_opts(BlockDriver *drv, const char *filename,
                    QemuOpts *opts, Error **errp)
 {
diff --git a/block/vhdx.c b/block/vhdx.c
index b20b1edf11..89913cba87 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -1506,7 +1506,7 @@ exit:
  * There are 2 headers, and the highest sequence number will represent
  * the active header
  */
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 vhdx_create_new_headers(BlockBackend *blk, uint64_t image_size,
                         uint32_t log_size)
 {
@@ -1515,6 +1515,8 @@ vhdx_create_new_headers(BlockBackend *blk, uint64_t image_size,
     int ret = 0;
     VHDXHeader *hdr = NULL;
 
+    GRAPH_RDLOCK_GUARD();
+
     hdr = g_new0(VHDXHeader, 1);
 
     hdr->signature       = VHDX_HEADER_SIGNATURE;
@@ -1898,7 +1900,7 @@ exit:
  *    .---- ~ ----------- ~ ------------ ~ ---------------- ~ -----------.
  *   1MB
  */
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 vhdx_co_create(BlockdevCreateOptions *opts, Error **errp)
 {
     BlockdevCreateOptionsVhdx *vhdx_opts;
@@ -2060,7 +2062,7 @@ delete_and_exit:
     return ret;
 }
 
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 vhdx_co_create_opts(BlockDriver *drv, const char *filename,
                     QemuOpts *opts, Error **errp)
 {
diff --git a/block/vmdk.c b/block/vmdk.c
index fddbd1c86c..e3e86608ec 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -2165,10 +2165,9 @@ vmdk_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int64_t bytes,
     return ret;
 }
 
-static int vmdk_init_extent(BlockBackend *blk,
-                            int64_t filesize, bool flat,
-                            bool compress, bool zeroed_grain,
-                            Error **errp)
+static int GRAPH_UNLOCKED
+vmdk_init_extent(BlockBackend *blk, int64_t filesize, bool flat, bool compress,
+                 bool zeroed_grain, Error **errp)
 {
     int ret, i;
     VMDK4Header header;
@@ -2277,7 +2276,7 @@ exit:
     return ret;
 }
 
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 vmdk_create_extent(const char *filename, int64_t filesize, bool flat,
                    bool compress, bool zeroed_grain, BlockBackend **pbb,
                    QemuOpts *opts, Error **errp)
@@ -2358,7 +2357,7 @@ static int filename_decompose(const char *filename, char *path, char *prefix,
  *           non-split format.
  * idx >= 1: get the n-th extent if in a split subformat
  */
-typedef BlockBackend * coroutine_fn /* GRAPH_RDLOCK */
+typedef BlockBackend * coroutine_fn GRAPH_UNLOCKED_PTR
     (*vmdk_create_extent_fn)(int64_t size, int idx, bool flat, bool split,
                              bool compress, bool zeroed_grain, void *opaque,
                              Error **errp);
@@ -2374,7 +2373,7 @@ static void vmdk_desc_add_extent(GString *desc,
     g_free(basename);
 }
 
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 vmdk_co_do_create(int64_t size,
                   BlockdevVmdkSubformat subformat,
                   BlockdevVmdkAdapterType adapter_type,
@@ -2605,7 +2604,7 @@ typedef struct {
     QemuOpts *opts;
 } VMDKCreateOptsData;
 
-static BlockBackend * coroutine_fn GRAPH_RDLOCK
+static BlockBackend * coroutine_fn GRAPH_UNLOCKED
 vmdk_co_create_opts_cb(int64_t size, int idx, bool flat, bool split,
                        bool compress, bool zeroed_grain, void *opaque,
                        Error **errp)
@@ -2647,7 +2646,7 @@ exit:
     return blk;
 }
 
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 vmdk_co_create_opts(BlockDriver *drv, const char *filename,
                     QemuOpts *opts, Error **errp)
 {
@@ -2756,11 +2755,9 @@ exit:
     return ret;
 }
 
-static BlockBackend * coroutine_fn vmdk_co_create_cb(int64_t size, int idx,
-                                                     bool flat, bool split,
-                                                     bool compress,
-                                                     bool zeroed_grain,
-                                                     void *opaque, Error **errp)
+static BlockBackend * coroutine_fn GRAPH_UNLOCKED
+vmdk_co_create_cb(int64_t size, int idx, bool flat, bool split, bool compress,
+                  bool zeroed_grain, void *opaque, Error **errp)
 {
     int ret;
     BlockDriverState *bs;
@@ -2809,7 +2806,7 @@ static BlockBackend * coroutine_fn vmdk_co_create_cb(int64_t size, int idx,
     return blk;
 }
 
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 vmdk_co_create(BlockdevCreateOptions *create_options, Error **errp)
 {
     BlockdevCreateOptionsVmdk *opts;
diff --git a/block/vpc.c b/block/vpc.c
index 07ddda5b99..7ee7c7b4e0 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -967,8 +967,8 @@ static int calculate_rounded_image_size(BlockdevCreateOptionsVpc *vpc_opts,
     return 0;
 }
 
-static int coroutine_fn vpc_co_create(BlockdevCreateOptions *opts,
-                                      Error **errp)
+static int coroutine_fn GRAPH_UNLOCKED
+vpc_co_create(BlockdevCreateOptions *opts, Error **errp)
 {
     BlockdevCreateOptionsVpc *vpc_opts;
     BlockBackend *blk = NULL;
@@ -1087,7 +1087,7 @@ out:
     return ret;
 }
 
-static int coroutine_fn GRAPH_RDLOCK
+static int coroutine_fn GRAPH_UNLOCKED
 vpc_co_create_opts(BlockDriver *drv, const char *filename,
                    QemuOpts *opts, Error **errp)
 {
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/8] block/export: Fix null pointer dereference in error path
  2023-05-10 20:35 [PATCH 0/8] block: Honour graph read lock even in the main thread Kevin Wolf
  2023-05-10 20:35 ` [PATCH 1/8] block: Call .bdrv_co_create(_opts) unlocked Kevin Wolf
@ 2023-05-10 20:35 ` Kevin Wolf
  2023-05-12 15:31   ` Peter Maydell
  2023-05-12 16:16   ` Eric Blake
  2023-05-10 20:35 ` [PATCH 3/8] qcow2: Unlock the graph in qcow2_do_open() where necessary Kevin Wolf
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 21+ messages in thread
From: Kevin Wolf @ 2023-05-10 20:35 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, stefanha, f.ebner, qemu-devel

There are some error paths in blk_exp_add() that jump to 'fail:' before
'exp' is even created. So we can't just unconditionally access exp->blk.

Add a NULL check, and switch from exp->blk to blk, which is available
earlier, just to be extra sure that we really cover all cases where
BlockDevOps could have been set for it (in practice, this only happens
in drv->create() today, so this part of the change isn't strictly
necessary).

Fixes: de79b52604e43fdeba6cee4f5af600b62169f2d2
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/export/export.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block/export/export.c b/block/export/export.c
index 62c7c22d45..a5c8f42f53 100644
--- a/block/export/export.c
+++ b/block/export/export.c
@@ -192,8 +192,10 @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
     return exp;
 
 fail:
-    blk_set_dev_ops(exp->blk, NULL, NULL);
-    blk_unref(blk);
+    if (blk) {
+        blk_set_dev_ops(blk, NULL, NULL);
+        blk_unref(blk);
+    }
     aio_context_release(ctx);
     if (exp) {
         g_free(exp->id);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/8] qcow2: Unlock the graph in qcow2_do_open() where necessary
  2023-05-10 20:35 [PATCH 0/8] block: Honour graph read lock even in the main thread Kevin Wolf
  2023-05-10 20:35 ` [PATCH 1/8] block: Call .bdrv_co_create(_opts) unlocked Kevin Wolf
  2023-05-10 20:35 ` [PATCH 2/8] block/export: Fix null pointer dereference in error path Kevin Wolf
@ 2023-05-10 20:35 ` Kevin Wolf
  2023-05-12 16:19   ` Eric Blake
  2023-05-10 20:35 ` [PATCH 4/8] qemu-img: Take graph lock more selectively Kevin Wolf
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2023-05-10 20:35 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, stefanha, f.ebner, qemu-devel

qcow2_do_open() calls a few no_co_wrappers that wrap functions taking
the graph lock internally as a writer. Therefore, it can't hold the
reader lock across these calls, it causes deadlocks. Drop the lock
temporarily around the calls.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 73f36447f9..b00b4e7575 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1619,9 +1619,11 @@ qcow2_do_open(BlockDriverState *bs, QDict *options, int flags,
 
     if (open_data_file) {
         /* Open external data file */
+        bdrv_graph_co_rdunlock();
         s->data_file = bdrv_co_open_child(NULL, options, "data-file", bs,
                                           &child_of_bds, BDRV_CHILD_DATA,
                                           true, errp);
+        bdrv_graph_co_rdlock();
         if (*errp) {
             ret = -EINVAL;
             goto fail;
@@ -1629,10 +1631,12 @@ qcow2_do_open(BlockDriverState *bs, QDict *options, int flags,
 
         if (s->incompatible_features & QCOW2_INCOMPAT_DATA_FILE) {
             if (!s->data_file && s->image_data_file) {
+                bdrv_graph_co_rdunlock();
                 s->data_file = bdrv_co_open_child(s->image_data_file, options,
                                                   "data-file", bs,
                                                   &child_of_bds,
                                                   BDRV_CHILD_DATA, false, errp);
+                bdrv_graph_co_rdlock();
                 if (!s->data_file) {
                     ret = -EINVAL;
                     goto fail;
@@ -1857,7 +1861,9 @@ qcow2_do_open(BlockDriverState *bs, QDict *options, int flags,
  fail:
     g_free(s->image_data_file);
     if (open_data_file && has_data_file(bs)) {
+        bdrv_graph_co_rdunlock();
         bdrv_unref_child(bs, s->data_file);
+        bdrv_graph_co_rdlock();
         s->data_file = NULL;
     }
     g_free(s->unknown_header_fields);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/8] qemu-img: Take graph lock more selectively
  2023-05-10 20:35 [PATCH 0/8] block: Honour graph read lock even in the main thread Kevin Wolf
                   ` (2 preceding siblings ...)
  2023-05-10 20:35 ` [PATCH 3/8] qcow2: Unlock the graph in qcow2_do_open() where necessary Kevin Wolf
@ 2023-05-10 20:35 ` Kevin Wolf
  2023-05-12 16:20   ` Eric Blake
  2023-05-10 20:35 ` [PATCH 5/8] test-bdrv-drain: " Kevin Wolf
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2023-05-10 20:35 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, stefanha, f.ebner, qemu-devel

If we take a reader lock, we can't call any functions that take a writer
lock internally without causing deadlocks once the reader lock is
actually enforced in the main thread, too. Take the reader lock only
where it is actually needed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 qemu-img.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 9f9f0a7629..27f48051b0 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2938,8 +2938,6 @@ static BlockGraphInfoList *collect_image_info_list(bool image_opts,
         }
         bs = blk_bs(blk);
 
-        GRAPH_RDLOCK_GUARD_MAINLOOP();
-
         /*
          * Note that the returned BlockGraphInfo object will not have
          * information about this image's backing node, because we have opened
@@ -2947,7 +2945,10 @@ static BlockGraphInfoList *collect_image_info_list(bool image_opts,
          * duplicate the backing chain information that we obtain by walking
          * the chain manually here.
          */
+        bdrv_graph_rdlock_main_loop();
         bdrv_query_block_graph_info(bs, &info, &err);
+        bdrv_graph_rdunlock_main_loop();
+
         if (err) {
             error_report_err(err);
             blk_unref(blk);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/8] test-bdrv-drain: Take graph lock more selectively
  2023-05-10 20:35 [PATCH 0/8] block: Honour graph read lock even in the main thread Kevin Wolf
                   ` (3 preceding siblings ...)
  2023-05-10 20:35 ` [PATCH 4/8] qemu-img: Take graph lock more selectively Kevin Wolf
@ 2023-05-10 20:35 ` Kevin Wolf
  2023-05-12 16:26   ` Eric Blake
  2023-05-10 20:35 ` [PATCH 6/8] test-bdrv-drain: Call bdrv_co_unref() in coroutine context Kevin Wolf
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2023-05-10 20:35 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, stefanha, f.ebner, qemu-devel

If we take a reader lock, we can't call any functions that take a writer
lock internally without causing deadlocks once the reader lock is
actually enforced in the main thread, too. Take the reader lock only
where it is actually needed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 tests/unit/test-bdrv-drain.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
index 9a4c5e59d6..ae4299ccfa 100644
--- a/tests/unit/test-bdrv-drain.c
+++ b/tests/unit/test-bdrv-drain.c
@@ -1004,8 +1004,6 @@ static void coroutine_fn test_co_delete_by_drain(void *opaque)
     void *buffer = g_malloc(65536);
     QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, buffer, 65536);
 
-    GRAPH_RDLOCK_GUARD();
-
     /* Pretend some internal write operation from parent to child.
      * Important: We have to read from the child, not from the parent!
      * Draining works by first propagating it all up the tree to the
@@ -1014,7 +1012,9 @@ static void coroutine_fn test_co_delete_by_drain(void *opaque)
      * everything will be drained before we go back down the tree, but
      * we do not want that.  We want to be in the middle of draining
      * when this following requests returns. */
+    bdrv_graph_co_rdlock();
     bdrv_co_preadv(tts->wait_child, 0, 65536, &qiov, 0);
+    bdrv_graph_co_rdunlock();
 
     g_assert_cmpint(bs->refcnt, ==, 1);
 
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 6/8] test-bdrv-drain: Call bdrv_co_unref() in coroutine context
  2023-05-10 20:35 [PATCH 0/8] block: Honour graph read lock even in the main thread Kevin Wolf
                   ` (4 preceding siblings ...)
  2023-05-10 20:35 ` [PATCH 5/8] test-bdrv-drain: " Kevin Wolf
@ 2023-05-10 20:35 ` Kevin Wolf
  2023-05-12 16:27   ` Eric Blake
  2023-05-10 20:36 ` [PATCH 7/8] blockjob: Adhere to rate limit even when reentered early Kevin Wolf
  2023-05-10 20:36 ` [PATCH 8/8] graph-lock: Honour read locks even in the main thread Kevin Wolf
  7 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2023-05-10 20:35 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, stefanha, f.ebner, qemu-devel

bdrv_unref() is a no_coroutine_fn, so calling it from coroutine context
is invalid. Use bdrv_co_unref() instead.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 tests/unit/test-bdrv-drain.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
index ae4299ccfa..08bb0f9984 100644
--- a/tests/unit/test-bdrv-drain.c
+++ b/tests/unit/test-bdrv-drain.c
@@ -1019,7 +1019,7 @@ static void coroutine_fn test_co_delete_by_drain(void *opaque)
     g_assert_cmpint(bs->refcnt, ==, 1);
 
     if (!dbdd->detach_instead_of_delete) {
-        blk_unref(blk);
+        blk_co_unref(blk);
     } else {
         BdrvChild *c, *next_c;
         QLIST_FOREACH_SAFE(c, &bs->children, next, next_c) {
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 7/8] blockjob: Adhere to rate limit even when reentered early
  2023-05-10 20:35 [PATCH 0/8] block: Honour graph read lock even in the main thread Kevin Wolf
                   ` (5 preceding siblings ...)
  2023-05-10 20:35 ` [PATCH 6/8] test-bdrv-drain: Call bdrv_co_unref() in coroutine context Kevin Wolf
@ 2023-05-10 20:36 ` Kevin Wolf
  2023-05-12 16:35   ` Eric Blake
  2023-05-10 20:36 ` [PATCH 8/8] graph-lock: Honour read locks even in the main thread Kevin Wolf
  7 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2023-05-10 20:36 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, stefanha, f.ebner, qemu-devel

When jobs are sleeping, for example to enforce a given rate limit, they
can be reentered early, in particular in order to get paused, to update
the rate limit or to get cancelled.

Before this patch, they behave in this case as if they had fully
completed their rate limiting delay. This means that requests are sped
up beyond their limit, violating the constraints that the user gave us.

Change the block jobs to sleep in a loop until the necessary delay is
completed, while still allowing cancelling them immediately as well
pausing (handled by the pause point in job_sleep_ns()) and updating the
rate limit.

This change is also motivated by iotests cases being prone to fail
because drain operations pause and unpause them so often that block jobs
complete earlier than they are supposed to. In particular, the next
commit would fail iotests 030 without this change.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/block/blockjob_int.h | 14 ++++++++++----
 block/commit.c               |  7 ++-----
 block/mirror.c               | 23 ++++++++++-------------
 block/stream.c               |  7 ++-----
 blockjob.c                   | 22 ++++++++++++++++++++--
 5 files changed, 44 insertions(+), 29 deletions(-)

diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index f008446285..104824040c 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -126,12 +126,18 @@ void block_job_user_resume(Job *job);
  */
 
 /**
- * block_job_ratelimit_get_delay:
+ * block_job_ratelimit_processed_bytes:
  *
- * Calculate and return delay for the next request in ns. See the documentation
- * of ratelimit_calculate_delay() for details.
+ * To be called after some work has been done. Adjusts the delay for the next
+ * request. See the documentation of ratelimit_calculate_delay() for details.
  */
-int64_t block_job_ratelimit_get_delay(BlockJob *job, uint64_t n);
+void block_job_ratelimit_processed_bytes(BlockJob *job, uint64_t n);
+
+/**
+ * Put the job to sleep (assuming that it wasn't canceled) to throttle it to the
+ * right speed according to its rate limiting.
+ */
+void block_job_ratelimit_sleep(BlockJob *job);
 
 /**
  * block_job_error_action:
diff --git a/block/commit.c b/block/commit.c
index 2b20fd0fd4..aa45beb0f0 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -116,7 +116,6 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
 {
     CommitBlockJob *s = container_of(job, CommitBlockJob, common.job);
     int64_t offset;
-    uint64_t delay_ns = 0;
     int ret = 0;
     int64_t n = 0; /* bytes */
     QEMU_AUTO_VFREE void *buf = NULL;
@@ -149,7 +148,7 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
         /* Note that even when no rate limit is applied we need to yield
          * with no pending I/O here so that bdrv_drain_all() returns.
          */
-        job_sleep_ns(&s->common.job, delay_ns);
+        block_job_ratelimit_sleep(&s->common);
         if (job_is_cancelled(&s->common.job)) {
             break;
         }
@@ -184,9 +183,7 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
         job_progress_update(&s->common.job, n);
 
         if (copy) {
-            delay_ns = block_job_ratelimit_get_delay(&s->common, n);
-        } else {
-            delay_ns = 0;
+            block_job_ratelimit_processed_bytes(&s->common, n);
         }
     }
 
diff --git a/block/mirror.c b/block/mirror.c
index 717442ca4d..b7d92d1378 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -471,12 +471,11 @@ static unsigned mirror_perform(MirrorBlockJob *s, int64_t offset,
     return bytes_handled;
 }
 
-static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
+static void coroutine_fn mirror_iteration(MirrorBlockJob *s)
 {
     BlockDriverState *source = s->mirror_top_bs->backing->bs;
     MirrorOp *pseudo_op;
     int64_t offset;
-    uint64_t delay_ns = 0, ret = 0;
     /* At least the first dirty chunk is mirrored in one iteration. */
     int nb_chunks = 1;
     bool write_zeroes_ok = bdrv_can_write_zeroes_with_unmap(blk_bs(s->target));
@@ -608,16 +607,13 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
         assert(io_bytes);
         offset += io_bytes;
         nb_chunks -= DIV_ROUND_UP(io_bytes, s->granularity);
-        delay_ns = block_job_ratelimit_get_delay(&s->common, io_bytes_acct);
+        block_job_ratelimit_processed_bytes(&s->common, io_bytes_acct);
     }
 
-    ret = delay_ns;
 fail:
     QTAILQ_REMOVE(&s->ops_in_flight, pseudo_op, next);
     qemu_co_queue_restart_all(&pseudo_op->waiting_requests);
     g_free(pseudo_op);
-
-    return ret;
 }
 
 static void mirror_free_init(MirrorBlockJob *s)
@@ -1011,7 +1007,6 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
     assert(!s->dbi);
     s->dbi = bdrv_dirty_iter_new(s->dirty_bitmap);
     for (;;) {
-        uint64_t delay_ns = 0;
         int64_t cnt, delta;
         bool should_complete;
 
@@ -1051,7 +1046,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
                 mirror_wait_for_free_in_flight_slot(s);
                 continue;
             } else if (cnt != 0) {
-                delay_ns = mirror_iteration(s);
+                mirror_iteration(s);
             }
         }
 
@@ -1114,12 +1109,14 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
         }
 
         if (job_is_ready(&s->common.job) && !should_complete) {
-            delay_ns = (s->in_flight == 0 &&
-                        cnt == 0 ? BLOCK_JOB_SLICE_TIME : 0);
+            if (s->in_flight == 0 && cnt == 0) {
+                trace_mirror_before_sleep(s, cnt, job_is_ready(&s->common.job),
+                                          BLOCK_JOB_SLICE_TIME);
+                job_sleep_ns(&s->common.job, BLOCK_JOB_SLICE_TIME);
+            }
+        } else {
+            block_job_ratelimit_sleep(&s->common);
         }
-        trace_mirror_before_sleep(s, cnt, job_is_ready(&s->common.job),
-                                  delay_ns);
-        job_sleep_ns(&s->common.job, delay_ns);
         s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     }
 
diff --git a/block/stream.c b/block/stream.c
index 7f9e1ecdbb..e522bbdec5 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -131,7 +131,6 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
     BlockDriverState *unfiltered_bs = bdrv_skip_filters(s->target_bs);
     int64_t len;
     int64_t offset = 0;
-    uint64_t delay_ns = 0;
     int error = 0;
     int64_t n = 0; /* bytes */
 
@@ -155,7 +154,7 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
         /* Note that even when no rate limit is applied we need to yield
          * with no pending I/O here so that bdrv_drain_all() returns.
          */
-        job_sleep_ns(&s->common.job, delay_ns);
+        block_job_ratelimit_sleep(&s->common);
         if (job_is_cancelled(&s->common.job)) {
             break;
         }
@@ -204,9 +203,7 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
         /* Publish progress */
         job_progress_update(&s->common.job, n);
         if (copy) {
-            delay_ns = block_job_ratelimit_get_delay(&s->common, n);
-        } else {
-            delay_ns = 0;
+            block_job_ratelimit_processed_bytes(&s->common, n);
         }
     }
 
diff --git a/blockjob.c b/blockjob.c
index 659c3cb9de..913da3cbf7 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -319,10 +319,28 @@ static bool block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
     return block_job_set_speed_locked(job, speed, errp);
 }
 
-int64_t block_job_ratelimit_get_delay(BlockJob *job, uint64_t n)
+void block_job_ratelimit_processed_bytes(BlockJob *job, uint64_t n)
 {
     IO_CODE();
-    return ratelimit_calculate_delay(&job->limit, n);
+    ratelimit_calculate_delay(&job->limit, n);
+}
+
+void block_job_ratelimit_sleep(BlockJob *job)
+{
+    uint64_t delay_ns;
+
+    /*
+     * Sleep at least once. If the job is reentered early, keep waiting until
+     * we've waited for the full time that is necessary to keep the job at the
+     * right speed.
+     *
+     * Make sure to recalculate the delay after each (possibly interrupted)
+     * sleep because the speed can change while the job has yielded.
+     */
+    do {
+        delay_ns = ratelimit_calculate_delay(&job->limit, 0);
+        job_sleep_ns(&job->job, delay_ns);
+    } while (delay_ns && !job_is_cancelled(&job->job));
 }
 
 BlockJobInfo *block_job_query_locked(BlockJob *job, Error **errp)
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 8/8] graph-lock: Honour read locks even in the main thread
  2023-05-10 20:35 [PATCH 0/8] block: Honour graph read lock even in the main thread Kevin Wolf
                   ` (6 preceding siblings ...)
  2023-05-10 20:36 ` [PATCH 7/8] blockjob: Adhere to rate limit even when reentered early Kevin Wolf
@ 2023-05-10 20:36 ` Kevin Wolf
  2023-05-12 16:37   ` Eric Blake
  7 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2023-05-10 20:36 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, hreitz, stefanha, f.ebner, qemu-devel

There are some conditions under which we don't actually need to do
anything for taking a reader lock: Writing the graph is only possible
from the main context while holding the BQL. So if a reader is running
in the main context under the BQL and knows that it won't be interrupted
until the next writer runs, we don't actually need to do anything.

This is the case if the reader code neither has a nested event loop
(this is forbidden anyway while you hold the lock) nor is a coroutine
(because a writer could run when the coroutine has yielded).

These conditions are exactly what bdrv_graph_rdlock_main_loop() asserts.
They are not fulfilled in bdrv_graph_co_rdlock(), which always runs in a
coroutine.

This deletes the shortcuts in bdrv_graph_co_rdlock() that skip taking
the reader lock in the main thread.

Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/graph-lock.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/block/graph-lock.c b/block/graph-lock.c
index 377884c3a9..9c42bd9799 100644
--- a/block/graph-lock.c
+++ b/block/graph-lock.c
@@ -162,11 +162,6 @@ void coroutine_fn bdrv_graph_co_rdlock(void)
     BdrvGraphRWlock *bdrv_graph;
     bdrv_graph = qemu_get_current_aio_context()->bdrv_graph;
 
-    /* Do not lock if in main thread */
-    if (qemu_in_main_thread()) {
-        return;
-    }
-
     for (;;) {
         qatomic_set(&bdrv_graph->reader_count,
                     bdrv_graph->reader_count + 1);
@@ -230,11 +225,6 @@ void coroutine_fn bdrv_graph_co_rdunlock(void)
     BdrvGraphRWlock *bdrv_graph;
     bdrv_graph = qemu_get_current_aio_context()->bdrv_graph;
 
-    /* Do not lock if in main thread */
-    if (qemu_in_main_thread()) {
-        return;
-    }
-
     qatomic_store_release(&bdrv_graph->reader_count,
                           bdrv_graph->reader_count - 1);
     /* make sure writer sees reader_count before we check has_writer */
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/8] block/export: Fix null pointer dereference in error path
  2023-05-10 20:35 ` [PATCH 2/8] block/export: Fix null pointer dereference in error path Kevin Wolf
@ 2023-05-12 15:31   ` Peter Maydell
  2023-05-12 16:16   ` Eric Blake
  1 sibling, 0 replies; 21+ messages in thread
From: Peter Maydell @ 2023-05-12 15:31 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, stefanha, f.ebner, qemu-devel

On Wed, 10 May 2023 at 21:38, Kevin Wolf <kwolf@redhat.com> wrote:
>
> There are some error paths in blk_exp_add() that jump to 'fail:' before
> 'exp' is even created. So we can't just unconditionally access exp->blk.
>
> Add a NULL check, and switch from exp->blk to blk, which is available
> earlier, just to be extra sure that we really cover all cases where
> BlockDevOps could have been set for it (in practice, this only happens
> in drv->create() today, so this part of the change isn't strictly
> necessary).
>
> Fixes: de79b52604e43fdeba6cee4f5af600b62169f2d2
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---

Coverity noticed this bug, incidentally: CID 1509238.

-- PMM


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/8] block: Call .bdrv_co_create(_opts) unlocked
  2023-05-10 20:35 ` [PATCH 1/8] block: Call .bdrv_co_create(_opts) unlocked Kevin Wolf
@ 2023-05-12 16:12   ` Eric Blake
  2023-05-15 16:19     ` Kevin Wolf
  0 siblings, 1 reply; 21+ messages in thread
From: Eric Blake @ 2023-05-12 16:12 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, stefanha, f.ebner, qemu-devel


On Wed, May 10, 2023 at 10:35:54PM +0200, Kevin Wolf wrote:
> 
> These are functions that modify the graph, so they must be able to take
> a writer lock. This is impossible if they already hold the reader lock.
> If they need a reader lock for some of their operations, they should
> take it internally.
> 
> Many of them go through blk_*(), which will always take the lock itself.
> Direct calls of bdrv_*() need to take the reader lock. Note that while
> locking for bdrv_co_*() calls is checked by TSA, this is not the case
> for the mixed_coroutine_fns bdrv_*(). Holding the lock is still required
> when they are called from coroutine context like here!
> 
> This effectively reverts 4ec8df0183, but adds some internal locking
> instead.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---

> +++ b/block/qcow2.c

> -static int coroutine_fn
> +static int coroutine_fn GRAPH_UNLOCKED
>  qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
>  {
>      BlockdevCreateOptionsQcow2 *qcow2_opts;
> @@ -3724,8 +3726,10 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
>          goto out;
>      }
>  
> +    bdrv_graph_co_rdlock();
>      ret = qcow2_alloc_clusters(blk_bs(blk), 3 * cluster_size);
>      if (ret < 0) {
> +        bdrv_graph_co_rdunlock();
>          error_setg_errno(errp, -ret, "Could not allocate clusters for qcow2 "
>                           "header and refcount table");
>          goto out;
> @@ -3743,6 +3747,8 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
>  
>      /* Create a full header (including things like feature table) */
>      ret = qcow2_update_header(blk_bs(blk));
> +    bdrv_graph_co_rdunlock();
> +

If we ever inject any 'goto out' in the elided lines, we're in
trouble.  Would this be any safer by wrapping the intervening
statements in a scope-guarded lock?

But what you have is correct, despite my worries about potential
maintenance concerns.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/8] block/export: Fix null pointer dereference in error path
  2023-05-10 20:35 ` [PATCH 2/8] block/export: Fix null pointer dereference in error path Kevin Wolf
  2023-05-12 15:31   ` Peter Maydell
@ 2023-05-12 16:16   ` Eric Blake
  2023-05-12 18:46     ` Eric Blake
  1 sibling, 1 reply; 21+ messages in thread
From: Eric Blake @ 2023-05-12 16:16 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, stefanha, f.ebner, qemu-devel


On Wed, May 10, 2023 at 10:35:55PM +0200, Kevin Wolf wrote:
> 
> There are some error paths in blk_exp_add() that jump to 'fail:' before
> 'exp' is even created. So we can't just unconditionally access exp->blk.
> 
> Add a NULL check, and switch from exp->blk to blk, which is available
> earlier, just to be extra sure that we really cover all cases where
> BlockDevOps could have been set for it (in practice, this only happens
> in drv->create() today, so this part of the change isn't strictly
> necessary).
> 
> Fixes: de79b52604e43fdeba6cee4f5af600b62169f2d2

Sorry for missing that on my first review, and this does look better.

I'm assuming you plan to take this in with the rest of the series
through your tree, but let me know if I should push it faster through
the NBD tree.

> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/export/export.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> diff --git a/block/export/export.c b/block/export/export.c
> index 62c7c22d45..a5c8f42f53 100644
> --- a/block/export/export.c
> +++ b/block/export/export.c
> @@ -192,8 +192,10 @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
>      return exp;
>  
>  fail:
> -    blk_set_dev_ops(exp->blk, NULL, NULL);
> -    blk_unref(blk);
> +    if (blk) {
> +        blk_set_dev_ops(blk, NULL, NULL);
> +        blk_unref(blk);
> +    }
>      aio_context_release(ctx);
>      if (exp) {
>          g_free(exp->id);
> -- 
> 2.40.1
> 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/8] qcow2: Unlock the graph in qcow2_do_open() where necessary
  2023-05-10 20:35 ` [PATCH 3/8] qcow2: Unlock the graph in qcow2_do_open() where necessary Kevin Wolf
@ 2023-05-12 16:19   ` Eric Blake
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Blake @ 2023-05-12 16:19 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, stefanha, f.ebner, qemu-devel


On Wed, May 10, 2023 at 10:35:56PM +0200, Kevin Wolf wrote:
> 
> qcow2_do_open() calls a few no_co_wrappers that wrap functions taking
> the graph lock internally as a writer. Therefore, it can't hold the
> reader lock across these calls, it causes deadlocks. Drop the lock
> temporarily around the calls.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2.c | 6 ++++++
>  1 file changed, 6 insertions(+)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/8] qemu-img: Take graph lock more selectively
  2023-05-10 20:35 ` [PATCH 4/8] qemu-img: Take graph lock more selectively Kevin Wolf
@ 2023-05-12 16:20   ` Eric Blake
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Blake @ 2023-05-12 16:20 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, stefanha, f.ebner, qemu-devel


On Wed, May 10, 2023 at 10:35:57PM +0200, Kevin Wolf wrote:
> 
> If we take a reader lock, we can't call any functions that take a writer
> lock internally without causing deadlocks once the reader lock is
> actually enforced in the main thread, too. Take the reader lock only
> where it is actually needed.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  qemu-img.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 5/8] test-bdrv-drain: Take graph lock more selectively
  2023-05-10 20:35 ` [PATCH 5/8] test-bdrv-drain: " Kevin Wolf
@ 2023-05-12 16:26   ` Eric Blake
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Blake @ 2023-05-12 16:26 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, stefanha, f.ebner, qemu-devel


On Wed, May 10, 2023 at 10:35:58PM +0200, Kevin Wolf wrote:
> 
> If we take a reader lock, we can't call any functions that take a writer
> lock internally without causing deadlocks once the reader lock is
> actually enforced in the main thread, too. Take the reader lock only
> where it is actually needed.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  tests/unit/test-bdrv-drain.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/8] test-bdrv-drain: Call bdrv_co_unref() in coroutine context
  2023-05-10 20:35 ` [PATCH 6/8] test-bdrv-drain: Call bdrv_co_unref() in coroutine context Kevin Wolf
@ 2023-05-12 16:27   ` Eric Blake
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Blake @ 2023-05-12 16:27 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, stefanha, f.ebner, qemu-devel


On Wed, May 10, 2023 at 10:35:59PM +0200, Kevin Wolf wrote:
> 
> bdrv_unref() is a no_coroutine_fn, so calling it from coroutine context
> is invalid. Use bdrv_co_unref() instead.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  tests/unit/test-bdrv-drain.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
> index ae4299ccfa..08bb0f9984 100644
> --- a/tests/unit/test-bdrv-drain.c
> +++ b/tests/unit/test-bdrv-drain.c
> @@ -1019,7 +1019,7 @@ static void coroutine_fn test_co_delete_by_drain(void *opaque)
>      g_assert_cmpint(bs->refcnt, ==, 1);
>  
>      if (!dbdd->detach_instead_of_delete) {
> -        blk_unref(blk);
> +        blk_co_unref(blk);
>      } else {
>          BdrvChild *c, *next_c;
>          QLIST_FOREACH_SAFE(c, &bs->children, next, next_c) {
> -- 
> 2.40.1
> 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 7/8] blockjob: Adhere to rate limit even when reentered early
  2023-05-10 20:36 ` [PATCH 7/8] blockjob: Adhere to rate limit even when reentered early Kevin Wolf
@ 2023-05-12 16:35   ` Eric Blake
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Blake @ 2023-05-12 16:35 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, stefanha, f.ebner, qemu-devel


On Wed, May 10, 2023 at 10:36:00PM +0200, Kevin Wolf wrote:
> 
> When jobs are sleeping, for example to enforce a given rate limit, they
> can be reentered early, in particular in order to get paused, to update
> the rate limit or to get cancelled.
> 
> Before this patch, they behave in this case as if they had fully
> completed their rate limiting delay. This means that requests are sped
> up beyond their limit, violating the constraints that the user gave us.

Aha - our tests ARE non-deterministic!  Good find.

> 
> Change the block jobs to sleep in a loop until the necessary delay is
> completed, while still allowing cancelling them immediately as well
> pausing (handled by the pause point in job_sleep_ns()) and updating the
> rate limit.
> 
> This change is also motivated by iotests cases being prone to fail
> because drain operations pause and unpause them so often that block jobs
> complete earlier than they are supposed to. In particular, the next
> commit would fail iotests 030 without this change.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/block/blockjob_int.h | 14 ++++++++++----
>  block/commit.c               |  7 ++-----
>  block/mirror.c               | 23 ++++++++++-------------
>  block/stream.c               |  7 ++-----
>  blockjob.c                   | 22 ++++++++++++++++++++--
>  5 files changed, 44 insertions(+), 29 deletions(-)
> 
> +++ b/blockjob.c
> @@ -319,10 +319,28 @@ static bool block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
>      return block_job_set_speed_locked(job, speed, errp);
>  }
>  
> -int64_t block_job_ratelimit_get_delay(BlockJob *job, uint64_t n)
> +void block_job_ratelimit_processed_bytes(BlockJob *job, uint64_t n)
>  {
>      IO_CODE();
> -    return ratelimit_calculate_delay(&job->limit, n);
> +    ratelimit_calculate_delay(&job->limit, n);
> +}
> +
> +void block_job_ratelimit_sleep(BlockJob *job)
> +{
> +    uint64_t delay_ns;
> +
> +    /*
> +     * Sleep at least once. If the job is reentered early, keep waiting until
> +     * we've waited for the full time that is necessary to keep the job at the
> +     * right speed.
> +     *
> +     * Make sure to recalculate the delay after each (possibly interrupted)
> +     * sleep because the speed can change while the job has yielded.
> +     */
> +    do {
> +        delay_ns = ratelimit_calculate_delay(&job->limit, 0);
> +        job_sleep_ns(&job->job, delay_ns);
> +    } while (delay_ns && !job_is_cancelled(&job->job));
>  }

Yes, that looks more robust.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 8/8] graph-lock: Honour read locks even in the main thread
  2023-05-10 20:36 ` [PATCH 8/8] graph-lock: Honour read locks even in the main thread Kevin Wolf
@ 2023-05-12 16:37   ` Eric Blake
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Blake @ 2023-05-12 16:37 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, stefanha, f.ebner, qemu-devel


On Wed, May 10, 2023 at 10:36:01PM +0200, Kevin Wolf wrote:
> 
> There are some conditions under which we don't actually need to do
> anything for taking a reader lock: Writing the graph is only possible
> from the main context while holding the BQL. So if a reader is running
> in the main context under the BQL and knows that it won't be interrupted
> until the next writer runs, we don't actually need to do anything.
> 
> This is the case if the reader code neither has a nested event loop
> (this is forbidden anyway while you hold the lock) nor is a coroutine
> (because a writer could run when the coroutine has yielded).
> 
> These conditions are exactly what bdrv_graph_rdlock_main_loop() asserts.
> They are not fulfilled in bdrv_graph_co_rdlock(), which always runs in a
> coroutine.
> 
> This deletes the shortcuts in bdrv_graph_co_rdlock() that skip taking
> the reader lock in the main thread.
> 
> Reported-by: Fiona Ebner <f.ebner@proxmox.com>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/graph-lock.c | 10 ----------
>  1 file changed, 10 deletions(-)

Bug fix by deletion is always fun.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/8] block/export: Fix null pointer dereference in error path
  2023-05-12 16:16   ` Eric Blake
@ 2023-05-12 18:46     ` Eric Blake
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Blake @ 2023-05-12 18:46 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, stefanha, f.ebner, qemu-devel


On Fri, May 12, 2023 at 11:16:03AM -0500, Eric Blake wrote:
> 
> 
> On Wed, May 10, 2023 at 10:35:55PM +0200, Kevin Wolf wrote:
> > 
> > There are some error paths in blk_exp_add() that jump to 'fail:' before
> > 'exp' is even created. So we can't just unconditionally access exp->blk.
> > 
> > Add a NULL check, and switch from exp->blk to blk, which is available
> > earlier, just to be extra sure that we really cover all cases where
> > BlockDevOps could have been set for it (in practice, this only happens
> > in drv->create() today, so this part of the change isn't strictly
> > necessary).
> > 
> > Fixes: de79b52604e43fdeba6cee4f5af600b62169f2d2
> 
> Sorry for missing that on my first review, and this does look better.
> 
> I'm assuming you plan to take this in with the rest of the series
> through your tree, but let me know if I should push it faster through
> the NBD tree.

Because iotest:
./check 307 -nbd

fails without this patch but passes with it,

Tested-by: Eric Blake <eblake@redhat.com>

[and I should really remember to run more iotests than just the subset
run by 'make check' when preparing a pull request...]

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/8] block: Call .bdrv_co_create(_opts) unlocked
  2023-05-12 16:12   ` Eric Blake
@ 2023-05-15 16:19     ` Kevin Wolf
  2023-05-15 21:08       ` Eric Blake
  0 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2023-05-15 16:19 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-block, hreitz, stefanha, f.ebner, qemu-devel

Am 12.05.2023 um 18:12 hat Eric Blake geschrieben:
> 
> On Wed, May 10, 2023 at 10:35:54PM +0200, Kevin Wolf wrote:
> > 
> > These are functions that modify the graph, so they must be able to take
> > a writer lock. This is impossible if they already hold the reader lock.
> > If they need a reader lock for some of their operations, they should
> > take it internally.
> > 
> > Many of them go through blk_*(), which will always take the lock itself.
> > Direct calls of bdrv_*() need to take the reader lock. Note that while
> > locking for bdrv_co_*() calls is checked by TSA, this is not the case
> > for the mixed_coroutine_fns bdrv_*(). Holding the lock is still required
> > when they are called from coroutine context like here!
> > 
> > This effectively reverts 4ec8df0183, but adds some internal locking
> > instead.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> 
> > +++ b/block/qcow2.c
> 
> > -static int coroutine_fn
> > +static int coroutine_fn GRAPH_UNLOCKED
> >  qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
> >  {
> >      BlockdevCreateOptionsQcow2 *qcow2_opts;
> > @@ -3724,8 +3726,10 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
> >          goto out;
> >      }
> >  
> > +    bdrv_graph_co_rdlock();
> >      ret = qcow2_alloc_clusters(blk_bs(blk), 3 * cluster_size);
> >      if (ret < 0) {
> > +        bdrv_graph_co_rdunlock();
> >          error_setg_errno(errp, -ret, "Could not allocate clusters for qcow2 "
> >                           "header and refcount table");
> >          goto out;
> > @@ -3743,6 +3747,8 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
> >  
> >      /* Create a full header (including things like feature table) */
> >      ret = qcow2_update_header(blk_bs(blk));
> > +    bdrv_graph_co_rdunlock();
> > +
> 
> If we ever inject any 'goto out' in the elided lines, we're in
> trouble.  Would this be any safer by wrapping the intervening
> statements in a scope-guarded lock?

TSA doesn't understand these guards, which is why they are only
annotated as assertions (I think we talked about this in my previous
series), at the cost of leaving unlocking unchecked. So in cases where
the scope isn't the full function, individual calls are better at the
moment. Once clang implements support for __attribute__((cleanup)), we
can maybe change this.

Of course, TSA solves the very maintenance problem you're concerned
about: With a 'goto out' added, compilation on clang fails because it
sees that there is a code path that doesn't unlock. So at least it makes
the compromise not terrible.

For example, if I comment out the unlock in the error case in the first,
this is what I get:

../block/qcow2.c:3825:5: error: mutex 'graph_lock' is not held on every path through here [-Werror,-Wthread-safety-analysis]
    blk_co_unref(blk);
    ^
../block/qcow2.c:3735:5: note: mutex acquired here
    bdrv_graph_co_rdlock();
    ^
1 error generated.

Kevin



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/8] block: Call .bdrv_co_create(_opts) unlocked
  2023-05-15 16:19     ` Kevin Wolf
@ 2023-05-15 21:08       ` Eric Blake
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Blake @ 2023-05-15 21:08 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, stefanha, f.ebner, qemu-devel


On Mon, May 15, 2023 at 06:19:41PM +0200, Kevin Wolf wrote:
> > > @@ -3724,8 +3726,10 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
> > >          goto out;
> > >      }
> > >  
> > > +    bdrv_graph_co_rdlock();
> > >      ret = qcow2_alloc_clusters(blk_bs(blk), 3 * cluster_size);
> > >      if (ret < 0) {
> > > +        bdrv_graph_co_rdunlock();
> > >          error_setg_errno(errp, -ret, "Could not allocate clusters for qcow2 "
> > >                           "header and refcount table");
> > >          goto out;
> > > @@ -3743,6 +3747,8 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
> > >  
> > >      /* Create a full header (including things like feature table) */
> > >      ret = qcow2_update_header(blk_bs(blk));
> > > +    bdrv_graph_co_rdunlock();
> > > +
> > 
> > If we ever inject any 'goto out' in the elided lines, we're in
> > trouble.  Would this be any safer by wrapping the intervening
> > statements in a scope-guarded lock?
> 
> TSA doesn't understand these guards, which is why they are only
> annotated as assertions (I think we talked about this in my previous
> series), at the cost of leaving unlocking unchecked. So in cases where
> the scope isn't the full function, individual calls are better at the
> moment. Once clang implements support for __attribute__((cleanup)), we
> can maybe change this.

Yeah, LOTS of people are waiting on clang __attribute__((cleanup))
analysis sanity ;)

> 
> Of course, TSA solves the very maintenance problem you're concerned
> about: With a 'goto out' added, compilation on clang fails because it
> sees that there is a code path that doesn't unlock. So at least it makes
> the compromise not terrible.
> 
> For example, if I comment out the unlock in the error case in the first,
> this is what I get:
> 
> ../block/qcow2.c:3825:5: error: mutex 'graph_lock' is not held on every path through here [-Werror,-Wthread-safety-analysis]
>     blk_co_unref(blk);
>     ^
> ../block/qcow2.c:3735:5: note: mutex acquired here
>     bdrv_graph_co_rdlock();
>     ^
> 1 error generated.

I'm sold!  The very reason you can't use a cleanup scope-guard
(because clang can't see through the annotation) is also the reason
why clang is able to flag your error if you don't properly clean up by
hand.  So while it is more tedious to maintain, we've finally got
compiler assistance to something that used to be human-only, which is
a step forwards even if it is more to type in the short term.

Thanks for testing that.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-05-15 21:09 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-10 20:35 [PATCH 0/8] block: Honour graph read lock even in the main thread Kevin Wolf
2023-05-10 20:35 ` [PATCH 1/8] block: Call .bdrv_co_create(_opts) unlocked Kevin Wolf
2023-05-12 16:12   ` Eric Blake
2023-05-15 16:19     ` Kevin Wolf
2023-05-15 21:08       ` Eric Blake
2023-05-10 20:35 ` [PATCH 2/8] block/export: Fix null pointer dereference in error path Kevin Wolf
2023-05-12 15:31   ` Peter Maydell
2023-05-12 16:16   ` Eric Blake
2023-05-12 18:46     ` Eric Blake
2023-05-10 20:35 ` [PATCH 3/8] qcow2: Unlock the graph in qcow2_do_open() where necessary Kevin Wolf
2023-05-12 16:19   ` Eric Blake
2023-05-10 20:35 ` [PATCH 4/8] qemu-img: Take graph lock more selectively Kevin Wolf
2023-05-12 16:20   ` Eric Blake
2023-05-10 20:35 ` [PATCH 5/8] test-bdrv-drain: " Kevin Wolf
2023-05-12 16:26   ` Eric Blake
2023-05-10 20:35 ` [PATCH 6/8] test-bdrv-drain: Call bdrv_co_unref() in coroutine context Kevin Wolf
2023-05-12 16:27   ` Eric Blake
2023-05-10 20:36 ` [PATCH 7/8] blockjob: Adhere to rate limit even when reentered early Kevin Wolf
2023-05-12 16:35   ` Eric Blake
2023-05-10 20:36 ` [PATCH 8/8] graph-lock: Honour read locks even in the main thread Kevin Wolf
2023-05-12 16:37   ` Eric Blake

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).