qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC 00/11] do not drain while holding the graph lock
@ 2025-05-08 14:09 Fiona Ebner
  2025-05-08 14:09 ` [PATCH 01/11] block: remove outdated comments about AioContext locking Fiona Ebner
                   ` (10 more replies)
  0 siblings, 11 replies; 30+ messages in thread
From: Fiona Ebner @ 2025-05-08 14:09 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, kwolf, den, andrey.drobyshev, hreitz, stefanha,
	eblake, jsnow, vsementsov

This series is an attempt to fix a deadlock issue reported by Andrey
here [0].

bdrv_drained_begin() polls and is not allowed to be called with the
block graph lock held. Mark the function as GRAPH_UNLOCKED.

This does not catch the issue reported by Andrey, because there
is a bdrv_graph_rdunlock_main_loop() before bdrv_drained_begin() in
the function bdrv_change_aio_context(). That unlock is of course
ineffective if the exclusive lock is held, but it seems to prevent TSA
from finding the issue.

Thus the bdrv_drained_begin() call from inside
bdrv_change_aio_context() needs to be moved up the call stack before
acquiring the locks. This required the most changes, patch 09/11.

Granular draining is not trivially possible, because many of the
affected functions can recursively call themselves.

In place where bdrv_drained_begin() calls were removed, assertions
are added, checking the quiesced_counter to ensure that the nodes
already got drained further up in the call stack.

In bdrv_graph_wrlock() there is a comment that it uses
bdrv_drain_all_begin_nopoll() to make sure that constantly arriving
new I/O doesn't cause starvation. The changes from this series are at
odds with that, but there doesn't seem to be any (new) test failures.

[0]: https://lore.kernel.org/qemu-devel/73839c04-7616-407e-b057-80ca69e63f51@virtuozzo.com/

I did not CC the maintainers for blkverify, vmdk, etc. because this is
just an RFC and the changes there are mechanical, for adapting some
callers of block layer functions.

Andrey Drobyshev (1):
  iotests/graph-changes-while-io: add test case with removal of lower
    snapshot

Fiona Ebner (10):
  block: remove outdated comments about AioContext locking
  block: move drain outside of read-locked bdrv_reopen_queue_child()
  block/snapshot: move drain outside of read-locked
    bdrv_snapshot_delete()
  block: drain while unlocked in bdrv_reopen_parse_file_or_backing()
  block: move drain outside of read-locked bdrv_inactivate_recurse()
  blockdev: drain while unlocked in internal_snapshot_action()
  blockdev: drain while unlocked in external_snapshot_action()
  block: mark bdrv_drained_begin() as GRAPH_UNLOCKED
  block: move drain out of bdrv_change_aio_context()
  block/graph-lock: add drain flag to bdrv_graph_wr{,un}lock

 block.c                                       | 170 +++++++++++-------
 block/backup.c                                |   4 +-
 block/blklogwrites.c                          |   8 +-
 block/blkverify.c                             |   4 +-
 block/block-backend.c                         |   8 +-
 block/commit.c                                |  16 +-
 block/graph-lock.c                            |  22 ++-
 block/mirror.c                                |  22 +--
 block/qcow2.c                                 |   4 +-
 block/quorum.c                                |   8 +-
 block/replication.c                           |  14 +-
 block/snapshot.c                              |  22 ++-
 block/stream.c                                |  18 +-
 block/vmdk.c                                  |  20 +--
 blockdev.c                                    |  59 ++++--
 blockjob.c                                    |  12 +-
 include/block/block-global-state.h            |   8 +-
 include/block/block-io.h                      |   3 +-
 include/block/graph-lock.h                    |   8 +-
 qemu-img.c                                    |   2 +
 scripts/block-coroutine-wrapper.py            |   4 +-
 .../qemu-iotests/tests/graph-changes-while-io | 101 ++++++++++-
 .../tests/graph-changes-while-io.out          |   4 +-
 tests/unit/test-bdrv-drain.c                  |  40 ++---
 tests/unit/test-bdrv-graph-mod.c              |  20 +--
 25 files changed, 384 insertions(+), 217 deletions(-)

-- 
2.39.5




^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2025-05-20  8:43 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-08 14:09 [RFC 00/11] do not drain while holding the graph lock Fiona Ebner
2025-05-08 14:09 ` [PATCH 01/11] block: remove outdated comments about AioContext locking Fiona Ebner
2025-05-14 16:22   ` Kevin Wolf
2025-05-08 14:09 ` [PATCH 02/11] block: move drain outside of read-locked bdrv_reopen_queue_child() Fiona Ebner
2025-05-14 16:36   ` Kevin Wolf
2025-05-15 11:48     ` Fiona Ebner
2025-05-15 12:28       ` Kevin Wolf
2025-05-15 13:41         ` Fiona Ebner
2025-05-08 14:09 ` [PATCH 03/11] block/snapshot: move drain outside of read-locked bdrv_snapshot_delete() Fiona Ebner
2025-05-14 16:52   ` Kevin Wolf
2025-05-15 12:03     ` Fiona Ebner
2025-05-08 14:09 ` [PATCH 04/11] block: drain while unlocked in bdrv_reopen_parse_file_or_backing() Fiona Ebner
2025-05-14 16:59   ` Kevin Wolf
2025-05-08 14:09 ` [PATCH 05/11] block: move drain outside of read-locked bdrv_inactivate_recurse() Fiona Ebner
2025-05-14 17:06   ` Kevin Wolf
2025-05-08 14:09 ` [PATCH 06/11] blockdev: drain while unlocked in internal_snapshot_action() Fiona Ebner
2025-05-14 17:26   ` Kevin Wolf
2025-05-19 12:37     ` Fiona Ebner
2025-05-08 14:09 ` [PATCH 07/11] blockdev: drain while unlocked in external_snapshot_action() Fiona Ebner
2025-05-08 14:09 ` [PATCH 08/11] block: mark bdrv_drained_begin() as GRAPH_UNLOCKED Fiona Ebner
2025-05-14 17:50   ` Kevin Wolf
2025-05-08 14:09 ` [PATCH 09/11] block: move drain out of bdrv_change_aio_context() Fiona Ebner
2025-05-14 19:45   ` Kevin Wolf
2025-05-08 14:09 ` [PATCH 10/11] block/graph-lock: add drain flag to bdrv_graph_wr{, un}lock Fiona Ebner
2025-05-14 19:54   ` [PATCH 10/11] block/graph-lock: add drain flag to bdrv_graph_wr{,un}lock Kevin Wolf
2025-05-19 12:10     ` Fiona Ebner
2025-05-20  6:09       ` Fiona Ebner
2025-05-20  8:42         ` Kevin Wolf
2025-05-08 14:09 ` [PATCH 11/11] iotests/graph-changes-while-io: add test case with removal of lower snapshot Fiona Ebner
2025-05-14 20:14   ` Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).