* [PATCH v2 0/3] fix two edge cases related to stream block jobs
@ 2024-03-21 13:33 Fiona Ebner
2024-03-21 13:33 ` [PATCH v2 1/3] block/io: accept NULL qiov in bdrv_pad_request Fiona Ebner
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Fiona Ebner @ 2024-03-21 13:33 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, qemu-stable, hreitz, kwolf, fam, stefanha,
t.lamprecht, w.bumiller
Changes in v2:
* Ran into another issue while writing the IO test Stefan wanted
to have (good call :)), so include a fix for that and add the
test. I didn't notice during manual testing, because I hadn't
used a scripted QMP 'quit', so there was no race.
Fiona Ebner (2):
block-backend: fix edge case in bdrv_next() where BDS associated to BB
changes
iotests: add test for stream job with an unaligned prefetch read
Stefan Reiter (1):
block/io: accept NULL qiov in bdrv_pad_request
block/block-backend.c | 7 +-
block/io.c | 31 ++++---
.../tests/stream-unaligned-prefetch | 86 +++++++++++++++++++
.../tests/stream-unaligned-prefetch.out | 5 ++
4 files changed, 113 insertions(+), 16 deletions(-)
create mode 100755 tests/qemu-iotests/tests/stream-unaligned-prefetch
create mode 100644 tests/qemu-iotests/tests/stream-unaligned-prefetch.out
--
2.39.2
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 1/3] block/io: accept NULL qiov in bdrv_pad_request
2024-03-21 13:33 [PATCH v2 0/3] fix two edge cases related to stream block jobs Fiona Ebner
@ 2024-03-21 13:33 ` Fiona Ebner
2024-03-21 13:33 ` [PATCH v2 2/3] block-backend: fix edge case in bdrv_next() where BDS associated to BB changes Fiona Ebner
2024-03-21 13:33 ` [PATCH v2 3/3] iotests: add test for stream job with an unaligned prefetch read Fiona Ebner
2 siblings, 0 replies; 4+ messages in thread
From: Fiona Ebner @ 2024-03-21 13:33 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, qemu-stable, hreitz, kwolf, fam, stefanha,
t.lamprecht, w.bumiller
From: Stefan Reiter <s.reiter@proxmox.com>
Some operations, e.g. block-stream, perform reads while discarding the
results (only copy-on-read matters). In this case, they will pass NULL
as the target QEMUIOVector, which will however trip bdrv_pad_request,
since it wants to extend its passed vector. In particular, this is the
case for the blk_co_preadv() call in stream_populate().
If there is no qiov, no operation can be done with it, but the bytes
and offset still need to be updated, so the subsequent aligned read
will actually be aligned and not run into an assertion failure.
In particular, this can happen when the request alignment of the top
node is larger than the allocated part of the bottom node, in which
case padding becomes necessary. For example:
> ./qemu-img create /tmp/backing.qcow2 -f qcow2 64M -o cluster_size=32768
> ./qemu-io -c "write -P42 0x0 0x1" /tmp/backing.qcow2
> ./qemu-img create /tmp/top.qcow2 -f qcow2 64M -b /tmp/backing.qcow2 -F qcow2
> ./qemu-system-x86_64 --qmp stdio \
> --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/top.qcow2 \
> <<EOF
> {"execute": "qmp_capabilities"}
> {"execute": "blockdev-add", "arguments": { "driver": "compress", "file": "node0", "node-name": "node1" } }
> {"execute": "block-stream", "arguments": { "job-id": "stream0", "device": "node1" } }
> EOF
Originally-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: do update bytes and offset in any case
add reproducer to commit message]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
No changes in v2.
block/io.c | 31 +++++++++++++++++++------------
1 file changed, 19 insertions(+), 12 deletions(-)
diff --git a/block/io.c b/block/io.c
index 33150c0359..395bea3bac 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1726,22 +1726,29 @@ static int bdrv_pad_request(BlockDriverState *bs,
return 0;
}
- sliced_iov = qemu_iovec_slice(*qiov, *qiov_offset, *bytes,
- &sliced_head, &sliced_tail,
- &sliced_niov);
+ /*
+ * For prefetching in stream_populate(), no qiov is passed along, because
+ * only copy-on-read matters.
+ */
+ if (qiov && *qiov) {
+ sliced_iov = qemu_iovec_slice(*qiov, *qiov_offset, *bytes,
+ &sliced_head, &sliced_tail,
+ &sliced_niov);
- /* Guaranteed by bdrv_check_request32() */
- assert(*bytes <= SIZE_MAX);
- ret = bdrv_create_padded_qiov(bs, pad, sliced_iov, sliced_niov,
- sliced_head, *bytes);
- if (ret < 0) {
- bdrv_padding_finalize(pad);
- return ret;
+ /* Guaranteed by bdrv_check_request32() */
+ assert(*bytes <= SIZE_MAX);
+ ret = bdrv_create_padded_qiov(bs, pad, sliced_iov, sliced_niov,
+ sliced_head, *bytes);
+ if (ret < 0) {
+ bdrv_padding_finalize(pad);
+ return ret;
+ }
+ *qiov = &pad->local_qiov;
+ *qiov_offset = 0;
}
+
*bytes += pad->head + pad->tail;
*offset -= pad->head;
- *qiov = &pad->local_qiov;
- *qiov_offset = 0;
if (padded) {
*padded = true;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v2 2/3] block-backend: fix edge case in bdrv_next() where BDS associated to BB changes
2024-03-21 13:33 [PATCH v2 0/3] fix two edge cases related to stream block jobs Fiona Ebner
2024-03-21 13:33 ` [PATCH v2 1/3] block/io: accept NULL qiov in bdrv_pad_request Fiona Ebner
@ 2024-03-21 13:33 ` Fiona Ebner
2024-03-21 13:33 ` [PATCH v2 3/3] iotests: add test for stream job with an unaligned prefetch read Fiona Ebner
2 siblings, 0 replies; 4+ messages in thread
From: Fiona Ebner @ 2024-03-21 13:33 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, qemu-stable, hreitz, kwolf, fam, stefanha,
t.lamprecht, w.bumiller
The old_bs variable in bdrv_next() is currently determined by looking
at the old block backend. However, if the block graph changes before
the next bdrv_next() call, it might be that the associated BDS is not
the same that was referenced previously. In that case, the wrong BDS
is unreferenced, leading to an assertion failure later:
> bdrv_unref: Assertion `bs->refcnt > 0' failed.
In particular, this can happen in the context of bdrv_flush_all(),
when polling for bdrv_co_flush() in the generated co-wrapper leads to
a graph change (for example with a stream block job [0]).
A racy reproducer:
> #!/bin/bash
> rm -f /tmp/backing.qcow2
> rm -f /tmp/top.qcow2
> ./qemu-img create /tmp/backing.qcow2 -f qcow2 64M
> ./qemu-io -c "write -P42 0x0 0x1" /tmp/backing.qcow2
> ./qemu-img create /tmp/top.qcow2 -f qcow2 64M -b /tmp/backing.qcow2 -F qcow2
> ./qemu-system-x86_64 --qmp stdio \
> --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/top.qcow2 \
> <<EOF
> {"execute": "qmp_capabilities"}
> {"execute": "block-stream", "arguments": { "job-id": "stream0", "device": "node0" } }
> {"execute": "quit"}
> EOF
[0]:
> #0 bdrv_replace_child_tran (child=..., new_bs=..., tran=...)
> #1 bdrv_replace_node_noperm (from=..., to=..., auto_skip=..., tran=..., errp=...)
> #2 bdrv_replace_node_common (from=..., to=..., auto_skip=..., detach_subchain=..., errp=...)
> #3 bdrv_drop_filter (bs=..., errp=...)
> #4 bdrv_cor_filter_drop (cor_filter_bs=...)
> #5 stream_prepare (job=...)
> #6 job_prepare_locked (job=...)
> #7 job_txn_apply_locked (fn=..., job=...)
> #8 job_do_finalize_locked (job=...)
> #9 job_exit (opaque=...)
> #10 aio_bh_poll (ctx=...)
> #11 aio_poll (ctx=..., blocking=...)
> #12 bdrv_poll_co (s=...)
> #13 bdrv_flush (bs=...)
> #14 bdrv_flush_all ()
> #15 do_vm_stop (state=..., send_stop=...)
> #16 vm_shutdown ()
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
Not sure if this is the correct fix, or if the call site should rather
be adapted somehow?
New in v2.
block/block-backend.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/block/block-backend.c b/block/block-backend.c
index 9c4de79e6b..28af1eb17a 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -599,14 +599,14 @@ BlockDriverState *bdrv_next(BdrvNextIterator *it)
/* Must be called from the main loop */
assert(qemu_get_current_aio_context() == qemu_get_aio_context());
+ old_bs = it->bs;
+
/* First, return all root nodes of BlockBackends. In order to avoid
* returning a BDS twice when multiple BBs refer to it, we only return it
* if the BB is the first one in the parent list of the BDS. */
if (it->phase == BDRV_NEXT_BACKEND_ROOTS) {
BlockBackend *old_blk = it->blk;
- old_bs = old_blk ? blk_bs(old_blk) : NULL;
-
do {
it->blk = blk_all_next(it->blk);
bs = it->blk ? blk_bs(it->blk) : NULL;
@@ -620,11 +620,10 @@ BlockDriverState *bdrv_next(BdrvNextIterator *it)
if (bs) {
bdrv_ref(bs);
bdrv_unref(old_bs);
+ it->bs = bs;
return bs;
}
it->phase = BDRV_NEXT_MONITOR_OWNED;
- } else {
- old_bs = it->bs;
}
/* Then return the monitor-owned BDSes without a BB attached. Ignore all
--
2.39.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v2 3/3] iotests: add test for stream job with an unaligned prefetch read
2024-03-21 13:33 [PATCH v2 0/3] fix two edge cases related to stream block jobs Fiona Ebner
2024-03-21 13:33 ` [PATCH v2 1/3] block/io: accept NULL qiov in bdrv_pad_request Fiona Ebner
2024-03-21 13:33 ` [PATCH v2 2/3] block-backend: fix edge case in bdrv_next() where BDS associated to BB changes Fiona Ebner
@ 2024-03-21 13:33 ` Fiona Ebner
2 siblings, 0 replies; 4+ messages in thread
From: Fiona Ebner @ 2024-03-21 13:33 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, qemu-stable, hreitz, kwolf, fam, stefanha,
t.lamprecht, w.bumiller
Previously, bdrv_pad_request() could not deal with a NULL qiov when
a read needed to be aligned. During prefetch, a stream job will pass a
NULL qiov. Add a test case to cover this scenario.
By accident, also covers a previous race during shutdown, where block
graph changes during iteration in bdrv_flush_all() could lead to
unreferencing the wrong block driver state and an assertion failure
later.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
New in v2.
.../tests/stream-unaligned-prefetch | 86 +++++++++++++++++++
.../tests/stream-unaligned-prefetch.out | 5 ++
2 files changed, 91 insertions(+)
create mode 100755 tests/qemu-iotests/tests/stream-unaligned-prefetch
create mode 100644 tests/qemu-iotests/tests/stream-unaligned-prefetch.out
diff --git a/tests/qemu-iotests/tests/stream-unaligned-prefetch b/tests/qemu-iotests/tests/stream-unaligned-prefetch
new file mode 100755
index 0000000000..546db1d369
--- /dev/null
+++ b/tests/qemu-iotests/tests/stream-unaligned-prefetch
@@ -0,0 +1,86 @@
+#!/usr/bin/env python3
+# group: rw quick
+#
+# Test what happens when a stream job does an unaligned prefetch read
+# which requires padding while having a NULL qiov.
+#
+# Copyright (C) Proxmox Server Solutions GmbH
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+#
+
+import os
+import iotests
+from iotests import imgfmt, qemu_img_create, qemu_io, QMPTestCase
+
+image_size = 1 * 1024 * 1024
+cluster_size = 64 * 1024
+base = os.path.join(iotests.test_dir, 'base.img')
+top = os.path.join(iotests.test_dir, 'top.img')
+
+class TestStreamUnalignedPrefetch(QMPTestCase):
+ def setUp(self) -> None:
+ """
+ Create two images:
+ - base image {base} with {cluster_size // 2} bytes allocated
+ - top image {top} without any data allocated and coarser
+ cluster size
+
+ Attach a compress filter for the top image, because that
+ requires that the request alignment is the top image's cluster
+ size.
+ """
+ qemu_img_create('-f', imgfmt,
+ '-o', 'cluster_size={}'.format(cluster_size // 2),
+ base, str(image_size))
+ qemu_io('-c', f'write 0 {cluster_size // 2}', base)
+ qemu_img_create('-f', imgfmt,
+ '-o', 'cluster_size={}'.format(cluster_size),
+ top, str(image_size))
+
+ self.vm = iotests.VM()
+ self.vm.add_blockdev(self.vm.qmp_to_opts({
+ 'driver': imgfmt,
+ 'node-name': 'base',
+ 'file': {
+ 'driver': 'file',
+ 'filename': base
+ }
+ }))
+ self.vm.add_blockdev(self.vm.qmp_to_opts({
+ 'driver': 'compress',
+ 'node-name': 'compress-top',
+ 'file': {
+ 'driver': imgfmt,
+ 'node-name': 'top',
+ 'file': {
+ 'driver': 'file',
+ 'filename': top
+ },
+ 'backing': 'base'
+ }
+ }))
+ self.vm.launch()
+
+ def tearDown(self) -> None:
+ self.vm.shutdown()
+ os.remove(top)
+ os.remove(base)
+
+ def test_stream_unaligned_prefetch(self) -> None:
+ self.vm.cmd('block-stream', job_id='stream', device='compress-top')
+
+
+if __name__ == '__main__':
+ iotests.main(supported_fmts=['qcow2'], supported_protocols=['file'])
diff --git a/tests/qemu-iotests/tests/stream-unaligned-prefetch.out b/tests/qemu-iotests/tests/stream-unaligned-prefetch.out
new file mode 100644
index 0000000000..ae1213e6f8
--- /dev/null
+++ b/tests/qemu-iotests/tests/stream-unaligned-prefetch.out
@@ -0,0 +1,5 @@
+.
+----------------------------------------------------------------------
+Ran 1 tests
+
+OK
--
2.39.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-03-21 13:35 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-21 13:33 [PATCH v2 0/3] fix two edge cases related to stream block jobs Fiona Ebner
2024-03-21 13:33 ` [PATCH v2 1/3] block/io: accept NULL qiov in bdrv_pad_request Fiona Ebner
2024-03-21 13:33 ` [PATCH v2 2/3] block-backend: fix edge case in bdrv_next() where BDS associated to BB changes Fiona Ebner
2024-03-21 13:33 ` [PATCH v2 3/3] iotests: add test for stream job with an unaligned prefetch read Fiona Ebner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).