From: Kevin Wolf <kwolf@redhat.com>
To: qemu-block@nongnu.org
Cc: kwolf@redhat.com, qemu-devel@nongnu.org
Subject: [Qemu-devel] [PULL 05/28] block: avoid recursive block_status call if possible
Date: Mon, 3 Jun 2019 17:02:10 +0200 [thread overview]
Message-ID: <20190603150233.6614-6-kwolf@redhat.com> (raw)
In-Reply-To: <20190603150233.6614-1-kwolf@redhat.com>
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
drv_co_block_status digs bs->file for additional, more accurate search
for hole inside region, reported as DATA by bs since 5daa74a6ebc.
This accuracy is not free: assume we have qcow2 disk. Actually, qcow2
knows, where are holes and where is data. But every block_status
request calls lseek additionally. Assume a big disk, full of
data, in any iterative copying block job (or img convert) we'll call
lseek(HOLE) on every iteration, and each of these lseeks will have to
iterate through all metadata up to the end of file. It's obviously
ineffective behavior. And for many scenarios we don't need this lseek
at all.
However, lseek is needed when we have metadata-preallocated image.
So, let's detect metadata-preallocation case and don't dig qcow2's
protocol file in other cases.
The idea is to compare allocation size in POV of filesystem with
allocations size in POV of Qcow2 (by refcounts). If allocation in fs is
significantly lower, consider it as metadata-preallocation case.
102 iotest changed, as our detector can't detect shrinked file as
metadata-preallocation, which don't seem to be wrong, as with metadata
preallocation we always have valid file length.
Two other iotests have a slight change in their QMP output sequence:
Active 'block-commit' returns earlier because the job coroutine yields
earlier on a blocking operation. This operation is loading the refcount
blocks in qcow2_detect_metadata_preallocation().
Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2.h | 4 ++++
include/block/block.h | 8 +++++++-
block/io.c | 9 ++++++++-
block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
block/qcow2.c | 11 +++++++++++
tests/qemu-iotests/102 | 2 +-
tests/qemu-iotests/102.out | 3 ++-
tests/qemu-iotests/141.out | 2 +-
tests/qemu-iotests/144.out | 2 +-
9 files changed, 67 insertions(+), 6 deletions(-)
diff --git a/block/qcow2.h b/block/qcow2.h
index 567375e56c..fc1b0d3c1e 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -356,6 +356,9 @@ typedef struct BDRVQcow2State {
int nb_threads;
BdrvChild *data_file;
+
+ bool metadata_preallocation_checked;
+ bool metadata_preallocation;
} BDRVQcow2State;
typedef struct Qcow2COWRegion {
@@ -655,6 +658,7 @@ int qcow2_change_refcount_order(BlockDriverState *bs, int refcount_order,
void *cb_opaque, Error **errp);
int qcow2_shrink_reftable(BlockDriverState *bs);
int64_t qcow2_get_last_cluster(BlockDriverState *bs, int64_t size);
+int qcow2_detect_metadata_preallocation(BlockDriverState *bs);
/* qcow2-cluster.c functions */
int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
diff --git a/include/block/block.h b/include/block/block.h
index 9b083e2bca..531cf595cf 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -156,10 +156,15 @@ typedef struct HDGeometry {
* BDRV_BLOCK_EOF: the returned pnum covers through end of file for this
* layer, set by block layer
*
- * Internal flag:
+ * Internal flags:
* BDRV_BLOCK_RAW: for use by passthrough drivers, such as raw, to request
* that the block layer recompute the answer from the returned
* BDS; must be accompanied by just BDRV_BLOCK_OFFSET_VALID.
+ * BDRV_BLOCK_RECURSE: request that the block layer will recursively search for
+ * zeroes in file child of current block node inside
+ * returned region. Only valid together with both
+ * BDRV_BLOCK_DATA and BDRV_BLOCK_OFFSET_VALID. Should not
+ * appear with BDRV_BLOCK_ZERO.
*
* If BDRV_BLOCK_OFFSET_VALID is set, the map parameter represents the
* host offset within the returned BDS that is allocated for the
@@ -184,6 +189,7 @@ typedef struct HDGeometry {
#define BDRV_BLOCK_RAW 0x08
#define BDRV_BLOCK_ALLOCATED 0x10
#define BDRV_BLOCK_EOF 0x20
+#define BDRV_BLOCK_RECURSE 0x40
#define BDRV_BLOCK_OFFSET_MASK BDRV_SECTOR_MASK
typedef QSIMPLEQ_HEAD(BlockReopenQueue, BlockReopenQueueEntry) BlockReopenQueue;
diff --git a/block/io.c b/block/io.c
index 3134a60a48..150358c3b1 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2092,6 +2092,12 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
*/
assert(*pnum && QEMU_IS_ALIGNED(*pnum, align) &&
align > offset - aligned_offset);
+ if (ret & BDRV_BLOCK_RECURSE) {
+ assert(ret & BDRV_BLOCK_DATA);
+ assert(ret & BDRV_BLOCK_OFFSET_VALID);
+ assert(!(ret & BDRV_BLOCK_ZERO));
+ }
+
*pnum -= offset - aligned_offset;
if (*pnum > bytes) {
*pnum = bytes;
@@ -2122,7 +2128,8 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
}
}
- if (want_zero && local_file && local_file != bs &&
+ if (want_zero && ret & BDRV_BLOCK_RECURSE &&
+ local_file && local_file != bs &&
(ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
(ret & BDRV_BLOCK_OFFSET_VALID)) {
int64_t file_pnum;
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 4c1794f9af..3a2c673a5e 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -3444,3 +3444,35 @@ int64_t qcow2_get_last_cluster(BlockDriverState *bs, int64_t size)
"There are no references in the refcount table.");
return -EIO;
}
+
+int qcow2_detect_metadata_preallocation(BlockDriverState *bs)
+{
+ BDRVQcow2State *s = bs->opaque;
+ int64_t i, end_cluster, cluster_count = 0, threshold;
+ int64_t file_length, real_allocation, real_clusters;
+
+ file_length = bdrv_getlength(bs->file->bs);
+ if (file_length < 0) {
+ return file_length;
+ }
+
+ real_allocation = bdrv_get_allocated_file_size(bs->file->bs);
+ if (real_allocation < 0) {
+ return real_allocation;
+ }
+
+ real_clusters = real_allocation / s->cluster_size;
+ threshold = MAX(real_clusters * 10 / 9, real_clusters + 2);
+
+ end_cluster = size_to_clusters(s, file_length);
+ for (i = 0; i < end_cluster && cluster_count < threshold; i++) {
+ uint64_t refcount;
+ int ret = qcow2_get_refcount(bs, i, &refcount);
+ if (ret < 0) {
+ return ret;
+ }
+ cluster_count += !!refcount;
+ }
+
+ return cluster_count >= threshold;
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index f2cb131048..14f914117f 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1895,6 +1895,12 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
unsigned int bytes;
int status = 0;
+ if (!s->metadata_preallocation_checked) {
+ ret = qcow2_detect_metadata_preallocation(bs);
+ s->metadata_preallocation = (ret == 1);
+ s->metadata_preallocation_checked = true;
+ }
+
bytes = MIN(INT_MAX, count);
qemu_co_mutex_lock(&s->lock);
ret = qcow2_get_cluster_offset(bs, offset, &bytes, &cluster_offset);
@@ -1917,6 +1923,11 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
} else if (ret != QCOW2_CLUSTER_UNALLOCATED) {
status |= BDRV_BLOCK_DATA;
}
+ if (s->metadata_preallocation && (status & BDRV_BLOCK_DATA) &&
+ (status & BDRV_BLOCK_OFFSET_VALID))
+ {
+ status |= BDRV_BLOCK_RECURSE;
+ }
return status;
}
diff --git a/tests/qemu-iotests/102 b/tests/qemu-iotests/102
index 749ff66b8a..b898df436f 100755
--- a/tests/qemu-iotests/102
+++ b/tests/qemu-iotests/102
@@ -55,7 +55,7 @@ $QEMU_IO -c 'write 0 64k' "$TEST_IMG" | _filter_qemu_io
$QEMU_IMG resize -f raw --shrink "$TEST_IMG" $((5 * 64 * 1024))
$QEMU_IO -c map "$TEST_IMG"
-$QEMU_IMG map "$TEST_IMG"
+$QEMU_IMG map "$TEST_IMG" | _filter_qemu_img_map
echo
echo '=== Testing map on an image file truncated outside of qemu ==='
diff --git a/tests/qemu-iotests/102.out b/tests/qemu-iotests/102.out
index 4401b08fee..cd2fdc7f96 100644
--- a/tests/qemu-iotests/102.out
+++ b/tests/qemu-iotests/102.out
@@ -7,7 +7,8 @@ wrote 65536/65536 bytes at offset 0
64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
Image resized.
64 KiB (0x10000) bytes allocated at offset 0 bytes (0x0)
-Offset Length Mapped to File
+Offset Length File
+0 0x10000 TEST_DIR/t.IMGFMT
=== Testing map on an image file truncated outside of qemu ===
diff --git a/tests/qemu-iotests/141.out b/tests/qemu-iotests/141.out
index 41c7291258..4d71d9dcae 100644
--- a/tests/qemu-iotests/141.out
+++ b/tests/qemu-iotests/141.out
@@ -42,9 +42,9 @@ Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.
{"return": {}}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
+{"return": {}}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job0"}}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "job0", "len": 0, "offset": 0, "speed": 0, "type": "commit"}}
-{"return": {}}
{"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: block device is in use by block job: commit"}}
{"return": {}}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job0"}}
diff --git a/tests/qemu-iotests/144.out b/tests/qemu-iotests/144.out
index 55299201e4..a9a8216bea 100644
--- a/tests/qemu-iotests/144.out
+++ b/tests/qemu-iotests/144.out
@@ -14,10 +14,10 @@ Formatting 'TEST_DIR/tmp.qcow2', fmt=qcow2 size=536870912 backing_file=TEST_DIR/
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "virtio0"}}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "virtio0"}}
+{"return": {}}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "virtio0"}}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "virtio0", "len": 0, "offset": 0, "speed": 0, "type": "commit"}}
{"return": {}}
-{"return": {}}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "virtio0"}}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "virtio0"}}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "virtio0", "len": 0, "offset": 0, "speed": 0, "type": "commit"}}
--
2.20.1
next prev parent reply other threads:[~2019-06-03 15:14 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-03 15:02 [Qemu-devel] [PULL 00/28] Block layer patches Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 01/28] block: Drain source node in bdrv_replace_node() Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 02/28] iotests: Test commit job start with concurrent I/O Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 03/28] blockdev: fix missed target unref for drive-backup Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 04/28] tests/perf: Test lseek influence on qcow2 block-status Kevin Wolf
2019-06-03 15:02 ` Kevin Wolf [this message]
2019-06-03 15:02 ` [Qemu-devel] [PULL 06/28] block/io: Delay decrementing the quiesce_counter Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 07/28] iotests: Test cancelling a job and closing the VM Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 08/28] block/linux-aio: Drop unused BlockAIOCB submission method Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 09/28] nvme: add Get/Set Feature Timestamp support Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 10/28] test-block-iothread: Check filter node in test_propagate_mirror Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 11/28] nbd-server: Call blk_set_allow_aio_context_change() Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 12/28] block: Add Error to blk_set_aio_context() Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 13/28] block: Add BlockBackend.ctx Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 14/28] block: Add qdev_prop_drive_iothread property type Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 15/28] scsi-disk: Use qdev_prop_drive_iothread Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 16/28] block: Adjust AioContexts when attaching nodes Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 17/28] test-block-iothread: Test adding parent to iothread node Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 18/28] test-block-iothread: BlockBackend AioContext across root node change Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 19/28] block: Move node without parents to main AioContext Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 20/28] blockdev: Use bdrv_try_set_aio_context() for monitor commands Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 21/28] block: Remove wrong bdrv_set_aio_context() calls Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 22/28] virtio-scsi-test: Test attaching new overlay with iothreads Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 23/28] iotests: Attach new devices to node in non-default iothread Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 24/28] test-bdrv-drain: Use bdrv_try_set_aio_context() Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 25/28] block: Remove bdrv_set_aio_context() Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 26/28] block/qcow2-refcount: add trace-point to qcow2_process_discards Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 27/28] block/io: bdrv_pdiscard: support int64_t bytes parameter Kevin Wolf
2019-06-03 15:02 ` [Qemu-devel] [PULL 28/28] iotests: Fix duplicated diff output on failure Kevin Wolf
2019-06-03 16:00 ` [Qemu-devel] [PULL 00/28] Block layer patches Peter Maydell
2019-06-03 19:27 ` no-reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190603150233.6614-6-kwolf@redhat.com \
--to=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).