* [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path
@ 2023-04-07 15:32 Paolo Bonzini
2023-04-07 15:32 ` [PATCH 1/8] block: move has_variable_length to BlockLimits Paolo Bonzini
` (8 more replies)
0 siblings, 9 replies; 19+ messages in thread
From: Paolo Bonzini @ 2023-04-07 15:32 UTC (permalink / raw)
To: qemu-devel; +Cc: stefanha, kwolf, qemu-block, hreitz
The introduction of the graph lock is causing blk_get_geometry, a hot
function used in the I/O path, to create a coroutine for the call to
bdrv_co_refresh_total_sectors.
In theory the call to bdrv_co_refresh_total_sectors should only matter
in the rare case of host CD-ROM devices, whose size changes when a medium
is added or removed. However, the call is actually keyed by a field in
BlockDriver, drv->has_variable_length, and the field is true in the common
case of the raw driver! This is because the host CD-ROM is usually
layered below the raw driver.
So, this series starts by moving has_variable_length from BlockDriver to
BlockLimits. This is patches 1-4, which also include a fix for a small
latent bug (patch 3).
The second half of the series then cleans up the functions to retrieve
the BlockDriverState's size (patches 5-7) to limit the amount of duplicated
code introduced by the hand-written wrappers of patch 8. The final result
is that blk_get_geometry will not anymore create a coroutine.
This series applies to qemu.git, or to the block-next branch if commit
d8fbf9aa85ae ("block/export: Fix graph locking in blk_get_geometry()
call", 2023-03-27) is cherry picked. Commit d8fbf9aa85ae is also where
bdrv_co_get_geometry() was introduced and with it the performance
regression. It is quite a recent change, and therefore this is
probably a regression in 8.0 that had not been detected yet (except by
Stefan who talked to Kevin and me about it yesterday). I'm not sure how
we can avoid the regression, if not by disabling completely the graph lock
(!) or applying this large series.
I'm throwing this out before disappearing for a couple days for Easter;
I have only tested it with qemu-iotests and "make check-unit".
Thanks,
Paolo
Paolo Bonzini (8):
block: move has_variable_length to BlockLimits
block: remove has_variable_length from filters
block: refresh bs->total_sectors on reopen
block: remove has_variable_length from BlockDriver
migration/block: replace uses of blk_nb_sectors that do not check
result
block-backend: inline bdrv_co_get_geometry
block-backend: ignore inserted state in blk_co_nb_sectors
block, block-backend: write some hot coroutine wrappers by hand
block.c | 35 ++++++++++++++++++--------
block/block-backend.c | 42 ++++++++++++++++++++++++-------
block/copy-on-read.c | 1 -
block/file-posix.c | 12 ++++++---
block/file-win32.c | 2 +-
block/filter-compress.c | 1 -
block/io.c | 4 +++
block/preallocate.c | 1 -
block/raw-format.c | 3 ++-
block/replication.c | 1 -
include/block/block-io.h | 5 +---
include/block/block_int-common.h | 10 ++++++--
include/sysemu/block-backend-io.h | 5 ++--
migration/block.c | 5 ++--
14 files changed, 85 insertions(+), 42 deletions(-)
--
2.39.2
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/8] block: move has_variable_length to BlockLimits
2023-04-07 15:32 [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path Paolo Bonzini
@ 2023-04-07 15:32 ` Paolo Bonzini
2023-04-07 19:38 ` Eric Blake
2023-04-07 15:32 ` [PATCH 2/8] block: remove has_variable_length from filters Paolo Bonzini
` (7 subsequent siblings)
8 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2023-04-07 15:32 UTC (permalink / raw)
To: qemu-devel; +Cc: stefanha, kwolf, qemu-block, hreitz
At the protocol level, has_variable_length only needs to be true in the
very special case of host CD-ROM drives, so that they do not need an
explicit monitor command to read the new size when a disc is loaded
in the tray.
However, at the format level has_variable_length has to be true for all
raw blockdevs and for all filters, even though in practice the length
depends on the underlying file and thus will not change except in the
case of host CD-ROM drives.
As a first step towards computing an accurate value of has_variable_length,
add the value into the BlockLimits structure and initialize the field
from the BlockDriver.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block.c | 2 +-
block/io.c | 6 ++++++
include/block/block_int-common.h | 8 ++++++++
3 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/block.c b/block.c
index 89a79c321fab..b1b7c7efe036 100644
--- a/block.c
+++ b/block.c
@@ -5849,7 +5849,7 @@ int64_t coroutine_fn bdrv_co_nb_sectors(BlockDriverState *bs)
if (!drv)
return -ENOMEDIUM;
- if (drv->has_variable_length) {
+ if (bs->bl.has_variable_length) {
int ret = bdrv_co_refresh_total_sectors(bs, bs->total_sectors);
if (ret < 0) {
return ret;
diff --git a/block/io.c b/block/io.c
index db438c765757..c49917c74677 100644
--- a/block/io.c
+++ b/block/io.c
@@ -182,6 +182,8 @@ void bdrv_refresh_limits(BlockDriverState *bs, Transaction *tran, Error **errp)
drv->bdrv_aio_preadv ||
drv->bdrv_co_preadv_part) ? 1 : 512;
+ bs->bl.has_variable_length = drv->has_variable_length;
+
/* Take some limits from the children as a default */
have_limits = false;
QLIST_FOREACH(c, &bs->children, next) {
@@ -190,6 +192,10 @@ void bdrv_refresh_limits(BlockDriverState *bs, Transaction *tran, Error **errp)
bdrv_merge_limits(&bs->bl, &c->bs->bl);
have_limits = true;
}
+
+ if (c->role & BDRV_CHILD_FILTERED) {
+ bs->bl.has_variable_length |= c->bs->bl.has_variable_length;
+ }
}
if (!have_limits) {
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index ce51c1f7f999..95c934589571 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -855,6 +855,14 @@ typedef struct BlockLimits {
/* maximum number of iovec elements */
int max_iov;
+
+ /*
+ * true if the length of the underlying file can change, and QEMU
+ * is expected to adjust automatically. Mostly for CD-ROM drives,
+ * whose length is zero when the tray is empty (they don't need
+ * an explicit monitor command to load the disk inside the guest).
+ */
+ bool has_variable_length;
} BlockLimits;
typedef struct BdrvOpBlocker BdrvOpBlocker;
--
2.39.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/8] block: remove has_variable_length from filters
2023-04-07 15:32 [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path Paolo Bonzini
2023-04-07 15:32 ` [PATCH 1/8] block: move has_variable_length to BlockLimits Paolo Bonzini
@ 2023-04-07 15:32 ` Paolo Bonzini
2023-04-07 19:40 ` Eric Blake
2023-04-07 15:32 ` [PATCH 3/8] block: refresh bs->total_sectors on reopen Paolo Bonzini
` (6 subsequent siblings)
8 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2023-04-07 15:32 UTC (permalink / raw)
To: qemu-devel; +Cc: stefanha, kwolf, qemu-block, hreitz
Filters automatically get has_variable_length from their underlying
BlockDriverState. There is no need to mark them as variable-length
in the BlockDriver.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block/copy-on-read.c | 1 -
block/filter-compress.c | 1 -
block/preallocate.c | 1 -
block/replication.c | 1 -
4 files changed, 4 deletions(-)
diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index cc0f848b0f10..b4d6b7efc30f 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -259,7 +259,6 @@ static BlockDriver bdrv_copy_on_read = {
.bdrv_co_eject = cor_co_eject,
.bdrv_co_lock_medium = cor_co_lock_medium,
- .has_variable_length = true,
.is_filter = true,
};
diff --git a/block/filter-compress.c b/block/filter-compress.c
index ac285f4b6657..320d9576fa1c 100644
--- a/block/filter-compress.c
+++ b/block/filter-compress.c
@@ -146,7 +146,6 @@ static BlockDriver bdrv_compress = {
.bdrv_co_eject = compress_co_eject,
.bdrv_co_lock_medium = compress_co_lock_medium,
- .has_variable_length = true,
.is_filter = true,
};
diff --git a/block/preallocate.c b/block/preallocate.c
index 71c360180945..4d821250366e 100644
--- a/block/preallocate.c
+++ b/block/preallocate.c
@@ -558,7 +558,6 @@ BlockDriver bdrv_preallocate_filter = {
.bdrv_set_perm = preallocate_set_perm,
.bdrv_child_perm = preallocate_child_perm,
- .has_variable_length = true,
.is_filter = true,
};
diff --git a/block/replication.c b/block/replication.c
index de01f9618467..ea4bf1aa8012 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -762,7 +762,6 @@ static BlockDriver bdrv_replication = {
.is_filter = true,
- .has_variable_length = true,
.strong_runtime_opts = replication_strong_runtime_opts,
};
--
2.39.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/8] block: refresh bs->total_sectors on reopen
2023-04-07 15:32 [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path Paolo Bonzini
2023-04-07 15:32 ` [PATCH 1/8] block: move has_variable_length to BlockLimits Paolo Bonzini
2023-04-07 15:32 ` [PATCH 2/8] block: remove has_variable_length from filters Paolo Bonzini
@ 2023-04-07 15:32 ` Paolo Bonzini
2023-04-07 19:42 ` Eric Blake
2023-04-07 15:32 ` [PATCH 4/8] block: remove has_variable_length from BlockDriver Paolo Bonzini
` (5 subsequent siblings)
8 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2023-04-07 15:32 UTC (permalink / raw)
To: qemu-devel; +Cc: stefanha, kwolf, qemu-block, hreitz
After reopening a BlockDriverState, it's possible that the size of the
underlying file has changed. This for example is covered by test 171.
Right now, this is handled by the raw driver's has_variable_length = true
setting. Since this will be removed by the next patch, handle it on
reopen instead, together with the existing bdrv_refresh_limits.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/block.c b/block.c
index b1b7c7efe036..9de50ac7c811 100644
--- a/block.c
+++ b/block.c
@@ -4918,6 +4918,7 @@ static void bdrv_reopen_commit(BDRVReopenState *reopen_state)
qdict_del(bs->options, "backing");
bdrv_refresh_limits(bs, NULL, NULL);
+ bdrv_refresh_total_sectors(bs, bs->total_sectors);
}
/*
--
2.39.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 4/8] block: remove has_variable_length from BlockDriver
2023-04-07 15:32 [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path Paolo Bonzini
` (2 preceding siblings ...)
2023-04-07 15:32 ` [PATCH 3/8] block: refresh bs->total_sectors on reopen Paolo Bonzini
@ 2023-04-07 15:32 ` Paolo Bonzini
2023-04-07 19:48 ` Eric Blake
2023-04-07 15:33 ` [PATCH 5/8] migration/block: replace uses of blk_nb_sectors that do not check result Paolo Bonzini
` (4 subsequent siblings)
8 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2023-04-07 15:32 UTC (permalink / raw)
To: qemu-devel; +Cc: stefanha, kwolf, qemu-block, hreitz
Fill in the field in BlockLimits directly for host devices, and
copy it from there for the raw format.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block/file-posix.c | 12 ++++++++----
block/file-win32.c | 2 +-
block/io.c | 2 --
block/raw-format.c | 3 ++-
include/block/block_int-common.h | 2 --
5 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index 5b0f2e6d261b..c7b723368e53 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -3732,6 +3732,12 @@ static void cdrom_parse_filename(const char *filename, QDict *options,
{
bdrv_parse_filename_strip_prefix(filename, "host_cdrom:", options);
}
+
+static void cdrom_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+ bs->bl.has_variable_length = true;
+ raw_refresh_limits(bs, errp);
+}
#endif
#ifdef __linux__
@@ -3827,14 +3833,13 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_co_preadv = raw_co_preadv,
.bdrv_co_pwritev = raw_co_pwritev,
.bdrv_co_flush_to_disk = raw_co_flush_to_disk,
- .bdrv_refresh_limits = raw_refresh_limits,
+ .bdrv_refresh_limits = cdrom_refresh_limits,
.bdrv_co_io_plug = raw_co_io_plug,
.bdrv_co_io_unplug = raw_co_io_unplug,
.bdrv_attach_aio_context = raw_aio_attach_aio_context,
.bdrv_co_truncate = raw_co_truncate,
.bdrv_co_getlength = raw_co_getlength,
- .has_variable_length = true,
.bdrv_co_get_allocated_file_size = raw_co_get_allocated_file_size,
/* removable device support */
@@ -3956,14 +3961,13 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_co_preadv = raw_co_preadv,
.bdrv_co_pwritev = raw_co_pwritev,
.bdrv_co_flush_to_disk = raw_co_flush_to_disk,
- .bdrv_refresh_limits = raw_refresh_limits,
+ .bdrv_refresh_limits = cdrom_refresh_limits,
.bdrv_co_io_plug = raw_co_io_plug,
.bdrv_co_io_unplug = raw_co_io_unplug,
.bdrv_attach_aio_context = raw_aio_attach_aio_context,
.bdrv_co_truncate = raw_co_truncate,
.bdrv_co_getlength = raw_co_getlength,
- .has_variable_length = true,
.bdrv_co_get_allocated_file_size = raw_co_get_allocated_file_size,
/* removable device support */
diff --git a/block/file-win32.c b/block/file-win32.c
index 1a5f9e6a9b13..48b790d91739 100644
--- a/block/file-win32.c
+++ b/block/file-win32.c
@@ -836,6 +836,7 @@ static void hdev_refresh_limits(BlockDriverState *bs, Error **errp)
{
/* XXX Does Windows support AIO on less than 512-byte alignment? */
bs->bl.request_alignment = 512;
+ bs->bl.has_variable_length = true;
}
static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
@@ -931,7 +932,6 @@ static BlockDriver bdrv_host_device = {
.bdrv_attach_aio_context = raw_attach_aio_context,
.bdrv_co_getlength = raw_co_getlength,
- .has_variable_length = true,
.bdrv_co_get_allocated_file_size = raw_co_get_allocated_file_size,
};
diff --git a/block/io.c b/block/io.c
index c49917c74677..6fa199337472 100644
--- a/block/io.c
+++ b/block/io.c
@@ -182,8 +182,6 @@ void bdrv_refresh_limits(BlockDriverState *bs, Transaction *tran, Error **errp)
drv->bdrv_aio_preadv ||
drv->bdrv_co_preadv_part) ? 1 : 512;
- bs->bl.has_variable_length = drv->has_variable_length;
-
/* Take some limits from the children as a default */
have_limits = false;
QLIST_FOREACH(c, &bs->children, next) {
diff --git a/block/raw-format.c b/block/raw-format.c
index 66783ed8e77b..06b8030d9d40 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -377,6 +377,8 @@ raw_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
{
+ bs->bl.has_variable_length = bs->file->bs->bl.has_variable_length;
+
if (bs->probed) {
/* To make it easier to protect the first sector, any probed
* image is restricted to read-modify-write on sub-sector
@@ -623,7 +625,6 @@ BlockDriver bdrv_raw = {
.bdrv_co_truncate = &raw_co_truncate,
.bdrv_co_getlength = &raw_co_getlength,
.is_format = true,
- .has_variable_length = true,
.bdrv_measure = &raw_measure,
.bdrv_co_get_info = &raw_co_get_info,
.bdrv_refresh_limits = &raw_refresh_limits,
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 95c934589571..013d419444d9 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -158,8 +158,6 @@ struct BlockDriver {
*/
bool supports_backing;
- bool has_variable_length;
-
/*
* Drivers setting this field must be able to work with just a plain
* filename with '<protocol_name>:' as a prefix, and no other options.
--
2.39.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 5/8] migration/block: replace uses of blk_nb_sectors that do not check result
2023-04-07 15:32 [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path Paolo Bonzini
` (3 preceding siblings ...)
2023-04-07 15:32 ` [PATCH 4/8] block: remove has_variable_length from BlockDriver Paolo Bonzini
@ 2023-04-07 15:33 ` Paolo Bonzini
2023-04-07 19:49 ` Eric Blake
2023-04-07 15:33 ` [PATCH 6/8] block-backend: inline bdrv_co_get_geometry Paolo Bonzini
` (3 subsequent siblings)
8 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2023-04-07 15:33 UTC (permalink / raw)
To: qemu-devel; +Cc: stefanha, kwolf, qemu-block, hreitz
Uses of blk_nb_sectors must check whether the result is negative.
Otherwise, underflow can happen. Fortunately, alloc_aio_bitmap()
and bmds_aio_inflight() both have an alternative way to retrieve the
number of sectors in the file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
migration/block.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/migration/block.c b/migration/block.c
index 426a25bb192e..b2497bbd329c 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -195,7 +195,7 @@ static int bmds_aio_inflight(BlkMigDevState *bmds, int64_t sector)
{
int64_t chunk = sector / (int64_t)BDRV_SECTORS_PER_DIRTY_CHUNK;
- if (sector < blk_nb_sectors(bmds->blk)) {
+ if (sector < bmds->total_sectors) {
return !!(bmds->aio_bitmap[chunk / (sizeof(unsigned long) * 8)] &
(1UL << (chunk % (sizeof(unsigned long) * 8))));
} else {
@@ -229,10 +229,9 @@ static void bmds_set_aio_inflight(BlkMigDevState *bmds, int64_t sector_num,
static void alloc_aio_bitmap(BlkMigDevState *bmds)
{
- BlockBackend *bb = bmds->blk;
int64_t bitmap_size;
- bitmap_size = blk_nb_sectors(bb) + BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
+ bitmap_size = bmds->total_sectors + BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
bitmap_size /= BDRV_SECTORS_PER_DIRTY_CHUNK * 8;
bmds->aio_bitmap = g_malloc0(bitmap_size);
--
2.39.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 6/8] block-backend: inline bdrv_co_get_geometry
2023-04-07 15:32 [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path Paolo Bonzini
` (4 preceding siblings ...)
2023-04-07 15:33 ` [PATCH 5/8] migration/block: replace uses of blk_nb_sectors that do not check result Paolo Bonzini
@ 2023-04-07 15:33 ` Paolo Bonzini
2023-04-07 19:58 ` Eric Blake
2023-04-07 15:33 ` [PATCH 7/8] block-backend: ignore inserted state in blk_co_nb_sectors Paolo Bonzini
` (2 subsequent siblings)
8 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2023-04-07 15:33 UTC (permalink / raw)
To: qemu-devel; +Cc: stefanha, kwolf, qemu-block, hreitz
bdrv_co_get_geometry is only used in blk_co_get_geometry. Inline it in
there, to reduce the number of wrappers for bs->total_sectors.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block.c | 10 ----------
block/block-backend.c | 8 ++++++--
include/block/block-io.h | 3 ---
3 files changed, 6 insertions(+), 15 deletions(-)
diff --git a/block.c b/block.c
index 9de50ac7c811..dbbc8de30c24 100644
--- a/block.c
+++ b/block.c
@@ -5879,16 +5879,6 @@ int64_t coroutine_fn bdrv_co_getlength(BlockDriverState *bs)
return ret * BDRV_SECTOR_SIZE;
}
-/* return 0 as number of sectors if no device present or error */
-void coroutine_fn bdrv_co_get_geometry(BlockDriverState *bs,
- uint64_t *nb_sectors_ptr)
-{
- int64_t nb_sectors = bdrv_co_nb_sectors(bs);
- IO_CODE();
-
- *nb_sectors_ptr = nb_sectors < 0 ? 0 : nb_sectors;
-}
-
bool bdrv_is_sg(BlockDriverState *bs)
{
IO_CODE();
diff --git a/block/block-backend.c b/block/block-backend.c
index 7d331d93ebbc..f159cc51d264 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1627,16 +1627,20 @@ int64_t coroutine_fn blk_co_getlength(BlockBackend *blk)
return bdrv_co_getlength(blk_bs(blk));
}
+/* return 0 as number of sectors if no device present or error */
void coroutine_fn blk_co_get_geometry(BlockBackend *blk,
uint64_t *nb_sectors_ptr)
{
+ BlockDriverState *bs = blk_bs(blk);
+
IO_CODE();
GRAPH_RDLOCK_GUARD();
- if (!blk_bs(blk)) {
+ if (!bs) {
*nb_sectors_ptr = 0;
} else {
- bdrv_co_get_geometry(blk_bs(blk), nb_sectors_ptr);
+ int64_t nb_sectors = bdrv_co_nb_sectors(bs);
+ *nb_sectors_ptr = nb_sectors < 0 ? 0 : nb_sectors;
}
}
diff --git a/include/block/block-io.h b/include/block/block-io.h
index dbc034b7288e..9e2248a295a6 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -90,9 +90,6 @@ int64_t co_wrapper bdrv_get_allocated_file_size(BlockDriverState *bs);
BlockMeasureInfo *bdrv_measure(BlockDriver *drv, QemuOpts *opts,
BlockDriverState *in_bs, Error **errp);
-void coroutine_fn GRAPH_RDLOCK
-bdrv_co_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr);
-
int coroutine_fn GRAPH_RDLOCK
bdrv_co_delete_file(BlockDriverState *bs, Error **errp);
--
2.39.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 7/8] block-backend: ignore inserted state in blk_co_nb_sectors
2023-04-07 15:32 [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path Paolo Bonzini
` (5 preceding siblings ...)
2023-04-07 15:33 ` [PATCH 6/8] block-backend: inline bdrv_co_get_geometry Paolo Bonzini
@ 2023-04-07 15:33 ` Paolo Bonzini
2023-04-07 20:00 ` Eric Blake
2023-04-07 15:33 ` [PATCH 8/8] block, block-backend: write some hot coroutine wrappers by hand Paolo Bonzini
2023-04-11 15:01 ` [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path Kevin Wolf
8 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2023-04-07 15:33 UTC (permalink / raw)
To: qemu-devel; +Cc: stefanha, kwolf, qemu-block, hreitz
All callers of blk_co_nb_sectors (and blk_nb_sectors) are able to
handle a non-inserted CD-ROM as a zero-length file, they do not need
to raise an error.
Not using blk_co_is_available() aligns the function with
blk_co_get_geometry(), which becomes a simple wrapper for
blk_co_nb_sectors(). It will also make it possible to skip the creation
of a coroutine in the (common) case where bs->bl.has_variable_length
is false.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block/block-backend.c | 23 ++++++++---------------
1 file changed, 8 insertions(+), 15 deletions(-)
diff --git a/block/block-backend.c b/block/block-backend.c
index f159cc51d264..d08e949799ee 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1627,9 +1627,7 @@ int64_t coroutine_fn blk_co_getlength(BlockBackend *blk)
return bdrv_co_getlength(blk_bs(blk));
}
-/* return 0 as number of sectors if no device present or error */
-void coroutine_fn blk_co_get_geometry(BlockBackend *blk,
- uint64_t *nb_sectors_ptr)
+int64_t coroutine_fn blk_co_nb_sectors(BlockBackend *blk)
{
BlockDriverState *bs = blk_bs(blk);
@@ -1637,23 +1635,18 @@ void coroutine_fn blk_co_get_geometry(BlockBackend *blk,
GRAPH_RDLOCK_GUARD();
if (!bs) {
- *nb_sectors_ptr = 0;
+ return -ENOMEDIUM;
} else {
- int64_t nb_sectors = bdrv_co_nb_sectors(bs);
- *nb_sectors_ptr = nb_sectors < 0 ? 0 : nb_sectors;
+ return bdrv_co_nb_sectors(bs);
}
}
-int64_t coroutine_fn blk_co_nb_sectors(BlockBackend *blk)
+/* return 0 as number of sectors if no device present or error */
+void coroutine_fn blk_co_get_geometry(BlockBackend *blk,
+ uint64_t *nb_sectors_ptr)
{
- IO_CODE();
- GRAPH_RDLOCK_GUARD();
-
- if (!blk_co_is_available(blk)) {
- return -ENOMEDIUM;
- }
-
- return bdrv_co_nb_sectors(blk_bs(blk));
+ int64_t ret = blk_co_nb_sectors(blk);
+ *nb_sectors_ptr = ret < 0 ? 0 : ret;
}
BlockAIOCB *blk_aio_preadv(BlockBackend *blk, int64_t offset,
--
2.39.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 8/8] block, block-backend: write some hot coroutine wrappers by hand
2023-04-07 15:32 [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path Paolo Bonzini
` (6 preceding siblings ...)
2023-04-07 15:33 ` [PATCH 7/8] block-backend: ignore inserted state in blk_co_nb_sectors Paolo Bonzini
@ 2023-04-07 15:33 ` Paolo Bonzini
2023-04-07 20:04 ` Eric Blake
2023-04-11 15:01 ` [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path Kevin Wolf
8 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2023-04-07 15:33 UTC (permalink / raw)
To: qemu-devel; +Cc: stefanha, kwolf, qemu-block, hreitz
The introduction of the graph lock is causing blk_get_geometry, a hot function
used in the I/O path, to create a coroutine. However, the only part that really
needs to run in coroutine context is the call to bdrv_co_refresh_total_sectors,
which in turn only happens in the rare case of host CD-ROM devices.
So, write by hand the three wrappers on the path from blk_co_get_geometry to
bdrv_co_refresh_total_sectors, so that the coroutine wrapper is only created
if bdrv_nb_sectors actually calls bdrv_refresh_total_sectors.
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block.c | 22 ++++++++++++++++++++++
block/block-backend.c | 27 +++++++++++++++++++++++++++
include/sysemu/block-backend-io.h | 5 ++---
4 files changed, 52 insertions(+), 4 deletions(-)
diff --git a/block.c b/block.c
index dbbc8de30c24..3390efd18cf6 100644
--- a/block.c
+++ b/block.c
@@ -5859,6 +5859,28 @@ int64_t coroutine_fn bdrv_co_nb_sectors(BlockDriverState *bs)
return bs->total_sectors;
}
+/*
+ * This wrapper is written by hand because this function is in the hot I/O path,
+ * via blk_get_geometry.
+ */
+int64_t coroutine_mixed_fn bdrv_nb_sectors(BlockDriverState *bs)
+{
+ BlockDriver *drv = bs->drv;
+ IO_CODE();
+
+ if (!drv)
+ return -ENOMEDIUM;
+
+ if (!bs->bl.has_variable_length) {
+ int ret = bdrv_refresh_total_sectors(bs, bs->total_sectors);
+ if (ret < 0) {
+ return ret;
+ }
+ }
+
+ return bs->total_sectors;
+}
+
/**
* Return length in bytes on success, -errno on error.
* The length is always a multiple of BDRV_SECTOR_SIZE.
diff --git a/block/block-backend.c b/block/block-backend.c
index d08e949799ee..a9767466af69 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1641,6 +1641,23 @@ int64_t coroutine_fn blk_co_nb_sectors(BlockBackend *blk)
}
}
+/*
+ * This wrapper is written by hand because this function is in the hot I/O path,
+ * via blk_get_geometry.
+ */
+int64_t coroutine_mixed_fn blk_nb_sectors(BlockBackend *blk)
+{
+ BlockDriverState *bs = blk_bs(blk);
+
+ IO_CODE();
+
+ if (!bs) {
+ return -ENOMEDIUM;
+ } else {
+ return bdrv_nb_sectors(blk_bs(blk));
+ }
+}
+
/* return 0 as number of sectors if no device present or error */
void coroutine_fn blk_co_get_geometry(BlockBackend *blk,
uint64_t *nb_sectors_ptr)
@@ -1649,6 +1666,16 @@ void coroutine_fn blk_co_get_geometry(BlockBackend *blk,
*nb_sectors_ptr = ret < 0 ? 0 : ret;
}
+/*
+ * This wrapper is written by hand because this function is in the hot I/O path.
+ */
+void coroutine_mixed_fn blk_get_geometry(BlockBackend *blk,
+ uint64_t *nb_sectors_ptr)
+{
+ int64_t ret = blk_nb_sectors(blk);
+ *nb_sectors_ptr = ret < 0 ? 0 : ret;
+}
+
BlockAIOCB *blk_aio_preadv(BlockBackend *blk, int64_t offset,
QEMUIOVector *qiov, BdrvRequestFlags flags,
BlockCompletionFunc *cb, void *opaque)
diff --git a/include/block/block-io.h b/include/block/block-io.h
index 9e2248a295a6..5dab88521d02 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -79,7 +79,7 @@ bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact,
PreallocMode prealloc, BdrvRequestFlags flags, Error **errp);
int64_t coroutine_fn GRAPH_RDLOCK bdrv_co_nb_sectors(BlockDriverState *bs);
-int64_t co_wrapper_mixed_bdrv_rdlock bdrv_nb_sectors(BlockDriverState *bs);
+int64_t coroutine_mixed_fn bdrv_nb_sectors(BlockDriverState *bs);
int64_t coroutine_fn GRAPH_RDLOCK bdrv_co_getlength(BlockDriverState *bs);
int64_t co_wrapper_mixed_bdrv_rdlock bdrv_getlength(BlockDriverState *bs);
diff --git a/include/sysemu/block-backend-io.h b/include/sysemu/block-backend-io.h
index 9736346d7940..851a44de9642 100644
--- a/include/sysemu/block-backend-io.h
+++ b/include/sysemu/block-backend-io.h
@@ -72,11 +72,10 @@ int64_t co_wrapper_mixed blk_getlength(BlockBackend *blk);
void coroutine_fn blk_co_get_geometry(BlockBackend *blk,
uint64_t *nb_sectors_ptr);
-void co_wrapper_mixed blk_get_geometry(BlockBackend *blk,
- uint64_t *nb_sectors_ptr);
+void blk_get_geometry(BlockBackend *blk, uint64_t *nb_sectors_ptr);
int64_t coroutine_fn blk_co_nb_sectors(BlockBackend *blk);
-int64_t co_wrapper_mixed blk_nb_sectors(BlockBackend *blk);
+int64_t blk_nb_sectors(BlockBackend *blk);
void *blk_try_blockalign(BlockBackend *blk, size_t size);
void *blk_blockalign(BlockBackend *blk, size_t size);
--
2.39.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 1/8] block: move has_variable_length to BlockLimits
2023-04-07 15:32 ` [PATCH 1/8] block: move has_variable_length to BlockLimits Paolo Bonzini
@ 2023-04-07 19:38 ` Eric Blake
0 siblings, 0 replies; 19+ messages in thread
From: Eric Blake @ 2023-04-07 19:38 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, stefanha, kwolf, qemu-block, hreitz
On Fri, Apr 07, 2023 at 05:32:56PM +0200, Paolo Bonzini wrote:
> At the protocol level, has_variable_length only needs to be true in the
> very special case of host CD-ROM drives, so that they do not need an
> explicit monitor command to read the new size when a disc is loaded
> in the tray.
>
> However, at the format level has_variable_length has to be true for all
> raw blockdevs and for all filters, even though in practice the length
> depends on the underlying file and thus will not change except in the
> case of host CD-ROM drives.
>
> As a first step towards computing an accurate value of has_variable_length,
> add the value into the BlockLimits structure and initialize the field
> from the BlockDriver.
My assumption here is that all other resizes (such as a block device
that can be externally enlarged upon seeing the guest pause due to an
ENOSPC condition on the current size) are NOT expected to have
automatic reaction in qemu, but instead rely on some other external
action (such as resuming after ENOSPC or an explicit monitor command)
as the reason for why qemu eventually learns the new size. If my
assumption is right, then you do sound correct in stating that CDROMs
are special in that the guest OS can request the tray to be loaded and
change the size from 0 to the size of the newly-inserted disc, with no
intervening action or QMP command, and qemu must react to that size
change.
I'm asking because at one point, there was a proposal to extend the
NBD protocol to allow dynamic resizing, somewhat similar to how a
block device can be externally resized; there were questions on how
much of a resize action has to be from client-to-server "please poll
if the server has changed size recently" instead of server-to-client
"by the way, the size just changed". I don't want NBD resize (if
someone ever fleshes out the extension propsal into an actual
implementation) to be hamstrung by an inability to reflect size
changes initiated on the server side, rather than triggered at the
client side. But since I think regular files and block devices have
the same considerations, I don't think it is a show-stopper to this
series.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block.c | 2 +-
> block/io.c | 6 ++++++
> include/block/block_int-common.h | 8 ++++++++
> 3 files changed, 15 insertions(+), 1 deletion(-)
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/8] block: remove has_variable_length from filters
2023-04-07 15:32 ` [PATCH 2/8] block: remove has_variable_length from filters Paolo Bonzini
@ 2023-04-07 19:40 ` Eric Blake
0 siblings, 0 replies; 19+ messages in thread
From: Eric Blake @ 2023-04-07 19:40 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, stefanha, kwolf, qemu-block, hreitz
On Fri, Apr 07, 2023 at 05:32:57PM +0200, Paolo Bonzini wrote:
> Filters automatically get has_variable_length from their underlying
> BlockDriverState. There is no need to mark them as variable-length
> in the BlockDriver.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/copy-on-read.c | 1 -
> block/filter-compress.c | 1 -
> block/preallocate.c | 1 -
> block/replication.c | 1 -
> 4 files changed, 4 deletions(-)
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/8] block: refresh bs->total_sectors on reopen
2023-04-07 15:32 ` [PATCH 3/8] block: refresh bs->total_sectors on reopen Paolo Bonzini
@ 2023-04-07 19:42 ` Eric Blake
0 siblings, 0 replies; 19+ messages in thread
From: Eric Blake @ 2023-04-07 19:42 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, stefanha, kwolf, qemu-block, hreitz
On Fri, Apr 07, 2023 at 05:32:58PM +0200, Paolo Bonzini wrote:
> After reopening a BlockDriverState, it's possible that the size of the
> underlying file has changed. This for example is covered by test 171.
>
> Right now, this is handled by the raw driver's has_variable_length = true
> setting. Since this will be removed by the next patch, handle it on
> reopen instead, together with the existing bdrv_refresh_limits.
Makes sense. Normally, when exposing a host image to a guest, you
don't expect the image to be changing outside of qemu's control; there
are exceptions (like dealing with ENOSPC), but they are generally
limited to adding more space (and not modifying existing contents or
shrinking the volume). Detecting growth at reopen seems reasonable,
and like it will cut out on some of the frequency of checks we are
currently doing even when most see no change.
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/8] block: remove has_variable_length from BlockDriver
2023-04-07 15:32 ` [PATCH 4/8] block: remove has_variable_length from BlockDriver Paolo Bonzini
@ 2023-04-07 19:48 ` Eric Blake
0 siblings, 0 replies; 19+ messages in thread
From: Eric Blake @ 2023-04-07 19:48 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, stefanha, kwolf, qemu-block, hreitz
On Fri, Apr 07, 2023 at 05:32:59PM +0200, Paolo Bonzini wrote:
> Fill in the field in BlockLimits directly for host devices, and
> copy it from there for the raw format.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/file-posix.c | 12 ++++++++----
> block/file-win32.c | 2 +-
> block/io.c | 2 --
> block/raw-format.c | 3 ++-
> include/block/block_int-common.h | 2 --
> 5 files changed, 11 insertions(+), 10 deletions(-)
The change makes sense to me. I'm having a slight doubt on whether it
might cause any regression in the bigger picture where a regular host
file exposed as a guest image grew outside of qemu's control but where
qemu used to see the new size automatically but now won't see it until
an explicit QMP action. But I suspect that since we already do have a
QMP size for telling qemu to do resizes itself, or to at least refresh
its notion of size (for that very case of a third-party adding more
storage and telling qemu it is now safe to use that extra space), the
explicit QMP interaction is probably sufficient, and that any such
corner-case regression I'm worrying about is not a problem in reality.
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 5/8] migration/block: replace uses of blk_nb_sectors that do not check result
2023-04-07 15:33 ` [PATCH 5/8] migration/block: replace uses of blk_nb_sectors that do not check result Paolo Bonzini
@ 2023-04-07 19:49 ` Eric Blake
0 siblings, 0 replies; 19+ messages in thread
From: Eric Blake @ 2023-04-07 19:49 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, stefanha, kwolf, qemu-block, hreitz
On Fri, Apr 07, 2023 at 05:33:00PM +0200, Paolo Bonzini wrote:
> Uses of blk_nb_sectors must check whether the result is negative.
> Otherwise, underflow can happen. Fortunately, alloc_aio_bitmap()
> and bmds_aio_inflight() both have an alternative way to retrieve the
> number of sectors in the file.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> migration/block.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/8] block-backend: inline bdrv_co_get_geometry
2023-04-07 15:33 ` [PATCH 6/8] block-backend: inline bdrv_co_get_geometry Paolo Bonzini
@ 2023-04-07 19:58 ` Eric Blake
0 siblings, 0 replies; 19+ messages in thread
From: Eric Blake @ 2023-04-07 19:58 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, stefanha, kwolf, qemu-block, hreitz
On Fri, Apr 07, 2023 at 05:33:01PM +0200, Paolo Bonzini wrote:
> bdrv_co_get_geometry is only used in blk_co_get_geometry. Inline it in
> there, to reduce the number of wrappers for bs->total_sectors.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block.c | 10 ----------
> block/block-backend.c | 8 ++++++--
> include/block/block-io.h | 3 ---
> 3 files changed, 6 insertions(+), 15 deletions(-)
Reviewed-by: Eric Blake <eblake@redhat.com>
>
> diff --git a/block.c b/block.c
> index 9de50ac7c811..dbbc8de30c24 100644
> --- a/block.c
> +++ b/block.c
> @@ -5879,16 +5879,6 @@ int64_t coroutine_fn bdrv_co_getlength(BlockDriverState *bs)
> return ret * BDRV_SECTOR_SIZE;
> }
>
> -/* return 0 as number of sectors if no device present or error */
> -void coroutine_fn bdrv_co_get_geometry(BlockDriverState *bs,
> - uint64_t *nb_sectors_ptr)
> -{
> - int64_t nb_sectors = bdrv_co_nb_sectors(bs);
> - IO_CODE();
Pre-patch, we called bdrv_co_nb_sectors() before the IO_CODE guard...
> +/* return 0 as number of sectors if no device present or error */
> void coroutine_fn blk_co_get_geometry(BlockBackend *blk,
> uint64_t *nb_sectors_ptr)
> {
> + BlockDriverState *bs = blk_bs(blk);
> +
> IO_CODE();
> GRAPH_RDLOCK_GUARD();
>
> - if (!blk_bs(blk)) {
> + if (!bs) {
> *nb_sectors_ptr = 0;
> } else {
> - bdrv_co_get_geometry(blk_bs(blk), nb_sectors_ptr);
> + int64_t nb_sectors = bdrv_co_nb_sectors(bs);
...post-patch the order swaps. That actually feels better to me, (the
guard is supposed to do sanity checks to detect coding bugs at the
soonest possible moment; if we have a bug, doing the work and only
later failing the check is not as safe as failing fast) - but probably
no impact to correctly written code.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 7/8] block-backend: ignore inserted state in blk_co_nb_sectors
2023-04-07 15:33 ` [PATCH 7/8] block-backend: ignore inserted state in blk_co_nb_sectors Paolo Bonzini
@ 2023-04-07 20:00 ` Eric Blake
0 siblings, 0 replies; 19+ messages in thread
From: Eric Blake @ 2023-04-07 20:00 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, stefanha, kwolf, qemu-block, hreitz
On Fri, Apr 07, 2023 at 05:33:02PM +0200, Paolo Bonzini wrote:
> All callers of blk_co_nb_sectors (and blk_nb_sectors) are able to
> handle a non-inserted CD-ROM as a zero-length file, they do not need
> to raise an error.
>
> Not using blk_co_is_available() aligns the function with
> blk_co_get_geometry(), which becomes a simple wrapper for
> blk_co_nb_sectors(). It will also make it possible to skip the creation
> of a coroutine in the (common) case where bs->bl.has_variable_length
> is false.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/block-backend.c | 23 ++++++++---------------
> 1 file changed, 8 insertions(+), 15 deletions(-)
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] block, block-backend: write some hot coroutine wrappers by hand
2023-04-07 15:33 ` [PATCH 8/8] block, block-backend: write some hot coroutine wrappers by hand Paolo Bonzini
@ 2023-04-07 20:04 ` Eric Blake
2023-04-07 20:32 ` Paolo Bonzini
0 siblings, 1 reply; 19+ messages in thread
From: Eric Blake @ 2023-04-07 20:04 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, stefanha, kwolf, qemu-block, hreitz
On Fri, Apr 07, 2023 at 05:33:03PM +0200, Paolo Bonzini wrote:
> The introduction of the graph lock is causing blk_get_geometry, a hot function
> used in the I/O path, to create a coroutine. However, the only part that really
> needs to run in coroutine context is the call to bdrv_co_refresh_total_sectors,
> which in turn only happens in the rare case of host CD-ROM devices.
>
> So, write by hand the three wrappers on the path from blk_co_get_geometry to
> bdrv_co_refresh_total_sectors, so that the coroutine wrapper is only created
> if bdrv_nb_sectors actually calls bdrv_refresh_total_sectors.
>
> Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block.c | 22 ++++++++++++++++++++++
> block/block-backend.c | 27 +++++++++++++++++++++++++++
>
> include/sysemu/block-backend-io.h | 5 ++---
> 4 files changed, 52 insertions(+), 4 deletions(-)
>
> diff --git a/block.c b/block.c
> index dbbc8de30c24..3390efd18cf6 100644
> --- a/block.c
> +++ b/block.c
> @@ -5859,6 +5859,28 @@ int64_t coroutine_fn bdrv_co_nb_sectors(BlockDriverState *bs)
> return bs->total_sectors;
> }
>
> +/*
> + * This wrapper is written by hand because this function is in the hot I/O path,
> + * via blk_get_geometry.
> + */
> +int64_t coroutine_mixed_fn bdrv_nb_sectors(BlockDriverState *bs)
> +{
> + BlockDriver *drv = bs->drv;
> + IO_CODE();
> +
> + if (!drv)
> + return -ENOMEDIUM;
> +
> + if (!bs->bl.has_variable_length) {
> + int ret = bdrv_refresh_total_sectors(bs, bs->total_sectors);
Is this logic backwards? Why are we only refreshing the total sectors
when we don't have variable length?
> + if (ret < 0) {
> + return ret;
> + }
> + }
> +
> + return bs->total_sectors;
> +}
> +
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] block, block-backend: write some hot coroutine wrappers by hand
2023-04-07 20:04 ` Eric Blake
@ 2023-04-07 20:32 ` Paolo Bonzini
0 siblings, 0 replies; 19+ messages in thread
From: Paolo Bonzini @ 2023-04-07 20:32 UTC (permalink / raw)
To: Eric Blake
Cc: qemu-devel, Hajnoczi, Stefan, Wolf, Kevin,
open list:Block layer core, Hanna Czenczek
[-- Attachment #1: Type: text/plain, Size: 2197 bytes --]
Il ven 7 apr 2023, 22:04 Eric Blake <eblake@redhat.com> ha scritto:
> On Fri, Apr 07, 2023 at 05:33:03PM +0200, Paolo Bonzini wrote:
> > The introduction of the graph lock is causing blk_get_geometry, a hot
> function
> > used in the I/O path, to create a coroutine. However, the only part
> that really
> > needs to run in coroutine context is the call to
> bdrv_co_refresh_total_sectors,
> > which in turn only happens in the rare case of host CD-ROM devices.
> >
> > So, write by hand the three wrappers on the path from
> blk_co_get_geometry to
> > bdrv_co_refresh_total_sectors, so that the coroutine wrapper is only
> created
> > if bdrv_nb_sectors actually calls bdrv_refresh_total_sectors.
> >
> > Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> > block.c | 22 ++++++++++++++++++++++
> > block/block-backend.c | 27 +++++++++++++++++++++++++++
> >
> > include/sysemu/block-backend-io.h | 5 ++---
> > 4 files changed, 52 insertions(+), 4 deletions(-)
> >
> > diff --git a/block.c b/block.c
> > index dbbc8de30c24..3390efd18cf6 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -5859,6 +5859,28 @@ int64_t coroutine_fn
> bdrv_co_nb_sectors(BlockDriverState *bs)
> > return bs->total_sectors;
> > }
> >
> > +/*
> > + * This wrapper is written by hand because this function is in the hot
> I/O path,
> > + * via blk_get_geometry.
> > + */
> > +int64_t coroutine_mixed_fn bdrv_nb_sectors(BlockDriverState *bs)
> > +{
> > + BlockDriver *drv = bs->drv;
> > + IO_CODE();
> > +
> > + if (!drv)
> > + return -ENOMEDIUM;
> > +
> > + if (!bs->bl.has_variable_length) {
> > + int ret = bdrv_refresh_total_sectors(bs, bs->total_sectors);
>
> Is this logic backwards? Why are we only refreshing the total sectors
> when we don't have variable length?
>
Yes, it is backwards.
Paolo
> > + if (ret < 0) {
> > + return ret;
> > + }
> > + }
> > +
> > + return bs->total_sectors;
> > +}
> > +
>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc. +1-919-301-3266
> Virtualization: qemu.org | libvirt.org
>
>
[-- Attachment #2: Type: text/html, Size: 3482 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path
2023-04-07 15:32 [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path Paolo Bonzini
` (7 preceding siblings ...)
2023-04-07 15:33 ` [PATCH 8/8] block, block-backend: write some hot coroutine wrappers by hand Paolo Bonzini
@ 2023-04-11 15:01 ` Kevin Wolf
8 siblings, 0 replies; 19+ messages in thread
From: Kevin Wolf @ 2023-04-11 15:01 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, stefanha, qemu-block, hreitz
Am 07.04.2023 um 17:32 hat Paolo Bonzini geschrieben:
> The introduction of the graph lock is causing blk_get_geometry, a hot
> function used in the I/O path, to create a coroutine for the call to
> bdrv_co_refresh_total_sectors.
>
> In theory the call to bdrv_co_refresh_total_sectors should only matter
> in the rare case of host CD-ROM devices, whose size changes when a medium
> is added or removed. However, the call is actually keyed by a field in
> BlockDriver, drv->has_variable_length, and the field is true in the common
> case of the raw driver! This is because the host CD-ROM is usually
> layered below the raw driver.
>
> So, this series starts by moving has_variable_length from BlockDriver to
> BlockLimits. This is patches 1-4, which also include a fix for a small
> latent bug (patch 3).
>
> The second half of the series then cleans up the functions to retrieve
> the BlockDriverState's size (patches 5-7) to limit the amount of duplicated
> code introduced by the hand-written wrappers of patch 8. The final result
> is that blk_get_geometry will not anymore create a coroutine.
>
> This series applies to qemu.git, or to the block-next branch if commit
> d8fbf9aa85ae ("block/export: Fix graph locking in blk_get_geometry()
> call", 2023-03-27) is cherry picked. Commit d8fbf9aa85ae is also where
> bdrv_co_get_geometry() was introduced and with it the performance
> regression. It is quite a recent change, and therefore this is
> probably a regression in 8.0 that had not been detected yet (except by
> Stefan who talked to Kevin and me about it yesterday). I'm not sure how
> we can avoid the regression, if not by disabling completely the graph lock
> (!) or applying this large series.
>
> I'm throwing this out before disappearing for a couple days for Easter;
> I have only tested it with qemu-iotests and "make check-unit".
Thanks, fixed up patch 8 to make the non-coroutine wrappers almost exact
copies of the coroutine version (including fixing the bug that Eric
found), and applied to the block branch.
I'm not sure if the functions actually need to be coroutine_mixed_fn,
because coroutines should already call blk_co_get_geometry(), but we can
clean that up later.
Kevin
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2023-04-11 15:02 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-07 15:32 [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path Paolo Bonzini
2023-04-07 15:32 ` [PATCH 1/8] block: move has_variable_length to BlockLimits Paolo Bonzini
2023-04-07 19:38 ` Eric Blake
2023-04-07 15:32 ` [PATCH 2/8] block: remove has_variable_length from filters Paolo Bonzini
2023-04-07 19:40 ` Eric Blake
2023-04-07 15:32 ` [PATCH 3/8] block: refresh bs->total_sectors on reopen Paolo Bonzini
2023-04-07 19:42 ` Eric Blake
2023-04-07 15:32 ` [PATCH 4/8] block: remove has_variable_length from BlockDriver Paolo Bonzini
2023-04-07 19:48 ` Eric Blake
2023-04-07 15:33 ` [PATCH 5/8] migration/block: replace uses of blk_nb_sectors that do not check result Paolo Bonzini
2023-04-07 19:49 ` Eric Blake
2023-04-07 15:33 ` [PATCH 6/8] block-backend: inline bdrv_co_get_geometry Paolo Bonzini
2023-04-07 19:58 ` Eric Blake
2023-04-07 15:33 ` [PATCH 7/8] block-backend: ignore inserted state in blk_co_nb_sectors Paolo Bonzini
2023-04-07 20:00 ` Eric Blake
2023-04-07 15:33 ` [PATCH 8/8] block, block-backend: write some hot coroutine wrappers by hand Paolo Bonzini
2023-04-07 20:04 ` Eric Blake
2023-04-07 20:32 ` Paolo Bonzini
2023-04-11 15:01 ` [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path Kevin Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).