From: John Snow <jsnow@redhat.com>
To: Eric Blake <eblake@redhat.com>, qemu-devel@nongnu.org
Cc: kwolf@redhat.com, famz@redhat.com, qemu-block@nongnu.org,
Max Reitz <mreitz@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v4 02/23] block: Add flag to avoid wasted work in bdrv_is_allocated()
Date: Tue, 26 Sep 2017 14:31:17 -0400 [thread overview]
Message-ID: <b5fd50f3-7bd0-115c-bc67-0df49b05cb76@redhat.com> (raw)
In-Reply-To: <20170913160333.23622-3-eblake@redhat.com>
On 09/13/2017 12:03 PM, Eric Blake wrote:
> Not all callers care about which BDS owns the mapping for a given
> range of the file. In particular, bdrv_is_allocated() cares more
> about finding the largest run of allocated data from the guest
> perspective, whether or not that data is consecutive from the
> host perspective. Therefore, doing subsequent refinements such
> as checking how much of the format-layer allocation also satisfies
> BDRV_BLOCK_ZERO at the protocol layer is wasted work - in the best
> case, it just costs extra CPU cycles during a single
> bdrv_is_allocated(), but in the worst case, it results in a smaller
> *pnum, and forces callers to iterate through more status probes when
> visiting the entire file for even more extra CPU cycles.
>
> This patch only optimizes the block layer. But subsequent patches
> will tweak the driver callback to be byte-based, and in the process,
> can also pass this hint through to the driver.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> ---
> v4: only context changes
> v3: s/allocation/mapping/ and flip sense of bool
> v2: new patch
> ---
> block/io.c | 52 ++++++++++++++++++++++++++++++++++++++--------------
> 1 file changed, 38 insertions(+), 14 deletions(-)
>
> diff --git a/block/io.c b/block/io.c
> index f250029395..6509c804d4 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1709,6 +1709,7 @@ typedef struct BdrvCoGetBlockStatusData {
> int nb_sectors;
> int *pnum;
> int64_t ret;
> + bool mapping;
> bool done;
> } BdrvCoGetBlockStatusData;
>
> @@ -1743,6 +1744,11 @@ int64_t coroutine_fn bdrv_co_get_block_status_from_backing(BlockDriverState *bs,
> * Drivers not implementing the functionality are assumed to not support
> * backing files, hence all their sectors are reported as allocated.
> *
> + * If 'mapping' is true, the caller is querying for mapping purposes,
> + * and the result should include BDRV_BLOCK_OFFSET_VALID where
> + * possible; otherwise, the result may omit that bit particularly if
> + * it allows for a larger value in 'pnum'.
> + *
> * If 'sector_num' is beyond the end of the disk image the return value is
> * BDRV_BLOCK_EOF and 'pnum' is set to 0.
> *
> @@ -1759,6 +1765,7 @@ int64_t coroutine_fn bdrv_co_get_block_status_from_backing(BlockDriverState *bs,
> * is allocated in.
> */
> static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
> + bool mapping,
> int64_t sector_num,
> int nb_sectors, int *pnum,
> BlockDriverState **file)
> @@ -1817,14 +1824,15 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
>
> if (ret & BDRV_BLOCK_RAW) {
> assert(ret & BDRV_BLOCK_OFFSET_VALID && local_file);
> - ret = bdrv_co_get_block_status(local_file, ret >> BDRV_SECTOR_BITS,
> + ret = bdrv_co_get_block_status(local_file, mapping,
> + ret >> BDRV_SECTOR_BITS,
> *pnum, pnum, &local_file);
> goto out;
> }
>
> if (ret & (BDRV_BLOCK_DATA | BDRV_BLOCK_ZERO)) {
> ret |= BDRV_BLOCK_ALLOCATED;
> - } else {
> + } else if (mapping) {
> if (bdrv_unallocated_blocks_are_zero(bs)) {
> ret |= BDRV_BLOCK_ZERO;
> } else if (bs->backing) {
> @@ -1836,12 +1844,13 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
> }
> }
>
> - if (local_file && local_file != bs &&
> + if (mapping && local_file && local_file != bs &&
Tentatively this looks OK to me, but I have to admit I'm a little shaky
on this portion because I've not really investigated this function too
much. I am at the very least convinced that when mapping is true that
the function is equivalent and that existing callers don't have their
behavior changed too much.
Benefit of the doubt:
Reviewed-by: John Snow <jsnow@redhat.com>
> (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
> (ret & BDRV_BLOCK_OFFSET_VALID)) {
> int file_pnum;
>
> - ret2 = bdrv_co_get_block_status(local_file, ret >> BDRV_SECTOR_BITS,
> + ret2 = bdrv_co_get_block_status(local_file, mapping,
> + ret >> BDRV_SECTOR_BITS,
> *pnum, &file_pnum, NULL);
> if (ret2 >= 0) {
> /* Ignore errors. This is just providing extra information, it
> @@ -1876,6 +1885,7 @@ out:
>
> static int64_t coroutine_fn bdrv_co_get_block_status_above(BlockDriverState *bs,
> BlockDriverState *base,
> + bool mapping,
> int64_t sector_num,
> int nb_sectors,
> int *pnum,
> @@ -1887,7 +1897,8 @@ static int64_t coroutine_fn bdrv_co_get_block_status_above(BlockDriverState *bs,
>
> assert(bs != base);
> for (p = bs; p != base; p = backing_bs(p)) {
> - ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, pnum, file);
> + ret = bdrv_co_get_block_status(p, mapping, sector_num, nb_sectors,
> + pnum, file);
> if (ret < 0) {
> break;
> }
> @@ -1917,6 +1928,7 @@ static void coroutine_fn bdrv_get_block_status_above_co_entry(void *opaque)
> BdrvCoGetBlockStatusData *data = opaque;
>
> data->ret = bdrv_co_get_block_status_above(data->bs, data->base,
> + data->mapping,
> data->sector_num,
> data->nb_sectors,
> data->pnum,
> @@ -1929,11 +1941,12 @@ static void coroutine_fn bdrv_get_block_status_above_co_entry(void *opaque)
> *
> * See bdrv_co_get_block_status_above() for details.
> */
> -int64_t bdrv_get_block_status_above(BlockDriverState *bs,
> - BlockDriverState *base,
> - int64_t sector_num,
> - int nb_sectors, int *pnum,
> - BlockDriverState **file)
> +static int64_t bdrv_common_block_status_above(BlockDriverState *bs,
> + BlockDriverState *base,
> + bool mapping,
> + int64_t sector_num,
> + int nb_sectors, int *pnum,
> + BlockDriverState **file)
> {
> Coroutine *co;
> BdrvCoGetBlockStatusData data = {
> @@ -1943,6 +1956,7 @@ int64_t bdrv_get_block_status_above(BlockDriverState *bs,
> .sector_num = sector_num,
> .nb_sectors = nb_sectors,
> .pnum = pnum,
> + .mapping = mapping,
> .done = false,
> };
>
> @@ -1958,6 +1972,16 @@ int64_t bdrv_get_block_status_above(BlockDriverState *bs,
> return data.ret;
> }
>
> +int64_t bdrv_get_block_status_above(BlockDriverState *bs,
> + BlockDriverState *base,
> + int64_t sector_num,
> + int nb_sectors, int *pnum,
> + BlockDriverState **file)
> +{
> + return bdrv_common_block_status_above(bs, base, true, sector_num,
> + nb_sectors, pnum, file);
> +}
> +
> int64_t bdrv_get_block_status(BlockDriverState *bs,
> int64_t sector_num,
> int nb_sectors, int *pnum,
> @@ -1970,15 +1994,15 @@ int64_t bdrv_get_block_status(BlockDriverState *bs,
> int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
> int64_t bytes, int64_t *pnum)
> {
> - int64_t sector_num = offset >> BDRV_SECTOR_BITS;
> - int nb_sectors = bytes >> BDRV_SECTOR_BITS;
> int64_t ret;
> int psectors;
>
> assert(QEMU_IS_ALIGNED(offset, BDRV_SECTOR_SIZE));
> assert(QEMU_IS_ALIGNED(bytes, BDRV_SECTOR_SIZE) && bytes < INT_MAX);
> - ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &psectors,
> - NULL);
> + ret = bdrv_common_block_status_above(bs, backing_bs(bs), false,
> + offset >> BDRV_SECTOR_BITS,
> + bytes >> BDRV_SECTOR_BITS, &psectors,
> + NULL);
> if (ret < 0) {
> return ret;
> }
>
next prev parent reply other threads:[~2017-09-26 18:31 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-13 16:03 [Qemu-devel] [PATCH v4 00/23] make bdrv_get_block_status byte-based Eric Blake
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 01/23] block: Allow NULL file for bdrv_get_block_status() Eric Blake
2017-09-25 22:43 ` John Snow
2017-09-27 21:46 ` Eric Blake
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 02/23] block: Add flag to avoid wasted work in bdrv_is_allocated() Eric Blake
2017-09-26 18:31 ` John Snow [this message]
2017-09-28 14:58 ` Eric Blake
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 03/23] block: Make bdrv_round_to_clusters() signature more useful Eric Blake
2017-09-26 18:51 ` John Snow
2017-09-26 19:18 ` Eric Blake
2017-09-26 19:29 ` John Snow
2017-09-28 22:29 ` Eric Blake
2017-09-29 20:03 ` Eric Blake
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 04/23] qcow2: Switch is_zero_sectors() to byte-based Eric Blake
2017-09-26 19:06 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 05/23] block: Switch bdrv_make_zero() " Eric Blake
2017-09-26 19:13 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 06/23] qemu-img: Switch get_block_status() " Eric Blake
2017-09-26 19:16 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 07/23] block: Convert bdrv_get_block_status() to bytes Eric Blake
2017-09-26 19:39 ` John Snow
2017-09-26 19:57 ` Eric Blake
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 08/23] block: Switch bdrv_co_get_block_status() to byte-based Eric Blake
2017-09-26 20:15 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 09/23] block: Switch BdrvCoGetBlockStatusData " Eric Blake
2017-09-26 20:20 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 10/23] block: Switch bdrv_common_block_status_above() " Eric Blake
2017-09-27 18:26 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 11/23] block: Switch bdrv_co_get_block_status_above() " Eric Blake
2017-09-27 18:31 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 12/23] block: Convert bdrv_get_block_status_above() to bytes Eric Blake
2017-09-27 18:41 ` John Snow
2017-09-27 18:57 ` Eric Blake
2017-09-27 19:40 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 13/23] qemu-img: Simplify logic in img_compare() Eric Blake
2017-09-27 19:05 ` John Snow
2017-09-27 19:15 ` Eric Blake
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 14/23] qemu-img: Speed up compare on pre-allocated larger file Eric Blake
2017-09-27 20:54 ` John Snow
2017-10-03 9:32 ` Vladimir Sementsov-Ogievskiy
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 15/23] qemu-img: Add find_nonzero() Eric Blake
2017-09-27 21:16 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 16/23] qemu-img: Drop redundant error message in compare Eric Blake
2017-09-27 21:35 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 17/23] qemu-img: Change check_empty_sectors() to byte-based Eric Blake
2017-09-27 21:43 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 18/23] qemu-img: Change compare_sectors() to be byte-based Eric Blake
2017-09-27 22:25 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 19/23] qemu-img: Change img_rebase() " Eric Blake
2017-09-29 19:38 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 20/23] qemu-img: Change img_compare() " Eric Blake
2017-09-29 20:42 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 21/23] block: Align block status requests Eric Blake
2017-09-13 19:26 ` Eric Blake
2017-09-13 20:36 ` Eric Blake
2017-10-02 20:24 ` John Snow
2017-10-02 23:51 ` Eric Blake
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 22/23] block: Relax bdrv_aligned_preadv() assertion Eric Blake
2017-10-02 21:20 ` John Snow
2017-09-13 16:03 ` [Qemu-devel] [PATCH v4 23/23] qemu-io: Relax 'alloc' now that block-status doesn't assert Eric Blake
2017-10-02 21:27 ` John Snow
2017-10-02 23:56 ` Eric Blake
2017-10-03 3:18 ` John Snow
2017-09-13 21:05 ` [Qemu-devel] [PATCH v4 00/23] make bdrv_get_block_status byte-based Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b5fd50f3-7bd0-115c-bc67-0df49b05cb76@redhat.com \
--to=jsnow@redhat.com \
--cc=eblake@redhat.com \
--cc=famz@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).