All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Cody <jcody@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: qemu-devel@nongnu.org, kwolf@redhat.com, qemu-block@nongnu.org,
	jsnow@redhat.com, Max Reitz <mreitz@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Fam Zheng <famz@redhat.com>,
	Wen Congyang <wencongyang2@huawei.com>,
	Xie Changlong <xiechanglong.d@gmail.com>
Subject: Re: [Qemu-devel] [PATCH v3 20/20] block: Make bdrv_is_allocated_above() byte-based
Date: Fri, 30 Jun 2017 17:36:30 -0400	[thread overview]
Message-ID: <20170630213630.GR4997@localhost.localdomain> (raw)
In-Reply-To: <20170627192458.15519-21-eblake@redhat.com>

On Tue, Jun 27, 2017 at 02:24:58PM -0500, Eric Blake wrote:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  In the common case, allocation is unlikely to ever use
> values that are not naturally sector-aligned, but it is possible
> that byte-based values will let us be more precise about allocation
> at the end of an unaligned file that can do byte-based access.
> 
> Changing the signature of the function to use int64_t *pnum ensures
> that the compiler enforces that all callers are updated.  For now,
> the io.c layer still assert()s that all callers are sector-aligned,
> but that can be relaxed when a later patch implements byte-based
> block status.  Therefore, for the most part this patch is just the
> addition of scaling at the callers followed by inverse scaling at
> bdrv_is_allocated().  But some code, particularly stream_run(),
> gets a lot simpler because it no longer has to mess with sectors.
> 
> For ease of review, bdrv_is_allocated() was tackled separately.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: John Snow <jsnow@redhat.com>
> 

Reviewed-by: Jeff Cody <jcody@redhat.com>

> ---
> v2: tweak function comments, favor bdrv_getlength() over ->total_sectors
> ---
>  include/block/block.h |  2 +-
>  block/commit.c        | 20 ++++++++------------
>  block/io.c            | 42 ++++++++++++++++++++----------------------
>  block/mirror.c        |  5 ++++-
>  block/replication.c   | 17 ++++++++++++-----
>  block/stream.c        | 21 +++++++++------------
>  qemu-img.c            | 10 +++++++---
>  7 files changed, 61 insertions(+), 56 deletions(-)
> 
> diff --git a/include/block/block.h b/include/block/block.h
> index 9b9d87b..13022d5 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -428,7 +428,7 @@ int64_t bdrv_get_block_status_above(BlockDriverState *bs,
>  int bdrv_is_allocated(BlockDriverState *bs, int64_t offset, int64_t bytes,
>                        int64_t *pnum);
>  int bdrv_is_allocated_above(BlockDriverState *top, BlockDriverState *base,
> -                            int64_t sector_num, int nb_sectors, int *pnum);
> +                            int64_t offset, int64_t bytes, int64_t *pnum);
> 
>  bool bdrv_is_read_only(BlockDriverState *bs);
>  bool bdrv_is_writable(BlockDriverState *bs);
> diff --git a/block/commit.c b/block/commit.c
> index 241aa95..774a8a5 100644
> --- a/block/commit.c
> +++ b/block/commit.c
> @@ -146,7 +146,7 @@ static void coroutine_fn commit_run(void *opaque)
>      int64_t offset;
>      uint64_t delay_ns = 0;
>      int ret = 0;
> -    int n = 0; /* sectors */
> +    int64_t n = 0; /* bytes */
>      void *buf = NULL;
>      int bytes_written = 0;
>      int64_t base_len;
> @@ -171,7 +171,7 @@ static void coroutine_fn commit_run(void *opaque)
> 
>      buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE);
> 
> -    for (offset = 0; offset < s->common.len; offset += n * BDRV_SECTOR_SIZE) {
> +    for (offset = 0; offset < s->common.len; offset += n) {
>          bool copy;
> 
>          /* Note that even when no rate limit is applied we need to yield
> @@ -183,15 +183,12 @@ static void coroutine_fn commit_run(void *opaque)
>          }
>          /* Copy if allocated above the base */
>          ret = bdrv_is_allocated_above(blk_bs(s->top), blk_bs(s->base),
> -                                      offset / BDRV_SECTOR_SIZE,
> -                                      COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
> -                                      &n);
> +                                      offset, COMMIT_BUFFER_SIZE, &n);
>          copy = (ret == 1);
> -        trace_commit_one_iteration(s, offset, n * BDRV_SECTOR_SIZE, ret);
> +        trace_commit_one_iteration(s, offset, n, ret);
>          if (copy) {
> -            ret = commit_populate(s->top, s->base, offset,
> -                                  n * BDRV_SECTOR_SIZE, buf);
> -            bytes_written += n * BDRV_SECTOR_SIZE;
> +            ret = commit_populate(s->top, s->base, offset, n, buf);
> +            bytes_written += n;
>          }
>          if (ret < 0) {
>              BlockErrorAction action =
> @@ -204,11 +201,10 @@ static void coroutine_fn commit_run(void *opaque)
>              }
>          }
>          /* Publish progress */
> -        s->common.offset += n * BDRV_SECTOR_SIZE;
> +        s->common.offset += n;
> 
>          if (copy && s->common.speed) {
> -            delay_ns = ratelimit_calculate_delay(&s->limit,
> -                                                 n * BDRV_SECTOR_SIZE);
> +            delay_ns = ratelimit_calculate_delay(&s->limit, n);
>          }
>      }
> 
> diff --git a/block/io.c b/block/io.c
> index 5bbf153..061a162 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1903,54 +1903,52 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
>  /*
>   * Given an image chain: ... -> [BASE] -> [INTER1] -> [INTER2] -> [TOP]
>   *
> - * Return true if the given sector is allocated in any image between
> - * BASE and TOP (inclusive).  BASE can be NULL to check if the given
> - * sector is allocated in any image of the chain.  Return false otherwise,
> + * Return true if the (prefix of the) given range is allocated in any image
> + * between BASE and TOP (inclusive).  BASE can be NULL to check if the given
> + * offset is allocated in any image of the chain.  Return false otherwise,
>   * or negative errno on failure.
>   *
> - * 'pnum' is set to the number of sectors (including and immediately following
> - *  the specified sector) that are known to be in the same
> - *  allocated/unallocated state.
> + * 'pnum' is set to the number of bytes (including and immediately
> + * following the specified offset) that are known to be in the same
> + * allocated/unallocated state.  Note that a subsequent call starting
> + * at 'offset + *pnum' may return the same allocation status (in other
> + * words, the result is not necessarily the maximum possible range);
> + * but 'pnum' will only be 0 when end of file is reached.
>   *
>   */
>  int bdrv_is_allocated_above(BlockDriverState *top,
>                              BlockDriverState *base,
> -                            int64_t sector_num,
> -                            int nb_sectors, int *pnum)
> +                            int64_t offset, int64_t bytes, int64_t *pnum)
>  {
>      BlockDriverState *intermediate;
> -    int ret, n = nb_sectors;
> +    int ret;
> +    int64_t n = bytes;
> 
>      intermediate = top;
>      while (intermediate && intermediate != base) {
>          int64_t pnum_inter;
>          int64_t size_inter;
> -        int psectors_inter;
> 
> -        ret = bdrv_is_allocated(intermediate, sector_num * BDRV_SECTOR_SIZE,
> -                                nb_sectors * BDRV_SECTOR_SIZE,
> -                                &pnum_inter);
> +        ret = bdrv_is_allocated(intermediate, offset, bytes, &pnum_inter);
>          if (ret < 0) {
>              return ret;
>          }
> -        assert(pnum_inter < INT_MAX * BDRV_SECTOR_SIZE);
> -        psectors_inter = pnum_inter >> BDRV_SECTOR_BITS;
>          if (ret) {
> -            *pnum = psectors_inter;
> +            *pnum = pnum_inter;
>              return 1;
>          }
> 
>          /*
> -         * [sector_num, nb_sectors] is unallocated on top but intermediate
> -         * might have [sector_num+x, nb_sectors-x] allocated.
> +         * [offset, bytes] is unallocated on top but intermediate
> +         * might have [offset+x, bytes-x] allocated.
>           */
> -        size_inter = bdrv_nb_sectors(intermediate);
> +        size_inter = bdrv_getlength(intermediate);
>          if (size_inter < 0) {
>              return size_inter;
>          }
> -        if (n > psectors_inter &&
> -            (intermediate == top || sector_num + psectors_inter < size_inter)) {
> -            n = psectors_inter;
> +        if (n > pnum_inter &&
> +            (intermediate == top || offset + pnum_inter < size_inter)) {
> +            n = pnum_inter;
>          }
> 
>          intermediate = backing_bs(intermediate);
> diff --git a/block/mirror.c b/block/mirror.c
> index 0eb2af4..8c3752b 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -621,6 +621,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
>      BlockDriverState *bs = s->source;
>      BlockDriverState *target_bs = blk_bs(s->target);
>      int ret, n;
> +    int64_t count;
> 
>      end = s->bdev_length / BDRV_SECTOR_SIZE;
> 
> @@ -670,11 +671,13 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
>              return 0;
>          }
> 
> -        ret = bdrv_is_allocated_above(bs, base, sector_num, nb_sectors, &n);
> +        ret = bdrv_is_allocated_above(bs, base, sector_num * BDRV_SECTOR_SIZE,
> +                                      nb_sectors * BDRV_SECTOR_SIZE, &count);
>          if (ret < 0) {
>              return ret;
>          }
> 
> +        n = DIV_ROUND_UP(count, BDRV_SECTOR_SIZE);
>          assert(n > 0);
>          if (ret == 1) {
>              bdrv_set_dirty_bitmap(s->dirty_bitmap, sector_num, n);
> diff --git a/block/replication.c b/block/replication.c
> index 8f3aba7..bf4462c 100644
> --- a/block/replication.c
> +++ b/block/replication.c
> @@ -264,7 +264,8 @@ static coroutine_fn int replication_co_writev(BlockDriverState *bs,
>      BdrvChild *top = bs->file;
>      BdrvChild *base = s->secondary_disk;
>      BdrvChild *target;
> -    int ret, n;
> +    int ret;
> +    int64_t n;
> 
>      ret = replication_get_io_status(s);
>      if (ret < 0) {
> @@ -283,14 +284,20 @@ static coroutine_fn int replication_co_writev(BlockDriverState *bs,
>       */
>      qemu_iovec_init(&hd_qiov, qiov->niov);
>      while (remaining_sectors > 0) {
> -        ret = bdrv_is_allocated_above(top->bs, base->bs, sector_num,
> -                                      remaining_sectors, &n);
> +        int64_t count;
> +
> +        ret = bdrv_is_allocated_above(top->bs, base->bs,
> +                                      sector_num * BDRV_SECTOR_SIZE,
> +                                      remaining_sectors * BDRV_SECTOR_SIZE,
> +                                      &count);
>          if (ret < 0) {
>              goto out1;
>          }
> 
> +        assert(QEMU_IS_ALIGNED(count, BDRV_SECTOR_SIZE));
> +        n = count >> BDRV_SECTOR_BITS;
>          qemu_iovec_reset(&hd_qiov);
> -        qemu_iovec_concat(&hd_qiov, qiov, bytes_done, n * BDRV_SECTOR_SIZE);
> +        qemu_iovec_concat(&hd_qiov, qiov, bytes_done, count);
> 
>          target = ret ? top : base;
>          ret = bdrv_co_writev(target, sector_num, n, &hd_qiov);
> @@ -300,7 +307,7 @@ static coroutine_fn int replication_co_writev(BlockDriverState *bs,
> 
>          remaining_sectors -= n;
>          sector_num += n;
> -        bytes_done += n * BDRV_SECTOR_SIZE;
> +        bytes_done += count;
>      }
> 
>  out1:
> diff --git a/block/stream.c b/block/stream.c
> index 0a56c29..eb18557 100644
> --- a/block/stream.c
> +++ b/block/stream.c
> @@ -112,7 +112,7 @@ static void coroutine_fn stream_run(void *opaque)
>      uint64_t delay_ns = 0;
>      int error = 0;
>      int ret = 0;
> -    int n = 0; /* sectors */
> +    int64_t n = 0; /* bytes */
>      void *buf;
> 
>      if (!bs->backing) {
> @@ -136,9 +136,8 @@ static void coroutine_fn stream_run(void *opaque)
>          bdrv_enable_copy_on_read(bs);
>      }
> 
> -    for ( ; offset < s->common.len; offset += n * BDRV_SECTOR_SIZE) {
> +    for ( ; offset < s->common.len; offset += n) {
>          bool copy;
> -        int64_t count = 0;
> 
>          /* Note that even when no rate limit is applied we need to yield
>           * with no pending I/O here so that bdrv_drain_all() returns.
> @@ -150,26 +149,25 @@ static void coroutine_fn stream_run(void *opaque)
> 
>          copy = false;
> 
> -        ret = bdrv_is_allocated(bs, offset, STREAM_BUFFER_SIZE, &count);
> -        n = DIV_ROUND_UP(count, BDRV_SECTOR_SIZE);
> +        ret = bdrv_is_allocated(bs, offset, STREAM_BUFFER_SIZE, &n);
>          if (ret == 1) {
>              /* Allocated in the top, no need to copy.  */
>          } else if (ret >= 0) {
>              /* Copy if allocated in the intermediate images.  Limit to the
>               * known-unallocated area [offset, offset+n*BDRV_SECTOR_SIZE).  */
>              ret = bdrv_is_allocated_above(backing_bs(bs), base,
> -                                          offset / BDRV_SECTOR_SIZE, n, &n);
> +                                          offset, n, &n);
> 
>              /* Finish early if end of backing file has been reached */
>              if (ret == 0 && n == 0) {
> -                n = (s->common.len - offset) / BDRV_SECTOR_SIZE;
> +                n = s->common.len - offset;
>              }
> 
>              copy = (ret == 1);
>          }
> -        trace_stream_one_iteration(s, offset, n * BDRV_SECTOR_SIZE, ret);
> +        trace_stream_one_iteration(s, offset, n, ret);
>          if (copy) {
> -            ret = stream_populate(blk, offset, n * BDRV_SECTOR_SIZE, buf);
> +            ret = stream_populate(blk, offset, n, buf);
>          }
>          if (ret < 0) {
>              BlockErrorAction action =
> @@ -188,10 +186,9 @@ static void coroutine_fn stream_run(void *opaque)
>          ret = 0;
> 
>          /* Publish progress */
> -        s->common.offset += n * BDRV_SECTOR_SIZE;
> +        s->common.offset += n;
>          if (copy && s->common.speed) {
> -            delay_ns = ratelimit_calculate_delay(&s->limit,
> -                                                 n * BDRV_SECTOR_SIZE);
> +            delay_ns = ratelimit_calculate_delay(&s->limit, n);
>          }
>      }
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index c5709bf..ca984e1 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -1516,12 +1516,16 @@ static int img_compare(int argc, char **argv)
>          }
> 
>          for (;;) {
> +            int64_t count;
> +
>              nb_sectors = sectors_to_process(total_sectors_over, sector_num);
>              if (nb_sectors <= 0) {
>                  break;
>              }
> -            ret = bdrv_is_allocated_above(blk_bs(blk_over), NULL, sector_num,
> -                                          nb_sectors, &pnum);
> +            ret = bdrv_is_allocated_above(blk_bs(blk_over), NULL,
> +                                          sector_num * BDRV_SECTOR_SIZE,
> +                                          nb_sectors * BDRV_SECTOR_SIZE,
> +                                          &count);
>              if (ret < 0) {
>                  ret = 3;
>                  error_report("Sector allocation test failed for %s",
> @@ -1529,7 +1533,7 @@ static int img_compare(int argc, char **argv)
>                  goto out;
> 
>              }
> -            nb_sectors = pnum;
> +            nb_sectors = DIV_ROUND_UP(count, BDRV_SECTOR_SIZE);
>              if (ret) {
>                  ret = check_empty_sectors(blk_over, sector_num, nb_sectors,
>                                            filename_over, buf1, quiet);
> -- 
> 2.9.4
> 

      parent reply	other threads:[~2017-06-30 21:36 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-27 19:24 [Qemu-devel] [PATCH v3 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 01/20] blockjob: Track job ratelimits via bytes, not sectors Eric Blake
2017-06-30 19:42   ` Jeff Cody
2017-07-04 15:28   ` Kevin Wolf
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 02/20] trace: Show blockjob actions " Eric Blake
2017-06-30 19:56   ` Jeff Cody
2017-07-04 15:28   ` Kevin Wolf
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 03/20] stream: Switch stream_populate() to byte-based Eric Blake
2017-06-30 19:57   ` Jeff Cody
2017-07-04 15:28   ` Kevin Wolf
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 04/20] stream: Switch stream_run() " Eric Blake
2017-06-30 20:01   ` Jeff Cody
2017-07-04 15:00   ` Kevin Wolf
2017-07-05 12:13     ` Eric Blake
2017-07-05 13:41       ` Kevin Wolf
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 05/20] commit: Switch commit_populate() " Eric Blake
2017-06-30 20:17   ` Jeff Cody
2017-07-04 15:28   ` Kevin Wolf
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 06/20] commit: Switch commit_run() " Eric Blake
2017-06-30 20:19   ` Jeff Cody
2017-07-04 15:29   ` Kevin Wolf
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 07/20] mirror: Switch MirrorBlockJob " Eric Blake
2017-06-30 20:20   ` Jeff Cody
2017-07-05 11:42   ` Kevin Wolf
2017-07-05 20:18     ` Eric Blake
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 08/20] mirror: Switch mirror_do_zero_or_discard() " Eric Blake
2017-06-30 20:22   ` Jeff Cody
2017-07-05 16:36   ` Kevin Wolf
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 09/20] mirror: Update signature of mirror_clip_sectors() Eric Blake
2017-06-30 20:51   ` Jeff Cody
2017-07-05 16:37   ` Kevin Wolf
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 10/20] mirror: Switch mirror_cow_align() to byte-based Eric Blake
2017-06-30 21:03   ` Jeff Cody
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 11/20] mirror: Switch mirror_do_read() " Eric Blake
2017-06-30 21:07   ` Jeff Cody
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 12/20] mirror: Switch mirror_iteration() " Eric Blake
2017-06-30 21:14   ` Jeff Cody
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 13/20] block: Drop unused bdrv_round_sectors_to_clusters() Eric Blake
2017-06-30 21:17   ` [Qemu-devel] [Qemu-block] " Jeff Cody
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 14/20] backup: Switch BackupBlockJob to byte-based Eric Blake
2017-06-30 21:18   ` Jeff Cody
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 15/20] backup: Switch block_backup.h " Eric Blake
2017-06-28  0:38   ` Xie Changlong
2017-06-30 21:23   ` Jeff Cody
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 16/20] backup: Switch backup_do_cow() " Eric Blake
2017-06-30 21:24   ` Jeff Cody
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 17/20] backup: Switch backup_run() " Eric Blake
2017-06-30 21:25   ` Jeff Cody
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 18/20] block: Make bdrv_is_allocated() byte-based Eric Blake
2017-06-28  9:14   ` Juan Quintela
2017-06-30 21:32   ` Jeff Cody
2017-07-03 18:22   ` Eric Blake
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 19/20] block: Minimize raw use of bds->total_sectors Eric Blake
2017-06-28  8:57   ` [Qemu-devel] [Qemu-block] " Manos Pitsidianakis
2017-06-30 21:34   ` Jeff Cody
2017-06-27 19:24 ` [Qemu-devel] [PATCH v3 20/20] block: Make bdrv_is_allocated_above() byte-based Eric Blake
2017-06-28  0:38   ` Xie Changlong
2017-06-30 21:36   ` Jeff Cody [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170630213630.GR4997@localhost.localdomain \
    --to=jcody@redhat.com \
    --cc=eblake@redhat.com \
    --cc=famz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=wencongyang2@huawei.com \
    --cc=xiechanglong.d@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.