qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Cc: aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com,
	kvm@vger.kernel.org, mtosatti@redhat.com, qemu-devel@nongnu.org,
	pair@us.ibm.com, zwu.kernel@gmail.com, ryanh@us.ibm.com
Subject: Re: [Qemu-devel] [PATCH v8 3/4] block: add block timer and throttling algorithm
Date: Fri, 23 Sep 2011 18:19:27 +0200	[thread overview]
Message-ID: <4E7CB18F.1080402@redhat.com> (raw)
In-Reply-To: <1315476668-19812-4-git-send-email-wuzhy@linux.vnet.ibm.com>

Am 08.09.2011 12:11, schrieb Zhi Yong Wu:
> Note:
>      1.) When bps/iops limits are specified to a small value such as 511 bytes/s, this VM will hang up. We are considering how to handle this senario.
>      2.) When "dd" command is issued in guest, if its option bs is set to a large value such as "bs=1024K", the result speed will slightly bigger than the limits.
> 
> For these problems, if you have nice thought, pls let us know.:)
> 
> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
> ---
>  block.c |  259 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
>  block.h |    1 -
>  2 files changed, 248 insertions(+), 12 deletions(-)

One general comment: What about synchronous and/or coroutine I/O
operations? Do you think they are just not important enough to consider
here or were they forgotten?

Also, do I understand correctly that you're always submitting the whole
queue at once? Does this effectively enforce the limit all the time or
will it lead to some peaks and then no requests at all for a while until
the average is right again?

Maybe some documentation on how it all works from a high level
perspective would be helpful.

> diff --git a/block.c b/block.c
> index cd75183..c08fde8 100644
> --- a/block.c
> +++ b/block.c
> @@ -30,6 +30,9 @@
>  #include "qemu-objects.h"
>  #include "qemu-coroutine.h"
>  
> +#include "qemu-timer.h"
> +#include "block/blk-queue.h"
> +
>  #ifdef CONFIG_BSD
>  #include <sys/types.h>
>  #include <sys/stat.h>
> @@ -72,6 +75,13 @@ static int coroutine_fn bdrv_co_writev_em(BlockDriverState *bs,
>                                           QEMUIOVector *iov);
>  static int coroutine_fn bdrv_co_flush_em(BlockDriverState *bs);
>  
> +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors,
> +        bool is_write, double elapsed_time, uint64_t *wait);
> +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write,
> +        double elapsed_time, uint64_t *wait);
> +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors,
> +        bool is_write, int64_t *wait);
> +
>  static QTAILQ_HEAD(, BlockDriverState) bdrv_states =
>      QTAILQ_HEAD_INITIALIZER(bdrv_states);
>  
> @@ -745,6 +755,11 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
>              bs->change_cb(bs->change_opaque, CHANGE_MEDIA);
>      }
>  
> +    /* throttling disk I/O limits */
> +    if (bs->io_limits_enabled) {
> +        bdrv_io_limits_enable(bs);
> +    }
> +
>      return 0;
>  
>  unlink_and_fail:
> @@ -783,6 +798,18 @@ void bdrv_close(BlockDriverState *bs)
>          if (bs->change_cb)
>              bs->change_cb(bs->change_opaque, CHANGE_MEDIA);
>      }
> +
> +    /* throttling disk I/O limits */
> +    if (bs->block_queue) {
> +        qemu_del_block_queue(bs->block_queue);
> +        bs->block_queue = NULL;
> +    }
> +
> +    if (bs->block_timer) {
> +        qemu_del_timer(bs->block_timer);
> +        qemu_free_timer(bs->block_timer);
> +        bs->block_timer = NULL;
> +    }

Why not io_limits_disable() instead of copying the code here?

>  }
>  
>  void bdrv_close_all(void)
> @@ -2341,16 +2368,48 @@ BlockDriverAIOCB *bdrv_aio_readv(BlockDriverState *bs, int64_t sector_num,
>                                   BlockDriverCompletionFunc *cb, void *opaque)
>  {
>      BlockDriver *drv = bs->drv;
> -
> +    BlockDriverAIOCB *ret;
> +    int64_t wait_time = -1;
> +printf("sector_num=%ld, nb_sectors=%d\n", sector_num, nb_sectors);

Debugging leftover (more of them follow, won't comment on each one)

>      trace_bdrv_aio_readv(bs, sector_num, nb_sectors, opaque);
>  
> -    if (!drv)
> -        return NULL;
> -    if (bdrv_check_request(bs, sector_num, nb_sectors))
> +    if (!drv || bdrv_check_request(bs, sector_num, nb_sectors)) {
>          return NULL;
> +    }

This part is unrelated.

> +
> +    /* throttling disk read I/O */
> +    if (bs->io_limits_enabled) {
> +        if (bdrv_exceed_io_limits(bs, nb_sectors, false, &wait_time)) {
> +            ret = qemu_block_queue_enqueue(bs->block_queue, bs, bdrv_aio_readv,
> +                           sector_num, qiov, nb_sectors, cb, opaque);
> +            printf("wait_time=%ld\n", wait_time);
> +            if (wait_time != -1) {
> +                printf("reset block timer\n");
> +                qemu_mod_timer(bs->block_timer,
> +                               wait_time + qemu_get_clock_ns(vm_clock));
> +            }
> +
> +            if (ret) {
> +                printf("ori ret is not null\n");
> +            } else {
> +                printf("ori ret is null\n");
> +            }
> +
> +            return ret;
> +        }
> +    }
>  
> -    return drv->bdrv_aio_readv(bs, sector_num, qiov, nb_sectors,
> +    ret =  drv->bdrv_aio_readv(bs, sector_num, qiov, nb_sectors,
>                                 cb, opaque);
> +    if (ret) {
> +        if (bs->io_limits_enabled) {
> +            bs->io_disps.bytes[BLOCK_IO_LIMIT_READ] +=
> +                              (unsigned) nb_sectors * BDRV_SECTOR_SIZE;
> +            bs->io_disps.ios[BLOCK_IO_LIMIT_READ]++;
> +        }

I wonder if you can't reuse bs->nr_bytes/nr_ops instead of introducing a
second counting mechanism. Would have the advantage that numbers are
actually consistent (your metric counts slightly differently than the
existing info blockstats one).

> +    }
> +
> +    return ret;
>  }
>  
>  typedef struct BlockCompleteData {
> @@ -2396,15 +2455,14 @@ BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num,
>      BlockDriver *drv = bs->drv;
>      BlockDriverAIOCB *ret;
>      BlockCompleteData *blk_cb_data;
> +    int64_t wait_time = -1;
>  
>      trace_bdrv_aio_writev(bs, sector_num, nb_sectors, opaque);
>  
> -    if (!drv)
> -        return NULL;
> -    if (bs->read_only)
> -        return NULL;
> -    if (bdrv_check_request(bs, sector_num, nb_sectors))
> +    if (!drv || bs->read_only
> +        || bdrv_check_request(bs, sector_num, nb_sectors)) {
>          return NULL;
> +    }

Again, unrelated changes.

>  
>      if (bs->dirty_bitmap) {
>          blk_cb_data = blk_dirty_cb_alloc(bs, sector_num, nb_sectors, cb,
> @@ -2413,13 +2471,32 @@ BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num,
>          opaque = blk_cb_data;
>      }
>  
> +    /* throttling disk write I/O */
> +    if (bs->io_limits_enabled) {
> +        if (bdrv_exceed_io_limits(bs, nb_sectors, true, &wait_time)) {
> +            ret = qemu_block_queue_enqueue(bs->block_queue, bs, bdrv_aio_writev,
> +                                  sector_num, qiov, nb_sectors, cb, opaque);
> +            if (wait_time != -1) {
> +                qemu_mod_timer(bs->block_timer,
> +                               wait_time + qemu_get_clock_ns(vm_clock));
> +            }
> +
> +            return ret;
> +        }
> +    }

This looks very similar to the code in bdrv_aio_readv. Can it be moved
into a common function?

Kevin

  parent reply	other threads:[~2011-09-23 16:16 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-08 10:11 [Qemu-devel] [PATCH v8 0/4] The intro of QEMU block I/O throttling Zhi Yong Wu
2011-09-08 10:11 ` [Qemu-devel] [PATCH v8 1/4] block: add the block queue support Zhi Yong Wu
2011-09-23 15:32   ` Kevin Wolf
2011-09-26  8:01     ` Zhi Yong Wu
2011-10-17 10:17       ` Kevin Wolf
     [not found]         ` <4E9C00D2.1040004@redhat.com>
2011-10-18  7:00           ` Zhi Yong Wu
2011-10-18  8:07         ` Zhi Yong Wu
2011-10-18  8:36           ` Kevin Wolf
2011-10-18  9:29             ` Zhi Yong Wu
2011-10-18  9:56               ` Kevin Wolf
2011-10-18 13:29                 ` Zhi Yong Wu
2011-09-08 10:11 ` [Qemu-devel] [PATCH v8 2/4] block: add the command line support Zhi Yong Wu
2011-09-23 15:54   ` Kevin Wolf
2011-09-26  6:15     ` Zhi Yong Wu
2011-10-17 10:19       ` Kevin Wolf
2011-10-18  8:17         ` Zhi Yong Wu
2011-09-08 10:11 ` [Qemu-devel] [PATCH v8 3/4] block: add block timer and throttling algorithm Zhi Yong Wu
2011-09-09 14:44   ` Marcelo Tosatti
2011-09-13  3:09     ` Zhi Yong Wu
2011-09-14 10:50       ` Marcelo Tosatti
2011-09-19  9:55         ` Zhi Yong Wu
2011-09-20 12:34           ` Marcelo Tosatti
2011-09-21  3:14             ` Zhi Yong Wu
2011-09-21  5:54               ` Zhi Yong Wu
2011-09-21  7:03             ` Zhi Yong Wu
2011-09-26  8:15             ` Zhi Yong Wu
2011-09-23 16:19   ` Kevin Wolf [this message]
2011-09-26  7:24     ` Zhi Yong Wu
2011-10-17 10:26       ` Kevin Wolf
2011-10-17 15:54         ` Stefan Hajnoczi
2011-10-18  8:29           ` Zhi Yong Wu
2011-10-18  8:43         ` Zhi Yong Wu
2011-09-08 10:11 ` [Qemu-devel] [PATCH v8 4/4] qmp/hmp: add block_set_io_throttle Zhi Yong Wu
  -- strict thread matches above, loose matches on Subject: below --
2011-09-07 12:41 [Qemu-devel] [PATCH v8 3/4] block: add block timer and throttling algorithm Zhi Yong Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E7CB18F.1080402@redhat.com \
    --to=kwolf@redhat.com \
    --cc=aliguori@us.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pair@us.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=ryanh@us.ibm.com \
    --cc=stefanha@linux.vnet.ibm.com \
    --cc=wuzhy@linux.vnet.ibm.com \
    --cc=zwu.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).