* [Qemu-devel] [PATCH v12 0/5] The intro to QEMU block I/O throttling @ 2011-11-03 8:57 Zhi Yong Wu 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 1/5] block: add the blockio limits command line support Zhi Yong Wu ` (4 more replies) 0 siblings, 5 replies; 15+ messages in thread From: Zhi Yong Wu @ 2011-11-03 8:57 UTC (permalink / raw) To: kwolf; +Cc: zwu.kernel, ryanh, Zhi Yong Wu, qemu-devel, stefanha The main goal of the patch is to effectively cap the disk I/O speed or counts of one single VM.It is only one draft, so it unavoidably has some drawbacks, if you catch them, please let me know. The patch will mainly introduce one block I/O throttling algorithm, one timer and one block queue for each I/O limits enabled drive. When a block request is coming in, the throttling algorithm will check if its I/O rate or counts exceed the limits; if yes, then it will enqueue to the block queue; The timer will handle the I/O requests in it. Some available features follow as below: (1) global bps limit. -drive bps=xxx in bytes/s (2) only read bps limit -drive bps_rd=xxx in bytes/s (3) only write bps limit -drive bps_wr=xxx in bytes/s (4) global iops limit -drive iops=xxx in ios/s (5) only read iops limit -drive iops_rd=xxx in ios/s (6) only write iops limit -drive iops_wr=xxx in ios/s (7) the combination of some limits. -drive bps=xxx,iops=xxx Known Limitations: (1) #1 can not coexist with #2, #3 (2) #4 can not coexist with #5, #6 Changes since code V11: Made some changes based on kevin's comments. v11: Made some mininal changes based on stefan and Ryan's comments Add one perf report for block I/O throttling v10: Greately simply the logic and rebase request queue to CoQueue based on Stefan's comments. v9: made a lot of changes based on kevin's comments. slice_time is dynamically adjusted based on wait_time. rebase the latest qemu upstream. v8: fix the build per patch based on stefan's comments. v7: Mainly simply the block queue. Adjust codes based on stefan's comments. v6: Mainly fix the aio callback issue for block queue. Adjust codes based on Ram Pai's comments. v5: add qmp/hmp support. Adjust the codes based on stefan's comments qmp/hmp: add block_set_io_throttle v4: fix memory leaking based on ryan's feedback. v3: Added the code for extending slice time, and modified the method to compute wait time for the timer. v2: The codes V2 for QEMU disk I/O limits. Modified the codes mainly based on stefan's comments. v1: Submit the codes for QEMU disk I/O limits. Zhi Yong Wu (5): block: add the blockio limits command line support CoQueue: introduce qemu_co_queue_wait_insert_head block: add I/O throttling algorithm hmp/qmp: add block_set_io_throttle block: perf testing report based on block I/O throttling 10mbps.dat | 310 ++++++++++++++++++++++++++++++++++++++++++++ 1mbps.dat | 339 +++++++++++++++++++++++++++++++++++++++++++++++++ block.c | 274 +++++++++++++++++++++++++++++++++++++++ block.h | 5 + block_int.h | 30 +++++ blockdev.c | 103 +++++++++++++++ blockdev.h | 2 + hmp-commands.hx | 15 ++ hmp.c | 10 ++ qapi-schema.json | 16 ++- qemu-config.c | 24 ++++ qemu-coroutine-lock.c | 8 + qemu-coroutine.h | 6 + qemu-options.hx | 1 + qerror.c | 4 + qerror.h | 3 + qmp-commands.hx | 53 ++++++++- 17 files changed, 1201 insertions(+), 2 deletions(-) create mode 100644 10mbps.dat create mode 100644 1mbps.dat -- 1.7.6 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v12 1/5] block: add the blockio limits command line support 2011-11-03 8:57 [Qemu-devel] [PATCH v12 0/5] The intro to QEMU block I/O throttling Zhi Yong Wu @ 2011-11-03 8:57 ` Zhi Yong Wu 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 2/5] CoQueue: introduce qemu_co_queue_wait_insert_head Zhi Yong Wu ` (3 subsequent siblings) 4 siblings, 0 replies; 15+ messages in thread From: Zhi Yong Wu @ 2011-11-03 8:57 UTC (permalink / raw) To: kwolf; +Cc: zwu.kernel, ryanh, Zhi Yong Wu, qemu-devel, stefanha Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- block.c | 39 +++++++++++++++++++++++++++++++++++++++ block.h | 4 ++++ block_int.h | 29 +++++++++++++++++++++++++++++ blockdev.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ qemu-config.c | 24 ++++++++++++++++++++++++ qemu-options.hx | 1 + 6 files changed, 141 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 9bb236c..79e7f09 100644 --- a/block.c +++ b/block.c @@ -30,6 +30,7 @@ #include "qjson.h" #include "qemu-coroutine.h" #include "qmp-commands.h" +#include "qemu-timer.h" #ifdef CONFIG_BSD #include <sys/types.h> @@ -105,6 +106,36 @@ int is_windows_drive(const char *filename) } #endif +/* throttling disk I/O limits */ +static void bdrv_block_timer(void *opaque) +{ + BlockDriverState *bs = opaque; + + qemu_co_queue_next(&bs->throttled_reqs); +} + +void bdrv_io_limits_enable(BlockDriverState *bs) +{ + qemu_co_queue_init(&bs->throttled_reqs); + bs->block_timer = qemu_new_timer_ns(vm_clock, bdrv_block_timer, bs); + bs->slice_time = 5 * BLOCK_IO_SLICE_TIME; + bs->slice_start = qemu_get_clock_ns(vm_clock); + bs->slice_end = bs->slice_start + bs->slice_time; + memset(&bs->io_base, 0, sizeof(bs->io_base)); + bs->io_limits_enabled = true; +} + +bool bdrv_io_limits_enabled(BlockDriverState *bs) +{ + BlockIOLimit *io_limits = &bs->io_limits; + return io_limits->bps[BLOCK_IO_LIMIT_READ] + || io_limits->bps[BLOCK_IO_LIMIT_WRITE] + || io_limits->bps[BLOCK_IO_LIMIT_TOTAL] + || io_limits->iops[BLOCK_IO_LIMIT_READ] + || io_limits->iops[BLOCK_IO_LIMIT_WRITE] + || io_limits->iops[BLOCK_IO_LIMIT_TOTAL]; +} + /* check if the path starts with "<protocol>:" */ static int path_has_protocol(const char *path) { @@ -1519,6 +1550,14 @@ void bdrv_get_geometry_hint(BlockDriverState *bs, *psecs = bs->secs; } +/* throttling disk io limits */ +void bdrv_set_io_limits(BlockDriverState *bs, + BlockIOLimit *io_limits) +{ + bs->io_limits = *io_limits; + bs->io_limits_enabled = bdrv_io_limits_enabled(bs); +} + /* Recognize floppy formats */ typedef struct FDFormat { FDriveType drive; diff --git a/block.h b/block.h index 38cd748..bc8315d 100644 --- a/block.h +++ b/block.h @@ -89,6 +89,10 @@ void bdrv_info(Monitor *mon, QObject **ret_data); void bdrv_stats_print(Monitor *mon, const QObject *data); void bdrv_info_stats(Monitor *mon, QObject **ret_data); +/* disk I/O throttling */ +void bdrv_io_limits_enable(BlockDriverState *bs); +bool bdrv_io_limits_enabled(BlockDriverState *bs); + void bdrv_init(void); void bdrv_init_with_whitelist(void); BlockDriver *bdrv_find_protocol(const char *filename); diff --git a/block_int.h b/block_int.h index f4547f6..7315e0d 100644 --- a/block_int.h +++ b/block_int.h @@ -34,6 +34,12 @@ #define BLOCK_FLAG_ENCRYPT 1 #define BLOCK_FLAG_COMPAT6 4 +#define BLOCK_IO_LIMIT_READ 0 +#define BLOCK_IO_LIMIT_WRITE 1 +#define BLOCK_IO_LIMIT_TOTAL 2 + +#define BLOCK_IO_SLICE_TIME 100000000 + #define BLOCK_OPT_SIZE "size" #define BLOCK_OPT_ENCRYPT "encryption" #define BLOCK_OPT_COMPAT6 "compat6" @@ -50,6 +56,16 @@ typedef struct AIOPool { BlockDriverAIOCB *free_aiocb; } AIOPool; +typedef struct BlockIOLimit { + int64_t bps[3]; + int64_t iops[3]; +} BlockIOLimit; + +typedef struct BlockIOBaseValue { + uint64_t bytes[2]; + uint64_t ios[2]; +} BlockIOBaseValue; + struct BlockDriver { const char *format_name; int instance_size; @@ -184,6 +200,16 @@ struct BlockDriverState { void *sync_aiocb; + /* the time for latest disk I/O */ + int64_t slice_time; + int64_t slice_start; + int64_t slice_end; + BlockIOLimit io_limits; + BlockIOBaseValue io_base; + CoQueue throttled_reqs; + QEMUTimer *block_timer; + bool io_limits_enabled; + /* I/O stats (display with "info blockstats"). */ uint64_t nr_bytes[BDRV_MAX_IOTYPE]; uint64_t nr_ops[BDRV_MAX_IOTYPE]; @@ -227,6 +253,9 @@ void *qemu_aio_get(AIOPool *pool, BlockDriverState *bs, BlockDriverCompletionFunc *cb, void *opaque); void qemu_aio_release(void *p); +void bdrv_set_io_limits(BlockDriverState *bs, + BlockIOLimit *io_limits); + #ifdef _WIN32 int is_windows_drive(const char *filename); #endif diff --git a/blockdev.c b/blockdev.c index 0827bf7..651828c 100644 --- a/blockdev.c +++ b/blockdev.c @@ -216,6 +216,26 @@ static int parse_block_error_action(const char *buf, int is_read) } } +static bool do_check_io_limits(BlockIOLimit *io_limits) +{ + bool bps_flag; + bool iops_flag; + + assert(io_limits); + + bps_flag = (io_limits->bps[BLOCK_IO_LIMIT_TOTAL] != 0) + && ((io_limits->bps[BLOCK_IO_LIMIT_READ] != 0) + || (io_limits->bps[BLOCK_IO_LIMIT_WRITE] != 0)); + iops_flag = (io_limits->iops[BLOCK_IO_LIMIT_TOTAL] != 0) + && ((io_limits->iops[BLOCK_IO_LIMIT_READ] != 0) + || (io_limits->iops[BLOCK_IO_LIMIT_WRITE] != 0)); + if (bps_flag || iops_flag) { + return false; + } + + return true; +} + DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) { const char *buf; @@ -235,6 +255,7 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) int on_read_error, on_write_error; const char *devaddr; DriveInfo *dinfo; + BlockIOLimit io_limits; int snapshot = 0; int ret; @@ -353,6 +374,26 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) } } + /* disk I/O throttling */ + io_limits.bps[BLOCK_IO_LIMIT_TOTAL] = + qemu_opt_get_number(opts, "bps", 0); + io_limits.bps[BLOCK_IO_LIMIT_READ] = + qemu_opt_get_number(opts, "bps_rd", 0); + io_limits.bps[BLOCK_IO_LIMIT_WRITE] = + qemu_opt_get_number(opts, "bps_wr", 0); + io_limits.iops[BLOCK_IO_LIMIT_TOTAL] = + qemu_opt_get_number(opts, "iops", 0); + io_limits.iops[BLOCK_IO_LIMIT_READ] = + qemu_opt_get_number(opts, "iops_rd", 0); + io_limits.iops[BLOCK_IO_LIMIT_WRITE] = + qemu_opt_get_number(opts, "iops_wr", 0); + + if (!do_check_io_limits(&io_limits)) { + error_report("bps(iops) and bps_rd/bps_wr(iops_rd/iops_wr) " + "cannot be used at the same time"); + return NULL; + } + on_write_error = BLOCK_ERR_STOP_ENOSPC; if ((buf = qemu_opt_get(opts, "werror")) != NULL) { if (type != IF_IDE && type != IF_SCSI && type != IF_VIRTIO && type != IF_NONE) { @@ -460,6 +501,9 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) bdrv_set_on_error(dinfo->bdrv, on_read_error, on_write_error); + /* disk I/O throttling */ + bdrv_set_io_limits(dinfo->bdrv, &io_limits); + switch(type) { case IF_IDE: case IF_SCSI: diff --git a/qemu-config.c b/qemu-config.c index 597d7e1..1aa080f 100644 --- a/qemu-config.c +++ b/qemu-config.c @@ -85,6 +85,30 @@ static QemuOptsList qemu_drive_opts = { .name = "readonly", .type = QEMU_OPT_BOOL, .help = "open drive file as read-only", + },{ + .name = "iops", + .type = QEMU_OPT_NUMBER, + .help = "limit total I/O operations per second", + },{ + .name = "iops_rd", + .type = QEMU_OPT_NUMBER, + .help = "limit read operations per second", + },{ + .name = "iops_wr", + .type = QEMU_OPT_NUMBER, + .help = "limit write operations per second", + },{ + .name = "bps", + .type = QEMU_OPT_NUMBER, + .help = "limit total bytes per second", + },{ + .name = "bps_rd", + .type = QEMU_OPT_NUMBER, + .help = "limit read bytes per second", + },{ + .name = "bps_wr", + .type = QEMU_OPT_NUMBER, + .help = "limit write bytes per second", }, { /* end of list */ } }, diff --git a/qemu-options.hx b/qemu-options.hx index 681eaf1..25a7be7 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -136,6 +136,7 @@ DEF("drive", HAS_ARG, QEMU_OPTION_drive, " [,cache=writethrough|writeback|none|directsync|unsafe][,format=f]\n" " [,serial=s][,addr=A][,id=name][,aio=threads|native]\n" " [,readonly=on|off]\n" + " [[,bps=b]|[[,bps_rd=r][,bps_wr=w]]][[,iops=i]|[[,iops_rd=r][,iops_wr=w]]\n" " use 'file' as a drive image\n", QEMU_ARCH_ALL) STEXI @item -drive @var{option}[,@var{option}[,@var{option}[,...]]] -- 1.7.6 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v12 2/5] CoQueue: introduce qemu_co_queue_wait_insert_head 2011-11-03 8:57 [Qemu-devel] [PATCH v12 0/5] The intro to QEMU block I/O throttling Zhi Yong Wu 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 1/5] block: add the blockio limits command line support Zhi Yong Wu @ 2011-11-03 8:57 ` Zhi Yong Wu 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 3/5] block: add I/O throttling algorithm Zhi Yong Wu ` (2 subsequent siblings) 4 siblings, 0 replies; 15+ messages in thread From: Zhi Yong Wu @ 2011-11-03 8:57 UTC (permalink / raw) To: kwolf; +Cc: zwu.kernel, ryanh, Zhi Yong Wu, qemu-devel, stefanha Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- qemu-coroutine-lock.c | 8 ++++++++ qemu-coroutine.h | 6 ++++++ 2 files changed, 14 insertions(+), 0 deletions(-) diff --git a/qemu-coroutine-lock.c b/qemu-coroutine-lock.c index 6b58160..9549c07 100644 --- a/qemu-coroutine-lock.c +++ b/qemu-coroutine-lock.c @@ -61,6 +61,14 @@ void coroutine_fn qemu_co_queue_wait(CoQueue *queue) assert(qemu_in_coroutine()); } +void coroutine_fn qemu_co_queue_wait_insert_head(CoQueue *queue) +{ + Coroutine *self = qemu_coroutine_self(); + QTAILQ_INSERT_HEAD(&queue->entries, self, co_queue_next); + qemu_coroutine_yield(); + assert(qemu_in_coroutine()); +} + bool qemu_co_queue_next(CoQueue *queue) { Coroutine *next; diff --git a/qemu-coroutine.h b/qemu-coroutine.h index b8fc4f4..8a2e5d2 100644 --- a/qemu-coroutine.h +++ b/qemu-coroutine.h @@ -118,6 +118,12 @@ void qemu_co_queue_init(CoQueue *queue); void coroutine_fn qemu_co_queue_wait(CoQueue *queue); /** + * Adds the current coroutine to the head of the CoQueue and transfers control to the + * caller of the coroutine. + */ +void coroutine_fn qemu_co_queue_wait_insert_head(CoQueue *queue); + +/** * Restarts the next coroutine in the CoQueue and removes it from the queue. * * Returns true if a coroutine was restarted, false if the queue is empty. -- 1.7.6 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v12 3/5] block: add I/O throttling algorithm 2011-11-03 8:57 [Qemu-devel] [PATCH v12 0/5] The intro to QEMU block I/O throttling Zhi Yong Wu 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 1/5] block: add the blockio limits command line support Zhi Yong Wu 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 2/5] CoQueue: introduce qemu_co_queue_wait_insert_head Zhi Yong Wu @ 2011-11-03 8:57 ` Zhi Yong Wu 2011-11-07 15:18 ` Kevin Wolf 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 4/5] hmp/qmp: add block_set_io_throttle Zhi Yong Wu 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 5/5] block: perf testing report based on block I/O throttling Zhi Yong Wu 4 siblings, 1 reply; 15+ messages in thread From: Zhi Yong Wu @ 2011-11-03 8:57 UTC (permalink / raw) To: kwolf; +Cc: zwu.kernel, ryanh, Zhi Yong Wu, qemu-devel, stefanha Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- block.c | 220 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ block.h | 1 + block_int.h | 1 + 3 files changed, 222 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 79e7f09..b2af48f 100644 --- a/block.c +++ b/block.c @@ -74,6 +74,13 @@ static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs, bool is_write); static void coroutine_fn bdrv_co_do_rw(void *opaque); +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, + bool is_write, double elapsed_time, uint64_t *wait); +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, + double elapsed_time, uint64_t *wait); +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, + bool is_write, int64_t *wait); + static QTAILQ_HEAD(, BlockDriverState) bdrv_states = QTAILQ_HEAD_INITIALIZER(bdrv_states); @@ -107,6 +114,24 @@ int is_windows_drive(const char *filename) #endif /* throttling disk I/O limits */ +void bdrv_io_limits_disable(BlockDriverState *bs) +{ + bs->io_limits_enabled = false; + + while (qemu_co_queue_next(&bs->throttled_reqs)); + + if (bs->block_timer) { + qemu_del_timer(bs->block_timer); + qemu_free_timer(bs->block_timer); + bs->block_timer = NULL; + } + + bs->slice_start = 0; + bs->slice_end = 0; + bs->slice_time = 0; + memset(&bs->io_base, 0, sizeof(bs->io_base)); +} + static void bdrv_block_timer(void *opaque) { BlockDriverState *bs = opaque; @@ -136,6 +161,31 @@ bool bdrv_io_limits_enabled(BlockDriverState *bs) || io_limits->iops[BLOCK_IO_LIMIT_TOTAL]; } +static void bdrv_io_limits_intercept(BlockDriverState *bs, + bool is_write, int nb_sectors) +{ + int64_t wait_time = -1; + + if (!qemu_co_queue_empty(&bs->throttled_reqs)) { + qemu_co_queue_wait(&bs->throttled_reqs); + } + + /* In fact, we hope to keep each request's timing, in FIFO mode. The next + * throttled requests will not be dequeued until the current request is + * allowed to be serviced. So if the current request still exceeds the + * limits, it will be inserted to the head. All requests followed it will + * be still in throttled_reqs queue. + */ + + while (bdrv_exceed_io_limits(bs, nb_sectors, is_write, &wait_time)) { + qemu_mod_timer(bs->block_timer, + wait_time + qemu_get_clock_ns(vm_clock)); + qemu_co_queue_wait_insert_head(&bs->throttled_reqs); + } + + qemu_co_queue_next(&bs->throttled_reqs); +} + /* check if the path starts with "<protocol>:" */ static int path_has_protocol(const char *path) { @@ -718,6 +768,11 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, bdrv_dev_change_media_cb(bs, true); } + /* throttling disk I/O limits */ + if (bs->io_limits_enabled) { + bdrv_io_limits_enable(bs); + } + return 0; unlink_and_fail: @@ -753,6 +808,11 @@ void bdrv_close(BlockDriverState *bs) bdrv_dev_change_media_cb(bs, false); } + + /*throttling disk I/O limits*/ + if (bs->io_limits_enabled) { + bdrv_io_limits_disable(bs); + } } void bdrv_close_all(void) @@ -1291,6 +1351,11 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, return -EIO; } + /* throttling disk read I/O */ + if (bs->io_limits_enabled) { + bdrv_io_limits_intercept(bs, false, nb_sectors); + } + return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov); } @@ -1321,6 +1386,11 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs, return -EIO; } + /* throttling disk write I/O */ + if (bs->io_limits_enabled) { + bdrv_io_limits_intercept(bs, true, nb_sectors); + } + ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov); if (bs->dirty_bitmap) { @@ -2512,6 +2582,156 @@ void bdrv_aio_cancel(BlockDriverAIOCB *acb) acb->pool->cancel(acb); } +/* block I/O throttling */ +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, + bool is_write, double elapsed_time, uint64_t *wait) { + uint64_t bps_limit = 0; + double bytes_limit, bytes_base, bytes_res; + double slice_time, wait_time; + + if (bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]) { + bps_limit = bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]; + } else if (bs->io_limits.bps[is_write]) { + bps_limit = bs->io_limits.bps[is_write]; + } else { + if (wait) { + *wait = 0; + } + + return false; + } + + slice_time = bs->slice_end - bs->slice_start; + slice_time /= (NANOSECONDS_PER_SECOND); + bytes_limit = bps_limit * slice_time; + bytes_base = bs->nr_bytes[is_write] - bs->io_base.bytes[is_write]; + if (bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]) { + bytes_base += bs->nr_bytes[!is_write] - bs->io_base.bytes[!is_write]; + } + + bytes_res = (unsigned) nb_sectors * BDRV_SECTOR_SIZE; + + if (bytes_base + bytes_res <= bytes_limit) { + if (wait) { + *wait = 0; + } + + return false; + } + + /* Calc approx time to dispatch */ + wait_time = (bytes_base + bytes_res) / bps_limit - elapsed_time; + + bs->slice_time = wait_time * BLOCK_IO_SLICE_TIME * 10; + bs->slice_end += bs->slice_time - 3 * BLOCK_IO_SLICE_TIME; + if (wait) { + *wait = wait_time * BLOCK_IO_SLICE_TIME * 10; + } + + return true; +} + +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, + double elapsed_time, uint64_t *wait) { + uint64_t iops_limit = 0; + double ios_limit, ios_base; + double slice_time, wait_time; + + if (bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]) { + iops_limit = bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]; + } else if (bs->io_limits.iops[is_write]) { + iops_limit = bs->io_limits.iops[is_write]; + } else { + if (wait) { + *wait = 0; + } + + return false; + } + + slice_time = bs->slice_end - bs->slice_start; + slice_time /= (NANOSECONDS_PER_SECOND); + ios_limit = iops_limit * slice_time; + ios_base = bs->nr_ops[is_write] - bs->io_base.ios[is_write]; + if (bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]) { + ios_base += bs->nr_ops[!is_write] - bs->io_base.ios[!is_write]; + } + + if (ios_base + 1 <= ios_limit) { + if (wait) { + *wait = 0; + } + + return false; + } + + /* Calc approx time to dispatch */ + wait_time = (ios_base + 1) / iops_limit; + if (wait_time > elapsed_time) { + wait_time = wait_time - elapsed_time; + } else { + wait_time = 0; + } + + bs->slice_time = wait_time * BLOCK_IO_SLICE_TIME * 10; + bs->slice_end += bs->slice_time - 3 * BLOCK_IO_SLICE_TIME; + if (wait) { + *wait = wait_time * BLOCK_IO_SLICE_TIME * 10; + } + + return true; +} + +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, + bool is_write, int64_t *wait) { + int64_t now, max_wait; + uint64_t bps_wait = 0, iops_wait = 0; + double elapsed_time; + int bps_ret, iops_ret; + + now = qemu_get_clock_ns(vm_clock); + if ((bs->slice_start < now) + && (bs->slice_end > now)) { + bs->slice_end = now + bs->slice_time; + } else { + bs->slice_time = 5 * BLOCK_IO_SLICE_TIME; + bs->slice_start = now; + bs->slice_end = now + bs->slice_time; + + bs->io_base.bytes[is_write] = bs->nr_bytes[is_write]; + bs->io_base.bytes[!is_write] = bs->nr_bytes[!is_write]; + + bs->io_base.ios[is_write] = bs->nr_ops[is_write]; + bs->io_base.ios[!is_write] = bs->nr_ops[!is_write]; + } + + elapsed_time = now - bs->slice_start; + elapsed_time /= (NANOSECONDS_PER_SECOND); + + bps_ret = bdrv_exceed_bps_limits(bs, nb_sectors, + is_write, elapsed_time, &bps_wait); + iops_ret = bdrv_exceed_iops_limits(bs, is_write, + elapsed_time, &iops_wait); + if (bps_ret || iops_ret) { + max_wait = bps_wait > iops_wait ? bps_wait : iops_wait; + if (wait) { + *wait = max_wait; + } + + now = qemu_get_clock_ns(vm_clock); + if (bs->slice_end < now + max_wait) { + bs->slice_end = now + max_wait; + } + + return true; + } + + if (wait) { + *wait = 0; + } + + return false; +} /**************************************************************/ /* async block device emulation */ diff --git a/block.h b/block.h index bc8315d..9b5b35f 100644 --- a/block.h +++ b/block.h @@ -91,6 +91,7 @@ void bdrv_info_stats(Monitor *mon, QObject **ret_data); /* disk I/O throttling */ void bdrv_io_limits_enable(BlockDriverState *bs); +void bdrv_io_limits_disable(BlockDriverState *bs); bool bdrv_io_limits_enabled(BlockDriverState *bs); void bdrv_init(void); diff --git a/block_int.h b/block_int.h index 7315e0d..69418fe 100644 --- a/block_int.h +++ b/block_int.h @@ -39,6 +39,7 @@ #define BLOCK_IO_LIMIT_TOTAL 2 #define BLOCK_IO_SLICE_TIME 100000000 +#define NANOSECONDS_PER_SECOND 1000000000.0 #define BLOCK_OPT_SIZE "size" #define BLOCK_OPT_ENCRYPT "encryption" -- 1.7.6 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v12 3/5] block: add I/O throttling algorithm 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 3/5] block: add I/O throttling algorithm Zhi Yong Wu @ 2011-11-07 15:18 ` Kevin Wolf 2011-11-08 4:34 ` Zhi Yong Wu 0 siblings, 1 reply; 15+ messages in thread From: Kevin Wolf @ 2011-11-07 15:18 UTC (permalink / raw) To: Zhi Yong Wu; +Cc: zwu.kernel, ryanh, qemu-devel, stefanha Am 03.11.2011 09:57, schrieb Zhi Yong Wu: > Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> > Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> > --- > block.c | 220 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > block.h | 1 + > block_int.h | 1 + > 3 files changed, 222 insertions(+), 0 deletions(-) > > diff --git a/block.c b/block.c > index 79e7f09..b2af48f 100644 > --- a/block.c > +++ b/block.c > @@ -74,6 +74,13 @@ static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs, > bool is_write); > static void coroutine_fn bdrv_co_do_rw(void *opaque); > > +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, > + bool is_write, double elapsed_time, uint64_t *wait); > +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, > + double elapsed_time, uint64_t *wait); > +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, > + bool is_write, int64_t *wait); > + > static QTAILQ_HEAD(, BlockDriverState) bdrv_states = > QTAILQ_HEAD_INITIALIZER(bdrv_states); > > @@ -107,6 +114,24 @@ int is_windows_drive(const char *filename) > #endif > > /* throttling disk I/O limits */ > +void bdrv_io_limits_disable(BlockDriverState *bs) > +{ > + bs->io_limits_enabled = false; > + > + while (qemu_co_queue_next(&bs->throttled_reqs)); > + > + if (bs->block_timer) { > + qemu_del_timer(bs->block_timer); > + qemu_free_timer(bs->block_timer); > + bs->block_timer = NULL; > + } > + > + bs->slice_start = 0; > + bs->slice_end = 0; > + bs->slice_time = 0; > + memset(&bs->io_base, 0, sizeof(bs->io_base)); > +} > + > static void bdrv_block_timer(void *opaque) > { > BlockDriverState *bs = opaque; > @@ -136,6 +161,31 @@ bool bdrv_io_limits_enabled(BlockDriverState *bs) > || io_limits->iops[BLOCK_IO_LIMIT_TOTAL]; > } > > +static void bdrv_io_limits_intercept(BlockDriverState *bs, > + bool is_write, int nb_sectors) > +{ > + int64_t wait_time = -1; > + > + if (!qemu_co_queue_empty(&bs->throttled_reqs)) { > + qemu_co_queue_wait(&bs->throttled_reqs); > + } > + > + /* In fact, we hope to keep each request's timing, in FIFO mode. The next > + * throttled requests will not be dequeued until the current request is > + * allowed to be serviced. So if the current request still exceeds the > + * limits, it will be inserted to the head. All requests followed it will > + * be still in throttled_reqs queue. > + */ > + > + while (bdrv_exceed_io_limits(bs, nb_sectors, is_write, &wait_time)) { > + qemu_mod_timer(bs->block_timer, > + wait_time + qemu_get_clock_ns(vm_clock)); > + qemu_co_queue_wait_insert_head(&bs->throttled_reqs); > + } > + > + qemu_co_queue_next(&bs->throttled_reqs); > +} > + > /* check if the path starts with "<protocol>:" */ > static int path_has_protocol(const char *path) > { > @@ -718,6 +768,11 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, > bdrv_dev_change_media_cb(bs, true); > } > > + /* throttling disk I/O limits */ > + if (bs->io_limits_enabled) { > + bdrv_io_limits_enable(bs); > + } > + > return 0; > > unlink_and_fail: > @@ -753,6 +808,11 @@ void bdrv_close(BlockDriverState *bs) > > bdrv_dev_change_media_cb(bs, false); > } > + > + /*throttling disk I/O limits*/ > + if (bs->io_limits_enabled) { > + bdrv_io_limits_disable(bs); > + } > } > > void bdrv_close_all(void) > @@ -1291,6 +1351,11 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, > return -EIO; > } > > + /* throttling disk read I/O */ > + if (bs->io_limits_enabled) { > + bdrv_io_limits_intercept(bs, false, nb_sectors); > + } > + > return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov); > } > > @@ -1321,6 +1386,11 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs, > return -EIO; > } > > + /* throttling disk write I/O */ > + if (bs->io_limits_enabled) { > + bdrv_io_limits_intercept(bs, true, nb_sectors); > + } > + > ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov); > > if (bs->dirty_bitmap) { > @@ -2512,6 +2582,156 @@ void bdrv_aio_cancel(BlockDriverAIOCB *acb) > acb->pool->cancel(acb); > } > > +/* block I/O throttling */ > +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, > + bool is_write, double elapsed_time, uint64_t *wait) { > + uint64_t bps_limit = 0; > + double bytes_limit, bytes_base, bytes_res; > + double slice_time, wait_time; > + > + if (bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]) { > + bps_limit = bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]; > + } else if (bs->io_limits.bps[is_write]) { > + bps_limit = bs->io_limits.bps[is_write]; > + } else { > + if (wait) { > + *wait = 0; > + } > + > + return false; > + } > + > + slice_time = bs->slice_end - bs->slice_start; > + slice_time /= (NANOSECONDS_PER_SECOND); > + bytes_limit = bps_limit * slice_time; > + bytes_base = bs->nr_bytes[is_write] - bs->io_base.bytes[is_write]; > + if (bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]) { > + bytes_base += bs->nr_bytes[!is_write] - bs->io_base.bytes[!is_write]; > + } > + > + bytes_res = (unsigned) nb_sectors * BDRV_SECTOR_SIZE; > + > + if (bytes_base + bytes_res <= bytes_limit) { > + if (wait) { > + *wait = 0; > + } > + > + return false; > + } > + > + /* Calc approx time to dispatch */ > + wait_time = (bytes_base + bytes_res) / bps_limit - elapsed_time; > + > + bs->slice_time = wait_time * BLOCK_IO_SLICE_TIME * 10; > + bs->slice_end += bs->slice_time - 3 * BLOCK_IO_SLICE_TIME; > + if (wait) { > + *wait = wait_time * BLOCK_IO_SLICE_TIME * 10; > + } I'm not quire sure what bs->slice_end really is and what these calculations do exactly. Looks like magic. Can you add some comments that explain why slice_end is increased and how you estimate *wait? > + > + return true; > +} > + > +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, > + double elapsed_time, uint64_t *wait) { Coding style requires the brace on its own line. > + uint64_t iops_limit = 0; > + double ios_limit, ios_base; > + double slice_time, wait_time; > + > + if (bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]) { > + iops_limit = bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]; > + } else if (bs->io_limits.iops[is_write]) { > + iops_limit = bs->io_limits.iops[is_write]; > + } else { > + if (wait) { > + *wait = 0; > + } > + > + return false; > + } > + > + slice_time = bs->slice_end - bs->slice_start; > + slice_time /= (NANOSECONDS_PER_SECOND); > + ios_limit = iops_limit * slice_time; > + ios_base = bs->nr_ops[is_write] - bs->io_base.ios[is_write]; > + if (bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]) { > + ios_base += bs->nr_ops[!is_write] - bs->io_base.ios[!is_write]; > + } > + > + if (ios_base + 1 <= ios_limit) { > + if (wait) { > + *wait = 0; > + } > + > + return false; > + } > + > + /* Calc approx time to dispatch */ > + wait_time = (ios_base + 1) / iops_limit; > + if (wait_time > elapsed_time) { > + wait_time = wait_time - elapsed_time; > + } else { > + wait_time = 0; > + } > + > + bs->slice_time = wait_time * BLOCK_IO_SLICE_TIME * 10; > + bs->slice_end += bs->slice_time - 3 * BLOCK_IO_SLICE_TIME; > + if (wait) { > + *wait = wait_time * BLOCK_IO_SLICE_TIME * 10; > + } > + > + return true; > +} > + > +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, > + bool is_write, int64_t *wait) { Same here. Kevin ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v12 3/5] block: add I/O throttling algorithm 2011-11-07 15:18 ` Kevin Wolf @ 2011-11-08 4:34 ` Zhi Yong Wu 2011-11-08 8:41 ` Kevin Wolf 0 siblings, 1 reply; 15+ messages in thread From: Zhi Yong Wu @ 2011-11-08 4:34 UTC (permalink / raw) To: Kevin Wolf; +Cc: ryanh, Zhi Yong Wu, qemu-devel, stefanha On Mon, Nov 7, 2011 at 11:18 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 03.11.2011 09:57, schrieb Zhi Yong Wu: >> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> >> --- >> block.c | 220 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> block.h | 1 + >> block_int.h | 1 + >> 3 files changed, 222 insertions(+), 0 deletions(-) >> >> diff --git a/block.c b/block.c >> index 79e7f09..b2af48f 100644 >> --- a/block.c >> +++ b/block.c >> @@ -74,6 +74,13 @@ static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs, >> bool is_write); >> static void coroutine_fn bdrv_co_do_rw(void *opaque); >> >> +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, >> + bool is_write, double elapsed_time, uint64_t *wait); >> +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, >> + double elapsed_time, uint64_t *wait); >> +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, >> + bool is_write, int64_t *wait); >> + >> static QTAILQ_HEAD(, BlockDriverState) bdrv_states = >> QTAILQ_HEAD_INITIALIZER(bdrv_states); >> >> @@ -107,6 +114,24 @@ int is_windows_drive(const char *filename) >> #endif >> >> /* throttling disk I/O limits */ >> +void bdrv_io_limits_disable(BlockDriverState *bs) >> +{ >> + bs->io_limits_enabled = false; >> + >> + while (qemu_co_queue_next(&bs->throttled_reqs)); >> + >> + if (bs->block_timer) { >> + qemu_del_timer(bs->block_timer); >> + qemu_free_timer(bs->block_timer); >> + bs->block_timer = NULL; >> + } >> + >> + bs->slice_start = 0; >> + bs->slice_end = 0; >> + bs->slice_time = 0; >> + memset(&bs->io_base, 0, sizeof(bs->io_base)); >> +} >> + >> static void bdrv_block_timer(void *opaque) >> { >> BlockDriverState *bs = opaque; >> @@ -136,6 +161,31 @@ bool bdrv_io_limits_enabled(BlockDriverState *bs) >> || io_limits->iops[BLOCK_IO_LIMIT_TOTAL]; >> } >> >> +static void bdrv_io_limits_intercept(BlockDriverState *bs, >> + bool is_write, int nb_sectors) >> +{ >> + int64_t wait_time = -1; >> + >> + if (!qemu_co_queue_empty(&bs->throttled_reqs)) { >> + qemu_co_queue_wait(&bs->throttled_reqs); >> + } >> + >> + /* In fact, we hope to keep each request's timing, in FIFO mode. The next >> + * throttled requests will not be dequeued until the current request is >> + * allowed to be serviced. So if the current request still exceeds the >> + * limits, it will be inserted to the head. All requests followed it will >> + * be still in throttled_reqs queue. >> + */ >> + >> + while (bdrv_exceed_io_limits(bs, nb_sectors, is_write, &wait_time)) { >> + qemu_mod_timer(bs->block_timer, >> + wait_time + qemu_get_clock_ns(vm_clock)); >> + qemu_co_queue_wait_insert_head(&bs->throttled_reqs); >> + } >> + >> + qemu_co_queue_next(&bs->throttled_reqs); >> +} >> + >> /* check if the path starts with "<protocol>:" */ >> static int path_has_protocol(const char *path) >> { >> @@ -718,6 +768,11 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, >> bdrv_dev_change_media_cb(bs, true); >> } >> >> + /* throttling disk I/O limits */ >> + if (bs->io_limits_enabled) { >> + bdrv_io_limits_enable(bs); >> + } >> + >> return 0; >> >> unlink_and_fail: >> @@ -753,6 +808,11 @@ void bdrv_close(BlockDriverState *bs) >> >> bdrv_dev_change_media_cb(bs, false); >> } >> + >> + /*throttling disk I/O limits*/ >> + if (bs->io_limits_enabled) { >> + bdrv_io_limits_disable(bs); >> + } >> } >> >> void bdrv_close_all(void) >> @@ -1291,6 +1351,11 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, >> return -EIO; >> } >> >> + /* throttling disk read I/O */ >> + if (bs->io_limits_enabled) { >> + bdrv_io_limits_intercept(bs, false, nb_sectors); >> + } >> + >> return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov); >> } >> >> @@ -1321,6 +1386,11 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs, >> return -EIO; >> } >> >> + /* throttling disk write I/O */ >> + if (bs->io_limits_enabled) { >> + bdrv_io_limits_intercept(bs, true, nb_sectors); >> + } >> + >> ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov); >> >> if (bs->dirty_bitmap) { >> @@ -2512,6 +2582,156 @@ void bdrv_aio_cancel(BlockDriverAIOCB *acb) >> acb->pool->cancel(acb); >> } >> >> +/* block I/O throttling */ >> +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, >> + bool is_write, double elapsed_time, uint64_t *wait) { >> + uint64_t bps_limit = 0; >> + double bytes_limit, bytes_base, bytes_res; >> + double slice_time, wait_time; >> + >> + if (bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]) { >> + bps_limit = bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]; >> + } else if (bs->io_limits.bps[is_write]) { >> + bps_limit = bs->io_limits.bps[is_write]; >> + } else { >> + if (wait) { >> + *wait = 0; >> + } >> + >> + return false; >> + } >> + >> + slice_time = bs->slice_end - bs->slice_start; >> + slice_time /= (NANOSECONDS_PER_SECOND); >> + bytes_limit = bps_limit * slice_time; >> + bytes_base = bs->nr_bytes[is_write] - bs->io_base.bytes[is_write]; >> + if (bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]) { >> + bytes_base += bs->nr_bytes[!is_write] - bs->io_base.bytes[!is_write]; >> + } >> + >> + bytes_res = (unsigned) nb_sectors * BDRV_SECTOR_SIZE; >> + >> + if (bytes_base + bytes_res <= bytes_limit) { >> + if (wait) { >> + *wait = 0; >> + } >> + >> + return false; >> + } >> + >> + /* Calc approx time to dispatch */ >> + wait_time = (bytes_base + bytes_res) / bps_limit - elapsed_time; >> + >> + bs->slice_time = wait_time * BLOCK_IO_SLICE_TIME * 10; >> + bs->slice_end += bs->slice_time - 3 * BLOCK_IO_SLICE_TIME; >> + if (wait) { >> + *wait = wait_time * BLOCK_IO_SLICE_TIME * 10; >> + } > > I'm not quire sure what bs->slice_end really is and what these > calculations do exactly. Looks like magic. Can you add some comments > that explain why slice_end is increased? As you'ver known, when the I/O rate at runtime exceeds the limits, bs->slice_end need to be extended in order that the current statistic info can be kept until the timer fire, so it is increased and tuned based on the result of experimet. > and how you estimate *wait? The wait time is calcuated based on the history info of bps and iops. bytes_res = (unsigned) nb_sectors * BDRV_SECTOR_SIZE; wait_time = (bytes_base + bytes_res) / bps_limit - elapsed_time; 1.) bytes_base is the bytes of data which have been read/written; and it is obtained from the history statistic info. 2.) bytes_res is the remaining bytes of data which need to be read/written. 3.) (bytes_base + bytes_res) / bps_limit, this expression will be used to calcuated the total time for completing reading/writting all data. I don't make sure if you understand this. > >> + >> + return true; >> +} >> + >> +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, >> + double elapsed_time, uint64_t *wait) { > > Coding style requires the brace on its own line. > >> + uint64_t iops_limit = 0; >> + double ios_limit, ios_base; >> + double slice_time, wait_time; >> + >> + if (bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]) { >> + iops_limit = bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]; >> + } else if (bs->io_limits.iops[is_write]) { >> + iops_limit = bs->io_limits.iops[is_write]; >> + } else { >> + if (wait) { >> + *wait = 0; >> + } >> + >> + return false; >> + } >> + >> + slice_time = bs->slice_end - bs->slice_start; >> + slice_time /= (NANOSECONDS_PER_SECOND); >> + ios_limit = iops_limit * slice_time; >> + ios_base = bs->nr_ops[is_write] - bs->io_base.ios[is_write]; >> + if (bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]) { >> + ios_base += bs->nr_ops[!is_write] - bs->io_base.ios[!is_write]; >> + } >> + >> + if (ios_base + 1 <= ios_limit) { >> + if (wait) { >> + *wait = 0; >> + } >> + >> + return false; >> + } >> + >> + /* Calc approx time to dispatch */ >> + wait_time = (ios_base + 1) / iops_limit; >> + if (wait_time > elapsed_time) { >> + wait_time = wait_time - elapsed_time; >> + } else { >> + wait_time = 0; >> + } >> + >> + bs->slice_time = wait_time * BLOCK_IO_SLICE_TIME * 10; >> + bs->slice_end += bs->slice_time - 3 * BLOCK_IO_SLICE_TIME; >> + if (wait) { >> + *wait = wait_time * BLOCK_IO_SLICE_TIME * 10; >> + } >> + >> + return true; >> +} >> + >> +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, >> + bool is_write, int64_t *wait) { > > Same here. > > Kevin > -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v12 3/5] block: add I/O throttling algorithm 2011-11-08 4:34 ` Zhi Yong Wu @ 2011-11-08 8:41 ` Kevin Wolf 2011-11-08 8:57 ` Zhi Yong Wu 0 siblings, 1 reply; 15+ messages in thread From: Kevin Wolf @ 2011-11-08 8:41 UTC (permalink / raw) To: Zhi Yong Wu; +Cc: ryanh, Zhi Yong Wu, qemu-devel, stefanha Am 08.11.2011 05:34, schrieb Zhi Yong Wu: > On Mon, Nov 7, 2011 at 11:18 PM, Kevin Wolf <kwolf@redhat.com> wrote: >> Am 03.11.2011 09:57, schrieb Zhi Yong Wu: >>> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >>> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> >>> --- >>> block.c | 220 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> block.h | 1 + >>> block_int.h | 1 + >>> 3 files changed, 222 insertions(+), 0 deletions(-) >>> >>> diff --git a/block.c b/block.c >>> index 79e7f09..b2af48f 100644 >>> --- a/block.c >>> +++ b/block.c >>> @@ -74,6 +74,13 @@ static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs, >>> bool is_write); >>> static void coroutine_fn bdrv_co_do_rw(void *opaque); >>> >>> +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, >>> + bool is_write, double elapsed_time, uint64_t *wait); >>> +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, >>> + double elapsed_time, uint64_t *wait); >>> +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, >>> + bool is_write, int64_t *wait); >>> + >>> static QTAILQ_HEAD(, BlockDriverState) bdrv_states = >>> QTAILQ_HEAD_INITIALIZER(bdrv_states); >>> >>> @@ -107,6 +114,24 @@ int is_windows_drive(const char *filename) >>> #endif >>> >>> /* throttling disk I/O limits */ >>> +void bdrv_io_limits_disable(BlockDriverState *bs) >>> +{ >>> + bs->io_limits_enabled = false; >>> + >>> + while (qemu_co_queue_next(&bs->throttled_reqs)); >>> + >>> + if (bs->block_timer) { >>> + qemu_del_timer(bs->block_timer); >>> + qemu_free_timer(bs->block_timer); >>> + bs->block_timer = NULL; >>> + } >>> + >>> + bs->slice_start = 0; >>> + bs->slice_end = 0; >>> + bs->slice_time = 0; >>> + memset(&bs->io_base, 0, sizeof(bs->io_base)); >>> +} >>> + >>> static void bdrv_block_timer(void *opaque) >>> { >>> BlockDriverState *bs = opaque; >>> @@ -136,6 +161,31 @@ bool bdrv_io_limits_enabled(BlockDriverState *bs) >>> || io_limits->iops[BLOCK_IO_LIMIT_TOTAL]; >>> } >>> >>> +static void bdrv_io_limits_intercept(BlockDriverState *bs, >>> + bool is_write, int nb_sectors) >>> +{ >>> + int64_t wait_time = -1; >>> + >>> + if (!qemu_co_queue_empty(&bs->throttled_reqs)) { >>> + qemu_co_queue_wait(&bs->throttled_reqs); >>> + } >>> + >>> + /* In fact, we hope to keep each request's timing, in FIFO mode. The next >>> + * throttled requests will not be dequeued until the current request is >>> + * allowed to be serviced. So if the current request still exceeds the >>> + * limits, it will be inserted to the head. All requests followed it will >>> + * be still in throttled_reqs queue. >>> + */ >>> + >>> + while (bdrv_exceed_io_limits(bs, nb_sectors, is_write, &wait_time)) { >>> + qemu_mod_timer(bs->block_timer, >>> + wait_time + qemu_get_clock_ns(vm_clock)); >>> + qemu_co_queue_wait_insert_head(&bs->throttled_reqs); >>> + } >>> + >>> + qemu_co_queue_next(&bs->throttled_reqs); >>> +} >>> + >>> /* check if the path starts with "<protocol>:" */ >>> static int path_has_protocol(const char *path) >>> { >>> @@ -718,6 +768,11 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, >>> bdrv_dev_change_media_cb(bs, true); >>> } >>> >>> + /* throttling disk I/O limits */ >>> + if (bs->io_limits_enabled) { >>> + bdrv_io_limits_enable(bs); >>> + } >>> + >>> return 0; >>> >>> unlink_and_fail: >>> @@ -753,6 +808,11 @@ void bdrv_close(BlockDriverState *bs) >>> >>> bdrv_dev_change_media_cb(bs, false); >>> } >>> + >>> + /*throttling disk I/O limits*/ >>> + if (bs->io_limits_enabled) { >>> + bdrv_io_limits_disable(bs); >>> + } >>> } >>> >>> void bdrv_close_all(void) >>> @@ -1291,6 +1351,11 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, >>> return -EIO; >>> } >>> >>> + /* throttling disk read I/O */ >>> + if (bs->io_limits_enabled) { >>> + bdrv_io_limits_intercept(bs, false, nb_sectors); >>> + } >>> + >>> return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov); >>> } >>> >>> @@ -1321,6 +1386,11 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs, >>> return -EIO; >>> } >>> >>> + /* throttling disk write I/O */ >>> + if (bs->io_limits_enabled) { >>> + bdrv_io_limits_intercept(bs, true, nb_sectors); >>> + } >>> + >>> ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov); >>> >>> if (bs->dirty_bitmap) { >>> @@ -2512,6 +2582,156 @@ void bdrv_aio_cancel(BlockDriverAIOCB *acb) >>> acb->pool->cancel(acb); >>> } >>> >>> +/* block I/O throttling */ >>> +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, >>> + bool is_write, double elapsed_time, uint64_t *wait) { >>> + uint64_t bps_limit = 0; >>> + double bytes_limit, bytes_base, bytes_res; >>> + double slice_time, wait_time; >>> + >>> + if (bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]) { >>> + bps_limit = bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]; >>> + } else if (bs->io_limits.bps[is_write]) { >>> + bps_limit = bs->io_limits.bps[is_write]; >>> + } else { >>> + if (wait) { >>> + *wait = 0; >>> + } >>> + >>> + return false; >>> + } >>> + >>> + slice_time = bs->slice_end - bs->slice_start; >>> + slice_time /= (NANOSECONDS_PER_SECOND); >>> + bytes_limit = bps_limit * slice_time; >>> + bytes_base = bs->nr_bytes[is_write] - bs->io_base.bytes[is_write]; >>> + if (bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]) { >>> + bytes_base += bs->nr_bytes[!is_write] - bs->io_base.bytes[!is_write]; >>> + } >>> + >>> + bytes_res = (unsigned) nb_sectors * BDRV_SECTOR_SIZE; >>> + >>> + if (bytes_base + bytes_res <= bytes_limit) { >>> + if (wait) { >>> + *wait = 0; >>> + } >>> + >>> + return false; >>> + } >>> + >>> + /* Calc approx time to dispatch */ >>> + wait_time = (bytes_base + bytes_res) / bps_limit - elapsed_time; >>> + >>> + bs->slice_time = wait_time * BLOCK_IO_SLICE_TIME * 10; >>> + bs->slice_end += bs->slice_time - 3 * BLOCK_IO_SLICE_TIME; >>> + if (wait) { >>> + *wait = wait_time * BLOCK_IO_SLICE_TIME * 10; >>> + } >> >> I'm not quire sure what bs->slice_end really is and what these >> calculations do exactly. Looks like magic. Can you add some comments >> that explain why slice_end is increased? > As you'ver known, when the I/O rate at runtime exceeds the limits, > bs->slice_end need to be extended in order that the current statistic > info can be kept until the timer fire, so it is increased and tuned > based on the result of experimet. > >> and how you estimate *wait? > The wait time is calcuated based on the history info of bps and iops. > > bytes_res = (unsigned) nb_sectors * BDRV_SECTOR_SIZE; > wait_time = (bytes_base + bytes_res) / bps_limit - elapsed_time; > > 1.) bytes_base is the bytes of data which have been read/written; and > it is obtained from the history statistic info. > 2.) bytes_res is the remaining bytes of data which need to be read/written. > 3.) (bytes_base + bytes_res) / bps_limit, this expression will be used > to calcuated the total time for completing reading/writting all data. > > I don't make sure if you understand this. Yes, I think this makes sense to me. However, I don't understand why things like 10 * BLOCK_IO_SLICE_TIME or 3 * BLOCK_IO_SLICE_TIME appear in the code. These numbers are magic for me. Are they more or less arbitrary values that happen to work well? Kevin ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v12 3/5] block: add I/O throttling algorithm 2011-11-08 8:41 ` Kevin Wolf @ 2011-11-08 8:57 ` Zhi Yong Wu 0 siblings, 0 replies; 15+ messages in thread From: Zhi Yong Wu @ 2011-11-08 8:57 UTC (permalink / raw) To: Kevin Wolf; +Cc: ryanh, Zhi Yong Wu, qemu-devel, stefanha On Tue, Nov 8, 2011 at 4:41 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 08.11.2011 05:34, schrieb Zhi Yong Wu: >> On Mon, Nov 7, 2011 at 11:18 PM, Kevin Wolf <kwolf@redhat.com> wrote: >>> Am 03.11.2011 09:57, schrieb Zhi Yong Wu: >>>> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >>>> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> >>>> --- >>>> block.c | 220 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> block.h | 1 + >>>> block_int.h | 1 + >>>> 3 files changed, 222 insertions(+), 0 deletions(-) >>>> >>>> diff --git a/block.c b/block.c >>>> index 79e7f09..b2af48f 100644 >>>> --- a/block.c >>>> +++ b/block.c >>>> @@ -74,6 +74,13 @@ static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs, >>>> bool is_write); >>>> static void coroutine_fn bdrv_co_do_rw(void *opaque); >>>> >>>> +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, >>>> + bool is_write, double elapsed_time, uint64_t *wait); >>>> +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, >>>> + double elapsed_time, uint64_t *wait); >>>> +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, >>>> + bool is_write, int64_t *wait); >>>> + >>>> static QTAILQ_HEAD(, BlockDriverState) bdrv_states = >>>> QTAILQ_HEAD_INITIALIZER(bdrv_states); >>>> >>>> @@ -107,6 +114,24 @@ int is_windows_drive(const char *filename) >>>> #endif >>>> >>>> /* throttling disk I/O limits */ >>>> +void bdrv_io_limits_disable(BlockDriverState *bs) >>>> +{ >>>> + bs->io_limits_enabled = false; >>>> + >>>> + while (qemu_co_queue_next(&bs->throttled_reqs)); >>>> + >>>> + if (bs->block_timer) { >>>> + qemu_del_timer(bs->block_timer); >>>> + qemu_free_timer(bs->block_timer); >>>> + bs->block_timer = NULL; >>>> + } >>>> + >>>> + bs->slice_start = 0; >>>> + bs->slice_end = 0; >>>> + bs->slice_time = 0; >>>> + memset(&bs->io_base, 0, sizeof(bs->io_base)); >>>> +} >>>> + >>>> static void bdrv_block_timer(void *opaque) >>>> { >>>> BlockDriverState *bs = opaque; >>>> @@ -136,6 +161,31 @@ bool bdrv_io_limits_enabled(BlockDriverState *bs) >>>> || io_limits->iops[BLOCK_IO_LIMIT_TOTAL]; >>>> } >>>> >>>> +static void bdrv_io_limits_intercept(BlockDriverState *bs, >>>> + bool is_write, int nb_sectors) >>>> +{ >>>> + int64_t wait_time = -1; >>>> + >>>> + if (!qemu_co_queue_empty(&bs->throttled_reqs)) { >>>> + qemu_co_queue_wait(&bs->throttled_reqs); >>>> + } >>>> + >>>> + /* In fact, we hope to keep each request's timing, in FIFO mode. The next >>>> + * throttled requests will not be dequeued until the current request is >>>> + * allowed to be serviced. So if the current request still exceeds the >>>> + * limits, it will be inserted to the head. All requests followed it will >>>> + * be still in throttled_reqs queue. >>>> + */ >>>> + >>>> + while (bdrv_exceed_io_limits(bs, nb_sectors, is_write, &wait_time)) { >>>> + qemu_mod_timer(bs->block_timer, >>>> + wait_time + qemu_get_clock_ns(vm_clock)); >>>> + qemu_co_queue_wait_insert_head(&bs->throttled_reqs); >>>> + } >>>> + >>>> + qemu_co_queue_next(&bs->throttled_reqs); >>>> +} >>>> + >>>> /* check if the path starts with "<protocol>:" */ >>>> static int path_has_protocol(const char *path) >>>> { >>>> @@ -718,6 +768,11 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, >>>> bdrv_dev_change_media_cb(bs, true); >>>> } >>>> >>>> + /* throttling disk I/O limits */ >>>> + if (bs->io_limits_enabled) { >>>> + bdrv_io_limits_enable(bs); >>>> + } >>>> + >>>> return 0; >>>> >>>> unlink_and_fail: >>>> @@ -753,6 +808,11 @@ void bdrv_close(BlockDriverState *bs) >>>> >>>> bdrv_dev_change_media_cb(bs, false); >>>> } >>>> + >>>> + /*throttling disk I/O limits*/ >>>> + if (bs->io_limits_enabled) { >>>> + bdrv_io_limits_disable(bs); >>>> + } >>>> } >>>> >>>> void bdrv_close_all(void) >>>> @@ -1291,6 +1351,11 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, >>>> return -EIO; >>>> } >>>> >>>> + /* throttling disk read I/O */ >>>> + if (bs->io_limits_enabled) { >>>> + bdrv_io_limits_intercept(bs, false, nb_sectors); >>>> + } >>>> + >>>> return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov); >>>> } >>>> >>>> @@ -1321,6 +1386,11 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs, >>>> return -EIO; >>>> } >>>> >>>> + /* throttling disk write I/O */ >>>> + if (bs->io_limits_enabled) { >>>> + bdrv_io_limits_intercept(bs, true, nb_sectors); >>>> + } >>>> + >>>> ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov); >>>> >>>> if (bs->dirty_bitmap) { >>>> @@ -2512,6 +2582,156 @@ void bdrv_aio_cancel(BlockDriverAIOCB *acb) >>>> acb->pool->cancel(acb); >>>> } >>>> >>>> +/* block I/O throttling */ >>>> +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, >>>> + bool is_write, double elapsed_time, uint64_t *wait) { >>>> + uint64_t bps_limit = 0; >>>> + double bytes_limit, bytes_base, bytes_res; >>>> + double slice_time, wait_time; >>>> + >>>> + if (bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]) { >>>> + bps_limit = bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]; >>>> + } else if (bs->io_limits.bps[is_write]) { >>>> + bps_limit = bs->io_limits.bps[is_write]; >>>> + } else { >>>> + if (wait) { >>>> + *wait = 0; >>>> + } >>>> + >>>> + return false; >>>> + } >>>> + >>>> + slice_time = bs->slice_end - bs->slice_start; >>>> + slice_time /= (NANOSECONDS_PER_SECOND); >>>> + bytes_limit = bps_limit * slice_time; >>>> + bytes_base = bs->nr_bytes[is_write] - bs->io_base.bytes[is_write]; >>>> + if (bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]) { >>>> + bytes_base += bs->nr_bytes[!is_write] - bs->io_base.bytes[!is_write]; >>>> + } >>>> + >>>> + bytes_res = (unsigned) nb_sectors * BDRV_SECTOR_SIZE; >>>> + >>>> + if (bytes_base + bytes_res <= bytes_limit) { >>>> + if (wait) { >>>> + *wait = 0; >>>> + } >>>> + >>>> + return false; >>>> + } >>>> + >>>> + /* Calc approx time to dispatch */ >>>> + wait_time = (bytes_base + bytes_res) / bps_limit - elapsed_time; >>>> + >>>> + bs->slice_time = wait_time * BLOCK_IO_SLICE_TIME * 10; >>>> + bs->slice_end += bs->slice_time - 3 * BLOCK_IO_SLICE_TIME; >>>> + if (wait) { >>>> + *wait = wait_time * BLOCK_IO_SLICE_TIME * 10; >>>> + } >>> >>> I'm not quire sure what bs->slice_end really is and what these >>> calculations do exactly. Looks like magic. Can you add some comments >>> that explain why slice_end is increased? >> As you'ver known, when the I/O rate at runtime exceeds the limits, >> bs->slice_end need to be extended in order that the current statistic >> info can be kept until the timer fire, so it is increased and tuned >> based on the result of experimet. >> >>> and how you estimate *wait? >> The wait time is calcuated based on the history info of bps and iops. >> >> bytes_res = (unsigned) nb_sectors * BDRV_SECTOR_SIZE; >> wait_time = (bytes_base + bytes_res) / bps_limit - elapsed_time; >> >> 1.) bytes_base is the bytes of data which have been read/written; and >> it is obtained from the history statistic info. >> 2.) bytes_res is the remaining bytes of data which need to be read/written. >> 3.) (bytes_base + bytes_res) / bps_limit, this expression will be used >> to calcuated the total time for completing reading/writting all data. >> >> I don't make sure if you understand this. > > Yes, I think this makes sense to me. > > However, I don't understand why things like 10 * BLOCK_IO_SLICE_TIME or 10 * BLOCK_IO_SLICE_TIME is used to translate s value to ns value, and is actually 1s. > 3 * BLOCK_IO_SLICE_TIME appear in the code. These numbers are magic for > me. Are they more or less arbitrary values that happen to work well? They are used to define the window size of one slice. The slice determine how close the calcuated runtime rate is to the real runtime rate. So they are tunable variable. > > Kevin > -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v12 4/5] hmp/qmp: add block_set_io_throttle 2011-11-03 8:57 [Qemu-devel] [PATCH v12 0/5] The intro to QEMU block I/O throttling Zhi Yong Wu ` (2 preceding siblings ...) 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 3/5] block: add I/O throttling algorithm Zhi Yong Wu @ 2011-11-03 8:57 ` Zhi Yong Wu 2011-11-07 15:26 ` Kevin Wolf 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 5/5] block: perf testing report based on block I/O throttling Zhi Yong Wu 4 siblings, 1 reply; 15+ messages in thread From: Zhi Yong Wu @ 2011-11-03 8:57 UTC (permalink / raw) To: kwolf; +Cc: zwu.kernel, ryanh, Zhi Yong Wu, qemu-devel, stefanha Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- block.c | 15 +++++++++++++ blockdev.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ blockdev.h | 2 + hmp-commands.hx | 15 +++++++++++++ hmp.c | 10 +++++++++ qapi-schema.json | 16 +++++++++++++- qerror.c | 4 +++ qerror.h | 3 ++ qmp-commands.hx | 53 +++++++++++++++++++++++++++++++++++++++++++++++- 9 files changed, 175 insertions(+), 2 deletions(-) diff --git a/block.c b/block.c index b2af48f..ed6fe20 100644 --- a/block.c +++ b/block.c @@ -1971,6 +1971,21 @@ BlockInfoList *qmp_query_block(Error **errp) info->value->inserted->has_backing_file = true; info->value->inserted->backing_file = g_strdup(bs->backing_file); } + + if (bs->io_limits_enabled) { + info->value->inserted->bps = + bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]; + info->value->inserted->bps_rd = + bs->io_limits.bps[BLOCK_IO_LIMIT_READ]; + info->value->inserted->bps_wr = + bs->io_limits.bps[BLOCK_IO_LIMIT_WRITE]; + info->value->inserted->iops = + bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]; + info->value->inserted->iops_rd = + bs->io_limits.iops[BLOCK_IO_LIMIT_READ]; + info->value->inserted->iops_wr = + bs->io_limits.iops[BLOCK_IO_LIMIT_WRITE]; + } } /* XXX: waiting for the qapi to support GSList */ diff --git a/blockdev.c b/blockdev.c index 651828c..95d1faa 100644 --- a/blockdev.c +++ b/blockdev.c @@ -757,6 +757,65 @@ int do_change_block(Monitor *mon, const char *device, return monitor_read_bdrv_key_start(mon, bs, NULL, NULL); } +/* throttling disk I/O limits */ +int do_block_set_io_throttle(Monitor *mon, + const QDict *qdict, QObject **ret_data) +{ + BlockIOLimit io_limits; + const char *devname = qdict_get_str(qdict, "device"); + BlockDriverState *bs; + + io_limits.bps[BLOCK_IO_LIMIT_TOTAL] + = qdict_get_try_int(qdict, "bps", -1); + io_limits.bps[BLOCK_IO_LIMIT_READ] + = qdict_get_try_int(qdict, "bps_rd", -1); + io_limits.bps[BLOCK_IO_LIMIT_WRITE] + = qdict_get_try_int(qdict, "bps_wr", -1); + io_limits.iops[BLOCK_IO_LIMIT_TOTAL] + = qdict_get_try_int(qdict, "iops", -1); + io_limits.iops[BLOCK_IO_LIMIT_READ] + = qdict_get_try_int(qdict, "iops_rd", -1); + io_limits.iops[BLOCK_IO_LIMIT_WRITE] + = qdict_get_try_int(qdict, "iops_wr", -1); + + bs = bdrv_find(devname); + if (!bs) { + qerror_report(QERR_DEVICE_NOT_FOUND, devname); + return -1; + } + + if ((io_limits.bps[BLOCK_IO_LIMIT_TOTAL] == -1) + || (io_limits.bps[BLOCK_IO_LIMIT_READ] == -1) + || (io_limits.bps[BLOCK_IO_LIMIT_WRITE] == -1) + || (io_limits.iops[BLOCK_IO_LIMIT_TOTAL] == -1) + || (io_limits.iops[BLOCK_IO_LIMIT_READ] == -1) + || (io_limits.iops[BLOCK_IO_LIMIT_WRITE] == -1)) { + qerror_report(QERR_MISSING_PARAMETER, + "bps/bps_rd/bps_wr/iops/iops_rd/iops_wr"); + return -1; + } + + if (!do_check_io_limits(&io_limits)) { + qerror_report(QERR_INVALID_PARAMETER_COMBINATION); + return -1; + } + + bs->io_limits = io_limits; + bs->slice_time = BLOCK_IO_SLICE_TIME; + + if (!bs->io_limits_enabled && bdrv_io_limits_enabled(bs)) { + bdrv_io_limits_enable(bs); + } else if (bs->io_limits_enabled && !bdrv_io_limits_enabled(bs)) { + bdrv_io_limits_disable(bs); + } else { + if (bs->block_timer) { + qemu_mod_timer(bs->block_timer, qemu_get_clock_ns(vm_clock)); + } + } + + return 0; +} + int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data) { const char *id = qdict_get_str(qdict, "id"); diff --git a/blockdev.h b/blockdev.h index 3587786..1b48a75 100644 --- a/blockdev.h +++ b/blockdev.h @@ -63,6 +63,8 @@ int do_block_set_passwd(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_change_block(Monitor *mon, const char *device, const char *filename, const char *fmt); int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data); +int do_block_set_io_throttle(Monitor *mon, + const QDict *qdict, QObject **ret_data); int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_block_resize(Monitor *mon, const QDict *qdict, QObject **ret_data); diff --git a/hmp-commands.hx b/hmp-commands.hx index 089c1ac..48f3c21 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1207,6 +1207,21 @@ ETEXI }, STEXI +@item block_set_io_throttle @var{device} @var{bps} @var{bps_rd} @var{bps_wr} @var{iops} @var{iops_rd} @var{iops_wr} +@findex block_set_io_throttle +Change I/O throttle limits for a block drive to @var{bps} @var{bps_rd} @var{bps_wr} @var{iops} @var{iops_rd} @var{iops_wr} +ETEXI + + { + .name = "block_set_io_throttle", + .args_type = "device:B,bps:i?,bps_rd:i?,bps_wr:i?,iops:i?,iops_rd:i?,iops_wr:i?", + .params = "device [bps] [bps_rd] [bps_wr] [iops] [iops_rd] [iops_wr]", + .help = "change I/O throttle limits for a block drive", + .user_print = monitor_user_noop, + .mhandler.cmd_new = do_block_set_io_throttle, + }, + +STEXI @item block_passwd @var{device} @var{password} @findex block_passwd Set the encrypted device @var{device} password to @var{password} diff --git a/hmp.c b/hmp.c index 443d3a7..dfab7ad 100644 --- a/hmp.c +++ b/hmp.c @@ -216,6 +216,16 @@ void hmp_info_block(Monitor *mon) info->value->inserted->ro, info->value->inserted->drv, info->value->inserted->encrypted); + + monitor_printf(mon, " bps=%" PRId64 " bps_rd=%" PRId64 + " bps_wr=%" PRId64 " iops=%" PRId64 + " iops_rd=%" PRId64 " iops_wr=%" PRId64, + info->value->inserted->bps, + info->value->inserted->bps_rd, + info->value->inserted->bps_wr, + info->value->inserted->iops, + info->value->inserted->iops_rd, + info->value->inserted->iops_wr); } else { monitor_printf(mon, " [not inserted]"); } diff --git a/qapi-schema.json b/qapi-schema.json index cb1ba77..734076b 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -370,13 +370,27 @@ # # @encrypted: true if the backing device is encrypted # +# @bps: #optional if total throughput limit in bytes per second is specified +# +# @bps_rd: #optional if read throughput limit in bytes per second is specified +# +# @bps_wr: #optional if write throughput limit in bytes per second is specified +# +# @iops: #optional if total I/O operations per second is specified +# +# @iops_rd: #optional if read I/O operations per second is specified +# +# @iops_wr: #optional if write I/O operations per second is specified +# # Since: 0.14.0 # # Notes: This interface is only found in @BlockInfo. ## { 'type': 'BlockDeviceInfo', 'data': { 'file': 'str', 'ro': 'bool', 'drv': 'str', - '*backing_file': 'str', 'encrypted': 'bool' } } + '*backing_file': 'str', 'encrypted': 'bool', + 'bps': 'int', 'bps_rd': 'int', 'bps_wr': 'int', + 'iops': 'int', 'iops_rd': 'int', 'iops_wr': 'int'} } ## # @BlockDeviceIoStatus: diff --git a/qerror.c b/qerror.c index 4b48b39..807fb55 100644 --- a/qerror.c +++ b/qerror.c @@ -238,6 +238,10 @@ static const QErrorStringTable qerror_table[] = { .error_fmt = QERR_QGA_COMMAND_FAILED, .desc = "Guest agent command failed, error was '%(message)'", }, + { + .error_fmt = QERR_INVALID_PARAMETER_COMBINATION, + .desc = "Invalid paramter combination", + }, {} }; diff --git a/qerror.h b/qerror.h index d4bfcfd..777a36a 100644 --- a/qerror.h +++ b/qerror.h @@ -198,4 +198,7 @@ QError *qobject_to_qerror(const QObject *obj); #define QERR_QGA_COMMAND_FAILED \ "{ 'class': 'QgaCommandFailed', 'data': { 'message': %s } }" +#define QERR_INVALID_PARAMETER_COMBINATION \ + "{ 'class': 'InvalidParameterCombination', 'data': {} }" + #endif /* QERROR_H */ diff --git a/qmp-commands.hx b/qmp-commands.hx index 97975a5..cdc3c18 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -851,6 +851,44 @@ Example: EQMP { + .name = "block_set_io_throttle", + .args_type = "device:B,bps:i?,bps_rd:i?,bps_wr:i?,iops:i?,iops_rd:i?,iops_wr:i?", + .params = "device [bps] [bps_rd] [bps_wr] [iops] [iops_rd] [iops_wr]", + .help = "change I/O throttle limits for a block drive", + .user_print = monitor_user_noop, + .mhandler.cmd_new = do_block_set_io_throttle, + }, + +SQMP +block_set_io_throttle +------------ + +Change I/O throttle limits for a block drive. + +Arguments: + +- "device": device name (json-string) +- "bps": total throughput limit in bytes per second(json-int, optional) +- "bps_rd": read throughput limit in bytes per second(json-int, optional) +- "bps_wr": read throughput limit in bytes per second(json-int, optional) +- "iops": total I/O operations per second(json-int, optional) +- "iops_rd": read I/O operations per second(json-int, optional) +- "iops_wr": write I/O operations per second(json-int, optional) + +Example: + +-> { "execute": "block_set_io_throttle", "arguments": { "device": "virtio0", + "bps": "1000000", + "bps_rd": "0", + "bps_wr": "0", + "iops": "0", + "iops_rd": "0", + "iops_wr": "0" } } +<- { "return": {} } + +EQMP + + { .name = "set_password", .args_type = "protocol:s,password:s,connected:s?", .params = "protocol password action-if-connected", @@ -1152,6 +1190,13 @@ Each json-object contain the following: "tftp", "vdi", "vmdk", "vpc", "vvfat" - "backing_file": backing file name (json-string, optional) - "encrypted": true if encrypted, false otherwise (json-bool) + - "bps": limit total bytes per second (json-int) + - "bps_rd": limit read bytes per second (json-int) + - "bps_wr": limit write bytes per second (json-int) + - "iops": limit total I/O operations per second (json-int) + - "iops_rd": limit read operations per second (json-int) + - "iops_wr": limit write operations per second (json-int) + - "io-status": I/O operation status, only present if the device supports it and the VM is configured to stop on errors. It's always reset to "ok" when the "cont" command is issued (json_string, optional) @@ -1171,7 +1216,13 @@ Example: "ro":false, "drv":"qcow2", "encrypted":false, - "file":"disks/test.img" + "file":"disks/test.img", + "bps":1000000, + "bps_rd":0, + "bps_wr":0, + "iops":1000000, + "iops_rd":0, + "iops_wr":0, }, "type":"unknown" }, -- 1.7.6 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v12 4/5] hmp/qmp: add block_set_io_throttle 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 4/5] hmp/qmp: add block_set_io_throttle Zhi Yong Wu @ 2011-11-07 15:26 ` Kevin Wolf 2011-11-08 2:21 ` Zhi Yong Wu 0 siblings, 1 reply; 15+ messages in thread From: Kevin Wolf @ 2011-11-07 15:26 UTC (permalink / raw) To: Zhi Yong Wu; +Cc: zwu.kernel, ryanh, qemu-devel, stefanha Am 03.11.2011 09:57, schrieb Zhi Yong Wu: > Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> > Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> > --- > block.c | 15 +++++++++++++ > blockdev.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > blockdev.h | 2 + > hmp-commands.hx | 15 +++++++++++++ > hmp.c | 10 +++++++++ > qapi-schema.json | 16 +++++++++++++- > qerror.c | 4 +++ > qerror.h | 3 ++ > qmp-commands.hx | 53 +++++++++++++++++++++++++++++++++++++++++++++++- > 9 files changed, 175 insertions(+), 2 deletions(-) > > diff --git a/block.c b/block.c > index b2af48f..ed6fe20 100644 > --- a/block.c > +++ b/block.c > @@ -1971,6 +1971,21 @@ BlockInfoList *qmp_query_block(Error **errp) > info->value->inserted->has_backing_file = true; > info->value->inserted->backing_file = g_strdup(bs->backing_file); > } > + > + if (bs->io_limits_enabled) { > + info->value->inserted->bps = > + bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]; > + info->value->inserted->bps_rd = > + bs->io_limits.bps[BLOCK_IO_LIMIT_READ]; > + info->value->inserted->bps_wr = > + bs->io_limits.bps[BLOCK_IO_LIMIT_WRITE]; > + info->value->inserted->iops = > + bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]; > + info->value->inserted->iops_rd = > + bs->io_limits.iops[BLOCK_IO_LIMIT_READ]; > + info->value->inserted->iops_wr = > + bs->io_limits.iops[BLOCK_IO_LIMIT_WRITE]; > + } > } > > /* XXX: waiting for the qapi to support GSList */ > diff --git a/blockdev.c b/blockdev.c > index 651828c..95d1faa 100644 > --- a/blockdev.c > +++ b/blockdev.c > @@ -757,6 +757,65 @@ int do_change_block(Monitor *mon, const char *device, > return monitor_read_bdrv_key_start(mon, bs, NULL, NULL); > } > > +/* throttling disk I/O limits */ > +int do_block_set_io_throttle(Monitor *mon, > + const QDict *qdict, QObject **ret_data) > +{ > + BlockIOLimit io_limits; > + const char *devname = qdict_get_str(qdict, "device"); > + BlockDriverState *bs; > + > + io_limits.bps[BLOCK_IO_LIMIT_TOTAL] > + = qdict_get_try_int(qdict, "bps", -1); > + io_limits.bps[BLOCK_IO_LIMIT_READ] > + = qdict_get_try_int(qdict, "bps_rd", -1); > + io_limits.bps[BLOCK_IO_LIMIT_WRITE] > + = qdict_get_try_int(qdict, "bps_wr", -1); > + io_limits.iops[BLOCK_IO_LIMIT_TOTAL] > + = qdict_get_try_int(qdict, "iops", -1); > + io_limits.iops[BLOCK_IO_LIMIT_READ] > + = qdict_get_try_int(qdict, "iops_rd", -1); > + io_limits.iops[BLOCK_IO_LIMIT_WRITE] > + = qdict_get_try_int(qdict, "iops_wr", -1); > + > + bs = bdrv_find(devname); > + if (!bs) { > + qerror_report(QERR_DEVICE_NOT_FOUND, devname); > + return -1; > + } > + > + if ((io_limits.bps[BLOCK_IO_LIMIT_TOTAL] == -1) > + || (io_limits.bps[BLOCK_IO_LIMIT_READ] == -1) > + || (io_limits.bps[BLOCK_IO_LIMIT_WRITE] == -1) > + || (io_limits.iops[BLOCK_IO_LIMIT_TOTAL] == -1) > + || (io_limits.iops[BLOCK_IO_LIMIT_READ] == -1) > + || (io_limits.iops[BLOCK_IO_LIMIT_WRITE] == -1)) { > + qerror_report(QERR_MISSING_PARAMETER, > + "bps/bps_rd/bps_wr/iops/iops_rd/iops_wr"); > + return -1; > + } Here you require that all parameters are set... > + > + if (!do_check_io_limits(&io_limits)) { > + qerror_report(QERR_INVALID_PARAMETER_COMBINATION); > + return -1; > + } > + > + bs->io_limits = io_limits; > + bs->slice_time = BLOCK_IO_SLICE_TIME; > + > + if (!bs->io_limits_enabled && bdrv_io_limits_enabled(bs)) { > + bdrv_io_limits_enable(bs); > + } else if (bs->io_limits_enabled && !bdrv_io_limits_enabled(bs)) { > + bdrv_io_limits_disable(bs); > + } else { > + if (bs->block_timer) { > + qemu_mod_timer(bs->block_timer, qemu_get_clock_ns(vm_clock)); > + } > + } > + > + return 0; > +} > + > int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data) > { > const char *id = qdict_get_str(qdict, "id"); > diff --git a/blockdev.h b/blockdev.h > index 3587786..1b48a75 100644 > --- a/blockdev.h > +++ b/blockdev.h > @@ -63,6 +63,8 @@ int do_block_set_passwd(Monitor *mon, const QDict *qdict, QObject **ret_data); > int do_change_block(Monitor *mon, const char *device, > const char *filename, const char *fmt); > int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data); > +int do_block_set_io_throttle(Monitor *mon, > + const QDict *qdict, QObject **ret_data); > int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data); > int do_block_resize(Monitor *mon, const QDict *qdict, QObject **ret_data); > > diff --git a/hmp-commands.hx b/hmp-commands.hx > index 089c1ac..48f3c21 100644 > --- a/hmp-commands.hx > +++ b/hmp-commands.hx > @@ -1207,6 +1207,21 @@ ETEXI > }, > > STEXI > +@item block_set_io_throttle @var{device} @var{bps} @var{bps_rd} @var{bps_wr} @var{iops} @var{iops_rd} @var{iops_wr} > +@findex block_set_io_throttle > +Change I/O throttle limits for a block drive to @var{bps} @var{bps_rd} @var{bps_wr} @var{iops} @var{iops_rd} @var{iops_wr} > +ETEXI > + > + { > + .name = "block_set_io_throttle", > + .args_type = "device:B,bps:i?,bps_rd:i?,bps_wr:i?,iops:i?,iops_rd:i?,iops_wr:i?", > + .params = "device [bps] [bps_rd] [bps_wr] [iops] [iops_rd] [iops_wr]", > + .help = "change I/O throttle limits for a block drive", > + .user_print = monitor_user_noop, > + .mhandler.cmd_new = do_block_set_io_throttle, > + }, > + ...but here.... > +STEXI > @item block_passwd @var{device} @var{password} > @findex block_passwd > Set the encrypted device @var{device} password to @var{password} > diff --git a/hmp.c b/hmp.c > index 443d3a7..dfab7ad 100644 > --- a/hmp.c > +++ b/hmp.c > @@ -216,6 +216,16 @@ void hmp_info_block(Monitor *mon) > info->value->inserted->ro, > info->value->inserted->drv, > info->value->inserted->encrypted); > + > + monitor_printf(mon, " bps=%" PRId64 " bps_rd=%" PRId64 > + " bps_wr=%" PRId64 " iops=%" PRId64 > + " iops_rd=%" PRId64 " iops_wr=%" PRId64, > + info->value->inserted->bps, > + info->value->inserted->bps_rd, > + info->value->inserted->bps_wr, > + info->value->inserted->iops, > + info->value->inserted->iops_rd, > + info->value->inserted->iops_wr); > } else { > monitor_printf(mon, " [not inserted]"); > } > diff --git a/qapi-schema.json b/qapi-schema.json > index cb1ba77..734076b 100644 > --- a/qapi-schema.json > +++ b/qapi-schema.json > @@ -370,13 +370,27 @@ > # > # @encrypted: true if the backing device is encrypted > # > +# @bps: #optional if total throughput limit in bytes per second is specified > +# > +# @bps_rd: #optional if read throughput limit in bytes per second is specified > +# > +# @bps_wr: #optional if write throughput limit in bytes per second is specified > +# > +# @iops: #optional if total I/O operations per second is specified > +# > +# @iops_rd: #optional if read I/O operations per second is specified > +# > +# @iops_wr: #optional if write I/O operations per second is specified > +# > # Since: 0.14.0 > # > # Notes: This interface is only found in @BlockInfo. > ## > { 'type': 'BlockDeviceInfo', > 'data': { 'file': 'str', 'ro': 'bool', 'drv': 'str', > - '*backing_file': 'str', 'encrypted': 'bool' } } > + '*backing_file': 'str', 'encrypted': 'bool', > + 'bps': 'int', 'bps_rd': 'int', 'bps_wr': 'int', > + 'iops': 'int', 'iops_rd': 'int', 'iops_wr': 'int'} } > > ## > # @BlockDeviceIoStatus: > diff --git a/qerror.c b/qerror.c > index 4b48b39..807fb55 100644 > --- a/qerror.c > +++ b/qerror.c > @@ -238,6 +238,10 @@ static const QErrorStringTable qerror_table[] = { > .error_fmt = QERR_QGA_COMMAND_FAILED, > .desc = "Guest agent command failed, error was '%(message)'", > }, > + { > + .error_fmt = QERR_INVALID_PARAMETER_COMBINATION, > + .desc = "Invalid paramter combination", > + }, > {} > }; > > diff --git a/qerror.h b/qerror.h > index d4bfcfd..777a36a 100644 > --- a/qerror.h > +++ b/qerror.h > @@ -198,4 +198,7 @@ QError *qobject_to_qerror(const QObject *obj); > #define QERR_QGA_COMMAND_FAILED \ > "{ 'class': 'QgaCommandFailed', 'data': { 'message': %s } }" > > +#define QERR_INVALID_PARAMETER_COMBINATION \ > + "{ 'class': 'InvalidParameterCombination', 'data': {} }" > + > #endif /* QERROR_H */ > diff --git a/qmp-commands.hx b/qmp-commands.hx > index 97975a5..cdc3c18 100644 > --- a/qmp-commands.hx > +++ b/qmp-commands.hx > @@ -851,6 +851,44 @@ Example: > EQMP > > { > + .name = "block_set_io_throttle", > + .args_type = "device:B,bps:i?,bps_rd:i?,bps_wr:i?,iops:i?,iops_rd:i?,iops_wr:i?", > + .params = "device [bps] [bps_rd] [bps_wr] [iops] [iops_rd] [iops_wr]", > + .help = "change I/O throttle limits for a block drive", > + .user_print = monitor_user_noop, > + .mhandler.cmd_new = do_block_set_io_throttle, > + }, > + > +SQMP > +block_set_io_throttle > +------------ > + > +Change I/O throttle limits for a block drive. > + > +Arguments: > + > +- "device": device name (json-string) > +- "bps": total throughput limit in bytes per second(json-int, optional) > +- "bps_rd": read throughput limit in bytes per second(json-int, optional) > +- "bps_wr": read throughput limit in bytes per second(json-int, optional) > +- "iops": total I/O operations per second(json-int, optional) > +- "iops_rd": read I/O operations per second(json-int, optional) > +- "iops_wr": write I/O operations per second(json-int, optional) ...and here they are described as optional. One part is wrong, though I don't know which one. Kevin ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v12 4/5] hmp/qmp: add block_set_io_throttle 2011-11-07 15:26 ` Kevin Wolf @ 2011-11-08 2:21 ` Zhi Yong Wu 0 siblings, 0 replies; 15+ messages in thread From: Zhi Yong Wu @ 2011-11-08 2:21 UTC (permalink / raw) To: Kevin Wolf; +Cc: ryanh, Zhi Yong Wu, qemu-devel, stefanha On Mon, Nov 7, 2011 at 11:26 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 03.11.2011 09:57, schrieb Zhi Yong Wu: >> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> >> --- >> block.c | 15 +++++++++++++ >> blockdev.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> blockdev.h | 2 + >> hmp-commands.hx | 15 +++++++++++++ >> hmp.c | 10 +++++++++ >> qapi-schema.json | 16 +++++++++++++- >> qerror.c | 4 +++ >> qerror.h | 3 ++ >> qmp-commands.hx | 53 +++++++++++++++++++++++++++++++++++++++++++++++- >> 9 files changed, 175 insertions(+), 2 deletions(-) >> >> diff --git a/block.c b/block.c >> index b2af48f..ed6fe20 100644 >> --- a/block.c >> +++ b/block.c >> @@ -1971,6 +1971,21 @@ BlockInfoList *qmp_query_block(Error **errp) >> info->value->inserted->has_backing_file = true; >> info->value->inserted->backing_file = g_strdup(bs->backing_file); >> } >> + >> + if (bs->io_limits_enabled) { >> + info->value->inserted->bps = >> + bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]; >> + info->value->inserted->bps_rd = >> + bs->io_limits.bps[BLOCK_IO_LIMIT_READ]; >> + info->value->inserted->bps_wr = >> + bs->io_limits.bps[BLOCK_IO_LIMIT_WRITE]; >> + info->value->inserted->iops = >> + bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]; >> + info->value->inserted->iops_rd = >> + bs->io_limits.iops[BLOCK_IO_LIMIT_READ]; >> + info->value->inserted->iops_wr = >> + bs->io_limits.iops[BLOCK_IO_LIMIT_WRITE]; >> + } >> } >> >> /* XXX: waiting for the qapi to support GSList */ >> diff --git a/blockdev.c b/blockdev.c >> index 651828c..95d1faa 100644 >> --- a/blockdev.c >> +++ b/blockdev.c >> @@ -757,6 +757,65 @@ int do_change_block(Monitor *mon, const char *device, >> return monitor_read_bdrv_key_start(mon, bs, NULL, NULL); >> } >> >> +/* throttling disk I/O limits */ >> +int do_block_set_io_throttle(Monitor *mon, >> + const QDict *qdict, QObject **ret_data) >> +{ >> + BlockIOLimit io_limits; >> + const char *devname = qdict_get_str(qdict, "device"); >> + BlockDriverState *bs; >> + >> + io_limits.bps[BLOCK_IO_LIMIT_TOTAL] >> + = qdict_get_try_int(qdict, "bps", -1); >> + io_limits.bps[BLOCK_IO_LIMIT_READ] >> + = qdict_get_try_int(qdict, "bps_rd", -1); >> + io_limits.bps[BLOCK_IO_LIMIT_WRITE] >> + = qdict_get_try_int(qdict, "bps_wr", -1); >> + io_limits.iops[BLOCK_IO_LIMIT_TOTAL] >> + = qdict_get_try_int(qdict, "iops", -1); >> + io_limits.iops[BLOCK_IO_LIMIT_READ] >> + = qdict_get_try_int(qdict, "iops_rd", -1); >> + io_limits.iops[BLOCK_IO_LIMIT_WRITE] >> + = qdict_get_try_int(qdict, "iops_wr", -1); >> + >> + bs = bdrv_find(devname); >> + if (!bs) { >> + qerror_report(QERR_DEVICE_NOT_FOUND, devname); >> + return -1; >> + } >> + >> + if ((io_limits.bps[BLOCK_IO_LIMIT_TOTAL] == -1) >> + || (io_limits.bps[BLOCK_IO_LIMIT_READ] == -1) >> + || (io_limits.bps[BLOCK_IO_LIMIT_WRITE] == -1) >> + || (io_limits.iops[BLOCK_IO_LIMIT_TOTAL] == -1) >> + || (io_limits.iops[BLOCK_IO_LIMIT_READ] == -1) >> + || (io_limits.iops[BLOCK_IO_LIMIT_WRITE] == -1)) { >> + qerror_report(QERR_MISSING_PARAMETER, >> + "bps/bps_rd/bps_wr/iops/iops_rd/iops_wr"); >> + return -1; >> + } > > Here you require that all parameters are set... This is what i want. > >> + >> + if (!do_check_io_limits(&io_limits)) { >> + qerror_report(QERR_INVALID_PARAMETER_COMBINATION); >> + return -1; >> + } >> + >> + bs->io_limits = io_limits; >> + bs->slice_time = BLOCK_IO_SLICE_TIME; >> + >> + if (!bs->io_limits_enabled && bdrv_io_limits_enabled(bs)) { >> + bdrv_io_limits_enable(bs); >> + } else if (bs->io_limits_enabled && !bdrv_io_limits_enabled(bs)) { >> + bdrv_io_limits_disable(bs); >> + } else { >> + if (bs->block_timer) { >> + qemu_mod_timer(bs->block_timer, qemu_get_clock_ns(vm_clock)); >> + } >> + } >> + >> + return 0; >> +} >> + >> int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data) >> { >> const char *id = qdict_get_str(qdict, "id"); >> diff --git a/blockdev.h b/blockdev.h >> index 3587786..1b48a75 100644 >> --- a/blockdev.h >> +++ b/blockdev.h >> @@ -63,6 +63,8 @@ int do_block_set_passwd(Monitor *mon, const QDict *qdict, QObject **ret_data); >> int do_change_block(Monitor *mon, const char *device, >> const char *filename, const char *fmt); >> int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data); >> +int do_block_set_io_throttle(Monitor *mon, >> + const QDict *qdict, QObject **ret_data); >> int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data); >> int do_block_resize(Monitor *mon, const QDict *qdict, QObject **ret_data); >> >> diff --git a/hmp-commands.hx b/hmp-commands.hx >> index 089c1ac..48f3c21 100644 >> --- a/hmp-commands.hx >> +++ b/hmp-commands.hx >> @@ -1207,6 +1207,21 @@ ETEXI >> }, >> >> STEXI >> +@item block_set_io_throttle @var{device} @var{bps} @var{bps_rd} @var{bps_wr} @var{iops} @var{iops_rd} @var{iops_wr} >> +@findex block_set_io_throttle >> +Change I/O throttle limits for a block drive to @var{bps} @var{bps_rd} @var{bps_wr} @var{iops} @var{iops_rd} @var{iops_wr} >> +ETEXI >> + >> + { >> + .name = "block_set_io_throttle", >> + .args_type = "device:B,bps:i?,bps_rd:i?,bps_wr:i?,iops:i?,iops_rd:i?,iops_wr:i?", >> + .params = "device [bps] [bps_rd] [bps_wr] [iops] [iops_rd] [iops_wr]", >> + .help = "change I/O throttle limits for a block drive", >> + .user_print = monitor_user_noop, >> + .mhandler.cmd_new = do_block_set_io_throttle, >> + }, >> + > > ...but here.... Sorry, i will update this. > >> +STEXI >> @item block_passwd @var{device} @var{password} >> @findex block_passwd >> Set the encrypted device @var{device} password to @var{password} >> diff --git a/hmp.c b/hmp.c >> index 443d3a7..dfab7ad 100644 >> --- a/hmp.c >> +++ b/hmp.c >> @@ -216,6 +216,16 @@ void hmp_info_block(Monitor *mon) >> info->value->inserted->ro, >> info->value->inserted->drv, >> info->value->inserted->encrypted); >> + >> + monitor_printf(mon, " bps=%" PRId64 " bps_rd=%" PRId64 >> + " bps_wr=%" PRId64 " iops=%" PRId64 >> + " iops_rd=%" PRId64 " iops_wr=%" PRId64, >> + info->value->inserted->bps, >> + info->value->inserted->bps_rd, >> + info->value->inserted->bps_wr, >> + info->value->inserted->iops, >> + info->value->inserted->iops_rd, >> + info->value->inserted->iops_wr); >> } else { >> monitor_printf(mon, " [not inserted]"); >> } >> diff --git a/qapi-schema.json b/qapi-schema.json >> index cb1ba77..734076b 100644 >> --- a/qapi-schema.json >> +++ b/qapi-schema.json >> @@ -370,13 +370,27 @@ >> # >> # @encrypted: true if the backing device is encrypted >> # >> +# @bps: #optional if total throughput limit in bytes per second is specified >> +# >> +# @bps_rd: #optional if read throughput limit in bytes per second is specified >> +# >> +# @bps_wr: #optional if write throughput limit in bytes per second is specified >> +# >> +# @iops: #optional if total I/O operations per second is specified >> +# >> +# @iops_rd: #optional if read I/O operations per second is specified >> +# >> +# @iops_wr: #optional if write I/O operations per second is specified >> +# >> # Since: 0.14.0 >> # >> # Notes: This interface is only found in @BlockInfo. >> ## >> { 'type': 'BlockDeviceInfo', >> 'data': { 'file': 'str', 'ro': 'bool', 'drv': 'str', >> - '*backing_file': 'str', 'encrypted': 'bool' } } >> + '*backing_file': 'str', 'encrypted': 'bool', >> + 'bps': 'int', 'bps_rd': 'int', 'bps_wr': 'int', >> + 'iops': 'int', 'iops_rd': 'int', 'iops_wr': 'int'} } >> >> ## >> # @BlockDeviceIoStatus: >> diff --git a/qerror.c b/qerror.c >> index 4b48b39..807fb55 100644 >> --- a/qerror.c >> +++ b/qerror.c >> @@ -238,6 +238,10 @@ static const QErrorStringTable qerror_table[] = { >> .error_fmt = QERR_QGA_COMMAND_FAILED, >> .desc = "Guest agent command failed, error was '%(message)'", >> }, >> + { >> + .error_fmt = QERR_INVALID_PARAMETER_COMBINATION, >> + .desc = "Invalid paramter combination", >> + }, >> {} >> }; >> >> diff --git a/qerror.h b/qerror.h >> index d4bfcfd..777a36a 100644 >> --- a/qerror.h >> +++ b/qerror.h >> @@ -198,4 +198,7 @@ QError *qobject_to_qerror(const QObject *obj); >> #define QERR_QGA_COMMAND_FAILED \ >> "{ 'class': 'QgaCommandFailed', 'data': { 'message': %s } }" >> >> +#define QERR_INVALID_PARAMETER_COMBINATION \ >> + "{ 'class': 'InvalidParameterCombination', 'data': {} }" >> + >> #endif /* QERROR_H */ >> diff --git a/qmp-commands.hx b/qmp-commands.hx >> index 97975a5..cdc3c18 100644 >> --- a/qmp-commands.hx >> +++ b/qmp-commands.hx >> @@ -851,6 +851,44 @@ Example: >> EQMP >> >> { >> + .name = "block_set_io_throttle", >> + .args_type = "device:B,bps:i?,bps_rd:i?,bps_wr:i?,iops:i?,iops_rd:i?,iops_wr:i?", >> + .params = "device [bps] [bps_rd] [bps_wr] [iops] [iops_rd] [iops_wr]", >> + .help = "change I/O throttle limits for a block drive", >> + .user_print = monitor_user_noop, >> + .mhandler.cmd_new = do_block_set_io_throttle, >> + }, >> + >> +SQMP >> +block_set_io_throttle >> +------------ >> + >> +Change I/O throttle limits for a block drive. >> + >> +Arguments: >> + >> +- "device": device name (json-string) >> +- "bps": total throughput limit in bytes per second(json-int, optional) >> +- "bps_rd": read throughput limit in bytes per second(json-int, optional) >> +- "bps_wr": read throughput limit in bytes per second(json-int, optional) >> +- "iops": total I/O operations per second(json-int, optional) >> +- "iops_rd": read I/O operations per second(json-int, optional) >> +- "iops_wr": write I/O operations per second(json-int, optional) > > ...and here they are described as optional. One part is wrong, though I > don't know which one. This will be updated. > > Kevin > -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v12 5/5] block: perf testing report based on block I/O throttling 2011-11-03 8:57 [Qemu-devel] [PATCH v12 0/5] The intro to QEMU block I/O throttling Zhi Yong Wu ` (3 preceding siblings ...) 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 4/5] hmp/qmp: add block_set_io_throttle Zhi Yong Wu @ 2011-11-03 8:57 ` Zhi Yong Wu 2011-11-07 15:27 ` Kevin Wolf 4 siblings, 1 reply; 15+ messages in thread From: Zhi Yong Wu @ 2011-11-03 8:57 UTC (permalink / raw) To: kwolf; +Cc: zwu.kernel, ryanh, Zhi Yong Wu, qemu-devel, stefanha The file 1mbps.dat is based on bps=1024*1024 I/O throttling; and the file 10mbps.dat is based on bps=10*1024*1024 I/O throttling. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> --- 10mbps.dat | 310 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1mbps.dat | 339 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 649 insertions(+), 0 deletions(-) create mode 100644 10mbps.dat create mode 100644 1mbps.dat diff --git a/10mbps.dat b/10mbps.dat new file mode 100644 index 0000000..2ef419b --- /dev/null +++ b/10mbps.dat @@ -0,0 +1,310 @@ +test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [100.0% done] [0K/1,618K /s] [0/3K iops] [eta 00m:00s] +test: (groupid=0, jobs=1): err= 0: pid=3552 + write: io=51,200KB, bw=1,739KB/s, iops=3,478, runt= 29441msec + slat (usec): min=14, max=9,047, avg=22.07, stdev=66.92 + clat (usec): min=1, max=120K, avg=262.82, stdev=555.45 + lat (usec): min=213, max=120K, avg=285.49, stdev=571.78 + bw (KB/s) : min= 1223, max= 1847, per=100.11%, avg=1740.90, stdev=120.92 + cpu : usr=1.45%, sys=9.67%, ctx=102648, majf=0, minf=23 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/102400, short=0/0 + lat (usec): 2=0.01%, 4=0.01%, 50=0.01%, 100=0.05%, 250=78.05% + lat (usec): 500=20.75%, 750=0.19%, 1000=0.36% + lat (msec): 2=0.53%, 4=0.02%, 10=0.02%, 20=0.01%, 50=0.02% + lat (msec): 250=0.01% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=1,739KB/s, minb=1,780KB/s, maxb=1,780KB/s, mint=29441msec, maxt=29441msec + +Disk stats (read/write): + dm-0: ios=0/102409, merge=0/0, ticks=0/30908, in_queue=30908, util=88.72%, aggrios=0/102435, aggrmerge=0/37, aggrticks=0/29427, aggrin_queue=29420, aggrutil=88.58% + vda: ios=0/102435, merge=0/37, ticks=0/29427, in_queue=29420, util=88.58% +test: (g=0): rw=write, bs=1K-1K/1K-1K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [100.0% done] [0K/3,289K /s] [0/3K iops] [eta 00m:00s] +test: (groupid=0, jobs=1): err= 0: pid=3560 + write: io=51,200KB, bw=3,482KB/s, iops=3,481, runt= 14705msec + slat (usec): min=14, max=16,837, avg=22.62, stdev=92.91 + clat (usec): min=1, max=36,828, avg=261.91, stdev=418.40 + lat (usec): min=214, max=38,143, avg=285.13, stdev=454.17 + bw (KB/s) : min= 3038, max= 3722, per=100.00%, avg=3481.03, stdev=219.52 + cpu : usr=1.86%, sys=9.47%, ctx=51325, majf=0, minf=23 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/51200, short=0/0 + lat (usec): 2=0.01%, 4=0.02%, 100=0.06%, 250=78.20%, 500=20.52% + lat (usec): 750=0.23%, 1000=0.27% + lat (msec): 2=0.61%, 4=0.01%, 10=0.04%, 20=0.01%, 50=0.02% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=3,481KB/s, minb=3,565KB/s, maxb=3,565KB/s, mint=14705msec, maxt=14705msec + +Disk stats (read/write): + dm-0: ios=0/51037, merge=0/0, ticks=0/13460, in_queue=13460, util=88.34%, aggrios=0/51210, aggrmerge=0/17, aggrticks=0/13221, aggrin_queue=13219, aggrutil=88.14% + vda: ios=0/51210, merge=0/17, ticks=0/13221, in_queue=13219, util=88.14% +test: (g=0): rw=write, bs=2K-2K/2K-2K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [100.0% done] [0K/7,353K /s] [0/4K iops] [eta 00m:00s] +test: (groupid=0, jobs=1): err= 0: pid=3567 + write: io=51,200KB, bw=6,847KB/s, iops=3,423, runt= 7478msec + slat (usec): min=14, max=16,906, avg=23.00, stdev=108.13 + clat (usec): min=1, max=34,159, avg=266.41, stdev=442.61 + lat (usec): min=217, max=37,680, avg=290.00, stdev=484.19 + bw (KB/s) : min= 5948, max= 7280, per=99.65%, avg=6821.93, stdev=441.31 + cpu : usr=1.78%, sys=9.44%, ctx=25657, majf=0, minf=23 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/25600, short=0/0 + lat (usec): 2=0.02%, 4=0.01%, 100=0.10%, 250=73.29%, 500=25.28% + lat (usec): 750=0.30%, 1000=0.29% + lat (msec): 2=0.60%, 4=0.02%, 10=0.06%, 50=0.02% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=6,846KB/s, minb=7,011KB/s, maxb=7,011KB/s, mint=7478msec, maxt=7478msec + +Disk stats (read/write): + dm-0: ios=0/25300, merge=0/0, ticks=0/6863, in_queue=6863, util=87.37%, aggrios=0/25604, aggrmerge=0/15, aggrticks=0/6678, aggrin_queue=6677, aggrutil=87.10% + vda: ios=0/25604, merge=0/15, ticks=0/6678, in_queue=6677, util=87.10% +test: (g=0): rw=write, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [100.0% done] [0K/12M /s] [0/3K iops] [eta 00m:00s] +test: (groupid=0, jobs=1): err= 0: pid=3574 + write: io=51,200KB, bw=10,445KB/s, iops=2,611, runt= 4902msec + slat (usec): min=14, max=1,084, avg=22.49, stdev=21.67 + clat (usec): min=1, max=501K, avg=357.64, stdev=6282.79 + lat (usec): min=220, max=501K, avg=380.76, stdev=6282.83 + bw (KB/s) : min= 4852, max=14016, per=104.91%, avg=10956.50, stdev=3750.77 + cpu : usr=1.20%, sys=7.45%, ctx=12836, majf=0, minf=24 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/12800, short=0/0 + lat (usec): 2=0.01%, 4=0.01%, 100=0.07%, 250=62.34%, 500=35.22% + lat (usec): 750=1.32%, 1000=0.28% + lat (msec): 2=0.63%, 4=0.02%, 10=0.05%, 50=0.03%, 750=0.02% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=10,444KB/s, minb=10,695KB/s, maxb=10,695KB/s, mint=4902msec, maxt=4902msec + +Disk stats (read/write): + dm-0: ios=0/12763, merge=0/0, ticks=0/4717, in_queue=4717, util=88.39%, aggrios=0/12805, aggrmerge=0/14, aggrticks=0/4504, aggrin_queue=4499, aggrutil=87.96% + vda: ios=0/12805, merge=0/14, ticks=0/4504, in_queue=4499, util=87.96% +test: (g=0): rw=write, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [100.0% done] [0K/8,535K /s] [0/1K iops] [eta 00m:00s] +test: (groupid=0, jobs=1): err= 0: pid=3582 + write: io=51,200KB, bw=10,980KB/s, iops=1,372, runt= 4663msec + slat (usec): min=14, max=68,995, avg=34.01, stdev=862.56 + clat (usec): min=2, max=502K, avg=691.64, stdev=14005.68 + lat (usec): min=225, max=502K, avg=726.29, stdev=14044.20 + bw (KB/s) : min=10228, max=10250, per=93.29%, avg=10243.20, stdev= 9.96 + cpu : usr=0.82%, sys=3.82%, ctx=6412, majf=0, minf=23 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/6400, short=0/0 + lat (usec): 4=0.05%, 250=47.50%, 500=48.36%, 750=2.45%, 1000=0.72% + lat (msec): 2=0.55%, 4=0.11%, 10=0.16%, 20=0.02%, 50=0.02% + lat (msec): 750=0.08% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=10,980KB/s, minb=11,243KB/s, maxb=11,243KB/s, mint=4663msec, maxt=4663msec + +Disk stats (read/write): + dm-0: ios=0/6323, merge=0/0, ticks=0/9761, in_queue=9761, util=92.43%, aggrios=0/6405, aggrmerge=0/14, aggrticks=0/5332, aggrin_queue=5332, aggrutil=92.12% + vda: ios=0/6405, merge=0/14, ticks=0/5332, in_queue=5332, util=92.12% +test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [83.3% done] [0K/6,416K /s] [0/48 iops] [eta 00m:01s] +test: (groupid=0, jobs=1): err= 0: pid=3596 + write: io=51,200KB, bw=10,415KB/s, iops=81, runt= 4916msec + slat (usec): min=23, max=41,668, avg=139.95, stdev=2081.64 + clat (usec): min=699, max=543K, avg=12143.84, stdev=73600.48 + lat (usec): min=728, max=544K, avg=12284.44, stdev=73657.92 + bw (KB/s) : min=10153, max=10257, per=98.27%, avg=10234.12, stdev=34.61 + cpu : usr=0.12%, sys=0.26%, ctx=403, majf=0, minf=23 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/400, short=0/0 + lat (usec): 750=30.50%, 1000=18.00% + lat (msec): 2=44.25%, 4=1.00%, 10=1.50%, 20=1.00%, 50=1.75% + lat (msec): 750=2.00% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=10,414KB/s, minb=10,664KB/s, maxb=10,664KB/s, mint=4916msec, maxt=4916msec + +Disk stats (read/write): + dm-0: ios=0/410, merge=0/0, ticks=0/5494, in_queue=6013, util=97.73%, aggrios=0/405, aggrmerge=0/14, aggrticks=0/5105, aggrin_queue=5105, aggrutil=97.34% + vda: ios=0/405, merge=0/14, ticks=0/5105, in_queue=5105, util=97.34% +test: (g=0): rw=write, bs=256K-256K/256K-256K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [100.0% done] [0K/14M /s] [0/55 iops] [eta 00m:00s] +test: (groupid=0, jobs=1): err= 0: pid=3603 + write: io=51,200KB, bw=10,919KB/s, iops=42, runt= 4689msec + slat (usec): min=32, max=9,026, avg=92.83, stdev=635.17 + clat (msec): min=1, max=592, avg=23.34, stdev=102.35 + lat (msec): min=1, max=592, avg=23.44, stdev=102.36 + bw (KB/s) : min=10158, max=10256, per=93.73%, avg=10234.29, stdev=35.87 + cpu : usr=0.02%, sys=0.23%, ctx=201, majf=0, minf=23 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/200, short=0/0 + + lat (msec): 2=75.00%, 4=9.50%, 10=3.50%, 20=2.00%, 50=6.50% + lat (msec): 750=3.50% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=10,919KB/s, minb=11,181KB/s, maxb=11,181KB/s, mint=4689msec, maxt=4689msec + +Disk stats (read/write): + dm-0: ios=0/202, merge=0/0, ticks=0/4249, in_queue=4841, util=97.78%, aggrios=0/204, aggrmerge=0/14, aggrticks=0/4755, aggrin_queue=4755, aggrutil=97.48% + vda: ios=0/204, merge=0/14, ticks=0/4755, in_queue=4755, util=97.48% +test: (g=0): rw=write, bs=512K-512K/512K-512K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [100.0% done] [0K/17M /s] [0/32 iops] [eta 00m:00s] +test: (groupid=0, jobs=1): err= 0: pid=3610 + write: io=51,200KB, bw=10,614KB/s, iops=20, runt= 4824msec + slat (usec): min=48, max=8,414, avg=155.42, stdev=834.45 + clat (msec): min=2, max=681, avg=48.08, stdev=150.68 + lat (msec): min=2, max=681, avg=48.23, stdev=150.67 + bw (KB/s) : min=10168, max=10257, per=96.36%, avg=10227.00, stdev=34.44 + cpu : usr=0.10%, sys=0.10%, ctx=101, majf=0, minf=23 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/100, short=0/0 + + lat (msec): 4=66.00%, 10=10.00%, 20=6.00%, 50=11.00%, 750=7.00% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=10,613KB/s, minb=10,868KB/s, maxb=10,868KB/s, mint=4824msec, maxt=4824msec + +Disk stats (read/write): + dm-0: ios=0/114, merge=0/0, ticks=0/4606, in_queue=5111, util=97.84%, aggrios=0/105, aggrmerge=0/14, aggrticks=0/5132, aggrin_queue=5132, aggrutil=97.67% + vda: ios=0/105, merge=0/14, ticks=0/5132, in_queue=5132, util=97.67% +test: (g=0): rw=write, bs=1M-1M/1M-1M, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [100.0% done] [0K/9,427K /s] [0/8 iops] [eta 00m:00s] +test: (groupid=0, jobs=1): err= 0: pid=3617 + write: io=51,200KB, bw=10,910KB/s, iops=10, runt= 4693msec + slat (usec): min=88, max=8,609, avg=279.00, stdev=1202.17 + clat (msec): min=4, max=757, avg=93.56, stdev=212.43 + lat (msec): min=4, max=758, avg=93.84, stdev=212.38 + bw (KB/s) : min=10155, max=10254, per=93.76%, avg=10228.17, stdev=36.28 + cpu : usr=0.06%, sys=0.09%, ctx=51, majf=0, minf=23 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/50, short=0/0 + + lat (msec): 10=44.00%, 20=16.00%, 50=26.00%, 100=2.00%, 750=8.00% + lat (msec): 1000=4.00% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=10,909KB/s, minb=11,171KB/s, maxb=11,171KB/s, mint=4693msec, maxt=4693msec + +Disk stats (read/write): + dm-0: ios=0/112, merge=0/0, ticks=0/13254, in_queue=13303, util=97.89%, aggrios=0/105, aggrmerge=0/14, aggrticks=0/10142, aggrin_queue=10142, aggrutil=97.51% + vda: ios=0/105, merge=0/14, ticks=0/10142, in_queue=10142, util=97.51% +test: (g=0): rw=write, bs=2M-2M/2M-2M, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [100.0% done] [0K/12M /s] [0/5 iops] [eta 00m:00s] +test: (groupid=0, jobs=1): err= 0: pid=3624 + write: io=51,200KB, bw=10,779KB/s, iops=5, runt= 4750msec + slat (usec): min=152, max=277, avg=193.96, stdev=36.43 + clat (msec): min=9, max=1,066, avg=189.77, stdev=330.87 + lat (msec): min=9, max=1,066, avg=189.97, stdev=330.85 + bw (KB/s) : min= 9652, max=10281, per=93.63%, avg=10091.80, stdev=260.97 + cpu : usr=0.04%, sys=0.11%, ctx=25, majf=0, minf=23 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/25, short=0/0 + + lat (msec): 10=8.00%, 20=12.00%, 50=44.00%, 100=16.00%, 750=8.00% + lat (msec): 1000=8.00%, 2000=4.00% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=10,778KB/s, minb=11,037KB/s, maxb=11,037KB/s, mint=4750msec, maxt=4750msec + +Disk stats (read/write): + dm-0: ios=0/108, merge=0/0, ticks=0/23308, in_queue=27376, util=97.80%, aggrios=0/103, aggrmerge=0/14, aggrticks=0/19660, aggrin_queue=19660, aggrutil=97.45% + vda: ios=0/103, merge=0/14, ticks=0/19660, in_queue=19660, util=97.45% +test: (g=0): rw=write, bs=4M-4M/4M-4M, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [100.0% done] [0K/16M /s] [0/3 iops] [eta 00m:00s] +test: (groupid=0, jobs=1): err= 0: pid=3631 + write: io=53,248KB, bw=11,466KB/s, iops=2, runt= 4644msec + slat (usec): min=290, max=648, avg=405.77, stdev=95.74 + clat (msec): min=57, max=1,138, avg=356.70, stdev=444.78 + lat (msec): min=57, max=1,138, avg=357.14, stdev=444.76 + bw (KB/s) : min= 9869, max=12073, per=93.31%, avg=10698.50, stdev=983.37 + cpu : usr=0.00%, sys=0.13%, ctx=14, majf=0, minf=23 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/13, short=0/0 + + lat (msec): 100=69.23%, 1000=15.38%, 2000=15.38% + +Run status group 0 (all jobs): + WRITE: io=53,248KB, aggrb=11,465KB/s, minb=11,741KB/s, maxb=11,741KB/s, mint=4644msec, maxt=4644msec + +Disk stats (read/write): + dm-0: ios=0/122, merge=0/0, ticks=0/36036, in_queue=36537, util=97.69%, aggrios=0/108, aggrmerge=0/15, aggrticks=0/35504, aggrin_queue=35504, aggrutil=97.34% + vda: ios=0/108, merge=0/15, ticks=0/35504, in_queue=35504, util=97.34% +test: (g=0): rw=write, bs=10M-10M/10M-10M, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [83.3% done] [0K/10M /s] [0/0 iops] [eta 00m:01s] +test: (groupid=0, jobs=1): err= 0: pid=3638 + write: io=51,200KB, bw=10,335KB/s, iops=1, runt= 4954msec + slat (usec): min=783, max=1,129, avg=949.80, stdev=131.70 + clat (msec): min=240, max=1,638, avg=989.72, stdev=505.16 + lat (msec): min=241, max=1,639, avg=990.68, stdev=505.19 + bw (KB/s) : min= 8835, max=11441, per=99.73%, avg=10306.75, stdev=1134.04 + cpu : usr=0.04%, sys=0.08%, ctx=5, majf=0, minf=23 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/5, short=0/0 + + lat (msec): 250=20.00%, 1000=20.00%, 2000=60.00% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=10,335KB/s, minb=10,583KB/s, maxb=10,583KB/s, mint=4954msec, maxt=4954msec + +Disk stats (read/write): + dm-0: ios=0/118, merge=0/0, ticks=0/77727, in_queue=100870, util=97.51%, aggrios=0/105, aggrmerge=0/14, aggrticks=0/94326, aggrin_queue=94326, aggrutil=97.27% + vda: ios=0/105, merge=0/14, ticks=0/94326, in_queue=94326, util=97.27% +test: (g=0): rw=write, bs=100M-100M/100M-100M, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [inf% done] [0K/0K /s] [0/0 iops] [eta 1158050441d:07h:00m:05Jobs: 1 (f=0): [W] [100.0% done] [0K/9,373K /s] [0/0 iops] [eta 00m:00s] +test: (groupid=0, jobs=1): err= 0: pid=3645 + write: io=100MB, bw=9,593KB/s, iops=0, runt= 10675msec + slat (usec): min=2,870K, max=2,870K, avg=2869613.00, stdev= 0.00 + clat (usec): min=7,402K, max=7,402K, avg=7401876.00, stdev= 0.00 + lat (usec): min=10,272K, max=10,272K, avg=10271526.00, stdev= 0.00 + bw (KB/s) : min= 9601, max= 9601, per=100.09%, avg=9601.00, stdev= 0.00 + cpu : usr=0.01%, sys=0.13%, ctx=81, majf=77, minf=1732 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/1, short=0/0 + + lat (msec): >=2000=100.00% + +Run status group 0 (all jobs): + WRITE: io=100MB, aggrb=9,592KB/s, minb=9,822KB/s, maxb=9,822KB/s, mint=10675msec, maxt=10675msec + +Disk stats (read/write): + dm-0: ios=1/71, merge=0/0, ticks=577/5370, in_queue=9436, util=7.23%, aggrios=78/426, aggrmerge=509/1616, aggrticks=3427/467317, aggrin_queue=470744, aggrutil=98.32% + vda: ios=78/426, merge=509/1616, ticks=3427/467317, in_queue=470744, util=98.32% +[root@f14 ~]# diff --git a/1mbps.dat b/1mbps.dat new file mode 100644 index 0000000..fc0d419 --- /dev/null +++ b/1mbps.dat @@ -0,0 +1,339 @@ +test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [100.0% done] [0K/1,503K /s] [0/3K iops] [eta 00m:00s] +test: (groupid=0, jobs=1): err= 0: pid=2577 + write: io=51,200KB, bw=943KB/s, iops=1,886, runt= 54271msec + slat (usec): min=19, max=515K, avg=35.74, stdev=1612.30 + clat (usec): min=1, max=998K, avg=421.89, stdev=8697.91 + lat (usec): min=215, max=998K, avg=458.32, stdev=8852.21 + bw (KB/s) : min= 0, max= 1729, per=126.35%, avg=1191.52, stdev=513.17 + cpu : usr=0.68%, sys=7.31%, ctx=102762, majf=0, minf=26 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/102400, short=0/0 + lat (usec): 2=0.02%, 4=0.01%, 100=0.07%, 250=63.47%, 500=35.06% + lat (usec): 750=0.34%, 1000=0.06% + lat (msec): 2=0.82%, 4=0.07%, 10=0.01%, 20=0.01%, 50=0.03% + lat (msec): 100=0.01%, 750=0.03%, 1000=0.01% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=943KB/s, minb=966KB/s, maxb=966KB/s, mint=54271msec, maxt=54271msec + +Disk stats (read/write): + dm-0: ios=13/101913, merge=0/0, ticks=4017/228247, in_queue=232254, util=89.99%, aggrios=13/102499, aggrmerge=0/90, aggrticks=4017/166813, aggrin_queue=170639, aggrutil=89.13% + vda: ios=13/102499, merge=0/90, ticks=4017/166813, in_queue=170639, util=89.13% +test: (g=0): rw=write, bs=1K-1K/1K-1K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [96.5% done] [0K/3,148K /s] [0/3K iops] [eta 00m:02s] +test: (groupid=0, jobs=1): err= 0: pid=2585 + write: io=51,200KB, bw=932KB/s, iops=932, runt= 54935msec + slat (usec): min=14, max=8,332, avg=23.73, stdev=105.25 + clat (usec): min=1, max=3,705K, avg=1035.27, stdev=37439.37 + lat (usec): min=216, max=3,705K, avg=1059.63, stdev=37439.88 + bw (KB/s) : min= 0, max= 3678, per=143.56%, avg=1337.98, stdev=958.76 + cpu : usr=0.53%, sys=2.55%, ctx=51352, majf=0, minf=25 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/51200, short=0/0 + lat (usec): 2=0.02%, 4=0.01%, 100=0.04%, 250=70.43%, 500=28.20% + lat (usec): 750=0.30%, 1000=0.28% + lat (msec): 2=0.52%, 4=0.03%, 10=0.01%, 20=0.04%, 50=0.05% + lat (msec): 100=0.01%, 750=0.08%, >=2000=0.01% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=932KB/s, minb=954KB/s, maxb=954KB/s, mint=54935msec, maxt=54935msec + +Disk stats (read/write): + dm-0: ios=0/50983, merge=0/0, ticks=0/213890, in_queue=213890, util=97.13%, aggrios=0/51271, aggrmerge=0/48, aggrticks=0/182123, aggrin_queue=182117, aggrutil=97.05% + vda: ios=0/51271, merge=0/48, ticks=0/182123, in_queue=182117, util=97.05% +test: (g=0): rw=write, bs=2K-2K/2K-2K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [94.9% done] [0K/405K /s] [0/197 iops] [eta 00m:03s] +test: (groupid=0, jobs=1): err= 0: pid=2593 + write: io=51,200KB, bw=920KB/s, iops=459, runt= 55678msec + slat (usec): min=14, max=33,442, avg=28.90, stdev=398.16 + clat (usec): min=1, max=3,903K, avg=1959.71, stdev=53216.81 + lat (usec): min=216, max=3,903K, avg=1989.27, stdev=53220.37 + bw (KB/s) : min= 61, max= 6460, per=133.31%, avg=1225.10, stdev=1134.07 + cpu : usr=0.24%, sys=1.32%, ctx=25668, majf=0, minf=25 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/25600, short=0/0 + lat (usec): 2=0.01%, 4=0.02%, 100=0.01%, 250=66.88%, 500=31.79% + lat (usec): 750=0.29%, 1000=0.20% + lat (msec): 2=0.47%, 4=0.02%, 10=0.01%, 20=0.02%, 50=0.06% + lat (msec): 100=0.01%, 750=0.21%, >=2000=0.02% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=919KB/s, minb=941KB/s, maxb=941KB/s, mint=55678msec, maxt=55678msec + +Disk stats (read/write): + dm-0: ios=0/25151, merge=0/0, ticks=0/282089, in_queue=282089, util=98.29%, aggrios=0/25659, aggrmerge=0/39, aggrticks=0/210880, aggrin_queue=210878, aggrutil=98.20% + vda: ios=0/25659, merge=0/39, ticks=0/210880, in_queue=210878, util=98.20% +test: (g=0): rw=write, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [91.7% done] [0K/0K /s] [0/0 iops] [eta 00m:05s] +test: (groupid=0, jobs=1): err= 0: pid=2600 + write: io=51,200KB, bw=941KB/s, iops=235, runt= 54417msec + slat (usec): min=13, max=6,612K, avg=552.76, stdev=58448.78 + clat (usec): min=1, max=5,042K, avg=3693.60, stdev=74145.30 + lat (usec): min=207, max=6,638K, avg=4247.00, stdev=94533.46 + bw (KB/s) : min= 0, max= 1033, per=104.01%, avg=977.65, stdev=193.31 + cpu : usr=0.17%, sys=0.62%, ctx=12836, majf=0, minf=25 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/12800, short=0/0 + lat (usec): 2=0.01%, 100=0.02%, 250=53.02%, 500=44.40%, 750=1.26% + lat (usec): 1000=0.12% + lat (msec): 2=0.40%, 4=0.03%, 20=0.12%, 50=0.16%, 750=0.42% + lat (msec): 2000=0.02%, >=2000=0.02% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=940KB/s, minb=963KB/s, maxb=963KB/s, mint=54417msec, maxt=54417msec + +Disk stats (read/write): + dm-0: ios=0/12078, merge=0/0, ticks=0/251022, in_queue=251022, util=99.21%, aggrios=0/12862, aggrmerge=0/43, aggrticks=0/222267, aggrin_queue=222264, aggrutil=99.14% + vda: ios=0/12862, merge=0/43, ticks=0/222267, in_queue=222264, util=99.14% +test: (g=0): rw=write, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [93.2% done] [0K/0K /s] [0/0 iops] [eta 00m:04s] +test: (groupid=0, jobs=1): err= 0: pid=2607 + write: io=51,200KB, bw=941KB/s, iops=117, runt= 54416msec + slat (usec): min=15, max=33,439, avg=43.10, stdev=698.99 + clat (usec): min=2, max=4,040K, avg=7580.60, stdev=105355.37 + lat (usec): min=226, max=4,040K, avg=7624.37, stdev=105360.30 + bw (KB/s) : min= 2, max= 1032, per=101.86%, avg=957.53, stdev=213.27 + cpu : usr=0.09%, sys=0.33%, ctx=6421, majf=0, minf=25 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/6400, short=0/0 + lat (usec): 4=0.02%, 100=0.03%, 250=29.34%, 500=66.12%, 750=2.36% + lat (usec): 1000=0.36% + lat (msec): 2=0.33%, 4=0.03%, 10=0.02%, 20=0.28%, 50=0.22% + lat (msec): 100=0.02%, 750=0.75%, 2000=0.08%, >=2000=0.05% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=940KB/s, minb=963KB/s, maxb=963KB/s, mint=54416msec, maxt=54416msec + +Disk stats (read/write): + dm-0: ios=1/5969, merge=0/0, ticks=467/237432, in_queue=241910, util=99.71%, aggrios=1/6460, aggrmerge=0/43, aggrticks=467/222411, aggrin_queue=222878, aggrutil=99.64% + vda: ios=1/6460, merge=0/43, ticks=467/222411, in_queue=222878, util=99.64% +test: (g=0): rw=write, bs=64K-64K/64K-64K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [98.3% done] [0K/0K /s] [0/0 iops] [eta 00m:01s] +test: (groupid=0, jobs=1): err= 0: pid=2615 + write: io=51,200KB, bw=902KB/s, iops=14, runt= 56779msec + slat (usec): min=19, max=7,124K, avg=9080.96, stdev=251884.21 + clat (usec): min=308, max=5,645K, avg=61883.27, stdev=394928.06 + lat (usec): min=449, max=7,150K, avg=70965.13, stdev=467716.04 + bw (KB/s) : min= 8, max= 1025, per=107.75%, avg=970.81, stdev=204.00 + cpu : usr=0.01%, sys=0.05%, ctx=811, majf=0, minf=25 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/800, short=0/0 + lat (usec): 500=54.87%, 750=35.88%, 1000=0.88% + lat (msec): 2=0.25%, 4=0.38%, 10=0.25%, 20=1.12%, 50=1.88% + lat (msec): 100=0.12%, 750=1.12%, 1000=1.00%, 2000=1.88%, >=2000=0.38% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=901KB/s, minb=923KB/s, maxb=923KB/s, mint=56779msec, maxt=56779msec + +Disk stats (read/write): + dm-0: ios=0/866, merge=0/0, ticks=0/223017, in_queue=228660, util=99.93%, aggrios=0/862, aggrmerge=0/41, aggrticks=0/203954, aggrin_queue=203954, aggrutil=99.91% + vda: ios=0/862, merge=0/41, ticks=0/203954, in_queue=203954, util=99.91% +test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [98.3% done] [0K/0K /s] [0/0 iops] [eta 00m:01s] +test: (groupid=0, jobs=1): err= 0: pid=2622 + write: io=51,200KB, bw=898KB/s, iops=7, runt= 57016msec + slat (usec): min=23, max=8,082K, avg=20524.64, stdev=404090.96 + clat (usec): min=582, max=5,767K, avg=121981.91, stdev=585587.60 + lat (usec): min=725, max=8,200K, avg=142507.73, stdev=711337.27 + bw (KB/s) : min= 15, max= 1026, per=109.47%, avg=981.96, stdev=193.76 + cpu : usr=0.01%, sys=0.03%, ctx=412, majf=0, minf=25 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/400, short=0/0 + lat (usec): 750=42.75%, 1000=24.00% + lat (msec): 2=20.75%, 10=0.50%, 20=0.75%, 50=4.50%, 250=0.25% + lat (msec): 750=0.75%, 1000=1.25%, 2000=3.50%, >=2000=1.00% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=897KB/s, minb=919KB/s, maxb=919KB/s, mint=57016msec, maxt=57016msec + +Disk stats (read/write): + dm-0: ios=0/492, merge=0/0, ticks=0/172695, in_queue=172695, util=99.91%, aggrios=0/470, aggrmerge=0/36, aggrticks=0/166070, aggrin_queue=166070, aggrutil=99.89% + vda: ios=0/470, merge=0/36, ticks=0/166070, in_queue=166070, util=99.89% +test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [96.6% done] [0K/0K /s] [0/0 iops] [eta 00m:02s] +test: (groupid=0, jobs=1): err= 0: pid=2626 + write: io=51,200KB, bw=907KB/s, iops=7, runt= 56462msec + slat (usec): min=22, max=8,079K, avg=20360.73, stdev=403935.41 + clat (usec): min=589, max=5,632K, avg=120760.03, stdev=536735.73 + lat (usec): min=712, max=8,187K, avg=141121.76, stdev=671347.90 + bw (KB/s) : min= 15, max= 1025, per=108.69%, avg=984.75, stdev=190.25 + cpu : usr=0.00%, sys=0.04%, ctx=411, majf=0, minf=25 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/400, short=0/0 + lat (usec): 750=52.00%, 1000=23.25% + lat (msec): 2=10.75%, 4=1.50%, 10=0.25%, 20=1.25%, 50=3.75% + lat (msec): 250=0.25%, 500=0.25%, 750=0.75%, 1000=1.00%, 2000=3.50% + lat (msec): >=2000=1.50% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=906KB/s, minb=928KB/s, maxb=928KB/s, mint=56462msec, maxt=56462msec + +Disk stats (read/write): + dm-0: ios=0/466, merge=0/0, ticks=0/199397, in_queue=199801, util=99.93%, aggrios=0/462, aggrmerge=0/28, aggrticks=0/186071, aggrin_queue=186071, aggrutil=99.90% + vda: ios=0/462, merge=0/28, ticks=0/186071, in_queue=186071, util=99.90% +test: (g=0): rw=write, bs=256K-256K/256K-256K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [93.8% done] [0K/0K /s] [0/0 iops] [eta 00m:03s] +test: (groupid=0, jobs=1): err= 0: pid=2633 + write: io=51,200KB, bw=1,135KB/s, iops=4, runt= 45100msec + slat (usec): min=29, max=7,058K, avg=35513.48, stdev=499044.93 + clat (msec): min=1, max=6,554, avg=189.94, stdev=1050.37 + lat (msec): min=1, max=7,098, avg=225.46, stdev=1158.32 + bw (KB/s) : min= 36, max= 2503, per=96.02%, avg=1089.86, stdev=722.70 + cpu : usr=0.01%, sys=0.02%, ctx=208, majf=0, minf=24 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/200, short=0/0 + + lat (msec): 2=58.00%, 4=22.00%, 10=1.50%, 20=4.00%, 50=11.00% + lat (msec): 100=0.50%, >=2000=3.00% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=1,135KB/s, minb=1,162KB/s, maxb=1,162KB/s, mint=45100msec, maxt=45100msec + +Disk stats (read/write): + dm-0: ios=0/266, merge=0/0, ticks=0/349073, in_queue=410345, util=99.90%, aggrios=0/254, aggrmerge=0/39, aggrticks=0/300675, aggrin_queue=300675, aggrutil=99.87% + vda: ios=0/254, merge=0/39, ticks=0/300675, in_queue=300675, util=99.87% +test: (g=0): rw=write, bs=512K-512K/512K-512K, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [96.2% done] [0K/0K /s] [0/0 iops] [eta 00m:02s] +test: (groupid=0, jobs=1): err= 0: pid=2642 + write: io=51,200KB, bw=1,040KB/s, iops=2, runt= 49253msec + slat (usec): min=48, max=3,032, avg=114.21, stdev=305.30 + clat (msec): min=2, max=8,847, avg=492.32, stdev=1923.15 + lat (msec): min=2, max=8,847, avg=492.43, stdev=1923.27 + bw (KB/s) : min= 59, max= 2003, per=98.41%, avg=1022.50, stdev=614.79 + cpu : usr=0.00%, sys=0.02%, ctx=106, majf=0, minf=25 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/100, short=0/0 + + lat (msec): 4=51.00%, 10=18.00%, 20=7.00%, 50=15.00%, 100=3.00% + lat (msec): >=2000=6.00% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=1,039KB/s, minb=1,064KB/s, maxb=1,064KB/s, mint=49253msec, maxt=49253msec + +Disk stats (read/write): + dm-0: ios=0/178, merge=0/0, ticks=0/204659, in_queue=224540, util=99.91%, aggrios=0/151, aggrmerge=0/38, aggrticks=0/108535, aggrin_queue=108535, aggrutil=99.87% + vda: ios=0/151, merge=0/38, ticks=0/108535, in_queue=108535, util=99.87% +test: (g=0): rw=write, bs=1M-1M/1M-1M, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [100.0% done] [0K/138K /s] [0/0 iops] [eta 00m:00s] +test: (groupid=0, jobs=1): err= 0: pid=2649 + write: io=51,200KB, bw=1,194KB/s, iops=1, runt= 42883msec + slat (usec): min=79, max=94,291, avg=2387.48, stdev=13366.42 + clat (msec): min=4, max=11,278, avg=854.56, stdev=2850.11 + lat (msec): min=4, max=11,278, avg=856.95, stdev=2849.49 + bw (KB/s) : min= 882, max= 1839, per=99.73%, avg=1189.75, stdev=437.63 + cpu : usr=0.00%, sys=0.01%, ctx=57, majf=0, minf=25 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/50, short=0/0 + + lat (msec): 10=38.00%, 20=18.00%, 50=20.00%, 100=14.00%, 250=2.00% + lat (msec): >=2000=8.00% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=1,193KB/s, minb=1,222KB/s, maxb=1,222KB/s, mint=42883msec, maxt=42883msec + +Disk stats (read/write): + dm-0: ios=0/176, merge=0/0, ticks=0/434584, in_queue=522186, util=84.59%, aggrios=0/145, aggrmerge=0/38, aggrticks=0/483688, aggrin_queue=483688, aggrutil=99.84% + vda: ios=0/145, merge=0/38, ticks=0/483688, in_queue=483688, util=99.84% +test: (g=0): rw=write, bs=2M-2M/2M-2M, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [93.4% done] [0K/0K /s] [0/0 iops] [eta 00m:04s] +test: (groupid=0, jobs=1): err= 0: pid=2656 + write: io=51,200KB, bw=907KB/s, iops=0, runt= 56425msec + slat (usec): min=149, max=8,062K, avg=324358.12, stdev=1612031.89 + clat (msec): min=8, max=7,963, avg=1931.41, stdev=3084.90 + lat (msec): min=9, max=8,134, avg=2255.77, stdev=3295.44 + bw (KB/s) : min= 251, max= 1028, per=99.85%, avg=905.67, stdev=259.71 + cpu : usr=0.00%, sys=0.01%, ctx=32, majf=0, minf=24 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/25, short=0/0 + + lat (msec): 10=8.00%, 20=8.00%, 50=24.00%, 100=28.00%, >=2000=32.00% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=907KB/s, minb=929KB/s, maxb=929KB/s, mint=56425msec, maxt=56425msec + +Disk stats (read/write): + dm-0: ios=4/181, merge=0/0, ticks=6902/544633, in_queue=551859, util=99.91%, aggrios=4/151, aggrmerge=0/37, aggrticks=6980/422988, aggrin_queue=429968, aggrutil=99.92% + vda: ios=4/151, merge=0/37, ticks=6980/422988, in_queue=429968, util=99.92% +test: (g=0): rw=write, bs=4M-4M/4M-4M, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [inf% done] [0K/0K /s] [0/0 iops] [eta 1158050441d:07h:00m:13Jobs: 1 (f=1): [W] [inf% done] [0K/0K /s] [0/0 iops] [eta 1158050441d:07h:00m:13Jobs: 1 (f=1): [W] [inf% done] [0K/0K /s] [0/0 iops] [eta 1158050441d:07h:00m:13Jobs: 1 (f=1): [W] [inf% done] [0K/0K /s] [0/0 iops] [eta 1158050441d:07h:00m:13Jobs: 1 (f=1): [W] [inf% done] [0K/0K /s] [0/0 iops] [eta 1158050441d:07h:00m:12Jobs: 1 (f=1): [W] [inf% done] [0K/0K /s] [0/0 iops] [eta 1158050441d:07h:00m:12Jobs: 1 (f=1): [W] [inf% done] [0K/0K /s] [0/0 iops] [eta 1158050441d:07h:00m:12Jobs: 1 (f=1): [W] [inf% done] [0K/0K /s] [0/0 iops] [eta 1158050441d:07h:00m:12Jobs: 1 (f=1): [W] [8.1% done] [0K/990K /s] [0/0 iops] [eta 00m:57s] Jobs: 1 (f=1): [W] [88.9% done] [0K/0K /s] [0/0 iops] [eta 00m:07s] +test: (groupid=0, jobs=1): err= 0: pid=2694 + write: io=53,248KB, bw=951KB/s, iops=0, runt= 55981msec + slat (usec): min=308, max=99,165, avg=9423.77, stdev=27118.23 + clat (msec): min=46, max=12,004, avg=4293.90, stdev=5115.17 + lat (msec): min=47, max=12,103, avg=4303.33, stdev=5126.37 + bw (KB/s) : min= 338, max= 1444, per=97.48%, avg=927.00, stdev=365.90 + cpu : usr=0.00%, sys=0.01%, ctx=19, majf=0, minf=24 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/13, short=0/0 + + lat (msec): 50=7.69%, 100=23.08%, 250=23.08%, >=2000=46.15% + +Run status group 0 (all jobs): + WRITE: io=53,248KB, aggrb=951KB/s, minb=974KB/s, maxb=974KB/s, mint=55981msec, maxt=55981msec + +Disk stats (read/write): + dm-0: ios=0/179, merge=0/0, ticks=0/857023, in_queue=954515, util=99.90%, aggrios=0/158, aggrmerge=0/45, aggrticks=0/799098, aggrin_queue=799098, aggrutil=99.87% + vda: ios=0/158, merge=0/45, ticks=0/799098, in_queue=799098, util=99.87% +test: (g=0): rw=write, bs=10M-10M/10M-10M, ioengine=libaio, iodepth=1 +Starting 1 process +Jobs: 1 (f=1): [W] [100.0% done] [0K/1,115K /s] [0/0 iops] [eta 00m:00s] +test: (groupid=0, jobs=1): err= 0: pid=2701 + write: io=51,200KB, bw=1,025KB/s, iops=0, runt= 49928msec + slat (usec): min=902, max=185K, avg=37668.00, stdev=82084.39 + clat (msec): min=203, max=17,994, avg=9947.50, stdev=8967.29 + lat (msec): min=204, max=18,179, avg=9985.18, stdev=9008.75 + bw (KB/s) : min= 563, max= 2083, per=105.63%, avg=1082.67, stdev=866.53 + cpu : usr=0.00%, sys=0.01%, ctx=9, majf=0, minf=24 + IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% + submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% + issued r/w: total=0/5, short=0/0 + + lat (msec): 250=20.00%, 500=20.00%, >=2000=60.00% + +Run status group 0 (all jobs): + WRITE: io=51,200KB, aggrb=1,025KB/s, minb=1,050KB/s, maxb=1,050KB/s, mint=49928msec, maxt=49928msec + +Disk stats (read/write): + dm-0: ios=0/158, merge=0/0, ticks=0/883525, in_queue=890761, util=64.33%, aggrios=2/135, aggrmerge=0/34, aggrticks=10798/1123673, aggrin_queue=1134490, aggrutil=99.80% + vda: ios=2/135, merge=0/34, ticks=10798/1123673, in_queue=1134490, util=99.80% -- 1.7.6 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v12 5/5] block: perf testing report based on block I/O throttling 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 5/5] block: perf testing report based on block I/O throttling Zhi Yong Wu @ 2011-11-07 15:27 ` Kevin Wolf 2011-11-08 2:15 ` Zhi Yong Wu 0 siblings, 1 reply; 15+ messages in thread From: Kevin Wolf @ 2011-11-07 15:27 UTC (permalink / raw) To: Zhi Yong Wu; +Cc: zwu.kernel, ryanh, qemu-devel, stefanha Am 03.11.2011 09:57, schrieb Zhi Yong Wu: > The file 1mbps.dat is based on bps=1024*1024 I/O throttling; and the file 10mbps.dat is based on bps=10*1024*1024 I/O throttling. > > Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> > --- > 10mbps.dat | 310 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1mbps.dat | 339 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 649 insertions(+), 0 deletions(-) > create mode 100644 10mbps.dat > create mode 100644 1mbps.dat This is just for information and not supposed to be merged, right? Kevin ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v12 5/5] block: perf testing report based on block I/O throttling 2011-11-07 15:27 ` Kevin Wolf @ 2011-11-08 2:15 ` Zhi Yong Wu 0 siblings, 0 replies; 15+ messages in thread From: Zhi Yong Wu @ 2011-11-08 2:15 UTC (permalink / raw) To: Kevin Wolf; +Cc: ryanh, Zhi Yong Wu, qemu-devel, stefanha On Mon, Nov 7, 2011 at 11:27 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 03.11.2011 09:57, schrieb Zhi Yong Wu: >> The file 1mbps.dat is based on bps=1024*1024 I/O throttling; and the file 10mbps.dat is based on bps=10*1024*1024 I/O throttling. >> >> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >> --- >> 10mbps.dat | 310 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> 1mbps.dat | 339 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 649 insertions(+), 0 deletions(-) >> create mode 100644 10mbps.dat >> create mode 100644 1mbps.dat > > This is just for information and not supposed to be merged, right? Yeah, it is only to supply some proof to prove I/O block throttling is effective. > > Kevin > -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v12 3/5] block: add I/O throttling algorithm @ 2011-11-08 5:00 Zhi Yong Wu 0 siblings, 0 replies; 15+ messages in thread From: Zhi Yong Wu @ 2011-11-08 5:00 UTC (permalink / raw) To: kwolf; +Cc: zwu.kernel, ryanh, Zhi Yong Wu, qemu-devel, stefanha Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- block.c | 234 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ block.h | 1 + block_int.h | 1 + 3 files changed, 236 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 79e7f09..3d0ec23 100644 --- a/block.c +++ b/block.c @@ -74,6 +74,13 @@ static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs, bool is_write); static void coroutine_fn bdrv_co_do_rw(void *opaque); +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, + bool is_write, double elapsed_time, uint64_t *wait); +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, + double elapsed_time, uint64_t *wait); +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, + bool is_write, int64_t *wait); + static QTAILQ_HEAD(, BlockDriverState) bdrv_states = QTAILQ_HEAD_INITIALIZER(bdrv_states); @@ -107,6 +114,24 @@ int is_windows_drive(const char *filename) #endif /* throttling disk I/O limits */ +void bdrv_io_limits_disable(BlockDriverState *bs) +{ + bs->io_limits_enabled = false; + + while (qemu_co_queue_next(&bs->throttled_reqs)); + + if (bs->block_timer) { + qemu_del_timer(bs->block_timer); + qemu_free_timer(bs->block_timer); + bs->block_timer = NULL; + } + + bs->slice_start = 0; + bs->slice_end = 0; + bs->slice_time = 0; + memset(&bs->io_base, 0, sizeof(bs->io_base)); +} + static void bdrv_block_timer(void *opaque) { BlockDriverState *bs = opaque; @@ -136,6 +161,31 @@ bool bdrv_io_limits_enabled(BlockDriverState *bs) || io_limits->iops[BLOCK_IO_LIMIT_TOTAL]; } +static void bdrv_io_limits_intercept(BlockDriverState *bs, + bool is_write, int nb_sectors) +{ + int64_t wait_time = -1; + + if (!qemu_co_queue_empty(&bs->throttled_reqs)) { + qemu_co_queue_wait(&bs->throttled_reqs); + } + + /* In fact, we hope to keep each request's timing, in FIFO mode. The next + * throttled requests will not be dequeued until the current request is + * allowed to be serviced. So if the current request still exceeds the + * limits, it will be inserted to the head. All requests followed it will + * be still in throttled_reqs queue. + */ + + while (bdrv_exceed_io_limits(bs, nb_sectors, is_write, &wait_time)) { + qemu_mod_timer(bs->block_timer, + wait_time + qemu_get_clock_ns(vm_clock)); + qemu_co_queue_wait_insert_head(&bs->throttled_reqs); + } + + qemu_co_queue_next(&bs->throttled_reqs); +} + /* check if the path starts with "<protocol>:" */ static int path_has_protocol(const char *path) { @@ -718,6 +768,11 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, bdrv_dev_change_media_cb(bs, true); } + /* throttling disk I/O limits */ + if (bs->io_limits_enabled) { + bdrv_io_limits_enable(bs); + } + return 0; unlink_and_fail: @@ -753,6 +808,11 @@ void bdrv_close(BlockDriverState *bs) bdrv_dev_change_media_cb(bs, false); } + + /*throttling disk I/O limits*/ + if (bs->io_limits_enabled) { + bdrv_io_limits_disable(bs); + } } void bdrv_close_all(void) @@ -1291,6 +1351,11 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, return -EIO; } + /* throttling disk read I/O */ + if (bs->io_limits_enabled) { + bdrv_io_limits_intercept(bs, false, nb_sectors); + } + return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov); } @@ -1321,6 +1386,11 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs, return -EIO; } + /* throttling disk write I/O */ + if (bs->io_limits_enabled) { + bdrv_io_limits_intercept(bs, true, nb_sectors); + } + ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov); if (bs->dirty_bitmap) { @@ -2512,6 +2582,170 @@ void bdrv_aio_cancel(BlockDriverAIOCB *acb) acb->pool->cancel(acb); } +/* block I/O throttling */ +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, + bool is_write, double elapsed_time, uint64_t *wait) +{ + uint64_t bps_limit = 0; + double bytes_limit, bytes_base, bytes_res; + double slice_time, wait_time; + + if (bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]) { + bps_limit = bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]; + } else if (bs->io_limits.bps[is_write]) { + bps_limit = bs->io_limits.bps[is_write]; + } else { + if (wait) { + *wait = 0; + } + + return false; + } + + slice_time = bs->slice_end - bs->slice_start; + slice_time /= (NANOSECONDS_PER_SECOND); + bytes_limit = bps_limit * slice_time; + bytes_base = bs->nr_bytes[is_write] - bs->io_base.bytes[is_write]; + if (bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]) { + bytes_base += bs->nr_bytes[!is_write] - bs->io_base.bytes[!is_write]; + } + + /* bytes_base: the bytes of data which have been read/written; and + * it is obtained from the history statistic info. + * bytes_res: the remaining bytes of data which need to be read/written. + * (bytes_base + bytes_res) / bps_limit: used to calcuate + * the total time for completing reading/writting all data. + */ + bytes_res = (unsigned) nb_sectors * BDRV_SECTOR_SIZE; + + if (bytes_base + bytes_res <= bytes_limit) { + if (wait) { + *wait = 0; + } + + return false; + } + + /* Calc approx time to dispatch */ + wait_time = (bytes_base + bytes_res) / bps_limit - elapsed_time; + + /* When the I/O rate at runtime exceeds the limits, + * bs->slice_end need to be extended in order that the current statistic + * info can be kept until the timer fire, so it is increased and tuned + * based on the result of experiment. + */ + bs->slice_time = wait_time * BLOCK_IO_SLICE_TIME * 10; + bs->slice_end += bs->slice_time - 3 * BLOCK_IO_SLICE_TIME; + if (wait) { + *wait = wait_time * BLOCK_IO_SLICE_TIME * 10; + } + + return true; +} + +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, + double elapsed_time, uint64_t *wait) +{ + uint64_t iops_limit = 0; + double ios_limit, ios_base; + double slice_time, wait_time; + + if (bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]) { + iops_limit = bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]; + } else if (bs->io_limits.iops[is_write]) { + iops_limit = bs->io_limits.iops[is_write]; + } else { + if (wait) { + *wait = 0; + } + + return false; + } + + slice_time = bs->slice_end - bs->slice_start; + slice_time /= (NANOSECONDS_PER_SECOND); + ios_limit = iops_limit * slice_time; + ios_base = bs->nr_ops[is_write] - bs->io_base.ios[is_write]; + if (bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]) { + ios_base += bs->nr_ops[!is_write] - bs->io_base.ios[!is_write]; + } + + if (ios_base + 1 <= ios_limit) { + if (wait) { + *wait = 0; + } + + return false; + } + + /* Calc approx time to dispatch */ + wait_time = (ios_base + 1) / iops_limit; + if (wait_time > elapsed_time) { + wait_time = wait_time - elapsed_time; + } else { + wait_time = 0; + } + + bs->slice_time = wait_time * BLOCK_IO_SLICE_TIME * 10; + bs->slice_end += bs->slice_time - 3 * BLOCK_IO_SLICE_TIME; + if (wait) { + *wait = wait_time * BLOCK_IO_SLICE_TIME * 10; + } + + return true; +} + +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, + bool is_write, int64_t *wait) +{ + int64_t now, max_wait; + uint64_t bps_wait = 0, iops_wait = 0; + double elapsed_time; + int bps_ret, iops_ret; + + now = qemu_get_clock_ns(vm_clock); + if ((bs->slice_start < now) + && (bs->slice_end > now)) { + bs->slice_end = now + bs->slice_time; + } else { + bs->slice_time = 5 * BLOCK_IO_SLICE_TIME; + bs->slice_start = now; + bs->slice_end = now + bs->slice_time; + + bs->io_base.bytes[is_write] = bs->nr_bytes[is_write]; + bs->io_base.bytes[!is_write] = bs->nr_bytes[!is_write]; + + bs->io_base.ios[is_write] = bs->nr_ops[is_write]; + bs->io_base.ios[!is_write] = bs->nr_ops[!is_write]; + } + + elapsed_time = now - bs->slice_start; + elapsed_time /= (NANOSECONDS_PER_SECOND); + + bps_ret = bdrv_exceed_bps_limits(bs, nb_sectors, + is_write, elapsed_time, &bps_wait); + iops_ret = bdrv_exceed_iops_limits(bs, is_write, + elapsed_time, &iops_wait); + if (bps_ret || iops_ret) { + max_wait = bps_wait > iops_wait ? bps_wait : iops_wait; + if (wait) { + *wait = max_wait; + } + + now = qemu_get_clock_ns(vm_clock); + if (bs->slice_end < now + max_wait) { + bs->slice_end = now + max_wait; + } + + return true; + } + + if (wait) { + *wait = 0; + } + + return false; +} /**************************************************************/ /* async block device emulation */ diff --git a/block.h b/block.h index bc8315d..9b5b35f 100644 --- a/block.h +++ b/block.h @@ -91,6 +91,7 @@ void bdrv_info_stats(Monitor *mon, QObject **ret_data); /* disk I/O throttling */ void bdrv_io_limits_enable(BlockDriverState *bs); +void bdrv_io_limits_disable(BlockDriverState *bs); bool bdrv_io_limits_enabled(BlockDriverState *bs); void bdrv_init(void); diff --git a/block_int.h b/block_int.h index 7315e0d..69418fe 100644 --- a/block_int.h +++ b/block_int.h @@ -39,6 +39,7 @@ #define BLOCK_IO_LIMIT_TOTAL 2 #define BLOCK_IO_SLICE_TIME 100000000 +#define NANOSECONDS_PER_SECOND 1000000000.0 #define BLOCK_OPT_SIZE "size" #define BLOCK_OPT_ENCRYPT "encryption" -- 1.7.6 ^ permalink raw reply related [flat|nested] 15+ messages in thread
end of thread, other threads:[~2011-11-08 8:57 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-11-03 8:57 [Qemu-devel] [PATCH v12 0/5] The intro to QEMU block I/O throttling Zhi Yong Wu 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 1/5] block: add the blockio limits command line support Zhi Yong Wu 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 2/5] CoQueue: introduce qemu_co_queue_wait_insert_head Zhi Yong Wu 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 3/5] block: add I/O throttling algorithm Zhi Yong Wu 2011-11-07 15:18 ` Kevin Wolf 2011-11-08 4:34 ` Zhi Yong Wu 2011-11-08 8:41 ` Kevin Wolf 2011-11-08 8:57 ` Zhi Yong Wu 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 4/5] hmp/qmp: add block_set_io_throttle Zhi Yong Wu 2011-11-07 15:26 ` Kevin Wolf 2011-11-08 2:21 ` Zhi Yong Wu 2011-11-03 8:57 ` [Qemu-devel] [PATCH v12 5/5] block: perf testing report based on block I/O throttling Zhi Yong Wu 2011-11-07 15:27 ` Kevin Wolf 2011-11-08 2:15 ` Zhi Yong Wu -- strict thread matches above, loose matches on Subject: below -- 2011-11-08 5:00 [Qemu-devel] [PATCH v12 3/5] block: add I/O throttling algorithm Zhi Yong Wu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).