* [Qemu-devel] [RFC PATCH V2 0/4] virtio-blk: add multiread support
@ 2014-12-05 11:50 Peter Lieven
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 1/4] block: add accounting for merged requests Peter Lieven
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Peter Lieven @ 2014-12-05 11:50 UTC (permalink / raw)
To: qemu-devel
Cc: kwolf, famz, benoit, ming.lei, Peter Lieven, armbru, mreitz,
stefanha, pbonzini
this series adds the long missing multiread support to virtio-blk.
some remarks:
- i introduced rd_merged and wr_merged block accounting stats to
blockstats as a generic interface which can be set from any
driver that will introduce multirequst merging in the future.
- the knob to disable request merging is not yet there. I would
add it to the device properties also as a generic interface
to have the same switch on for any driver that might introduce
request merging in the future
- there is cleanup and iotest adjustion missing.
RFC v1->v2:
- completed Patch 1 by the example in qmp-commands.hx [Eric]
- used bool for merge in Patch 4 [Max]
- fixed a few typos in the commit msg of Patch 4 [Max]
- do not start merging and directly pass req[0]->qiov in case of num_reqs == 1
- avoid allocating memory for the multireq [Kevin]
- do not import block_int.h and add appropiate iface to block-backend [Kevin]
- removed debug output and added trace event for multireq [Kevin]
- fixed alloc hint for the merge qiov [Kevin]
- currently did not split virtio_submit_multireq into rw code since
the redundant code would now be much bigger part than in the original patch.
- added a merge_qiov to VirtioBlockRequest. Abusing the qiov was not possible
because it is initialized externally with guest memory [Kevin]
- added a pointer to VirtioBlockRequest to create a linked list
of VirtioBlockBlockRequests. This list is used to complete all
requests belonging to a multireq [Kevin]
Peter Lieven (4):
block: add accounting for merged requests
hw/virtio-blk: add a constant for max number of merged requests
block-backend: expose bs->bl.max_transfer_length
virtio-blk: introduce multiread
block.c | 2 +
block/accounting.c | 7 ++
block/block-backend.c | 5 +
block/qapi.c | 2 +
hmp.c | 6 +-
hw/block/dataplane/virtio-blk.c | 10 +-
hw/block/virtio-blk.c | 219 ++++++++++++++++++++++-----------------
include/block/accounting.h | 3 +
include/hw/virtio/virtio-blk.h | 23 ++--
include/sysemu/block-backend.h | 1 +
qapi/block-core.json | 9 +-
qmp-commands.hx | 22 +++-
trace-events | 1 +
13 files changed, 197 insertions(+), 113 deletions(-)
--
1.7.9.5
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Qemu-devel] [RFC PATCH V2 1/4] block: add accounting for merged requests
2014-12-05 11:50 [Qemu-devel] [RFC PATCH V2 0/4] virtio-blk: add multiread support Peter Lieven
@ 2014-12-05 11:50 ` Peter Lieven
2014-12-05 14:58 ` Eric Blake
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 2/4] hw/virtio-blk: add a constant for max number of " Peter Lieven
` (2 subsequent siblings)
3 siblings, 1 reply; 10+ messages in thread
From: Peter Lieven @ 2014-12-05 11:50 UTC (permalink / raw)
To: qemu-devel
Cc: kwolf, famz, benoit, ming.lei, Peter Lieven, armbru, mreitz,
stefanha, pbonzini
Signed-off-by: Peter Lieven <pl@kamp.de>
---
block.c | 2 ++
block/accounting.c | 7 +++++++
block/qapi.c | 2 ++
hmp.c | 6 +++++-
include/block/accounting.h | 3 +++
qapi/block-core.json | 9 ++++++++-
qmp-commands.hx | 22 ++++++++++++++++++----
7 files changed, 45 insertions(+), 6 deletions(-)
diff --git a/block.c b/block.c
index a612594..75450e3 100644
--- a/block.c
+++ b/block.c
@@ -4508,6 +4508,8 @@ static int multiwrite_merge(BlockDriverState *bs, BlockRequest *reqs,
}
}
+ block_acct_merge_done(&bs->stats, BLOCK_ACCT_WRITE, num_reqs - outidx - 1);
+
return outidx + 1;
}
diff --git a/block/accounting.c b/block/accounting.c
index edbb1cc..d1afcd9 100644
--- a/block/accounting.c
+++ b/block/accounting.c
@@ -52,3 +52,10 @@ void block_acct_highest_sector(BlockAcctStats *stats, int64_t sector_num,
stats->wr_highest_sector = sector_num + nb_sectors - 1;
}
}
+
+void block_acct_merge_done(BlockAcctStats *stats, enum BlockAcctType type,
+ int num_requests)
+{
+ assert(type < BLOCK_MAX_IOTYPE);
+ stats->merged[type] += num_requests;
+}
diff --git a/block/qapi.c b/block/qapi.c
index a87a34a..ec28240 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -316,6 +316,8 @@ static BlockStats *bdrv_query_stats(const BlockDriverState *bs)
s->stats->wr_bytes = bs->stats.nr_bytes[BLOCK_ACCT_WRITE];
s->stats->rd_operations = bs->stats.nr_ops[BLOCK_ACCT_READ];
s->stats->wr_operations = bs->stats.nr_ops[BLOCK_ACCT_WRITE];
+ s->stats->rd_merged = bs->stats.merged[BLOCK_ACCT_READ];
+ s->stats->wr_merged = bs->stats.merged[BLOCK_ACCT_WRITE];
s->stats->wr_highest_offset =
bs->stats.wr_highest_sector * BDRV_SECTOR_SIZE;
s->stats->flush_operations = bs->stats.nr_ops[BLOCK_ACCT_FLUSH];
diff --git a/hmp.c b/hmp.c
index 63d7686..baaa9e5 100644
--- a/hmp.c
+++ b/hmp.c
@@ -419,6 +419,8 @@ void hmp_info_blockstats(Monitor *mon, const QDict *qdict)
" wr_total_time_ns=%" PRId64
" rd_total_time_ns=%" PRId64
" flush_total_time_ns=%" PRId64
+ " rd_merged=%" PRId64
+ " wr_merged=%" PRId64
"\n",
stats->value->stats->rd_bytes,
stats->value->stats->wr_bytes,
@@ -427,7 +429,9 @@ void hmp_info_blockstats(Monitor *mon, const QDict *qdict)
stats->value->stats->flush_operations,
stats->value->stats->wr_total_time_ns,
stats->value->stats->rd_total_time_ns,
- stats->value->stats->flush_total_time_ns);
+ stats->value->stats->flush_total_time_ns,
+ stats->value->stats->rd_merged,
+ stats->value->stats->wr_merged);
}
qapi_free_BlockStatsList(stats_list);
diff --git a/include/block/accounting.h b/include/block/accounting.h
index 50b42b3..4c406cf 100644
--- a/include/block/accounting.h
+++ b/include/block/accounting.h
@@ -39,6 +39,7 @@ typedef struct BlockAcctStats {
uint64_t nr_bytes[BLOCK_MAX_IOTYPE];
uint64_t nr_ops[BLOCK_MAX_IOTYPE];
uint64_t total_time_ns[BLOCK_MAX_IOTYPE];
+ uint64_t merged[BLOCK_MAX_IOTYPE];
uint64_t wr_highest_sector;
} BlockAcctStats;
@@ -53,5 +54,7 @@ void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie,
void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie);
void block_acct_highest_sector(BlockAcctStats *stats, int64_t sector_num,
unsigned int nb_sectors);
+void block_acct_merge_done(BlockAcctStats *stats, enum BlockAcctType type,
+ int num_requests);
#endif
diff --git a/qapi/block-core.json b/qapi/block-core.json
index a14e6ab..ead6d5b 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -389,13 +389,20 @@
# growable sparse files (like qcow2) that are used on top
# of a physical device.
#
+# @rd_merged: Number of read requests that have been merged into another
+# request (Since 2.3).
+#
+# @wr_merged: Number of write requests that have been merged into another
+# request (Since 2.3).
+#
# Since: 0.14.0
##
{ 'type': 'BlockDeviceStats',
'data': {'rd_bytes': 'int', 'wr_bytes': 'int', 'rd_operations': 'int',
'wr_operations': 'int', 'flush_operations': 'int',
'flush_total_time_ns': 'int', 'wr_total_time_ns': 'int',
- 'rd_total_time_ns': 'int', 'wr_highest_offset': 'int' } }
+ 'rd_total_time_ns': 'int', 'wr_highest_offset': 'int',
+ 'rd_merged': 'int', 'wr_merged': 'int' } }
##
# @BlockStats:
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 718dd92..42f1e59 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2261,6 +2261,10 @@ Each json-object contain the following:
- "flush_total_time_ns": total time spend on cache flushes in nano-seconds (json-int)
- "wr_highest_offset": Highest offset of a sector written since the
BlockDriverState has been opened (json-int)
+ - "rd_merged": number of read requests that have been merged into
+ another request (json-int)
+ - "wr_merged": number of write requests that have been merged into
+ another request (json-int)
- "parent": Contains recursively the statistics of the underlying
protocol (e.g. the host file for a qcow2 image). If there is
no underlying protocol, this field is omitted
@@ -2284,6 +2288,8 @@ Example:
"rd_total_times_ns":3465673657
"flush_total_times_ns":49653
"flush_operations":61,
+ "rd_merged":0,
+ "wr_merged":0
}
},
"stats":{
@@ -2295,7 +2301,9 @@ Example:
"flush_operations":51,
"wr_total_times_ns":313253456
"rd_total_times_ns":3465673657
- "flush_total_times_ns":49653
+ "flush_total_times_ns":49653,
+ "rd_merged":0,
+ "wr_merged":0
}
},
{
@@ -2309,7 +2317,9 @@ Example:
"flush_operations":0,
"wr_total_times_ns":0
"rd_total_times_ns":0
- "flush_total_times_ns":0
+ "flush_total_times_ns":0,
+ "rd_merged":0,
+ "wr_merged":0
}
},
{
@@ -2323,7 +2333,9 @@ Example:
"flush_operations":0,
"wr_total_times_ns":0
"rd_total_times_ns":0
- "flush_total_times_ns":0
+ "flush_total_times_ns":0,
+ "rd_merged":0,
+ "wr_merged":0
}
},
{
@@ -2337,7 +2349,9 @@ Example:
"flush_operations":0,
"wr_total_times_ns":0
"rd_total_times_ns":0
- "flush_total_times_ns":0
+ "flush_total_times_ns":0,
+ "rd_merged":0,
+ "wr_merged":0
}
}
]
--
1.7.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [RFC PATCH V2 2/4] hw/virtio-blk: add a constant for max number of merged requests
2014-12-05 11:50 [Qemu-devel] [RFC PATCH V2 0/4] virtio-blk: add multiread support Peter Lieven
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 1/4] block: add accounting for merged requests Peter Lieven
@ 2014-12-05 11:50 ` Peter Lieven
2014-12-05 14:59 ` Eric Blake
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 3/4] block-backend: expose bs->bl.max_transfer_length Peter Lieven
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 4/4] virtio-blk: introduce multiread Peter Lieven
3 siblings, 1 reply; 10+ messages in thread
From: Peter Lieven @ 2014-12-05 11:50 UTC (permalink / raw)
To: qemu-devel
Cc: kwolf, famz, benoit, ming.lei, Peter Lieven, armbru, mreitz,
stefanha, pbonzini
As it was not obvious (at least for me) where the 32 comes from;
add a constant for it.
Signed-off-by: Peter Lieven <pl@kamp.de>
---
hw/block/virtio-blk.c | 2 +-
include/hw/virtio/virtio-blk.h | 4 +++-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index b19b102..490f961 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -326,7 +326,7 @@ static void virtio_blk_handle_write(VirtIOBlockReq *req, MultiReqBuffer *mrb)
block_acct_start(blk_get_stats(req->dev->blk), &req->acct, req->qiov.size,
BLOCK_ACCT_WRITE);
- if (mrb->num_writes == 32) {
+ if (mrb->num_writes == VIRTIO_BLK_MAX_MERGE_REQS) {
virtio_submit_multiwrite(req->dev->blk, mrb);
}
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 3979dc4..3f2652f 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -134,8 +134,10 @@ typedef struct VirtIOBlock {
struct VirtIOBlockDataPlane *dataplane;
} VirtIOBlock;
+#define VIRTIO_BLK_MAX_MERGE_REQS 32
+
typedef struct MultiReqBuffer {
- BlockRequest blkreq[32];
+ BlockRequest blkreq[VIRTIO_BLK_MAX_MERGE_REQS];
unsigned int num_writes;
} MultiReqBuffer;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [RFC PATCH V2 3/4] block-backend: expose bs->bl.max_transfer_length
2014-12-05 11:50 [Qemu-devel] [RFC PATCH V2 0/4] virtio-blk: add multiread support Peter Lieven
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 1/4] block: add accounting for merged requests Peter Lieven
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 2/4] hw/virtio-blk: add a constant for max number of " Peter Lieven
@ 2014-12-05 11:50 ` Peter Lieven
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 4/4] virtio-blk: introduce multiread Peter Lieven
3 siblings, 0 replies; 10+ messages in thread
From: Peter Lieven @ 2014-12-05 11:50 UTC (permalink / raw)
To: qemu-devel
Cc: kwolf, famz, benoit, ming.lei, Peter Lieven, armbru, mreitz,
stefanha, pbonzini
Signed-off-by: Peter Lieven <pl@kamp.de>
---
block/block-backend.c | 5 +++++
include/sysemu/block-backend.h | 1 +
2 files changed, 6 insertions(+)
diff --git a/block/block-backend.c b/block/block-backend.c
index d0692b1..545580f 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -569,6 +569,11 @@ int blk_get_flags(BlockBackend *blk)
return bdrv_get_flags(blk->bs);
}
+int blk_get_max_transfer_length(BlockBackend *blk)
+{
+ return blk->bs->bl.max_transfer_length;
+}
+
void blk_set_guest_block_size(BlockBackend *blk, int align)
{
bdrv_set_guest_block_size(blk->bs, align);
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 52d13c1..b3243cb 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -124,6 +124,7 @@ int blk_is_inserted(BlockBackend *blk);
void blk_lock_medium(BlockBackend *blk, bool locked);
void blk_eject(BlockBackend *blk, bool eject_flag);
int blk_get_flags(BlockBackend *blk);
+int blk_get_max_transfer_length(BlockBackend *blk);
void blk_set_guest_block_size(BlockBackend *blk, int align);
void *blk_blockalign(BlockBackend *blk, size_t size);
bool blk_op_is_blocked(BlockBackend *blk, BlockOpType op, Error **errp);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [RFC PATCH V2 4/4] virtio-blk: introduce multiread
2014-12-05 11:50 [Qemu-devel] [RFC PATCH V2 0/4] virtio-blk: add multiread support Peter Lieven
` (2 preceding siblings ...)
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 3/4] block-backend: expose bs->bl.max_transfer_length Peter Lieven
@ 2014-12-05 11:50 ` Peter Lieven
2014-12-05 15:05 ` Eric Blake
3 siblings, 1 reply; 10+ messages in thread
From: Peter Lieven @ 2014-12-05 11:50 UTC (permalink / raw)
To: qemu-devel
Cc: kwolf, famz, benoit, ming.lei, Peter Lieven, armbru, mreitz,
stefanha, pbonzini
this patch finally introduce multiread support to virtio-blk while
multiwrite support was there for a long time read support was missing.
To achieve this the patch does serveral things which might need futher
explaination:
- the whole merge and multireq logic is moved from block.c into
virtio-blk. This is move is a preparation for directly creating a
coroutine out of virtio-blk.
- requests are only merged if they are strictly sequential and no
longer sorted. This simplification decreases overhead and reduces
latency. It will also merge some requests which were unmergable before.
The old algorithm took up to 32 requests sorted them and tried to merge
them. The outcome was anything between 1 and 32 requests. In case of
32 requests there were 31 requests unnecessarily delayed.
On the other hand lets imagine e.g. 16 unmergeable requests followed
by 32 mergable requests. The latter 32 requests would have been split
into two 16 byte requests.
Last the simplified logic allows for a fast path if we have only a
single request in the multirequest. In this case the request is sent as
ordinary request without mulltireq callbacks.
As a first benchmark I installed Ubuntu 14.04.1 on a ramdisk. The number of
merged requests is in the same order while the latency is slightly decreased.
One should not stick too much to the numbers because the number of wr_requests
are highly fluctuant. I hope the numbers show that this patch is at least
not causing too big harm:
Cmdline:
qemu-system-x86_64 -m 1024 -smp 2 -enable-kvm -cdrom ubuntu-14.04.1-server-amd64.iso \
-drive if=virtio,file=/tmp/ubuntu.raw -monitor stdio
Before:
virtio0: rd_bytes=150979072 wr_bytes=2591138816 rd_operations=18475 wr_operations=69216
flush_operations=15343 wr_total_time_ns=26969283701 rd_total_time_ns=1018449432
flush_total_time_ns=562596819 rd_merged=0 wr_merged=10453
After:
virtio0: rd_bytes=148112896 wr_bytes=2845834240 rd_operations=17535 wr_operations=72197
flush_operations=15760 wr_total_time_ns=26104971623 rd_total_time_ns=1012926283
flush_total_time_ns=564414752 rd_merged=517 wr_merged=9859
Some first numbers of improved read performance while booting:
The Ubuntu 14.04.1 vServer from above:
virtio0: rd_bytes=86158336 wr_bytes=688128 rd_operations=5851 wr_operations=70
flush_operations=4 wr_total_time_ns=10886705 rd_total_time_ns=416688595
flush_total_time_ns=288776 rd_merged=1297 wr_merged=2
Windows 2012R2:
virtio0: rd_bytes=176559104 wr_bytes=61859840 rd_operations=7200 wr_operations=360
flush_operations=68 wr_total_time_ns=34344992718 rd_total_time_ns=134386844669
flush_total_time_ns=18115517 rd_merged=641 wr_merged=216
Signed-off-by: Peter Lieven <pl@kamp.de>
---
hw/block/dataplane/virtio-blk.c | 10 +-
hw/block/virtio-blk.c | 219 ++++++++++++++++++++++-----------------
include/hw/virtio/virtio-blk.h | 25 +++--
trace-events | 1 +
4 files changed, 146 insertions(+), 109 deletions(-)
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 1222a37..aa4ad91 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -96,9 +96,8 @@ static void handle_notify(EventNotifier *e)
event_notifier_test_and_clear(&s->host_notifier);
blk_io_plug(s->conf->conf.blk);
for (;;) {
- MultiReqBuffer mrb = {
- .num_writes = 0,
- };
+ MultiReqBuffer mrb_rd = {};
+ MultiReqBuffer mrb_wr = {.is_write = 1};
int ret;
/* Disable guest->host notifies to avoid unnecessary vmexits */
@@ -117,10 +116,11 @@ static void handle_notify(EventNotifier *e)
req->elem.in_num,
req->elem.index);
- virtio_blk_handle_request(req, &mrb);
+ virtio_blk_handle_request(req, &mrb_wr, &mrb_rd);
}
- virtio_submit_multiwrite(s->conf->conf.blk, &mrb);
+ virtio_submit_multireq(s->conf->conf.blk, &mrb_wr);
+ virtio_submit_multireq(s->conf->conf.blk, &mrb_rd);
if (likely(ret == -EAGAIN)) { /* vring emptied */
/* Re-enable guest->host notifies and stop processing the vring.
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 490f961..7d8a835 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -34,6 +34,8 @@ VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
req->dev = s;
req->qiov.size = 0;
req->next = NULL;
+ req->mr_next = NULL;
+ req->mr_qiov.nalloc = 0;
return req;
}
@@ -84,20 +86,29 @@ static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error,
static void virtio_blk_rw_complete(void *opaque, int ret)
{
- VirtIOBlockReq *req = opaque;
+ VirtIOBlockReq *next = opaque;
- trace_virtio_blk_rw_complete(req, ret);
+ while (next) {
+ VirtIOBlockReq *req = next;
+ next = req->mr_next;
+ trace_virtio_blk_rw_complete(req, ret);
- if (ret) {
- int p = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out.type);
- bool is_read = !(p & VIRTIO_BLK_T_OUT);
- if (virtio_blk_handle_rw_error(req, -ret, is_read))
- return;
- }
+ if (req->mr_qiov.nalloc) {
+ qemu_iovec_destroy(&req->mr_qiov);
+ }
- virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
- block_acct_done(blk_get_stats(req->dev->blk), &req->acct);
- virtio_blk_free_request(req);
+ if (ret) {
+ int p = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out.type);
+ bool is_read = !(p & VIRTIO_BLK_T_OUT);
+ if (virtio_blk_handle_rw_error(req, -ret, is_read)) {
+ continue;
+ }
+ }
+
+ virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
+ block_acct_done(blk_get_stats(req->dev->blk), &req->acct);
+ virtio_blk_free_request(req);
+ }
}
static void virtio_blk_flush_complete(void *opaque, int ret)
@@ -257,24 +268,46 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req)
virtio_blk_free_request(req);
}
-void virtio_submit_multiwrite(BlockBackend *blk, MultiReqBuffer *mrb)
+void virtio_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
{
- int i, ret;
+ QEMUIOVector *qiov = &mrb->reqs[0]->qiov;
- if (!mrb->num_writes) {
+ if (!mrb->num_reqs) {
return;
}
- ret = blk_aio_multiwrite(blk, mrb->blkreq, mrb->num_writes);
- if (ret != 0) {
- for (i = 0; i < mrb->num_writes; i++) {
- if (mrb->blkreq[i].error) {
- virtio_blk_rw_complete(mrb->blkreq[i].opaque, -EIO);
+ if (mrb->num_reqs > 1) {
+ int i;
+
+ trace_virtio_blk_submit_multireq(mrb, mrb->num_reqs, mrb->sector_num, mrb->nb_sectors, mrb->is_write);
+
+ qiov = &mrb->reqs[0]->mr_qiov;
+ qemu_iovec_init(qiov, mrb->niov);
+
+ for (i = 0; i < mrb->num_reqs; i++) {
+ qemu_iovec_concat(qiov, &mrb->reqs[i]->qiov, 0, mrb->reqs[i]->qiov.size);
+ if (i) {
+ mrb->reqs[i - 1]->mr_next = mrb->reqs[i];
}
}
+ assert(mrb->nb_sectors == qiov->size / BDRV_SECTOR_SIZE);
+
+ block_acct_merge_done(blk_get_stats(blk),
+ mrb->is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ,
+ mrb->num_reqs - 1);
}
- mrb->num_writes = 0;
+ if (mrb->is_write) {
+ blk_aio_writev(blk, mrb->sector_num, qiov, mrb->nb_sectors,
+ virtio_blk_rw_complete, mrb->reqs[0]);
+ } else {
+ blk_aio_readv(blk, mrb->sector_num, qiov, mrb->nb_sectors,
+ virtio_blk_rw_complete, mrb->reqs[0]);
+ }
+
+ mrb->num_reqs = 0;
+ mrb->nb_sectors = 0;
+ mrb->niov = 0;
}
static void virtio_blk_handle_flush(VirtIOBlockReq *req, MultiReqBuffer *mrb)
@@ -283,9 +316,9 @@ static void virtio_blk_handle_flush(VirtIOBlockReq *req, MultiReqBuffer *mrb)
BLOCK_ACCT_FLUSH);
/*
- * Make sure all outstanding writes are posted to the backing device.
+ * Make sure all outstanding requests are posted to the backing device.
*/
- virtio_submit_multiwrite(req->dev->blk, mrb);
+ virtio_submit_multireq(req->dev->blk, mrb);
blk_aio_flush(req->dev->blk, virtio_blk_flush_complete, req);
}
@@ -308,61 +341,8 @@ static bool virtio_blk_sect_range_ok(VirtIOBlock *dev,
return true;
}
-static void virtio_blk_handle_write(VirtIOBlockReq *req, MultiReqBuffer *mrb)
-{
- BlockRequest *blkreq;
- uint64_t sector;
-
- sector = virtio_ldq_p(VIRTIO_DEVICE(req->dev), &req->out.sector);
-
- trace_virtio_blk_handle_write(req, sector, req->qiov.size / 512);
-
- if (!virtio_blk_sect_range_ok(req->dev, sector, req->qiov.size)) {
- virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR);
- virtio_blk_free_request(req);
- return;
- }
-
- block_acct_start(blk_get_stats(req->dev->blk), &req->acct, req->qiov.size,
- BLOCK_ACCT_WRITE);
-
- if (mrb->num_writes == VIRTIO_BLK_MAX_MERGE_REQS) {
- virtio_submit_multiwrite(req->dev->blk, mrb);
- }
-
- blkreq = &mrb->blkreq[mrb->num_writes];
- blkreq->sector = sector;
- blkreq->nb_sectors = req->qiov.size / BDRV_SECTOR_SIZE;
- blkreq->qiov = &req->qiov;
- blkreq->cb = virtio_blk_rw_complete;
- blkreq->opaque = req;
- blkreq->error = 0;
-
- mrb->num_writes++;
-}
-
-static void virtio_blk_handle_read(VirtIOBlockReq *req)
-{
- uint64_t sector;
-
- sector = virtio_ldq_p(VIRTIO_DEVICE(req->dev), &req->out.sector);
-
- trace_virtio_blk_handle_read(req, sector, req->qiov.size / 512);
-
- if (!virtio_blk_sect_range_ok(req->dev, sector, req->qiov.size)) {
- virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR);
- virtio_blk_free_request(req);
- return;
- }
-
- block_acct_start(blk_get_stats(req->dev->blk), &req->acct, req->qiov.size,
- BLOCK_ACCT_READ);
- blk_aio_readv(req->dev->blk, sector, &req->qiov,
- req->qiov.size / BDRV_SECTOR_SIZE,
- virtio_blk_rw_complete, req);
-}
-
-void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb)
+void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb_wr,
+ MultiReqBuffer *mrb_rd)
{
uint32_t type;
struct iovec *in_iov = req->elem.in_sg;
@@ -397,7 +377,7 @@ void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb)
type = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out.type);
if (type & VIRTIO_BLK_T_FLUSH) {
- virtio_blk_handle_flush(req, mrb);
+ virtio_blk_handle_flush(req, mrb_wr);
} else if (type & VIRTIO_BLK_T_SCSI_CMD) {
virtio_blk_handle_scsi(req);
} else if (type & VIRTIO_BLK_T_GET_ID) {
@@ -414,13 +394,62 @@ void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb)
iov_from_buf(in_iov, in_num, 0, serial, size);
virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
virtio_blk_free_request(req);
- } else if (type & VIRTIO_BLK_T_OUT) {
- qemu_iovec_init_external(&req->qiov, iov, out_num);
- virtio_blk_handle_write(req, mrb);
- } else if (type == VIRTIO_BLK_T_IN || type == VIRTIO_BLK_T_BARRIER) {
- /* VIRTIO_BLK_T_IN is 0, so we can't just & it. */
- qemu_iovec_init_external(&req->qiov, in_iov, in_num);
- virtio_blk_handle_read(req);
+ } else if (type & VIRTIO_BLK_T_OUT || type == VIRTIO_BLK_T_IN || type == VIRTIO_BLK_T_BARRIER) {
+ bool is_write = type & VIRTIO_BLK_T_OUT;
+ MultiReqBuffer *mrb = is_write ? mrb_wr : mrb_rd;
+ int64_t sector_num = virtio_ldq_p(VIRTIO_DEVICE(req->dev), &req->out.sector);
+ int max_transfer_length = blk_get_max_transfer_length(req->dev->blk);
+ int nb_sectors = 0;
+ bool merge = true;
+
+ if (!virtio_blk_sect_range_ok(req->dev, sector_num, req->qiov.size)) {
+ virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR);
+ virtio_blk_free_request(req);
+ return;
+ }
+
+ if (is_write) {
+ qemu_iovec_init_external(&req->qiov, iov, out_num);
+ trace_virtio_blk_handle_write(req, sector_num, req->qiov.size / BDRV_SECTOR_SIZE);
+ } else {
+ qemu_iovec_init_external(&req->qiov, in_iov, in_num);
+ trace_virtio_blk_handle_read(req, sector_num, req->qiov.size / BDRV_SECTOR_SIZE);
+ }
+
+ nb_sectors = req->qiov.size / BDRV_SECTOR_SIZE;
+
+ block_acct_start(blk_get_stats(req->dev->blk), &req->acct, req->qiov.size,
+ is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ);
+
+ /* merge would exceed maximum number of requests or IOVs */
+ if (mrb->num_reqs == MAX_MERGE_REQS ||
+ mrb->niov + req->qiov.niov + 1 > IOV_MAX) {
+ merge = false;
+ }
+
+ /* merge would exceed maximum transfer length of backend device */
+ if (max_transfer_length &&
+ mrb->nb_sectors + nb_sectors > max_transfer_length) {
+ merge = false;
+ }
+
+ /* requests are not sequential */
+ if (mrb->num_reqs && mrb->sector_num + mrb->nb_sectors != sector_num) {
+ merge = false;
+ }
+
+ if (!merge) {
+ virtio_submit_multireq(req->dev->blk, mrb);
+ }
+
+ if (mrb->num_reqs == 0) {
+ mrb->sector_num = sector_num;
+ }
+
+ mrb->nb_sectors += req->qiov.size / BDRV_SECTOR_SIZE;
+ mrb->reqs[mrb->num_reqs] = req;
+ mrb->niov += req->qiov.niov;
+ mrb->num_reqs++;
} else {
virtio_blk_req_complete(req, VIRTIO_BLK_S_UNSUPP);
virtio_blk_free_request(req);
@@ -431,9 +460,8 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
{
VirtIOBlock *s = VIRTIO_BLK(vdev);
VirtIOBlockReq *req;
- MultiReqBuffer mrb = {
- .num_writes = 0,
- };
+ MultiReqBuffer mrb_rd = {};
+ MultiReqBuffer mrb_wr = {.is_write = 1};
/* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
* dataplane here instead of waiting for .set_status().
@@ -444,10 +472,11 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
}
while ((req = virtio_blk_get_request(s))) {
- virtio_blk_handle_request(req, &mrb);
+ virtio_blk_handle_request(req, &mrb_wr, &mrb_rd);
}
- virtio_submit_multiwrite(s->blk, &mrb);
+ virtio_submit_multireq(s->blk, &mrb_wr);
+ virtio_submit_multireq(s->blk, &mrb_rd);
/*
* FIXME: Want to check for completions before returning to guest mode,
@@ -460,9 +489,8 @@ static void virtio_blk_dma_restart_bh(void *opaque)
{
VirtIOBlock *s = opaque;
VirtIOBlockReq *req = s->rq;
- MultiReqBuffer mrb = {
- .num_writes = 0,
- };
+ MultiReqBuffer mrb_rd = {};
+ MultiReqBuffer mrb_wr = {.is_write = 1};
qemu_bh_delete(s->bh);
s->bh = NULL;
@@ -471,11 +499,12 @@ static void virtio_blk_dma_restart_bh(void *opaque)
while (req) {
VirtIOBlockReq *next = req->next;
- virtio_blk_handle_request(req, &mrb);
+ virtio_blk_handle_request(req, &mrb_wr, &mrb_rd);
req = next;
}
- virtio_submit_multiwrite(s->blk, &mrb);
+ virtio_submit_multireq(s->blk, &mrb_wr);
+ virtio_submit_multireq(s->blk, &mrb_rd);
}
static void virtio_blk_dma_restart_cb(void *opaque, int running,
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 3f2652f..98d3d83 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -134,13 +134,6 @@ typedef struct VirtIOBlock {
struct VirtIOBlockDataPlane *dataplane;
} VirtIOBlock;
-#define VIRTIO_BLK_MAX_MERGE_REQS 32
-
-typedef struct MultiReqBuffer {
- BlockRequest blkreq[VIRTIO_BLK_MAX_MERGE_REQS];
- unsigned int num_writes;
-} MultiReqBuffer;
-
typedef struct VirtIOBlockReq {
VirtIOBlock *dev;
VirtQueueElement elem;
@@ -149,8 +142,21 @@ typedef struct VirtIOBlockReq {
QEMUIOVector qiov;
struct VirtIOBlockReq *next;
BlockAcctCookie acct;
+ QEMUIOVector mr_qiov;
+ struct VirtIOBlockReq *mr_next;
} VirtIOBlockReq;
+#define MAX_MERGE_REQS 32
+
+typedef struct MultiReqBuffer {
+ VirtIOBlockReq *reqs[MAX_MERGE_REQS];
+ unsigned int num_reqs;
+ bool is_write;
+ int niov;
+ int64_t sector_num;
+ int nb_sectors;
+} MultiReqBuffer;
+
VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s);
void virtio_blk_free_request(VirtIOBlockReq *req);
@@ -158,8 +164,9 @@ void virtio_blk_free_request(VirtIOBlockReq *req);
int virtio_blk_handle_scsi_req(VirtIOBlock *blk,
VirtQueueElement *elem);
-void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb);
+void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb_wr,
+ MultiReqBuffer *mrb_rd);
-void virtio_submit_multiwrite(BlockBackend *blk, MultiReqBuffer *mrb);
+void virtio_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb);
#endif
diff --git a/trace-events b/trace-events
index b5722ea..1b2fbf3 100644
--- a/trace-events
+++ b/trace-events
@@ -116,6 +116,7 @@ virtio_blk_req_complete(void *req, int status) "req %p status %d"
virtio_blk_rw_complete(void *req, int ret) "req %p ret %d"
virtio_blk_handle_write(void *req, uint64_t sector, size_t nsectors) "req %p sector %"PRIu64" nsectors %zu"
virtio_blk_handle_read(void *req, uint64_t sector, size_t nsectors) "req %p sector %"PRIu64" nsectors %zu"
+virtio_blk_submit_multireq(void *mrb, int num_reqs, uint64_t sector, size_t nsectors, bool is_write) "mrb %p num_reqs %d sector %"PRIu64" nsectors %zu is_write %d"
# hw/block/dataplane/virtio-blk.c
virtio_blk_data_plane_start(void *s) "dataplane %p"
--
1.7.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [RFC PATCH V2 1/4] block: add accounting for merged requests
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 1/4] block: add accounting for merged requests Peter Lieven
@ 2014-12-05 14:58 ` Eric Blake
2014-12-09 15:43 ` Peter Lieven
0 siblings, 1 reply; 10+ messages in thread
From: Eric Blake @ 2014-12-05 14:58 UTC (permalink / raw)
To: Peter Lieven, qemu-devel
Cc: kwolf, famz, benoit, ming.lei, armbru, mreitz, stefanha, pbonzini
[-- Attachment #1: Type: text/plain, Size: 976 bytes --]
On 12/05/2014 04:50 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
> block.c | 2 ++
> block/accounting.c | 7 +++++++
> block/qapi.c | 2 ++
> hmp.c | 6 +++++-
> include/block/accounting.h | 3 +++
> qapi/block-core.json | 9 ++++++++-
> qmp-commands.hx | 22 ++++++++++++++++++----
> 7 files changed, 45 insertions(+), 6 deletions(-)
> @@ -2284,6 +2288,8 @@ Example:
> "rd_total_times_ns":3465673657
> "flush_total_times_ns":49653
> "flush_operations":61,
> + "rd_merged":0,
> + "wr_merged":0
Oh my - we previously had invalid JSON in our example. Yay that this
fixes it :)
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [RFC PATCH V2 2/4] hw/virtio-blk: add a constant for max number of merged requests
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 2/4] hw/virtio-blk: add a constant for max number of " Peter Lieven
@ 2014-12-05 14:59 ` Eric Blake
0 siblings, 0 replies; 10+ messages in thread
From: Eric Blake @ 2014-12-05 14:59 UTC (permalink / raw)
To: Peter Lieven, qemu-devel
Cc: kwolf, famz, benoit, ming.lei, armbru, mreitz, stefanha, pbonzini
[-- Attachment #1: Type: text/plain, Size: 497 bytes --]
On 12/05/2014 04:50 AM, Peter Lieven wrote:
> As it was not obvious (at least for me) where the 32 comes from;
> add a constant for it.
>
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
> hw/block/virtio-blk.c | 2 +-
> include/hw/virtio/virtio-blk.h | 4 +++-
> 2 files changed, 4 insertions(+), 2 deletions(-)
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [RFC PATCH V2 4/4] virtio-blk: introduce multiread
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 4/4] virtio-blk: introduce multiread Peter Lieven
@ 2014-12-05 15:05 ` Eric Blake
2014-12-09 15:44 ` Peter Lieven
0 siblings, 1 reply; 10+ messages in thread
From: Eric Blake @ 2014-12-05 15:05 UTC (permalink / raw)
To: Peter Lieven, qemu-devel
Cc: kwolf, famz, benoit, ming.lei, armbru, mreitz, stefanha, pbonzini
[-- Attachment #1: Type: text/plain, Size: 2269 bytes --]
On 12/05/2014 04:50 AM, Peter Lieven wrote:
> this patch finally introduce multiread support to virtio-blk while
s/introduce/introduces/
s/virtio-blk while/virtio-blk. While/
> multiwrite support was there for a long time read support was missing.
s/time/time,/
>
> To achieve this the patch does serveral things which might need futher
s/serveral/several/
s/futher/further/
> explaination:
s/explaination/explanation/
>
> - the whole merge and multireq logic is moved from block.c into
> virtio-blk. This is move is a preparation for directly creating a
> coroutine out of virtio-blk.
Can this move be done as a separate prerequisite patch? Mixing code
motion and new features in the same patch is harder to review.
>
> - requests are only merged if they are strictly sequential and no
s/sequential/sequential,/
> longer sorted. This simplification decreases overhead and reduces
> latency. It will also merge some requests which were unmergable before.
>
> The old algorithm took up to 32 requests sorted them and tried to merge
s/requests/requests,/
> them. The outcome was anything between 1 and 32 requests. In case of
> 32 requests there were 31 requests unnecessarily delayed.
>
> On the other hand lets imagine e.g. 16 unmergeable requests followed
s/lets/let's/
> by 32 mergable requests. The latter 32 requests would have been split
> into two 16 byte requests.
>
> Last the simplified logic allows for a fast path if we have only a
> single request in the multirequest. In this case the request is sent as
> ordinary request without mulltireq callbacks.
s/mulltireq/multireq/
>
> As a first benchmark I installed Ubuntu 14.04.1 on a ramdisk. The number of
> merged requests is in the same order while the latency is slightly decreased.
> One should not stick too much to the numbers because the number of wr_requests
> are highly fluctuant. I hope the numbers show that this patch is at least
> not causing too big harm:
>
I'll leave the actual patch review to developers more knowledgeable
about block behavior.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [RFC PATCH V2 1/4] block: add accounting for merged requests
2014-12-05 14:58 ` Eric Blake
@ 2014-12-09 15:43 ` Peter Lieven
0 siblings, 0 replies; 10+ messages in thread
From: Peter Lieven @ 2014-12-09 15:43 UTC (permalink / raw)
To: Eric Blake, qemu-devel
Cc: kwolf, famz, benoit, ming.lei, armbru, mreitz, stefanha, pbonzini
On 05.12.2014 15:58, Eric Blake wrote:
> On 12/05/2014 04:50 AM, Peter Lieven wrote:
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>> ---
>> block.c | 2 ++
>> block/accounting.c | 7 +++++++
>> block/qapi.c | 2 ++
>> hmp.c | 6 +++++-
>> include/block/accounting.h | 3 +++
>> qapi/block-core.json | 9 ++++++++-
>> qmp-commands.hx | 22 ++++++++++++++++++----
>> 7 files changed, 45 insertions(+), 6 deletions(-)
>> @@ -2284,6 +2288,8 @@ Example:
>> "rd_total_times_ns":3465673657
>> "flush_total_times_ns":49653
>> "flush_operations":61,
>> + "rd_merged":0,
>> + "wr_merged":0
> Oh my - we previously had invalid JSON in our example. Yay that this
> fixes it :)
There is actually more invalid JSON, but I would fix that in a separate
patch apart from this series.
Peter
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [RFC PATCH V2 4/4] virtio-blk: introduce multiread
2014-12-05 15:05 ` Eric Blake
@ 2014-12-09 15:44 ` Peter Lieven
0 siblings, 0 replies; 10+ messages in thread
From: Peter Lieven @ 2014-12-09 15:44 UTC (permalink / raw)
To: Eric Blake, qemu-devel
Cc: kwolf, famz, benoit, ming.lei, armbru, mreitz, stefanha, pbonzini
On 05.12.2014 16:05, Eric Blake wrote:
> On 12/05/2014 04:50 AM, Peter Lieven wrote:
>> this patch finally introduce multiread support to virtio-blk while
> s/introduce/introduces/
> s/virtio-blk while/virtio-blk. While/
>
>> multiwrite support was there for a long time read support was missing.
> s/time/time,/
>
>> To achieve this the patch does serveral things which might need futher
> s/serveral/several/
> s/futher/further/
>
>> explaination:
> s/explaination/explanation/
>
>> - the whole merge and multireq logic is moved from block.c into
>> virtio-blk. This is move is a preparation for directly creating a
>> coroutine out of virtio-blk.
> Can this move be done as a separate prerequisite patch? Mixing code
> motion and new features in the same patch is harder to review.
The issue is that the code is not movable. I could artifically use
the new code in a new patch only for write requests and add
another patch to add read merging. But I doubt that it would make
it easier to review but even harder.
Peter
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-12-09 15:45 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-05 11:50 [Qemu-devel] [RFC PATCH V2 0/4] virtio-blk: add multiread support Peter Lieven
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 1/4] block: add accounting for merged requests Peter Lieven
2014-12-05 14:58 ` Eric Blake
2014-12-09 15:43 ` Peter Lieven
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 2/4] hw/virtio-blk: add a constant for max number of " Peter Lieven
2014-12-05 14:59 ` Eric Blake
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 3/4] block-backend: expose bs->bl.max_transfer_length Peter Lieven
2014-12-05 11:50 ` [Qemu-devel] [RFC PATCH V2 4/4] virtio-blk: introduce multiread Peter Lieven
2014-12-05 15:05 ` Eric Blake
2014-12-09 15:44 ` Peter Lieven
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).