* [PATCH 00/11] kvm tools: Support native vectored AIO
@ 2011-10-30 17:35 Sasha Levin
2011-10-30 17:35 ` [PATCH 01/11] kvm tools: Switch to using an enum for disk image types Sasha Levin
` (10 more replies)
0 siblings, 11 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin
This patch set adds support for native vectored AIO ops. Instead of using our
own thread pool for queueing io ops, we use the kernel.
Numbers - short story: Overall increase of at least 10%, with sequential
ops benefiting the most.
Numbers - long story:
Before:
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 2 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
512 1G 550 97 26638 7 25050 4 2942 91 339188 35 1711 68
Latency 40602us 12269ms 2755ms 27333us 119ms 77816us
Version 1.96 ------Sequential Create------ --------Random Create--------
512 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 12172 22 +++++ +++ 23771 34 20209 36 +++++ +++ 22181 33
Latency 153ms 566us 571us 204us 56us 624us
1.96,1.96,512,2,1319996851,1G,,550,97,26638,7,25050,4,2942,91,339188,35,1711,68,16,,,,,12172,22,+++++,+++,23771,34,20209,36,+++++,+++,22181,33,40602us,12269ms,2755ms,27333us,119ms,77816us,153ms,566us,571us,204us,56us,624us
After:
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 2 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
tux 512M 565 96 49307 19 33464 6 3270 99 923539 57 2599 64
Latency 15285us 240ms 2496ms 7478us 5586us 2878ms
Version 1.96 ------Sequential Create------ --------Random Create--------
tux -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 26992 48 +++++ +++ 27018 38 25618 45 +++++ +++ 24272 34
Latency 1145us 1311us 1708us 408us 186us 283us
1.96,1.96,tux,2,1319997274,512M,,565,96,49307,19,33464,6,3270,99,923539,57,2599,64,16,,,,,26992,48,+++++,+++,27018,38,25618,45,+++++,+++,24272,34,15285us,240ms,2496ms,7478us,5586us,2878ms,1145us,1311us,1708us,408us,186us,283us
I'd be happy if someone could run more benchmarks, as these ones looks very
optimistic, but I kept getting similar results over and over.
Sasha Levin (11):
kvm tools: Switch to using an enum for disk image types
kvm tools: Hold a copy of ops struct inside disk_image
kvm tools: Remove the non-iov interface from disk image ops
kvm tools: Modify disk ops usage
kvm tools: Modify behaviour on missing ops ptr
kvm tools: Add optional callback on disk op completion
kvm tools: Remove qcow nowrite function
kvm tools: Split io request from completion
kvm tools: Hook virtio-blk completion to disk op completion
kvm tools: Add aio read write functions
kvm tools: Use native vectored AIO in virtio-blk
tools/kvm/Makefile | 1 +
tools/kvm/disk/core.c | 116 +++++++++++++++++++---------------
tools/kvm/disk/qcow.c | 58 ++++++++++++++---
tools/kvm/disk/raw.c | 87 +++++++++++++++++--------
tools/kvm/include/kvm/disk-image.h | 50 ++++++++-------
tools/kvm/include/kvm/read-write.h | 6 ++
tools/kvm/include/kvm/virtio-blk.h | 1 +
tools/kvm/read-write.c | 24 +++++++
tools/kvm/virtio/blk.c | 122 ++++++++++++++++++++++++------------
9 files changed, 312 insertions(+), 153 deletions(-)
--
1.7.7.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 01/11] kvm tools: Switch to using an enum for disk image types
2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
2011-10-30 17:35 ` [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image Sasha Levin
` (9 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
tools/kvm/disk/qcow.c | 8 ++++----
tools/kvm/disk/raw.c | 2 +-
tools/kvm/include/kvm/disk-image.h | 7 +++++--
3 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/tools/kvm/disk/qcow.c b/tools/kvm/disk/qcow.c
index 167e97e..bdf6bd8 100644
--- a/tools/kvm/disk/qcow.c
+++ b/tools/kvm/disk/qcow.c
@@ -1107,9 +1107,9 @@ static struct disk_image *qcow2_probe(int fd, bool readonly)
* Do not use mmap use read/write instead
*/
if (readonly)
- disk_image = disk_image__new(fd, h->size, &qcow_disk_readonly_ops, DISK_IMAGE_NOMMAP);
+ disk_image = disk_image__new(fd, h->size, &qcow_disk_readonly_ops, DISK_IMAGE_REGULAR);
else
- disk_image = disk_image__new(fd, h->size, &qcow_disk_ops, DISK_IMAGE_NOMMAP);
+ disk_image = disk_image__new(fd, h->size, &qcow_disk_ops, DISK_IMAGE_REGULAR);
if (!disk_image)
goto free_refcount_table;
@@ -1238,9 +1238,9 @@ static struct disk_image *qcow1_probe(int fd, bool readonly)
* Do not use mmap use read/write instead
*/
if (readonly)
- disk_image = disk_image__new(fd, h->size, &qcow_disk_readonly_ops, DISK_IMAGE_NOMMAP);
+ disk_image = disk_image__new(fd, h->size, &qcow_disk_readonly_ops, DISK_IMAGE_REGULAR);
else
- disk_image = disk_image__new(fd, h->size, &qcow_disk_ops, DISK_IMAGE_NOMMAP);
+ disk_image = disk_image__new(fd, h->size, &qcow_disk_ops, DISK_IMAGE_REGULAR);
if (!disk_image)
goto free_l1_table;
diff --git a/tools/kvm/disk/raw.c b/tools/kvm/disk/raw.c
index 7f3f8db..66c52be 100644
--- a/tools/kvm/disk/raw.c
+++ b/tools/kvm/disk/raw.c
@@ -78,5 +78,5 @@ struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly)
/*
* Use read/write instead of mmap
*/
- return disk_image__new(fd, st->st_size, &raw_image_iov_ops, DISK_IMAGE_NOMMAP);
+ return disk_image__new(fd, st->st_size, &raw_image_iov_ops, DISK_IMAGE_REGULAR);
}
diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h
index 75b54f9..6acde53 100644
--- a/tools/kvm/include/kvm/disk-image.h
+++ b/tools/kvm/include/kvm/disk-image.h
@@ -21,8 +21,11 @@
#define SECTOR_SHIFT 9
#define SECTOR_SIZE (1UL << SECTOR_SHIFT)
-#define DISK_IMAGE_MMAP 0
-#define DISK_IMAGE_NOMMAP 1
+enum {
+ DISK_IMAGE_REGULAR,
+ DISK_IMAGE_MMAP,
+};
+
#define MAX_DISK_IMAGES 4
struct disk_image;
--
1.7.7.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image
2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
2011-10-30 17:35 ` [PATCH 01/11] kvm tools: Switch to using an enum for disk image types Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
2011-11-01 6:44 ` Pekka Enberg
2011-10-30 17:35 ` [PATCH 03/11] kvm tools: Remove the non-iov interface from disk image ops Sasha Levin
` (8 subsequent siblings)
10 siblings, 1 reply; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin
This makes passing different ops structures easier since you don't have
to keep them somewhere else after initializing disk_image.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
tools/kvm/disk/core.c | 26 +++++++++++++-------------
tools/kvm/include/kvm/disk-image.h | 2 +-
2 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
index 20e1990..a914fef7 100644
--- a/tools/kvm/disk/core.c
+++ b/tools/kvm/disk/core.c
@@ -13,7 +13,7 @@ struct disk_image *disk_image__new(int fd, u64 size, struct disk_image_operation
disk->fd = fd;
disk->size = size;
- disk->ops = ops;
+ disk->ops = *ops;
if (use_mmap == DISK_IMAGE_MMAP) {
/*
@@ -96,8 +96,8 @@ error:
int disk_image__flush(struct disk_image *disk)
{
- if (disk->ops->flush)
- return disk->ops->flush(disk);
+ if (disk->ops.flush)
+ return disk->ops.flush(disk);
return fsync(disk->fd);
}
@@ -108,8 +108,8 @@ int disk_image__close(struct disk_image *disk)
if (!disk)
return 0;
- if (disk->ops->close)
- return disk->ops->close(disk);
+ if (disk->ops.close)
+ return disk->ops.close(disk);
if (close(disk->fd) < 0)
pr_warning("close() failed");
@@ -139,21 +139,21 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
if (debug_iodelay)
msleep(debug_iodelay);
- if (disk->ops->read_sector_iov) {
+ if (disk->ops.read_sector_iov) {
/*
* Try mulitple buffer based operation first
*/
- total = disk->ops->read_sector_iov(disk, sector, iov, iovcount);
+ total = disk->ops.read_sector_iov(disk, sector, iov, iovcount);
if (total < 0) {
pr_info("disk_image__read error: total=%ld\n", (long)total);
return -1;
}
- } else if (disk->ops->read_sector) {
+ } else if (disk->ops.read_sector) {
/*
* Fallback to single buffer based operation
*/
while (iovcount--) {
- nr = disk->ops->read_sector(disk, sector, iov->iov_base, iov->iov_len);
+ nr = disk->ops.read_sector(disk, sector, iov->iov_base, iov->iov_len);
if (nr != (ssize_t)iov->iov_len) {
pr_info("disk_image__read error: nr = %ld iov_len=%ld\n", (long)nr, (long)iov->iov_len);
return -1;
@@ -180,21 +180,21 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iove
if (debug_iodelay)
msleep(debug_iodelay);
- if (disk->ops->write_sector_iov) {
+ if (disk->ops.write_sector_iov) {
/*
* Try writev based operation first
*/
- total = disk->ops->write_sector_iov(disk, sector, iov, iovcount);
+ total = disk->ops.write_sector_iov(disk, sector, iov, iovcount);
if (total < 0) {
pr_info("disk_image__write error: total=%ld\n", (long)total);
return -1;
}
- } else if (disk->ops->write_sector) {
+ } else if (disk->ops.write_sector) {
/*
* Fallback to single buffer based operation
*/
while (iovcount--) {
- nr = disk->ops->write_sector(disk, sector, iov->iov_base, iov->iov_len);
+ nr = disk->ops.write_sector(disk, sector, iov->iov_base, iov->iov_len);
if (nr != (ssize_t)iov->iov_len) {
pr_info("disk_image__write error: nr=%ld iov_len=%ld\n", (long)nr, (long)iov->iov_len);
return -1;
diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h
index 6acde53..bb97cdf 100644
--- a/tools/kvm/include/kvm/disk-image.h
+++ b/tools/kvm/include/kvm/disk-image.h
@@ -50,7 +50,7 @@ struct disk_image_operations {
struct disk_image {
int fd;
u64 size;
- struct disk_image_operations *ops;
+ struct disk_image_operations ops;
void *priv;
};
--
1.7.7.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 03/11] kvm tools: Remove the non-iov interface from disk image ops
2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
2011-10-30 17:35 ` [PATCH 01/11] kvm tools: Switch to using an enum for disk image types Sasha Levin
2011-10-30 17:35 ` [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
2011-10-30 17:35 ` [PATCH 04/11] kvm tools: Modify disk ops usage Sasha Levin
` (7 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
tools/kvm/disk/core.c | 48 ++++++----------------------------
tools/kvm/disk/qcow.c | 47 +++++++++++++++++++++++++++++++--
tools/kvm/disk/raw.c | 50 +++++++++++++++++++++---------------
tools/kvm/include/kvm/disk-image.h | 26 +++++++-----------
4 files changed, 91 insertions(+), 80 deletions(-)
diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
index a914fef7..38aa99b 100644
--- a/tools/kvm/disk/core.c
+++ b/tools/kvm/disk/core.c
@@ -134,36 +134,19 @@ void disk_image__close_all(struct disk_image **disks, int count)
ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
{
ssize_t total = 0;
- ssize_t nr;
if (debug_iodelay)
msleep(debug_iodelay);
- if (disk->ops.read_sector_iov) {
- /*
- * Try mulitple buffer based operation first
- */
- total = disk->ops.read_sector_iov(disk, sector, iov, iovcount);
+ if (disk->ops.read_sector) {
+ total = disk->ops.read_sector(disk, sector, iov, iovcount);
if (total < 0) {
pr_info("disk_image__read error: total=%ld\n", (long)total);
return -1;
}
- } else if (disk->ops.read_sector) {
- /*
- * Fallback to single buffer based operation
- */
- while (iovcount--) {
- nr = disk->ops.read_sector(disk, sector, iov->iov_base, iov->iov_len);
- if (nr != (ssize_t)iov->iov_len) {
- pr_info("disk_image__read error: nr = %ld iov_len=%ld\n", (long)nr, (long)iov->iov_len);
- return -1;
- }
- sector += iov->iov_len >> SECTOR_SHIFT;
- iov++;
- total += nr;
- }
- } else
+ } else {
die("No disk image operation for read\n");
+ }
return total;
}
@@ -175,37 +158,22 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
{
ssize_t total = 0;
- ssize_t nr;
if (debug_iodelay)
msleep(debug_iodelay);
- if (disk->ops.write_sector_iov) {
+ if (disk->ops.write_sector) {
/*
* Try writev based operation first
*/
- total = disk->ops.write_sector_iov(disk, sector, iov, iovcount);
+ total = disk->ops.write_sector(disk, sector, iov, iovcount);
if (total < 0) {
pr_info("disk_image__write error: total=%ld\n", (long)total);
return -1;
}
- } else if (disk->ops.write_sector) {
- /*
- * Fallback to single buffer based operation
- */
- while (iovcount--) {
- nr = disk->ops.write_sector(disk, sector, iov->iov_base, iov->iov_len);
- if (nr != (ssize_t)iov->iov_len) {
- pr_info("disk_image__write error: nr=%ld iov_len=%ld\n", (long)nr, (long)iov->iov_len);
- return -1;
- }
-
- sector += iov->iov_len >> SECTOR_SHIFT;
- iov++;
- total += nr;
- }
- } else
+ } else {
die("No disk image operation for read\n");
+ }
return total;
}
diff --git a/tools/kvm/disk/qcow.c b/tools/kvm/disk/qcow.c
index bdf6bd8..288782e 100644
--- a/tools/kvm/disk/qcow.c
+++ b/tools/kvm/disk/qcow.c
@@ -452,7 +452,7 @@ out_error:
return -1;
}
-static ssize_t qcow_read_sector(struct disk_image *disk, u64 sector,
+static ssize_t qcow_read_sector_single(struct disk_image *disk, u64 sector,
void *dst, u32 dst_len)
{
struct qcow *q = disk->priv;
@@ -488,6 +488,26 @@ static ssize_t qcow_read_sector(struct disk_image *disk, u64 sector,
return dst_len;
}
+static ssize_t qcow_read_sector(struct disk_image *disk, u64 sector,
+ const struct iovec *iov, int iovcount)
+{
+ ssize_t nr, total = 0;
+
+ while (iovcount--) {
+ nr = qcow_read_sector_single(disk, sector, iov->iov_base, iov->iov_len);
+ if (nr != (ssize_t)iov->iov_len) {
+ pr_info("qcow_read_sector error: nr=%ld iov_len=%ld\n", (long)nr, (long)iov->iov_len);
+ return -1;
+ }
+
+ sector += iov->iov_len >> SECTOR_SHIFT;
+ iov++;
+ total += nr;
+ }
+
+ return total;
+}
+
static inline u64 file_size(int fd)
{
struct stat st;
@@ -867,7 +887,7 @@ error:
return -1;
}
-static ssize_t qcow_write_sector(struct disk_image *disk, u64 sector, void *src, u32 src_len)
+static ssize_t qcow_write_sector_single(struct disk_image *disk, u64 sector, void *src, u32 src_len)
{
struct qcow *q = disk->priv;
struct qcow_header *header = q->header;
@@ -896,7 +916,28 @@ static ssize_t qcow_write_sector(struct disk_image *disk, u64 sector, void *src,
return nr_written;
}
-static ssize_t qcow_nowrite_sector(struct disk_image *disk, u64 sector, void *src, u32 src_len)
+static ssize_t qcow_write_sector(struct disk_image *disk, u64 sector,
+ const struct iovec *iov, int iovcount)
+{
+ ssize_t nr, total = 0;
+
+ while (iovcount--) {
+ nr = qcow_write_sector_single(disk, sector, iov->iov_base, iov->iov_len);
+ if (nr != (ssize_t)iov->iov_len) {
+ pr_info("qcow_write_sector error: nr=%ld iov_len=%ld\n", (long)nr, (long)iov->iov_len);
+ return -1;
+ }
+
+ sector += iov->iov_len >> SECTOR_SHIFT;
+ iov++;
+ total += nr;
+ }
+
+ return total;
+}
+
+static ssize_t qcow_nowrite_sector(struct disk_image *disk, u64 sector,
+ const struct iovec *iov, int iovcount)
{
/* I/O error */
pr_info("%s: no write support\n", __func__);
diff --git a/tools/kvm/disk/raw.c b/tools/kvm/disk/raw.c
index 66c52be..1c6a985 100644
--- a/tools/kvm/disk/raw.c
+++ b/tools/kvm/disk/raw.c
@@ -1,41 +1,49 @@
#include "kvm/disk-image.h"
-ssize_t raw_image__read_sector_iov(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
{
u64 offset = sector << SECTOR_SHIFT;
return preadv_in_full(disk->fd, iov, iovcount, offset);
}
-ssize_t raw_image__write_sector_iov(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
{
u64 offset = sector << SECTOR_SHIFT;
return pwritev_in_full(disk->fd, iov, iovcount, offset);
}
-ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector, void *dst, u32 dst_len)
+ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
{
u64 offset = sector << SECTOR_SHIFT;
+ ssize_t nr, total = 0;
- if (offset + dst_len > disk->size)
- return -1;
+ while (iovcount--) {
+ memcpy(iov->iov_base, disk->priv + offset, iov->iov_len);
- memcpy(dst, disk->priv + offset, dst_len);
+ sector += iov->iov_len >> SECTOR_SHIFT;
+ iov++;
+ total += nr;
+ }
- return dst_len;
+ return total;
}
-ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector, void *src, u32 src_len)
+ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
{
u64 offset = sector << SECTOR_SHIFT;
+ ssize_t nr, total = 0;
- if (offset + src_len > disk->size)
- return -1;
+ while (iovcount--) {
+ memcpy(disk->priv + offset, iov->iov_base, iov->iov_len);
- memcpy(disk->priv + offset, src, src_len);
+ sector += iov->iov_len >> SECTOR_SHIFT;
+ iov++;
+ total += nr;
+ }
- return src_len;
+ return total;
}
int raw_image__close(struct disk_image *disk)
@@ -51,17 +59,17 @@ int raw_image__close(struct disk_image *disk)
/*
* multiple buffer based disk image operations
*/
-static struct disk_image_operations raw_image_iov_ops = {
- .read_sector_iov = raw_image__read_sector_iov,
- .write_sector_iov = raw_image__write_sector_iov,
+static struct disk_image_operations raw_image_regular_ops = {
+ .read_sector = raw_image__read_sector,
+ .write_sector = raw_image__write_sector,
};
/*
- * single buffer based disk image operations
+ * mmap() based disk image operations
*/
-static struct disk_image_operations raw_image_ops = {
- .read_sector = raw_image__read_sector,
- .write_sector = raw_image__write_sector,
+static struct disk_image_operations raw_image_mmap_ops = {
+ .read_sector = raw_image__read_sector_mmap,
+ .write_sector = raw_image__write_sector_mmap,
.close = raw_image__close,
};
@@ -73,10 +81,10 @@ struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly)
* Use mmap's MAP_PRIVATE to implement non-persistent write
* FIXME: This does not work on 32-bit host.
*/
- return disk_image__new(fd, st->st_size, &raw_image_ops, DISK_IMAGE_MMAP);
+ return disk_image__new(fd, st->st_size, &raw_image_mmap_ops, DISK_IMAGE_MMAP);
else
/*
* Use read/write instead of mmap
*/
- return disk_image__new(fd, st->st_size, &raw_image_iov_ops, DISK_IMAGE_REGULAR);
+ return disk_image__new(fd, st->st_size, &raw_image_regular_ops, DISK_IMAGE_REGULAR);
}
diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h
index bb97cdf..5087b9e 100644
--- a/tools/kvm/include/kvm/disk-image.h
+++ b/tools/kvm/include/kvm/disk-image.h
@@ -31,18 +31,8 @@ enum {
struct disk_image;
struct disk_image_operations {
- /*
- * The following two are used for reading or writing with a single buffer.
- * The implentation can use readv/writev or memcpy.
- */
- ssize_t (*read_sector)(struct disk_image *disk, u64 sector, void *dst, u32 dst_len);
- ssize_t (*write_sector)(struct disk_image *disk, u64 sector, void *src, u32 src_len);
- /*
- * The following two are used for reading or writing with multiple buffers.
- * The implentation can use readv/writev or memcpy.
- */
- ssize_t (*read_sector_iov)(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
- ssize_t (*write_sector_iov)(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
+ ssize_t (*read_sector)(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
+ ssize_t (*write_sector)(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
int (*flush)(struct disk_image *disk);
int (*close)(struct disk_image *disk);
};
@@ -67,10 +57,14 @@ ssize_t disk_image__get_serial(struct disk_image *disk, void *buffer, ssize_t *l
struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly);
struct disk_image *blkdev__probe(const char *filename, struct stat *st);
-ssize_t raw_image__read_sector_iov(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
-ssize_t raw_image__write_sector_iov(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
-ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector, void *dst, u32 dst_len);
-ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector, void *src, u32 src_len);
+ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector,
+ const struct iovec *iov, int iovcount);
+ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector,
+ const struct iovec *iov, int iovcount);
+ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector,
+ const struct iovec *iov, int iovcount);
+ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector,
+ const struct iovec *iov, int iovcount);
int raw_image__close(struct disk_image *disk);
#endif /* KVM__DISK_IMAGE_H */
--
1.7.7.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 04/11] kvm tools: Modify disk ops usage
2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
` (2 preceding siblings ...)
2011-10-30 17:35 ` [PATCH 03/11] kvm tools: Remove the non-iov interface from disk image ops Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
2011-10-30 17:35 ` [PATCH 05/11] kvm tools: Modify behaviour on missing ops ptr Sasha Levin
` (6 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin
This patch modifies the definition and usage of ops for read only, mmap and
regular IO.
There is no longer a mix between iov and mmap, and read only no longer implies
mmap (although it will try to use it first).
This allows for more flexibility defining different ops for different
scenarios.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
tools/kvm/disk/core.c | 6 ++++--
tools/kvm/disk/raw.c | 40 ++++++++++++++++++++++++----------------
2 files changed, 28 insertions(+), 18 deletions(-)
diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
index 38aa99b..6048871 100644
--- a/tools/kvm/disk/core.c
+++ b/tools/kvm/disk/core.c
@@ -20,8 +20,10 @@ struct disk_image *disk_image__new(int fd, u64 size, struct disk_image_operation
* The write to disk image will be discarded
*/
disk->priv = mmap(NULL, size, PROT_RW, MAP_PRIVATE | MAP_NORESERVE, fd, 0);
- if (disk->priv == MAP_FAILED)
- die("mmap() failed");
+ if (disk->priv == MAP_FAILED) {
+ free(disk);
+ disk = NULL;
+ }
}
return disk;
diff --git a/tools/kvm/disk/raw.c b/tools/kvm/disk/raw.c
index 1c6a985..2ed1c07 100644
--- a/tools/kvm/disk/raw.c
+++ b/tools/kvm/disk/raw.c
@@ -17,14 +17,15 @@ ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector, const struc
ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
{
u64 offset = sector << SECTOR_SHIFT;
- ssize_t nr, total = 0;
+ ssize_t total = 0;
while (iovcount--) {
memcpy(iov->iov_base, disk->priv + offset, iov->iov_len);
sector += iov->iov_len >> SECTOR_SHIFT;
+ offset += iov->iov_len;
+ total += iov->iov_len;
iov++;
- total += nr;
}
return total;
@@ -33,14 +34,15 @@ ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector, const s
ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
{
u64 offset = sector << SECTOR_SHIFT;
- ssize_t nr, total = 0;
+ ssize_t total = 0;
while (iovcount--) {
memcpy(disk->priv + offset, iov->iov_base, iov->iov_len);
sector += iov->iov_len >> SECTOR_SHIFT;
+ offset += iov->iov_len;
+ total += iov->iov_len;
iov++;
- total += nr;
}
return total;
@@ -64,27 +66,33 @@ static struct disk_image_operations raw_image_regular_ops = {
.write_sector = raw_image__write_sector,
};
-/*
- * mmap() based disk image operations
- */
-static struct disk_image_operations raw_image_mmap_ops = {
- .read_sector = raw_image__read_sector_mmap,
- .write_sector = raw_image__write_sector_mmap,
- .close = raw_image__close,
-};
-
struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly)
{
- if (readonly)
+ if (readonly) {
/*
* Use mmap's MAP_PRIVATE to implement non-persistent write
* FIXME: This does not work on 32-bit host.
*/
- return disk_image__new(fd, st->st_size, &raw_image_mmap_ops, DISK_IMAGE_MMAP);
- else
+ struct disk_image *disk;
+ struct disk_image_operations ro_ops = {
+ .read_sector = raw_image__read_sector_mmap,
+ .write_sector = raw_image__write_sector_mmap,
+ .close = raw_image__close,
+ };
+
+ disk = disk_image__new(fd, st->st_size, &ro_ops, DISK_IMAGE_MMAP);
+ if (disk == NULL) {
+ ro_ops = raw_image_regular_ops;
+ ro_ops.write_sector = NULL;
+ disk = disk_image__new(fd, st->st_size, &ro_ops, DISK_IMAGE_REGULAR);
+ }
+
+ return disk;
+ } else {
/*
* Use read/write instead of mmap
*/
return disk_image__new(fd, st->st_size, &raw_image_regular_ops, DISK_IMAGE_REGULAR);
+ }
}
--
1.7.7.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 05/11] kvm tools: Modify behaviour on missing ops ptr
2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
` (3 preceding siblings ...)
2011-10-30 17:35 ` [PATCH 04/11] kvm tools: Modify disk ops usage Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
2011-10-30 17:35 ` [PATCH 06/11] kvm tools: Add optional callback on disk op completion Sasha Levin
` (5 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin
In case a read or write op ptr is missing simply ignore it instead of
critically failing. This provides an easier way to prevent read or write
in specific scenarios.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
tools/kvm/disk/core.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
index 6048871..d76bd55 100644
--- a/tools/kvm/disk/core.c
+++ b/tools/kvm/disk/core.c
@@ -147,7 +147,7 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
return -1;
}
} else {
- die("No disk image operation for read\n");
+ /* Do nothing */
}
return total;
@@ -174,7 +174,7 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iove
return -1;
}
} else {
- die("No disk image operation for read\n");
+ /* Do nothing */
}
return total;
--
1.7.7.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 06/11] kvm tools: Add optional callback on disk op completion
2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
` (4 preceding siblings ...)
2011-10-30 17:35 ` [PATCH 05/11] kvm tools: Modify behaviour on missing ops ptr Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
2011-10-30 17:35 ` [PATCH 07/11] kvm tools: Remove qcow nowrite function Sasha Levin
` (4 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin
This patch adds an optional callback to be called when a disk op completes.
Currently theres not much use for it, but it is the infrastructure for adding
aio support.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
tools/kvm/disk/core.c | 21 +++++++++++++++++----
tools/kvm/disk/qcow.c | 6 +++---
tools/kvm/disk/raw.c | 12 ++++++++----
tools/kvm/include/kvm/disk-image.h | 24 +++++++++++++++---------
tools/kvm/virtio/blk.c | 6 ++++--
5 files changed, 47 insertions(+), 22 deletions(-)
diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
index d76bd55..842fef4 100644
--- a/tools/kvm/disk/core.c
+++ b/tools/kvm/disk/core.c
@@ -133,7 +133,8 @@ void disk_image__close_all(struct disk_image **disks, int count)
* Fill iov with disk data, starting from sector 'sector'.
* Return amount of bytes read.
*/
-ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec *iov,
+ int iovcount, void *param)
{
ssize_t total = 0;
@@ -141,7 +142,7 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
msleep(debug_iodelay);
if (disk->ops.read_sector) {
- total = disk->ops.read_sector(disk, sector, iov, iovcount);
+ total = disk->ops.read_sector(disk, sector, iov, iovcount, param);
if (total < 0) {
pr_info("disk_image__read error: total=%ld\n", (long)total);
return -1;
@@ -150,6 +151,9 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
/* Do nothing */
}
+ if (disk->disk_req_cb)
+ disk->disk_req_cb(param);
+
return total;
}
@@ -157,7 +161,8 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
* Write iov to disk, starting from sector 'sector'.
* Return amount of bytes written.
*/
-ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov,
+ int iovcount, void *param)
{
ssize_t total = 0;
@@ -168,7 +173,7 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iove
/*
* Try writev based operation first
*/
- total = disk->ops.write_sector(disk, sector, iov, iovcount);
+ total = disk->ops.write_sector(disk, sector, iov, iovcount, param);
if (total < 0) {
pr_info("disk_image__write error: total=%ld\n", (long)total);
return -1;
@@ -177,6 +182,9 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iove
/* Do nothing */
}
+ if (disk->disk_req_cb)
+ disk->disk_req_cb(param);
+
return total;
}
@@ -190,3 +198,8 @@ ssize_t disk_image__get_serial(struct disk_image *disk, void *buffer, ssize_t *l
*len = snprintf(buffer, *len, "%llu%llu%llu", (u64)st.st_dev, (u64)st.st_rdev, (u64)st.st_ino);
return *len;
}
+
+void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param))
+{
+ disk->disk_req_cb = disk_req_cb;
+}
diff --git a/tools/kvm/disk/qcow.c b/tools/kvm/disk/qcow.c
index 288782e..ba65ab6 100644
--- a/tools/kvm/disk/qcow.c
+++ b/tools/kvm/disk/qcow.c
@@ -489,7 +489,7 @@ static ssize_t qcow_read_sector_single(struct disk_image *disk, u64 sector,
}
static ssize_t qcow_read_sector(struct disk_image *disk, u64 sector,
- const struct iovec *iov, int iovcount)
+ const struct iovec *iov, int iovcount, void *param)
{
ssize_t nr, total = 0;
@@ -917,7 +917,7 @@ static ssize_t qcow_write_sector_single(struct disk_image *disk, u64 sector, voi
}
static ssize_t qcow_write_sector(struct disk_image *disk, u64 sector,
- const struct iovec *iov, int iovcount)
+ const struct iovec *iov, int iovcount, void *param)
{
ssize_t nr, total = 0;
@@ -937,7 +937,7 @@ static ssize_t qcow_write_sector(struct disk_image *disk, u64 sector,
}
static ssize_t qcow_nowrite_sector(struct disk_image *disk, u64 sector,
- const struct iovec *iov, int iovcount)
+ const struct iovec *iov, int iovcount, void *param)
{
/* I/O error */
pr_info("%s: no write support\n", __func__);
diff --git a/tools/kvm/disk/raw.c b/tools/kvm/disk/raw.c
index 2ed1c07..fa2e013 100644
--- a/tools/kvm/disk/raw.c
+++ b/tools/kvm/disk/raw.c
@@ -1,20 +1,23 @@
#include "kvm/disk-image.h"
-ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector, const struct iovec *iov,
+ int iovcount, void *param)
{
u64 offset = sector << SECTOR_SHIFT;
return preadv_in_full(disk->fd, iov, iovcount, offset);
}
-ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector, const struct iovec *iov,
+ int iovcount, void *param)
{
u64 offset = sector << SECTOR_SHIFT;
return pwritev_in_full(disk->fd, iov, iovcount, offset);
}
-ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov,
+ int iovcount, void *param)
{
u64 offset = sector << SECTOR_SHIFT;
ssize_t total = 0;
@@ -31,7 +34,8 @@ ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector, const s
return total;
}
-ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov,
+ int iovcount, void *param)
{
u64 offset = sector << SECTOR_SHIFT;
ssize_t total = 0;
diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h
index 5087b9e..bae9744 100644
--- a/tools/kvm/include/kvm/disk-image.h
+++ b/tools/kvm/include/kvm/disk-image.h
@@ -31,8 +31,10 @@ enum {
struct disk_image;
struct disk_image_operations {
- ssize_t (*read_sector)(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
- ssize_t (*write_sector)(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
+ ssize_t (*read_sector)(struct disk_image *disk, u64 sector, const struct iovec *iov,
+ int iovcount, void *param);
+ ssize_t (*write_sector)(struct disk_image *disk, u64 sector, const struct iovec *iov,
+ int iovcount, void *param);
int (*flush)(struct disk_image *disk);
int (*close)(struct disk_image *disk);
};
@@ -42,6 +44,8 @@ struct disk_image {
u64 size;
struct disk_image_operations ops;
void *priv;
+ void *disk_req_cb_param;
+ void (*disk_req_cb)(void *param);
};
struct disk_image *disk_image__open(const char *filename, bool readonly);
@@ -50,21 +54,23 @@ struct disk_image *disk_image__new(int fd, u64 size, struct disk_image_operation
int disk_image__close(struct disk_image *disk);
void disk_image__close_all(struct disk_image **disks, int count);
int disk_image__flush(struct disk_image *disk);
-ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
-ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
+ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec *iov,
+ int iovcount, void *param);
+ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov,
+ int iovcount, void *param);
ssize_t disk_image__get_serial(struct disk_image *disk, void *buffer, ssize_t *len);
struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly);
struct disk_image *blkdev__probe(const char *filename, struct stat *st);
ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector,
- const struct iovec *iov, int iovcount);
+ const struct iovec *iov, int iovcount, void *param);
ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector,
- const struct iovec *iov, int iovcount);
+ const struct iovec *iov, int iovcount, void *param);
ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector,
- const struct iovec *iov, int iovcount);
+ const struct iovec *iov, int iovcount, void *param);
ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector,
- const struct iovec *iov, int iovcount);
+ const struct iovec *iov, int iovcount, void *param);
int raw_image__close(struct disk_image *disk);
-
+void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param));
#endif /* KVM__DISK_IMAGE_H */
diff --git a/tools/kvm/virtio/blk.c b/tools/kvm/virtio/blk.c
index 6ffa753..9db293e 100644
--- a/tools/kvm/virtio/blk.c
+++ b/tools/kvm/virtio/blk.c
@@ -77,10 +77,12 @@ static void virtio_blk_do_io_request(struct kvm *kvm, void *param)
switch (req->type) {
case VIRTIO_BLK_T_IN:
- block_cnt = disk_image__read(bdev->disk, req->sector, iov + 1, in + out - 2);
+ block_cnt = disk_image__read(bdev->disk, req->sector, iov + 1,
+ in + out - 2, NULL);
break;
case VIRTIO_BLK_T_OUT:
- block_cnt = disk_image__write(bdev->disk, req->sector, iov + 1, in + out - 2);
+ block_cnt = disk_image__write(bdev->disk, req->sector, iov + 1,
+ in + out - 2, NULL);
break;
case VIRTIO_BLK_T_FLUSH:
block_cnt = disk_image__flush(bdev->disk);
--
1.7.7.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 07/11] kvm tools: Remove qcow nowrite function
2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
` (5 preceding siblings ...)
2011-10-30 17:35 ` [PATCH 06/11] kvm tools: Add optional callback on disk op completion Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
2011-10-30 17:35 ` [PATCH 08/11] kvm tools: Split io request from completion Sasha Levin
` (3 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin
It is no longer needed due to previous changes.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
tools/kvm/disk/qcow.c | 9 ---------
1 files changed, 0 insertions(+), 9 deletions(-)
diff --git a/tools/kvm/disk/qcow.c b/tools/kvm/disk/qcow.c
index ba65ab6..f9598de 100644
--- a/tools/kvm/disk/qcow.c
+++ b/tools/kvm/disk/qcow.c
@@ -936,14 +936,6 @@ static ssize_t qcow_write_sector(struct disk_image *disk, u64 sector,
return total;
}
-static ssize_t qcow_nowrite_sector(struct disk_image *disk, u64 sector,
- const struct iovec *iov, int iovcount, void *param)
-{
- /* I/O error */
- pr_info("%s: no write support\n", __func__);
- return -1;
-}
-
static int qcow_disk_flush(struct disk_image *disk)
{
struct qcow *q = disk->priv;
@@ -1013,7 +1005,6 @@ static int qcow_disk_close(struct disk_image *disk)
static struct disk_image_operations qcow_disk_readonly_ops = {
.read_sector = qcow_read_sector,
- .write_sector = qcow_nowrite_sector,
.close = qcow_disk_close,
};
--
1.7.7.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 08/11] kvm tools: Split io request from completion
2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
` (6 preceding siblings ...)
2011-10-30 17:35 ` [PATCH 07/11] kvm tools: Remove qcow nowrite function Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
2011-10-31 10:42 ` Asias He
2011-10-30 17:35 ` [PATCH 09/11] kvm tools: Hook virtio-blk completion to disk op completion Sasha Levin
` (2 subsequent siblings)
10 siblings, 1 reply; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin
This patch splits IO request processing from completion notification.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
tools/kvm/disk/core.c | 6 +-
tools/kvm/include/kvm/disk-image.h | 4 +-
tools/kvm/include/kvm/virtio-blk.h | 1 +
tools/kvm/virtio/blk.c | 109 ++++++++++++++++++++++++------------
4 files changed, 79 insertions(+), 41 deletions(-)
diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
index 842fef4..557a1a2 100644
--- a/tools/kvm/disk/core.c
+++ b/tools/kvm/disk/core.c
@@ -152,7 +152,7 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
}
if (disk->disk_req_cb)
- disk->disk_req_cb(param);
+ disk->disk_req_cb(param, total);
return total;
}
@@ -183,7 +183,7 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iove
}
if (disk->disk_req_cb)
- disk->disk_req_cb(param);
+ disk->disk_req_cb(param, total);
return total;
}
@@ -199,7 +199,7 @@ ssize_t disk_image__get_serial(struct disk_image *disk, void *buffer, ssize_t *l
return *len;
}
-void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param))
+void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param, long len))
{
disk->disk_req_cb = disk_req_cb;
}
diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h
index bae9744..bcb3396 100644
--- a/tools/kvm/include/kvm/disk-image.h
+++ b/tools/kvm/include/kvm/disk-image.h
@@ -45,7 +45,7 @@ struct disk_image {
struct disk_image_operations ops;
void *priv;
void *disk_req_cb_param;
- void (*disk_req_cb)(void *param);
+ void (*disk_req_cb)(void *param, long len);
};
struct disk_image *disk_image__open(const char *filename, bool readonly);
@@ -72,5 +72,5 @@ ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector,
ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector,
const struct iovec *iov, int iovcount, void *param);
int raw_image__close(struct disk_image *disk);
-void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param));
+void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param, long len));
#endif /* KVM__DISK_IMAGE_H */
diff --git a/tools/kvm/include/kvm/virtio-blk.h b/tools/kvm/include/kvm/virtio-blk.h
index 8c4fb91..63731a9 100644
--- a/tools/kvm/include/kvm/virtio-blk.h
+++ b/tools/kvm/include/kvm/virtio-blk.h
@@ -8,5 +8,6 @@ struct kvm;
void virtio_blk__init(struct kvm *kvm, struct disk_image *disk);
void virtio_blk__init_all(struct kvm *kvm);
void virtio_blk__delete_all(struct kvm *kvm);
+void virtio_blk_complete(void *param, long len);
#endif /* KVM__BLK_VIRTIO_H */
diff --git a/tools/kvm/virtio/blk.c b/tools/kvm/virtio/blk.c
index 9db293e..0688f82 100644
--- a/tools/kvm/virtio/blk.c
+++ b/tools/kvm/virtio/blk.c
@@ -29,17 +29,21 @@
*/
#define DISK_SEG_MAX (VIRTIO_BLK_QUEUE_SIZE - 2)
-struct blk_dev_job {
+struct blk_dev_req {
+ struct list_head list;
struct virt_queue *vq;
struct blk_dev *bdev;
struct iovec iov[VIRTIO_BLK_QUEUE_SIZE];
u16 out, in, head;
- struct thread_pool__job job_id;
+ struct kvm *kvm;
};
struct blk_dev {
pthread_mutex_t mutex;
+ pthread_mutex_t req_mutex;
+
struct list_head list;
+ struct list_head req_list;
struct virtio_pci vpci;
struct virtio_blk_config blk_config;
@@ -47,41 +51,76 @@ struct blk_dev {
u32 features;
struct virt_queue vqs[NUM_VIRT_QUEUES];
- struct blk_dev_job jobs[VIRTIO_BLK_QUEUE_SIZE];
- u16 job_idx;
+ struct blk_dev_req reqs[VIRTIO_BLK_QUEUE_SIZE];
};
static LIST_HEAD(bdevs);
static int compat_id;
-static void virtio_blk_do_io_request(struct kvm *kvm, void *param)
+static struct blk_dev_req *virtio_blk_req_pop(struct blk_dev *bdev)
+{
+ struct blk_dev_req *req;
+
+ mutex_lock(&bdev->req_mutex);
+ req = list_first_entry(&bdev->req_list, struct blk_dev_req, list);
+ list_del_init(&req->list);
+ mutex_unlock(&bdev->req_mutex);
+
+ return req;
+}
+
+static void virtio_blk_req_push(struct blk_dev *bdev, struct blk_dev_req *req)
{
- struct virtio_blk_outhdr *req;
+ mutex_lock(&bdev->req_mutex);
+ list_add(&req->list, &bdev->req_list);
+ mutex_unlock(&bdev->req_mutex);
+}
+
+void virtio_blk_complete(void *param, long len)
+{
+ struct blk_dev_req *req = param;
+ struct blk_dev *bdev = req->bdev;
+ int queueid = req->vq - bdev->vqs;
u8 *status;
+
+ /* status */
+ status = req->iov[req->out + req->in - 1].iov_base;
+ *status = (len < 0) ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
+
+ mutex_lock(&bdev->mutex);
+ virt_queue__set_used_elem(req->vq, req->head, len);
+ mutex_unlock(&bdev->mutex);
+
+ virtio_pci__signal_vq(req->kvm, &bdev->vpci, queueid);
+
+ virtio_blk_req_push(req->bdev, req);
+}
+
+static void virtio_blk_do_io_request(struct kvm *kvm, struct blk_dev_req *req)
+{
+ struct virtio_blk_outhdr *req_hdr;
ssize_t block_cnt;
- struct blk_dev_job *job;
struct blk_dev *bdev;
struct virt_queue *queue;
struct iovec *iov;
u16 out, in, head;
block_cnt = -1;
- job = param;
- bdev = job->bdev;
- queue = job->vq;
- iov = job->iov;
- out = job->out;
- in = job->in;
- head = job->head;
- req = iov[0].iov_base;
-
- switch (req->type) {
+ bdev = req->bdev;
+ queue = req->vq;
+ iov = req->iov;
+ out = req->out;
+ in = req->in;
+ head = req->head;
+ req_hdr = iov[0].iov_base;
+
+ switch (req_hdr->type) {
case VIRTIO_BLK_T_IN:
- block_cnt = disk_image__read(bdev->disk, req->sector, iov + 1,
+ block_cnt = disk_image__read(bdev->disk, req_hdr->sector, iov + 1,
in + out - 2, NULL);
break;
case VIRTIO_BLK_T_OUT:
- block_cnt = disk_image__write(bdev->disk, req->sector, iov + 1,
+ block_cnt = disk_image__write(bdev->disk, req_hdr->sector, iov + 1,
in + out - 2, NULL);
break;
case VIRTIO_BLK_T_FLUSH:
@@ -92,35 +131,27 @@ static void virtio_blk_do_io_request(struct kvm *kvm, void *param)
disk_image__get_serial(bdev->disk, (iov + 1)->iov_base, &block_cnt);
break;
default:
- pr_warning("request type %d", req->type);
+ pr_warning("request type %d", req_hdr->type);
block_cnt = -1;
break;
}
- /* status */
- status = iov[out + in - 1].iov_base;
- *status = (block_cnt < 0) ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
-
- mutex_lock(&bdev->mutex);
- virt_queue__set_used_elem(queue, head, block_cnt);
- mutex_unlock(&bdev->mutex);
-
- virtio_pci__signal_vq(kvm, &bdev->vpci, queue - bdev->vqs);
+ virtio_blk_complete(req, block_cnt);
}
static void virtio_blk_do_io(struct kvm *kvm, struct virt_queue *vq, struct blk_dev *bdev)
{
while (virt_queue__available(vq)) {
- struct blk_dev_job *job = &bdev->jobs[bdev->job_idx++ % VIRTIO_BLK_QUEUE_SIZE];
+ struct blk_dev_req *req = virtio_blk_req_pop(bdev);
- *job = (struct blk_dev_job) {
- .vq = vq,
- .bdev = bdev,
+ *req = (struct blk_dev_req) {
+ .vq = vq,
+ .bdev = bdev,
+ .kvm = kvm,
};
- job->head = virt_queue__get_iov(vq, job->iov, &job->out, &job->in, kvm);
+ req->head = virt_queue__get_iov(vq, req->iov, &req->out, &req->in, kvm);
- thread_pool__init_job(&job->job_id, kvm, virtio_blk_do_io_request, job);
- thread_pool__do_job(&job->job_id);
+ virtio_blk_do_io_request(kvm, req);
}
}
@@ -191,6 +222,7 @@ static int get_size_vq(struct kvm *kvm, void *dev, u32 vq)
void virtio_blk__init(struct kvm *kvm, struct disk_image *disk)
{
struct blk_dev *bdev;
+ size_t i;
if (!disk)
return;
@@ -201,6 +233,7 @@ void virtio_blk__init(struct kvm *kvm, struct disk_image *disk)
*bdev = (struct blk_dev) {
.mutex = PTHREAD_MUTEX_INITIALIZER,
+ .req_mutex = PTHREAD_MUTEX_INITIALIZER,
.disk = disk,
.blk_config = (struct virtio_blk_config) {
.capacity = disk->size / SECTOR_SIZE,
@@ -222,6 +255,10 @@ void virtio_blk__init(struct kvm *kvm, struct disk_image *disk)
list_add_tail(&bdev->list, &bdevs);
+ INIT_LIST_HEAD(&bdev->req_list);
+ for (i = 0; i < ARRAY_SIZE(bdev->reqs); i++)
+ list_add(&bdev->reqs[i].list, &bdev->req_list);
+
if (compat_id != -1)
compat_id = compat__add_message("virtio-blk device was not detected",
"While you have requested a virtio-blk device, "
--
1.7.7.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 09/11] kvm tools: Hook virtio-blk completion to disk op completion
2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
` (7 preceding siblings ...)
2011-10-30 17:35 ` [PATCH 08/11] kvm tools: Split io request from completion Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
2011-10-30 17:35 ` [PATCH 10/11] kvm tools: Add aio read write functions Sasha Levin
2011-10-30 17:35 ` [PATCH 11/11] kvm tools: Use native vectored AIO in virtio-blk Sasha Levin
10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin
This patch connects the completion processing in virtio-blk to the completion
notification coming from disk image.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
tools/kvm/virtio/blk.c | 10 ++++++----
1 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/tools/kvm/virtio/blk.c b/tools/kvm/virtio/blk.c
index 0688f82..c167dc9 100644
--- a/tools/kvm/virtio/blk.c
+++ b/tools/kvm/virtio/blk.c
@@ -117,26 +117,26 @@ static void virtio_blk_do_io_request(struct kvm *kvm, struct blk_dev_req *req)
switch (req_hdr->type) {
case VIRTIO_BLK_T_IN:
block_cnt = disk_image__read(bdev->disk, req_hdr->sector, iov + 1,
- in + out - 2, NULL);
+ in + out - 2, req);
break;
case VIRTIO_BLK_T_OUT:
block_cnt = disk_image__write(bdev->disk, req_hdr->sector, iov + 1,
- in + out - 2, NULL);
+ in + out - 2, req);
break;
case VIRTIO_BLK_T_FLUSH:
block_cnt = disk_image__flush(bdev->disk);
+ virtio_blk_complete(req, block_cnt);
break;
case VIRTIO_BLK_T_GET_ID:
block_cnt = VIRTIO_BLK_ID_BYTES;
disk_image__get_serial(bdev->disk, (iov + 1)->iov_base, &block_cnt);
+ virtio_blk_complete(req, block_cnt);
break;
default:
pr_warning("request type %d", req_hdr->type);
block_cnt = -1;
break;
}
-
- virtio_blk_complete(req, block_cnt);
}
static void virtio_blk_do_io(struct kvm *kvm, struct virt_queue *vq, struct blk_dev *bdev)
@@ -259,6 +259,8 @@ void virtio_blk__init(struct kvm *kvm, struct disk_image *disk)
for (i = 0; i < ARRAY_SIZE(bdev->reqs); i++)
list_add(&bdev->reqs[i].list, &bdev->req_list);
+ disk_image__set_callback(bdev->disk, virtio_blk_complete);
+
if (compat_id != -1)
compat_id = compat__add_message("virtio-blk device was not detected",
"While you have requested a virtio-blk device, "
--
1.7.7.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 10/11] kvm tools: Add aio read write functions
2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
` (8 preceding siblings ...)
2011-10-30 17:35 ` [PATCH 09/11] kvm tools: Hook virtio-blk completion to disk op completion Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
2011-10-30 17:35 ` [PATCH 11/11] kvm tools: Use native vectored AIO in virtio-blk Sasha Levin
10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin
This patch adds basic native vectored AIO functions.
These functions should be optimized to process multiple io
requests at once.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
tools/kvm/Makefile | 1 +
tools/kvm/include/kvm/read-write.h | 6 ++++++
tools/kvm/read-write.c | 24 ++++++++++++++++++++++++
3 files changed, 31 insertions(+), 0 deletions(-)
diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile
index d9baa69..6095440 100644
--- a/tools/kvm/Makefile
+++ b/tools/kvm/Makefile
@@ -124,6 +124,7 @@ OBJS += bios/bios-rom.o
LIBS += -lrt
LIBS += -lpthread
LIBS += -lutil
+LIBS += -laio
# Additional ARCH settings for x86
ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/i386/ -e s/sun4u/sparc64/ \
diff --git a/tools/kvm/include/kvm/read-write.h b/tools/kvm/include/kvm/read-write.h
index 3351103..89c1590 100644
--- a/tools/kvm/include/kvm/read-write.h
+++ b/tools/kvm/include/kvm/read-write.h
@@ -4,6 +4,7 @@
#include <sys/types.h>
#include <sys/uio.h>
#include <unistd.h>
+#include <libaio.h>
ssize_t xread(int fd, void *buf, size_t count);
ssize_t xwrite(int fd, const void *buf, size_t count);
@@ -29,4 +30,9 @@ ssize_t xpwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset);
ssize_t preadv_in_full(int fd, const struct iovec *iov, int iovcnt, off_t offset);
ssize_t pwritev_in_full(int fd, const struct iovec *iov, int iovcnt, off_t offset);
+int aio_preadv(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt,
+ off_t offset, int ev, void *param);
+int aio_pwritev(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt,
+ off_t offset, int ev, void *param);
+
#endif /* KVM_READ_WRITE_H */
diff --git a/tools/kvm/read-write.c b/tools/kvm/read-write.c
index 737fb26..be5a4fa 100644
--- a/tools/kvm/read-write.c
+++ b/tools/kvm/read-write.c
@@ -316,3 +316,27 @@ ssize_t pwritev_in_full(int fd, const struct iovec *iov, int iovcnt, off_t offse
return total;
}
+
+int aio_pwritev(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt,
+ off_t offset, int ev, void *param)
+{
+ struct iocb *ios[1] = { iocb };
+
+ io_prep_pwritev(iocb, fd, iov, iovcnt, offset);
+ io_set_eventfd(iocb, ev);
+ iocb->data = param;
+
+ return io_submit(ctx, 1, ios);
+}
+
+int aio_preadv(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt,
+ off_t offset, int ev, void *param)
+{
+ struct iocb *ios[1] = { iocb };
+
+ io_prep_preadv(iocb, fd, iov, iovcnt, offset);
+ io_set_eventfd(iocb, ev);
+ iocb->data = param;
+
+ return io_submit(ctx, 1, ios);
+}
--
1.7.7.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 11/11] kvm tools: Use native vectored AIO in virtio-blk
2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
` (9 preceding siblings ...)
2011-10-30 17:35 ` [PATCH 10/11] kvm tools: Add aio read write functions Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin
This patch hooks AIO support into virtio-blk, allowing for faster IO.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
tools/kvm/disk/core.c | 35 +++++++++++++++++++++++++++++++++--
tools/kvm/disk/qcow.c | 4 ++++
tools/kvm/disk/raw.c | 19 +++++++++++++++----
tools/kvm/include/kvm/disk-image.h | 3 +++
tools/kvm/virtio/blk.c | 11 +++++------
5 files changed, 60 insertions(+), 12 deletions(-)
diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
index 557a1a2..17fe548 100644
--- a/tools/kvm/disk/core.c
+++ b/tools/kvm/disk/core.c
@@ -1,8 +1,30 @@
#include "kvm/disk-image.h"
#include "kvm/qcow.h"
+#include "kvm/virtio-blk.h"
+
+#include <sys/eventfd.h>
+#include <sys/poll.h>
+
+#define AIO_MAX 32
int debug_iodelay;
+static void *disk_image__thread(void *param)
+{
+ struct disk_image *disk = param;
+ u64 dummy;
+
+ while (read(disk->evt, &dummy, sizeof(dummy)) > 0) {
+ struct io_event event;
+ struct timespec notime = {0};
+
+ while (io_getevents(disk->ctx, 1, 1, &event, ¬ime) > 0)
+ disk->disk_req_cb(event.data, event.res);
+ }
+
+ return NULL;
+}
+
struct disk_image *disk_image__new(int fd, u64 size, struct disk_image_operations *ops, int use_mmap)
{
struct disk_image *disk;
@@ -26,6 +48,14 @@ struct disk_image *disk_image__new(int fd, u64 size, struct disk_image_operation
}
}
+ if (disk) {
+ pthread_t thread;
+
+ disk->evt = eventfd(0, 0);
+ io_setup(AIO_MAX, &disk->ctx);
+ if (pthread_create(&thread, NULL, disk_image__thread, disk) != 0)
+ die("Failed starting IO řthread");
+ }
return disk;
}
@@ -87,6 +117,7 @@ struct disk_image **disk_image__open_all(const char **filenames, bool *readonly,
goto error;
}
}
+
return disks;
error:
for (i = 0; i < count; i++)
@@ -151,7 +182,7 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
/* Do nothing */
}
- if (disk->disk_req_cb)
+ if (!disk->async && disk->disk_req_cb)
disk->disk_req_cb(param, total);
return total;
@@ -182,7 +213,7 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iove
/* Do nothing */
}
- if (disk->disk_req_cb)
+ if (!disk->async && disk->disk_req_cb)
disk->disk_req_cb(param, total);
return total;
diff --git a/tools/kvm/disk/qcow.c b/tools/kvm/disk/qcow.c
index f9598de..87385df 100644
--- a/tools/kvm/disk/qcow.c
+++ b/tools/kvm/disk/qcow.c
@@ -1145,6 +1145,8 @@ static struct disk_image *qcow2_probe(int fd, bool readonly)
if (!disk_image)
goto free_refcount_table;
+
+ disk_image->async = 1;
disk_image->priv = q;
return disk_image;
@@ -1276,6 +1278,8 @@ static struct disk_image *qcow1_probe(int fd, bool readonly)
if (!disk_image)
goto free_l1_table;
+
+ disk_image->async = 1;
disk_image->priv = q;
return disk_image;
diff --git a/tools/kvm/disk/raw.c b/tools/kvm/disk/raw.c
index fa2e013..3612623 100644
--- a/tools/kvm/disk/raw.c
+++ b/tools/kvm/disk/raw.c
@@ -1,19 +1,25 @@
#include "kvm/disk-image.h"
+#include <libaio.h>
+
ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector, const struct iovec *iov,
int iovcount, void *param)
{
+ struct iocb iocb;
u64 offset = sector << SECTOR_SHIFT;
- return preadv_in_full(disk->fd, iov, iovcount, offset);
+ return aio_preadv(disk->ctx, &iocb, disk->fd, iov, iovcount, offset,
+ disk->evt, param);
}
ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector, const struct iovec *iov,
int iovcount, void *param)
{
+ struct iocb iocb;
u64 offset = sector << SECTOR_SHIFT;
- return pwritev_in_full(disk->fd, iov, iovcount, offset);
+ return aio_pwritev(disk->ctx, &iocb, disk->fd, iov, iovcount, offset,
+ disk->evt, param);
}
ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov,
@@ -72,13 +78,13 @@ static struct disk_image_operations raw_image_regular_ops = {
struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly)
{
+ struct disk_image *disk;
if (readonly) {
/*
* Use mmap's MAP_PRIVATE to implement non-persistent write
* FIXME: This does not work on 32-bit host.
*/
- struct disk_image *disk;
struct disk_image_operations ro_ops = {
.read_sector = raw_image__read_sector_mmap,
.write_sector = raw_image__write_sector_mmap,
@@ -90,6 +96,8 @@ struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly)
ro_ops = raw_image_regular_ops;
ro_ops.write_sector = NULL;
disk = disk_image__new(fd, st->st_size, &ro_ops, DISK_IMAGE_REGULAR);
+ if (disk)
+ disk->async = 1;
}
return disk;
@@ -97,6 +105,9 @@ struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly)
/*
* Use read/write instead of mmap
*/
- return disk_image__new(fd, st->st_size, &raw_image_regular_ops, DISK_IMAGE_REGULAR);
+ disk = disk_image__new(fd, st->st_size, &raw_image_regular_ops, DISK_IMAGE_REGULAR);
+ if (disk)
+ disk->async = 1;
+ return disk;
}
}
diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h
index bcb3396..5a14eab 100644
--- a/tools/kvm/include/kvm/disk-image.h
+++ b/tools/kvm/include/kvm/disk-image.h
@@ -46,6 +46,9 @@ struct disk_image {
void *priv;
void *disk_req_cb_param;
void (*disk_req_cb)(void *param, long len);
+ bool async;
+ int evt;
+ io_context_t ctx;
};
struct disk_image *disk_image__open(const char *filename, bool readonly);
diff --git a/tools/kvm/virtio/blk.c b/tools/kvm/virtio/blk.c
index c167dc9..3d31228d 100644
--- a/tools/kvm/virtio/blk.c
+++ b/tools/kvm/virtio/blk.c
@@ -2,7 +2,6 @@
#include "kvm/virtio-pci-dev.h"
#include "kvm/disk-image.h"
-#include "kvm/virtio.h"
#include "kvm/mutex.h"
#include "kvm/util.h"
#include "kvm/kvm.h"
@@ -11,23 +10,23 @@
#include "kvm/ioeventfd.h"
#include "kvm/guest_compat.h"
#include "kvm/virtio-pci.h"
+#include "kvm/virtio.h"
#include <linux/virtio_ring.h>
#include <linux/virtio_blk.h>
-
#include <linux/kernel.h>
#include <linux/list.h>
#include <linux/types.h>
#include <pthread.h>
#define VIRTIO_BLK_MAX_DEV 4
-#define NUM_VIRT_QUEUES 1
-#define VIRTIO_BLK_QUEUE_SIZE 128
/*
* the header and status consume too entries
*/
#define DISK_SEG_MAX (VIRTIO_BLK_QUEUE_SIZE - 2)
+#define VIRTIO_BLK_QUEUE_SIZE 128
+#define NUM_VIRT_QUEUES 1
struct blk_dev_req {
struct list_head list;
@@ -84,8 +83,8 @@ void virtio_blk_complete(void *param, long len)
u8 *status;
/* status */
- status = req->iov[req->out + req->in - 1].iov_base;
- *status = (len < 0) ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
+ status = req->iov[req->out + req->in - 1].iov_base;
+ *status = (len < 0) ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
mutex_lock(&bdev->mutex);
virt_queue__set_used_elem(req->vq, req->head, len);
--
1.7.7.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 08/11] kvm tools: Split io request from completion
2011-10-30 17:35 ` [PATCH 08/11] kvm tools: Split io request from completion Sasha Levin
@ 2011-10-31 10:42 ` Asias He
0 siblings, 0 replies; 18+ messages in thread
From: Asias He @ 2011-10-31 10:42 UTC (permalink / raw)
To: Sasha Levin; +Cc: penberg, kvm, mingo, gorcunov
On 10/31/2011 01:35 AM, Sasha Levin wrote:
> This patch splits IO request processing from completion notification.
>
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> ---
> tools/kvm/disk/core.c | 6 +-
> tools/kvm/include/kvm/disk-image.h | 4 +-
> tools/kvm/include/kvm/virtio-blk.h | 1 +
> tools/kvm/virtio/blk.c | 109 ++++++++++++++++++++++++------------
> 4 files changed, 79 insertions(+), 41 deletions(-)
>
> diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
> index 842fef4..557a1a2 100644
> --- a/tools/kvm/disk/core.c
> +++ b/tools/kvm/disk/core.c
> @@ -152,7 +152,7 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
> }
>
> if (disk->disk_req_cb)
> - disk->disk_req_cb(param);
> + disk->disk_req_cb(param, total);
>
> return total;
> }
> @@ -183,7 +183,7 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iove
> }
>
> if (disk->disk_req_cb)
> - disk->disk_req_cb(param);
> + disk->disk_req_cb(param, total);
>
> return total;
> }
> @@ -199,7 +199,7 @@ ssize_t disk_image__get_serial(struct disk_image *disk, void *buffer, ssize_t *l
> return *len;
> }
>
> -void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param))
> +void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param, long len))
> {
> disk->disk_req_cb = disk_req_cb;
> }
> diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h
> index bae9744..bcb3396 100644
> --- a/tools/kvm/include/kvm/disk-image.h
> +++ b/tools/kvm/include/kvm/disk-image.h
> @@ -45,7 +45,7 @@ struct disk_image {
> struct disk_image_operations ops;
> void *priv;
> void *disk_req_cb_param;
> - void (*disk_req_cb)(void *param);
> + void (*disk_req_cb)(void *param, long len);
> };
>
> struct disk_image *disk_image__open(const char *filename, bool readonly);
> @@ -72,5 +72,5 @@ ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector,
> ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector,
> const struct iovec *iov, int iovcount, void *param);
> int raw_image__close(struct disk_image *disk);
> -void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param));
> +void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param, long len));
> #endif /* KVM__DISK_IMAGE_H */
> diff --git a/tools/kvm/include/kvm/virtio-blk.h b/tools/kvm/include/kvm/virtio-blk.h
> index 8c4fb91..63731a9 100644
> --- a/tools/kvm/include/kvm/virtio-blk.h
> +++ b/tools/kvm/include/kvm/virtio-blk.h
> @@ -8,5 +8,6 @@ struct kvm;
> void virtio_blk__init(struct kvm *kvm, struct disk_image *disk);
> void virtio_blk__init_all(struct kvm *kvm);
> void virtio_blk__delete_all(struct kvm *kvm);
> +void virtio_blk_complete(void *param, long len);
>
> #endif /* KVM__BLK_VIRTIO_H */
> diff --git a/tools/kvm/virtio/blk.c b/tools/kvm/virtio/blk.c
> index 9db293e..0688f82 100644
> --- a/tools/kvm/virtio/blk.c
> +++ b/tools/kvm/virtio/blk.c
> @@ -29,17 +29,21 @@
> */
> #define DISK_SEG_MAX (VIRTIO_BLK_QUEUE_SIZE - 2)
>
> -struct blk_dev_job {
> +struct blk_dev_req {
> + struct list_head list;
> struct virt_queue *vq;
> struct blk_dev *bdev;
> struct iovec iov[VIRTIO_BLK_QUEUE_SIZE];
> u16 out, in, head;
> - struct thread_pool__job job_id;
> + struct kvm *kvm;
> };
>
> struct blk_dev {
> pthread_mutex_t mutex;
> + pthread_mutex_t req_mutex;
> +
> struct list_head list;
> + struct list_head req_list;
>
> struct virtio_pci vpci;
> struct virtio_blk_config blk_config;
> @@ -47,41 +51,76 @@ struct blk_dev {
> u32 features;
>
> struct virt_queue vqs[NUM_VIRT_QUEUES];
> - struct blk_dev_job jobs[VIRTIO_BLK_QUEUE_SIZE];
> - u16 job_idx;
> + struct blk_dev_req reqs[VIRTIO_BLK_QUEUE_SIZE];
> };
>
> static LIST_HEAD(bdevs);
> static int compat_id;
>
> -static void virtio_blk_do_io_request(struct kvm *kvm, void *param)
> +static struct blk_dev_req *virtio_blk_req_pop(struct blk_dev *bdev)
> +{
> + struct blk_dev_req *req;
> +
> + mutex_lock(&bdev->req_mutex);
> + req = list_first_entry(&bdev->req_list, struct blk_dev_req, list);
> + list_del_init(&req->list);
> + mutex_unlock(&bdev->req_mutex);
> +
> + return req;
> +}
> +
> +static void virtio_blk_req_push(struct blk_dev *bdev, struct blk_dev_req *req)
> {
> - struct virtio_blk_outhdr *req;
> + mutex_lock(&bdev->req_mutex);
> + list_add(&req->list, &bdev->req_list);
> + mutex_unlock(&bdev->req_mutex);
> +}
> +
> +void virtio_blk_complete(void *param, long len)
> +{
> + struct blk_dev_req *req = param;
> + struct blk_dev *bdev = req->bdev;
> + int queueid = req->vq - bdev->vqs;
> u8 *status;
> +
> + /* status */
> + status = req->iov[req->out + req->in - 1].iov_base;
> + *status = (len < 0) ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
> +
> + mutex_lock(&bdev->mutex);
> + virt_queue__set_used_elem(req->vq, req->head, len);
> + mutex_unlock(&bdev->mutex);
> +
> + virtio_pci__signal_vq(req->kvm, &bdev->vpci, queueid);
> +
> + virtio_blk_req_push(req->bdev, req);
> +}
> +
> +static void virtio_blk_do_io_request(struct kvm *kvm, struct blk_dev_req *req)
> +{
> + struct virtio_blk_outhdr *req_hdr;
> ssize_t block_cnt;
> - struct blk_dev_job *job;
> struct blk_dev *bdev;
> struct virt_queue *queue;
> struct iovec *iov;
> u16 out, in, head;
>
> block_cnt = -1;
> - job = param;
> - bdev = job->bdev;
> - queue = job->vq;
> - iov = job->iov;
> - out = job->out;
> - in = job->in;
> - head = job->head;
> - req = iov[0].iov_base;
> -
> - switch (req->type) {
> + bdev = req->bdev;
> + queue = req->vq;
> + iov = req->iov;
> + out = req->out;
> + in = req->in;
> + head = req->head;
Hi, Sasha
Please drop these queue and head local variables.
virtio/blk.c: In function ‘virtio_blk_do_io_request’:
virtio/blk.c:106:15: error: variable ‘head’ set but not used
[-Werror=unused-but-set-variable]
virtio/blk.c:104:21: error: variable ‘queue’ set but not used
[-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
> + req_hdr = iov[0].iov_base;
> +
> + switch (req_hdr->type) {
> case VIRTIO_BLK_T_IN:
> - block_cnt = disk_image__read(bdev->disk, req->sector, iov + 1,
> + block_cnt = disk_image__read(bdev->disk, req_hdr->sector, iov + 1,
> in + out - 2, NULL);
> break;
> case VIRTIO_BLK_T_OUT:
> - block_cnt = disk_image__write(bdev->disk, req->sector, iov + 1,
> + block_cnt = disk_image__write(bdev->disk, req_hdr->sector, iov + 1,
> in + out - 2, NULL);
> break;
> case VIRTIO_BLK_T_FLUSH:
> @@ -92,35 +131,27 @@ static void virtio_blk_do_io_request(struct kvm *kvm, void *param)
> disk_image__get_serial(bdev->disk, (iov + 1)->iov_base, &block_cnt);
> break;
> default:
> - pr_warning("request type %d", req->type);
> + pr_warning("request type %d", req_hdr->type);
> block_cnt = -1;
> break;
> }
>
> - /* status */
> - status = iov[out + in - 1].iov_base;
> - *status = (block_cnt < 0) ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
> -
> - mutex_lock(&bdev->mutex);
> - virt_queue__set_used_elem(queue, head, block_cnt);
> - mutex_unlock(&bdev->mutex);
> -
> - virtio_pci__signal_vq(kvm, &bdev->vpci, queue - bdev->vqs);
> + virtio_blk_complete(req, block_cnt);
> }
>
> static void virtio_blk_do_io(struct kvm *kvm, struct virt_queue *vq, struct blk_dev *bdev)
> {
> while (virt_queue__available(vq)) {
> - struct blk_dev_job *job = &bdev->jobs[bdev->job_idx++ % VIRTIO_BLK_QUEUE_SIZE];
> + struct blk_dev_req *req = virtio_blk_req_pop(bdev);
>
> - *job = (struct blk_dev_job) {
> - .vq = vq,
> - .bdev = bdev,
> + *req = (struct blk_dev_req) {
> + .vq = vq,
> + .bdev = bdev,
> + .kvm = kvm,
> };
> - job->head = virt_queue__get_iov(vq, job->iov, &job->out, &job->in, kvm);
> + req->head = virt_queue__get_iov(vq, req->iov, &req->out, &req->in, kvm);
>
> - thread_pool__init_job(&job->job_id, kvm, virtio_blk_do_io_request, job);
> - thread_pool__do_job(&job->job_id);
> + virtio_blk_do_io_request(kvm, req);
> }
> }
>
> @@ -191,6 +222,7 @@ static int get_size_vq(struct kvm *kvm, void *dev, u32 vq)
> void virtio_blk__init(struct kvm *kvm, struct disk_image *disk)
> {
> struct blk_dev *bdev;
> + size_t i;
>
> if (!disk)
> return;
> @@ -201,6 +233,7 @@ void virtio_blk__init(struct kvm *kvm, struct disk_image *disk)
>
> *bdev = (struct blk_dev) {
> .mutex = PTHREAD_MUTEX_INITIALIZER,
> + .req_mutex = PTHREAD_MUTEX_INITIALIZER,
> .disk = disk,
> .blk_config = (struct virtio_blk_config) {
> .capacity = disk->size / SECTOR_SIZE,
> @@ -222,6 +255,10 @@ void virtio_blk__init(struct kvm *kvm, struct disk_image *disk)
>
> list_add_tail(&bdev->list, &bdevs);
>
> + INIT_LIST_HEAD(&bdev->req_list);
> + for (i = 0; i < ARRAY_SIZE(bdev->reqs); i++)
> + list_add(&bdev->reqs[i].list, &bdev->req_list);
> +
> if (compat_id != -1)
> compat_id = compat__add_message("virtio-blk device was not detected",
> "While you have requested a virtio-blk device, "
--
Asias He
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image
2011-10-30 17:35 ` [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image Sasha Levin
@ 2011-11-01 6:44 ` Pekka Enberg
2011-11-01 11:16 ` Sasha Levin
0 siblings, 1 reply; 18+ messages in thread
From: Pekka Enberg @ 2011-11-01 6:44 UTC (permalink / raw)
To: Sasha Levin; +Cc: kvm, mingo, asias.hejun, gorcunov
On Sun, Oct 30, 2011 at 7:35 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> This makes passing different ops structures easier since you don't have
> to keep them somewhere else after initializing disk_image.
>
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Why do we want to do this? Why would you ever want to allocate ops via
malloc() or on the stack?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image
2011-11-01 6:44 ` Pekka Enberg
@ 2011-11-01 11:16 ` Sasha Levin
2011-11-01 11:30 ` Pekka Enberg
0 siblings, 1 reply; 18+ messages in thread
From: Sasha Levin @ 2011-11-01 11:16 UTC (permalink / raw)
To: Pekka Enberg; +Cc: kvm, mingo, asias.hejun, gorcunov
On Tue, 2011-11-01 at 08:44 +0200, Pekka Enberg wrote:
> On Sun, Oct 30, 2011 at 7:35 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> > This makes passing different ops structures easier since you don't have
> > to keep them somewhere else after initializing disk_image.
> >
> > Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
>
> Why do we want to do this? Why would you ever want to allocate ops via
> malloc() or on the stack?
It's mostly there to avoid having to either keep ops in a global struct
or to malloc() them, instead - it just holds them as part of disk_image;
It's useful with how we handle read-only ops now, you can see how we use
RO image now (private mmap() with fallback to AIO with no write op).
--
Sasha.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image
2011-11-01 11:16 ` Sasha Levin
@ 2011-11-01 11:30 ` Pekka Enberg
2011-11-01 11:43 ` Sasha Levin
0 siblings, 1 reply; 18+ messages in thread
From: Pekka Enberg @ 2011-11-01 11:30 UTC (permalink / raw)
To: Sasha Levin; +Cc: kvm, mingo, asias.hejun, gorcunov
Hi Sasha,
On Sun, Oct 30, 2011 at 7:35 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
>> > This makes passing different ops structures easier since you don't have
>> > to keep them somewhere else after initializing disk_image.
>> >
>> > Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
On Tue, 2011-11-01 at 08:44 +0200, Pekka Enberg wrote:
>> Why do we want to do this? Why would you ever want to allocate ops via
>> malloc() or on the stack?
On Tue, Nov 1, 2011 at 1:16 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> It's mostly there to avoid having to either keep ops in a global struct
> or to malloc() them, instead - it just holds them as part of disk_image;
>
> It's useful with how we handle read-only ops now, you can see how we use
> RO image now (private mmap() with fallback to AIO with no write op).
It's not useful, it's obfuscation. The whole point of "operations"
struct is to provide an immutable and stateless virtual function
dispatch table. It's absolutely pointless to use malloc() to hold them
and seems to miss the whole point of keeping data and function pointer
separate.
Pekka
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image
2011-11-01 11:30 ` Pekka Enberg
@ 2011-11-01 11:43 ` Sasha Levin
2011-11-01 11:52 ` Pekka Enberg
0 siblings, 1 reply; 18+ messages in thread
From: Sasha Levin @ 2011-11-01 11:43 UTC (permalink / raw)
To: Pekka Enberg; +Cc: kvm, mingo, asias.hejun, gorcunov
On Tue, 2011-11-01 at 13:30 +0200, Pekka Enberg wrote:
> Hi Sasha,
>
> On Sun, Oct 30, 2011 at 7:35 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> >> > This makes passing different ops structures easier since you don't have
> >> > to keep them somewhere else after initializing disk_image.
> >> >
> >> > Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
>
> On Tue, 2011-11-01 at 08:44 +0200, Pekka Enberg wrote:
> >> Why do we want to do this? Why would you ever want to allocate ops via
> >> malloc() or on the stack?
>
> On Tue, Nov 1, 2011 at 1:16 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> > It's mostly there to avoid having to either keep ops in a global struct
> > or to malloc() them, instead - it just holds them as part of disk_image;
> >
> > It's useful with how we handle read-only ops now, you can see how we use
> > RO image now (private mmap() with fallback to AIO with no write op).
>
> It's not useful, it's obfuscation. The whole point of "operations"
> struct is to provide an immutable and stateless virtual function
> dispatch table. It's absolutely pointless to use malloc() to hold them
> and seems to miss the whole point of keeping data and function pointer
> separate.
It does provide a stateless function dispatch table, the only difference
is how the function ptr is stored in disk_image.
How would you implement the fallback thing we have in RO images now?
declare all those ops as global variables?
--
Sasha.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image
2011-11-01 11:43 ` Sasha Levin
@ 2011-11-01 11:52 ` Pekka Enberg
0 siblings, 0 replies; 18+ messages in thread
From: Pekka Enberg @ 2011-11-01 11:52 UTC (permalink / raw)
To: Sasha Levin; +Cc: kvm, mingo, asias.hejun, gorcunov
On Tue, Nov 1, 2011 at 1:43 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> How would you implement the fallback thing we have in RO images now?
> declare all those ops as global variables?
Yes.
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2011-11-01 11:52 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
2011-10-30 17:35 ` [PATCH 01/11] kvm tools: Switch to using an enum for disk image types Sasha Levin
2011-10-30 17:35 ` [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image Sasha Levin
2011-11-01 6:44 ` Pekka Enberg
2011-11-01 11:16 ` Sasha Levin
2011-11-01 11:30 ` Pekka Enberg
2011-11-01 11:43 ` Sasha Levin
2011-11-01 11:52 ` Pekka Enberg
2011-10-30 17:35 ` [PATCH 03/11] kvm tools: Remove the non-iov interface from disk image ops Sasha Levin
2011-10-30 17:35 ` [PATCH 04/11] kvm tools: Modify disk ops usage Sasha Levin
2011-10-30 17:35 ` [PATCH 05/11] kvm tools: Modify behaviour on missing ops ptr Sasha Levin
2011-10-30 17:35 ` [PATCH 06/11] kvm tools: Add optional callback on disk op completion Sasha Levin
2011-10-30 17:35 ` [PATCH 07/11] kvm tools: Remove qcow nowrite function Sasha Levin
2011-10-30 17:35 ` [PATCH 08/11] kvm tools: Split io request from completion Sasha Levin
2011-10-31 10:42 ` Asias He
2011-10-30 17:35 ` [PATCH 09/11] kvm tools: Hook virtio-blk completion to disk op completion Sasha Levin
2011-10-30 17:35 ` [PATCH 10/11] kvm tools: Add aio read write functions Sasha Levin
2011-10-30 17:35 ` [PATCH 11/11] kvm tools: Use native vectored AIO in virtio-blk Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).