kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/11] kvm tools: Support native vectored AIO
@ 2011-10-30 17:35 Sasha Levin
  2011-10-30 17:35 ` [PATCH 01/11] kvm tools: Switch to using an enum for disk image types Sasha Levin
                   ` (10 more replies)
  0 siblings, 11 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
  To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin

This patch set adds support for native vectored AIO ops. Instead of using our
own thread pool for queueing io ops, we use the kernel.

Numbers - short story: Overall increase of at least 10%, with sequential
ops benefiting the most.

Numbers - long story:

Before:
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   2     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
512              1G   550  97 26638   7 25050   4  2942  91 339188  35  1711  68
Latency             40602us   12269ms    2755ms   27333us     119ms   77816us
Version  1.96       ------Sequential Create------ --------Random Create--------
512                 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 12172  22 +++++ +++ 23771  34 20209  36 +++++ +++ 22181  33
Latency               153ms     566us     571us     204us      56us     624us
1.96,1.96,512,2,1319996851,1G,,550,97,26638,7,25050,4,2942,91,339188,35,1711,68,16,,,,,12172,22,+++++,+++,23771,34,20209,36,+++++,+++,22181,33,40602us,12269ms,2755ms,27333us,119ms,77816us,153ms,566us,571us,204us,56us,624us

After:
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   2     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
tux            512M   565  96 49307  19 33464   6  3270  99 923539  57  2599  64
Latency             15285us     240ms    2496ms    7478us    5586us    2878ms
Version  1.96       ------Sequential Create------ --------Random Create--------
tux                 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 26992  48 +++++ +++ 27018  38 25618  45 +++++ +++ 24272  34
Latency              1145us    1311us    1708us     408us     186us     283us
1.96,1.96,tux,2,1319997274,512M,,565,96,49307,19,33464,6,3270,99,923539,57,2599,64,16,,,,,26992,48,+++++,+++,27018,38,25618,45,+++++,+++,24272,34,15285us,240ms,2496ms,7478us,5586us,2878ms,1145us,1311us,1708us,408us,186us,283us

I'd be happy if someone could run more benchmarks, as these ones looks very
optimistic, but I kept getting similar results over and over.


Sasha Levin (11):
  kvm tools: Switch to using an enum for disk image types
  kvm tools: Hold a copy of ops struct inside disk_image
  kvm tools: Remove the non-iov interface from disk image ops
  kvm tools: Modify disk ops usage
  kvm tools: Modify behaviour on missing ops ptr
  kvm tools: Add optional callback on disk op completion
  kvm tools: Remove qcow nowrite function
  kvm tools: Split io request from completion
  kvm tools: Hook virtio-blk completion to disk op completion
  kvm tools: Add aio read write functions
  kvm tools: Use native vectored AIO in virtio-blk

 tools/kvm/Makefile                 |    1 +
 tools/kvm/disk/core.c              |  116 +++++++++++++++++++---------------
 tools/kvm/disk/qcow.c              |   58 ++++++++++++++---
 tools/kvm/disk/raw.c               |   87 +++++++++++++++++--------
 tools/kvm/include/kvm/disk-image.h |   50 ++++++++-------
 tools/kvm/include/kvm/read-write.h |    6 ++
 tools/kvm/include/kvm/virtio-blk.h |    1 +
 tools/kvm/read-write.c             |   24 +++++++
 tools/kvm/virtio/blk.c             |  122 ++++++++++++++++++++++++------------
 9 files changed, 312 insertions(+), 153 deletions(-)

-- 
1.7.7.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 01/11] kvm tools: Switch to using an enum for disk image types
  2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
  2011-10-30 17:35 ` [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image Sasha Levin
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
  To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 tools/kvm/disk/qcow.c              |    8 ++++----
 tools/kvm/disk/raw.c               |    2 +-
 tools/kvm/include/kvm/disk-image.h |    7 +++++--
 3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/tools/kvm/disk/qcow.c b/tools/kvm/disk/qcow.c
index 167e97e..bdf6bd8 100644
--- a/tools/kvm/disk/qcow.c
+++ b/tools/kvm/disk/qcow.c
@@ -1107,9 +1107,9 @@ static struct disk_image *qcow2_probe(int fd, bool readonly)
 	 * Do not use mmap use read/write instead
 	 */
 	if (readonly)
-		disk_image = disk_image__new(fd, h->size, &qcow_disk_readonly_ops, DISK_IMAGE_NOMMAP);
+		disk_image = disk_image__new(fd, h->size, &qcow_disk_readonly_ops, DISK_IMAGE_REGULAR);
 	else
-		disk_image = disk_image__new(fd, h->size, &qcow_disk_ops, DISK_IMAGE_NOMMAP);
+		disk_image = disk_image__new(fd, h->size, &qcow_disk_ops, DISK_IMAGE_REGULAR);
 
 	if (!disk_image)
 		goto free_refcount_table;
@@ -1238,9 +1238,9 @@ static struct disk_image *qcow1_probe(int fd, bool readonly)
 	 * Do not use mmap use read/write instead
 	 */
 	if (readonly)
-		disk_image = disk_image__new(fd, h->size, &qcow_disk_readonly_ops, DISK_IMAGE_NOMMAP);
+		disk_image = disk_image__new(fd, h->size, &qcow_disk_readonly_ops, DISK_IMAGE_REGULAR);
 	else
-		disk_image = disk_image__new(fd, h->size, &qcow_disk_ops, DISK_IMAGE_NOMMAP);
+		disk_image = disk_image__new(fd, h->size, &qcow_disk_ops, DISK_IMAGE_REGULAR);
 
 	if (!disk_image)
 		goto free_l1_table;
diff --git a/tools/kvm/disk/raw.c b/tools/kvm/disk/raw.c
index 7f3f8db..66c52be 100644
--- a/tools/kvm/disk/raw.c
+++ b/tools/kvm/disk/raw.c
@@ -78,5 +78,5 @@ struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly)
 		/*
 		 * Use read/write instead of mmap
 		 */
-		return disk_image__new(fd, st->st_size, &raw_image_iov_ops, DISK_IMAGE_NOMMAP);
+		return disk_image__new(fd, st->st_size, &raw_image_iov_ops, DISK_IMAGE_REGULAR);
 }
diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h
index 75b54f9..6acde53 100644
--- a/tools/kvm/include/kvm/disk-image.h
+++ b/tools/kvm/include/kvm/disk-image.h
@@ -21,8 +21,11 @@
 #define SECTOR_SHIFT		9
 #define SECTOR_SIZE		(1UL << SECTOR_SHIFT)
 
-#define DISK_IMAGE_MMAP		0
-#define DISK_IMAGE_NOMMAP	1
+enum {
+	DISK_IMAGE_REGULAR,
+	DISK_IMAGE_MMAP,
+};
+
 #define MAX_DISK_IMAGES         4
 
 struct disk_image;
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image
  2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
  2011-10-30 17:35 ` [PATCH 01/11] kvm tools: Switch to using an enum for disk image types Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
  2011-11-01  6:44   ` Pekka Enberg
  2011-10-30 17:35 ` [PATCH 03/11] kvm tools: Remove the non-iov interface from disk image ops Sasha Levin
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
  To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin

This makes passing different ops structures easier since you don't have
to keep them somewhere else after initializing disk_image.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 tools/kvm/disk/core.c              |   26 +++++++++++++-------------
 tools/kvm/include/kvm/disk-image.h |    2 +-
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
index 20e1990..a914fef7 100644
--- a/tools/kvm/disk/core.c
+++ b/tools/kvm/disk/core.c
@@ -13,7 +13,7 @@ struct disk_image *disk_image__new(int fd, u64 size, struct disk_image_operation
 
 	disk->fd	= fd;
 	disk->size	= size;
-	disk->ops	= ops;
+	disk->ops	= *ops;
 
 	if (use_mmap == DISK_IMAGE_MMAP) {
 		/*
@@ -96,8 +96,8 @@ error:
 
 int disk_image__flush(struct disk_image *disk)
 {
-	if (disk->ops->flush)
-		return disk->ops->flush(disk);
+	if (disk->ops.flush)
+		return disk->ops.flush(disk);
 
 	return fsync(disk->fd);
 }
@@ -108,8 +108,8 @@ int disk_image__close(struct disk_image *disk)
 	if (!disk)
 		return 0;
 
-	if (disk->ops->close)
-		return disk->ops->close(disk);
+	if (disk->ops.close)
+		return disk->ops.close(disk);
 
 	if (close(disk->fd) < 0)
 		pr_warning("close() failed");
@@ -139,21 +139,21 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
 	if (debug_iodelay)
 		msleep(debug_iodelay);
 
-	if (disk->ops->read_sector_iov) {
+	if (disk->ops.read_sector_iov) {
 		/*
 		 * Try mulitple buffer based operation first
 		 */
-		total		= disk->ops->read_sector_iov(disk, sector, iov, iovcount);
+		total		= disk->ops.read_sector_iov(disk, sector, iov, iovcount);
 		if (total < 0) {
 			pr_info("disk_image__read error: total=%ld\n", (long)total);
 			return -1;
 		}
-	} else if (disk->ops->read_sector) {
+	} else if (disk->ops.read_sector) {
 		/*
 		 * Fallback to single buffer based operation
 		 */
 		while (iovcount--) {
-			nr 	= disk->ops->read_sector(disk, sector, iov->iov_base, iov->iov_len);
+			nr 	= disk->ops.read_sector(disk, sector, iov->iov_base, iov->iov_len);
 			if (nr != (ssize_t)iov->iov_len) {
 				pr_info("disk_image__read error: nr = %ld iov_len=%ld\n", (long)nr, (long)iov->iov_len);
 				return -1;
@@ -180,21 +180,21 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iove
 	if (debug_iodelay)
 		msleep(debug_iodelay);
 
-	if (disk->ops->write_sector_iov) {
+	if (disk->ops.write_sector_iov) {
 		/*
 		 * Try writev based operation first
 		 */
-		total = disk->ops->write_sector_iov(disk, sector, iov, iovcount);
+		total = disk->ops.write_sector_iov(disk, sector, iov, iovcount);
 		if (total < 0) {
 			pr_info("disk_image__write error: total=%ld\n", (long)total);
 			return -1;
 		}
-	} else if (disk->ops->write_sector) {
+	} else if (disk->ops.write_sector) {
 		/*
 		 * Fallback to single buffer based operation
 		 */
 		while (iovcount--) {
-			nr	 = disk->ops->write_sector(disk, sector, iov->iov_base, iov->iov_len);
+			nr	 = disk->ops.write_sector(disk, sector, iov->iov_base, iov->iov_len);
 			if (nr != (ssize_t)iov->iov_len) {
 				pr_info("disk_image__write error: nr=%ld iov_len=%ld\n", (long)nr, (long)iov->iov_len);
 				return -1;
diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h
index 6acde53..bb97cdf 100644
--- a/tools/kvm/include/kvm/disk-image.h
+++ b/tools/kvm/include/kvm/disk-image.h
@@ -50,7 +50,7 @@ struct disk_image_operations {
 struct disk_image {
 	int				fd;
 	u64				size;
-	struct disk_image_operations	*ops;
+	struct disk_image_operations	ops;
 	void				*priv;
 };
 
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 03/11] kvm tools: Remove the non-iov interface from disk image ops
  2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
  2011-10-30 17:35 ` [PATCH 01/11] kvm tools: Switch to using an enum for disk image types Sasha Levin
  2011-10-30 17:35 ` [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
  2011-10-30 17:35 ` [PATCH 04/11] kvm tools: Modify disk ops usage Sasha Levin
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
  To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 tools/kvm/disk/core.c              |   48 ++++++----------------------------
 tools/kvm/disk/qcow.c              |   47 +++++++++++++++++++++++++++++++--
 tools/kvm/disk/raw.c               |   50 +++++++++++++++++++++---------------
 tools/kvm/include/kvm/disk-image.h |   26 +++++++-----------
 4 files changed, 91 insertions(+), 80 deletions(-)

diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
index a914fef7..38aa99b 100644
--- a/tools/kvm/disk/core.c
+++ b/tools/kvm/disk/core.c
@@ -134,36 +134,19 @@ void disk_image__close_all(struct disk_image **disks, int count)
 ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
 {
 	ssize_t total = 0;
-	ssize_t nr;
 
 	if (debug_iodelay)
 		msleep(debug_iodelay);
 
-	if (disk->ops.read_sector_iov) {
-		/*
-		 * Try mulitple buffer based operation first
-		 */
-		total		= disk->ops.read_sector_iov(disk, sector, iov, iovcount);
+	if (disk->ops.read_sector) {
+		total = disk->ops.read_sector(disk, sector, iov, iovcount);
 		if (total < 0) {
 			pr_info("disk_image__read error: total=%ld\n", (long)total);
 			return -1;
 		}
-	} else if (disk->ops.read_sector) {
-		/*
-		 * Fallback to single buffer based operation
-		 */
-		while (iovcount--) {
-			nr 	= disk->ops.read_sector(disk, sector, iov->iov_base, iov->iov_len);
-			if (nr != (ssize_t)iov->iov_len) {
-				pr_info("disk_image__read error: nr = %ld iov_len=%ld\n", (long)nr, (long)iov->iov_len);
-				return -1;
-			}
-			sector	+= iov->iov_len >> SECTOR_SHIFT;
-			iov++;
-			total 	+= nr;
-		}
-	} else
+	} else {
 		die("No disk image operation for read\n");
+	}
 
 	return total;
 }
@@ -175,37 +158,22 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
 ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
 {
 	ssize_t total = 0;
-	ssize_t nr;
 
 	if (debug_iodelay)
 		msleep(debug_iodelay);
 
-	if (disk->ops.write_sector_iov) {
+	if (disk->ops.write_sector) {
 		/*
 		 * Try writev based operation first
 		 */
-		total = disk->ops.write_sector_iov(disk, sector, iov, iovcount);
+		total = disk->ops.write_sector(disk, sector, iov, iovcount);
 		if (total < 0) {
 			pr_info("disk_image__write error: total=%ld\n", (long)total);
 			return -1;
 		}
-	} else if (disk->ops.write_sector) {
-		/*
-		 * Fallback to single buffer based operation
-		 */
-		while (iovcount--) {
-			nr	 = disk->ops.write_sector(disk, sector, iov->iov_base, iov->iov_len);
-			if (nr != (ssize_t)iov->iov_len) {
-				pr_info("disk_image__write error: nr=%ld iov_len=%ld\n", (long)nr, (long)iov->iov_len);
-				return -1;
-			}
-
-			sector	+= iov->iov_len >> SECTOR_SHIFT;
-			iov++;
-			total	+= nr;
-		}
-	} else
+	} else {
 		die("No disk image operation for read\n");
+	}
 
 	return total;
 }
diff --git a/tools/kvm/disk/qcow.c b/tools/kvm/disk/qcow.c
index bdf6bd8..288782e 100644
--- a/tools/kvm/disk/qcow.c
+++ b/tools/kvm/disk/qcow.c
@@ -452,7 +452,7 @@ out_error:
 	return -1;
 }
 
-static ssize_t qcow_read_sector(struct disk_image *disk, u64 sector,
+static ssize_t qcow_read_sector_single(struct disk_image *disk, u64 sector,
 	void *dst, u32 dst_len)
 {
 	struct qcow *q = disk->priv;
@@ -488,6 +488,26 @@ static ssize_t qcow_read_sector(struct disk_image *disk, u64 sector,
 	return dst_len;
 }
 
+static ssize_t qcow_read_sector(struct disk_image *disk, u64 sector,
+				const struct iovec *iov, int iovcount)
+{
+	ssize_t nr, total = 0;
+
+	while (iovcount--) {
+		nr = qcow_read_sector_single(disk, sector, iov->iov_base, iov->iov_len);
+		if (nr != (ssize_t)iov->iov_len) {
+			pr_info("qcow_read_sector error: nr=%ld iov_len=%ld\n", (long)nr, (long)iov->iov_len);
+			return -1;
+		}
+
+		sector	+= iov->iov_len >> SECTOR_SHIFT;
+		iov++;
+		total	+= nr;
+	}
+
+	return total;
+}
+
 static inline u64 file_size(int fd)
 {
 	struct stat st;
@@ -867,7 +887,7 @@ error:
 	return -1;
 }
 
-static ssize_t qcow_write_sector(struct disk_image *disk, u64 sector, void *src, u32 src_len)
+static ssize_t qcow_write_sector_single(struct disk_image *disk, u64 sector, void *src, u32 src_len)
 {
 	struct qcow *q = disk->priv;
 	struct qcow_header *header = q->header;
@@ -896,7 +916,28 @@ static ssize_t qcow_write_sector(struct disk_image *disk, u64 sector, void *src,
 	return nr_written;
 }
 
-static ssize_t qcow_nowrite_sector(struct disk_image *disk, u64 sector, void *src, u32 src_len)
+static ssize_t qcow_write_sector(struct disk_image *disk, u64 sector,
+				const struct iovec *iov, int iovcount)
+{
+	ssize_t nr, total = 0;
+
+	while (iovcount--) {
+		nr = qcow_write_sector_single(disk, sector, iov->iov_base, iov->iov_len);
+		if (nr != (ssize_t)iov->iov_len) {
+			pr_info("qcow_write_sector error: nr=%ld iov_len=%ld\n", (long)nr, (long)iov->iov_len);
+			return -1;
+		}
+
+		sector	+= iov->iov_len >> SECTOR_SHIFT;
+		iov++;
+		total	+= nr;
+	}
+
+	return total;
+}
+
+static ssize_t qcow_nowrite_sector(struct disk_image *disk, u64 sector,
+				const struct iovec *iov, int iovcount)
 {
 	/* I/O error */
 	pr_info("%s: no write support\n", __func__);
diff --git a/tools/kvm/disk/raw.c b/tools/kvm/disk/raw.c
index 66c52be..1c6a985 100644
--- a/tools/kvm/disk/raw.c
+++ b/tools/kvm/disk/raw.c
@@ -1,41 +1,49 @@
 #include "kvm/disk-image.h"
 
-ssize_t raw_image__read_sector_iov(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
 {
 	u64 offset = sector << SECTOR_SHIFT;
 
 	return preadv_in_full(disk->fd, iov, iovcount, offset);
 }
 
-ssize_t raw_image__write_sector_iov(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
 {
 	u64 offset = sector << SECTOR_SHIFT;
 
 	return pwritev_in_full(disk->fd, iov, iovcount, offset);
 }
 
-ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector, void *dst, u32 dst_len)
+ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
 {
 	u64 offset = sector << SECTOR_SHIFT;
+	ssize_t nr, total = 0;
 
-	if (offset + dst_len > disk->size)
-		return -1;
+	while (iovcount--) {
+		memcpy(iov->iov_base, disk->priv + offset, iov->iov_len);
 
-	memcpy(dst, disk->priv + offset, dst_len);
+		sector	+= iov->iov_len >> SECTOR_SHIFT;
+		iov++;
+		total	+= nr;
+	}
 
-	return dst_len;
+	return total;
 }
 
-ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector, void *src, u32 src_len)
+ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
 {
 	u64 offset = sector << SECTOR_SHIFT;
+	ssize_t nr, total = 0;
 
-	if (offset + src_len > disk->size)
-		return -1;
+	while (iovcount--) {
+		memcpy(disk->priv + offset, iov->iov_base, iov->iov_len);
 
-	memcpy(disk->priv + offset, src, src_len);
+		sector	+= iov->iov_len >> SECTOR_SHIFT;
+		iov++;
+		total	+= nr;
+	}
 
-	return src_len;
+	return total;
 }
 
 int raw_image__close(struct disk_image *disk)
@@ -51,17 +59,17 @@ int raw_image__close(struct disk_image *disk)
 /*
  * multiple buffer based disk image operations
  */
-static struct disk_image_operations raw_image_iov_ops = {
-	.read_sector_iov	= raw_image__read_sector_iov,
-	.write_sector_iov	= raw_image__write_sector_iov,
+static struct disk_image_operations raw_image_regular_ops = {
+	.read_sector		= raw_image__read_sector,
+	.write_sector		= raw_image__write_sector,
 };
 
 /*
- * single buffer based disk image operations
+ * mmap() based disk image operations
  */
-static struct disk_image_operations raw_image_ops = {
-	.read_sector		= raw_image__read_sector,
-	.write_sector		= raw_image__write_sector,
+static struct disk_image_operations raw_image_mmap_ops = {
+	.read_sector		= raw_image__read_sector_mmap,
+	.write_sector		= raw_image__write_sector_mmap,
 	.close			= raw_image__close,
 };
 
@@ -73,10 +81,10 @@ struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly)
 		 * Use mmap's MAP_PRIVATE to implement non-persistent write
 		 * FIXME: This does not work on 32-bit host.
 		 */
-		return disk_image__new(fd, st->st_size, &raw_image_ops, DISK_IMAGE_MMAP);
+		return disk_image__new(fd, st->st_size, &raw_image_mmap_ops, DISK_IMAGE_MMAP);
 	else
 		/*
 		 * Use read/write instead of mmap
 		 */
-		return disk_image__new(fd, st->st_size, &raw_image_iov_ops, DISK_IMAGE_REGULAR);
+		return disk_image__new(fd, st->st_size, &raw_image_regular_ops, DISK_IMAGE_REGULAR);
 }
diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h
index bb97cdf..5087b9e 100644
--- a/tools/kvm/include/kvm/disk-image.h
+++ b/tools/kvm/include/kvm/disk-image.h
@@ -31,18 +31,8 @@ enum {
 struct disk_image;
 
 struct disk_image_operations {
-	/*
-	 * The following two are used for reading or writing with a single buffer.
-	 * The implentation can use readv/writev or memcpy.
-	 */
-	ssize_t (*read_sector)(struct disk_image *disk, u64 sector, void *dst, u32 dst_len);
-	ssize_t (*write_sector)(struct disk_image *disk, u64 sector, void *src, u32 src_len);
-	/*
-	 * The following two are used for reading or writing with multiple buffers.
-	 * The implentation can use readv/writev or memcpy.
-	 */
-	ssize_t (*read_sector_iov)(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
-	ssize_t (*write_sector_iov)(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
+	ssize_t (*read_sector)(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
+	ssize_t (*write_sector)(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
 	int (*flush)(struct disk_image *disk);
 	int (*close)(struct disk_image *disk);
 };
@@ -67,10 +57,14 @@ ssize_t disk_image__get_serial(struct disk_image *disk, void *buffer, ssize_t *l
 struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly);
 struct disk_image *blkdev__probe(const char *filename, struct stat *st);
 
-ssize_t raw_image__read_sector_iov(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
-ssize_t raw_image__write_sector_iov(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
-ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector, void *dst, u32 dst_len);
-ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector, void *src, u32 src_len);
+ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector,
+				const struct iovec *iov, int iovcount);
+ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector,
+				const struct iovec *iov, int iovcount);
+ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector,
+				const struct iovec *iov, int iovcount);
+ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector,
+				const struct iovec *iov, int iovcount);
 int raw_image__close(struct disk_image *disk);
 
 #endif /* KVM__DISK_IMAGE_H */
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 04/11] kvm tools: Modify disk ops usage
  2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
                   ` (2 preceding siblings ...)
  2011-10-30 17:35 ` [PATCH 03/11] kvm tools: Remove the non-iov interface from disk image ops Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
  2011-10-30 17:35 ` [PATCH 05/11] kvm tools: Modify behaviour on missing ops ptr Sasha Levin
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
  To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin

This patch modifies the definition and usage of ops for read only, mmap and
regular IO.

There is no longer a mix between iov and mmap, and read only no longer implies
mmap (although it will try to use it first).

This allows for more flexibility defining different ops for different
scenarios.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 tools/kvm/disk/core.c |    6 ++++--
 tools/kvm/disk/raw.c  |   40 ++++++++++++++++++++++++----------------
 2 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
index 38aa99b..6048871 100644
--- a/tools/kvm/disk/core.c
+++ b/tools/kvm/disk/core.c
@@ -20,8 +20,10 @@ struct disk_image *disk_image__new(int fd, u64 size, struct disk_image_operation
 		 * The write to disk image will be discarded
 		 */
 		disk->priv = mmap(NULL, size, PROT_RW, MAP_PRIVATE | MAP_NORESERVE, fd, 0);
-		if (disk->priv == MAP_FAILED)
-			die("mmap() failed");
+		if (disk->priv == MAP_FAILED) {
+			free(disk);
+			disk = NULL;
+		}
 	}
 
 	return disk;
diff --git a/tools/kvm/disk/raw.c b/tools/kvm/disk/raw.c
index 1c6a985..2ed1c07 100644
--- a/tools/kvm/disk/raw.c
+++ b/tools/kvm/disk/raw.c
@@ -17,14 +17,15 @@ ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector, const struc
 ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
 {
 	u64 offset = sector << SECTOR_SHIFT;
-	ssize_t nr, total = 0;
+	ssize_t total = 0;
 
 	while (iovcount--) {
 		memcpy(iov->iov_base, disk->priv + offset, iov->iov_len);
 
 		sector	+= iov->iov_len >> SECTOR_SHIFT;
+		offset	+= iov->iov_len;
+		total	+= iov->iov_len;
 		iov++;
-		total	+= nr;
 	}
 
 	return total;
@@ -33,14 +34,15 @@ ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector, const s
 ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
 {
 	u64 offset = sector << SECTOR_SHIFT;
-	ssize_t nr, total = 0;
+	ssize_t total = 0;
 
 	while (iovcount--) {
 		memcpy(disk->priv + offset, iov->iov_base, iov->iov_len);
 
 		sector	+= iov->iov_len >> SECTOR_SHIFT;
+		offset	+= iov->iov_len;
+		total	+= iov->iov_len;
 		iov++;
-		total	+= nr;
 	}
 
 	return total;
@@ -64,27 +66,33 @@ static struct disk_image_operations raw_image_regular_ops = {
 	.write_sector		= raw_image__write_sector,
 };
 
-/*
- * mmap() based disk image operations
- */
-static struct disk_image_operations raw_image_mmap_ops = {
-	.read_sector		= raw_image__read_sector_mmap,
-	.write_sector		= raw_image__write_sector_mmap,
-	.close			= raw_image__close,
-};
-
 struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly)
 {
 
-	if (readonly)
+	if (readonly) {
 		/*
 		 * Use mmap's MAP_PRIVATE to implement non-persistent write
 		 * FIXME: This does not work on 32-bit host.
 		 */
-		return disk_image__new(fd, st->st_size, &raw_image_mmap_ops, DISK_IMAGE_MMAP);
-	else
+		struct disk_image *disk;
+		struct disk_image_operations ro_ops = {
+			.read_sector	= raw_image__read_sector_mmap,
+			.write_sector	= raw_image__write_sector_mmap,
+			.close		= raw_image__close,
+		};
+
+		disk = disk_image__new(fd, st->st_size, &ro_ops, DISK_IMAGE_MMAP);
+		if (disk == NULL) {
+			ro_ops = raw_image_regular_ops;
+			ro_ops.write_sector = NULL;
+			disk = disk_image__new(fd, st->st_size, &ro_ops, DISK_IMAGE_REGULAR);
+		}
+
+		return disk;
+	} else {
 		/*
 		 * Use read/write instead of mmap
 		 */
 		return disk_image__new(fd, st->st_size, &raw_image_regular_ops, DISK_IMAGE_REGULAR);
+	}
 }
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 05/11] kvm tools: Modify behaviour on missing ops ptr
  2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
                   ` (3 preceding siblings ...)
  2011-10-30 17:35 ` [PATCH 04/11] kvm tools: Modify disk ops usage Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
  2011-10-30 17:35 ` [PATCH 06/11] kvm tools: Add optional callback on disk op completion Sasha Levin
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
  To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin

In case a read or write op ptr is missing simply ignore it instead of
critically failing. This provides an easier way to prevent read or write
in specific scenarios.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 tools/kvm/disk/core.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
index 6048871..d76bd55 100644
--- a/tools/kvm/disk/core.c
+++ b/tools/kvm/disk/core.c
@@ -147,7 +147,7 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
 			return -1;
 		}
 	} else {
-		die("No disk image operation for read\n");
+		/* Do nothing */
 	}
 
 	return total;
@@ -174,7 +174,7 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iove
 			return -1;
 		}
 	} else {
-		die("No disk image operation for read\n");
+		/* Do nothing */
 	}
 
 	return total;
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 06/11] kvm tools: Add optional callback on disk op completion
  2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
                   ` (4 preceding siblings ...)
  2011-10-30 17:35 ` [PATCH 05/11] kvm tools: Modify behaviour on missing ops ptr Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
  2011-10-30 17:35 ` [PATCH 07/11] kvm tools: Remove qcow nowrite function Sasha Levin
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
  To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin

This patch adds an optional callback to be called when a disk op completes.

Currently theres not much use for it, but it is the infrastructure for adding
aio support.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 tools/kvm/disk/core.c              |   21 +++++++++++++++++----
 tools/kvm/disk/qcow.c              |    6 +++---
 tools/kvm/disk/raw.c               |   12 ++++++++----
 tools/kvm/include/kvm/disk-image.h |   24 +++++++++++++++---------
 tools/kvm/virtio/blk.c             |    6 ++++--
 5 files changed, 47 insertions(+), 22 deletions(-)

diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
index d76bd55..842fef4 100644
--- a/tools/kvm/disk/core.c
+++ b/tools/kvm/disk/core.c
@@ -133,7 +133,8 @@ void disk_image__close_all(struct disk_image **disks, int count)
  * Fill iov with disk data, starting from sector 'sector'.
  * Return amount of bytes read.
  */
-ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec *iov,
+				int iovcount, void *param)
 {
 	ssize_t total = 0;
 
@@ -141,7 +142,7 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
 		msleep(debug_iodelay);
 
 	if (disk->ops.read_sector) {
-		total = disk->ops.read_sector(disk, sector, iov, iovcount);
+		total = disk->ops.read_sector(disk, sector, iov, iovcount, param);
 		if (total < 0) {
 			pr_info("disk_image__read error: total=%ld\n", (long)total);
 			return -1;
@@ -150,6 +151,9 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
 		/* Do nothing */
 	}
 
+	if (disk->disk_req_cb)
+		disk->disk_req_cb(param);
+
 	return total;
 }
 
@@ -157,7 +161,8 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
  * Write iov to disk, starting from sector 'sector'.
  * Return amount of bytes written.
  */
-ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov,
+				int iovcount, void *param)
 {
 	ssize_t total = 0;
 
@@ -168,7 +173,7 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iove
 		/*
 		 * Try writev based operation first
 		 */
-		total = disk->ops.write_sector(disk, sector, iov, iovcount);
+		total = disk->ops.write_sector(disk, sector, iov, iovcount, param);
 		if (total < 0) {
 			pr_info("disk_image__write error: total=%ld\n", (long)total);
 			return -1;
@@ -177,6 +182,9 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iove
 		/* Do nothing */
 	}
 
+	if (disk->disk_req_cb)
+		disk->disk_req_cb(param);
+
 	return total;
 }
 
@@ -190,3 +198,8 @@ ssize_t disk_image__get_serial(struct disk_image *disk, void *buffer, ssize_t *l
 	*len = snprintf(buffer, *len, "%llu%llu%llu", (u64)st.st_dev, (u64)st.st_rdev, (u64)st.st_ino);
 	return *len;
 }
+
+void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param))
+{
+	disk->disk_req_cb = disk_req_cb;
+}
diff --git a/tools/kvm/disk/qcow.c b/tools/kvm/disk/qcow.c
index 288782e..ba65ab6 100644
--- a/tools/kvm/disk/qcow.c
+++ b/tools/kvm/disk/qcow.c
@@ -489,7 +489,7 @@ static ssize_t qcow_read_sector_single(struct disk_image *disk, u64 sector,
 }
 
 static ssize_t qcow_read_sector(struct disk_image *disk, u64 sector,
-				const struct iovec *iov, int iovcount)
+				const struct iovec *iov, int iovcount, void *param)
 {
 	ssize_t nr, total = 0;
 
@@ -917,7 +917,7 @@ static ssize_t qcow_write_sector_single(struct disk_image *disk, u64 sector, voi
 }
 
 static ssize_t qcow_write_sector(struct disk_image *disk, u64 sector,
-				const struct iovec *iov, int iovcount)
+				const struct iovec *iov, int iovcount, void *param)
 {
 	ssize_t nr, total = 0;
 
@@ -937,7 +937,7 @@ static ssize_t qcow_write_sector(struct disk_image *disk, u64 sector,
 }
 
 static ssize_t qcow_nowrite_sector(struct disk_image *disk, u64 sector,
-				const struct iovec *iov, int iovcount)
+				const struct iovec *iov, int iovcount, void *param)
 {
 	/* I/O error */
 	pr_info("%s: no write support\n", __func__);
diff --git a/tools/kvm/disk/raw.c b/tools/kvm/disk/raw.c
index 2ed1c07..fa2e013 100644
--- a/tools/kvm/disk/raw.c
+++ b/tools/kvm/disk/raw.c
@@ -1,20 +1,23 @@
 #include "kvm/disk-image.h"
 
-ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector, const struct iovec *iov,
+				int iovcount, void *param)
 {
 	u64 offset = sector << SECTOR_SHIFT;
 
 	return preadv_in_full(disk->fd, iov, iovcount, offset);
 }
 
-ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector, const struct iovec *iov,
+				int iovcount, void *param)
 {
 	u64 offset = sector << SECTOR_SHIFT;
 
 	return pwritev_in_full(disk->fd, iov, iovcount, offset);
 }
 
-ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov,
+				int iovcount, void *param)
 {
 	u64 offset = sector << SECTOR_SHIFT;
 	ssize_t total = 0;
@@ -31,7 +34,8 @@ ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector, const s
 	return total;
 }
 
-ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount)
+ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov,
+				int iovcount, void *param)
 {
 	u64 offset = sector << SECTOR_SHIFT;
 	ssize_t total = 0;
diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h
index 5087b9e..bae9744 100644
--- a/tools/kvm/include/kvm/disk-image.h
+++ b/tools/kvm/include/kvm/disk-image.h
@@ -31,8 +31,10 @@ enum {
 struct disk_image;
 
 struct disk_image_operations {
-	ssize_t (*read_sector)(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
-	ssize_t (*write_sector)(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
+	ssize_t (*read_sector)(struct disk_image *disk, u64 sector, const struct iovec *iov,
+				int iovcount, void *param);
+	ssize_t (*write_sector)(struct disk_image *disk, u64 sector, const struct iovec *iov,
+				int iovcount, void *param);
 	int (*flush)(struct disk_image *disk);
 	int (*close)(struct disk_image *disk);
 };
@@ -42,6 +44,8 @@ struct disk_image {
 	u64				size;
 	struct disk_image_operations	ops;
 	void				*priv;
+	void				*disk_req_cb_param;
+	void				(*disk_req_cb)(void *param);
 };
 
 struct disk_image *disk_image__open(const char *filename, bool readonly);
@@ -50,21 +54,23 @@ struct disk_image *disk_image__new(int fd, u64 size, struct disk_image_operation
 int disk_image__close(struct disk_image *disk);
 void disk_image__close_all(struct disk_image **disks, int count);
 int disk_image__flush(struct disk_image *disk);
-ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
-ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount);
+ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec *iov,
+				int iovcount, void *param);
+ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov,
+				int iovcount, void *param);
 ssize_t disk_image__get_serial(struct disk_image *disk, void *buffer, ssize_t *len);
 
 struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly);
 struct disk_image *blkdev__probe(const char *filename, struct stat *st);
 
 ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector,
-				const struct iovec *iov, int iovcount);
+				const struct iovec *iov, int iovcount, void *param);
 ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector,
-				const struct iovec *iov, int iovcount);
+				const struct iovec *iov, int iovcount, void *param);
 ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector,
-				const struct iovec *iov, int iovcount);
+				const struct iovec *iov, int iovcount, void *param);
 ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector,
-				const struct iovec *iov, int iovcount);
+				const struct iovec *iov, int iovcount, void *param);
 int raw_image__close(struct disk_image *disk);
-
+void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param));
 #endif /* KVM__DISK_IMAGE_H */
diff --git a/tools/kvm/virtio/blk.c b/tools/kvm/virtio/blk.c
index 6ffa753..9db293e 100644
--- a/tools/kvm/virtio/blk.c
+++ b/tools/kvm/virtio/blk.c
@@ -77,10 +77,12 @@ static void virtio_blk_do_io_request(struct kvm *kvm, void *param)
 
 	switch (req->type) {
 	case VIRTIO_BLK_T_IN:
-		block_cnt	= disk_image__read(bdev->disk, req->sector, iov + 1, in + out - 2);
+		block_cnt	= disk_image__read(bdev->disk, req->sector, iov + 1,
+					in + out - 2, NULL);
 		break;
 	case VIRTIO_BLK_T_OUT:
-		block_cnt	= disk_image__write(bdev->disk, req->sector, iov + 1, in + out - 2);
+		block_cnt	= disk_image__write(bdev->disk, req->sector, iov + 1,
+					in + out - 2, NULL);
 		break;
 	case VIRTIO_BLK_T_FLUSH:
 		block_cnt       = disk_image__flush(bdev->disk);
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 07/11] kvm tools: Remove qcow nowrite function
  2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
                   ` (5 preceding siblings ...)
  2011-10-30 17:35 ` [PATCH 06/11] kvm tools: Add optional callback on disk op completion Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
  2011-10-30 17:35 ` [PATCH 08/11] kvm tools: Split io request from completion Sasha Levin
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
  To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin

It is no longer needed due to previous changes.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 tools/kvm/disk/qcow.c |    9 ---------
 1 files changed, 0 insertions(+), 9 deletions(-)

diff --git a/tools/kvm/disk/qcow.c b/tools/kvm/disk/qcow.c
index ba65ab6..f9598de 100644
--- a/tools/kvm/disk/qcow.c
+++ b/tools/kvm/disk/qcow.c
@@ -936,14 +936,6 @@ static ssize_t qcow_write_sector(struct disk_image *disk, u64 sector,
 	return total;
 }
 
-static ssize_t qcow_nowrite_sector(struct disk_image *disk, u64 sector,
-				const struct iovec *iov, int iovcount, void *param)
-{
-	/* I/O error */
-	pr_info("%s: no write support\n", __func__);
-	return -1;
-}
-
 static int qcow_disk_flush(struct disk_image *disk)
 {
 	struct qcow *q = disk->priv;
@@ -1013,7 +1005,6 @@ static int qcow_disk_close(struct disk_image *disk)
 
 static struct disk_image_operations qcow_disk_readonly_ops = {
 	.read_sector		= qcow_read_sector,
-	.write_sector		= qcow_nowrite_sector,
 	.close			= qcow_disk_close,
 };
 
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 08/11] kvm tools: Split io request from completion
  2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
                   ` (6 preceding siblings ...)
  2011-10-30 17:35 ` [PATCH 07/11] kvm tools: Remove qcow nowrite function Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
  2011-10-31 10:42   ` Asias He
  2011-10-30 17:35 ` [PATCH 09/11] kvm tools: Hook virtio-blk completion to disk op completion Sasha Levin
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
  To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin

This patch splits IO request processing from completion notification.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 tools/kvm/disk/core.c              |    6 +-
 tools/kvm/include/kvm/disk-image.h |    4 +-
 tools/kvm/include/kvm/virtio-blk.h |    1 +
 tools/kvm/virtio/blk.c             |  109 ++++++++++++++++++++++++------------
 4 files changed, 79 insertions(+), 41 deletions(-)

diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
index 842fef4..557a1a2 100644
--- a/tools/kvm/disk/core.c
+++ b/tools/kvm/disk/core.c
@@ -152,7 +152,7 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
 	}
 
 	if (disk->disk_req_cb)
-		disk->disk_req_cb(param);
+		disk->disk_req_cb(param, total);
 
 	return total;
 }
@@ -183,7 +183,7 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iove
 	}
 
 	if (disk->disk_req_cb)
-		disk->disk_req_cb(param);
+		disk->disk_req_cb(param, total);
 
 	return total;
 }
@@ -199,7 +199,7 @@ ssize_t disk_image__get_serial(struct disk_image *disk, void *buffer, ssize_t *l
 	return *len;
 }
 
-void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param))
+void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param, long len))
 {
 	disk->disk_req_cb = disk_req_cb;
 }
diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h
index bae9744..bcb3396 100644
--- a/tools/kvm/include/kvm/disk-image.h
+++ b/tools/kvm/include/kvm/disk-image.h
@@ -45,7 +45,7 @@ struct disk_image {
 	struct disk_image_operations	ops;
 	void				*priv;
 	void				*disk_req_cb_param;
-	void				(*disk_req_cb)(void *param);
+	void				(*disk_req_cb)(void *param, long len);
 };
 
 struct disk_image *disk_image__open(const char *filename, bool readonly);
@@ -72,5 +72,5 @@ ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector,
 ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector,
 				const struct iovec *iov, int iovcount, void *param);
 int raw_image__close(struct disk_image *disk);
-void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param));
+void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param, long len));
 #endif /* KVM__DISK_IMAGE_H */
diff --git a/tools/kvm/include/kvm/virtio-blk.h b/tools/kvm/include/kvm/virtio-blk.h
index 8c4fb91..63731a9 100644
--- a/tools/kvm/include/kvm/virtio-blk.h
+++ b/tools/kvm/include/kvm/virtio-blk.h
@@ -8,5 +8,6 @@ struct kvm;
 void virtio_blk__init(struct kvm *kvm, struct disk_image *disk);
 void virtio_blk__init_all(struct kvm *kvm);
 void virtio_blk__delete_all(struct kvm *kvm);
+void virtio_blk_complete(void *param, long len);
 
 #endif /* KVM__BLK_VIRTIO_H */
diff --git a/tools/kvm/virtio/blk.c b/tools/kvm/virtio/blk.c
index 9db293e..0688f82 100644
--- a/tools/kvm/virtio/blk.c
+++ b/tools/kvm/virtio/blk.c
@@ -29,17 +29,21 @@
  */
 #define DISK_SEG_MAX			(VIRTIO_BLK_QUEUE_SIZE - 2)
 
-struct blk_dev_job {
+struct blk_dev_req {
+	struct list_head		list;
 	struct virt_queue		*vq;
 	struct blk_dev			*bdev;
 	struct iovec			iov[VIRTIO_BLK_QUEUE_SIZE];
 	u16				out, in, head;
-	struct thread_pool__job		job_id;
+	struct kvm			*kvm;
 };
 
 struct blk_dev {
 	pthread_mutex_t			mutex;
+	pthread_mutex_t			req_mutex;
+
 	struct list_head		list;
+	struct list_head		req_list;
 
 	struct virtio_pci		vpci;
 	struct virtio_blk_config	blk_config;
@@ -47,41 +51,76 @@ struct blk_dev {
 	u32				features;
 
 	struct virt_queue		vqs[NUM_VIRT_QUEUES];
-	struct blk_dev_job		jobs[VIRTIO_BLK_QUEUE_SIZE];
-	u16				job_idx;
+	struct blk_dev_req		reqs[VIRTIO_BLK_QUEUE_SIZE];
 };
 
 static LIST_HEAD(bdevs);
 static int compat_id;
 
-static void virtio_blk_do_io_request(struct kvm *kvm, void *param)
+static struct blk_dev_req *virtio_blk_req_pop(struct blk_dev *bdev)
+{
+	struct blk_dev_req *req;
+
+	mutex_lock(&bdev->req_mutex);
+	req = list_first_entry(&bdev->req_list, struct blk_dev_req, list);
+	list_del_init(&req->list);
+	mutex_unlock(&bdev->req_mutex);
+
+	return req;
+}
+
+static void virtio_blk_req_push(struct blk_dev *bdev, struct blk_dev_req *req)
 {
-	struct virtio_blk_outhdr *req;
+	mutex_lock(&bdev->req_mutex);
+	list_add(&req->list, &bdev->req_list);
+	mutex_unlock(&bdev->req_mutex);
+}
+
+void virtio_blk_complete(void *param, long len)
+{
+	struct blk_dev_req *req = param;
+	struct blk_dev *bdev = req->bdev;
+	int queueid = req->vq - bdev->vqs;
 	u8 *status;
+
+	/* status */
+	status			= req->iov[req->out + req->in - 1].iov_base;
+	*status			= (len < 0) ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
+
+	mutex_lock(&bdev->mutex);
+	virt_queue__set_used_elem(req->vq, req->head, len);
+	mutex_unlock(&bdev->mutex);
+
+	virtio_pci__signal_vq(req->kvm, &bdev->vpci, queueid);
+
+	virtio_blk_req_push(req->bdev, req);
+}
+
+static void virtio_blk_do_io_request(struct kvm *kvm, struct blk_dev_req *req)
+{
+	struct virtio_blk_outhdr *req_hdr;
 	ssize_t block_cnt;
-	struct blk_dev_job *job;
 	struct blk_dev *bdev;
 	struct virt_queue *queue;
 	struct iovec *iov;
 	u16 out, in, head;
 
 	block_cnt	= -1;
-	job		= param;
-	bdev		= job->bdev;
-	queue		= job->vq;
-	iov		= job->iov;
-	out		= job->out;
-	in		= job->in;
-	head		= job->head;
-	req		= iov[0].iov_base;
-
-	switch (req->type) {
+	bdev		= req->bdev;
+	queue		= req->vq;
+	iov		= req->iov;
+	out		= req->out;
+	in		= req->in;
+	head		= req->head;
+	req_hdr		= iov[0].iov_base;
+
+	switch (req_hdr->type) {
 	case VIRTIO_BLK_T_IN:
-		block_cnt	= disk_image__read(bdev->disk, req->sector, iov + 1,
+		block_cnt	= disk_image__read(bdev->disk, req_hdr->sector, iov + 1,
 					in + out - 2, NULL);
 		break;
 	case VIRTIO_BLK_T_OUT:
-		block_cnt	= disk_image__write(bdev->disk, req->sector, iov + 1,
+		block_cnt	= disk_image__write(bdev->disk, req_hdr->sector, iov + 1,
 					in + out - 2, NULL);
 		break;
 	case VIRTIO_BLK_T_FLUSH:
@@ -92,35 +131,27 @@ static void virtio_blk_do_io_request(struct kvm *kvm, void *param)
 		disk_image__get_serial(bdev->disk, (iov + 1)->iov_base, &block_cnt);
 		break;
 	default:
-		pr_warning("request type %d", req->type);
+		pr_warning("request type %d", req_hdr->type);
 		block_cnt	= -1;
 		break;
 	}
 
-	/* status */
-	status			= iov[out + in - 1].iov_base;
-	*status			= (block_cnt < 0) ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
-
-	mutex_lock(&bdev->mutex);
-	virt_queue__set_used_elem(queue, head, block_cnt);
-	mutex_unlock(&bdev->mutex);
-
-	virtio_pci__signal_vq(kvm, &bdev->vpci, queue - bdev->vqs);
+	virtio_blk_complete(req, block_cnt);
 }
 
 static void virtio_blk_do_io(struct kvm *kvm, struct virt_queue *vq, struct blk_dev *bdev)
 {
 	while (virt_queue__available(vq)) {
-		struct blk_dev_job *job = &bdev->jobs[bdev->job_idx++ % VIRTIO_BLK_QUEUE_SIZE];
+		struct blk_dev_req *req = virtio_blk_req_pop(bdev);
 
-		*job			= (struct blk_dev_job) {
-			.vq			= vq,
-			.bdev			= bdev,
+		*req		= (struct blk_dev_req) {
+			.vq	= vq,
+			.bdev	= bdev,
+			.kvm	= kvm,
 		};
-		job->head = virt_queue__get_iov(vq, job->iov, &job->out, &job->in, kvm);
+		req->head = virt_queue__get_iov(vq, req->iov, &req->out, &req->in, kvm);
 
-		thread_pool__init_job(&job->job_id, kvm, virtio_blk_do_io_request, job);
-		thread_pool__do_job(&job->job_id);
+		virtio_blk_do_io_request(kvm, req);
 	}
 }
 
@@ -191,6 +222,7 @@ static int get_size_vq(struct kvm *kvm, void *dev, u32 vq)
 void virtio_blk__init(struct kvm *kvm, struct disk_image *disk)
 {
 	struct blk_dev *bdev;
+	size_t i;
 
 	if (!disk)
 		return;
@@ -201,6 +233,7 @@ void virtio_blk__init(struct kvm *kvm, struct disk_image *disk)
 
 	*bdev = (struct blk_dev) {
 		.mutex			= PTHREAD_MUTEX_INITIALIZER,
+		.req_mutex		= PTHREAD_MUTEX_INITIALIZER,
 		.disk			= disk,
 		.blk_config		= (struct virtio_blk_config) {
 			.capacity	= disk->size / SECTOR_SIZE,
@@ -222,6 +255,10 @@ void virtio_blk__init(struct kvm *kvm, struct disk_image *disk)
 
 	list_add_tail(&bdev->list, &bdevs);
 
+	INIT_LIST_HEAD(&bdev->req_list);
+	for (i = 0; i < ARRAY_SIZE(bdev->reqs); i++)
+		list_add(&bdev->reqs[i].list, &bdev->req_list);
+
 	if (compat_id != -1)
 		compat_id = compat__add_message("virtio-blk device was not detected",
 						"While you have requested a virtio-blk device, "
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 09/11] kvm tools: Hook virtio-blk completion to disk op completion
  2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
                   ` (7 preceding siblings ...)
  2011-10-30 17:35 ` [PATCH 08/11] kvm tools: Split io request from completion Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
  2011-10-30 17:35 ` [PATCH 10/11] kvm tools: Add aio read write functions Sasha Levin
  2011-10-30 17:35 ` [PATCH 11/11] kvm tools: Use native vectored AIO in virtio-blk Sasha Levin
  10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
  To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin

This patch connects the completion processing in virtio-blk to the completion
notification coming from disk image.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 tools/kvm/virtio/blk.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/tools/kvm/virtio/blk.c b/tools/kvm/virtio/blk.c
index 0688f82..c167dc9 100644
--- a/tools/kvm/virtio/blk.c
+++ b/tools/kvm/virtio/blk.c
@@ -117,26 +117,26 @@ static void virtio_blk_do_io_request(struct kvm *kvm, struct blk_dev_req *req)
 	switch (req_hdr->type) {
 	case VIRTIO_BLK_T_IN:
 		block_cnt	= disk_image__read(bdev->disk, req_hdr->sector, iov + 1,
-					in + out - 2, NULL);
+					in + out - 2, req);
 		break;
 	case VIRTIO_BLK_T_OUT:
 		block_cnt	= disk_image__write(bdev->disk, req_hdr->sector, iov + 1,
-					in + out - 2, NULL);
+					in + out - 2, req);
 		break;
 	case VIRTIO_BLK_T_FLUSH:
 		block_cnt       = disk_image__flush(bdev->disk);
+		virtio_blk_complete(req, block_cnt);
 		break;
 	case VIRTIO_BLK_T_GET_ID:
 		block_cnt	= VIRTIO_BLK_ID_BYTES;
 		disk_image__get_serial(bdev->disk, (iov + 1)->iov_base, &block_cnt);
+		virtio_blk_complete(req, block_cnt);
 		break;
 	default:
 		pr_warning("request type %d", req_hdr->type);
 		block_cnt	= -1;
 		break;
 	}
-
-	virtio_blk_complete(req, block_cnt);
 }
 
 static void virtio_blk_do_io(struct kvm *kvm, struct virt_queue *vq, struct blk_dev *bdev)
@@ -259,6 +259,8 @@ void virtio_blk__init(struct kvm *kvm, struct disk_image *disk)
 	for (i = 0; i < ARRAY_SIZE(bdev->reqs); i++)
 		list_add(&bdev->reqs[i].list, &bdev->req_list);
 
+	disk_image__set_callback(bdev->disk, virtio_blk_complete);
+
 	if (compat_id != -1)
 		compat_id = compat__add_message("virtio-blk device was not detected",
 						"While you have requested a virtio-blk device, "
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 10/11] kvm tools: Add aio read write functions
  2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
                   ` (8 preceding siblings ...)
  2011-10-30 17:35 ` [PATCH 09/11] kvm tools: Hook virtio-blk completion to disk op completion Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
  2011-10-30 17:35 ` [PATCH 11/11] kvm tools: Use native vectored AIO in virtio-blk Sasha Levin
  10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
  To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin

This patch adds basic native vectored AIO functions.

These functions should be optimized to process multiple io
requests at once.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 tools/kvm/Makefile                 |    1 +
 tools/kvm/include/kvm/read-write.h |    6 ++++++
 tools/kvm/read-write.c             |   24 ++++++++++++++++++++++++
 3 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile
index d9baa69..6095440 100644
--- a/tools/kvm/Makefile
+++ b/tools/kvm/Makefile
@@ -124,6 +124,7 @@ OBJS	+= bios/bios-rom.o
 LIBS	+= -lrt
 LIBS	+= -lpthread
 LIBS	+= -lutil
+LIBS	+= -laio
 
 # Additional ARCH settings for x86
 ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/i386/ -e s/sun4u/sparc64/ \
diff --git a/tools/kvm/include/kvm/read-write.h b/tools/kvm/include/kvm/read-write.h
index 3351103..89c1590 100644
--- a/tools/kvm/include/kvm/read-write.h
+++ b/tools/kvm/include/kvm/read-write.h
@@ -4,6 +4,7 @@
 #include <sys/types.h>
 #include <sys/uio.h>
 #include <unistd.h>
+#include <libaio.h>
 
 ssize_t xread(int fd, void *buf, size_t count);
 ssize_t xwrite(int fd, const void *buf, size_t count);
@@ -29,4 +30,9 @@ ssize_t xpwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset);
 ssize_t preadv_in_full(int fd, const struct iovec *iov, int iovcnt, off_t offset);
 ssize_t pwritev_in_full(int fd, const struct iovec *iov, int iovcnt, off_t offset);
 
+int aio_preadv(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt,
+		off_t offset, int ev, void *param);
+int aio_pwritev(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt,
+		off_t offset, int ev, void *param);
+
 #endif /* KVM_READ_WRITE_H */
diff --git a/tools/kvm/read-write.c b/tools/kvm/read-write.c
index 737fb26..be5a4fa 100644
--- a/tools/kvm/read-write.c
+++ b/tools/kvm/read-write.c
@@ -316,3 +316,27 @@ ssize_t pwritev_in_full(int fd, const struct iovec *iov, int iovcnt, off_t offse
 
 	return total;
 }
+
+int aio_pwritev(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt,
+		off_t offset, int ev, void *param)
+{
+	struct iocb *ios[1] = { iocb };
+
+	io_prep_pwritev(iocb, fd, iov, iovcnt, offset);
+	io_set_eventfd(iocb, ev);
+	iocb->data = param;
+
+	return io_submit(ctx, 1, ios);
+}
+
+int aio_preadv(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt,
+		off_t offset, int ev, void *param)
+{
+	struct iocb *ios[1] = { iocb };
+
+	io_prep_preadv(iocb, fd, iov, iovcnt, offset);
+	io_set_eventfd(iocb, ev);
+	iocb->data = param;
+
+	return io_submit(ctx, 1, ios);
+}
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 11/11] kvm tools: Use native vectored AIO in virtio-blk
  2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
                   ` (9 preceding siblings ...)
  2011-10-30 17:35 ` [PATCH 10/11] kvm tools: Add aio read write functions Sasha Levin
@ 2011-10-30 17:35 ` Sasha Levin
  10 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2011-10-30 17:35 UTC (permalink / raw)
  To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin

This patch hooks AIO support into virtio-blk, allowing for faster IO.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 tools/kvm/disk/core.c              |   35 +++++++++++++++++++++++++++++++++--
 tools/kvm/disk/qcow.c              |    4 ++++
 tools/kvm/disk/raw.c               |   19 +++++++++++++++----
 tools/kvm/include/kvm/disk-image.h |    3 +++
 tools/kvm/virtio/blk.c             |   11 +++++------
 5 files changed, 60 insertions(+), 12 deletions(-)

diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
index 557a1a2..17fe548 100644
--- a/tools/kvm/disk/core.c
+++ b/tools/kvm/disk/core.c
@@ -1,8 +1,30 @@
 #include "kvm/disk-image.h"
 #include "kvm/qcow.h"
+#include "kvm/virtio-blk.h"
+
+#include <sys/eventfd.h>
+#include <sys/poll.h>
+
+#define AIO_MAX 32
 
 int debug_iodelay;
 
+static void *disk_image__thread(void *param)
+{
+	struct disk_image *disk = param;
+	u64 dummy;
+
+	while (read(disk->evt, &dummy, sizeof(dummy)) > 0) {
+		struct io_event event;
+		struct timespec notime = {0};
+
+		while (io_getevents(disk->ctx, 1, 1, &event, &notime) > 0)
+			disk->disk_req_cb(event.data, event.res);
+	}
+
+	return NULL;
+}
+
 struct disk_image *disk_image__new(int fd, u64 size, struct disk_image_operations *ops, int use_mmap)
 {
 	struct disk_image *disk;
@@ -26,6 +48,14 @@ struct disk_image *disk_image__new(int fd, u64 size, struct disk_image_operation
 		}
 	}
 
+	if (disk) {
+		pthread_t thread;
+
+		disk->evt = eventfd(0, 0);
+		io_setup(AIO_MAX, &disk->ctx);
+		if (pthread_create(&thread, NULL, disk_image__thread, disk) != 0)
+			die("Failed starting IO řthread");
+	}
 	return disk;
 }
 
@@ -87,6 +117,7 @@ struct disk_image **disk_image__open_all(const char **filenames, bool *readonly,
 			goto error;
 		}
 	}
+
 	return disks;
 error:
 	for (i = 0; i < count; i++)
@@ -151,7 +182,7 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
 		/* Do nothing */
 	}
 
-	if (disk->disk_req_cb)
+	if (!disk->async && disk->disk_req_cb)
 		disk->disk_req_cb(param, total);
 
 	return total;
@@ -182,7 +213,7 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iove
 		/* Do nothing */
 	}
 
-	if (disk->disk_req_cb)
+	if (!disk->async && disk->disk_req_cb)
 		disk->disk_req_cb(param, total);
 
 	return total;
diff --git a/tools/kvm/disk/qcow.c b/tools/kvm/disk/qcow.c
index f9598de..87385df 100644
--- a/tools/kvm/disk/qcow.c
+++ b/tools/kvm/disk/qcow.c
@@ -1145,6 +1145,8 @@ static struct disk_image *qcow2_probe(int fd, bool readonly)
 
 	if (!disk_image)
 		goto free_refcount_table;
+
+	disk_image->async = 1;
 	disk_image->priv = q;
 
 	return disk_image;
@@ -1276,6 +1278,8 @@ static struct disk_image *qcow1_probe(int fd, bool readonly)
 
 	if (!disk_image)
 		goto free_l1_table;
+
+	disk_image->async = 1;
 	disk_image->priv = q;
 
 	return disk_image;
diff --git a/tools/kvm/disk/raw.c b/tools/kvm/disk/raw.c
index fa2e013..3612623 100644
--- a/tools/kvm/disk/raw.c
+++ b/tools/kvm/disk/raw.c
@@ -1,19 +1,25 @@
 #include "kvm/disk-image.h"
 
+#include <libaio.h>
+
 ssize_t raw_image__read_sector(struct disk_image *disk, u64 sector, const struct iovec *iov,
 				int iovcount, void *param)
 {
+	struct iocb iocb;
 	u64 offset = sector << SECTOR_SHIFT;
 
-	return preadv_in_full(disk->fd, iov, iovcount, offset);
+	return aio_preadv(disk->ctx, &iocb, disk->fd, iov, iovcount, offset,
+				disk->evt, param);
 }
 
 ssize_t raw_image__write_sector(struct disk_image *disk, u64 sector, const struct iovec *iov,
 				int iovcount, void *param)
 {
+	struct iocb iocb;
 	u64 offset = sector << SECTOR_SHIFT;
 
-	return pwritev_in_full(disk->fd, iov, iovcount, offset);
+	return aio_pwritev(disk->ctx, &iocb, disk->fd, iov, iovcount, offset,
+				disk->evt, param);
 }
 
 ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov,
@@ -72,13 +78,13 @@ static struct disk_image_operations raw_image_regular_ops = {
 
 struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly)
 {
+	struct disk_image *disk;
 
 	if (readonly) {
 		/*
 		 * Use mmap's MAP_PRIVATE to implement non-persistent write
 		 * FIXME: This does not work on 32-bit host.
 		 */
-		struct disk_image *disk;
 		struct disk_image_operations ro_ops = {
 			.read_sector	= raw_image__read_sector_mmap,
 			.write_sector	= raw_image__write_sector_mmap,
@@ -90,6 +96,8 @@ struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly)
 			ro_ops = raw_image_regular_ops;
 			ro_ops.write_sector = NULL;
 			disk = disk_image__new(fd, st->st_size, &ro_ops, DISK_IMAGE_REGULAR);
+			if (disk)
+				disk->async = 1;
 		}
 
 		return disk;
@@ -97,6 +105,9 @@ struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly)
 		/*
 		 * Use read/write instead of mmap
 		 */
-		return disk_image__new(fd, st->st_size, &raw_image_regular_ops, DISK_IMAGE_REGULAR);
+		disk = disk_image__new(fd, st->st_size, &raw_image_regular_ops, DISK_IMAGE_REGULAR);
+		if (disk)
+			disk->async = 1;
+		return disk;
 	}
 }
diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h
index bcb3396..5a14eab 100644
--- a/tools/kvm/include/kvm/disk-image.h
+++ b/tools/kvm/include/kvm/disk-image.h
@@ -46,6 +46,9 @@ struct disk_image {
 	void				*priv;
 	void				*disk_req_cb_param;
 	void				(*disk_req_cb)(void *param, long len);
+	bool				async;
+	int				evt;
+	io_context_t			ctx;
 };
 
 struct disk_image *disk_image__open(const char *filename, bool readonly);
diff --git a/tools/kvm/virtio/blk.c b/tools/kvm/virtio/blk.c
index c167dc9..3d31228d 100644
--- a/tools/kvm/virtio/blk.c
+++ b/tools/kvm/virtio/blk.c
@@ -2,7 +2,6 @@
 
 #include "kvm/virtio-pci-dev.h"
 #include "kvm/disk-image.h"
-#include "kvm/virtio.h"
 #include "kvm/mutex.h"
 #include "kvm/util.h"
 #include "kvm/kvm.h"
@@ -11,23 +10,23 @@
 #include "kvm/ioeventfd.h"
 #include "kvm/guest_compat.h"
 #include "kvm/virtio-pci.h"
+#include "kvm/virtio.h"
 
 #include <linux/virtio_ring.h>
 #include <linux/virtio_blk.h>
-
 #include <linux/kernel.h>
 #include <linux/list.h>
 #include <linux/types.h>
 #include <pthread.h>
 
 #define VIRTIO_BLK_MAX_DEV		4
-#define NUM_VIRT_QUEUES			1
 
-#define VIRTIO_BLK_QUEUE_SIZE		128
 /*
  * the header and status consume too entries
  */
 #define DISK_SEG_MAX			(VIRTIO_BLK_QUEUE_SIZE - 2)
+#define VIRTIO_BLK_QUEUE_SIZE		128
+#define NUM_VIRT_QUEUES			1
 
 struct blk_dev_req {
 	struct list_head		list;
@@ -84,8 +83,8 @@ void virtio_blk_complete(void *param, long len)
 	u8 *status;
 
 	/* status */
-	status			= req->iov[req->out + req->in - 1].iov_base;
-	*status			= (len < 0) ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
+	status	= req->iov[req->out + req->in - 1].iov_base;
+	*status	= (len < 0) ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
 
 	mutex_lock(&bdev->mutex);
 	virt_queue__set_used_elem(req->vq, req->head, len);
-- 
1.7.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 08/11] kvm tools: Split io request from completion
  2011-10-30 17:35 ` [PATCH 08/11] kvm tools: Split io request from completion Sasha Levin
@ 2011-10-31 10:42   ` Asias He
  0 siblings, 0 replies; 18+ messages in thread
From: Asias He @ 2011-10-31 10:42 UTC (permalink / raw)
  To: Sasha Levin; +Cc: penberg, kvm, mingo, gorcunov

On 10/31/2011 01:35 AM, Sasha Levin wrote:
> This patch splits IO request processing from completion notification.
> 
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> ---
>  tools/kvm/disk/core.c              |    6 +-
>  tools/kvm/include/kvm/disk-image.h |    4 +-
>  tools/kvm/include/kvm/virtio-blk.h |    1 +
>  tools/kvm/virtio/blk.c             |  109 ++++++++++++++++++++++++------------
>  4 files changed, 79 insertions(+), 41 deletions(-)
> 
> diff --git a/tools/kvm/disk/core.c b/tools/kvm/disk/core.c
> index 842fef4..557a1a2 100644
> --- a/tools/kvm/disk/core.c
> +++ b/tools/kvm/disk/core.c
> @@ -152,7 +152,7 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
>  	}
>  
>  	if (disk->disk_req_cb)
> -		disk->disk_req_cb(param);
> +		disk->disk_req_cb(param, total);
>  
>  	return total;
>  }
> @@ -183,7 +183,7 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iove
>  	}
>  
>  	if (disk->disk_req_cb)
> -		disk->disk_req_cb(param);
> +		disk->disk_req_cb(param, total);
>  
>  	return total;
>  }
> @@ -199,7 +199,7 @@ ssize_t disk_image__get_serial(struct disk_image *disk, void *buffer, ssize_t *l
>  	return *len;
>  }
>  
> -void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param))
> +void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param, long len))
>  {
>  	disk->disk_req_cb = disk_req_cb;
>  }
> diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h
> index bae9744..bcb3396 100644
> --- a/tools/kvm/include/kvm/disk-image.h
> +++ b/tools/kvm/include/kvm/disk-image.h
> @@ -45,7 +45,7 @@ struct disk_image {
>  	struct disk_image_operations	ops;
>  	void				*priv;
>  	void				*disk_req_cb_param;
> -	void				(*disk_req_cb)(void *param);
> +	void				(*disk_req_cb)(void *param, long len);
>  };
>  
>  struct disk_image *disk_image__open(const char *filename, bool readonly);
> @@ -72,5 +72,5 @@ ssize_t raw_image__read_sector_mmap(struct disk_image *disk, u64 sector,
>  ssize_t raw_image__write_sector_mmap(struct disk_image *disk, u64 sector,
>  				const struct iovec *iov, int iovcount, void *param);
>  int raw_image__close(struct disk_image *disk);
> -void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param));
> +void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param, long len));
>  #endif /* KVM__DISK_IMAGE_H */
> diff --git a/tools/kvm/include/kvm/virtio-blk.h b/tools/kvm/include/kvm/virtio-blk.h
> index 8c4fb91..63731a9 100644
> --- a/tools/kvm/include/kvm/virtio-blk.h
> +++ b/tools/kvm/include/kvm/virtio-blk.h
> @@ -8,5 +8,6 @@ struct kvm;
>  void virtio_blk__init(struct kvm *kvm, struct disk_image *disk);
>  void virtio_blk__init_all(struct kvm *kvm);
>  void virtio_blk__delete_all(struct kvm *kvm);
> +void virtio_blk_complete(void *param, long len);
>  
>  #endif /* KVM__BLK_VIRTIO_H */
> diff --git a/tools/kvm/virtio/blk.c b/tools/kvm/virtio/blk.c
> index 9db293e..0688f82 100644
> --- a/tools/kvm/virtio/blk.c
> +++ b/tools/kvm/virtio/blk.c
> @@ -29,17 +29,21 @@
>   */
>  #define DISK_SEG_MAX			(VIRTIO_BLK_QUEUE_SIZE - 2)
>  
> -struct blk_dev_job {
> +struct blk_dev_req {
> +	struct list_head		list;
>  	struct virt_queue		*vq;
>  	struct blk_dev			*bdev;
>  	struct iovec			iov[VIRTIO_BLK_QUEUE_SIZE];
>  	u16				out, in, head;
> -	struct thread_pool__job		job_id;
> +	struct kvm			*kvm;
>  };
>  
>  struct blk_dev {
>  	pthread_mutex_t			mutex;
> +	pthread_mutex_t			req_mutex;
> +
>  	struct list_head		list;
> +	struct list_head		req_list;
>  
>  	struct virtio_pci		vpci;
>  	struct virtio_blk_config	blk_config;
> @@ -47,41 +51,76 @@ struct blk_dev {
>  	u32				features;
>  
>  	struct virt_queue		vqs[NUM_VIRT_QUEUES];
> -	struct blk_dev_job		jobs[VIRTIO_BLK_QUEUE_SIZE];
> -	u16				job_idx;
> +	struct blk_dev_req		reqs[VIRTIO_BLK_QUEUE_SIZE];
>  };
>  
>  static LIST_HEAD(bdevs);
>  static int compat_id;
>  
> -static void virtio_blk_do_io_request(struct kvm *kvm, void *param)
> +static struct blk_dev_req *virtio_blk_req_pop(struct blk_dev *bdev)
> +{
> +	struct blk_dev_req *req;
> +
> +	mutex_lock(&bdev->req_mutex);
> +	req = list_first_entry(&bdev->req_list, struct blk_dev_req, list);
> +	list_del_init(&req->list);
> +	mutex_unlock(&bdev->req_mutex);
> +
> +	return req;
> +}
> +
> +static void virtio_blk_req_push(struct blk_dev *bdev, struct blk_dev_req *req)
>  {
> -	struct virtio_blk_outhdr *req;
> +	mutex_lock(&bdev->req_mutex);
> +	list_add(&req->list, &bdev->req_list);
> +	mutex_unlock(&bdev->req_mutex);
> +}
> +
> +void virtio_blk_complete(void *param, long len)
> +{
> +	struct blk_dev_req *req = param;
> +	struct blk_dev *bdev = req->bdev;
> +	int queueid = req->vq - bdev->vqs;
>  	u8 *status;
> +
> +	/* status */
> +	status			= req->iov[req->out + req->in - 1].iov_base;
> +	*status			= (len < 0) ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
> +
> +	mutex_lock(&bdev->mutex);
> +	virt_queue__set_used_elem(req->vq, req->head, len);
> +	mutex_unlock(&bdev->mutex);
> +
> +	virtio_pci__signal_vq(req->kvm, &bdev->vpci, queueid);
> +
> +	virtio_blk_req_push(req->bdev, req);
> +}
> +
> +static void virtio_blk_do_io_request(struct kvm *kvm, struct blk_dev_req *req)
> +{
> +	struct virtio_blk_outhdr *req_hdr;
>  	ssize_t block_cnt;
> -	struct blk_dev_job *job;
>  	struct blk_dev *bdev;
>  	struct virt_queue *queue;
>  	struct iovec *iov;
>  	u16 out, in, head;
>  
>  	block_cnt	= -1;
> -	job		= param;
> -	bdev		= job->bdev;
> -	queue		= job->vq;
> -	iov		= job->iov;
> -	out		= job->out;
> -	in		= job->in;
> -	head		= job->head;
> -	req		= iov[0].iov_base;
> -
> -	switch (req->type) {
> +	bdev		= req->bdev;
> +	queue		= req->vq;
> +	iov		= req->iov;
> +	out		= req->out;
> +	in		= req->in;
> +	head		= req->head;

Hi, Sasha

Please drop these queue and head local variables.

virtio/blk.c: In function ‘virtio_blk_do_io_request’:
virtio/blk.c:106:15: error: variable ‘head’ set but not used
[-Werror=unused-but-set-variable]
virtio/blk.c:104:21: error: variable ‘queue’ set but not used
[-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors


> +	req_hdr		= iov[0].iov_base;
> +
> +	switch (req_hdr->type) {
>  	case VIRTIO_BLK_T_IN:
> -		block_cnt	= disk_image__read(bdev->disk, req->sector, iov + 1,
> +		block_cnt	= disk_image__read(bdev->disk, req_hdr->sector, iov + 1,
>  					in + out - 2, NULL);
>  		break;
>  	case VIRTIO_BLK_T_OUT:
> -		block_cnt	= disk_image__write(bdev->disk, req->sector, iov + 1,
> +		block_cnt	= disk_image__write(bdev->disk, req_hdr->sector, iov + 1,
>  					in + out - 2, NULL);
>  		break;
>  	case VIRTIO_BLK_T_FLUSH:
> @@ -92,35 +131,27 @@ static void virtio_blk_do_io_request(struct kvm *kvm, void *param)
>  		disk_image__get_serial(bdev->disk, (iov + 1)->iov_base, &block_cnt);
>  		break;
>  	default:
> -		pr_warning("request type %d", req->type);
> +		pr_warning("request type %d", req_hdr->type);
>  		block_cnt	= -1;
>  		break;
>  	}
>  
> -	/* status */
> -	status			= iov[out + in - 1].iov_base;
> -	*status			= (block_cnt < 0) ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
> -
> -	mutex_lock(&bdev->mutex);
> -	virt_queue__set_used_elem(queue, head, block_cnt);
> -	mutex_unlock(&bdev->mutex);
> -
> -	virtio_pci__signal_vq(kvm, &bdev->vpci, queue - bdev->vqs);
> +	virtio_blk_complete(req, block_cnt);
>  }
>  
>  static void virtio_blk_do_io(struct kvm *kvm, struct virt_queue *vq, struct blk_dev *bdev)
>  {
>  	while (virt_queue__available(vq)) {
> -		struct blk_dev_job *job = &bdev->jobs[bdev->job_idx++ % VIRTIO_BLK_QUEUE_SIZE];
> +		struct blk_dev_req *req = virtio_blk_req_pop(bdev);
>  
> -		*job			= (struct blk_dev_job) {
> -			.vq			= vq,
> -			.bdev			= bdev,
> +		*req		= (struct blk_dev_req) {
> +			.vq	= vq,
> +			.bdev	= bdev,
> +			.kvm	= kvm,
>  		};
> -		job->head = virt_queue__get_iov(vq, job->iov, &job->out, &job->in, kvm);
> +		req->head = virt_queue__get_iov(vq, req->iov, &req->out, &req->in, kvm);
>  
> -		thread_pool__init_job(&job->job_id, kvm, virtio_blk_do_io_request, job);
> -		thread_pool__do_job(&job->job_id);
> +		virtio_blk_do_io_request(kvm, req);
>  	}
>  }
>  
> @@ -191,6 +222,7 @@ static int get_size_vq(struct kvm *kvm, void *dev, u32 vq)
>  void virtio_blk__init(struct kvm *kvm, struct disk_image *disk)
>  {
>  	struct blk_dev *bdev;
> +	size_t i;
>  
>  	if (!disk)
>  		return;
> @@ -201,6 +233,7 @@ void virtio_blk__init(struct kvm *kvm, struct disk_image *disk)
>  
>  	*bdev = (struct blk_dev) {
>  		.mutex			= PTHREAD_MUTEX_INITIALIZER,
> +		.req_mutex		= PTHREAD_MUTEX_INITIALIZER,
>  		.disk			= disk,
>  		.blk_config		= (struct virtio_blk_config) {
>  			.capacity	= disk->size / SECTOR_SIZE,
> @@ -222,6 +255,10 @@ void virtio_blk__init(struct kvm *kvm, struct disk_image *disk)
>  
>  	list_add_tail(&bdev->list, &bdevs);
>  
> +	INIT_LIST_HEAD(&bdev->req_list);
> +	for (i = 0; i < ARRAY_SIZE(bdev->reqs); i++)
> +		list_add(&bdev->reqs[i].list, &bdev->req_list);
> +
>  	if (compat_id != -1)
>  		compat_id = compat__add_message("virtio-blk device was not detected",
>  						"While you have requested a virtio-blk device, "


-- 
Asias He

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image
  2011-10-30 17:35 ` [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image Sasha Levin
@ 2011-11-01  6:44   ` Pekka Enberg
  2011-11-01 11:16     ` Sasha Levin
  0 siblings, 1 reply; 18+ messages in thread
From: Pekka Enberg @ 2011-11-01  6:44 UTC (permalink / raw)
  To: Sasha Levin; +Cc: kvm, mingo, asias.hejun, gorcunov

On Sun, Oct 30, 2011 at 7:35 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> This makes passing different ops structures easier since you don't have
> to keep them somewhere else after initializing disk_image.
>
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>

Why do we want to do this? Why would you ever want to allocate ops via
malloc() or on the stack?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image
  2011-11-01  6:44   ` Pekka Enberg
@ 2011-11-01 11:16     ` Sasha Levin
  2011-11-01 11:30       ` Pekka Enberg
  0 siblings, 1 reply; 18+ messages in thread
From: Sasha Levin @ 2011-11-01 11:16 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: kvm, mingo, asias.hejun, gorcunov

On Tue, 2011-11-01 at 08:44 +0200, Pekka Enberg wrote:
> On Sun, Oct 30, 2011 at 7:35 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> > This makes passing different ops structures easier since you don't have
> > to keep them somewhere else after initializing disk_image.
> >
> > Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> 
> Why do we want to do this? Why would you ever want to allocate ops via
> malloc() or on the stack?

It's mostly there to avoid having to either keep ops in a global struct
or to malloc() them, instead - it just holds them as part of disk_image;

It's useful with how we handle read-only ops now, you can see how we use
RO image now (private mmap() with fallback to AIO with no write op).

-- 

Sasha.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image
  2011-11-01 11:16     ` Sasha Levin
@ 2011-11-01 11:30       ` Pekka Enberg
  2011-11-01 11:43         ` Sasha Levin
  0 siblings, 1 reply; 18+ messages in thread
From: Pekka Enberg @ 2011-11-01 11:30 UTC (permalink / raw)
  To: Sasha Levin; +Cc: kvm, mingo, asias.hejun, gorcunov

Hi Sasha,

On Sun, Oct 30, 2011 at 7:35 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
>> > This makes passing different ops structures easier since you don't have
>> > to keep them somewhere else after initializing disk_image.
>> >
>> > Signed-off-by: Sasha Levin <levinsasha928@gmail.com>

On Tue, 2011-11-01 at 08:44 +0200, Pekka Enberg wrote:
>> Why do we want to do this? Why would you ever want to allocate ops via
>> malloc() or on the stack?

On Tue, Nov 1, 2011 at 1:16 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> It's mostly there to avoid having to either keep ops in a global struct
> or to malloc() them, instead - it just holds them as part of disk_image;
>
> It's useful with how we handle read-only ops now, you can see how we use
> RO image now (private mmap() with fallback to AIO with no write op).

It's not useful, it's obfuscation. The whole point of "operations"
struct is to provide an immutable and stateless virtual function
dispatch table. It's absolutely pointless to use malloc() to hold them
and seems to miss the whole point of keeping data and function pointer
separate.

                        Pekka

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image
  2011-11-01 11:30       ` Pekka Enberg
@ 2011-11-01 11:43         ` Sasha Levin
  2011-11-01 11:52           ` Pekka Enberg
  0 siblings, 1 reply; 18+ messages in thread
From: Sasha Levin @ 2011-11-01 11:43 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: kvm, mingo, asias.hejun, gorcunov

On Tue, 2011-11-01 at 13:30 +0200, Pekka Enberg wrote:
> Hi Sasha,
> 
> On Sun, Oct 30, 2011 at 7:35 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> >> > This makes passing different ops structures easier since you don't have
> >> > to keep them somewhere else after initializing disk_image.
> >> >
> >> > Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> 
> On Tue, 2011-11-01 at 08:44 +0200, Pekka Enberg wrote:
> >> Why do we want to do this? Why would you ever want to allocate ops via
> >> malloc() or on the stack?
> 
> On Tue, Nov 1, 2011 at 1:16 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> > It's mostly there to avoid having to either keep ops in a global struct
> > or to malloc() them, instead - it just holds them as part of disk_image;
> >
> > It's useful with how we handle read-only ops now, you can see how we use
> > RO image now (private mmap() with fallback to AIO with no write op).
> 
> It's not useful, it's obfuscation. The whole point of "operations"
> struct is to provide an immutable and stateless virtual function
> dispatch table. It's absolutely pointless to use malloc() to hold them
> and seems to miss the whole point of keeping data and function pointer
> separate.

It does provide a stateless function dispatch table, the only difference
is how the function ptr is stored in disk_image.

How would you implement the fallback thing we have in RO images now?
declare all those ops as global variables?

-- 

Sasha.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image
  2011-11-01 11:43         ` Sasha Levin
@ 2011-11-01 11:52           ` Pekka Enberg
  0 siblings, 0 replies; 18+ messages in thread
From: Pekka Enberg @ 2011-11-01 11:52 UTC (permalink / raw)
  To: Sasha Levin; +Cc: kvm, mingo, asias.hejun, gorcunov

On Tue, Nov 1, 2011 at 1:43 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> How would you implement the fallback thing we have in RO images now?
> declare all those ops as global variables?

Yes.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2011-11-01 11:52 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-30 17:35 [PATCH 00/11] kvm tools: Support native vectored AIO Sasha Levin
2011-10-30 17:35 ` [PATCH 01/11] kvm tools: Switch to using an enum for disk image types Sasha Levin
2011-10-30 17:35 ` [PATCH 02/11] kvm tools: Hold a copy of ops struct inside disk_image Sasha Levin
2011-11-01  6:44   ` Pekka Enberg
2011-11-01 11:16     ` Sasha Levin
2011-11-01 11:30       ` Pekka Enberg
2011-11-01 11:43         ` Sasha Levin
2011-11-01 11:52           ` Pekka Enberg
2011-10-30 17:35 ` [PATCH 03/11] kvm tools: Remove the non-iov interface from disk image ops Sasha Levin
2011-10-30 17:35 ` [PATCH 04/11] kvm tools: Modify disk ops usage Sasha Levin
2011-10-30 17:35 ` [PATCH 05/11] kvm tools: Modify behaviour on missing ops ptr Sasha Levin
2011-10-30 17:35 ` [PATCH 06/11] kvm tools: Add optional callback on disk op completion Sasha Levin
2011-10-30 17:35 ` [PATCH 07/11] kvm tools: Remove qcow nowrite function Sasha Levin
2011-10-30 17:35 ` [PATCH 08/11] kvm tools: Split io request from completion Sasha Levin
2011-10-31 10:42   ` Asias He
2011-10-30 17:35 ` [PATCH 09/11] kvm tools: Hook virtio-blk completion to disk op completion Sasha Levin
2011-10-30 17:35 ` [PATCH 10/11] kvm tools: Add aio read write functions Sasha Levin
2011-10-30 17:35 ` [PATCH 11/11] kvm tools: Use native vectored AIO in virtio-blk Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).