Linux virtualization list
 help / color / mirror / Atom feed
* Re: [PATCH 0000/0003] drivers: hv: util
From: gregkh @ 2012-05-12 23:16 UTC (permalink / raw)
  To: KY Srinivasan
  Cc: linux-kernel@vger.kernel.org, devel@linuxdriverproject.org,
	virtualization@lists.osdl.org, ohering@suse.com
In-Reply-To: <426367E2313C2449837CD2DE46E7EAF90305A56E@SN2PRD0310MB382.namprd03.prod.outlook.com>

On Sat, May 12, 2012 at 08:41:06PM +0000, KY Srinivasan wrote:
> 
> 
> > -----Original Message-----
> > From: K. Y. Srinivasan [mailto:kys@microsoft.com]
> > Sent: Saturday, May 12, 2012 4:44 PM
> > To: gregkh@linuxfoundation.org; linux-kernel@vger.kernel.org;
> > devel@linuxdriverproject.org; virtualization@lists.osdl.org; ohering@suse.com
> > Cc: KY Srinivasan
> > Subject: [PATCH 0000/0003] drivers: hv: util
> 
> There are only two patches following this mail; sorry for the typo in the subject line.

What's with the 0000 stuff as well?  You do know that git send-mail can
handle all of that for you properly, right?

^ permalink raw reply

* [PATCH 2/2] Drivers: hv: util: Properly handle version negotiations.
From: K. Y. Srinivasan @ 2012-05-12 20:44 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, virtualization, ohering
In-Reply-To: <1336855498-3981-1-git-send-email-kys@microsoft.com>

The current version negotiation code is not "future proof". Fix this
by allowing each service the flexibility to either specify the highest
version it can support or it can support the highest version number
the host is offering.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
---
 drivers/hv/channel_mgmt.c |   54 +++++++++++++++++++++++++++++++-------------
 drivers/hv/hv_kvp.c       |    3 +-
 drivers/hv/hv_util.c      |    9 +++++--
 include/linux/hyperv.h    |    4 ++-
 4 files changed, 49 insertions(+), 21 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 6c8c4d3..2b8b8d4 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -46,37 +46,59 @@ struct vmbus_channel_message_table_entry {
  *
  * @icmsghdrp is of type &struct icmsg_hdr.
  * @negop is of type &struct icmsg_negotiate.
- * Set up and fill in default negotiate response message. This response can
- * come from both the vmbus driver and the hv_utils driver. The current api
- * will respond properly to both Windows 2008 and Windows 2008-R2 operating
- * systems.
+ * Set up and fill in default negotiate response message.
+ *
+ * The max_fw_version specifies the maximum framework version that
+ * we can support and max _srv_version specifies the maximum service
+ * version we can support. A special value MAX_SRV_VER can be
+ * specified to indicate that we can handle the maximum version
+ * exposed by the host.
  *
  * Mainly used by Hyper-V drivers.
  */
 void vmbus_prep_negotiate_resp(struct icmsg_hdr *icmsghdrp,
-			       struct icmsg_negotiate *negop, u8 *buf)
+				struct icmsg_negotiate *negop, u8 *buf,
+				int max_fw_version, int max_srv_version)
 {
+	int icframe_vercnt;
+	int icmsg_vercnt;
+	int i;
+
 	icmsghdrp->icmsgsize = 0x10;
 
 	negop = (struct icmsg_negotiate *)&buf[
 		sizeof(struct vmbuspipe_hdr) +
 		sizeof(struct icmsg_hdr)];
 
-	if (negop->icframe_vercnt == 2 &&
-	   negop->icversion_data[1].major == 3) {
-		negop->icversion_data[0].major = 3;
-		negop->icversion_data[0].minor = 0;
-		negop->icversion_data[1].major = 3;
-		negop->icversion_data[1].minor = 0;
-	} else {
-		negop->icversion_data[0].major = 1;
-		negop->icversion_data[0].minor = 0;
-		negop->icversion_data[1].major = 1;
-		negop->icversion_data[1].minor = 0;
+	icframe_vercnt = negop->icframe_vercnt;
+	icmsg_vercnt = negop->icmsg_vercnt;
+
+	/*
+	 * Select the framework version number we will
+	 * support.
+	 */
+
+	for (i = 0; i < negop->icframe_vercnt; i++) {
+		if (negop->icversion_data[i].major <= max_fw_version)
+			icframe_vercnt = negop->icversion_data[i].major;
+	}
+
+	for (i = negop->icframe_vercnt;
+		 (i < negop->icframe_vercnt + negop->icmsg_vercnt); i++) {
+		if (negop->icversion_data[i].major <= max_srv_version)
+			icmsg_vercnt = negop->icversion_data[i].major;
 	}
 
+	/*
+	 * Respond with the maximum framework and service
+	 * version numbers we can support.
+	 */
 	negop->icframe_vercnt = 1;
 	negop->icmsg_vercnt = 1;
+	negop->icversion_data[0].major = icframe_vercnt;
+	negop->icversion_data[0].minor = 0;
+	negop->icversion_data[1].major = icmsg_vercnt;
+	negop->icversion_data[1].minor = 0;
 }
 
 EXPORT_SYMBOL_GPL(vmbus_prep_negotiate_resp);
diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c
index 6186025..0012eed 100644
--- a/drivers/hv/hv_kvp.c
+++ b/drivers/hv/hv_kvp.c
@@ -394,7 +394,8 @@ void hv_kvp_onchannelcallback(void *context)
 			sizeof(struct vmbuspipe_hdr)];
 
 		if (icmsghdrp->icmsgtype == ICMSGTYPE_NEGOTIATE) {
-			vmbus_prep_negotiate_resp(icmsghdrp, negop, recv_buffer);
+			vmbus_prep_negotiate_resp(icmsghdrp, negop,
+				 recv_buffer, MAX_SRV_VER, MAX_SRV_VER);
 		} else {
 			kvp_msg = (struct hv_kvp_msg *)&recv_buffer[
 				sizeof(struct vmbuspipe_hdr) +
diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
index dbb8b8e..d3ac6a4 100644
--- a/drivers/hv/hv_util.c
+++ b/drivers/hv/hv_util.c
@@ -70,7 +70,8 @@ static void shutdown_onchannelcallback(void *context)
 			sizeof(struct vmbuspipe_hdr)];
 
 		if (icmsghdrp->icmsgtype == ICMSGTYPE_NEGOTIATE) {
-			vmbus_prep_negotiate_resp(icmsghdrp, negop, shut_txf_buf);
+			vmbus_prep_negotiate_resp(icmsghdrp, negop,
+					shut_txf_buf, MAX_SRV_VER, MAX_SRV_VER);
 		} else {
 			shutdown_msg =
 				(struct shutdown_msg_data *)&shut_txf_buf[
@@ -195,7 +196,8 @@ static void timesync_onchannelcallback(void *context)
 				sizeof(struct vmbuspipe_hdr)];
 
 		if (icmsghdrp->icmsgtype == ICMSGTYPE_NEGOTIATE) {
-			vmbus_prep_negotiate_resp(icmsghdrp, NULL, time_txf_buf);
+			vmbus_prep_negotiate_resp(icmsghdrp, NULL, time_txf_buf,
+						MAX_SRV_VER, MAX_SRV_VER);
 		} else {
 			timedatap = (struct ictimesync_data *)&time_txf_buf[
 				sizeof(struct vmbuspipe_hdr) +
@@ -234,7 +236,8 @@ static void heartbeat_onchannelcallback(void *context)
 				sizeof(struct vmbuspipe_hdr)];
 
 		if (icmsghdrp->icmsgtype == ICMSGTYPE_NEGOTIATE) {
-			vmbus_prep_negotiate_resp(icmsghdrp, NULL, hbeat_txf_buf);
+			vmbus_prep_negotiate_resp(icmsghdrp, NULL,
+				hbeat_txf_buf, MAX_SRV_VER, MAX_SRV_VER);
 		} else {
 			heartbeat_msg =
 				(struct heartbeat_msg_data *)&hbeat_txf_buf[
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 6af8738..68ed7f7 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1062,8 +1062,10 @@ struct hyperv_service_callback {
 	void (*callback) (void *context);
 };
 
+#define MAX_SRV_VER	0x7ffffff
 extern void vmbus_prep_negotiate_resp(struct icmsg_hdr *,
-				      struct icmsg_negotiate *, u8 *);
+					struct icmsg_negotiate *, u8 *, int,
+					int);
 
 int hv_kvp_init(struct hv_util_service *);
 void hv_kvp_deinit(void);
-- 
1.7.4.1

^ permalink raw reply related

* [PATCH 1/2] Drivers: hv: Get rid of an unnecessary check in vmbus_prep_negotiate_resp()
From: K. Y. Srinivasan @ 2012-05-12 20:44 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, virtualization, ohering
In-Reply-To: <1336855466-3942-1-git-send-email-kys@microsoft.com>

The vmbus_prep_negotiate_resp() is only invoked when we are negotiating
the version; so the current check in vmbus_prep_negotiate_resp()
is unnecessary. Get rid of it.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
---
 drivers/hv/channel_mgmt.c |   39 +++++++++++++++++++--------------------
 1 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 9ffbfc5..6c8c4d3 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -56,30 +56,29 @@ struct vmbus_channel_message_table_entry {
 void vmbus_prep_negotiate_resp(struct icmsg_hdr *icmsghdrp,
 			       struct icmsg_negotiate *negop, u8 *buf)
 {
-	if (icmsghdrp->icmsgtype == ICMSGTYPE_NEGOTIATE) {
-		icmsghdrp->icmsgsize = 0x10;
-
-		negop = (struct icmsg_negotiate *)&buf[
-			sizeof(struct vmbuspipe_hdr) +
-			sizeof(struct icmsg_hdr)];
-
-		if (negop->icframe_vercnt == 2 &&
-		   negop->icversion_data[1].major == 3) {
-			negop->icversion_data[0].major = 3;
-			negop->icversion_data[0].minor = 0;
-			negop->icversion_data[1].major = 3;
-			negop->icversion_data[1].minor = 0;
-		} else {
-			negop->icversion_data[0].major = 1;
-			negop->icversion_data[0].minor = 0;
-			negop->icversion_data[1].major = 1;
-			negop->icversion_data[1].minor = 0;
-		}
-
-		negop->icframe_vercnt = 1;
-		negop->icmsg_vercnt = 1;
+	icmsghdrp->icmsgsize = 0x10;
+
+	negop = (struct icmsg_negotiate *)&buf[
+		sizeof(struct vmbuspipe_hdr) +
+		sizeof(struct icmsg_hdr)];
+
+	if (negop->icframe_vercnt == 2 &&
+	   negop->icversion_data[1].major == 3) {
+		negop->icversion_data[0].major = 3;
+		negop->icversion_data[0].minor = 0;
+		negop->icversion_data[1].major = 3;
+		negop->icversion_data[1].minor = 0;
+	} else {
+		negop->icversion_data[0].major = 1;
+		negop->icversion_data[0].minor = 0;
+		negop->icversion_data[1].major = 1;
+		negop->icversion_data[1].minor = 0;
 	}
+
+	negop->icframe_vercnt = 1;
+	negop->icmsg_vercnt = 1;
 }
+
 EXPORT_SYMBOL_GPL(vmbus_prep_negotiate_resp);
 
 /*
-- 
1.7.4.1

^ permalink raw reply related

* [PATCH 0000/0003] drivers: hv: util
From: K. Y. Srinivasan @ 2012-05-12 20:44 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, virtualization, ohering

Fix-up the version negotiation code for the util drivers.

Regards,

K. Y

^ permalink raw reply

* RE: [PATCH 0000/0003] drivers: hv: util
From: KY Srinivasan @ 2012-05-12 20:41 UTC (permalink / raw)
  To: KY Srinivasan, gregkh@linuxfoundation.org,
	linux-kernel@vger.kernel.org, devel@linuxdriverproject.org,
	virtualization@lists.osdl.org, ohering@suse.com
In-Reply-To: <1336855466-3942-1-git-send-email-kys@microsoft.com>



> -----Original Message-----
> From: K. Y. Srinivasan [mailto:kys@microsoft.com]
> Sent: Saturday, May 12, 2012 4:44 PM
> To: gregkh@linuxfoundation.org; linux-kernel@vger.kernel.org;
> devel@linuxdriverproject.org; virtualization@lists.osdl.org; ohering@suse.com
> Cc: KY Srinivasan
> Subject: [PATCH 0000/0003] drivers: hv: util

There are only two patches following this mail; sorry for the typo in the subject line.

Regards,

K. Y

^ permalink raw reply

* Re: virtio 3.4 patches
From: Michael S. Tsirkin @ 2012-05-10 14:46 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Amit Shah, kvm, virtualization
In-Reply-To: <874nrre84d.fsf@rustcorp.com.au>

On Tue, May 08, 2012 at 11:39:38AM +0930, Rusty Russell wrote:
> On Mon, 7 May 2012 08:30:21 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > Hi Rusty, please also pick two fixes from for_linus tag on my tree
> > I think they should be sent to Linus for 3.4:
> > 
> > http://git.kernel.org/?p=linux/kernel/git/mst/vhost.git;a=tag;h=refs/tags/for_linus
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git refs/tags/for_linus
> 
> Done!
> 
> Thanks so much for handling virtio while I was away; knowing it was
> safely in your hands allowed me to relax without reservation.  My wife
> thanks you, too :)
> 
> Cheers,
> Rusty.

Do I get to ask for a favor in return ;) ?
It would help immensely if there was a git branch in your tree with
patches queued for the next kernel.
Patches are much less handy for collaboration.

Thanks,

-- 
MST

^ permalink raw reply

* Re: [RFC PATCH] virtio_console: link vq to port with a private pointer in struct virtqueue
From: Michael S. Tsirkin @ 2012-05-10  6:02 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Amit Shah, Paolo Bonzini, linux-kernel, virtualization
In-Reply-To: <877gwkddxq.fsf@rustcorp.com.au>

On Thu, May 10, 2012 at 10:56:09AM +0930, Rusty Russell wrote:
> I think that making vring_virtqueue public looks like the way to go,
> too.

Then it's easy: transports can include vring_virtqueue
at end of transport structure.

^ permalink raw reply

* Re: [RFC PATCH] virtio_console: link vq to port with a private pointer in struct virtqueue
From: Rusty Russell @ 2012-05-10  1:26 UTC (permalink / raw)
  To: Michael S. Tsirkin, Paolo Bonzini; +Cc: Amit Shah, linux-kernel, virtualization
In-Reply-To: <20120508101839.GA15598@redhat.com>

On Tue, 8 May 2012 13:18:39 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Tue, May 08, 2012 at 11:45:55AM +0200, Paolo Bonzini wrote:
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -76,8 +76,6 @@
> >  
> >  struct vring_virtqueue
> >  {
> > -	struct virtqueue vq;
> > -
> >  	/* Actual memory layout for this queue */
> >  	struct vring vring;
> >  
> > @@ -106,6 +104,9 @@ struct vring_virtqueue
> >  	/* How to notify other side. FIXME: commonalize hcalls! */
> >  	void (*notify)(struct virtqueue *vq);
> >  
> > +	/* Tokens for callbacks. */
> > +	void **data;
> > +
> >  #ifdef DEBUG
> >  	/* They're supposed to lock for us. */
> >  	unsigned int in_use;
> > @@ -115,8 +116,9 @@ struct vring_virtqueue
> >  	ktime_t last_add_time;
> >  #endif
> >  
> > -	/* Tokens for callbacks. */
> > -	void *data[];
> > +	struct virtqueue vq;
> > +
> > +	/* Bus-specific virtqueue data goes here.  */
> >  };
> >  
> >  #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
> 
> 
> This moves the data out of line and the bus specific stuff inline.
> But bus accesses only happen on io and interrupt which are already
> slow, while data accesses happen on fast path.
> 
> >From that POV this looks like a wrong thing to do.

(Resend to all)

Most of it was on a different cacheline anyway though, so it's probably
not measurable.  We avoid the pci_vq->vq indirection, but add the
vq->data indirection.

I share your discomfort with the offset arg method, too.

So unless benchmarks show otherwise, let's do it the vanilla way?

I think that making vring_virtqueue public looks like the way to go,
too.

Of course, a nice series would be great as well :)

Thanks!
Rusty.

^ permalink raw reply

* Re: [PATCH] virtio-mmio: Devices parameter parsing
From: Rusty Russell @ 2012-05-10  0:44 UTC (permalink / raw)
  Cc: linux-kernel, Pawel Moll, virtualization
In-Reply-To: <1336584616-7802-1-git-send-email-pawel.moll@arm.com>

On Wed,  9 May 2012 18:30:16 +0100, Pawel Moll <pawel.moll@arm.com> wrote:
> This patch adds an option to instantiate guest virtio-mmio devices
> basing on a kernel command line (or module) parameter, for example:
> 
> 	virtio_mmio.devices=0x100@0x100b0000:48
> 
> Signed-off-by: Pawel Moll <pawel.moll@arm.com>
> ---
> 
> Hi Rusty,
> 
> As the param changes are in now, could you queue this
> long-time-forgotten patch for 3.5?

Finally, thanks!

> PS. Congratulations once again and see you in HK.

Thanks, and looking forward to it!

Cheers,
Rusty. 

^ permalink raw reply

* [PATCH] virtio-mmio: Devices parameter parsing
From: Pawel Moll @ 2012-05-09 17:30 UTC (permalink / raw)
  To: Rusty Russell; +Cc: linux-kernel, Pawel Moll, virtualization

This patch adds an option to instantiate guest virtio-mmio devices
basing on a kernel command line (or module) parameter, for example:

	virtio_mmio.devices=0x100@0x100b0000:48

Signed-off-by: Pawel Moll <pawel.moll@arm.com>
---

Hi Rusty,

As the param changes are in now, could you queue this
long-time-forgotten patch for 3.5?

Cheers!

Pawel

PS. Congratulations once again and see you in HK.

 Documentation/kernel-parameters.txt |   17 ++++
 drivers/virtio/Kconfig              |   11 +++
 drivers/virtio/virtio_mmio.c        |  163 +++++++++++++++++++++++++++++++++++
 3 files changed, 191 insertions(+), 0 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index c1601e5..8b35051 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -110,6 +110,7 @@ parameter is applicable:
 	USB	USB support is enabled.
 	USBHID	USB Human Interface Device support is enabled.
 	V4L	Video For Linux support is enabled.
+	VMMIO   Driver for memory mapped virtio devices is enabled.
 	VGA	The VGA console has been enabled.
 	VT	Virtual terminal support is enabled.
 	WDT	Watchdog support is enabled.
@@ -2847,6 +2848,22 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	video=		[FB] Frame buffer configuration
 			See Documentation/fb/modedb.txt.
 
+	virtio_mmio.device=
+			[VMMIO] Memory mapped virtio (platform) device.
+
+				<size>@<baseaddr>:<irq>[:<id>]
+			where:
+				<size>     := size (can use standard suffixes
+						like K, M and G)
+				<baseaddr> := physical base address
+				<irq>      := interrupt number (as passed to
+						request_irq())
+				<id>       := (optional) platform device id
+			example:
+				virtio_mmio.device=1K@0x100b0000:48:7
+
+			Can be used multiple times for multiple devices.
+
 	vga=		[BOOT,X86-32] Select a particular video mode
 			See Documentation/x86/boot.txt and
 			Documentation/svga.txt.
diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 1a61939..f38b17a 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -46,4 +46,15 @@ config VIRTIO_BALLOON
 
  	 If unsure, say N.
 
+config VIRTIO_MMIO_CMDLINE_DEVICES
+	bool "Memory mapped virtio devices parameter parsing"
+	depends on VIRTIO_MMIO
+	---help---
+	 Allow virtio-mmio devices instantiation via the kernel command line
+	 or module parameters. Be aware that using incorrect parameters (base
+	 address in particular) can crash your system - you have been warned.
+	 See Documentation/kernel-parameters.txt for details.
+
+	 If unsure, say 'N'.
+
 endmenu
diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index 01d6dc2..453db0c 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -6,6 +6,50 @@
  * This module allows virtio devices to be used over a virtual, memory mapped
  * platform device.
  *
+ * The guest device(s) may be instantiated in one of three equivalent ways:
+ *
+ * 1. Static platform device in board's code, eg.:
+ *
+ *	static struct platform_device v2m_virtio_device = {
+ *		.name = "virtio-mmio",
+ *		.id = -1,
+ *		.num_resources = 2,
+ *		.resource = (struct resource []) {
+ *			{
+ *				.start = 0x1001e000,
+ *				.end = 0x1001e0ff,
+ *				.flags = IORESOURCE_MEM,
+ *			}, {
+ *				.start = 42 + 32,
+ *				.end = 42 + 32,
+ *				.flags = IORESOURCE_IRQ,
+ *			},
+ *		}
+ *	};
+ *
+ * 2. Device Tree node, eg.:
+ *
+ *		virtio_block@1e000 {
+ *			compatible = "virtio,mmio";
+ *			reg = <0x1e000 0x100>;
+ *			interrupts = <42>;
+ *		}
+ *
+ * 3. Kernel module (or command line) parameter. Can be used more than once -
+ *    one device will be created for each one. Syntax:
+ *
+ *		[virtio_mmio.]device=<size>@<baseaddr>:<irq>[:<id>]
+ *    where:
+ *		<size>     := size (can use standard suffixes like K, M or G)
+ *		<baseaddr> := physical base address
+ *		<irq>      := interrupt number (as passed to request_irq())
+ *		<id>       := (optional) platform device id
+ *    eg.:
+ *		virtio_mmio.device=0x100@0x100b0000:48 \
+ *				virtio_mmio.device=1K@0x1001e000:74
+ *
+ *
+ *
  * Registers layout (all 32-bit wide):
  *
  * offset d. name             description
@@ -42,6 +86,8 @@
  * See the COPYING file in the top-level directory.
  */
 
+#define pr_fmt(fmt) "virtio-mmio: " fmt
+
 #include <linux/highmem.h>
 #include <linux/interrupt.h>
 #include <linux/io.h>
@@ -449,6 +495,122 @@ static int __devexit virtio_mmio_remove(struct platform_device *pdev)
 
 
 
+/* Devices list parameter */
+
+#if defined(CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES)
+
+static struct device vm_cmdline_parent = {
+	.init_name = "virtio-mmio-cmdline",
+};
+
+static int vm_cmdline_parent_registered;
+static int vm_cmdline_id;
+
+static int vm_cmdline_set(const char *device,
+		const struct kernel_param *kp)
+{
+	int err;
+	struct resource resources[2] = {};
+	char *str;
+	long long int base;
+	int processed, consumed = 0;
+	struct platform_device *pdev;
+
+	resources[0].flags = IORESOURCE_MEM;
+	resources[1].flags = IORESOURCE_IRQ;
+
+	resources[0].end = memparse(device, &str) - 1;
+
+	processed = sscanf(str, "@%lli:%u%n:%d%n",
+			&base, &resources[1].start, &consumed,
+			&vm_cmdline_id, &consumed);
+
+	if (processed < 2 || processed > 3 || str[consumed])
+		return -EINVAL;
+
+	resources[0].start = base;
+	resources[0].end += base;
+	resources[1].end = resources[1].start;
+
+	if (!vm_cmdline_parent_registered) {
+		err = device_register(&vm_cmdline_parent);
+		if (err) {
+			pr_err("Failed to register parent device!\n");
+			return err;
+		}
+		vm_cmdline_parent_registered = 1;
+	}
+
+	pr_info("Registering device virtio-mmio.%d at 0x%llx-0x%llx, IRQ %d.\n",
+		       vm_cmdline_id,
+		       (unsigned long long)resources[0].start,
+		       (unsigned long long)resources[0].end,
+		       (int)resources[1].start);
+
+	pdev = platform_device_register_resndata(&vm_cmdline_parent,
+			"virtio-mmio", vm_cmdline_id++,
+			resources, ARRAY_SIZE(resources), NULL, 0);
+	if (IS_ERR(pdev))
+		return PTR_ERR(pdev);
+
+	return 0;
+}
+
+static int vm_cmdline_get_device(struct device *dev, void *data)
+{
+	char *buffer = data;
+	unsigned int len = strlen(buffer);
+	struct platform_device *pdev = to_platform_device(dev);
+
+	snprintf(buffer + len, PAGE_SIZE - len, "0x%llx@0x%llx:%llu:%d\n",
+			pdev->resource[0].end - pdev->resource[0].start + 1ULL,
+			(unsigned long long)pdev->resource[0].start,
+			(unsigned long long)pdev->resource[1].start,
+			pdev->id);
+	return 0;
+}
+
+static int vm_cmdline_get(char *buffer, const struct kernel_param *kp)
+{
+	buffer[0] = '\0';
+	device_for_each_child(&vm_cmdline_parent, buffer,
+			vm_cmdline_get_device);
+	return strlen(buffer) + 1;
+}
+
+static struct kernel_param_ops vm_cmdline_param_ops = {
+	.set = vm_cmdline_set,
+	.get = vm_cmdline_get,
+};
+
+device_param_cb(device, &vm_cmdline_param_ops, NULL, S_IRUSR);
+
+static int vm_unregister_cmdline_device(struct device *dev,
+		void *data)
+{
+	platform_device_unregister(to_platform_device(dev));
+
+	return 0;
+}
+
+static void vm_unregister_cmdline_devices(void)
+{
+	if (vm_cmdline_parent_registered) {
+		device_for_each_child(&vm_cmdline_parent, NULL,
+				vm_unregister_cmdline_device);
+		device_unregister(&vm_cmdline_parent);
+		vm_cmdline_parent_registered = 0;
+	}
+}
+
+#else
+
+static void vm_unregister_cmdline_devices(void)
+{
+}
+
+#endif
+
 /* Platform driver */
 
 static struct of_device_id virtio_mmio_match[] = {
@@ -475,6 +637,7 @@ static int __init virtio_mmio_init(void)
 static void __exit virtio_mmio_exit(void)
 {
 	platform_driver_unregister(&virtio_mmio_driver);
+	vm_unregister_cmdline_devices();
 }
 
 module_init(virtio_mmio_init);
-- 
1.7.5.4

^ permalink raw reply related

* Re: [PATCH untested] virtio: allocate extra memory before the ring ( was Re: [RFC PATCH] virtio_console: link vq to port with a private) pointer in struct virtqueue
From: Michael S. Tsirkin @ 2012-05-08 11:53 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Amit Shah, linux-kernel, virtualization
In-Reply-To: <7dee31a0-7b89-418c-a69b-436729ba55b0@zmail13.collab.prod.int.phx2.redhat.com>

On Tue, May 08, 2012 at 07:06:04AM -0400, Paolo Bonzini wrote:
> > And it's not a problem if virtqueue is exactly at start of
> > vring_virtqueue: we just need to allocate a bit more at start, and
> > offset when we free.  Here's how I would do this: first apply patch
> > below that adds the offset parameter, then update all transports, one
> > patch at a time to not use priv pointer, finally you can repurpose
> > priv pointer to let devices use it.
> 
> I like it, though I prefer to have the new arguments as sizeof()
> rather than offsets.

Well it tells us both how much extra memory
to allocate and where in the allocated buffer to put
struct virtqueue.

So size is kind of vague, offset is a more specific name.


> I'll pick up this and test it.

BTW still somewhat unhappy that the specific layout requirements make it a
bit fragile, even though I added BUG_ONs to at least fail in an obvious
way if we break the rules. But I wonder whether Rusty has a better idea.


> > As a bonus we get small incremental patches.
> 
> FWIW, that's possible with the patch I posted too.
> 
> Paolo

^ permalink raw reply

* Re: [PATCH untested] virtio: allocate extra memory before the ring ( was Re: [RFC PATCH] virtio_console: link vq to port with a private) pointer in struct virtqueue
From: Michael S. Tsirkin @ 2012-05-08 11:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Amit Shah, linux-kernel, virtualization
In-Reply-To: <20120508105654.GB15598@redhat.com>

> virtio: allocate extra memory before the ring
> 
> This let transports put their data inline instead
> of indirect acces through priv pointer.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

And transports would do something like the below to avoid
using priv.  Warning: completely untested.

------------------------------>
    virtio_pci: put vq info structure inline
    
    vq->priv is now unused by virtio_pci.
    
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

---

diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
index 3c3eea9..4092006 100644
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -73,9 +73,6 @@ enum {
 
 struct virtio_pci_vq_info
 {
-	/* the actual virtqueue */
-	struct virtqueue *vq;
-
 	/* the number of entries in the queue */
 	int num;
 
@@ -90,8 +87,15 @@ struct virtio_pci_vq_info
 
 	/* MSI-X vector (or none) */
 	unsigned msix_vector;
+
+	/* the actual virtqueue. Note: must be the last field */
+	struct virtqueue vq;
 };
 
+#define to_vpinfo(vq) container_of(info, struct virtio_pci_vq_info, vq)
+/* How much space we need to reserve before struct virtqueue */
+#define VPINFO_PRIV offsetof(struct virtio_pci_vq_info, vq)
+
 /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. */
 static struct pci_device_id virtio_pci_id_table[] = {
 	{ 0x1af4, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
@@ -202,7 +206,7 @@ static void vp_reset(struct virtio_device *vdev)
 static void vp_notify(struct virtqueue *vq)
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev);
-	struct virtio_pci_vq_info *info = vq->priv;
+	struct virtio_pci_vq_info *info = to_vpinfo(vq);
 
 	/* we write the queue's selector into the notification register to
 	 * signal the other end */
@@ -232,7 +236,7 @@ static irqreturn_t vp_vring_interrupt(int irq, void *opaque)
 
 	spin_lock_irqsave(&vp_dev->lock, flags);
 	list_for_each_entry(info, &vp_dev->virtqueues, node) {
-		if (vring_interrupt(irq, info->vq) == IRQ_HANDLED)
+		if (vring_interrupt(irq, &info->vq) == IRQ_HANDLED)
 			ret = IRQ_HANDLED;
 	}
 	spin_unlock_irqrestore(&vp_dev->lock, flags);
@@ -385,6 +389,7 @@ static struct virtqueue *setup_vq(struct virtio_device *vdev, unsigned index,
 	struct virtio_pci_vq_info *info;
 	struct virtqueue *vq;
 	unsigned long flags, size;
+	void *queue;
 	u16 num;
 	int err;
 
@@ -398,35 +403,34 @@ static struct virtqueue *setup_vq(struct virtio_device *vdev, unsigned index,
 
 	/* allocate and fill out our structure the represents an active
 	 * queue */
-	info = kmalloc(sizeof(struct virtio_pci_vq_info), GFP_KERNEL);
-	if (!info)
-		return ERR_PTR(-ENOMEM);
-
-	info->queue_index = index;
-	info->num = num;
-	info->msix_vector = msix_vec;
 
 	size = PAGE_ALIGN(vring_size(num, VIRTIO_PCI_VRING_ALIGN));
-	info->queue = alloc_pages_exact(size, GFP_KERNEL|__GFP_ZERO);
-	if (info->queue == NULL) {
+	queue = alloc_pages_exact(size, GFP_KERNEL|__GFP_ZERO);
+	if (queue == NULL) {
 		err = -ENOMEM;
 		goto out_info;
 	}
 
 	/* activate the queue */
-	iowrite32(virt_to_phys(info->queue) >> VIRTIO_PCI_QUEUE_ADDR_SHIFT,
+	iowrite32(virt_to_phys(queue) >> VIRTIO_PCI_QUEUE_ADDR_SHIFT,
 		  vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
 
-	/* create the vring */
-	vq = vring_new_virtqueue(0, info->num, VIRTIO_PCI_VRING_ALIGN, vdev,
-				 true, info->queue, vp_notify, callback, name);
+	/* We can store transport specific data before the vq field.
+	 * Make sure we don't have any fields after it. */
+	BUILD_BUG_ON(VPINFO_PRIV + sizeof info->vq != sizeof *info);
+
+	vq = vring_new_virtqueue(VPINFO_PRIV, num, VIRTIO_PCI_VRING_ALIGN, vdev,
+				 true, queue, vp_notify, callback, name);
 	if (!vq) {
 		err = -ENOMEM;
 		goto out_activate_queue;
 	}
 
-	vq->priv = info;
-	info->vq = vq;
+	info = to_vpinfo(vq);
+
+	info->queue_index = index;
+	info->num = num;
+	info->msix_vector = msix_vec;
 
 	if (msix_vec != VIRTIO_MSI_NO_VECTOR) {
 		iowrite16(msix_vec, vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
@@ -448,20 +452,21 @@ static struct virtqueue *setup_vq(struct virtio_device *vdev, unsigned index,
 	return vq;
 
 out_assign:
-	vring_del_virtqueue(vq, 0);
+	vring_del_virtqueue(vq, VPINFO_PRIV);
 out_activate_queue:
 	iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
 	free_pages_exact(info->queue, size);
 out_info:
-	kfree(info);
 	return ERR_PTR(err);
 }
 
 static void vp_del_vq(struct virtqueue *vq)
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev);
-	struct virtio_pci_vq_info *info = vq->priv;
+	struct virtio_pci_vq_info *info = to_vpinfo(vq);
 	unsigned long flags, size;
+	void *queue = info->queue;
+	u16 num = info->num;
 
 	spin_lock_irqsave(&vp_dev->lock, flags);
 	list_del(&info->node);
@@ -476,14 +481,13 @@ static void vp_del_vq(struct virtqueue *vq)
 		ioread8(vp_dev->ioaddr + VIRTIO_PCI_ISR);
 	}
 
-	vring_del_virtqueue(vq, 0);
+	vring_del_virtqueue(vq, VPINFO_PRIV);
 
 	/* Select and deactivate the queue */
 	iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
 
 	size = PAGE_ALIGN(vring_size(info->num, VIRTIO_PCI_VRING_ALIGN));
-	free_pages_exact(info->queue, size);
-	kfree(info);
+	free_pages_exact(queue, size);
 }
 
 /* the config->del_vqs() implementation */
@@ -494,7 +498,7 @@ static void vp_del_vqs(struct virtio_device *vdev)
 	struct virtio_pci_vq_info *info;
 
 	list_for_each_entry_safe(vq, n, &vdev->vqs, list) {
-		info = vq->priv;
+		info = to_vpinfo(vq);
 		if (vp_dev->per_vq_vectors &&
 			info->msix_vector != VIRTIO_MSI_NO_VECTOR)
 			free_irq(vp_dev->msix_entries[info->msix_vector].vector,

^ permalink raw reply related

* Re: [PATCH untested] virtio: allocate extra memory before the ring ( was Re: [RFC PATCH] virtio_console: link vq to port with a private) pointer in struct virtqueue
From: Paolo Bonzini @ 2012-05-08 11:06 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Amit Shah, linux-kernel, virtualization
In-Reply-To: <20120508105654.GB15598@redhat.com>

> And it's not a problem if virtqueue is exactly at start of
> vring_virtqueue: we just need to allocate a bit more at start, and
> offset when we free.  Here's how I would do this: first apply patch
> below that adds the offset parameter, then update all transports, one
> patch at a time to not use priv pointer, finally you can repurpose
> priv pointer to let devices use it.

I like it, though I prefer to have the new arguments as sizeof()
rather than offsets.

I'll pick up this and test it.

> As a bonus we get small incremental patches.

FWIW, that's possible with the patch I posted too.

Paolo

^ permalink raw reply

* [PATCH untested] virtio: allocate extra memory before the ring ( was Re: [RFC PATCH] virtio_console: link vq to port with a private) pointer in struct virtqueue
From: Michael S. Tsirkin @ 2012-05-08 10:56 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Amit Shah, linux-kernel, virtualization
In-Reply-To: <1336470355-19257-1-git-send-email-pbonzini@redhat.com>

On Tue, May 08, 2012 at 11:45:55AM +0200, Paolo Bonzini wrote:
> > How bad would be it to get rid of the current ->priv and use
> > container_of() instead?  ie. have virtio_pci, virtio_mmio, lguest_bus
> > and s390's kvm_virtio embed the struct virtqueue?
> 
> Something like the following, compile-tested only...
> 
> The layout of vring_virtqueue gets a bit complex, with private vring
> data _before_ the struct virtqueue and private bus data after it.

If we want to do this, the right thing IMO is keeping the data inline
and at fixed offset and that means at the end.

And it's not a problem if virtqueue is exactly at start of
vring_virtqueue: we just need to allocate a bit more at start, and
offset when we free.  Here's how I would do this: first apply patch
below that adds the offset parameter, then update all transports, one
patch at a time to not use priv pointer, finally you can repurpose priv
pointer to let devices use it.
As a bonus we get small incremental patches.

Patch below is completely untested, just to give you the idea.
By the way, we need to document vring_new_virtqueue/vring_del_virtqueue.

----------------------------------------------->

virtio: allocate extra memory before the ring

This let transports put their data inline instead
of indirect acces through priv pointer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

---

diff --git a/drivers/lguest/lguest_device.c b/drivers/lguest/lguest_device.c
index 9e8388e..555b5d5 100644
--- a/drivers/lguest/lguest_device.c
+++ b/drivers/lguest/lguest_device.c
@@ -296,7 +296,7 @@ static struct virtqueue *lg_find_vq(struct virtio_device *vdev,
 	 * to 'true': the host just a(nother) SMP CPU, so we only need inter-cpu
 	 * barriers.
 	 */
-	vq = vring_new_virtqueue(lvq->config.num, LGUEST_VRING_ALIGN, vdev,
+	vq = vring_new_virtqueue(0, lvq->config.num, LGUEST_VRING_ALIGN, vdev,
 				 true, lvq->pages, lg_notify, callback, name);
 	if (!vq) {
 		err = -ENOMEM;
@@ -331,7 +331,7 @@ static struct virtqueue *lg_find_vq(struct virtio_device *vdev,
 free_desc:
 	irq_free_desc(lvq->config.irq);
 destroy_vring:
-	vring_del_virtqueue(vq);
+	vring_del_virtqueue(vq, 0);
 unmap:
 	lguest_unmap(lvq->pages);
 free_lvq:
@@ -348,7 +348,7 @@ static void lg_del_vq(struct virtqueue *vq)
 	/* Release the interrupt */
 	free_irq(lvq->config.irq, vq);
 	/* Tell virtio_ring.c to free the virtqueue. */
-	vring_del_virtqueue(vq);
+	vring_del_virtqueue(vq, 0);
 	/* Unmap the pages containing the ring. */
 	lguest_unmap(lvq->pages);
 	/* Free our own queue information. */
diff --git a/drivers/remoteproc/remoteproc_virtio.c b/drivers/remoteproc/remoteproc_virtio.c
index ecf6121..cc714af 100644
--- a/drivers/remoteproc/remoteproc_virtio.c
+++ b/drivers/remoteproc/remoteproc_virtio.c
@@ -99,7 +99,7 @@ static struct virtqueue *rp_find_vq(struct virtio_device *vdev,
 	 * Create the new vq, and tell virtio we're not interested in
 	 * the 'weak' smp barriers, since we're talking with a real device.
 	 */
-	vq = vring_new_virtqueue(len, rvring->align, vdev, false, addr,
+	vq = vring_new_virtqueue(0, len, rvring->align, vdev, false, addr,
 					rproc_virtio_notify, callback, name);
 	if (!vq) {
 		dev_err(rproc->dev, "vring_new_virtqueue %s failed\n", name);
@@ -124,7 +124,7 @@ static void rproc_virtio_del_vqs(struct virtio_device *vdev)
 	list_for_each_entry_safe(vq, n, &vdev->vqs, list) {
 		rvring = vq->priv;
 		rvring->vq = NULL;
-		vring_del_virtqueue(vq);
+		vring_del_virtqueue(vq, 0);
 	}
 }
 
diff --git a/drivers/s390/kvm/kvm_virtio.c b/drivers/s390/kvm/kvm_virtio.c
index d74e9ae..e51fcdf 100644
--- a/drivers/s390/kvm/kvm_virtio.c
+++ b/drivers/s390/kvm/kvm_virtio.c
@@ -197,7 +197,7 @@ static struct virtqueue *kvm_find_vq(struct virtio_device *vdev,
 	if (err)
 		goto out;
 
-	vq = vring_new_virtqueue(config->num, KVM_S390_VIRTIO_RING_ALIGN,
+	vq = vring_new_virtqueue(0, config->num, KVM_S390_VIRTIO_RING_ALIGN,
 				 vdev, true, (void *) config->address,
 				 kvm_notify, callback, name);
 	if (!vq) {
@@ -225,7 +225,7 @@ static void kvm_del_vq(struct virtqueue *vq)
 {
 	struct kvm_vqconfig *config = vq->priv;
 
-	vring_del_virtqueue(vq);
+	vring_del_virtqueue(vq, 0);
 	vmem_remove_mapping(config->address,
 			    vring_size(config->num,
 				       KVM_S390_VIRTIO_RING_ALIGN));
diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index 01d6dc2..98e56dd 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -229,7 +229,7 @@ static void vm_del_vq(struct virtqueue *vq)
 	list_del(&info->node);
 	spin_unlock_irqrestore(&vm_dev->lock, flags);
 
-	vring_del_virtqueue(vq);
+	vring_del_virtqueue(vq, 0);
 
 	/* Select and deactivate the queue */
 	writel(info->queue_index, vm_dev->base + VIRTIO_MMIO_QUEUE_SEL);
@@ -310,7 +310,7 @@ static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index,
 			vm_dev->base + VIRTIO_MMIO_QUEUE_PFN);
 
 	/* Create the vring */
-	vq = vring_new_virtqueue(info->num, VIRTIO_MMIO_VRING_ALIGN, vdev,
+	vq = vring_new_virtqueue(0, info->num, VIRTIO_MMIO_VRING_ALIGN, vdev,
 				 true, info->queue, vm_notify, callback, name);
 	if (!vq) {
 		err = -ENOMEM;
diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
index 2e03d41..3c3eea9 100644
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -418,7 +418,7 @@ static struct virtqueue *setup_vq(struct virtio_device *vdev, unsigned index,
 		  vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
 
 	/* create the vring */
-	vq = vring_new_virtqueue(info->num, VIRTIO_PCI_VRING_ALIGN, vdev,
+	vq = vring_new_virtqueue(0, info->num, VIRTIO_PCI_VRING_ALIGN, vdev,
 				 true, info->queue, vp_notify, callback, name);
 	if (!vq) {
 		err = -ENOMEM;
@@ -448,7 +448,7 @@ static struct virtqueue *setup_vq(struct virtio_device *vdev, unsigned index,
 	return vq;
 
 out_assign:
-	vring_del_virtqueue(vq);
+	vring_del_virtqueue(vq, 0);
 out_activate_queue:
 	iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
 	free_pages_exact(info->queue, size);
@@ -476,7 +476,7 @@ static void vp_del_vq(struct virtqueue *vq)
 		ioread8(vp_dev->ioaddr + VIRTIO_PCI_ISR);
 	}
 
-	vring_del_virtqueue(vq);
+	vring_del_virtqueue(vq, 0);
 
 	/* Select and deactivate the queue */
 	iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 5aa43c3..688859b 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -616,7 +616,8 @@ irqreturn_t vring_interrupt(int irq, void *_vq)
 }
 EXPORT_SYMBOL_GPL(vring_interrupt);
 
-struct virtqueue *vring_new_virtqueue(unsigned int num,
+struct virtqueue *vring_new_virtqueue(unsigned int offset,
+				      unsigned int num,
 				      unsigned int vring_align,
 				      struct virtio_device *vdev,
 				      bool weak_barriers,
@@ -627,6 +628,7 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 {
 	struct vring_virtqueue *vq;
 	unsigned int i;
+	void *buf;
 
 	/* We assume num is a power of 2. */
 	if (num & (num - 1)) {
@@ -634,9 +636,11 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 		return NULL;
 	}
 
-	vq = kmalloc(sizeof(*vq) + sizeof(void *)*num, GFP_KERNEL);
-	if (!vq)
+	buf = kmalloc(offset + sizeof(*vq) + sizeof(void *)*num, GFP_KERNEL);
+	if (!buf)
 		return NULL;
+	BUG_ON(offset % __alignof__(*vq));
+	vq = buf + offset;
 
 	vring_init(&vq->vring, num, pages, vring_align);
 	vq->vq.callback = callback;
@@ -669,14 +673,17 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 	}
 	vq->data[i] = NULL;
 
+	/* Callers can overwrite off bytes before the returned pointer. */
+	BUILD_BUG_ON(offset_of(typeof(*vq), vq));
 	return &vq->vq;
 }
 EXPORT_SYMBOL_GPL(vring_new_virtqueue);
 
-void vring_del_virtqueue(struct virtqueue *vq)
+void vring_del_virtqueue(struct virtqueue *vq, unsigned int offset)
 {
+	void *buf = to_vvq(vq);
 	list_del(&vq->list);
-	kfree(to_vvq(vq));
+	kfree(buf - offset);
 }
 EXPORT_SYMBOL_GPL(vring_del_virtqueue);
 
diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index e338730..b91b8c1 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -165,7 +165,8 @@ static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
 struct virtio_device;
 struct virtqueue;
 
-struct virtqueue *vring_new_virtqueue(unsigned int num,
+struct virtqueue *vring_new_virtqueue(unsigned int offset,
+				      unsigned int num,
 				      unsigned int vring_align,
 				      struct virtio_device *vdev,
 				      bool weak_barriers,
diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c
index 6bf95f9..1a24799 100644
--- a/tools/virtio/virtio_test.c
+++ b/tools/virtio/virtio_test.c
@@ -92,7 +92,7 @@ static void vq_info_add(struct vdev_info *dev, int num)
 	assert(r >= 0);
 	memset(info->ring, 0, vring_size(num, 4096));
 	vring_init(&info->vring, num, info->ring, 4096);
-	info->vq = vring_new_virtqueue(info->vring.num, 4096, &dev->vdev,
+	info->vq = vring_new_virtqueue(0, info->vring.num, 4096, &dev->vdev,
 				       true, info->ring,
 				       vq_notify, vq_callback, "test");
 	assert(info->vq);

^ permalink raw reply related

* Re: [RFC PATCH] virtio_console: link vq to port with a private pointer in struct virtqueue
From: Michael S. Tsirkin @ 2012-05-08 10:18 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Amit Shah, linux-kernel, virtualization
In-Reply-To: <1336470355-19257-1-git-send-email-pbonzini@redhat.com>

On Tue, May 08, 2012 at 11:45:55AM +0200, Paolo Bonzini wrote:
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -76,8 +76,6 @@
>  
>  struct vring_virtqueue
>  {
> -	struct virtqueue vq;
> -
>  	/* Actual memory layout for this queue */
>  	struct vring vring;
>  
> @@ -106,6 +104,9 @@ struct vring_virtqueue
>  	/* How to notify other side. FIXME: commonalize hcalls! */
>  	void (*notify)(struct virtqueue *vq);
>  
> +	/* Tokens for callbacks. */
> +	void **data;
> +
>  #ifdef DEBUG
>  	/* They're supposed to lock for us. */
>  	unsigned int in_use;
> @@ -115,8 +116,9 @@ struct vring_virtqueue
>  	ktime_t last_add_time;
>  #endif
>  
> -	/* Tokens for callbacks. */
> -	void *data[];
> +	struct virtqueue vq;
> +
> +	/* Bus-specific virtqueue data goes here.  */
>  };
>  
>  #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)


This moves the data out of line and the bus specific stuff inline.
But bus accesses only happen on io and interrupt which are already
slow, while data accesses happen on fast path.

From that POV this looks like a wrong thing to do.

-- 
MST

^ permalink raw reply

* Re: [RFC PATCH] virtio_console: link vq to port with a private pointer in struct virtqueue
From: Paolo Bonzini @ 2012-05-08  9:45 UTC (permalink / raw)
  To: Rusty Russell, linux-kernel; +Cc: Amit Shah, Michael S. Tsirkin, virtualization
In-Reply-To: <87397be80k.fsf@rustcorp.com.au>

> How bad would be it to get rid of the current ->priv and use
> container_of() instead?  ie. have virtio_pci, virtio_mmio, lguest_bus
> and s390's kvm_virtio embed the struct virtqueue?

Something like the following, compile-tested only...

The layout of vring_virtqueue gets a bit complex, with private vring
data _before_ the struct virtqueue and private bus data after it.

The alternative is to make vring_virtqueue public.  It has some
appeal (for example many backends duplicate the num and pages members
in their private data) but overall I think I prefer this.

Anyhow, it is doable, the patches aren't too small but they are easily
split and quite boring.  Shall I pursue this instead?

Paolo

diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
index 2e03d41..047ce01 100644
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -71,10 +71,10 @@ enum {
 	VP_MSIX_VQ_VECTOR = 1,
 };
 
-struct virtio_pci_vq_info
+struct virtio_pci_vq
 {
 	/* the actual virtqueue */
-	struct virtqueue *vq;
+	struct virtqueue vq;
 
 	/* the number of entries in the queue */
 	int num;
@@ -106,6 +106,12 @@ static struct virtio_pci_device *to_vp_device(struct virtio_device *vdev)
 	return container_of(vdev, struct virtio_pci_device, vdev);
 }
 
+/* Convert a generic virtio virtqueue to our structure */
+static struct virtio_pci_vq *to_vp_queue(struct virtqueue *vq)
+{
+	return container_of(vq, struct virtio_pci_vq, vq);
+}
+
 /* virtio config->get_features() implementation */
 static u32 vp_get_features(struct virtio_device *vdev)
 {
@@ -202,11 +208,11 @@ static void vp_reset(struct virtio_device *vdev)
 static void vp_notify(struct virtqueue *vq)
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev);
-	struct virtio_pci_vq_info *info = vq->priv;
+	struct virtio_pci_vq *vp_queue = to_vp_queue(vq);
 
 	/* we write the queue's selector into the notification register to
 	 * signal the other end */
-	iowrite16(info->queue_index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY);
+	iowrite16(vp_queue->queue_index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY);
 }
 
 /* Handle a configuration change: Tell driver if it wants to know. */
@@ -226,13 +232,13 @@ static irqreturn_t vp_config_changed(int irq, void *opaque)
 static irqreturn_t vp_vring_interrupt(int irq, void *opaque)
 {
 	struct virtio_pci_device *vp_dev = opaque;
-	struct virtio_pci_vq_info *info;
+	struct virtio_pci_vq *vp_queue;
 	irqreturn_t ret = IRQ_NONE;
 	unsigned long flags;
 
 	spin_lock_irqsave(&vp_dev->lock, flags);
-	list_for_each_entry(info, &vp_dev->virtqueues, node) {
-		if (vring_interrupt(irq, info->vq) == IRQ_HANDLED)
+	list_for_each_entry(vp_queue, &vp_dev->virtqueues, node) {
+		if (vring_interrupt(irq, &vp_queue->vq) == IRQ_HANDLED)
 			ret = IRQ_HANDLED;
 	}
 	spin_unlock_irqrestore(&vp_dev->lock, flags);
@@ -382,9 +388,10 @@ static struct virtqueue *setup_vq(struct virtio_device *vdev, unsigned index,
 				  u16 msix_vec)
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
-	struct virtio_pci_vq_info *info;
+	struct virtio_pci_vq *vp_queue;
 	struct virtqueue *vq;
 	unsigned long flags, size;
+	void *queue;
 	u16 num;
 	int err;
 
@@ -398,35 +405,32 @@ static struct virtqueue *setup_vq(struct virtio_device *vdev, unsigned index,
 
 	/* allocate and fill out our structure the represents an active
 	 * queue */
-	info = kmalloc(sizeof(struct virtio_pci_vq_info), GFP_KERNEL);
-	if (!info)
-		return ERR_PTR(-ENOMEM);
-
-	info->queue_index = index;
-	info->num = num;
-	info->msix_vector = msix_vec;
-
 	size = PAGE_ALIGN(vring_size(num, VIRTIO_PCI_VRING_ALIGN));
-	info->queue = alloc_pages_exact(size, GFP_KERNEL|__GFP_ZERO);
-	if (info->queue == NULL) {
+	queue = alloc_pages_exact(size, GFP_KERNEL|__GFP_ZERO);
+	if (queue == NULL) {
 		err = -ENOMEM;
-		goto out_info;
+		goto out;
 	}
 
 	/* activate the queue */
-	iowrite32(virt_to_phys(info->queue) >> VIRTIO_PCI_QUEUE_ADDR_SHIFT,
+	iowrite32(virt_to_phys(queue) >> VIRTIO_PCI_QUEUE_ADDR_SHIFT,
 		  vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
 
 	/* create the vring */
-	vq = vring_new_virtqueue(info->num, VIRTIO_PCI_VRING_ALIGN, vdev,
-				 true, info->queue, vp_notify, callback, name);
+	vq = vring_new_virtqueue(sizeof(struct virtio_pci_vq),
+				 num, VIRTIO_PCI_VRING_ALIGN, vdev,
+				 true, queue, vp_notify, callback, name);
 	if (!vq) {
 		err = -ENOMEM;
 		goto out_activate_queue;
 	}
 
-	vq->priv = info;
-	info->vq = vq;
+	/* fill out our structure */
+	vp_queue = to_vp_queue(vq);
+	vp_queue->queue_index = index;
+	vp_queue->num = num;
+	vp_queue->msix_vector = msix_vec;
+	vp_queue->queue = queue;
 
 	if (msix_vec != VIRTIO_MSI_NO_VECTOR) {
 		iowrite16(msix_vec, vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
@@ -439,10 +443,10 @@ static struct virtqueue *setup_vq(struct virtio_device *vdev, unsigned index,
 
 	if (callback) {
 		spin_lock_irqsave(&vp_dev->lock, flags);
-		list_add(&info->node, &vp_dev->virtqueues);
+		list_add(&vp_queue->node, &vp_dev->virtqueues);
 		spin_unlock_irqrestore(&vp_dev->lock, flags);
 	} else {
-		INIT_LIST_HEAD(&info->node);
+		INIT_LIST_HEAD(&vp_queue->node);
 	}
 
 	return vq;
@@ -451,23 +455,22 @@ out_assign:
 	vring_del_virtqueue(vq);
 out_activate_queue:
 	iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
-	free_pages_exact(info->queue, size);
-out_info:
-	kfree(info);
+	free_pages_exact(queue, size);
+out:
 	return ERR_PTR(err);
 }
 
 static void vp_del_vq(struct virtqueue *vq)
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev);
-	struct virtio_pci_vq_info *info = vq->priv;
+	struct virtio_pci_vq *vp_queue = to_vp_queue(vq);
 	unsigned long flags, size;
 
 	spin_lock_irqsave(&vp_dev->lock, flags);
-	list_del(&info->node);
+	list_del(&vp_queue->node);
 	spin_unlock_irqrestore(&vp_dev->lock, flags);
 
-	iowrite16(info->queue_index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+	iowrite16(vp_queue->queue_index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
 
 	if (vp_dev->msix_enabled) {
 		iowrite16(VIRTIO_MSI_NO_VECTOR,
@@ -481,9 +484,8 @@ static void vp_del_vq(struct virtqueue *vq)
 	/* Select and deactivate the queue */
 	iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
 
-	size = PAGE_ALIGN(vring_size(info->num, VIRTIO_PCI_VRING_ALIGN));
-	free_pages_exact(info->queue, size);
-	kfree(info);
+	size = PAGE_ALIGN(vring_size(vp_queue->num, VIRTIO_PCI_VRING_ALIGN));
+	free_pages_exact(vp_queue->queue, size);
 }
 
 /* the config->del_vqs() implementation */
@@ -491,13 +493,13 @@ static void vp_del_vqs(struct virtio_device *vdev)
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
 	struct virtqueue *vq, *n;
-	struct virtio_pci_vq_info *info;
+	struct virtio_pci_vq *vp_queue;
 
 	list_for_each_entry_safe(vq, n, &vdev->vqs, list) {
-		info = vq->priv;
+		vp_queue = to_vp_queue(vq);
 		if (vp_dev->per_vq_vectors &&
-			info->msix_vector != VIRTIO_MSI_NO_VECTOR)
-			free_irq(vp_dev->msix_entries[info->msix_vector].vector,
+			vp_queue->msix_vector != VIRTIO_MSI_NO_VECTOR)
+			free_irq(vp_dev->msix_entries[vp_queue->msix_vector].vector,
 				 vq);
 		vp_del_vq(vq);
 	}
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 5aa43c3..8feee6b 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -76,8 +76,6 @@
 
 struct vring_virtqueue
 {
-	struct virtqueue vq;
-
 	/* Actual memory layout for this queue */
 	struct vring vring;
 
@@ -106,6 +104,9 @@ struct vring_virtqueue
 	/* How to notify other side. FIXME: commonalize hcalls! */
 	void (*notify)(struct virtqueue *vq);
 
+	/* Tokens for callbacks. */
+	void **data;
+
 #ifdef DEBUG
 	/* They're supposed to lock for us. */
 	unsigned int in_use;
@@ -115,8 +116,9 @@ struct vring_virtqueue
 	ktime_t last_add_time;
 #endif
 
-	/* Tokens for callbacks. */
-	void *data[];
+	struct virtqueue vq;
+
+	/* Bus-specific virtqueue data goes here.  */
 };
 
 #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
@@ -616,7 +618,8 @@ irqreturn_t vring_interrupt(int irq, void *_vq)
 }
 EXPORT_SYMBOL_GPL(vring_interrupt);
 
-struct virtqueue *vring_new_virtqueue(unsigned int num,
+struct virtqueue *vring_new_virtqueue(size_t sz,
+				      unsigned int num,
 				      unsigned int vring_align,
 				      struct virtio_device *vdev,
 				      bool weak_barriers,
@@ -634,7 +637,23 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 		return NULL;
 	}
 
-	vq = kmalloc(sizeof(*vq) + sizeof(void *)*num, GFP_KERNEL);
+	BUG_ON(sz < sizeof(vq->vq));
+
+	/* Add room for our own bits, which go before what we return.  */
+	sz += offsetof(struct vring_virtqueue, vq);
+
+	/* Make sure vq->data is properly aligned, since we carve it from
+	 * the same memory block.
+	 */
+	sz = round_up(sz, __alignof__(void *));
+
+	/* Now allocate the whole memory block:
+	 * 1) first goes the vring-specific parts;
+	 * 2) then the generic virtqueue data;
+	 * 3) then the bus-specific parts;
+	 * 4) then the callback tokens, pointed to by vq->data.
+	 */
+	vq = kmalloc(sz + sizeof(void *)*num, GFP_KERNEL);
 	if (!vq)
 		return NULL;
 
@@ -642,6 +661,7 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 	vq->vq.callback = callback;
 	vq->vq.vdev = vdev;
 	vq->vq.name = name;
+	vq->data = (void **) ((char *)vq + sz);
 	vq->notify = notify;
 	vq->weak_barriers = weak_barriers;
 	vq->broken = false;
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 8efd28a..7e61e2e 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -15,7 +15,7 @@
  * @callback: the function to call when buffers are consumed (can be NULL).
  * @name: the name of this virtqueue (mainly for debugging)
  * @vdev: the virtio device this queue was created for.
- * @priv: a pointer for the virtqueue implementation to use.
+ * @priv: a pointer for the virtio device driver to use.
  */
 struct virtqueue {
 	struct list_head list;

^ permalink raw reply related

* Re: [PATCH RFC V8 0/17] Paravirtualized ticket spinlocks
From: Avi Kivity @ 2012-05-08  9:08 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Raghavendra K T, KVM, linux-doc, Srivatsa Vaddagiri, Andi Kleen,
	H. Peter Anvin, Ingo Molnar, Stefano Stabellini, Xen Devel, X86,
	Ingo Molnar, Peter Zijlstra, Konrad Rzeszutek Wilk,
	Thomas Gleixner, Virtualization, Greg Kroah-Hartman, LKML,
	Attilio Rao, Andrew Morton, Linus Torvalds, Stephan Diestelhorst
In-Reply-To: <4FA8579C.3000205@goop.org>

On 05/08/2012 02:15 AM, Jeremy Fitzhardinge wrote:
> On 05/07/2012 06:49 AM, Avi Kivity wrote:
> > On 05/07/2012 04:46 PM, Srivatsa Vaddagiri wrote:
> >> * Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> [2012-05-07 19:08:51]:
> >>
> >>> I 'll get hold of a PLE mc  and come up with the numbers soon. but I
> >>> 'll expect the improvement around 1-3% as it was in last version.
> >> Deferring preemption (when vcpu is holding lock) may give us better than 1-3% 
> >> results on PLE hardware. Something worth trying IMHO.
> > Is the improvement so low, because PLE is interfering with the patch, or
> > because PLE already does a good job?
>
> How does PLE help with ticket scheduling on unlock?  I thought it would
> just help with the actual spin loops.

PLE yields to up a random vcpu, hoping it is the lock holder.  This
patchset wakes up the right vcpu.  For small vcpu counts the difference
is a few bad wakeups (and even a bad wakeup sometimes works, since it
can put the spinner to sleep for a bit).  I expect that large vcpu
counts would show a greater difference.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply

* Re: [PATCH RFC V8 0/17] Paravirtualized ticket spinlocks
From: Nikunj A Dadhania @ 2012-05-08  6:46 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar
  Cc: Jeremy Fitzhardinge, Raghavendra K T, KVM, linux-doc,
	Srivatsa Vaddagiri, Andi Kleen, H. Peter Anvin,
	Stefano Stabellini, Xen Devel, X86, Ingo Molnar, Avi Kivity,
	Peter Zijlstra, Konrad Rzeszutek Wilk, Virtualization,
	Greg Kroah-Hartman, LKML, Attilio Rao, Andrew Morton,
	Linus Torvalds, Stephan Diestelhorst
In-Reply-To: <alpine.LFD.2.02.1205072240020.6271@ionos>

On Mon, 7 May 2012 22:42:30 +0200 (CEST), Thomas Gleixner <tglx@linutronix.de> wrote:
> On Mon, 7 May 2012, Ingo Molnar wrote:
> > * Avi Kivity <avi@redhat.com> wrote:
> > 
> > > > PS: Nikunj had experimented that pv-flush tlb + 
> > > > paravirt-spinlock is a win on PLE where only one of them 
> > > > alone could not prove the benefit.
> > > 
Do not have PLE numbers yet for pvflush and pvspinlock. 

I have seen on Non-PLE having pvflush and pvspinlock patches -
kernbench, ebizzy, specjbb, hackbench and dbench all of them improved. 

I am chasing a race currently on pv-flush path, it is causing
file-system corruption. I will post these number along with my v2 post.

> > > I'd like to see those numbers, then.
> > > 
> > > Ingo, please hold on the kvm-specific patches, meanwhile.
> > 
> > I'll hold off on the whole thing - frankly, we don't want this 
> > kind of Xen-only complexity. If KVM can make use of PLE then Xen 
> > ought to be able to do it as well.
> > 
> > If both Xen and KVM makes good use of it then that's a different 
> > matter.
> 
> Aside of that, it's kinda strange that a dude named "Nikunj" is
> referenced in the argument chain, but I can't find him on the CC list.
> 
/me waves my hand

Regards
Nikunj

^ permalink raw reply

* Re: [RFC PATCH] virtio_console: link vq to port with a private pointer in struct virtqueue
From: Paolo Bonzini @ 2012-05-08  6:43 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Amit Shah, Michael S. Tsirkin, linux-kernel, virtualization
In-Reply-To: <87397be80k.fsf@rustcorp.com.au>

Il 08/05/2012 04:11, Rusty Russell ha scritto:
>> > For virtio-scsi multiqueue support I would like to have an easy and
>> > fast way to go from a virtqueue to the internal struct for that
>> > queue.
>> > 
>> > It turns out that virtio-serial has the same need, but it gets
>> > by with a simple list walk.
>> > 
>> > This patch adds a pointer to struct virtqueue that is reserved for
>> > the virtio device, and uses it in virtio-serial.
> I ike the concept, but share Michael's concern with naming confusion.
> 
> How bad would be it to get rid of the current ->priv and use
> container_of() instead?  ie. have virtio_pci, virtio_mmio, lguest_bus
> and s390's kvm_virtio embed the struct virtqueue?

How bad is not the question, it would actually be pretty good... the
question is more how hard! :)

lguest and s390 are a bit different from the others because ->priv
points into a memory-mapped descriptor provided by the host; PCI and
MMIO have lots of similarities.

Looks doable...

Paolo

^ permalink raw reply

* Re: [PATCH RFC V8 0/17] Paravirtualized ticket spinlocks
From: Raghavendra K T @ 2012-05-08  5:25 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Jeremy Fitzhardinge, Greg Kroah-Hartman, KVM, linux-doc,
	Srivatsa Vaddagiri, Andi Kleen, H. Peter Anvin, Ingo Molnar,
	Stefano Stabellini, Xen Devel, X86, Ingo Molnar, Peter Zijlstra,
	Konrad Rzeszutek Wilk, Thomas Gleixner, Virtualization, LKML,
	Attilio Rao, Andrew Morton, Linus Torvalds, Stephan Diestelhorst
In-Reply-To: <4FA7E1C8.7010509@redhat.com>

On 05/07/2012 08:22 PM, Avi Kivity wrote:
> On 05/07/2012 05:47 PM, Raghavendra K T wrote:
>>> Not good.  Solving a problem in software that is already solved by
>>> hardware?  It's okay if there are no costs involved, but here we're
>>> introducing a new ABI that we'll have to maintain for a long time.
>>>
>>
>>
>> Hmm agree that being a step ahead of mighty hardware (and just an
>> improvement of 1-3%) is no good for long term (where PLE is future).
>>
>
> PLE is the present, not the future.  It was introduced on later Nehalems
> and is present on all Westmeres.  Two more processor generations have
> passed meanwhile.  The AMD equivalent was also introduced around that
> timeframe.
>
>> Having said that, it is hard for me to resist saying :
>>   bottleneck is somewhere else on PLE m/c and IMHO answer would be
>> combination of paravirt-spinlock + pv-flush-tb.
>>
>> But I need to come up with good number to argue in favour of the claim.
>>
>> PS: Nikunj had experimented that pv-flush tlb + paravirt-spinlock is a
>> win on PLE where only one of them alone could not prove the benefit.
>>
>
> I'd like to see those numbers, then.
>
> Ingo, please hold on the kvm-specific patches, meanwhile.
>


Hmm. I think I messed up the fact while saying 1-3% improvement on PLE.

Going by what I had posted in  https://lkml.org/lkml/2012/4/5/73 (with
correct calculation)

   1x  	 70.475 (85.6979) 	63.5033 (72.7041)   15.7%
   2x  	 110.971 (132.829) 	105.099 (128.738)    5.56%	
   3x   	 150.265 (184.766) 	138.341 (172.69)     8.62%


It was around 12% with optimization patch posted separately with that 
(That one Needs more experiment though)

But anyways, I will come up with result for current patch series..

^ permalink raw reply

* Re: [PATCH] virtio_blk: Drop unused request tracking list
From: Rusty Russell @ 2012-05-08  4:08 UTC (permalink / raw)
  To: Asias He, virtualization, Michael S. Tsirkin; +Cc: kvm
In-Reply-To: <1333077850-21717-1-git-send-email-asias@redhat.com>

On Fri, 30 Mar 2012 11:24:10 +0800, Asias He <asias@redhat.com> wrote:
> Benchmark shows small performance improvement on fusion io device.
> 
> Before:
>   seq-read : io=1,024MB, bw=19,982KB/s, iops=39,964, runt= 52475msec
>   seq-write: io=1,024MB, bw=20,321KB/s, iops=40,641, runt= 51601msec
>   rnd-read : io=1,024MB, bw=15,404KB/s, iops=30,808, runt= 68070msec
>   rnd-write: io=1,024MB, bw=14,776KB/s, iops=29,552, runt= 70963msec
> 
> After:
>   seq-read : io=1,024MB, bw=20,343KB/s, iops=40,685, runt= 51546msec
>   seq-write: io=1,024MB, bw=20,803KB/s, iops=41,606, runt= 50404msec
>   rnd-read : io=1,024MB, bw=16,221KB/s, iops=32,442, runt= 64642msec
>   rnd-write: io=1,024MB, bw=15,199KB/s, iops=30,397, runt= 68991msec
> 
> Signed-off-by: Asias He <asias@redhat.com>

Thanks.  It didn't really need a benchmark to justify this cleanup, but
you certainly get points for thoroughness!

Applied,
Rusty.

^ permalink raw reply

* Re: [RFC PATCH] virtio_console: link vq to port with a private pointer in struct virtqueue
From: Rusty Russell @ 2012-05-08  2:11 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel; +Cc: Amit Shah, Michael S. Tsirkin, virtualization
In-Reply-To: <1334756013-11752-1-git-send-email-pbonzini@redhat.com>

On Wed, 18 Apr 2012 15:33:33 +0200, Paolo Bonzini <pbonzini@redhat.com> wrote:
> For virtio-scsi multiqueue support I would like to have an easy and
> fast way to go from a virtqueue to the internal struct for that
> queue.
> 
> It turns out that virtio-serial has the same need, but it gets
> by with a simple list walk.
> 
> This patch adds a pointer to struct virtqueue that is reserved for
> the virtio device, and uses it in virtio-serial.

I ike the concept, but share Michael's concern with naming confusion.

How bad would be it to get rid of the current ->priv and use
container_of() instead?  ie. have virtio_pci, virtio_mmio, lguest_bus
and s390's kvm_virtio embed the struct virtqueue?

Thanks,
Rusty.

^ permalink raw reply

* Re: virtio 3.4 patches
From: Rusty Russell @ 2012-05-08  2:09 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Amit Shah, kvm, virtualization
In-Reply-To: <20120507053020.GA15455@redhat.com>

On Mon, 7 May 2012 08:30:21 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> Hi Rusty, please also pick two fixes from for_linus tag on my tree
> I think they should be sent to Linus for 3.4:
> 
> http://git.kernel.org/?p=linux/kernel/git/mst/vhost.git;a=tag;h=refs/tags/for_linus
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git refs/tags/for_linus

Done!

Thanks so much for handling virtio while I was away; knowing it was
safely in your hands allowed me to relax without reservation.  My wife
thanks you, too :)

Cheers,
Rusty.

^ permalink raw reply

* Re: [PATCH RFC V8 0/17] Paravirtualized ticket spinlocks
From: Raghavendra K T @ 2012-05-08  1:13 UTC (permalink / raw)
  To: Jeremy Fitzhardinge, Avi Kivity
  Cc: Greg Kroah-Hartman, KVM, linux-doc, Srivatsa Vaddagiri,
	Andi Kleen, H. Peter Anvin, Ingo Molnar, Stefano Stabellini,
	Xen Devel, X86, Ingo Molnar, Peter Zijlstra,
	Konrad Rzeszutek Wilk, Thomas Gleixner, Virtualization, LKML,
	Attilio Rao, Andrew Morton, Linus Torvalds, Stephan Diestelhorst
In-Reply-To: <4FA8579C.3000205@goop.org>

On 05/08/2012 04:45 AM, Jeremy Fitzhardinge wrote:
> On 05/07/2012 06:49 AM, Avi Kivity wrote:
>> On 05/07/2012 04:46 PM, Srivatsa Vaddagiri wrote:
>>> * Raghavendra K T<raghavendra.kt@linux.vnet.ibm.com>  [2012-05-07 19:08:51]:
>>>
>>>> I 'll get hold of a PLE mc  and come up with the numbers soon. but I
>>>> 'll expect the improvement around 1-3% as it was in last version.
>>> Deferring preemption (when vcpu is holding lock) may give us better than 1-3%
>>> results on PLE hardware. Something worth trying IMHO.
>> Is the improvement so low, because PLE is interfering with the patch, or
>> because PLE already does a good job?
>
> How does PLE help with ticket scheduling on unlock?  I thought it would
> just help with the actual spin loops.

Hmm. This strikes something to me. I think I should replace while 1 hog
in with some *real job*  to measure over-commit case. I hope to see
greater improvements because of fairness and scheduling of the
patch-set.

May be all the way I was measuring something equal to 1x case.

^ permalink raw reply

* Re: [PATCH RFC V8 0/17] Paravirtualized ticket spinlocks
From: Jeremy Fitzhardinge @ 2012-05-07 23:15 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Raghavendra K T, KVM, linux-doc, Srivatsa Vaddagiri, Andi Kleen,
	H. Peter Anvin, Ingo Molnar, Stefano Stabellini, Xen Devel, X86,
	Ingo Molnar, Peter Zijlstra, Konrad Rzeszutek Wilk,
	Thomas Gleixner, Virtualization, Greg Kroah-Hartman, LKML,
	Attilio Rao, Andrew Morton, Linus Torvalds, Stephan Diestelhorst
In-Reply-To: <4FA7D2E5.1020607@redhat.com>

On 05/07/2012 06:49 AM, Avi Kivity wrote:
> On 05/07/2012 04:46 PM, Srivatsa Vaddagiri wrote:
>> * Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> [2012-05-07 19:08:51]:
>>
>>> I 'll get hold of a PLE mc  and come up with the numbers soon. but I
>>> 'll expect the improvement around 1-3% as it was in last version.
>> Deferring preemption (when vcpu is holding lock) may give us better than 1-3% 
>> results on PLE hardware. Something worth trying IMHO.
> Is the improvement so low, because PLE is interfering with the patch, or
> because PLE already does a good job?

How does PLE help with ticket scheduling on unlock?  I thought it would
just help with the actual spin loops.

    J

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox