Linux virtualization list
 help / color / mirror / Atom feed
* Re: [PATCH] virtio-spec: split virtio-net device status filed into ro and rw byte
From: Michael S. Tsirkin @ 2012-03-18 12:22 UTC (permalink / raw)
  To: Jason Wang; +Cc: linux-kernel, qemu-devel, virtualization
In-Reply-To: <20120316152026.6428.6983.stgit@jason-ThinkPad-T400>

On Fri, Mar 16, 2012 at 11:20:26PM +0800, Jason Wang wrote:
> This patch splits the device status field of virtio-net into ro and rw
> byte. This would simplify the implementation of both host and guest
> and make the layout more clean. As VIRTIO_NET_S_ANNOUNCE is a rw bit,
> it was moved to bit 8 (0x100).
> 
> btw. looks like there's no implementation that depends on
> VIRTIO_NET_S_ANNOUNCE, so the move is safe.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Hmm, I know I proposed this myself, and I thought it will
prevent problems if we will add more rw bits,
but I missed the following race:

host writes VIRTIO_NET_S_ANNOUNCE = 1, interrupt

				guest reads VIRTIO_NET_S_ANNOUNCE = 1

host writes VIRTIO_NET_S_SOME_NEW_FIELD = 1, interrupt

				guest writes VIRTIO_NET_S_ANNOUNCE = 0
				VIRTIO_NET_S_SOME_NEW_FIELD is overwritten

				guest reads VIRTIO_NET_S_SOME_NEW_FIELD = 0

How about making the new bit write 1 to clear?
If we do, we can keep it where it is currently ...



> ---
>  virtio-0.9.4.lyx |   20 +++++++++++++++++++-
>  1 files changed, 19 insertions(+), 1 deletions(-)
> 
> diff --git a/virtio-0.9.4.lyx b/virtio-0.9.4.lyx
> index 6c7bab1..ef3951c 100644
> --- a/virtio-0.9.4.lyx
> +++ b/virtio-0.9.4.lyx
> @@ -58,6 +58,7 @@
>  \html_be_strict false
>  \author -608949062 "Rusty Russell,,," 
>  \author 1531152142 "pbonzini" 
> +\author 2090695081 "Jason" 
>  \end_header
>  
>  \begin_body
> @@ -4012,8 +4013,19 @@ configuration
>  layout Two configuration fields are currently defined.
>   The mac address field always exists (though is only valid if VIRTIO_NET_F_MAC
>   is set), and the status field only exists if VIRTIO_NET_F_STATUS is set.
> +
> +\change_inserted 2090695081 1331907586
> + The low byte of status field is read-only, guest write to this byte would
> + be ignored.
> + Currently only one bit is defined for this byte: VIRTIO_NET_S_LINK_UP.
> + The high byte of status field is read-writable.
> + Currently only one bit is defined for this byte: VIRTIO_NET_S_ANNOUNCE.
> +
> +\change_deleted 2090695081 1331907489
>   Two bits are currently defined for the status field: VIRTIO_NET_S_LINK_UP
>   and VIRTIO_NET_S_ANNOUNCE.
> +
> +\change_unchanged
>   
>  \begin_inset listings
>  inline false
> @@ -4026,7 +4038,13 @@ status open
>  
>  \begin_layout Plain Layout
>  
> -#define VIRTIO_NET_S_ANNOUNCE	2
> +#define VIRTIO_NET_S_ANNOUNCE	
> +\change_inserted 2090695081 1331907493
> +0x100
> +\change_deleted 2090695081 1331907491
> +2
> +\change_unchanged
> +
>  \end_layout
>  
>  \begin_layout Plain Layout

^ permalink raw reply

* vhost question
From: Steve Glass @ 2012-03-18 12:43 UTC (permalink / raw)
  To: virtualization


[-- Attachment #1.1: Type: text/plain, Size: 3086 bytes --]

Hi,

I'm using virtio to implement a mac80211 device in the guest. On the host
side I plan to use a simple network simulation to control delivery of
frames between guests. Right now I've used the vhost approach on the host
side and can pass buffers from guest to host and back using virtio.

My immediate problem is how to have the host-side tx kick handler get the
host-side rx kick handler to run in order to deliver a frame? In the tx
kick handler I copy the frame into a queue for the receiving guest. In the
rx_handler I take queued frames and copy them into the rx buffers and wake
the guest. The problem is that if no frames are ready for delivery when the
guest rx kick handler runs then it has to exit and somehow I have to
arrange that it runs again (in the receiver's process context) when there
are frames to deliver.

How can I get my host-side tx kick handler to wake the host-side rx kick
handler to deliver frames to the guest?

Steve

--

static void
handle_rx(struct vhost_work *work)
{
int n;
unsigned out, in;
struct transmission *t;
u16 frames = 0;
struct vhost_poll *p = container_of(work, struct vhost_poll, work);
struct vhost_virtqueue *vq = container_of(p, struct vhost_virtqueue, poll);
struct vhost_node *node = container_of(vq, struct vhost_node,
vqs[WLAN_VQ_RX]);
struct vhost_dev *dev = &node->vdev;

mutex_lock(&vq->mutex);
vhost_disable_notify(dev, vq);
while (!queue_empty(&node->rxq)) {
n = vhost_get_vq_desc(dev, vq, vq->iov, ARRAY_SIZE(vq->iov), &out, &in,
NULL, NULL);
if (0 < n || n == vq->num)
break;
if ((t = queue_pop(&node->rxq))) {
BUG_ON(copy_to_user(vq->iov[0].iov_base, t->buf, t->buf_sz));
vq->iov[0].iov_len = t->buf_sz;
// ToDo: copy_to_user the rx_status
vhost_add_used(vq, n, out);
transmission_free(t);
++frames;
}
}
if (frames)
vhost_signal(dev, vq);
vhost_enable_notify(dev, vq);
mutex_unlock(&vq->mutex);
}

static void
handle_tx(struct vhost_work *work)
{
int n;
unsigned out, in;
struct transmission *t;
struct vhost_poll *p = container_of(work, struct vhost_poll, work);
struct vhost_virtqueue *vq = container_of(p, struct vhost_virtqueue, poll);
struct vhost_node *w = container_of(vq, struct vhost_node, vqs[WLAN_VQ_TX]);
struct vhost_dev *dev = &w->vdev;

mutex_lock(&vq->mutex);
do {
vhost_disable_notify(dev, vq);
n = vhost_get_vq_desc(dev, vq, vq->iov, ARRAY_SIZE(vq->iov), &out, &in,
NULL, NULL);
while (n >= 0 && n != vq->num) {
struct vhost_node *receiver = net_get_receiver(w);
if(receiver) {
if((t = transmission_alloc())) {
BUG_ON(copy_from_user(t->buf, vq->iov[1].iov_base, vq->iov[1].iov_len));
t->buf_sz = vq->iov[1].iov_len;
queue_push(&receiver->rxq, t);
// ToDo: kick receiver's handle_rx
// ToDo: populate TX status
} else {
pr_warn("%s: out of memory - packet dropped!", __func__);
// ToDo: populate TX status
}
} else {
pr_debug("%s: no receivers in range!", __func__);
// ToDo: populate TX status
}
vhost_add_used(vq, n, out);
n = vhost_get_vq_desc(dev, vq, vq->iov, ARRAY_SIZE(vq->iov), &out, &in,
NULL, NULL);
}
vhost_signal(dev, vq);
} while (vhost_enable_notify(dev, vq));
mutex_unlock(&vq->mutex);
}

[-- Attachment #1.2: Type: text/html, Size: 11189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* [PATCH RESEND  0000/0002] drivers: scsi: storvsc
From: K. Y. Srinivasan @ 2012-03-18 19:59 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, virtualization, ohering, jbottomley,
	hch, linux-scsi


The current Windows hosts only handle a subset of scsi commands
sent from the guest and for commands that are not supported, they are
filtered on the host side a generic failure is returned to the guest
as SRB status. The returned error code does not permit the guest to
figure out if there was an error or if the command was not supported.

Based on the input I got from the community, I have convinced the windows
developers to return an error code that allows the guest to distinguish between
unsupported command and true failures. However, it is not clear when this fix will
be shipped.

This patch set addresses this issue by filtering the ATA_16 command on the guest
(note that currently the host is filtering this command as it is not supported).
I also have a patch here the correctly handles SRB_STATUS_INVALID_REQUEST
error returns - note that the current windows hosts don't return this today.

This is a resend of the patches sent earlier based on suggestions from
Jeff Garzik <jgpobox@gmail.com> and Douglas Gilbert <dgilbert@interlog.com>. 


Regards,

K. Y

^ permalink raw reply

* [PATCH RESEND 1/2] Drivers: scsi: storvsc: Set the scsi result correctly when SRB status is INVALID
From: K. Y. Srinivasan @ 2012-03-18 20:00 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, virtualization, ohering, jbottomley,
	hch, linux-scsi
  Cc: K. Y. Srinivasan
In-Reply-To: <1332100789-25594-1-git-send-email-kys@microsoft.com>

Currently Windows hosts only support a subset of scsi commands and for commands
that are not supported, the host returns a generic SRB failure status.
However, they have agreed to change the return value to indicate that
the command is not supported. In preparation for that, handle the
SRB_STATUS_INVALID_REQUEST return value correctly.

I would like to thank Jeff Garzik <jgpobox@gmail.com> and
Douglas Gilbert <dgilbert@interlog.com> for suggesting the correct approach
here.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
---
 drivers/scsi/storvsc_drv.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 44c7a48..d5448e8 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -202,6 +202,7 @@ enum storvsc_request_type {
 #define SRB_STATUS_INVALID_LUN	0x20
 #define SRB_STATUS_SUCCESS	0x01
 #define SRB_STATUS_ERROR	0x04
+#define SRB_STATUS_INVALID_REQUEST 0x06
 
 /*
  * This is the end of Protocol specific defines.
@@ -779,6 +780,17 @@ static void storvsc_command_completion(struct storvsc_cmd_request *cmd_request)
 	}
 
 	/*
+	 * If the host returns with an invalid request, set
+	 * the scsi command result correctly.
+	 */
+	if (vm_srb->srb_status == SRB_STATUS_INVALID_REQUEST) {
+		scmnd->result = ((DRIVER_SENSE << 24) |
+                                 SAM_STAT_CHECK_CONDITION);
+		scsi_build_sense_buffer(0, scmnd->sense_buffer,
+					ILLEGAL_REQUEST, 0x20, 0);
+	}
+
+	/*
 	 * If there is an error; offline the device since all
 	 * error recovery strategies would have already been
 	 * deployed on the host side.
-- 
1.7.4.1

^ permalink raw reply related

* [PATCH RESEND 2/2] Drivers: scsi: storvsc: Don't pass ATA_16 command to the host
From: K. Y. Srinivasan @ 2012-03-18 20:00 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, virtualization, ohering, jbottomley,
	hch, linux-scsi
  Cc: K. Y. Srinivasan
In-Reply-To: <1332100817-25646-1-git-send-email-kys@microsoft.com>

The current Windows hosts don't handle the ATA_16 command and furthermore
return a generic error code after filtering the command on the host side.
For now filter the command on the guest side until the host is modified to
properly handle unsupported commands.

I would like to thank Jeff Garzik <jgpobox@gmail.com> and
Douglas Gilbert <dgilbert@interlog.com> for suggesting the correct approach
here.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
---
 drivers/scsi/storvsc_drv.c |   14 +++++++++++++-
 1 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index d5448e8..e9e37dd 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1233,9 +1233,21 @@ static bool storvsc_scsi_cmd_ok(struct scsi_cmnd *scmnd)
 	/*
 	 * smartd sends this command and the host does not handle
 	 * this. So, don't send it.
+	 * The current Windows hosts implement a subset of scsi commands
+	 * and for the commands that are not implemented, the host filters
+	 * them and returns a generic failure SRB status. I have been in
+	 * discussions with the Windows team to return the appropriate SRB
+	 * status code for unsupported scsi commands and while they have
+	 * agreed to implement this, it is not clear when this change will be
+	 * available. Consequently, filtering the command before submitting it
+	 * to the host is a resonable interim solution.
 	 */
 	case SET_WINDOW:
-		scmnd->result = ILLEGAL_REQUEST << 16;
+	case ATA_16:
+		scmnd->result = ((DRIVER_SENSE << 24) |
+                                 SAM_STAT_CHECK_CONDITION);
+		scsi_build_sense_buffer(0, scmnd->sense_buffer,
+					ILLEGAL_REQUEST, 0x20, 0);
 		allowed = false;
 		break;
 	default:
-- 
1.7.4.1


^ permalink raw reply related

* RE: [PATCH RESEND  0000/0002] drivers: scsi: storvsc
From: KY Srinivasan @ 2012-03-19  0:01 UTC (permalink / raw)
  To: KY Srinivasan, gregkh@linuxfoundation.org,
	linux-kernel@vger.kernel.org, devel@linuxdriverproject.org,
	virtualization@lists.osdl.org, ohering@suse.com,
	jbottomley@parallels.com, hch@infradead.org,
	linux-scsi@vger.kernel.org
In-Reply-To: <1332100789-25594-1-git-send-email-kys@microsoft.com>

Please drop these two patches. It looks like I sent out the wrong set of patches.
I will send the ones I wanted to send shortly.

Regards,

K. Y

> -----Original Message-----
> From: K. Y. Srinivasan [mailto:kys@microsoft.com]
> Sent: Sunday, March 18, 2012 4:00 PM
> To: gregkh@linuxfoundation.org; linux-kernel@vger.kernel.org;
> devel@linuxdriverproject.org; virtualization@lists.osdl.org; ohering@suse.com;
> jbottomley@parallels.com; hch@infradead.org; linux-scsi@vger.kernel.org
> Cc: KY Srinivasan
> Subject: [PATCH RESEND 0000/0002] drivers: scsi: storvsc
> 
> 
> The current Windows hosts only handle a subset of scsi commands
> sent from the guest and for commands that are not supported, they are
> filtered on the host side a generic failure is returned to the guest
> as SRB status. The returned error code does not permit the guest to
> figure out if there was an error or if the command was not supported.
> 
> Based on the input I got from the community, I have convinced the windows
> developers to return an error code that allows the guest to distinguish between
> unsupported command and true failures. However, it is not clear when this fix
> will
> be shipped.
> 
> This patch set addresses this issue by filtering the ATA_16 command on the guest
> (note that currently the host is filtering this command as it is not supported).
> I also have a patch here the correctly handles SRB_STATUS_INVALID_REQUEST
> error returns - note that the current windows hosts don't return this today.
> 
> This is a resend of the patches sent earlier based on suggestions from
> Jeff Garzik <jgpobox@gmail.com> and Douglas Gilbert <dgilbert@interlog.com>.
> 
> 
> Regards,
> 
> K. Y

^ permalink raw reply

* [PATCH RESEND  0000/0002] drivers: scsi: storvsc
From: K. Y. Srinivasan @ 2012-03-19  0:11 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, virtualization, ohering, jbottomley,
	hch, linux-scsi
  Cc: K. Y. Srinivasan


The current Windows hosts only handle a subset of scsi commands
sent from the guest and for commands that are not supported, they are
filtered on the host side a generic failure is returned to the guest
as SRB status. The returned error code does not permit the guest to
figure out if there was an error or if the command was not supported.

Based on the input I got from the community, I have convinced the windows
developers to return an error code that allows the guest to distinguish between
unsupported command and true failures. However, it is not clear when this fix will
be shipped.

This patch set addresses this issue by filtering the ATA_16 command on the guest
(note that currently the host is filtering this command as it is not supported).
I also have a patch here the correctly handles SRB_STATUS_INVALID_REQUEST
error returns - note that the current windows hosts don't return this today.

This is a resend of the patches sent earlier based on suggestions from
Jeff Garzik <jgpobox@gmail.com> and Douglas Gilbert <dgilbert@interlog.com>. 


Regards,

K. Y

^ permalink raw reply

* [PATCH RESEND 1/2] Drivers: scsi: storvsc: Set the scsi result correctly when SRB status is INVALID
From: K. Y. Srinivasan @ 2012-03-19  0:12 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, virtualization, ohering, jbottomley,
	hch, linux-scsi
  Cc: K. Y. Srinivasan
In-Reply-To: <1332115915-26805-1-git-send-email-kys@microsoft.com>

Currently Windows hosts only support a subset of scsi commands and for commands
that are not supported, the host returns a generic SRB failure status.
However, they have agreed to change the return value to indicate that
the command is not supported. In preparation for that, handle the
SRB_STATUS_INVALID_REQUEST return value correctly.

I would like to thank Jeff Garzik <jgpobox@gmail.com> and
Douglas Gilbert <dgilbert@interlog.com> for suggesting the correct approach
here.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
---
 drivers/scsi/storvsc_drv.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 44c7a48..018c363 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -202,6 +202,7 @@ enum storvsc_request_type {
 #define SRB_STATUS_INVALID_LUN	0x20
 #define SRB_STATUS_SUCCESS	0x01
 #define SRB_STATUS_ERROR	0x04
+#define SRB_STATUS_INVALID_REQUEST 0x06
 
 /*
  * This is the end of Protocol specific defines.
@@ -779,6 +780,17 @@ static void storvsc_command_completion(struct storvsc_cmd_request *cmd_request)
 	}
 
 	/*
+	 * If the host returns with an invalid request, set
+	 * the scsi command result correctly.
+	 */
+	if (vm_srb->srb_status == SRB_STATUS_INVALID_REQUEST) {
+		scmnd->result = ((DRIVER_SENSE << 24) |
+				SAM_STAT_CHECK_CONDITION);
+		scsi_build_sense_buffer(0, scmnd->sense_buffer,
+					ILLEGAL_REQUEST, 0x20, 0);
+	}
+
+	/*
 	 * If there is an error; offline the device since all
 	 * error recovery strategies would have already been
 	 * deployed on the host side.
-- 
1.7.4.1

^ permalink raw reply related

* [PATCH RESEND 2/2] Drivers: scsi: storvsc: Don't pass ATA_16 command to the host
From: K. Y. Srinivasan @ 2012-03-19  0:12 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, virtualization, ohering, jbottomley,
	hch, linux-scsi
  Cc: K. Y. Srinivasan
In-Reply-To: <1332115963-26855-1-git-send-email-kys@microsoft.com>

The current Windows hosts don't handle the ATA_16 command and furthermore
return a generic error code after filtering the command on the host side.
For now filter the command on the guest side until the host is modified to
properly handle unsupported commands.

I would like to thank Jeff Garzik <jgpobox@gmail.com> and
Douglas Gilbert <dgilbert@interlog.com> for suggesting the correct approach
here.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
---
 drivers/scsi/storvsc_drv.c |   14 +++++++++++++-
 1 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 018c363..66e9bdd 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1233,9 +1233,21 @@ static bool storvsc_scsi_cmd_ok(struct scsi_cmnd *scmnd)
 	/*
 	 * smartd sends this command and the host does not handle
 	 * this. So, don't send it.
+	 * The current Windows hosts implement a subset of scsi commands
+	 * and for the commands that are not implemented, the host filters
+	 * them and returns a generic failure SRB status. I have been in
+	 * discussions with the Windows team to return the appropriate SRB
+	 * status code for unsupported scsi commands and while they have
+	 * agreed to implement this, it is not clear when this change will be
+	 * available. Consequently, filtering the command before submitting it
+	 * to the host is a resonable interim solution.
 	 */
 	case SET_WINDOW:
-		scmnd->result = ILLEGAL_REQUEST << 16;
+	case ATA_16:
+		scmnd->result = ((DRIVER_SENSE << 24) |
+				SAM_STAT_CHECK_CONDITION);
+		scsi_build_sense_buffer(0, scmnd->sense_buffer,
+					ILLEGAL_REQUEST, 0x20, 0);
 		allowed = false;
 		break;
 	default:
-- 
1.7.4.1

^ permalink raw reply related

* Re: [V4 PATCH] virtio-net: send gratuitous packet when needed
From: Rusty Russell @ 2012-03-19  2:16 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang; +Cc: netdev, linux-kernel, virtualization
In-Reply-To: <20120313143331.GE14931@redhat.com>

On Tue, 13 Mar 2012 16:33:31 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
> > index 970d5a2..44a38d6 100644
> > --- a/include/linux/virtio_net.h
> > +++ b/include/linux/virtio_net.h
> > @@ -49,8 +49,10 @@
> >  #define VIRTIO_NET_F_CTRL_RX	18	/* Control channel RX mode support */
> >  #define VIRTIO_NET_F_CTRL_VLAN	19	/* Control channel VLAN filtering */
> >  #define VIRTIO_NET_F_CTRL_RX_EXTRA 20	/* Extra RX mode control support */
> > +#define VIRTIO_NET_F_GUEST_ANNOUNCE 21  /* Guest can send gratituous packet */
> >  
> >  #define VIRTIO_NET_S_LINK_UP	1	/* Link is up */
> > +#define VIRTIO_NET_S_ANNOUNCE   2       /* Announcement is needed */
> 
> I would put this in bit 8 (0x100), this way low status byte
> is RO, high byte is RW.

The whole idea of acking by clearing the bit is unreliable, moving to a
separate byte just controls the damage.

How about you use bits 8-15 as a counter?  It's still theoretically
unreliable if 256 notifications pass before the guest notices, but it's
probably better and clearer than this.

I leave the final call to MST though.

Thanks,
Rusty.
-- 
  How could I marry someone with more hair than me?  http://baldalex.org

^ permalink raw reply

* Re: [PATCH] virtio-spec: split virtio-net device status filed into ro and rw byte
From: Jason Wang @ 2012-03-19  3:09 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: linux-kernel, qemu-devel, virtualization
In-Reply-To: <20120318122252.GE29902@redhat.com>

On 03/18/2012 08:22 PM, Michael S. Tsirkin wrote:
> On Fri, Mar 16, 2012 at 11:20:26PM +0800, Jason Wang wrote:
>> This patch splits the device status field of virtio-net into ro and rw
>> byte. This would simplify the implementation of both host and guest
>> and make the layout more clean. As VIRTIO_NET_S_ANNOUNCE is a rw bit,
>> it was moved to bit 8 (0x100).
>>
>> btw. looks like there's no implementation that depends on
>> VIRTIO_NET_S_ANNOUNCE, so the move is safe.
>>
>> Signed-off-by: Jason Wang<jasowang@redhat.com>
> Hmm, I know I proposed this myself, and I thought it will
> prevent problems if we will add more rw bits,
> but I missed the following race:
>
> host writes VIRTIO_NET_S_ANNOUNCE = 1, interrupt
>
> 				guest reads VIRTIO_NET_S_ANNOUNCE = 1
>
> host writes VIRTIO_NET_S_SOME_NEW_FIELD = 1, interrupt
>
> 				guest writes VIRTIO_NET_S_ANNOUNCE = 0
> 				VIRTIO_NET_S_SOME_NEW_FIELD is overwritten
>
> 				guest reads VIRTIO_NET_S_SOME_NEW_FIELD = 0
>
> How about making the new bit write 1 to clear?
> If we do, we can keep it where it is currently ...
>

Didn't follow, do you mean to make VIRITO_NET_S_ANNOUNCE bit clear on 
read? Looks like this can prevent the race and keep what currently we have.

>
>> ---
>>   virtio-0.9.4.lyx |   20 +++++++++++++++++++-
>>   1 files changed, 19 insertions(+), 1 deletions(-)
>>
>> diff --git a/virtio-0.9.4.lyx b/virtio-0.9.4.lyx
>> index 6c7bab1..ef3951c 100644
>> --- a/virtio-0.9.4.lyx
>> +++ b/virtio-0.9.4.lyx
>> @@ -58,6 +58,7 @@
>>   \html_be_strict false
>>   \author -608949062 "Rusty Russell,,,"
>>   \author 1531152142 "pbonzini"
>> +\author 2090695081 "Jason"
>>   \end_header
>>
>>   \begin_body
>> @@ -4012,8 +4013,19 @@ configuration
>>   layout Two configuration fields are currently defined.
>>    The mac address field always exists (though is only valid if VIRTIO_NET_F_MAC
>>    is set), and the status field only exists if VIRTIO_NET_F_STATUS is set.
>> +
>> +\change_inserted 2090695081 1331907586
>> + The low byte of status field is read-only, guest write to this byte would
>> + be ignored.
>> + Currently only one bit is defined for this byte: VIRTIO_NET_S_LINK_UP.
>> + The high byte of status field is read-writable.
>> + Currently only one bit is defined for this byte: VIRTIO_NET_S_ANNOUNCE.
>> +
>> +\change_deleted 2090695081 1331907489
>>    Two bits are currently defined for the status field: VIRTIO_NET_S_LINK_UP
>>    and VIRTIO_NET_S_ANNOUNCE.
>> +
>> +\change_unchanged
>>
>>   \begin_inset listings
>>   inline false
>> @@ -4026,7 +4038,13 @@ status open
>>
>>   \begin_layout Plain Layout
>>
>> -#define VIRTIO_NET_S_ANNOUNCE	2
>> +#define VIRTIO_NET_S_ANNOUNCE	
>> +\change_inserted 2090695081 1331907493
>> +0x100
>> +\change_deleted 2090695081 1331907491
>> +2
>> +\change_unchanged
>> +
>>   \end_layout
>>
>>   \begin_layout Plain Layout
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply

* vhost question
From: Steve Glass @ 2012-03-19  3:11 UTC (permalink / raw)
  To: virtualization


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I'm using virtio to implement a mac80211 device in the guest. On the
host side I plan to use a simple network simulation to control
delivery of frames between guests. Right now I've used the vhost
approach on the host side and can pass buffers from guest to host and
back using virtio.

My immediate problem is how to have the host-side tx kick handler get
the host-side rx kick handler to run in order to deliver a frame? In
the tx kick handler I copy the frame into a queue for the receiving
guest. In the rx_handler I take queued frames and copy them into the
rx buffers and wake the guest. The problem is that if no frames are
ready for delivery when the guest rx kick handler runs then it has to
exit and somehow I have to arrange that it runs again (in the
receiver's process context) when there are frames to deliver.

How can I get my host-side tx kick handler to wake the host-side rx
kick handler to deliver frames to the guest?

Steve

static void
handle_rx(struct vhost_work *work)
{
int n;
unsigned out, in;
struct transmission *t;
u16 frames = 0;
struct vhost_poll *p = container_of(work, struct vhost_poll, work);
struct vhost_virtqueue *vq = container_of(p, struct vhost_virtqueue,
poll);
struct vhost_node *node = container_of(vq, struct vhost_node,
vqs[WLAN_VQ_RX]);
struct vhost_dev *dev = &node->vdev;

mutex_lock(&vq->mutex);
vhost_disable_notify(dev, vq);
while (!queue_empty(&node->rxq)) {
n = vhost_get_vq_desc(dev, vq, vq->iov, ARRAY_SIZE(vq->iov), &out,
&in, NULL, NULL);
if (0 < n || n == vq->num)
break;
if ((t = queue_pop(&node->rxq))) {
BUG_ON(copy_to_user(vq->iov[0].iov_base, t->buf, t->buf_sz));
vq->iov[0].iov_len = t->buf_sz;
// ToDo: copy_to_user the rx_status
vhost_add_used(vq, n, out);
transmission_free(t);
++frames;
}
}
if (frames)
vhost_signal(dev, vq);
vhost_enable_notify(dev, vq);
mutex_unlock(&vq->mutex);
}

static void
handle_tx(struct vhost_work *work)
{
int n;
unsigned out, in;
struct transmission *t;
struct vhost_poll *p = container_of(work, struct vhost_poll, work);
struct vhost_virtqueue *vq = container_of(p, struct vhost_virtqueue,
poll);
struct vhost_node *w = container_of(vq, struct vhost_node,
vqs[WLAN_VQ_TX]);
struct vhost_dev *dev = &w->vdev;

mutex_lock(&vq->mutex);
do {
vhost_disable_notify(dev, vq);
n = vhost_get_vq_desc(dev, vq, vq->iov, ARRAY_SIZE(vq->iov), &out,
&in, NULL, NULL);
while (n >= 0 && n != vq->num) {
struct vhost_node *receiver = net_get_receiver(w);
if(receiver) {
if((t = transmission_alloc())) {
BUG_ON(copy_from_user(t->buf, vq->iov[1].iov_base, vq->iov[1].iov_len));
t->buf_sz = vq->iov[1].iov_len;
queue_push(&receiver->rxq, t);
// ToDo: kick receiver's handle_rx
// ToDo: populate TX status
} else {
pr_warn("%s: out of memory - packet dropped!", __func__);
// ToDo: populate TX status
}
} else {
pr_debug("%s: no receivers in range!", __func__);
// ToDo: populate TX status
}
vhost_add_used(vq, n, out);
n = vhost_get_vq_desc(dev, vq, vq->iov, ARRAY_SIZE(vq->iov), &out,
&in, NULL, NULL);
}
vhost_signal(dev, vq);
} while (vhost_enable_notify(dev, vq));
mutex_unlock(&vq->mutex);
}


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk9mo9cACgkQW7aAm65EWy4tagCgvsGxQM7Hhgjqc8/OAN5EsRrg
xW0An1jqNVvTICA5yhhj+6S9LtdkyHJi
=XGVm
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: [PATCH] virtio-spec: split virtio-net device status filed into ro and rw byte
From: Michael S. Tsirkin @ 2012-03-19  8:23 UTC (permalink / raw)
  To: Jason Wang; +Cc: linux-kernel, qemu-devel, virtualization
In-Reply-To: <4F66A364.2080600@redhat.com>

On Mon, Mar 19, 2012 at 11:09:24AM +0800, Jason Wang wrote:
> On 03/18/2012 08:22 PM, Michael S. Tsirkin wrote:
> >On Fri, Mar 16, 2012 at 11:20:26PM +0800, Jason Wang wrote:
> >>This patch splits the device status field of virtio-net into ro and rw
> >>byte. This would simplify the implementation of both host and guest
> >>and make the layout more clean. As VIRTIO_NET_S_ANNOUNCE is a rw bit,
> >>it was moved to bit 8 (0x100).
> >>
> >>btw. looks like there's no implementation that depends on
> >>VIRTIO_NET_S_ANNOUNCE, so the move is safe.
> >>
> >>Signed-off-by: Jason Wang<jasowang@redhat.com>
> >Hmm, I know I proposed this myself, and I thought it will
> >prevent problems if we will add more rw bits,
> >but I missed the following race:
> >
> >host writes VIRTIO_NET_S_ANNOUNCE = 1, interrupt
> >
> >				guest reads VIRTIO_NET_S_ANNOUNCE = 1
> >
> >host writes VIRTIO_NET_S_SOME_NEW_FIELD = 1, interrupt
> >
> >				guest writes VIRTIO_NET_S_ANNOUNCE = 0
> >				VIRTIO_NET_S_SOME_NEW_FIELD is overwritten
> >
> >				guest reads VIRTIO_NET_S_SOME_NEW_FIELD = 0
> >
> >How about making the new bit write 1 to clear?
> >If we do, we can keep it where it is currently ...
> >
> 
> Didn't follow, do you mean to make VIRITO_NET_S_ANNOUNCE bit clear
> on read? Looks like this can prevent the race and keep what
> currently we have.

Clear on read is evil, it makes debugging harder
since you can't check the state without destroying it.
Write 1 to clear means writing 0 to a bit does not
change it, writing 1 clears it. This makes it possible
to flip individual bits in config space without
making Schrödinger's cat experiments.

> >
> >>---
> >>  virtio-0.9.4.lyx |   20 +++++++++++++++++++-
> >>  1 files changed, 19 insertions(+), 1 deletions(-)
> >>
> >>diff --git a/virtio-0.9.4.lyx b/virtio-0.9.4.lyx
> >>index 6c7bab1..ef3951c 100644
> >>--- a/virtio-0.9.4.lyx
> >>+++ b/virtio-0.9.4.lyx
> >>@@ -58,6 +58,7 @@
> >>  \html_be_strict false
> >>  \author -608949062 "Rusty Russell,,,"
> >>  \author 1531152142 "pbonzini"
> >>+\author 2090695081 "Jason"
> >>  \end_header
> >>
> >>  \begin_body
> >>@@ -4012,8 +4013,19 @@ configuration
> >>  layout Two configuration fields are currently defined.
> >>   The mac address field always exists (though is only valid if VIRTIO_NET_F_MAC
> >>   is set), and the status field only exists if VIRTIO_NET_F_STATUS is set.
> >>+
> >>+\change_inserted 2090695081 1331907586
> >>+ The low byte of status field is read-only, guest write to this byte would
> >>+ be ignored.
> >>+ Currently only one bit is defined for this byte: VIRTIO_NET_S_LINK_UP.
> >>+ The high byte of status field is read-writable.
> >>+ Currently only one bit is defined for this byte: VIRTIO_NET_S_ANNOUNCE.
> >>+
> >>+\change_deleted 2090695081 1331907489
> >>   Two bits are currently defined for the status field: VIRTIO_NET_S_LINK_UP
> >>   and VIRTIO_NET_S_ANNOUNCE.
> >>+
> >>+\change_unchanged
> >>
> >>  \begin_inset listings
> >>  inline false
> >>@@ -4026,7 +4038,13 @@ status open
> >>
> >>  \begin_layout Plain Layout
> >>
> >>-#define VIRTIO_NET_S_ANNOUNCE	2
> >>+#define VIRTIO_NET_S_ANNOUNCE	
> >>+\change_inserted 2090695081 1331907493
> >>+0x100
> >>+\change_deleted 2090695081 1331907491
> >>+2
> >>+\change_unchanged
> >>+
> >>  \end_layout
> >>
> >>  \begin_layout Plain Layout
> >--
> >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >the body of a message to majordomo@vger.kernel.org
> >More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply

* Re: [V4 PATCH] virtio-net: send gratuitous packet when needed
From: Michael S. Tsirkin @ 2012-03-19  8:44 UTC (permalink / raw)
  To: Rusty Russell; +Cc: netdev, linux-kernel, virtualization
In-Reply-To: <877gyhxrlu.fsf@rustcorp.com.au>

On Mon, Mar 19, 2012 at 12:46:29PM +1030, Rusty Russell wrote:
> On Tue, 13 Mar 2012 16:33:31 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
> > > index 970d5a2..44a38d6 100644
> > > --- a/include/linux/virtio_net.h
> > > +++ b/include/linux/virtio_net.h
> > > @@ -49,8 +49,10 @@
> > >  #define VIRTIO_NET_F_CTRL_RX	18	/* Control channel RX mode support */
> > >  #define VIRTIO_NET_F_CTRL_VLAN	19	/* Control channel VLAN filtering */
> > >  #define VIRTIO_NET_F_CTRL_RX_EXTRA 20	/* Extra RX mode control support */
> > > +#define VIRTIO_NET_F_GUEST_ANNOUNCE 21  /* Guest can send gratituous packet */
> > >  
> > >  #define VIRTIO_NET_S_LINK_UP	1	/* Link is up */
> > > +#define VIRTIO_NET_S_ANNOUNCE   2       /* Announcement is needed */
> > 
> > I would put this in bit 8 (0x100), this way low status byte
> > is RO, high byte is RW.
> 
> The whole idea of acking by clearing the bit is unreliable, moving to a
> separate byte just controls the damage.
> 
> How about you use bits 8-15 as a counter?  It's still theoretically
> unreliable if 256 notifications pass before the guest notices, but it's
> probably better and clearer than this.
> 
> I leave the final call to MST though.
> 
> Thanks,
> Rusty.

I guess the point was that we want a single packet
so we don't care if multiple notifications are coalesced
into a single one.

> -- 
>   How could I marry someone with more hair than me?  http://baldalex.org

^ permalink raw reply

* Re: [V4 PATCH] virtio-net: send gratuitous packet when needed
From: Jason Wang @ 2012-03-19  9:42 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, linux-kernel, virtualization
In-Reply-To: <20120319084449.GF17673@redhat.com>

On 03/19/2012 04:44 PM, Michael S. Tsirkin wrote:
> On Mon, Mar 19, 2012 at 12:46:29PM +1030, Rusty Russell wrote:
>> On Tue, 13 Mar 2012 16:33:31 +0200, "Michael S. Tsirkin"<mst@redhat.com>  wrote:
>>>> diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
>>>> index 970d5a2..44a38d6 100644
>>>> --- a/include/linux/virtio_net.h
>>>> +++ b/include/linux/virtio_net.h
>>>> @@ -49,8 +49,10 @@
>>>>   #define VIRTIO_NET_F_CTRL_RX	18	/* Control channel RX mode support */
>>>>   #define VIRTIO_NET_F_CTRL_VLAN	19	/* Control channel VLAN filtering */
>>>>   #define VIRTIO_NET_F_CTRL_RX_EXTRA 20	/* Extra RX mode control support */
>>>> +#define VIRTIO_NET_F_GUEST_ANNOUNCE 21  /* Guest can send gratituous packet */
>>>>
>>>>   #define VIRTIO_NET_S_LINK_UP	1	/* Link is up */
>>>> +#define VIRTIO_NET_S_ANNOUNCE   2       /* Announcement is needed */
>>> I would put this in bit 8 (0x100), this way low status byte
>>> is RO, high byte is RW.
>> The whole idea of acking by clearing the bit is unreliable, moving to a
>> separate byte just controls the damage.
>>
>> How about you use bits 8-15 as a counter?  It's still theoretically
>> unreliable if 256 notifications pass before the guest notices, but it's
>> probably better and clearer than this.
>>
>> I leave the final call to MST though.
>>
>> Thanks,
>> Rusty.
> I guess the point was that we want a single packet
> so we don't care if multiple notifications are coalesced
> into a single one.
>

To reduce the possibility of dropping or losing of gratuitous packet by 
the network, qemu usually send the gratuitous packets for many times ( 
currently 5 time with a increment gap between them such as 50ms, 150ms, 
250ms ...). As there's no method can guarantee the gratuitous packet 
were received by switch in guest, no need to care about the coalesced 
notifications in guest. And we may leave the work to qemu or just don't 
care about this.

>> -- 
>>    How could I marry someone with more hair than me?  http://baldalex.org
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH RFC] virtio-pci: add MMIO property
From: Michael S. Tsirkin @ 2012-03-19 15:56 UTC (permalink / raw)
  To: Alexey Kardashevskiy, rusty, kvm, virtualization, qemu-devel, avi
  Cc: Anthony Liguori

Currently virtio-pci is specified so that configuration of the device is
done through a PCI IO space (via BAR 0 of the virtual PCI device).
However, Linux guests happen to use ioread/iowrite/iomap primitives
for access, and these work uniformly across memory/io BARs.

While PCI IO accesses are faster than MMIO on x86 kvm,
MMIO might be helpful on other systems which don't
implement PIO or where PIO is slower than MMIO.

Add a property to make it possible to tweak the BAR type.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

This is harmless by default but causes segfaults in memory.c
when enabled. Thus an RFC until I figure out what's wrong.

---
 hw/virtio-pci.c |   16 ++++++++++++++--
 hw/virtio-pci.h |    4 ++++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 28498ec..6f338d2 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -655,6 +655,7 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
 {
     uint8_t *config;
     uint32_t size;
+    uint8_t bar0_type;
 
     proxy->vdev = vdev;
 
@@ -684,8 +685,14 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
 
     memory_region_init_io(&proxy->bar, &virtio_pci_config_ops, proxy,
                           "virtio-pci", size);
-    pci_register_bar(&proxy->pci_dev, 0, PCI_BASE_ADDRESS_SPACE_IO,
-                     &proxy->bar);
+
+    if (proxy->flags & VIRTIO_PCI_FLAG_USE_MMIO) {
+        bar0_type = PCI_BASE_ADDRESS_SPACE_MEMORY;
+    } else {
+        bar0_type = PCI_BASE_ADDRESS_SPACE_IO;
+    }
+
+    pci_register_bar(&proxy->pci_dev, 0, bar0_type, &proxy->bar);
 
     if (!kvm_has_many_ioeventfds()) {
         proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
@@ -823,6 +830,7 @@ static Property virtio_blk_properties[] = {
     DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
     DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
     DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -856,6 +864,7 @@ static Property virtio_net_properties[] = {
     DEFINE_PROP_UINT32("x-txtimer", VirtIOPCIProxy, net.txtimer, TX_TIMER_INTERVAL),
     DEFINE_PROP_INT32("x-txburst", VirtIOPCIProxy, net.txburst, TX_BURST),
     DEFINE_PROP_STRING("tx", VirtIOPCIProxy, net.tx),
+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -888,6 +897,7 @@ static Property virtio_serial_properties[] = {
     DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
     DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
     DEFINE_PROP_UINT32("max_ports", VirtIOPCIProxy, serial.max_virtserial_ports, 31),
+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -915,6 +925,7 @@ static TypeInfo virtio_serial_info = {
 
 static Property virtio_balloon_properties[] = {
     DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -969,6 +980,7 @@ static int virtio_scsi_exit_pci(PCIDevice *pci_dev)
 static Property virtio_scsi_properties[] = {
     DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
     DEFINE_VIRTIO_SCSI_PROPERTIES(VirtIOPCIProxy, host_features, scsi),
+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/virtio-pci.h b/hw/virtio-pci.h
index e560428..e6a8861 100644
--- a/hw/virtio-pci.h
+++ b/hw/virtio-pci.h
@@ -24,6 +24,10 @@
 #define VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT 1
 #define VIRTIO_PCI_FLAG_USE_IOEVENTFD   (1 << VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT)
 
+/* Some guests don't support port IO. Use MMIO instead. */
+#define VIRTIO_PCI_FLAG_USE_MMIO_BIT 2
+#define VIRTIO_PCI_FLAG_USE_MMIO   (1 << VIRTIO_PCI_FLAG_USE_MMIO_BIT)
+
 typedef struct {
     PCIDevice pci_dev;
     VirtIODevice *vdev;
-- 
1.7.9.111.gf3fb0

^ permalink raw reply related

* Re: [PATCH RFC] virtio-pci: add MMIO property
From: Avi Kivity @ 2012-03-19 17:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Anthony Liguori, kvm, Alexey Kardashevskiy, qemu-devel,
	virtualization
In-Reply-To: <20120319155650.GA6430@redhat.com>

On 03/19/2012 05:56 PM, Michael S. Tsirkin wrote:
> Currently virtio-pci is specified so that configuration of the device is
> done through a PCI IO space (via BAR 0 of the virtual PCI device).
> However, Linux guests happen to use ioread/iowrite/iomap primitives
> for access, and these work uniformly across memory/io BARs.
>
> While PCI IO accesses are faster than MMIO on x86 kvm,
> MMIO might be helpful on other systems which don't
> implement PIO or where PIO is slower than MMIO.
>
> Add a property to make it possible to tweak the BAR type.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>
> This is harmless by default but causes segfaults in memory.c
> when enabled. Thus an RFC until I figure out what's wrong.
>

Should be done via an extra BAR (with the same layout, perhaps extended)
so compatibility is preserved.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply

* Re: [PATCH RFC] virtio-pci: add MMIO property
From: Michael S. Tsirkin @ 2012-03-19 18:57 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Anthony Liguori, kvm, Alexey Kardashevskiy, qemu-devel,
	virtualization
In-Reply-To: <4F6773B4.2040209@redhat.com>

On Mon, Mar 19, 2012 at 07:58:12PM +0200, Avi Kivity wrote:
> On 03/19/2012 05:56 PM, Michael S. Tsirkin wrote:
> > Currently virtio-pci is specified so that configuration of the device is
> > done through a PCI IO space (via BAR 0 of the virtual PCI device).
> > However, Linux guests happen to use ioread/iowrite/iomap primitives
> > for access, and these work uniformly across memory/io BARs.
> >
> > While PCI IO accesses are faster than MMIO on x86 kvm,
> > MMIO might be helpful on other systems which don't
> > implement PIO or where PIO is slower than MMIO.
> >
> > Add a property to make it possible to tweak the BAR type.
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >
> > This is harmless by default but causes segfaults in memory.c
> > when enabled. Thus an RFC until I figure out what's wrong.
> >
> 
> Should be done via an extra BAR (with the same layout, perhaps extended)
> so compatibility is preserved.

No, that would need guest changes to be of use.  The point of this hack
is to make things work for Linux guests where PIO does not work.

> -- 
> error compiling committee.c: too many arguments to function

^ permalink raw reply

* Re: [Qemu-devel] [PATCH RFC] virtio-pci: add MMIO property
From: Anthony Liguori @ 2012-03-19 19:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Anthony Liguori, kvm, Alexey Kardashevskiy, qemu-devel,
	virtualization, avi
In-Reply-To: <20120319155650.GA6430@redhat.com>

On 03/19/2012 10:56 AM, Michael S. Tsirkin wrote:
> Currently virtio-pci is specified so that configuration of the device is
> done through a PCI IO space (via BAR 0 of the virtual PCI device).
> However, Linux guests happen to use ioread/iowrite/iomap primitives
> for access, and these work uniformly across memory/io BARs.
>
> While PCI IO accesses are faster than MMIO on x86 kvm,
> MMIO might be helpful on other systems which don't
> implement PIO or where PIO is slower than MMIO.
>
> Add a property to make it possible to tweak the BAR type.
>
> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>
> This is harmless by default but causes segfaults in memory.c
> when enabled. Thus an RFC until I figure out what's wrong.

Doesn't this violate the virtio-pci spec?

Making the same vendor/device ID have different semantics depending on a magic 
flag in QEMU seems like a pretty bad idea to me.

Regards,

Anthony Liguori

>
> ---
>   hw/virtio-pci.c |   16 ++++++++++++++--
>   hw/virtio-pci.h |    4 ++++
>   2 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index 28498ec..6f338d2 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -655,6 +655,7 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
>   {
>       uint8_t *config;
>       uint32_t size;
> +    uint8_t bar0_type;
>
>       proxy->vdev = vdev;
>
> @@ -684,8 +685,14 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
>
>       memory_region_init_io(&proxy->bar,&virtio_pci_config_ops, proxy,
>                             "virtio-pci", size);
> -    pci_register_bar(&proxy->pci_dev, 0, PCI_BASE_ADDRESS_SPACE_IO,
> -&proxy->bar);
> +
> +    if (proxy->flags&  VIRTIO_PCI_FLAG_USE_MMIO) {
> +        bar0_type = PCI_BASE_ADDRESS_SPACE_MEMORY;
> +    } else {
> +        bar0_type = PCI_BASE_ADDRESS_SPACE_IO;
> +    }
> +
> +    pci_register_bar(&proxy->pci_dev, 0, bar0_type,&proxy->bar);
>
>       if (!kvm_has_many_ioeventfds()) {
>           proxy->flags&= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
> @@ -823,6 +830,7 @@ static Property virtio_blk_properties[] = {
>       DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
>       DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
>       DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>       DEFINE_PROP_END_OF_LIST(),
>   };
>
> @@ -856,6 +864,7 @@ static Property virtio_net_properties[] = {
>       DEFINE_PROP_UINT32("x-txtimer", VirtIOPCIProxy, net.txtimer, TX_TIMER_INTERVAL),
>       DEFINE_PROP_INT32("x-txburst", VirtIOPCIProxy, net.txburst, TX_BURST),
>       DEFINE_PROP_STRING("tx", VirtIOPCIProxy, net.tx),
> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>       DEFINE_PROP_END_OF_LIST(),
>   };
>
> @@ -888,6 +897,7 @@ static Property virtio_serial_properties[] = {
>       DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
>       DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
>       DEFINE_PROP_UINT32("max_ports", VirtIOPCIProxy, serial.max_virtserial_ports, 31),
> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>       DEFINE_PROP_END_OF_LIST(),
>   };
>
> @@ -915,6 +925,7 @@ static TypeInfo virtio_serial_info = {
>
>   static Property virtio_balloon_properties[] = {
>       DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>       DEFINE_PROP_END_OF_LIST(),
>   };
>
> @@ -969,6 +980,7 @@ static int virtio_scsi_exit_pci(PCIDevice *pci_dev)
>   static Property virtio_scsi_properties[] = {
>       DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
>       DEFINE_VIRTIO_SCSI_PROPERTIES(VirtIOPCIProxy, host_features, scsi),
> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>       DEFINE_PROP_END_OF_LIST(),
>   };
>
> diff --git a/hw/virtio-pci.h b/hw/virtio-pci.h
> index e560428..e6a8861 100644
> --- a/hw/virtio-pci.h
> +++ b/hw/virtio-pci.h
> @@ -24,6 +24,10 @@
>   #define VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT 1
>   #define VIRTIO_PCI_FLAG_USE_IOEVENTFD   (1<<  VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT)
>
> +/* Some guests don't support port IO. Use MMIO instead. */
> +#define VIRTIO_PCI_FLAG_USE_MMIO_BIT 2
> +#define VIRTIO_PCI_FLAG_USE_MMIO   (1<<  VIRTIO_PCI_FLAG_USE_MMIO_BIT)
> +
>   typedef struct {
>       PCIDevice pci_dev;
>       VirtIODevice *vdev;

^ permalink raw reply

* [PATCHv2] virtio-pci: add MMIO property
From: Michael S. Tsirkin @ 2012-03-19 20:37 UTC (permalink / raw)
  To: Alexey Kardashevskiy, rusty, kvm, virtualization, qemu-devel, avi
  Cc: Anthony Liguori

Currently virtio-pci is specified so that configuration of the device is
done through a PCI IO space (via BAR 0 of the virtual PCI device).
However, Linux guests happen to use ioread/iowrite/iomap primitives
for access, and these work uniformly across memory/io BARs.

While PCI IO accesses are faster than MMIO on x86 kvm,
MMIO might be helpful on other systems:
for example IBM pSeries machines not all firmware/hypervisor
versions necessarily support PCI PIO access on all domains.

Add a property to make it possible to tweak the BAR type.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

---

OK I added old_mmio (BTW: would be nice if ops were checked when
region is inited) and now things work in userspace.
However, when I add ioeventfd=on I get an assert:

qemu/kvm-all.c:747: kvm_mem_ioeventfd_add: Assertion `match_data &&
section->size == 4' failed.

How to reproduce:
1. apply patch
2. create virtio device with flags mmio=on,ioeventfd=on


 hw/virtio-pci.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 hw/virtio-pci.h |    5 ++++
 2 files changed, 71 insertions(+), 2 deletions(-)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 28498ec..b061000 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -510,8 +510,58 @@ const MemoryRegionPortio virtio_portio[] = {
     PORTIO_END_OF_LIST()
 };
 
+static void virtio_pci_config_mmio_writeb(void *opaque, target_phys_addr_t addr, uint32_t val)
+{
+    VirtIOPCIProxy *proxy = opaque;
+    virtio_pci_config_writeb(opaque, addr & proxy->bar0_mask, val);
+}
+
+static void virtio_pci_config_mmio_writew(void *opaque, target_phys_addr_t addr, uint32_t val)
+{
+    VirtIOPCIProxy *proxy = opaque;
+    virtio_pci_config_writew(opaque, addr & proxy->bar0_mask, val);
+}
+
+static void virtio_pci_config_mmio_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
+{
+    VirtIOPCIProxy *proxy = opaque;
+    virtio_pci_config_writel(opaque, addr & proxy->bar0_mask, val);
+}
+
+static uint32_t virtio_pci_config_mmio_readb(void *opaque, target_phys_addr_t addr)
+{
+    VirtIOPCIProxy *proxy = opaque;
+    return virtio_pci_config_readb(opaque, addr & proxy->bar0_mask);
+}
+
+static uint32_t virtio_pci_config_mmio_readw(void *opaque, target_phys_addr_t addr)
+{
+    VirtIOPCIProxy *proxy = opaque;
+    uint32_t val = virtio_pci_config_readw(opaque, addr & proxy->bar0_mask);
+    return val;
+}
+
+static uint32_t virtio_pci_config_mmio_readl(void *opaque, target_phys_addr_t addr)
+{
+    VirtIOPCIProxy *proxy = opaque;
+    uint32_t val = virtio_pci_config_readl(opaque, addr & proxy->bar0_mask);
+    return val;
+}
+
 static const MemoryRegionOps virtio_pci_config_ops = {
     .old_portio = virtio_portio,
+    .old_mmio = {
+        .read = {
+            virtio_pci_config_mmio_readb,
+            virtio_pci_config_mmio_readw,
+            virtio_pci_config_mmio_readl,
+        },
+        .write = {
+            virtio_pci_config_mmio_writeb,
+            virtio_pci_config_mmio_writew,
+            virtio_pci_config_mmio_writel,
+        },
+    },
     .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
@@ -655,6 +705,7 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
 {
     uint8_t *config;
     uint32_t size;
+    uint8_t bar0_type;
 
     proxy->vdev = vdev;
 
@@ -682,10 +733,18 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
     if (size & (size-1))
         size = 1 << qemu_fls(size);
 
+    proxy->bar0_mask = size - 1;
+
     memory_region_init_io(&proxy->bar, &virtio_pci_config_ops, proxy,
                           "virtio-pci", size);
-    pci_register_bar(&proxy->pci_dev, 0, PCI_BASE_ADDRESS_SPACE_IO,
-                     &proxy->bar);
+
+    if (proxy->flags & VIRTIO_PCI_FLAG_USE_MMIO) {
+        bar0_type = PCI_BASE_ADDRESS_SPACE_MEMORY;
+    } else {
+        bar0_type = PCI_BASE_ADDRESS_SPACE_IO;
+    }
+
+    pci_register_bar(&proxy->pci_dev, 0, bar0_type, &proxy->bar);
 
     if (!kvm_has_many_ioeventfds()) {
         proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
@@ -823,6 +882,7 @@ static Property virtio_blk_properties[] = {
     DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
     DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
     DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -856,6 +916,7 @@ static Property virtio_net_properties[] = {
     DEFINE_PROP_UINT32("x-txtimer", VirtIOPCIProxy, net.txtimer, TX_TIMER_INTERVAL),
     DEFINE_PROP_INT32("x-txburst", VirtIOPCIProxy, net.txburst, TX_BURST),
     DEFINE_PROP_STRING("tx", VirtIOPCIProxy, net.tx),
+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -888,6 +949,7 @@ static Property virtio_serial_properties[] = {
     DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
     DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
     DEFINE_PROP_UINT32("max_ports", VirtIOPCIProxy, serial.max_virtserial_ports, 31),
+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -915,6 +977,7 @@ static TypeInfo virtio_serial_info = {
 
 static Property virtio_balloon_properties[] = {
     DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -969,6 +1032,7 @@ static int virtio_scsi_exit_pci(PCIDevice *pci_dev)
 static Property virtio_scsi_properties[] = {
     DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
     DEFINE_VIRTIO_SCSI_PROPERTIES(VirtIOPCIProxy, host_features, scsi),
+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/virtio-pci.h b/hw/virtio-pci.h
index e560428..71ae5c9 100644
--- a/hw/virtio-pci.h
+++ b/hw/virtio-pci.h
@@ -24,6 +24,10 @@
 #define VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT 1
 #define VIRTIO_PCI_FLAG_USE_IOEVENTFD   (1 << VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT)
 
+/* Some guests don't support port IO. Use MMIO instead. */
+#define VIRTIO_PCI_FLAG_USE_MMIO_BIT 2
+#define VIRTIO_PCI_FLAG_USE_MMIO   (1 << VIRTIO_PCI_FLAG_USE_MMIO_BIT)
+
 typedef struct {
     PCIDevice pci_dev;
     VirtIODevice *vdev;
@@ -44,6 +48,7 @@ typedef struct {
     VirtIOSCSIConf scsi;
     bool ioeventfd_disabled;
     bool ioeventfd_started;
+    unsigned bar0_mask;
 } VirtIOPCIProxy;
 
 void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev);
-- 
1.7.9.111.gf3fb0

^ permalink raw reply related

* Re: [Qemu-devel] [PATCH RFC] virtio-pci: add MMIO property
From: Michael S. Tsirkin @ 2012-03-19 20:49 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Anthony Liguori, kvm, Alexey Kardashevskiy, qemu-devel,
	virtualization, avi
In-Reply-To: <4F6786C5.3080902@codemonkey.ws>

On Mon, Mar 19, 2012 at 02:19:33PM -0500, Anthony Liguori wrote:
> On 03/19/2012 10:56 AM, Michael S. Tsirkin wrote:
> >Currently virtio-pci is specified so that configuration of the device is
> >done through a PCI IO space (via BAR 0 of the virtual PCI device).
> >However, Linux guests happen to use ioread/iowrite/iomap primitives
> >for access, and these work uniformly across memory/io BARs.
> >
> >While PCI IO accesses are faster than MMIO on x86 kvm,
> >MMIO might be helpful on other systems which don't
> >implement PIO or where PIO is slower than MMIO.
> >
> >Add a property to make it possible to tweak the BAR type.
> >
> >Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> >
> >This is harmless by default but causes segfaults in memory.c
> >when enabled. Thus an RFC until I figure out what's wrong.
> 
> Doesn't this violate the virtio-pci spec?
> 

The point is to change the BAR type depending on the architecture.
IO is fastest on x86 but maybe not on other architectures.

> Making the same vendor/device ID have different semantics depending
> on a magic flag in QEMU seems like a pretty bad idea to me.
> 
> Regards,
> 
> Anthony Liguori

We do this with MSI-X so why not the BAR type?

> >
> >---
> >  hw/virtio-pci.c |   16 ++++++++++++++--
> >  hw/virtio-pci.h |    4 ++++
> >  2 files changed, 18 insertions(+), 2 deletions(-)
> >
> >diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> >index 28498ec..6f338d2 100644
> >--- a/hw/virtio-pci.c
> >+++ b/hw/virtio-pci.c
> >@@ -655,6 +655,7 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
> >  {
> >      uint8_t *config;
> >      uint32_t size;
> >+    uint8_t bar0_type;
> >
> >      proxy->vdev = vdev;
> >
> >@@ -684,8 +685,14 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
> >
> >      memory_region_init_io(&proxy->bar,&virtio_pci_config_ops, proxy,
> >                            "virtio-pci", size);
> >-    pci_register_bar(&proxy->pci_dev, 0, PCI_BASE_ADDRESS_SPACE_IO,
> >-&proxy->bar);
> >+
> >+    if (proxy->flags&  VIRTIO_PCI_FLAG_USE_MMIO) {
> >+        bar0_type = PCI_BASE_ADDRESS_SPACE_MEMORY;
> >+    } else {
> >+        bar0_type = PCI_BASE_ADDRESS_SPACE_IO;
> >+    }
> >+
> >+    pci_register_bar(&proxy->pci_dev, 0, bar0_type,&proxy->bar);
> >
> >      if (!kvm_has_many_ioeventfds()) {
> >          proxy->flags&= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
> >@@ -823,6 +830,7 @@ static Property virtio_blk_properties[] = {
> >      DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
> >      DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
> >      DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
> >+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >      DEFINE_PROP_END_OF_LIST(),
> >  };
> >
> >@@ -856,6 +864,7 @@ static Property virtio_net_properties[] = {
> >      DEFINE_PROP_UINT32("x-txtimer", VirtIOPCIProxy, net.txtimer, TX_TIMER_INTERVAL),
> >      DEFINE_PROP_INT32("x-txburst", VirtIOPCIProxy, net.txburst, TX_BURST),
> >      DEFINE_PROP_STRING("tx", VirtIOPCIProxy, net.tx),
> >+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >      DEFINE_PROP_END_OF_LIST(),
> >  };
> >
> >@@ -888,6 +897,7 @@ static Property virtio_serial_properties[] = {
> >      DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
> >      DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
> >      DEFINE_PROP_UINT32("max_ports", VirtIOPCIProxy, serial.max_virtserial_ports, 31),
> >+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >      DEFINE_PROP_END_OF_LIST(),
> >  };
> >
> >@@ -915,6 +925,7 @@ static TypeInfo virtio_serial_info = {
> >
> >  static Property virtio_balloon_properties[] = {
> >      DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
> >+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >      DEFINE_PROP_END_OF_LIST(),
> >  };
> >
> >@@ -969,6 +980,7 @@ static int virtio_scsi_exit_pci(PCIDevice *pci_dev)
> >  static Property virtio_scsi_properties[] = {
> >      DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
> >      DEFINE_VIRTIO_SCSI_PROPERTIES(VirtIOPCIProxy, host_features, scsi),
> >+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >      DEFINE_PROP_END_OF_LIST(),
> >  };
> >
> >diff --git a/hw/virtio-pci.h b/hw/virtio-pci.h
> >index e560428..e6a8861 100644
> >--- a/hw/virtio-pci.h
> >+++ b/hw/virtio-pci.h
> >@@ -24,6 +24,10 @@
> >  #define VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT 1
> >  #define VIRTIO_PCI_FLAG_USE_IOEVENTFD   (1<<  VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT)
> >
> >+/* Some guests don't support port IO. Use MMIO instead. */
> >+#define VIRTIO_PCI_FLAG_USE_MMIO_BIT 2
> >+#define VIRTIO_PCI_FLAG_USE_MMIO   (1<<  VIRTIO_PCI_FLAG_USE_MMIO_BIT)
> >+
> >  typedef struct {
> >      PCIDevice pci_dev;
> >      VirtIODevice *vdev;

^ permalink raw reply

* Re: [Qemu-devel] [PATCH RFC] virtio-pci: add MMIO property
From: Anthony Liguori @ 2012-03-19 21:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, Alexey Kardashevskiy, qemu-devel, virtualization, avi,
	Anthony Liguori
In-Reply-To: <20120319204952.GA9747@redhat.com>

On 03/19/2012 03:49 PM, Michael S. Tsirkin wrote:
> On Mon, Mar 19, 2012 at 02:19:33PM -0500, Anthony Liguori wrote:
>> On 03/19/2012 10:56 AM, Michael S. Tsirkin wrote:
>>> Currently virtio-pci is specified so that configuration of the device is
>>> done through a PCI IO space (via BAR 0 of the virtual PCI device).
>>> However, Linux guests happen to use ioread/iowrite/iomap primitives
>>> for access, and these work uniformly across memory/io BARs.
>>>
>>> While PCI IO accesses are faster than MMIO on x86 kvm,
>>> MMIO might be helpful on other systems which don't
>>> implement PIO or where PIO is slower than MMIO.
>>>
>>> Add a property to make it possible to tweak the BAR type.
>>>
>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>>>
>>> This is harmless by default but causes segfaults in memory.c
>>> when enabled. Thus an RFC until I figure out what's wrong.
>>
>> Doesn't this violate the virtio-pci spec?
>>
>
> The point is to change the BAR type depending on the architecture.
> IO is fastest on x86 but maybe not on other architectures.

Are we going to document that the BAR is X on architecture Y in the spec?

I think the better way to do this is to use a separate device id range for MMIO 
virtio-pci.  You can make the same driver hand both ranges and that way the 
device is presented consistently to the guest regardless of what the 
architecture is.

>> Making the same vendor/device ID have different semantics depending
>> on a magic flag in QEMU seems like a pretty bad idea to me.
>>
>> Regards,
>>
>> Anthony Liguori
>
> We do this with MSI-X so why not the BAR type?

We extend the bar size with MSI-X and use a transport flag to indicate that it's 
available, right?

Regards,

Anthony LIguori

>
>>>
>>> ---
>>>   hw/virtio-pci.c |   16 ++++++++++++++--
>>>   hw/virtio-pci.h |    4 ++++
>>>   2 files changed, 18 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
>>> index 28498ec..6f338d2 100644
>>> --- a/hw/virtio-pci.c
>>> +++ b/hw/virtio-pci.c
>>> @@ -655,6 +655,7 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
>>>   {
>>>       uint8_t *config;
>>>       uint32_t size;
>>> +    uint8_t bar0_type;
>>>
>>>       proxy->vdev = vdev;
>>>
>>> @@ -684,8 +685,14 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
>>>
>>>       memory_region_init_io(&proxy->bar,&virtio_pci_config_ops, proxy,
>>>                             "virtio-pci", size);
>>> -    pci_register_bar(&proxy->pci_dev, 0, PCI_BASE_ADDRESS_SPACE_IO,
>>> -&proxy->bar);
>>> +
>>> +    if (proxy->flags&   VIRTIO_PCI_FLAG_USE_MMIO) {
>>> +        bar0_type = PCI_BASE_ADDRESS_SPACE_MEMORY;
>>> +    } else {
>>> +        bar0_type = PCI_BASE_ADDRESS_SPACE_IO;
>>> +    }
>>> +
>>> +    pci_register_bar(&proxy->pci_dev, 0, bar0_type,&proxy->bar);
>>>
>>>       if (!kvm_has_many_ioeventfds()) {
>>>           proxy->flags&= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
>>> @@ -823,6 +830,7 @@ static Property virtio_blk_properties[] = {
>>>       DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
>>>       DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
>>>       DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
>>> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>>>       DEFINE_PROP_END_OF_LIST(),
>>>   };
>>>
>>> @@ -856,6 +864,7 @@ static Property virtio_net_properties[] = {
>>>       DEFINE_PROP_UINT32("x-txtimer", VirtIOPCIProxy, net.txtimer, TX_TIMER_INTERVAL),
>>>       DEFINE_PROP_INT32("x-txburst", VirtIOPCIProxy, net.txburst, TX_BURST),
>>>       DEFINE_PROP_STRING("tx", VirtIOPCIProxy, net.tx),
>>> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>>>       DEFINE_PROP_END_OF_LIST(),
>>>   };
>>>
>>> @@ -888,6 +897,7 @@ static Property virtio_serial_properties[] = {
>>>       DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
>>>       DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
>>>       DEFINE_PROP_UINT32("max_ports", VirtIOPCIProxy, serial.max_virtserial_ports, 31),
>>> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>>>       DEFINE_PROP_END_OF_LIST(),
>>>   };
>>>
>>> @@ -915,6 +925,7 @@ static TypeInfo virtio_serial_info = {
>>>
>>>   static Property virtio_balloon_properties[] = {
>>>       DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
>>> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>>>       DEFINE_PROP_END_OF_LIST(),
>>>   };
>>>
>>> @@ -969,6 +980,7 @@ static int virtio_scsi_exit_pci(PCIDevice *pci_dev)
>>>   static Property virtio_scsi_properties[] = {
>>>       DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
>>>       DEFINE_VIRTIO_SCSI_PROPERTIES(VirtIOPCIProxy, host_features, scsi),
>>> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>>>       DEFINE_PROP_END_OF_LIST(),
>>>   };
>>>
>>> diff --git a/hw/virtio-pci.h b/hw/virtio-pci.h
>>> index e560428..e6a8861 100644
>>> --- a/hw/virtio-pci.h
>>> +++ b/hw/virtio-pci.h
>>> @@ -24,6 +24,10 @@
>>>   #define VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT 1
>>>   #define VIRTIO_PCI_FLAG_USE_IOEVENTFD   (1<<   VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT)
>>>
>>> +/* Some guests don't support port IO. Use MMIO instead. */
>>> +#define VIRTIO_PCI_FLAG_USE_MMIO_BIT 2
>>> +#define VIRTIO_PCI_FLAG_USE_MMIO   (1<<   VIRTIO_PCI_FLAG_USE_MMIO_BIT)
>>> +
>>>   typedef struct {
>>>       PCIDevice pci_dev;
>>>       VirtIODevice *vdev;
>

^ permalink raw reply

* Re: [Qemu-devel] [PATCH RFC] virtio-pci: add MMIO property
From: Michael S. Tsirkin @ 2012-03-19 21:20 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Alexey Kardashevskiy, qemu-devel, virtualization, avi,
	Anthony Liguori
In-Reply-To: <4F67A021.6040601@us.ibm.com>

On Mon, Mar 19, 2012 at 04:07:45PM -0500, Anthony Liguori wrote:
> On 03/19/2012 03:49 PM, Michael S. Tsirkin wrote:
> >On Mon, Mar 19, 2012 at 02:19:33PM -0500, Anthony Liguori wrote:
> >>On 03/19/2012 10:56 AM, Michael S. Tsirkin wrote:
> >>>Currently virtio-pci is specified so that configuration of the device is
> >>>done through a PCI IO space (via BAR 0 of the virtual PCI device).
> >>>However, Linux guests happen to use ioread/iowrite/iomap primitives
> >>>for access, and these work uniformly across memory/io BARs.
> >>>
> >>>While PCI IO accesses are faster than MMIO on x86 kvm,
> >>>MMIO might be helpful on other systems which don't
> >>>implement PIO or where PIO is slower than MMIO.
> >>>
> >>>Add a property to make it possible to tweak the BAR type.
> >>>
> >>>Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> >>>
> >>>This is harmless by default but causes segfaults in memory.c
> >>>when enabled. Thus an RFC until I figure out what's wrong.
> >>
> >>Doesn't this violate the virtio-pci spec?
> >>
> >
> >The point is to change the BAR type depending on the architecture.
> >IO is fastest on x86 but maybe not on other architectures.
> 
> Are we going to document that the BAR is X on architecture Y in the spec?
> 
> I think the better way to do this is to use a separate device id
> range for MMIO virtio-pci.  You can make the same driver hand both
> ranges and that way the device is presented consistently to the
> guest regardless of what the architecture is.

Yes there are endless ways to do this.
This specific hack is good for making existing linux drivers
on ppc, arm etc work.

> >>Making the same vendor/device ID have different semantics depending
> >>on a magic flag in QEMU seems like a pretty bad idea to me.
> >>
> >>Regards,
> >>
> >>Anthony Liguori
> >
> >We do this with MSI-X so why not the BAR type?
> 
> We extend the bar size with MSI-X and use a transport flag to
> indicate that it's available, right?

No, we use regular pci capability. Just like BAR type is
a regular PCI register :)

> Regards,
> 
> Anthony LIguori
> 
> >
> >>>
> >>>---
> >>>  hw/virtio-pci.c |   16 ++++++++++++++--
> >>>  hw/virtio-pci.h |    4 ++++
> >>>  2 files changed, 18 insertions(+), 2 deletions(-)
> >>>
> >>>diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> >>>index 28498ec..6f338d2 100644
> >>>--- a/hw/virtio-pci.c
> >>>+++ b/hw/virtio-pci.c
> >>>@@ -655,6 +655,7 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
> >>>  {
> >>>      uint8_t *config;
> >>>      uint32_t size;
> >>>+    uint8_t bar0_type;
> >>>
> >>>      proxy->vdev = vdev;
> >>>
> >>>@@ -684,8 +685,14 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
> >>>
> >>>      memory_region_init_io(&proxy->bar,&virtio_pci_config_ops, proxy,
> >>>                            "virtio-pci", size);
> >>>-    pci_register_bar(&proxy->pci_dev, 0, PCI_BASE_ADDRESS_SPACE_IO,
> >>>-&proxy->bar);
> >>>+
> >>>+    if (proxy->flags&   VIRTIO_PCI_FLAG_USE_MMIO) {
> >>>+        bar0_type = PCI_BASE_ADDRESS_SPACE_MEMORY;
> >>>+    } else {
> >>>+        bar0_type = PCI_BASE_ADDRESS_SPACE_IO;
> >>>+    }
> >>>+
> >>>+    pci_register_bar(&proxy->pci_dev, 0, bar0_type,&proxy->bar);
> >>>
> >>>      if (!kvm_has_many_ioeventfds()) {
> >>>          proxy->flags&= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
> >>>@@ -823,6 +830,7 @@ static Property virtio_blk_properties[] = {
> >>>      DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
> >>>      DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
> >>>      DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
> >>>+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >>>      DEFINE_PROP_END_OF_LIST(),
> >>>  };
> >>>
> >>>@@ -856,6 +864,7 @@ static Property virtio_net_properties[] = {
> >>>      DEFINE_PROP_UINT32("x-txtimer", VirtIOPCIProxy, net.txtimer, TX_TIMER_INTERVAL),
> >>>      DEFINE_PROP_INT32("x-txburst", VirtIOPCIProxy, net.txburst, TX_BURST),
> >>>      DEFINE_PROP_STRING("tx", VirtIOPCIProxy, net.tx),
> >>>+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >>>      DEFINE_PROP_END_OF_LIST(),
> >>>  };
> >>>
> >>>@@ -888,6 +897,7 @@ static Property virtio_serial_properties[] = {
> >>>      DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
> >>>      DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
> >>>      DEFINE_PROP_UINT32("max_ports", VirtIOPCIProxy, serial.max_virtserial_ports, 31),
> >>>+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >>>      DEFINE_PROP_END_OF_LIST(),
> >>>  };
> >>>
> >>>@@ -915,6 +925,7 @@ static TypeInfo virtio_serial_info = {
> >>>
> >>>  static Property virtio_balloon_properties[] = {
> >>>      DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
> >>>+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >>>      DEFINE_PROP_END_OF_LIST(),
> >>>  };
> >>>
> >>>@@ -969,6 +980,7 @@ static int virtio_scsi_exit_pci(PCIDevice *pci_dev)
> >>>  static Property virtio_scsi_properties[] = {
> >>>      DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
> >>>      DEFINE_VIRTIO_SCSI_PROPERTIES(VirtIOPCIProxy, host_features, scsi),
> >>>+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >>>      DEFINE_PROP_END_OF_LIST(),
> >>>  };
> >>>
> >>>diff --git a/hw/virtio-pci.h b/hw/virtio-pci.h
> >>>index e560428..e6a8861 100644
> >>>--- a/hw/virtio-pci.h
> >>>+++ b/hw/virtio-pci.h
> >>>@@ -24,6 +24,10 @@
> >>>  #define VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT 1
> >>>  #define VIRTIO_PCI_FLAG_USE_IOEVENTFD   (1<<   VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT)
> >>>
> >>>+/* Some guests don't support port IO. Use MMIO instead. */
> >>>+#define VIRTIO_PCI_FLAG_USE_MMIO_BIT 2
> >>>+#define VIRTIO_PCI_FLAG_USE_MMIO   (1<<   VIRTIO_PCI_FLAG_USE_MMIO_BIT)
> >>>+
> >>>  typedef struct {
> >>>      PCIDevice pci_dev;
> >>>      VirtIODevice *vdev;
> >

^ permalink raw reply

* Re: [Qemu-devel] [PATCH RFC] virtio-pci: add MMIO property
From: Michael S. Tsirkin @ 2012-03-19 21:29 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Alexey Kardashevskiy, qemu-devel, virtualization, avi,
	Anthony Liguori
In-Reply-To: <4F67A021.6040601@us.ibm.com>

On Mon, Mar 19, 2012 at 04:07:45PM -0500, Anthony Liguori wrote:
> On 03/19/2012 03:49 PM, Michael S. Tsirkin wrote:
> >On Mon, Mar 19, 2012 at 02:19:33PM -0500, Anthony Liguori wrote:
> >>On 03/19/2012 10:56 AM, Michael S. Tsirkin wrote:
> >>>Currently virtio-pci is specified so that configuration of the device is
> >>>done through a PCI IO space (via BAR 0 of the virtual PCI device).
> >>>However, Linux guests happen to use ioread/iowrite/iomap primitives
> >>>for access, and these work uniformly across memory/io BARs.
> >>>
> >>>While PCI IO accesses are faster than MMIO on x86 kvm,
> >>>MMIO might be helpful on other systems which don't
> >>>implement PIO or where PIO is slower than MMIO.
> >>>
> >>>Add a property to make it possible to tweak the BAR type.
> >>>
> >>>Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
> >>>
> >>>This is harmless by default but causes segfaults in memory.c
> >>>when enabled. Thus an RFC until I figure out what's wrong.
> >>
> >>Doesn't this violate the virtio-pci spec?
> >>
> >
> >The point is to change the BAR type depending on the architecture.
> >IO is fastest on x86 but maybe not on other architectures.
> 
> Are we going to document that the BAR is X on architecture Y in the spec?
> 
> I think the better way to do this is to use a separate device id
> range for MMIO virtio-pci.  You can make the same driver hand both
> ranges and that way the device is presented consistently to the
> guest regardless of what the architecture is.

Maybe just make this a hidden option like x-miio?
This will ensure people dont turn it on by mistake on e.g. x86.

> >>Making the same vendor/device ID have different semantics depending
> >>on a magic flag in QEMU seems like a pretty bad idea to me.
> >>
> >>Regards,
> >>
> >>Anthony Liguori
> >
> >We do this with MSI-X so why not the BAR type?
> 
> We extend the bar size with MSI-X and use a transport flag to
> indicate that it's available, right?
> 
> Regards,
> 
> Anthony LIguori
> 
> >
> >>>
> >>>---
> >>>  hw/virtio-pci.c |   16 ++++++++++++++--
> >>>  hw/virtio-pci.h |    4 ++++
> >>>  2 files changed, 18 insertions(+), 2 deletions(-)
> >>>
> >>>diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> >>>index 28498ec..6f338d2 100644
> >>>--- a/hw/virtio-pci.c
> >>>+++ b/hw/virtio-pci.c
> >>>@@ -655,6 +655,7 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
> >>>  {
> >>>      uint8_t *config;
> >>>      uint32_t size;
> >>>+    uint8_t bar0_type;
> >>>
> >>>      proxy->vdev = vdev;
> >>>
> >>>@@ -684,8 +685,14 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
> >>>
> >>>      memory_region_init_io(&proxy->bar,&virtio_pci_config_ops, proxy,
> >>>                            "virtio-pci", size);
> >>>-    pci_register_bar(&proxy->pci_dev, 0, PCI_BASE_ADDRESS_SPACE_IO,
> >>>-&proxy->bar);
> >>>+
> >>>+    if (proxy->flags&   VIRTIO_PCI_FLAG_USE_MMIO) {
> >>>+        bar0_type = PCI_BASE_ADDRESS_SPACE_MEMORY;
> >>>+    } else {
> >>>+        bar0_type = PCI_BASE_ADDRESS_SPACE_IO;
> >>>+    }
> >>>+
> >>>+    pci_register_bar(&proxy->pci_dev, 0, bar0_type,&proxy->bar);
> >>>
> >>>      if (!kvm_has_many_ioeventfds()) {
> >>>          proxy->flags&= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
> >>>@@ -823,6 +830,7 @@ static Property virtio_blk_properties[] = {
> >>>      DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
> >>>      DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
> >>>      DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
> >>>+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >>>      DEFINE_PROP_END_OF_LIST(),
> >>>  };
> >>>
> >>>@@ -856,6 +864,7 @@ static Property virtio_net_properties[] = {
> >>>      DEFINE_PROP_UINT32("x-txtimer", VirtIOPCIProxy, net.txtimer, TX_TIMER_INTERVAL),
> >>>      DEFINE_PROP_INT32("x-txburst", VirtIOPCIProxy, net.txburst, TX_BURST),
> >>>      DEFINE_PROP_STRING("tx", VirtIOPCIProxy, net.tx),
> >>>+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >>>      DEFINE_PROP_END_OF_LIST(),
> >>>  };
> >>>
> >>>@@ -888,6 +897,7 @@ static Property virtio_serial_properties[] = {
> >>>      DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
> >>>      DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
> >>>      DEFINE_PROP_UINT32("max_ports", VirtIOPCIProxy, serial.max_virtserial_ports, 31),
> >>>+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >>>      DEFINE_PROP_END_OF_LIST(),
> >>>  };
> >>>
> >>>@@ -915,6 +925,7 @@ static TypeInfo virtio_serial_info = {
> >>>
> >>>  static Property virtio_balloon_properties[] = {
> >>>      DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
> >>>+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >>>      DEFINE_PROP_END_OF_LIST(),
> >>>  };
> >>>
> >>>@@ -969,6 +980,7 @@ static int virtio_scsi_exit_pci(PCIDevice *pci_dev)
> >>>  static Property virtio_scsi_properties[] = {
> >>>      DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
> >>>      DEFINE_VIRTIO_SCSI_PROPERTIES(VirtIOPCIProxy, host_features, scsi),
> >>>+    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
> >>>      DEFINE_PROP_END_OF_LIST(),
> >>>  };
> >>>
> >>>diff --git a/hw/virtio-pci.h b/hw/virtio-pci.h
> >>>index e560428..e6a8861 100644
> >>>--- a/hw/virtio-pci.h
> >>>+++ b/hw/virtio-pci.h
> >>>@@ -24,6 +24,10 @@
> >>>  #define VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT 1
> >>>  #define VIRTIO_PCI_FLAG_USE_IOEVENTFD   (1<<   VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT)
> >>>
> >>>+/* Some guests don't support port IO. Use MMIO instead. */
> >>>+#define VIRTIO_PCI_FLAG_USE_MMIO_BIT 2
> >>>+#define VIRTIO_PCI_FLAG_USE_MMIO   (1<<   VIRTIO_PCI_FLAG_USE_MMIO_BIT)
> >>>+
> >>>  typedef struct {
> >>>      PCIDevice pci_dev;
> >>>      VirtIODevice *vdev;
> >

^ permalink raw reply

* Re: [Qemu-devel] [PATCH RFC] virtio-pci: add MMIO property
From: Anthony Liguori @ 2012-03-19 22:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Anthony Liguori, kvm, Alexey Kardashevskiy, qemu-devel,
	virtualization, avi
In-Reply-To: <20120319212916.GC9747@redhat.com>

On 03/19/2012 04:29 PM, Michael S. Tsirkin wrote:
> On Mon, Mar 19, 2012 at 04:07:45PM -0500, Anthony Liguori wrote:
>> On 03/19/2012 03:49 PM, Michael S. Tsirkin wrote:
>>> On Mon, Mar 19, 2012 at 02:19:33PM -0500, Anthony Liguori wrote:
>>>> On 03/19/2012 10:56 AM, Michael S. Tsirkin wrote:
>>>>> Currently virtio-pci is specified so that configuration of the device is
>>>>> done through a PCI IO space (via BAR 0 of the virtual PCI device).
>>>>> However, Linux guests happen to use ioread/iowrite/iomap primitives
>>>>> for access, and these work uniformly across memory/io BARs.
>>>>>
>>>>> While PCI IO accesses are faster than MMIO on x86 kvm,
>>>>> MMIO might be helpful on other systems which don't
>>>>> implement PIO or where PIO is slower than MMIO.
>>>>>
>>>>> Add a property to make it possible to tweak the BAR type.
>>>>>
>>>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>>>>>
>>>>> This is harmless by default but causes segfaults in memory.c
>>>>> when enabled. Thus an RFC until I figure out what's wrong.
>>>>
>>>> Doesn't this violate the virtio-pci spec?
>>>>
>>>
>>> The point is to change the BAR type depending on the architecture.
>>> IO is fastest on x86 but maybe not on other architectures.
>>
>> Are we going to document that the BAR is X on architecture Y in the spec?
>>
>> I think the better way to do this is to use a separate device id
>> range for MMIO virtio-pci.  You can make the same driver hand both
>> ranges and that way the device is presented consistently to the
>> guest regardless of what the architecture is.
>
> Maybe just make this a hidden option like x-miio?

x-violate-the-virtio-spec-to-trick-old-linux-drivers-into-working-on-power?

Really, aren't we just being too clever here?  From a practical perspective, I 
doubt anyone is ever going to support a driver that has *never* been tested on 
the platform just because it was accidentally compiled and happens to be there.

If we just do use a device PCI device id range for this, it's a 1-line patch 
that can be provided via an update to existing guests.

Regards,

Anthony Liguori

> This will ensure people dont turn it on by mistake on e.g. x86.
>
>>>> Making the same vendor/device ID have different semantics depending
>>>> on a magic flag in QEMU seems like a pretty bad idea to me.
>>>>
>>>> Regards,
>>>>
>>>> Anthony Liguori
>>>
>>> We do this with MSI-X so why not the BAR type?
>>
>> We extend the bar size with MSI-X and use a transport flag to
>> indicate that it's available, right?
>>
>> Regards,
>>
>> Anthony LIguori
>>
>>>
>>>>>
>>>>> ---
>>>>>   hw/virtio-pci.c |   16 ++++++++++++++--
>>>>>   hw/virtio-pci.h |    4 ++++
>>>>>   2 files changed, 18 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
>>>>> index 28498ec..6f338d2 100644
>>>>> --- a/hw/virtio-pci.c
>>>>> +++ b/hw/virtio-pci.c
>>>>> @@ -655,6 +655,7 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
>>>>>   {
>>>>>       uint8_t *config;
>>>>>       uint32_t size;
>>>>> +    uint8_t bar0_type;
>>>>>
>>>>>       proxy->vdev = vdev;
>>>>>
>>>>> @@ -684,8 +685,14 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev)
>>>>>
>>>>>       memory_region_init_io(&proxy->bar,&virtio_pci_config_ops, proxy,
>>>>>                             "virtio-pci", size);
>>>>> -    pci_register_bar(&proxy->pci_dev, 0, PCI_BASE_ADDRESS_SPACE_IO,
>>>>> -&proxy->bar);
>>>>> +
>>>>> +    if (proxy->flags&    VIRTIO_PCI_FLAG_USE_MMIO) {
>>>>> +        bar0_type = PCI_BASE_ADDRESS_SPACE_MEMORY;
>>>>> +    } else {
>>>>> +        bar0_type = PCI_BASE_ADDRESS_SPACE_IO;
>>>>> +    }
>>>>> +
>>>>> +    pci_register_bar(&proxy->pci_dev, 0, bar0_type,&proxy->bar);
>>>>>
>>>>>       if (!kvm_has_many_ioeventfds()) {
>>>>>           proxy->flags&= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
>>>>> @@ -823,6 +830,7 @@ static Property virtio_blk_properties[] = {
>>>>>       DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
>>>>>       DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
>>>>>       DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
>>>>> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>>>>>       DEFINE_PROP_END_OF_LIST(),
>>>>>   };
>>>>>
>>>>> @@ -856,6 +864,7 @@ static Property virtio_net_properties[] = {
>>>>>       DEFINE_PROP_UINT32("x-txtimer", VirtIOPCIProxy, net.txtimer, TX_TIMER_INTERVAL),
>>>>>       DEFINE_PROP_INT32("x-txburst", VirtIOPCIProxy, net.txburst, TX_BURST),
>>>>>       DEFINE_PROP_STRING("tx", VirtIOPCIProxy, net.tx),
>>>>> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>>>>>       DEFINE_PROP_END_OF_LIST(),
>>>>>   };
>>>>>
>>>>> @@ -888,6 +897,7 @@ static Property virtio_serial_properties[] = {
>>>>>       DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
>>>>>       DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
>>>>>       DEFINE_PROP_UINT32("max_ports", VirtIOPCIProxy, serial.max_virtserial_ports, 31),
>>>>> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>>>>>       DEFINE_PROP_END_OF_LIST(),
>>>>>   };
>>>>>
>>>>> @@ -915,6 +925,7 @@ static TypeInfo virtio_serial_info = {
>>>>>
>>>>>   static Property virtio_balloon_properties[] = {
>>>>>       DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
>>>>> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>>>>>       DEFINE_PROP_END_OF_LIST(),
>>>>>   };
>>>>>
>>>>> @@ -969,6 +980,7 @@ static int virtio_scsi_exit_pci(PCIDevice *pci_dev)
>>>>>   static Property virtio_scsi_properties[] = {
>>>>>       DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
>>>>>       DEFINE_VIRTIO_SCSI_PROPERTIES(VirtIOPCIProxy, host_features, scsi),
>>>>> +    DEFINE_PROP_BIT("mmio", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_MMIO_BIT, false),
>>>>>       DEFINE_PROP_END_OF_LIST(),
>>>>>   };
>>>>>
>>>>> diff --git a/hw/virtio-pci.h b/hw/virtio-pci.h
>>>>> index e560428..e6a8861 100644
>>>>> --- a/hw/virtio-pci.h
>>>>> +++ b/hw/virtio-pci.h
>>>>> @@ -24,6 +24,10 @@
>>>>>   #define VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT 1
>>>>>   #define VIRTIO_PCI_FLAG_USE_IOEVENTFD   (1<<    VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT)
>>>>>
>>>>> +/* Some guests don't support port IO. Use MMIO instead. */
>>>>> +#define VIRTIO_PCI_FLAG_USE_MMIO_BIT 2
>>>>> +#define VIRTIO_PCI_FLAG_USE_MMIO   (1<<    VIRTIO_PCI_FLAG_USE_MMIO_BIT)
>>>>> +
>>>>>   typedef struct {
>>>>>       PCIDevice pci_dev;
>>>>>       VirtIODevice *vdev;
>>>
>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox