Linux virtualization list
 help / color / mirror / Atom feed
* Re: [RFC] virtio: support VIRTIO_F_IO_BARRIER
From: Tiwei Bie @ 2018-05-03  8:30 UTC (permalink / raw)
  To: Jason Wang
  Cc: mst, linux-kernel, virtualization, stefanha, zhihong.wang,
	pbonzini
In-Reply-To: <ab66939d-47fb-5f43-3938-0c156ffc6918@redhat.com>

On Thu, May 03, 2018 at 03:30:03PM +0800, Jason Wang wrote:
> On 2018年05月03日 10:59, Tiwei Bie wrote:
> > This patch introduces the support for VIRTIO_F_IO_BARRIER.
> > When this feature is negotiated, driver will use the barriers
> > suitable for hardware devices.
> > 
> > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > ---
> >   drivers/virtio/virtio_ring.c       | 5 +++++
> >   include/uapi/linux/virtio_config.h | 8 +++++++-
> >   2 files changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 21d464a29cf8..edb565643bf4 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -996,6 +996,9 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
> >   		!context;
> >   	vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
> > +	if (virtio_has_feature(vdev, VIRTIO_F_IO_BARRIER))
> > +		vq->weak_barriers = false;
> > +
> >   	/* No callback?  Tell other side not to bother us. */
> >   	if (!callback) {
> >   		vq->avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > @@ -1164,6 +1167,8 @@ void vring_transport_features(struct virtio_device *vdev)
> >   			break;
> >   		case VIRTIO_F_IOMMU_PLATFORM:
> >   			break;
> > +		case VIRTIO_F_IO_BARRIER:
> > +			break;
> >   		default:
> >   			/* We don't understand this bit. */
> >   			__virtio_clear_bit(vdev, i);
> > diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
> > index 308e2096291f..6ca8d24bf468 100644
> > --- a/include/uapi/linux/virtio_config.h
> > +++ b/include/uapi/linux/virtio_config.h
> > @@ -49,7 +49,7 @@
> >    * transport being used (eg. virtio_ring), the rest are per-device feature
> >    * bits. */
> >   #define VIRTIO_TRANSPORT_F_START	28
> > -#define VIRTIO_TRANSPORT_F_END		34
> > +#define VIRTIO_TRANSPORT_F_END		38
> >   #ifndef VIRTIO_CONFIG_NO_LEGACY
> >   /* Do we get callbacks when the ring is completely used, even if we've
> > @@ -71,4 +71,10 @@
> >    * this is for compatibility with legacy systems.
> >    */
> >   #define VIRTIO_F_IOMMU_PLATFORM		33
> > +
> > +/*
> > + * If clear - driver may use barriers suitable for CPU cores.
> > + * If set - driver must use barriers suitable for hardware devices.
> > + */
> > +#define VIRTIO_F_IO_BARRIER		37
> >   #endif /* _UAPI_LINUX_VIRTIO_CONFIG_H */
> 
> Hi:
> 
> I believe this depends on Michael's patch of
> 
> "[PATCH] virtio_ring: switch to dma_XX barriers for rpmsg"
> 
> ?
> 
> Thanks

We already have below commit and some other related commits
in the tree:

7b21e34fd1c2 ("virtio: harsher barriers for rpmsg.")

They should have already guaranteed that virtio_Xmb() will
be OK for hardware devices when vq->weak_barriers is false.
If my understanding is correct, the barriers used in this
case are overkill. So Michael's patch is to make the barriers
weaker (or better).

Best regards,
Tiwei Bie
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC] virtio: support VIRTIO_F_IO_BARRIER
From: Jason Wang @ 2018-05-03  7:30 UTC (permalink / raw)
  To: Tiwei Bie, mst, pbonzini, stefanha, virtualization, linux-kernel
  Cc: zhihong.wang
In-Reply-To: <20180503025955.28816-1-tiwei.bie@intel.com>



On 2018年05月03日 10:59, Tiwei Bie wrote:
> This patch introduces the support for VIRTIO_F_IO_BARRIER.
> When this feature is negotiated, driver will use the barriers
> suitable for hardware devices.
>
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> ---
>   drivers/virtio/virtio_ring.c       | 5 +++++
>   include/uapi/linux/virtio_config.h | 8 +++++++-
>   2 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 21d464a29cf8..edb565643bf4 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -996,6 +996,9 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
>   		!context;
>   	vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
>   
> +	if (virtio_has_feature(vdev, VIRTIO_F_IO_BARRIER))
> +		vq->weak_barriers = false;
> +
>   	/* No callback?  Tell other side not to bother us. */
>   	if (!callback) {
>   		vq->avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> @@ -1164,6 +1167,8 @@ void vring_transport_features(struct virtio_device *vdev)
>   			break;
>   		case VIRTIO_F_IOMMU_PLATFORM:
>   			break;
> +		case VIRTIO_F_IO_BARRIER:
> +			break;
>   		default:
>   			/* We don't understand this bit. */
>   			__virtio_clear_bit(vdev, i);
> diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
> index 308e2096291f..6ca8d24bf468 100644
> --- a/include/uapi/linux/virtio_config.h
> +++ b/include/uapi/linux/virtio_config.h
> @@ -49,7 +49,7 @@
>    * transport being used (eg. virtio_ring), the rest are per-device feature
>    * bits. */
>   #define VIRTIO_TRANSPORT_F_START	28
> -#define VIRTIO_TRANSPORT_F_END		34
> +#define VIRTIO_TRANSPORT_F_END		38
>   
>   #ifndef VIRTIO_CONFIG_NO_LEGACY
>   /* Do we get callbacks when the ring is completely used, even if we've
> @@ -71,4 +71,10 @@
>    * this is for compatibility with legacy systems.
>    */
>   #define VIRTIO_F_IOMMU_PLATFORM		33
> +
> +/*
> + * If clear - driver may use barriers suitable for CPU cores.
> + * If set - driver must use barriers suitable for hardware devices.
> + */
> +#define VIRTIO_F_IO_BARRIER		37
>   #endif /* _UAPI_LINUX_VIRTIO_CONFIG_H */

Hi:

I believe this depends on Michael's patch of

"[PATCH] virtio_ring: switch to dma_XX barriers for rpmsg"

?

Thanks
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC v3 4/5] virtio_ring: add event idx support in packed ring
From: Jason Wang @ 2018-05-03  7:25 UTC (permalink / raw)
  To: Tiwei Bie, Michael S. Tsirkin; +Cc: netdev, wexu, linux-kernel, virtualization
In-Reply-To: <20180503020949.5u3qz32gsk33z6vk@debian>



On 2018年05月03日 10:09, Tiwei Bie wrote:
>>>> So how about we use the straightforward way then?
>>> You mean we do new += vq->vring_packed.num instead
>>> of event_idx -= vq->vring_packed.num before calling
>>> vring_need_event()?
>>>
>>> The problem is that, the second param (new_idx) of
>>> vring_need_event() will be used for:
>>>
>>> (__u16)(new_idx - event_idx - 1)
>>> (__u16)(new_idx - old)
>>>
>>> So if we change new, we will need to change old too.
>> I think that since we have a branch there anyway,
>> we are better off just special-casing if (wrap_counter != vq->wrap_counter).
>> Treat is differenty and avoid casts.
>>
>>> And that would be an ugly hack..
>>>
>>> Best regards,
>>> Tiwei Bie
>> I consider casts and huge numbers with two's complement
>> games even uglier.
> The dependency on two's complement game is introduced
> since the split ring.
>
> In packed ring, old is calculated via:
>
> old = vq->next_avail_idx - vq->num_added;
>
> In split ring, old is calculated via:
>
> old = vq->avail_idx_shadow - vq->num_added;
>
> In both cases, when vq->num_added is bigger, old will
> be a big number.
>
> Best regards,
> Tiwei Bie
>

How about just do something like vhost:

static u16 vhost_idx_diff(struct vhost_virtqueue *vq, u16 old, u16 new)
{
     if (new > old)
         return new - old;
     return  (new + vq->num - old);
}

static bool vhost_vring_packed_need_event(struct vhost_virtqueue *vq,
                       __u16 event_off, __u16 new,
                       __u16 old)
{
     return (__u16)(vhost_idx_diff(vq, new, event_off) - 1) <
            (__u16)vhost_idx_diff(vq, new, old);
}

?
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC V3 PATCH 1/8] vhost: move get_rx_bufs to vhost.c
From: Jason Wang @ 2018-05-03  7:19 UTC (permalink / raw)
  To: Tiwei Bie; +Cc: kvm, mst, netdev, linux-kernel, virtualization, wexu
In-Reply-To: <20180502080518.h52wme46fnqpyfpf@debian>



On 2018年05月02日 16:05, Tiwei Bie wrote:
> On Mon, Apr 23, 2018 at 01:34:53PM +0800, Jason Wang wrote:
>> Move get_rx_bufs() to vhost.c and rename it to
>> vhost_get_rx_bufs(). This helps to hide vring internal layout from
> A small typo. Based on the code change in this patch, it
> seems that this function is renamed to vhost_get_bufs().
>
> Thanks
>

Right, let me fix it in the next version.

Thanks

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH 0/6] virtio-console: spec compliance fixes
From: Amit Shah @ 2018-05-03  3:34 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Arnd Bergmann, Amit Shah, Greg Kroah-Hartman, linux-kernel,
	stable, virtualization
In-Reply-To: <20180424214104-mutt-send-email-mst@kernel.org>

On (Tue) 24 Apr 2018 [21:41:29], Michael S. Tsirkin wrote:
> On Fri, Apr 20, 2018 at 09:17:59PM +0300, Michael S. Tsirkin wrote:
> > Turns out virtio console tries to take a buffer out of an active vq.
> > Works by sheer luck, and is explicitly forbidden by spec.  And while
> > going over it I saw that error handling is also broken -
> > failure is easy to trigger if I force allocations to fail.
> > 
> > Lightly tested.
> 
> Amit - any feedback before I push these patches?

The changes look good spec-wise.

There are a bunch of tests in avocado-vt that test virtio-console
functionality.  Can you give those a try before pushing?


		Amit
-- 
http://amitshah.net/

^ permalink raw reply

* [RFC] virtio: support VIRTIO_F_IO_BARRIER
From: Tiwei Bie @ 2018-05-03  2:59 UTC (permalink / raw)
  To: mst, jasowang, pbonzini, stefanha, virtualization, linux-kernel
  Cc: zhihong.wang

This patch introduces the support for VIRTIO_F_IO_BARRIER.
When this feature is negotiated, driver will use the barriers
suitable for hardware devices.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
---
 drivers/virtio/virtio_ring.c       | 5 +++++
 include/uapi/linux/virtio_config.h | 8 +++++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 21d464a29cf8..edb565643bf4 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -996,6 +996,9 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
 		!context;
 	vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
 
+	if (virtio_has_feature(vdev, VIRTIO_F_IO_BARRIER))
+		vq->weak_barriers = false;
+
 	/* No callback?  Tell other side not to bother us. */
 	if (!callback) {
 		vq->avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
@@ -1164,6 +1167,8 @@ void vring_transport_features(struct virtio_device *vdev)
 			break;
 		case VIRTIO_F_IOMMU_PLATFORM:
 			break;
+		case VIRTIO_F_IO_BARRIER:
+			break;
 		default:
 			/* We don't understand this bit. */
 			__virtio_clear_bit(vdev, i);
diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
index 308e2096291f..6ca8d24bf468 100644
--- a/include/uapi/linux/virtio_config.h
+++ b/include/uapi/linux/virtio_config.h
@@ -49,7 +49,7 @@
  * transport being used (eg. virtio_ring), the rest are per-device feature
  * bits. */
 #define VIRTIO_TRANSPORT_F_START	28
-#define VIRTIO_TRANSPORT_F_END		34
+#define VIRTIO_TRANSPORT_F_END		38
 
 #ifndef VIRTIO_CONFIG_NO_LEGACY
 /* Do we get callbacks when the ring is completely used, even if we've
@@ -71,4 +71,10 @@
  * this is for compatibility with legacy systems.
  */
 #define VIRTIO_F_IOMMU_PLATFORM		33
+
+/*
+ * If clear - driver may use barriers suitable for CPU cores.
+ * If set - driver must use barriers suitable for hardware devices.
+ */
+#define VIRTIO_F_IO_BARRIER		37
 #endif /* _UAPI_LINUX_VIRTIO_CONFIG_H */
-- 
2.11.0

^ permalink raw reply related

* Re: [RFC v3 4/5] virtio_ring: add event idx support in packed ring
From: Tiwei Bie @ 2018-05-03  2:09 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, linux-kernel, virtualization, wexu
In-Reply-To: <20180503044218-mutt-send-email-mst@kernel.org>

On Thu, May 03, 2018 at 04:44:39AM +0300, Michael S. Tsirkin wrote:
> On Thu, May 03, 2018 at 09:11:16AM +0800, Tiwei Bie wrote:
> > On Wed, May 02, 2018 at 06:42:57PM +0300, Michael S. Tsirkin wrote:
> > > On Wed, May 02, 2018 at 11:12:55PM +0800, Tiwei Bie wrote:
> > > > On Wed, May 02, 2018 at 04:51:01PM +0300, Michael S. Tsirkin wrote:
> > > > > On Wed, May 02, 2018 at 03:28:19PM +0800, Tiwei Bie wrote:
> > > > > > On Wed, May 02, 2018 at 10:51:06AM +0800, Jason Wang wrote:
> > > > > > > On 2018年04月25日 13:15, Tiwei Bie wrote:
> > > > > > > > This commit introduces the event idx support in packed
> > > > > > > > ring. This feature is temporarily disabled, because the
> > > > > > > > implementation in this patch may not work as expected,
> > > > > > > > and some further discussions on the implementation are
> > > > > > > > needed, e.g. do we have to check the wrap counter when
> > > > > > > > checking whether a kick is needed?
> > > > > > > > 
> > > > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > > > ---
> > > > > > > >   drivers/virtio/virtio_ring.c | 53 ++++++++++++++++++++++++++++++++++++++++----
> > > > > > > >   1 file changed, 49 insertions(+), 4 deletions(-)
> > > > > > > > 
> > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > index 0181e93897be..b1039c2985b9 100644
> > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > @@ -986,7 +986,7 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq,
> > > > > > > >   static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > > > > > >   {
> > > > > > > >   	struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > -	u16 flags;
> > > > > > > > +	u16 new, old, off_wrap, flags;
> > > > > > > >   	bool needs_kick;
> > > > > > > >   	u32 snapshot;
> > > > > > > > @@ -995,7 +995,12 @@ static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > > > > > >   	 * suppressions. */
> > > > > > > >   	virtio_mb(vq->weak_barriers);
> > > > > > > > +	old = vq->next_avail_idx - vq->num_added;
> > > > > > > > +	new = vq->next_avail_idx;
> > > > > > > > +	vq->num_added = 0;
> > > > > > > > +
> > > > > > > >   	snapshot = *(u32 *)vq->vring_packed.device;
> > > > > > > > +	off_wrap = virtio16_to_cpu(_vq->vdev, snapshot & 0xffff);
> > > > > > > >   	flags = cpu_to_virtio16(_vq->vdev, snapshot >> 16) & 0x3;
> > > > > > > >   #ifdef DEBUG
> > > > > > > > @@ -1006,7 +1011,10 @@ static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > > > > > >   	vq->last_add_time_valid = false;
> > > > > > > >   #endif
> > > > > > > > -	needs_kick = (flags != VRING_EVENT_F_DISABLE);
> > > > > > > > +	if (flags == VRING_EVENT_F_DESC)
> > > > > > > > +		needs_kick = vring_need_event(off_wrap & ~(1<<15), new, old);
> > > > > > > 
> > > > > > > I wonder whether or not the math is correct. Both new and event are in the
> > > > > > > unit of descriptor ring size, but old looks not.
> > > > > > 
> > > > > > What vring_need_event() cares is the distance between
> > > > > > `new` and `old`, i.e. vq->num_added. So I think there
> > > > > > is nothing wrong with `old`. But the calculation of the
> > > > > > distance between `new` and `event_idx` isn't right when
> > > > > > `new` wraps. How do you think about the below code:
> > > > > > 
> > > > > > 	wrap_counter = off_wrap >> 15;
> > > > > > 	event_idx = off_wrap & ~(1<<15);
> > > > > > 	if (wrap_counter != vq->wrap_counter)
> > > > > > 		event_idx -= vq->vring_packed.num;
> > > > > > 	
> > > > > > 	needs_kick = vring_need_event(event_idx, new, old);
> > > > > 
> > > > > I suspect this hack won't work for non power of 2 ring.
> > > > 
> > > > Above code doesn't require the ring size to be a power of 2.
> > > > 
> > > > For (__u16)(new_idx - old), what we want to get is vq->num_added.
> > > > 
> > > > old = vq->next_avail_idx - vq->num_added;
> > > > new = vq->next_avail_idx;
> > > > 
> > > > When vq->next_avail_idx >= vq->num_added, it's obvious that,
> > > > (__u16)(new_idx - old) is vq->num_added.
> > > > 
> > > > And when vq->next_avail_idx < vq->num_added, new will be smaller
> > > > than old (old will be a big unsigned number), but (__u16)(new_idx
> > > > - old) is still vq->num_added.
> > > > 
> > > > For (__u16)(new_idx - event_idx - 1), when new wraps and event_idx
> > > > doesn't wrap, the most straightforward way to calculate it is:
> > > > (new + vq->vring_packed.num) - event_idx - 1.
> > > 
> > > So how about we use the straightforward way then?
> > 
> > You mean we do new += vq->vring_packed.num instead
> > of event_idx -= vq->vring_packed.num before calling
> > vring_need_event()?
> > 
> > The problem is that, the second param (new_idx) of
> > vring_need_event() will be used for:
> > 
> > (__u16)(new_idx - event_idx - 1)
> > (__u16)(new_idx - old)
> > 
> > So if we change new, we will need to change old too.
> 
> I think that since we have a branch there anyway,
> we are better off just special-casing if (wrap_counter != vq->wrap_counter).
> Treat is differenty and avoid casts.
> 
> > And that would be an ugly hack..
> > 
> > Best regards,
> > Tiwei Bie
> 
> I consider casts and huge numbers with two's complement
> games even uglier.

The dependency on two's complement game is introduced
since the split ring.

In packed ring, old is calculated via:

old = vq->next_avail_idx - vq->num_added;

In split ring, old is calculated via:

old = vq->avail_idx_shadow - vq->num_added;

In both cases, when vq->num_added is bigger, old will
be a big number.

Best regards,
Tiwei Bie

> 
> > > 
> > > > But we can also calculate it in this way:
> > > > 
> > > > event_idx -= vq->vring_packed.num;
> > > > (event_idx will be a big unsigned number)
> > > > 
> > > > Then (__u16)(new_idx - event_idx - 1) will be the value we want.
> > > > 
> > > > Best regards,
> > > > Tiwei Bie
> > > 
> > > 
> > > > > 
> > > > > 
> > > > > > Best regards,
> > > > > > Tiwei Bie
> > > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > Thanks
> > > > > > > 
> > > > > > > > +	else
> > > > > > > > +		needs_kick = (flags != VRING_EVENT_F_DISABLE);
> > > > > > > >   	END_USE(vq);
> > > > > > > >   	return needs_kick;
> > > > > > > >   }
> > > > > > > > @@ -1116,6 +1124,15 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > > > > >   	if (vq->last_used_idx >= vq->vring_packed.num)
> > > > > > > >   		vq->last_used_idx -= vq->vring_packed.num;
> > > > > > > > +	/* If we expect an interrupt for the next entry, tell host
> > > > > > > > +	 * by writing event index and flush out the write before
> > > > > > > > +	 * the read in the next get_buf call. */
> > > > > > > > +	if (vq->event_flags_shadow == VRING_EVENT_F_DESC)
> > > > > > > > +		virtio_store_mb(vq->weak_barriers,
> > > > > > > > +				&vq->vring_packed.driver->off_wrap,
> > > > > > > > +				cpu_to_virtio16(_vq->vdev, vq->last_used_idx |
> > > > > > > > +						(vq->wrap_counter << 15)));
> > > > > > > > +
> > > > > > > >   #ifdef DEBUG
> > > > > > > >   	vq->last_add_time_valid = false;
> > > > > > > >   #endif
> > > > > > > > @@ -1143,10 +1160,17 @@ static unsigned virtqueue_enable_cb_prepare_packed(struct virtqueue *_vq)
> > > > > > > >   	/* We optimistically turn back on interrupts, then check if there was
> > > > > > > >   	 * more to do. */
> > > > > > > > +	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
> > > > > > > > +	 * either clear the flags bit or point the event index at the next
> > > > > > > > +	 * entry. Always update the event index to keep code simple. */
> > > > > > > > +
> > > > > > > > +	vq->vring_packed.driver->off_wrap = cpu_to_virtio16(_vq->vdev,
> > > > > > > > +			vq->last_used_idx | (vq->wrap_counter << 15));
> > > > > > > >   	if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> > > > > > > >   		virtio_wmb(vq->weak_barriers);
> > > > > > > > -		vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> > > > > > > > +		vq->event_flags_shadow = vq->event ? VRING_EVENT_F_DESC :
> > > > > > > > +						     VRING_EVENT_F_ENABLE;
> > > > > > > >   		vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> > > > > > > >   							vq->event_flags_shadow);
> > > > > > > >   	}
> > > > > > > > @@ -1172,15 +1196,34 @@ static bool virtqueue_poll_packed(struct virtqueue *_vq, unsigned last_used_idx)
> > > > > > > >   static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)
> > > > > > > >   {
> > > > > > > >   	struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > +	u16 bufs, used_idx, wrap_counter;
> > > > > > > >   	START_USE(vq);
> > > > > > > >   	/* We optimistically turn back on interrupts, then check if there was
> > > > > > > >   	 * more to do. */
> > > > > > > > +	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
> > > > > > > > +	 * either clear the flags bit or point the event index at the next
> > > > > > > > +	 * entry. Always update the event index to keep code simple. */
> > > > > > > > +
> > > > > > > > +	/* TODO: tune this threshold */
> > > > > > > > +	bufs = (u16)(vq->next_avail_idx - vq->last_used_idx) * 3 / 4;
> > > > > > > > +
> > > > > > > > +	used_idx = vq->last_used_idx + bufs;
> > > > > > > > +	wrap_counter = vq->wrap_counter;
> > > > > > > > +
> > > > > > > > +	if (used_idx >= vq->vring_packed.num) {
> > > > > > > > +		used_idx -= vq->vring_packed.num;
> > > > > > > > +		wrap_counter ^= 1;
> > > > > > > > +	}
> > > > > > > > +
> > > > > > > > +	vq->vring_packed.driver->off_wrap = cpu_to_virtio16(_vq->vdev,
> > > > > > > > +			used_idx | (wrap_counter << 15));
> > > > > > > >   	if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> > > > > > > >   		virtio_wmb(vq->weak_barriers);
> > > > > > > > -		vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> > > > > > > > +		vq->event_flags_shadow = vq->event ? VRING_EVENT_F_DESC :
> > > > > > > > +						     VRING_EVENT_F_ENABLE;
> > > > > > > >   		vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> > > > > > > >   							vq->event_flags_shadow);
> > > > > > > >   	}
> > > > > > > > @@ -1822,8 +1865,10 @@ void vring_transport_features(struct virtio_device *vdev)
> > > > > > > >   		switch (i) {
> > > > > > > >   		case VIRTIO_RING_F_INDIRECT_DESC:
> > > > > > > >   			break;
> > > > > > > > +#if 0
> > > > > > > >   		case VIRTIO_RING_F_EVENT_IDX:
> > > > > > > >   			break;
> > > > > > > > +#endif
> > > > > > > >   		case VIRTIO_F_VERSION_1:
> > > > > > > >   			break;
> > > > > > > >   		case VIRTIO_F_IOMMU_PLATFORM:
> > > > > > > 
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC v3 4/5] virtio_ring: add event idx support in packed ring
From: Michael S. Tsirkin @ 2018-05-03  1:44 UTC (permalink / raw)
  To: Tiwei Bie; +Cc: netdev, linux-kernel, virtualization, wexu
In-Reply-To: <20180503011116.qvoyblcpklinrk26@debian>

On Thu, May 03, 2018 at 09:11:16AM +0800, Tiwei Bie wrote:
> On Wed, May 02, 2018 at 06:42:57PM +0300, Michael S. Tsirkin wrote:
> > On Wed, May 02, 2018 at 11:12:55PM +0800, Tiwei Bie wrote:
> > > On Wed, May 02, 2018 at 04:51:01PM +0300, Michael S. Tsirkin wrote:
> > > > On Wed, May 02, 2018 at 03:28:19PM +0800, Tiwei Bie wrote:
> > > > > On Wed, May 02, 2018 at 10:51:06AM +0800, Jason Wang wrote:
> > > > > > On 2018年04月25日 13:15, Tiwei Bie wrote:
> > > > > > > This commit introduces the event idx support in packed
> > > > > > > ring. This feature is temporarily disabled, because the
> > > > > > > implementation in this patch may not work as expected,
> > > > > > > and some further discussions on the implementation are
> > > > > > > needed, e.g. do we have to check the wrap counter when
> > > > > > > checking whether a kick is needed?
> > > > > > > 
> > > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > > ---
> > > > > > >   drivers/virtio/virtio_ring.c | 53 ++++++++++++++++++++++++++++++++++++++++----
> > > > > > >   1 file changed, 49 insertions(+), 4 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > index 0181e93897be..b1039c2985b9 100644
> > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > @@ -986,7 +986,7 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq,
> > > > > > >   static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > > > > >   {
> > > > > > >   	struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > -	u16 flags;
> > > > > > > +	u16 new, old, off_wrap, flags;
> > > > > > >   	bool needs_kick;
> > > > > > >   	u32 snapshot;
> > > > > > > @@ -995,7 +995,12 @@ static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > > > > >   	 * suppressions. */
> > > > > > >   	virtio_mb(vq->weak_barriers);
> > > > > > > +	old = vq->next_avail_idx - vq->num_added;
> > > > > > > +	new = vq->next_avail_idx;
> > > > > > > +	vq->num_added = 0;
> > > > > > > +
> > > > > > >   	snapshot = *(u32 *)vq->vring_packed.device;
> > > > > > > +	off_wrap = virtio16_to_cpu(_vq->vdev, snapshot & 0xffff);
> > > > > > >   	flags = cpu_to_virtio16(_vq->vdev, snapshot >> 16) & 0x3;
> > > > > > >   #ifdef DEBUG
> > > > > > > @@ -1006,7 +1011,10 @@ static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > > > > >   	vq->last_add_time_valid = false;
> > > > > > >   #endif
> > > > > > > -	needs_kick = (flags != VRING_EVENT_F_DISABLE);
> > > > > > > +	if (flags == VRING_EVENT_F_DESC)
> > > > > > > +		needs_kick = vring_need_event(off_wrap & ~(1<<15), new, old);
> > > > > > 
> > > > > > I wonder whether or not the math is correct. Both new and event are in the
> > > > > > unit of descriptor ring size, but old looks not.
> > > > > 
> > > > > What vring_need_event() cares is the distance between
> > > > > `new` and `old`, i.e. vq->num_added. So I think there
> > > > > is nothing wrong with `old`. But the calculation of the
> > > > > distance between `new` and `event_idx` isn't right when
> > > > > `new` wraps. How do you think about the below code:
> > > > > 
> > > > > 	wrap_counter = off_wrap >> 15;
> > > > > 	event_idx = off_wrap & ~(1<<15);
> > > > > 	if (wrap_counter != vq->wrap_counter)
> > > > > 		event_idx -= vq->vring_packed.num;
> > > > > 	
> > > > > 	needs_kick = vring_need_event(event_idx, new, old);
> > > > 
> > > > I suspect this hack won't work for non power of 2 ring.
> > > 
> > > Above code doesn't require the ring size to be a power of 2.
> > > 
> > > For (__u16)(new_idx - old), what we want to get is vq->num_added.
> > > 
> > > old = vq->next_avail_idx - vq->num_added;
> > > new = vq->next_avail_idx;
> > > 
> > > When vq->next_avail_idx >= vq->num_added, it's obvious that,
> > > (__u16)(new_idx - old) is vq->num_added.
> > > 
> > > And when vq->next_avail_idx < vq->num_added, new will be smaller
> > > than old (old will be a big unsigned number), but (__u16)(new_idx
> > > - old) is still vq->num_added.
> > > 
> > > For (__u16)(new_idx - event_idx - 1), when new wraps and event_idx
> > > doesn't wrap, the most straightforward way to calculate it is:
> > > (new + vq->vring_packed.num) - event_idx - 1.
> > 
> > So how about we use the straightforward way then?
> 
> You mean we do new += vq->vring_packed.num instead
> of event_idx -= vq->vring_packed.num before calling
> vring_need_event()?
> 
> The problem is that, the second param (new_idx) of
> vring_need_event() will be used for:
> 
> (__u16)(new_idx - event_idx - 1)
> (__u16)(new_idx - old)
> 
> So if we change new, we will need to change old too.

I think that since we have a branch there anyway,
we are better off just special-casing if (wrap_counter != vq->wrap_counter).
Treat is differenty and avoid casts.

> And that would be an ugly hack..
> 
> Best regards,
> Tiwei Bie

I consider casts and huge numbers with two's complement
games even uglier.

> > 
> > > But we can also calculate it in this way:
> > > 
> > > event_idx -= vq->vring_packed.num;
> > > (event_idx will be a big unsigned number)
> > > 
> > > Then (__u16)(new_idx - event_idx - 1) will be the value we want.
> > > 
> > > Best regards,
> > > Tiwei Bie
> > 
> > 
> > > > 
> > > > 
> > > > > Best regards,
> > > > > Tiwei Bie
> > > > > 
> > > > > 
> > > > > > 
> > > > > > Thanks
> > > > > > 
> > > > > > > +	else
> > > > > > > +		needs_kick = (flags != VRING_EVENT_F_DISABLE);
> > > > > > >   	END_USE(vq);
> > > > > > >   	return needs_kick;
> > > > > > >   }
> > > > > > > @@ -1116,6 +1124,15 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > > > >   	if (vq->last_used_idx >= vq->vring_packed.num)
> > > > > > >   		vq->last_used_idx -= vq->vring_packed.num;
> > > > > > > +	/* If we expect an interrupt for the next entry, tell host
> > > > > > > +	 * by writing event index and flush out the write before
> > > > > > > +	 * the read in the next get_buf call. */
> > > > > > > +	if (vq->event_flags_shadow == VRING_EVENT_F_DESC)
> > > > > > > +		virtio_store_mb(vq->weak_barriers,
> > > > > > > +				&vq->vring_packed.driver->off_wrap,
> > > > > > > +				cpu_to_virtio16(_vq->vdev, vq->last_used_idx |
> > > > > > > +						(vq->wrap_counter << 15)));
> > > > > > > +
> > > > > > >   #ifdef DEBUG
> > > > > > >   	vq->last_add_time_valid = false;
> > > > > > >   #endif
> > > > > > > @@ -1143,10 +1160,17 @@ static unsigned virtqueue_enable_cb_prepare_packed(struct virtqueue *_vq)
> > > > > > >   	/* We optimistically turn back on interrupts, then check if there was
> > > > > > >   	 * more to do. */
> > > > > > > +	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
> > > > > > > +	 * either clear the flags bit or point the event index at the next
> > > > > > > +	 * entry. Always update the event index to keep code simple. */
> > > > > > > +
> > > > > > > +	vq->vring_packed.driver->off_wrap = cpu_to_virtio16(_vq->vdev,
> > > > > > > +			vq->last_used_idx | (vq->wrap_counter << 15));
> > > > > > >   	if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> > > > > > >   		virtio_wmb(vq->weak_barriers);
> > > > > > > -		vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> > > > > > > +		vq->event_flags_shadow = vq->event ? VRING_EVENT_F_DESC :
> > > > > > > +						     VRING_EVENT_F_ENABLE;
> > > > > > >   		vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> > > > > > >   							vq->event_flags_shadow);
> > > > > > >   	}
> > > > > > > @@ -1172,15 +1196,34 @@ static bool virtqueue_poll_packed(struct virtqueue *_vq, unsigned last_used_idx)
> > > > > > >   static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)
> > > > > > >   {
> > > > > > >   	struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > +	u16 bufs, used_idx, wrap_counter;
> > > > > > >   	START_USE(vq);
> > > > > > >   	/* We optimistically turn back on interrupts, then check if there was
> > > > > > >   	 * more to do. */
> > > > > > > +	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
> > > > > > > +	 * either clear the flags bit or point the event index at the next
> > > > > > > +	 * entry. Always update the event index to keep code simple. */
> > > > > > > +
> > > > > > > +	/* TODO: tune this threshold */
> > > > > > > +	bufs = (u16)(vq->next_avail_idx - vq->last_used_idx) * 3 / 4;
> > > > > > > +
> > > > > > > +	used_idx = vq->last_used_idx + bufs;
> > > > > > > +	wrap_counter = vq->wrap_counter;
> > > > > > > +
> > > > > > > +	if (used_idx >= vq->vring_packed.num) {
> > > > > > > +		used_idx -= vq->vring_packed.num;
> > > > > > > +		wrap_counter ^= 1;
> > > > > > > +	}
> > > > > > > +
> > > > > > > +	vq->vring_packed.driver->off_wrap = cpu_to_virtio16(_vq->vdev,
> > > > > > > +			used_idx | (wrap_counter << 15));
> > > > > > >   	if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> > > > > > >   		virtio_wmb(vq->weak_barriers);
> > > > > > > -		vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> > > > > > > +		vq->event_flags_shadow = vq->event ? VRING_EVENT_F_DESC :
> > > > > > > +						     VRING_EVENT_F_ENABLE;
> > > > > > >   		vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> > > > > > >   							vq->event_flags_shadow);
> > > > > > >   	}
> > > > > > > @@ -1822,8 +1865,10 @@ void vring_transport_features(struct virtio_device *vdev)
> > > > > > >   		switch (i) {
> > > > > > >   		case VIRTIO_RING_F_INDIRECT_DESC:
> > > > > > >   			break;
> > > > > > > +#if 0
> > > > > > >   		case VIRTIO_RING_F_EVENT_IDX:
> > > > > > >   			break;
> > > > > > > +#endif
> > > > > > >   		case VIRTIO_F_VERSION_1:
> > > > > > >   			break;
> > > > > > >   		case VIRTIO_F_IOMMU_PLATFORM:
> > > > > > 
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC v3 4/5] virtio_ring: add event idx support in packed ring
From: Tiwei Bie @ 2018-05-03  1:11 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, linux-kernel, virtualization, wexu
In-Reply-To: <20180502184015-mutt-send-email-mst@kernel.org>

On Wed, May 02, 2018 at 06:42:57PM +0300, Michael S. Tsirkin wrote:
> On Wed, May 02, 2018 at 11:12:55PM +0800, Tiwei Bie wrote:
> > On Wed, May 02, 2018 at 04:51:01PM +0300, Michael S. Tsirkin wrote:
> > > On Wed, May 02, 2018 at 03:28:19PM +0800, Tiwei Bie wrote:
> > > > On Wed, May 02, 2018 at 10:51:06AM +0800, Jason Wang wrote:
> > > > > On 2018年04月25日 13:15, Tiwei Bie wrote:
> > > > > > This commit introduces the event idx support in packed
> > > > > > ring. This feature is temporarily disabled, because the
> > > > > > implementation in this patch may not work as expected,
> > > > > > and some further discussions on the implementation are
> > > > > > needed, e.g. do we have to check the wrap counter when
> > > > > > checking whether a kick is needed?
> > > > > > 
> > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > ---
> > > > > >   drivers/virtio/virtio_ring.c | 53 ++++++++++++++++++++++++++++++++++++++++----
> > > > > >   1 file changed, 49 insertions(+), 4 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > index 0181e93897be..b1039c2985b9 100644
> > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > @@ -986,7 +986,7 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq,
> > > > > >   static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > > > >   {
> > > > > >   	struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > -	u16 flags;
> > > > > > +	u16 new, old, off_wrap, flags;
> > > > > >   	bool needs_kick;
> > > > > >   	u32 snapshot;
> > > > > > @@ -995,7 +995,12 @@ static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > > > >   	 * suppressions. */
> > > > > >   	virtio_mb(vq->weak_barriers);
> > > > > > +	old = vq->next_avail_idx - vq->num_added;
> > > > > > +	new = vq->next_avail_idx;
> > > > > > +	vq->num_added = 0;
> > > > > > +
> > > > > >   	snapshot = *(u32 *)vq->vring_packed.device;
> > > > > > +	off_wrap = virtio16_to_cpu(_vq->vdev, snapshot & 0xffff);
> > > > > >   	flags = cpu_to_virtio16(_vq->vdev, snapshot >> 16) & 0x3;
> > > > > >   #ifdef DEBUG
> > > > > > @@ -1006,7 +1011,10 @@ static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > > > >   	vq->last_add_time_valid = false;
> > > > > >   #endif
> > > > > > -	needs_kick = (flags != VRING_EVENT_F_DISABLE);
> > > > > > +	if (flags == VRING_EVENT_F_DESC)
> > > > > > +		needs_kick = vring_need_event(off_wrap & ~(1<<15), new, old);
> > > > > 
> > > > > I wonder whether or not the math is correct. Both new and event are in the
> > > > > unit of descriptor ring size, but old looks not.
> > > > 
> > > > What vring_need_event() cares is the distance between
> > > > `new` and `old`, i.e. vq->num_added. So I think there
> > > > is nothing wrong with `old`. But the calculation of the
> > > > distance between `new` and `event_idx` isn't right when
> > > > `new` wraps. How do you think about the below code:
> > > > 
> > > > 	wrap_counter = off_wrap >> 15;
> > > > 	event_idx = off_wrap & ~(1<<15);
> > > > 	if (wrap_counter != vq->wrap_counter)
> > > > 		event_idx -= vq->vring_packed.num;
> > > > 	
> > > > 	needs_kick = vring_need_event(event_idx, new, old);
> > > 
> > > I suspect this hack won't work for non power of 2 ring.
> > 
> > Above code doesn't require the ring size to be a power of 2.
> > 
> > For (__u16)(new_idx - old), what we want to get is vq->num_added.
> > 
> > old = vq->next_avail_idx - vq->num_added;
> > new = vq->next_avail_idx;
> > 
> > When vq->next_avail_idx >= vq->num_added, it's obvious that,
> > (__u16)(new_idx - old) is vq->num_added.
> > 
> > And when vq->next_avail_idx < vq->num_added, new will be smaller
> > than old (old will be a big unsigned number), but (__u16)(new_idx
> > - old) is still vq->num_added.
> > 
> > For (__u16)(new_idx - event_idx - 1), when new wraps and event_idx
> > doesn't wrap, the most straightforward way to calculate it is:
> > (new + vq->vring_packed.num) - event_idx - 1.
> 
> So how about we use the straightforward way then?

You mean we do new += vq->vring_packed.num instead
of event_idx -= vq->vring_packed.num before calling
vring_need_event()?

The problem is that, the second param (new_idx) of
vring_need_event() will be used for:

(__u16)(new_idx - event_idx - 1)
(__u16)(new_idx - old)

So if we change new, we will need to change old too.
And that would be an ugly hack..

Best regards,
Tiwei Bie

> 
> > But we can also calculate it in this way:
> > 
> > event_idx -= vq->vring_packed.num;
> > (event_idx will be a big unsigned number)
> > 
> > Then (__u16)(new_idx - event_idx - 1) will be the value we want.
> > 
> > Best regards,
> > Tiwei Bie
> 
> 
> > > 
> > > 
> > > > Best regards,
> > > > Tiwei Bie
> > > > 
> > > > 
> > > > > 
> > > > > Thanks
> > > > > 
> > > > > > +	else
> > > > > > +		needs_kick = (flags != VRING_EVENT_F_DISABLE);
> > > > > >   	END_USE(vq);
> > > > > >   	return needs_kick;
> > > > > >   }
> > > > > > @@ -1116,6 +1124,15 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > > >   	if (vq->last_used_idx >= vq->vring_packed.num)
> > > > > >   		vq->last_used_idx -= vq->vring_packed.num;
> > > > > > +	/* If we expect an interrupt for the next entry, tell host
> > > > > > +	 * by writing event index and flush out the write before
> > > > > > +	 * the read in the next get_buf call. */
> > > > > > +	if (vq->event_flags_shadow == VRING_EVENT_F_DESC)
> > > > > > +		virtio_store_mb(vq->weak_barriers,
> > > > > > +				&vq->vring_packed.driver->off_wrap,
> > > > > > +				cpu_to_virtio16(_vq->vdev, vq->last_used_idx |
> > > > > > +						(vq->wrap_counter << 15)));
> > > > > > +
> > > > > >   #ifdef DEBUG
> > > > > >   	vq->last_add_time_valid = false;
> > > > > >   #endif
> > > > > > @@ -1143,10 +1160,17 @@ static unsigned virtqueue_enable_cb_prepare_packed(struct virtqueue *_vq)
> > > > > >   	/* We optimistically turn back on interrupts, then check if there was
> > > > > >   	 * more to do. */
> > > > > > +	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
> > > > > > +	 * either clear the flags bit or point the event index at the next
> > > > > > +	 * entry. Always update the event index to keep code simple. */
> > > > > > +
> > > > > > +	vq->vring_packed.driver->off_wrap = cpu_to_virtio16(_vq->vdev,
> > > > > > +			vq->last_used_idx | (vq->wrap_counter << 15));
> > > > > >   	if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> > > > > >   		virtio_wmb(vq->weak_barriers);
> > > > > > -		vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> > > > > > +		vq->event_flags_shadow = vq->event ? VRING_EVENT_F_DESC :
> > > > > > +						     VRING_EVENT_F_ENABLE;
> > > > > >   		vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> > > > > >   							vq->event_flags_shadow);
> > > > > >   	}
> > > > > > @@ -1172,15 +1196,34 @@ static bool virtqueue_poll_packed(struct virtqueue *_vq, unsigned last_used_idx)
> > > > > >   static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)
> > > > > >   {
> > > > > >   	struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > +	u16 bufs, used_idx, wrap_counter;
> > > > > >   	START_USE(vq);
> > > > > >   	/* We optimistically turn back on interrupts, then check if there was
> > > > > >   	 * more to do. */
> > > > > > +	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
> > > > > > +	 * either clear the flags bit or point the event index at the next
> > > > > > +	 * entry. Always update the event index to keep code simple. */
> > > > > > +
> > > > > > +	/* TODO: tune this threshold */
> > > > > > +	bufs = (u16)(vq->next_avail_idx - vq->last_used_idx) * 3 / 4;
> > > > > > +
> > > > > > +	used_idx = vq->last_used_idx + bufs;
> > > > > > +	wrap_counter = vq->wrap_counter;
> > > > > > +
> > > > > > +	if (used_idx >= vq->vring_packed.num) {
> > > > > > +		used_idx -= vq->vring_packed.num;
> > > > > > +		wrap_counter ^= 1;
> > > > > > +	}
> > > > > > +
> > > > > > +	vq->vring_packed.driver->off_wrap = cpu_to_virtio16(_vq->vdev,
> > > > > > +			used_idx | (wrap_counter << 15));
> > > > > >   	if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> > > > > >   		virtio_wmb(vq->weak_barriers);
> > > > > > -		vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> > > > > > +		vq->event_flags_shadow = vq->event ? VRING_EVENT_F_DESC :
> > > > > > +						     VRING_EVENT_F_ENABLE;
> > > > > >   		vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> > > > > >   							vq->event_flags_shadow);
> > > > > >   	}
> > > > > > @@ -1822,8 +1865,10 @@ void vring_transport_features(struct virtio_device *vdev)
> > > > > >   		switch (i) {
> > > > > >   		case VIRTIO_RING_F_INDIRECT_DESC:
> > > > > >   			break;
> > > > > > +#if 0
> > > > > >   		case VIRTIO_RING_F_EVENT_IDX:
> > > > > >   			break;
> > > > > > +#endif
> > > > > >   		case VIRTIO_F_VERSION_1:
> > > > > >   			break;
> > > > > >   		case VIRTIO_F_IOMMU_PLATFORM:
> > > > > 
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net-next v9 2/4] net: Introduce generic failover module
From: Jiri Pirko @ 2018-05-02 21:39 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: alexander.h.duyck, virtio-dev, mst, kubakici, netdev,
	virtualization, loseweigh, aaron.f.brown, davem
In-Reply-To: <052e2c60-dc4c-d6a0-0159-fc1821cb4365@intel.com>

Wed, May 02, 2018 at 07:51:12PM CEST, sridhar.samudrala@intel.com wrote:
>
>
>On 5/2/2018 9:15 AM, Jiri Pirko wrote:
>> Sat, Apr 28, 2018 at 11:06:01AM CEST, jiri@resnulli.us wrote:
>> > Fri, Apr 27, 2018 at 07:06:58PM CEST, sridhar.samudrala@intel.com wrote:
>> [...]
>> 
>> 
>> > > +
>> > > +	err = netdev_rx_handler_register(slave_dev, net_failover_handle_frame,
>> > > +					 failover_dev);
>> > > +	if (err) {
>> > > +		netdev_err(slave_dev, "can not register failover rx handler (err = %d)\n",
>> > > +			   err);
>> > > +		goto err_handler_register;
>> > > +	}
>> > > +
>> > > +	err = netdev_upper_dev_link(slave_dev, failover_dev, NULL);
>> > Please use netdev_master_upper_dev_link().
>> Don't forget to fillup struct netdev_lag_upper_info - NETDEV_LAG_TX_TYPE_ACTIVEBACKUP
>> 
>> 
>> Also, please call netdev_lower_state_changed() when the active slave
>> device changes from primary->backup of backup->primary and whenever link
>> state of a slave changes
>> 
>Sure. will look into it.  Do you think this will help with the issue
>you saw with having to change mac on standy twice to get the init scripts
>working? We are now going to block changing the mac on both standby and
>failover.

I don't see any relation to that.

^ permalink raw reply

* Re: [PATCH net-next v9 2/4] net: Introduce generic failover module
From: Jiri Pirko @ 2018-05-02 21:39 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: alexander.h.duyck, virtio-dev, mst, kubakici, netdev,
	virtualization, loseweigh, aaron.f.brown, davem
In-Reply-To: <052e2c60-dc4c-d6a0-0159-fc1821cb4365@intel.com>

Wed, May 02, 2018 at 07:51:12PM CEST, sridhar.samudrala@intel.com wrote:
>
>
>On 5/2/2018 9:15 AM, Jiri Pirko wrote:
>> Sat, Apr 28, 2018 at 11:06:01AM CEST, jiri@resnulli.us wrote:
>> > Fri, Apr 27, 2018 at 07:06:58PM CEST, sridhar.samudrala@intel.com wrote:
>> [...]
>> 
>> 
>> > > +
>> > > +	err = netdev_rx_handler_register(slave_dev, net_failover_handle_frame,
>> > > +					 failover_dev);
>> > > +	if (err) {
>> > > +		netdev_err(slave_dev, "can not register failover rx handler (err = %d)\n",
>> > > +			   err);
>> > > +		goto err_handler_register;
>> > > +	}
>> > > +
>> > > +	err = netdev_upper_dev_link(slave_dev, failover_dev, NULL);
>> > Please use netdev_master_upper_dev_link().
>> Don't forget to fillup struct netdev_lag_upper_info - NETDEV_LAG_TX_TYPE_ACTIVEBACKUP
>> 
>> 
>> Also, please call netdev_lower_state_changed() when the active slave
>> device changes from primary->backup of backup->primary and whenever link
>> state of a slave changes
>> 
>Sure. will look into it.  Do you think this will help with the issue
>you saw with having to change mac on standy twice to get the init scripts
>working? We are now going to block changing the mac on both standby and
>failover.
>
>Also, i was wondering if we should set dev->flags to IFF_MASTER on failover
>and IFF_SLAVE on primary and standby. netvsc does this.

No. Don't set it. It is wrong.



>Does this help with the init scripts and network manager to skip slave
>devices for dhcp requests?
>

^ permalink raw reply

* Re: [PATCH net-next v9 2/4] net: Introduce generic failover module
From: Samudrala, Sridhar @ 2018-05-02 21:36 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: alexander.h.duyck, virtio-dev, Jiri Pirko, kubakici, netdev,
	virtualization, loseweigh, aaron.f.brown, davem
In-Reply-To: <20180502232907-mutt-send-email-mst@kernel.org>

On 5/2/2018 1:30 PM, Michael S. Tsirkin wrote:
> On Wed, May 02, 2018 at 10:51:12AM -0700, Samudrala, Sridhar wrote:
>>
>> On 5/2/2018 9:15 AM, Jiri Pirko wrote:
>>> Sat, Apr 28, 2018 at 11:06:01AM CEST, jiri@resnulli.us wrote:
>>>> Fri, Apr 27, 2018 at 07:06:58PM CEST, sridhar.samudrala@intel.com wrote:
>>> [...]
>>>
>>>
>>>>> +
>>>>> +	err = netdev_rx_handler_register(slave_dev, net_failover_handle_frame,
>>>>> +					 failover_dev);
>>>>> +	if (err) {
>>>>> +		netdev_err(slave_dev, "can not register failover rx handler (err = %d)\n",
>>>>> +			   err);
>>>>> +		goto err_handler_register;
>>>>> +	}
>>>>> +
>>>>> +	err = netdev_upper_dev_link(slave_dev, failover_dev, NULL);
>>>> Please use netdev_master_upper_dev_link().
>>> Don't forget to fillup struct netdev_lag_upper_info - NETDEV_LAG_TX_TYPE_ACTIVEBACKUP
>>>
>>>
>>> Also, please call netdev_lower_state_changed() when the active slave
>>> device changes from primary->backup of backup->primary and whenever link
>>> state of a slave changes
>>>
>> Sure. will look into it.  Do you think this will help with the issue
>> you saw with having to change mac on standy twice to get the init scripts
>> working? We are now going to block changing the mac on both standby and
>> failover.
>>
>> Also, i was wondering if we should set dev->flags to IFF_MASTER on failover
>> and IFF_SLAVE on primary and standby.
> We do need a way to find things out, that's for sure.
> How does userspace know it's a failover
> config and find the failover device right now?

# ethtool -i ens12|grep driver
driver: failover

# ethtool -i ens12n_sby|grep driver
driver: virtio_net


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net-next v9 2/4] net: Introduce generic failover module
From: Michael S. Tsirkin @ 2018-05-02 20:30 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: alexander.h.duyck, virtio-dev, Jiri Pirko, kubakici, netdev,
	virtualization, loseweigh, aaron.f.brown, davem
In-Reply-To: <052e2c60-dc4c-d6a0-0159-fc1821cb4365@intel.com>

On Wed, May 02, 2018 at 10:51:12AM -0700, Samudrala, Sridhar wrote:
> 
> 
> On 5/2/2018 9:15 AM, Jiri Pirko wrote:
> > Sat, Apr 28, 2018 at 11:06:01AM CEST, jiri@resnulli.us wrote:
> > > Fri, Apr 27, 2018 at 07:06:58PM CEST, sridhar.samudrala@intel.com wrote:
> > [...]
> > 
> > 
> > > > +
> > > > +	err = netdev_rx_handler_register(slave_dev, net_failover_handle_frame,
> > > > +					 failover_dev);
> > > > +	if (err) {
> > > > +		netdev_err(slave_dev, "can not register failover rx handler (err = %d)\n",
> > > > +			   err);
> > > > +		goto err_handler_register;
> > > > +	}
> > > > +
> > > > +	err = netdev_upper_dev_link(slave_dev, failover_dev, NULL);
> > > Please use netdev_master_upper_dev_link().
> > Don't forget to fillup struct netdev_lag_upper_info - NETDEV_LAG_TX_TYPE_ACTIVEBACKUP
> > 
> > 
> > Also, please call netdev_lower_state_changed() when the active slave
> > device changes from primary->backup of backup->primary and whenever link
> > state of a slave changes
> > 
> Sure. will look into it.  Do you think this will help with the issue
> you saw with having to change mac on standy twice to get the init scripts
> working? We are now going to block changing the mac on both standby and
> failover.
> 
> Also, i was wondering if we should set dev->flags to IFF_MASTER on failover
> and IFF_SLAVE on primary and standby.

We do need a way to find things out, that's for sure.
How does userspace know it's a failover
config and find the failover device right now?

> netvsc does this.
> Does this help with the init scripts and network manager to skip slave
> devices for dhcp requests?

Try it?

^ permalink raw reply

* Re: [PATCH net-next v9 2/4] net: Introduce generic failover module
From: Samudrala, Sridhar @ 2018-05-02 17:51 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: alexander.h.duyck, virtio-dev, mst, kubakici, netdev,
	virtualization, loseweigh, aaron.f.brown, davem
In-Reply-To: <20180502161542.GI19250@nanopsycho>



On 5/2/2018 9:15 AM, Jiri Pirko wrote:
> Sat, Apr 28, 2018 at 11:06:01AM CEST, jiri@resnulli.us wrote:
>> Fri, Apr 27, 2018 at 07:06:58PM CEST, sridhar.samudrala@intel.com wrote:
> [...]
>
>
>>> +
>>> +	err = netdev_rx_handler_register(slave_dev, net_failover_handle_frame,
>>> +					 failover_dev);
>>> +	if (err) {
>>> +		netdev_err(slave_dev, "can not register failover rx handler (err = %d)\n",
>>> +			   err);
>>> +		goto err_handler_register;
>>> +	}
>>> +
>>> +	err = netdev_upper_dev_link(slave_dev, failover_dev, NULL);
>> Please use netdev_master_upper_dev_link().
> Don't forget to fillup struct netdev_lag_upper_info - NETDEV_LAG_TX_TYPE_ACTIVEBACKUP
>
>
> Also, please call netdev_lower_state_changed() when the active slave
> device changes from primary->backup of backup->primary and whenever link
> state of a slave changes
>
Sure. will look into it.  Do you think this will help with the issue
you saw with having to change mac on standy twice to get the init scripts
working? We are now going to block changing the mac on both standby and
failover.

Also, i was wondering if we should set dev->flags to IFF_MASTER on failover
and IFF_SLAVE on primary and standby. netvsc does this.
Does this help with the init scripts and network manager to skip slave
devices for dhcp requests?

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH V2 net-next 4/6] tun: Add support for SCTP checksum offload
From: Vlad Yasevich @ 2018-05-02 17:17 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner, Vladislav Yasevich
  Cc: virtio-dev, nhorman, mst, netdev, virtualization, linux-sctp
In-Reply-To: <20180502145607.GH5105@localhost.localdomain>

On 05/02/2018 10:56 AM, Marcelo Ricardo Leitner wrote:
> On Wed, May 02, 2018 at 11:53:47AM -0300, Marcelo Ricardo Leitner wrote:
>> On Tue, May 01, 2018 at 10:07:37PM -0400, Vladislav Yasevich wrote:
>>> Adds a new tun offload flag to allow for SCTP checksum offload.
>>> The flag has to be set by the user and defaults to "no offload".
>>
>> I'm confused here:
>>
>>> +++ b/drivers/net/tun.c
>>> @@ -216,7 +216,7 @@ struct tun_struct {
>>>  	struct net_device	*dev;
>>>  	netdev_features_t	set_features;
>>>  #define TUN_USER_FEATURES (NETIF_F_HW_CSUM|NETIF_F_TSO_ECN|NETIF_F_TSO| \
>>> -			  NETIF_F_TSO6)
>>> +			  NETIF_F_TSO6|NETIF_F_SCTP_CRC)
>>
>> Doesn't adding it here mean it defaults to "offload", instead?
>>
>> later on, it does:
>>                 dev->hw_features = NETIF_F_SG | NETIF_F_FRAGLIST |
>>                                    TUN_USER_FEATURES | NETIF_F_HW_VLAN_CTAG_TX |
>>                                    NETIF_F_HW_VLAN_STAG_TX;
> 
> Missed to paste the next line too:
>                 dev->features = dev->hw_features | NETIF_F_LLTX;
> 

Yes, as a software device, we can default it to on.  However, qemu will 0-out
the features and then set them up based on the guest (just like regular checksum).

-vlad

^ permalink raw reply

* Re: [PATCH net-next v9 2/4] net: Introduce generic failover module
From: Jiri Pirko @ 2018-05-02 16:15 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: alexander.h.duyck, virtio-dev, mst, kubakici, netdev,
	virtualization, loseweigh, aaron.f.brown, davem
In-Reply-To: <20180428090601.GL5632@nanopsycho.orion>

Sat, Apr 28, 2018 at 11:06:01AM CEST, jiri@resnulli.us wrote:
>Fri, Apr 27, 2018 at 07:06:58PM CEST, sridhar.samudrala@intel.com wrote:

[...]


>>+
>>+	err = netdev_rx_handler_register(slave_dev, net_failover_handle_frame,
>>+					 failover_dev);
>>+	if (err) {
>>+		netdev_err(slave_dev, "can not register failover rx handler (err = %d)\n",
>>+			   err);
>>+		goto err_handler_register;
>>+	}
>>+
>>+	err = netdev_upper_dev_link(slave_dev, failover_dev, NULL);
>
>Please use netdev_master_upper_dev_link().

Don't forget to fillup struct netdev_lag_upper_info - NETDEV_LAG_TX_TYPE_ACTIVEBACKUP


Also, please call netdev_lower_state_changed() when the active slave
device changes from primary->backup of backup->primary and whenever link
state of a slave changes

[...]

^ permalink raw reply

* Re: [PATCH net-next v9 3/4] virtio_net: Extend virtio to use VF datapath when available
From: Jiri Pirko @ 2018-05-02 16:05 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: alexander.h.duyck, virtio-dev, mst, kubakici, netdev,
	virtualization, loseweigh, aaron.f.brown, davem
In-Reply-To: <c94ce6a9-6f36-675c-cf5b-414454fc2244@intel.com>

Wed, May 02, 2018 at 05:34:44PM CEST, sridhar.samudrala@intel.com wrote:
>On 5/2/2018 12:50 AM, Jiri Pirko wrote:
>> Wed, May 02, 2018 at 02:20:26AM CEST, sridhar.samudrala@intel.com wrote:
>> > On 4/30/2018 12:20 AM, Jiri Pirko wrote:
>> > > > > Now I try to change mac of the failover master:
>> > > > > [root@test1 ~]# ip link set ens3 addr 52:54:00:b2:a7:f3
>> > > > > RTNETLINK answers: Operation not supported
>> > > > > 
>> > > > > That I did expect to work. I would expect this would change the mac of
>> > > > > the master and both standby and primary slaves.
>> > > > If a VF is untrusted, a VM will not able to change its MAC and moreover
>> > > Note that at this point, I have no VF. So I'm not sure why you mention
>> > > that.
>> > > 
>> > > 
>> > > > in this mode we are assuming that the hypervisor has assigned the MAC and
>> > > > guest is not expected to change the MAC.
>> > > Wait, for ordinary old-fashioned virtio_net, as a VM user, I can change
>> > > mac and all works fine. How is this different? Change mac on "failover
>> > > instance" should work and should propagate the mac down to its slaves.
>> > > 
>> > > 
>> > > > For the initial implementation, i would propose not allowing the guest to
>> > > > change the MAC of failover or standby dev.
>> > > I see no reason for such restriction.
>> > > 
>> > It is true that a VM user can change mac address of a normal virtio-net interface,
>> > however when it is in STANDBY mode i think we should not allow this change specifically
>> > because we are creating a failover instance based on a MAC that is assigned by the
>> > hypervisor.
>> > 
>> > Moreover,  in a cloud environment i would think that PF/hypervisor assigns a MAC to
>> > the VF and it cannot be changed by the guest.
>> So that is easy. You allow the change of the mac and in the "failover"
>> mac change implementation you propagate the change down to slaves. If
>> one slave does not support the change, you bail out. And since VF does
>> not allow it as you say, once it will be enslaved, the mac change could
>> not be done. Seems like a correct behavior to me and is in-sync with how
>> bond/team behaves.
>
>If we allow the mac to be changed when the VF is not yet plugged in
>and the guest changes the mac, then VF cannot join the failover when
>it is hot plugged with the original mac assigned by the hypervisor.
>Specifically there could be timing window during the live migration where
>VF would be unplugged at the source and if we allow the guest to change the
>failover mac, then VF would not be able to register with the failover after
>the VM is migrated to the destination.

Okay. So as suggested by mst, just forbid the mac changes all together.


>
>> 
>> > So for the initial implementation, do you see any issues with having this restriction
>> > in STANDBY mode.
>> > 
>> > 
>

^ permalink raw reply

* Re: [PATCH net-next v9 3/4] virtio_net: Extend virtio to use VF datapath when available
From: Jiri Pirko @ 2018-05-02 16:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: alexander.h.duyck, virtio-dev, kubakici, Samudrala, Sridhar,
	virtualization, loseweigh, netdev, aaron.f.brown, davem
In-Reply-To: <20180502184326-mutt-send-email-mst@kernel.org>

Wed, May 02, 2018 at 05:47:27PM CEST, mst@redhat.com wrote:
>On Wed, May 02, 2018 at 09:50:21AM +0200, Jiri Pirko wrote:
>> Wed, May 02, 2018 at 02:20:26AM CEST, sridhar.samudrala@intel.com wrote:
>> >On 4/30/2018 12:20 AM, Jiri Pirko wrote:
>> >> 
>> >> > > Now I try to change mac of the failover master:
>> >> > > [root@test1 ~]# ip link set ens3 addr 52:54:00:b2:a7:f3
>> >> > > RTNETLINK answers: Operation not supported
>> >> > > 
>> >> > > That I did expect to work. I would expect this would change the mac of
>> >> > > the master and both standby and primary slaves.
>> >> > If a VF is untrusted, a VM will not able to change its MAC and moreover
>> >> Note that at this point, I have no VF. So I'm not sure why you mention
>> >> that.
>> >> 
>> >> 
>> >> > in this mode we are assuming that the hypervisor has assigned the MAC and
>> >> > guest is not expected to change the MAC.
>> >> Wait, for ordinary old-fashioned virtio_net, as a VM user, I can change
>> >> mac and all works fine. How is this different? Change mac on "failover
>> >> instance" should work and should propagate the mac down to its slaves.
>> >> 
>> >> 
>> >> > For the initial implementation, i would propose not allowing the guest to
>> >> > change the MAC of failover or standby dev.
>> >> I see no reason for such restriction.
>> >> 
>> >
>> >It is true that a VM user can change mac address of a normal virtio-net interface,
>> >however when it is in STANDBY mode i think we should not allow this change specifically
>> >because we are creating a failover instance based on a MAC that is assigned by the
>> >hypervisor.
>> >
>> >Moreover,  in a cloud environment i would think that PF/hypervisor assigns a MAC to
>> >the VF and it cannot be changed by the guest.
>> 
>> So that is easy. You allow the change of the mac and in the "failover"
>> mac change implementation you propagate the change down to slaves. If
>> one slave does not support the change, you bail out. And since VF does
>
>I wish people would say primary/standby and not "VF" :)

Sure, sorry.


>
>> not allow it as you say, once it will be enslaved, the mac change could
>> not be done. Seems like a correct behavior to me
>
>
>what if primary does not allow mac changes and is attached after
>mac is changed on standy?

Mac should be changed on failover. In that case, the primary would have
a different mac and therefore it won't get enslaved.

>
>
>> and is in-sync with how
>> bond/team behaves.
>
>I think in the end virtio can just block MAC changes for simplicity.
>
>I personally don't see softmac as a must have feature in v1,
>we can add it later.

Okay.


>
>What's the situation with init scripts and whether it's
>possible to make them work well would be a better question.
>
>> 
>> >
>> >So for the initial implementation, do you see any issues with having this restriction
>> >in STANDBY mode.
>> >
>> >

^ permalink raw reply

* Re: [PATCH net-next v9 3/4] virtio_net: Extend virtio to use VF datapath when available
From: Michael S. Tsirkin @ 2018-05-02 15:47 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: alexander.h.duyck, virtio-dev, kubakici, Samudrala, Sridhar,
	virtualization, loseweigh, netdev, aaron.f.brown, davem
In-Reply-To: <20180502075021.GC19250@nanopsycho>

On Wed, May 02, 2018 at 09:50:21AM +0200, Jiri Pirko wrote:
> Wed, May 02, 2018 at 02:20:26AM CEST, sridhar.samudrala@intel.com wrote:
> >On 4/30/2018 12:20 AM, Jiri Pirko wrote:
> >> 
> >> > > Now I try to change mac of the failover master:
> >> > > [root@test1 ~]# ip link set ens3 addr 52:54:00:b2:a7:f3
> >> > > RTNETLINK answers: Operation not supported
> >> > > 
> >> > > That I did expect to work. I would expect this would change the mac of
> >> > > the master and both standby and primary slaves.
> >> > If a VF is untrusted, a VM will not able to change its MAC and moreover
> >> Note that at this point, I have no VF. So I'm not sure why you mention
> >> that.
> >> 
> >> 
> >> > in this mode we are assuming that the hypervisor has assigned the MAC and
> >> > guest is not expected to change the MAC.
> >> Wait, for ordinary old-fashioned virtio_net, as a VM user, I can change
> >> mac and all works fine. How is this different? Change mac on "failover
> >> instance" should work and should propagate the mac down to its slaves.
> >> 
> >> 
> >> > For the initial implementation, i would propose not allowing the guest to
> >> > change the MAC of failover or standby dev.
> >> I see no reason for such restriction.
> >> 
> >
> >It is true that a VM user can change mac address of a normal virtio-net interface,
> >however when it is in STANDBY mode i think we should not allow this change specifically
> >because we are creating a failover instance based on a MAC that is assigned by the
> >hypervisor.
> >
> >Moreover,  in a cloud environment i would think that PF/hypervisor assigns a MAC to
> >the VF and it cannot be changed by the guest.
> 
> So that is easy. You allow the change of the mac and in the "failover"
> mac change implementation you propagate the change down to slaves. If
> one slave does not support the change, you bail out. And since VF does

I wish people would say primary/standby and not "VF" :)

> not allow it as you say, once it will be enslaved, the mac change could
> not be done. Seems like a correct behavior to me


what if primary does not allow mac changes and is attached after
mac is changed on standy?


> and is in-sync with how
> bond/team behaves.

I think in the end virtio can just block MAC changes for simplicity.

I personally don't see softmac as a must have feature in v1,
we can add it later.

What's the situation with init scripts and whether it's
possible to make them work well would be a better question.

> 
> >
> >So for the initial implementation, do you see any issues with having this restriction
> >in STANDBY mode.
> >
> >

^ permalink raw reply

* Re: [RFC v3 4/5] virtio_ring: add event idx support in packed ring
From: Michael S. Tsirkin @ 2018-05-02 15:42 UTC (permalink / raw)
  To: Tiwei Bie; +Cc: netdev, linux-kernel, virtualization, wexu
In-Reply-To: <20180502151255.h3x6rhszxa3euinl@debian>

On Wed, May 02, 2018 at 11:12:55PM +0800, Tiwei Bie wrote:
> On Wed, May 02, 2018 at 04:51:01PM +0300, Michael S. Tsirkin wrote:
> > On Wed, May 02, 2018 at 03:28:19PM +0800, Tiwei Bie wrote:
> > > On Wed, May 02, 2018 at 10:51:06AM +0800, Jason Wang wrote:
> > > > On 2018年04月25日 13:15, Tiwei Bie wrote:
> > > > > This commit introduces the event idx support in packed
> > > > > ring. This feature is temporarily disabled, because the
> > > > > implementation in this patch may not work as expected,
> > > > > and some further discussions on the implementation are
> > > > > needed, e.g. do we have to check the wrap counter when
> > > > > checking whether a kick is needed?
> > > > > 
> > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > ---
> > > > >   drivers/virtio/virtio_ring.c | 53 ++++++++++++++++++++++++++++++++++++++++----
> > > > >   1 file changed, 49 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > index 0181e93897be..b1039c2985b9 100644
> > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > @@ -986,7 +986,7 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq,
> > > > >   static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > > >   {
> > > > >   	struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > -	u16 flags;
> > > > > +	u16 new, old, off_wrap, flags;
> > > > >   	bool needs_kick;
> > > > >   	u32 snapshot;
> > > > > @@ -995,7 +995,12 @@ static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > > >   	 * suppressions. */
> > > > >   	virtio_mb(vq->weak_barriers);
> > > > > +	old = vq->next_avail_idx - vq->num_added;
> > > > > +	new = vq->next_avail_idx;
> > > > > +	vq->num_added = 0;
> > > > > +
> > > > >   	snapshot = *(u32 *)vq->vring_packed.device;
> > > > > +	off_wrap = virtio16_to_cpu(_vq->vdev, snapshot & 0xffff);
> > > > >   	flags = cpu_to_virtio16(_vq->vdev, snapshot >> 16) & 0x3;
> > > > >   #ifdef DEBUG
> > > > > @@ -1006,7 +1011,10 @@ static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > > >   	vq->last_add_time_valid = false;
> > > > >   #endif
> > > > > -	needs_kick = (flags != VRING_EVENT_F_DISABLE);
> > > > > +	if (flags == VRING_EVENT_F_DESC)
> > > > > +		needs_kick = vring_need_event(off_wrap & ~(1<<15), new, old);
> > > > 
> > > > I wonder whether or not the math is correct. Both new and event are in the
> > > > unit of descriptor ring size, but old looks not.
> > > 
> > > What vring_need_event() cares is the distance between
> > > `new` and `old`, i.e. vq->num_added. So I think there
> > > is nothing wrong with `old`. But the calculation of the
> > > distance between `new` and `event_idx` isn't right when
> > > `new` wraps. How do you think about the below code:
> > > 
> > > 	wrap_counter = off_wrap >> 15;
> > > 	event_idx = off_wrap & ~(1<<15);
> > > 	if (wrap_counter != vq->wrap_counter)
> > > 		event_idx -= vq->vring_packed.num;
> > > 	
> > > 	needs_kick = vring_need_event(event_idx, new, old);
> > 
> > I suspect this hack won't work for non power of 2 ring.
> 
> Above code doesn't require the ring size to be a power of 2.
> 
> For (__u16)(new_idx - old), what we want to get is vq->num_added.
> 
> old = vq->next_avail_idx - vq->num_added;
> new = vq->next_avail_idx;
> 
> When vq->next_avail_idx >= vq->num_added, it's obvious that,
> (__u16)(new_idx - old) is vq->num_added.
> 
> And when vq->next_avail_idx < vq->num_added, new will be smaller
> than old (old will be a big unsigned number), but (__u16)(new_idx
> - old) is still vq->num_added.
> 
> For (__u16)(new_idx - event_idx - 1), when new wraps and event_idx
> doesn't wrap, the most straightforward way to calculate it is:
> (new + vq->vring_packed.num) - event_idx - 1.

So how about we use the straightforward way then?

> But we can also calculate it in this way:
> 
> event_idx -= vq->vring_packed.num;
> (event_idx will be a big unsigned number)
> 
> Then (__u16)(new_idx - event_idx - 1) will be the value we want.
> 
> Best regards,
> Tiwei Bie


> > 
> > 
> > > Best regards,
> > > Tiwei Bie
> > > 
> > > 
> > > > 
> > > > Thanks
> > > > 
> > > > > +	else
> > > > > +		needs_kick = (flags != VRING_EVENT_F_DISABLE);
> > > > >   	END_USE(vq);
> > > > >   	return needs_kick;
> > > > >   }
> > > > > @@ -1116,6 +1124,15 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > >   	if (vq->last_used_idx >= vq->vring_packed.num)
> > > > >   		vq->last_used_idx -= vq->vring_packed.num;
> > > > > +	/* If we expect an interrupt for the next entry, tell host
> > > > > +	 * by writing event index and flush out the write before
> > > > > +	 * the read in the next get_buf call. */
> > > > > +	if (vq->event_flags_shadow == VRING_EVENT_F_DESC)
> > > > > +		virtio_store_mb(vq->weak_barriers,
> > > > > +				&vq->vring_packed.driver->off_wrap,
> > > > > +				cpu_to_virtio16(_vq->vdev, vq->last_used_idx |
> > > > > +						(vq->wrap_counter << 15)));
> > > > > +
> > > > >   #ifdef DEBUG
> > > > >   	vq->last_add_time_valid = false;
> > > > >   #endif
> > > > > @@ -1143,10 +1160,17 @@ static unsigned virtqueue_enable_cb_prepare_packed(struct virtqueue *_vq)
> > > > >   	/* We optimistically turn back on interrupts, then check if there was
> > > > >   	 * more to do. */
> > > > > +	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
> > > > > +	 * either clear the flags bit or point the event index at the next
> > > > > +	 * entry. Always update the event index to keep code simple. */
> > > > > +
> > > > > +	vq->vring_packed.driver->off_wrap = cpu_to_virtio16(_vq->vdev,
> > > > > +			vq->last_used_idx | (vq->wrap_counter << 15));
> > > > >   	if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> > > > >   		virtio_wmb(vq->weak_barriers);
> > > > > -		vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> > > > > +		vq->event_flags_shadow = vq->event ? VRING_EVENT_F_DESC :
> > > > > +						     VRING_EVENT_F_ENABLE;
> > > > >   		vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> > > > >   							vq->event_flags_shadow);
> > > > >   	}
> > > > > @@ -1172,15 +1196,34 @@ static bool virtqueue_poll_packed(struct virtqueue *_vq, unsigned last_used_idx)
> > > > >   static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)
> > > > >   {
> > > > >   	struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > +	u16 bufs, used_idx, wrap_counter;
> > > > >   	START_USE(vq);
> > > > >   	/* We optimistically turn back on interrupts, then check if there was
> > > > >   	 * more to do. */
> > > > > +	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
> > > > > +	 * either clear the flags bit or point the event index at the next
> > > > > +	 * entry. Always update the event index to keep code simple. */
> > > > > +
> > > > > +	/* TODO: tune this threshold */
> > > > > +	bufs = (u16)(vq->next_avail_idx - vq->last_used_idx) * 3 / 4;
> > > > > +
> > > > > +	used_idx = vq->last_used_idx + bufs;
> > > > > +	wrap_counter = vq->wrap_counter;
> > > > > +
> > > > > +	if (used_idx >= vq->vring_packed.num) {
> > > > > +		used_idx -= vq->vring_packed.num;
> > > > > +		wrap_counter ^= 1;
> > > > > +	}
> > > > > +
> > > > > +	vq->vring_packed.driver->off_wrap = cpu_to_virtio16(_vq->vdev,
> > > > > +			used_idx | (wrap_counter << 15));
> > > > >   	if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> > > > >   		virtio_wmb(vq->weak_barriers);
> > > > > -		vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> > > > > +		vq->event_flags_shadow = vq->event ? VRING_EVENT_F_DESC :
> > > > > +						     VRING_EVENT_F_ENABLE;
> > > > >   		vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> > > > >   							vq->event_flags_shadow);
> > > > >   	}
> > > > > @@ -1822,8 +1865,10 @@ void vring_transport_features(struct virtio_device *vdev)
> > > > >   		switch (i) {
> > > > >   		case VIRTIO_RING_F_INDIRECT_DESC:
> > > > >   			break;
> > > > > +#if 0
> > > > >   		case VIRTIO_RING_F_EVENT_IDX:
> > > > >   			break;
> > > > > +#endif
> > > > >   		case VIRTIO_F_VERSION_1:
> > > > >   			break;
> > > > >   		case VIRTIO_F_IOMMU_PLATFORM:
> > > > 
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net-next v9 3/4] virtio_net: Extend virtio to use VF datapath when available
From: Samudrala, Sridhar @ 2018-05-02 15:34 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: alexander.h.duyck, virtio-dev, mst, kubakici, netdev,
	virtualization, loseweigh, aaron.f.brown, davem
In-Reply-To: <20180502075021.GC19250@nanopsycho>

On 5/2/2018 12:50 AM, Jiri Pirko wrote:
> Wed, May 02, 2018 at 02:20:26AM CEST, sridhar.samudrala@intel.com wrote:
>> On 4/30/2018 12:20 AM, Jiri Pirko wrote:
>>>>> Now I try to change mac of the failover master:
>>>>> [root@test1 ~]# ip link set ens3 addr 52:54:00:b2:a7:f3
>>>>> RTNETLINK answers: Operation not supported
>>>>>
>>>>> That I did expect to work. I would expect this would change the mac of
>>>>> the master and both standby and primary slaves.
>>>> If a VF is untrusted, a VM will not able to change its MAC and moreover
>>> Note that at this point, I have no VF. So I'm not sure why you mention
>>> that.
>>>
>>>
>>>> in this mode we are assuming that the hypervisor has assigned the MAC and
>>>> guest is not expected to change the MAC.
>>> Wait, for ordinary old-fashioned virtio_net, as a VM user, I can change
>>> mac and all works fine. How is this different? Change mac on "failover
>>> instance" should work and should propagate the mac down to its slaves.
>>>
>>>
>>>> For the initial implementation, i would propose not allowing the guest to
>>>> change the MAC of failover or standby dev.
>>> I see no reason for such restriction.
>>>
>> It is true that a VM user can change mac address of a normal virtio-net interface,
>> however when it is in STANDBY mode i think we should not allow this change specifically
>> because we are creating a failover instance based on a MAC that is assigned by the
>> hypervisor.
>>
>> Moreover,  in a cloud environment i would think that PF/hypervisor assigns a MAC to
>> the VF and it cannot be changed by the guest.
> So that is easy. You allow the change of the mac and in the "failover"
> mac change implementation you propagate the change down to slaves. If
> one slave does not support the change, you bail out. And since VF does
> not allow it as you say, once it will be enslaved, the mac change could
> not be done. Seems like a correct behavior to me and is in-sync with how
> bond/team behaves.

If we allow the mac to be changed when the VF is not yet plugged in
and the guest changes the mac, then VF cannot join the failover when
it is hot plugged with the original mac assigned by the hypervisor.
Specifically there could be timing window during the live migration where
VF would be unplugged at the source and if we allow the guest to change the
failover mac, then VF would not be able to register with the failover after
the VM is migrated to the destination.

>
>> So for the initial implementation, do you see any issues with having this restriction
>> in STANDBY mode.
>>
>>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* CFP PhyCS 2018 - 5th Int.l Conf. on Physiological Computing Systems (Seville/Spain)
From: phycs @ 2018-05-02 15:27 UTC (permalink / raw)
  To: virtualization

SUBMISSION DEADLINE 

5th International Conference on Physiological Computing Systems

Submission Deadline: May 24, 2018

http://www.phycs.org/

September 19 - 21, 2018
Seville, Spain.

 PhyCS is organized in 4 major tracks:

 - Devices
 - Methodologies and Methods
 - Human Factors
 - Applications


 
With the presence of internationally distinguished keynote speakers:
Eduardo Rocon, Consejo Superior de Investigaciones Científicas, Spain
Lucas Noldus, Noldus Information Technology bv, Netherlands
Georgios N. Yannakakis, University of Malta, Malta


A short list of presented papers will be selected so that revised and extended versions of these papers will be published by Springer.
 
All papers presented at the congress venue will also be available at the SCITEPRESS Digital Library (http://www.scitepress.org/DigitalLibrary/).
  
Should you have any question please don’t hesitate contacting me.
 

Kind regards,
PhyCS Secretariat

Address: Av. D. Manuel I, 27A, 2º esq.
2910-595 Setubal, Portugal
Tel: +351 265 520 184
Fax: +351 265 520 186
Web: http://www.phycs.org/
e-mail: phycs.secretariat@insticc.org

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* CFP WEBIST 2018 - 14th Int.l Conf. on Web Information Systems and Technologies (Seville/Spain)
From: webist @ 2018-05-02 15:27 UTC (permalink / raw)
  To: virtualization

SUBMISSION DEADLINE 

14th International Conference on Web Information Systems and Technologies

Submission Deadline: May 24, 2018

http://www.webist.org/

September 18 - 20, 2018
Seville, Spain.

 WEBIST is organized in 5 major tracks:

 - Internet Technology
 - Mobile and NLP Information Systems
 - Service Based Information Systems, Platforms and Eco-Systems
 - Web Intelligence
 - Web Interfaces


 
With the presence of internationally distinguished keynote speakers:
Antonia Bertolino, Italian National Research Council - CNR, Italy

A short list of presented papers will be selected so that revised and extended versions of these papers will be published by Springer.
 
All papers presented at the congress venue will also be available at the SCITEPRESS Digital Library (http://www.scitepress.org/DigitalLibrary/).
  
Should you have any question please don’t hesitate contacting me.
 

Kind regards,
WEBIST Secretariat

Address: Av. D. Manuel I, 27A, 2º esq.
2910-595 Setubal, Portugal
Tel: +351 265 520 184
Fax: +351 265 520 186
Web: http://www.webist.org/
e-mail: webist.secretariat@insticc.org

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC v3 4/5] virtio_ring: add event idx support in packed ring
From: Tiwei Bie @ 2018-05-02 15:12 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, linux-kernel, virtualization, wexu
In-Reply-To: <20180502164828-mutt-send-email-mst@kernel.org>

On Wed, May 02, 2018 at 04:51:01PM +0300, Michael S. Tsirkin wrote:
> On Wed, May 02, 2018 at 03:28:19PM +0800, Tiwei Bie wrote:
> > On Wed, May 02, 2018 at 10:51:06AM +0800, Jason Wang wrote:
> > > On 2018年04月25日 13:15, Tiwei Bie wrote:
> > > > This commit introduces the event idx support in packed
> > > > ring. This feature is temporarily disabled, because the
> > > > implementation in this patch may not work as expected,
> > > > and some further discussions on the implementation are
> > > > needed, e.g. do we have to check the wrap counter when
> > > > checking whether a kick is needed?
> > > > 
> > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > ---
> > > >   drivers/virtio/virtio_ring.c | 53 ++++++++++++++++++++++++++++++++++++++++----
> > > >   1 file changed, 49 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > index 0181e93897be..b1039c2985b9 100644
> > > > --- a/drivers/virtio/virtio_ring.c
> > > > +++ b/drivers/virtio/virtio_ring.c
> > > > @@ -986,7 +986,7 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq,
> > > >   static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > >   {
> > > >   	struct vring_virtqueue *vq = to_vvq(_vq);
> > > > -	u16 flags;
> > > > +	u16 new, old, off_wrap, flags;
> > > >   	bool needs_kick;
> > > >   	u32 snapshot;
> > > > @@ -995,7 +995,12 @@ static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > >   	 * suppressions. */
> > > >   	virtio_mb(vq->weak_barriers);
> > > > +	old = vq->next_avail_idx - vq->num_added;
> > > > +	new = vq->next_avail_idx;
> > > > +	vq->num_added = 0;
> > > > +
> > > >   	snapshot = *(u32 *)vq->vring_packed.device;
> > > > +	off_wrap = virtio16_to_cpu(_vq->vdev, snapshot & 0xffff);
> > > >   	flags = cpu_to_virtio16(_vq->vdev, snapshot >> 16) & 0x3;
> > > >   #ifdef DEBUG
> > > > @@ -1006,7 +1011,10 @@ static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> > > >   	vq->last_add_time_valid = false;
> > > >   #endif
> > > > -	needs_kick = (flags != VRING_EVENT_F_DISABLE);
> > > > +	if (flags == VRING_EVENT_F_DESC)
> > > > +		needs_kick = vring_need_event(off_wrap & ~(1<<15), new, old);
> > > 
> > > I wonder whether or not the math is correct. Both new and event are in the
> > > unit of descriptor ring size, but old looks not.
> > 
> > What vring_need_event() cares is the distance between
> > `new` and `old`, i.e. vq->num_added. So I think there
> > is nothing wrong with `old`. But the calculation of the
> > distance between `new` and `event_idx` isn't right when
> > `new` wraps. How do you think about the below code:
> > 
> > 	wrap_counter = off_wrap >> 15;
> > 	event_idx = off_wrap & ~(1<<15);
> > 	if (wrap_counter != vq->wrap_counter)
> > 		event_idx -= vq->vring_packed.num;
> > 	
> > 	needs_kick = vring_need_event(event_idx, new, old);
> 
> I suspect this hack won't work for non power of 2 ring.

Above code doesn't require the ring size to be a power of 2.

For (__u16)(new_idx - old), what we want to get is vq->num_added.

old = vq->next_avail_idx - vq->num_added;
new = vq->next_avail_idx;

When vq->next_avail_idx >= vq->num_added, it's obvious that,
(__u16)(new_idx - old) is vq->num_added.

And when vq->next_avail_idx < vq->num_added, new will be smaller
than old (old will be a big unsigned number), but (__u16)(new_idx
- old) is still vq->num_added.

For (__u16)(new_idx - event_idx - 1), when new wraps and event_idx
doesn't wrap, the most straightforward way to calculate it is:
(new + vq->vring_packed.num) - event_idx - 1.

But we can also calculate it in this way:

event_idx -= vq->vring_packed.num;
(event_idx will be a big unsigned number)

Then (__u16)(new_idx - event_idx - 1) will be the value we want.

Best regards,
Tiwei Bie

> 
> 
> > Best regards,
> > Tiwei Bie
> > 
> > 
> > > 
> > > Thanks
> > > 
> > > > +	else
> > > > +		needs_kick = (flags != VRING_EVENT_F_DISABLE);
> > > >   	END_USE(vq);
> > > >   	return needs_kick;
> > > >   }
> > > > @@ -1116,6 +1124,15 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > >   	if (vq->last_used_idx >= vq->vring_packed.num)
> > > >   		vq->last_used_idx -= vq->vring_packed.num;
> > > > +	/* If we expect an interrupt for the next entry, tell host
> > > > +	 * by writing event index and flush out the write before
> > > > +	 * the read in the next get_buf call. */
> > > > +	if (vq->event_flags_shadow == VRING_EVENT_F_DESC)
> > > > +		virtio_store_mb(vq->weak_barriers,
> > > > +				&vq->vring_packed.driver->off_wrap,
> > > > +				cpu_to_virtio16(_vq->vdev, vq->last_used_idx |
> > > > +						(vq->wrap_counter << 15)));
> > > > +
> > > >   #ifdef DEBUG
> > > >   	vq->last_add_time_valid = false;
> > > >   #endif
> > > > @@ -1143,10 +1160,17 @@ static unsigned virtqueue_enable_cb_prepare_packed(struct virtqueue *_vq)
> > > >   	/* We optimistically turn back on interrupts, then check if there was
> > > >   	 * more to do. */
> > > > +	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
> > > > +	 * either clear the flags bit or point the event index at the next
> > > > +	 * entry. Always update the event index to keep code simple. */
> > > > +
> > > > +	vq->vring_packed.driver->off_wrap = cpu_to_virtio16(_vq->vdev,
> > > > +			vq->last_used_idx | (vq->wrap_counter << 15));
> > > >   	if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> > > >   		virtio_wmb(vq->weak_barriers);
> > > > -		vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> > > > +		vq->event_flags_shadow = vq->event ? VRING_EVENT_F_DESC :
> > > > +						     VRING_EVENT_F_ENABLE;
> > > >   		vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> > > >   							vq->event_flags_shadow);
> > > >   	}
> > > > @@ -1172,15 +1196,34 @@ static bool virtqueue_poll_packed(struct virtqueue *_vq, unsigned last_used_idx)
> > > >   static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)
> > > >   {
> > > >   	struct vring_virtqueue *vq = to_vvq(_vq);
> > > > +	u16 bufs, used_idx, wrap_counter;
> > > >   	START_USE(vq);
> > > >   	/* We optimistically turn back on interrupts, then check if there was
> > > >   	 * more to do. */
> > > > +	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
> > > > +	 * either clear the flags bit or point the event index at the next
> > > > +	 * entry. Always update the event index to keep code simple. */
> > > > +
> > > > +	/* TODO: tune this threshold */
> > > > +	bufs = (u16)(vq->next_avail_idx - vq->last_used_idx) * 3 / 4;
> > > > +
> > > > +	used_idx = vq->last_used_idx + bufs;
> > > > +	wrap_counter = vq->wrap_counter;
> > > > +
> > > > +	if (used_idx >= vq->vring_packed.num) {
> > > > +		used_idx -= vq->vring_packed.num;
> > > > +		wrap_counter ^= 1;
> > > > +	}
> > > > +
> > > > +	vq->vring_packed.driver->off_wrap = cpu_to_virtio16(_vq->vdev,
> > > > +			used_idx | (wrap_counter << 15));
> > > >   	if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> > > >   		virtio_wmb(vq->weak_barriers);
> > > > -		vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> > > > +		vq->event_flags_shadow = vq->event ? VRING_EVENT_F_DESC :
> > > > +						     VRING_EVENT_F_ENABLE;
> > > >   		vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> > > >   							vq->event_flags_shadow);
> > > >   	}
> > > > @@ -1822,8 +1865,10 @@ void vring_transport_features(struct virtio_device *vdev)
> > > >   		switch (i) {
> > > >   		case VIRTIO_RING_F_INDIRECT_DESC:
> > > >   			break;
> > > > +#if 0
> > > >   		case VIRTIO_RING_F_EVENT_IDX:
> > > >   			break;
> > > > +#endif
> > > >   		case VIRTIO_F_VERSION_1:
> > > >   			break;
> > > >   		case VIRTIO_F_IOMMU_PLATFORM:
> > > 
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH] Revert "vhost: make msg padding explicit"
From: David Miller @ 2018-05-02 14:27 UTC (permalink / raw)
  To: mst; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <1525270740-460646-1-git-send-email-mst@redhat.com>

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Wed, 2 May 2018 17:19:05 +0300

> This reverts commit 93c0d549c4c5a7382ad70de6b86610b7aae57406.
> 
> Unfortunately the padding will break 32 bit userspace.
> Ouch. Need to add some compat code, revert for now.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Applied, thanks Michael.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox