Linux virtualization list
 help / color / mirror / Atom feed
* Re: [PATCH net] virtio-net: keep vnet header zeroed after processing XDP
From: David Miller @ 2018-12-01  1:25 UTC (permalink / raw)
  To: jasowang; +Cc: netdev, virtualization, linux-kernel, mst
In-Reply-To: <20181129055316.24090-1-jasowang@redhat.com>

From: Jason Wang <jasowang@redhat.com>
Date: Thu, 29 Nov 2018 13:53:16 +0800

> We copy vnet header unconditionally in page_to_skb() this is wrong
> since XDP may modify the packet data. So let's keep a zeroed vnet
> header for not confusing the conversion between vnet header and skb
> metadata.
> 
> In the future, we should able to detect whether or not the packet was
> modified and keep using the vnet header when packet was not touched.
> 
> Fixes: f600b6905015 ("virtio_net: Add XDP support")
> Reported-by: Pavel Popa <pashinho1990@gmail.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PULL] vhost: cleanups and fixes
From: Michael S. Tsirkin @ 2018-11-30 19:55 UTC (permalink / raw)
  To: Bijan Mottahedeh
  Cc: lenaic, mhocko, Kees Cook, kvm, netdev, liang.z.li,
	Linux Kernel Mailing List, virtualization, stefanha, joe,
	Andrew Morton, mhocko, Linus Torvalds
In-Reply-To: <e5eb0188-61fa-7556-4c36-68a618ba39bf@oracle.com>

On Fri, Nov 30, 2018 at 11:01:03AM -0800, Bijan Mottahedeh wrote:
> On 11/30/2018 5:44 AM, Michael S. Tsirkin wrote:
> > On Thu, Nov 01, 2018 at 04:06:19PM -0700, Linus Torvalds wrote:
> > > On Thu, Nov 1, 2018 at 4:00 PM Kees Cook <keescook@chromium.org> wrote:
> > > > +       memset(&rsp, 0, sizeof(rsp));
> > > > +       rsp.response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
> > > > +       resp = vq->iov[out].iov_base;
> > > > +       ret = __copy_to_user(resp, &rsp, sizeof(rsp));
> > > > 
> > > > Is it actually safe to trust that iov_base has passed an earlier
> > > > access_ok() check here? Why not just use copy_to_user() instead?
> > > Good point.
> > > 
> > > We really should have removed those double-underscore things ages ago.
> > > 
> > > Also, apart from the address, what about the size? Wouldn't it be
> > > better to use copy_to_iter() rather than implement it badly by hand?
> > > 
> > >                 Linus
> > Bijan can you respond please?
> > Are you going to look into this and convert code to copy_to_iter?
> > I don't think we should release Linux like this, so if you don't
> > have the time I'd rather revert for now and you can look
> > into reposting for the next release.
> > 
> > Thanks,
> > 
> 
> Sure, will do.  Can I send an individual patch for the fix to
> vhost_scsi_send_tmf_reject()?
> 
> Thanks.
> 
> --bijan

Please go ahead.

-- 
MST

^ permalink raw reply

* Re: [PATCH net-next v3 01/13] virtio: add packed ring types and macros
From: Michael S. Tsirkin @ 2018-11-30 16:46 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: virtio-dev, netdev, linux-kernel, virtualization, Maxime Coquelin,
	wexu
In-Reply-To: <20181130162416.GA19234@debian>

On Sat, Dec 01, 2018 at 12:24:16AM +0800, Tiwei Bie wrote:
> On Fri, Nov 30, 2018 at 10:53:07AM -0500, Michael S. Tsirkin wrote:
> > On Fri, Nov 30, 2018 at 11:37:37PM +0800, Tiwei Bie wrote:
> > > On Fri, Nov 30, 2018 at 08:52:42AM -0500, Michael S. Tsirkin wrote:
> > > > On Fri, Nov 30, 2018 at 02:01:06PM +0100, Maxime Coquelin wrote:
> > > > > On 11/30/18 1:47 PM, Michael S. Tsirkin wrote:
> > > > > > On Fri, Nov 30, 2018 at 05:53:40PM +0800, Tiwei Bie wrote:
> > > > > > > On Fri, Nov 30, 2018 at 04:10:55PM +0800, Jason Wang wrote:
> > > > > > > > 
> > > > > > > > On 2018/11/21 下午6:03, Tiwei Bie wrote:
> > > > > > > > > Add types and macros for packed ring.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > > > > ---
> > > > > > > > >    include/uapi/linux/virtio_config.h |  3 +++
> > > > > > > > >    include/uapi/linux/virtio_ring.h   | 52 ++++++++++++++++++++++++++++++++++++++
> > > > > > > > >    2 files changed, 55 insertions(+)
> > > > > > > > > 
> > > > > > > > > diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
> > > > > > > > > index 449132c76b1c..1196e1c1d4f6 100644
> > > > > > > > > --- a/include/uapi/linux/virtio_config.h
> > > > > > > > > +++ b/include/uapi/linux/virtio_config.h
> > > > > > > > > @@ -75,6 +75,9 @@
> > > > > > > > >     */
> > > > > > > > >    #define VIRTIO_F_IOMMU_PLATFORM		33
> > > > > > > > > +/* This feature indicates support for the packed virtqueue layout. */
> > > > > > > > > +#define VIRTIO_F_RING_PACKED		34
> > > > > > > > > +
> > > > > > > > >    /*
> > > > > > > > >     * Does the device support Single Root I/O Virtualization?
> > > > > > > > >     */
> > > > > > > > > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> > > > > > > > > index 6d5d5faa989b..2414f8af26b3 100644
> > > > > > > > > --- a/include/uapi/linux/virtio_ring.h
> > > > > > > > > +++ b/include/uapi/linux/virtio_ring.h
> > > > > > > > > @@ -44,6 +44,13 @@
> > > > > > > > >    /* This means the buffer contains a list of buffer descriptors. */
> > > > > > > > >    #define VRING_DESC_F_INDIRECT	4
> > > > > > > > > +/*
> > > > > > > > > + * Mark a descriptor as available or used in packed ring.
> > > > > > > > > + * Notice: they are defined as shifts instead of shifted values.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > This looks inconsistent to previous flags, any reason for using shifts?
> > > > > > > 
> > > > > > > Yeah, it was suggested to use shifts, as _F_ should be a bit
> > > > > > > number, not a shifted value:
> > > > > > > 
> > > > > > > https://patchwork.ozlabs.org/patch/942296/#1989390
> > > > > > 
> > > > > > But let's add a _SPLIT_ variant that uses shifts consistently.
> > > > > 
> > > > > Maybe we could avoid adding SPLIT and PACKED, but define as follow:
> > > > > 
> > > > > #define VRING_DESC_F_INDIRECT_SHIFT 2
> > > > > #define VRING_DESC_F_INDIRECT (1ull <<  VRING_DESC_F_INDIRECT_SHIFT)
> > > > > 
> > > > > #define VRING_DESC_F_AVAIL_SHIFT 7
> > > > > #define VRING_DESC_F_AVAIL (1ull << VRING_DESC_F_AVAIL_SHIFT)
> > > > > 
> > > > > I think it would be better for consistency.
> > > > 
> > > > I don't think that will help. the problem is that
> > > > most of the existing virtio code consistently uses _F_ as shifts.
> > > > So we just need to do something about these 5 being inconsistent:
> > > > 
> > > > include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_NEXT      1
> > > > include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_WRITE     2
> > > > include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_INDIRECT  4
> > > > include/uapi/linux/virtio_ring.h:#define VRING_USED_F_NO_NOTIFY 1
> > > > include/uapi/linux/virtio_ring.h:#define VRING_AVAIL_F_NO_INTERRUPT     1
> > > > 
> > > > I do not want all of virtio to become verbose with _SHIFT, ergo
> > > > we need to change the above 5 to have names which are with _F_ and
> > > > have the bit number.
> > > 
> > > How about something like this:
> > > 
> > > #define VRING_COMM_DESC_F_NEXT			0
> > > #define VRING_COMM_DESC_F_WRITE			1
> > > #define VRING_COMM_DESC_F_INDIRECT		2
> > > 
> > > #define VRING_SPLIT_USED_F_NO_NOTIFY		0
> > > #define VRING_SPLIT_AVAIL_F_NO_INTERRUPT	0
> > > 
> > > or
> > > 
> > > #define VRING_SPLIT_DESC_F_NEXT			0
> > > #define VRING_SPLIT_DESC_F_WRITE		1
> > > #define VRING_SPLIT_DESC_F_INDIRECT		2
> > > 
> > > #define VRING_SPLIT_USED_F_NO_NOTIFY		0
> > > #define VRING_SPLIT_AVAIL_F_NO_INTERRUPT	0
> > > 
> > > #define VRING_PACKED_DESC_F_NEXT		0
> > > #define VRING_PACKED_DESC_F_WRITE		1
> > > #define VRING_PACKED_DESC_F_INDIRECT		2
> > 
> > As we aren't sharing code I think I prefer the second form.
> > Maybe add VRING_NO_LEGACY so people can audit their code
> > for assumptions?
> 
> Maybe it's better to name it as VIRTIO_RING_NO_LEGACY
> which is more consistent with the names in other files.

OK

> > 
> > We also want to guard layout definitions at the end of that file
> > I think.
> 
> Do you mean guard the definitions of `struct vring` and
> `struct vring_packed` with _NO_LEGACY?

I really mean vring_size/vring_init/vring_used_event/vring_avail_event
these are pre- virtio 1 and should be disabled when
building with no legacy.

But yes I would say let's make sure people can set a flag and
find all dependencies of the split layout in their code easily.

> > 
> > > > 
> > > > 
> > > > 
> > > > > > 
> > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > + */
> > > > > > > > > +#define VRING_PACKED_DESC_F_AVAIL	7
> > > > > > > > > +#define VRING_PACKED_DESC_F_USED	15
> > > > > > > > > +
> > > > > > > > >    /* The Host uses this in used->flags to advise the Guest: don't kick me when
> > > > > > > > >     * you add a buffer.  It's unreliable, so it's simply an optimization.  Guest
> > > > > > > > >     * will still kick if it's out of buffers. */
> > > > > > > > > @@ -53,6 +60,23 @@
> > > > > > > > >     * optimization.  */
> > > > > > > > >    #define VRING_AVAIL_F_NO_INTERRUPT	1
> > > > > > > > > +/* Enable events in packed ring. */
> > > > > > > > > +#define VRING_PACKED_EVENT_FLAG_ENABLE	0x0
> > > > > > > > > +/* Disable events in packed ring. */
> > > > > > > > > +#define VRING_PACKED_EVENT_FLAG_DISABLE	0x1
> > > > > > > > > +/*
> > > > > > > > > + * Enable events for a specific descriptor in packed ring.
> > > > > > > > > + * (as specified by Descriptor Ring Change Event Offset/Wrap Counter).
> > > > > > > > > + * Only valid if VIRTIO_RING_F_EVENT_IDX has been negotiated.
> > > > > > > > > + */
> > > > > > > > > +#define VRING_PACKED_EVENT_FLAG_DESC	0x2
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Any reason for using _FLAG_ instead of _F_?
> > > > > > > 
> > > > > > > Yeah, it was suggested to not use _F_, as these are values,
> > > > > > > should not have _F_:
> > > > > > > 
> > > > > > > https://patchwork.ozlabs.org/patch/942296/#1989390
> > > > > > > 
> > > > > > > Regards,
> > > > > > > Tiwei
> > > > > > > 
> > > > > > > > 
> > > > > > > > Thanks
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > +
> > > > > > > > > +/*
> > > > > > > > > + * Wrap counter bit shift in event suppression structure
> > > > > > > > > + * of packed ring.
> > > > > > > > > + */
> > > > > > > > > +#define VRING_PACKED_EVENT_F_WRAP_CTR	15
> > > > > > > > > +
> > > > > > > > >    /* We support indirect buffer descriptors */
> > > > > > > > >    #define VIRTIO_RING_F_INDIRECT_DESC	28
> > > > > > > > > @@ -171,4 +195,32 @@ static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
> > > > > > > > >    	return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old);
> > > > > > > > >    }
> > > > > > > > > +struct vring_packed_desc_event {
> > > > > > > > > +	/* Descriptor Ring Change Event Offset/Wrap Counter. */
> > > > > > > > > +	__le16 off_wrap;
> > > > > > > > > +	/* Descriptor Ring Change Event Flags. */
> > > > > > > > > +	__le16 flags;
> > > > > > > > > +};
> > > > > > > > > +
> > > > > > > > > +struct vring_packed_desc {
> > > > > > > > > +	/* Buffer Address. */
> > > > > > > > > +	__le64 addr;
> > > > > > > > > +	/* Buffer Length. */
> > > > > > > > > +	__le32 len;
> > > > > > > > > +	/* Buffer ID. */
> > > > > > > > > +	__le16 id;
> > > > > > > > > +	/* The flags depending on descriptor type. */
> > > > > > > > > +	__le16 flags;
> > > > > > > > > +};
> > > > > > > > > +
> > > > > > > > > +struct vring_packed {
> > > > > > > > > +	unsigned int num;
> > > > > > > > > +
> > > > > > > > > +	struct vring_packed_desc *desc;
> > > > > > > > > +
> > > > > > > > > +	struct vring_packed_desc_event *driver;
> > > > > > > > > +
> > > > > > > > > +	struct vring_packed_desc_event *device;
> > > > > > > > > +};
> > > > > > > > > +
> > > > > > > > >    #endif /* _UAPI_LINUX_VIRTIO_RING_H */
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net-next v3 01/13] virtio: add packed ring types and macros
From: Tiwei Bie @ 2018-11-30 16:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, netdev, linux-kernel, virtualization, Maxime Coquelin,
	wexu
In-Reply-To: <20181130104957-mutt-send-email-mst@kernel.org>

On Fri, Nov 30, 2018 at 10:53:07AM -0500, Michael S. Tsirkin wrote:
> On Fri, Nov 30, 2018 at 11:37:37PM +0800, Tiwei Bie wrote:
> > On Fri, Nov 30, 2018 at 08:52:42AM -0500, Michael S. Tsirkin wrote:
> > > On Fri, Nov 30, 2018 at 02:01:06PM +0100, Maxime Coquelin wrote:
> > > > On 11/30/18 1:47 PM, Michael S. Tsirkin wrote:
> > > > > On Fri, Nov 30, 2018 at 05:53:40PM +0800, Tiwei Bie wrote:
> > > > > > On Fri, Nov 30, 2018 at 04:10:55PM +0800, Jason Wang wrote:
> > > > > > > 
> > > > > > > On 2018/11/21 下午6:03, Tiwei Bie wrote:
> > > > > > > > Add types and macros for packed ring.
> > > > > > > > 
> > > > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > > > ---
> > > > > > > >    include/uapi/linux/virtio_config.h |  3 +++
> > > > > > > >    include/uapi/linux/virtio_ring.h   | 52 ++++++++++++++++++++++++++++++++++++++
> > > > > > > >    2 files changed, 55 insertions(+)
> > > > > > > > 
> > > > > > > > diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
> > > > > > > > index 449132c76b1c..1196e1c1d4f6 100644
> > > > > > > > --- a/include/uapi/linux/virtio_config.h
> > > > > > > > +++ b/include/uapi/linux/virtio_config.h
> > > > > > > > @@ -75,6 +75,9 @@
> > > > > > > >     */
> > > > > > > >    #define VIRTIO_F_IOMMU_PLATFORM		33
> > > > > > > > +/* This feature indicates support for the packed virtqueue layout. */
> > > > > > > > +#define VIRTIO_F_RING_PACKED		34
> > > > > > > > +
> > > > > > > >    /*
> > > > > > > >     * Does the device support Single Root I/O Virtualization?
> > > > > > > >     */
> > > > > > > > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> > > > > > > > index 6d5d5faa989b..2414f8af26b3 100644
> > > > > > > > --- a/include/uapi/linux/virtio_ring.h
> > > > > > > > +++ b/include/uapi/linux/virtio_ring.h
> > > > > > > > @@ -44,6 +44,13 @@
> > > > > > > >    /* This means the buffer contains a list of buffer descriptors. */
> > > > > > > >    #define VRING_DESC_F_INDIRECT	4
> > > > > > > > +/*
> > > > > > > > + * Mark a descriptor as available or used in packed ring.
> > > > > > > > + * Notice: they are defined as shifts instead of shifted values.
> > > > > > > 
> > > > > > > 
> > > > > > > This looks inconsistent to previous flags, any reason for using shifts?
> > > > > > 
> > > > > > Yeah, it was suggested to use shifts, as _F_ should be a bit
> > > > > > number, not a shifted value:
> > > > > > 
> > > > > > https://patchwork.ozlabs.org/patch/942296/#1989390
> > > > > 
> > > > > But let's add a _SPLIT_ variant that uses shifts consistently.
> > > > 
> > > > Maybe we could avoid adding SPLIT and PACKED, but define as follow:
> > > > 
> > > > #define VRING_DESC_F_INDIRECT_SHIFT 2
> > > > #define VRING_DESC_F_INDIRECT (1ull <<  VRING_DESC_F_INDIRECT_SHIFT)
> > > > 
> > > > #define VRING_DESC_F_AVAIL_SHIFT 7
> > > > #define VRING_DESC_F_AVAIL (1ull << VRING_DESC_F_AVAIL_SHIFT)
> > > > 
> > > > I think it would be better for consistency.
> > > 
> > > I don't think that will help. the problem is that
> > > most of the existing virtio code consistently uses _F_ as shifts.
> > > So we just need to do something about these 5 being inconsistent:
> > > 
> > > include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_NEXT      1
> > > include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_WRITE     2
> > > include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_INDIRECT  4
> > > include/uapi/linux/virtio_ring.h:#define VRING_USED_F_NO_NOTIFY 1
> > > include/uapi/linux/virtio_ring.h:#define VRING_AVAIL_F_NO_INTERRUPT     1
> > > 
> > > I do not want all of virtio to become verbose with _SHIFT, ergo
> > > we need to change the above 5 to have names which are with _F_ and
> > > have the bit number.
> > 
> > How about something like this:
> > 
> > #define VRING_COMM_DESC_F_NEXT			0
> > #define VRING_COMM_DESC_F_WRITE			1
> > #define VRING_COMM_DESC_F_INDIRECT		2
> > 
> > #define VRING_SPLIT_USED_F_NO_NOTIFY		0
> > #define VRING_SPLIT_AVAIL_F_NO_INTERRUPT	0
> > 
> > or
> > 
> > #define VRING_SPLIT_DESC_F_NEXT			0
> > #define VRING_SPLIT_DESC_F_WRITE		1
> > #define VRING_SPLIT_DESC_F_INDIRECT		2
> > 
> > #define VRING_SPLIT_USED_F_NO_NOTIFY		0
> > #define VRING_SPLIT_AVAIL_F_NO_INTERRUPT	0
> > 
> > #define VRING_PACKED_DESC_F_NEXT		0
> > #define VRING_PACKED_DESC_F_WRITE		1
> > #define VRING_PACKED_DESC_F_INDIRECT		2
> 
> As we aren't sharing code I think I prefer the second form.
> Maybe add VRING_NO_LEGACY so people can audit their code
> for assumptions?

Maybe it's better to name it as VIRTIO_RING_NO_LEGACY
which is more consistent with the names in other files.

> 
> We also want to guard layout definitions at the end of that file
> I think.

Do you mean guard the definitions of `struct vring` and
`struct vring_packed` with _NO_LEGACY?

> 
> > > 
> > > 
> > > 
> > > > > 
> > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > > + */
> > > > > > > > +#define VRING_PACKED_DESC_F_AVAIL	7
> > > > > > > > +#define VRING_PACKED_DESC_F_USED	15
> > > > > > > > +
> > > > > > > >    /* The Host uses this in used->flags to advise the Guest: don't kick me when
> > > > > > > >     * you add a buffer.  It's unreliable, so it's simply an optimization.  Guest
> > > > > > > >     * will still kick if it's out of buffers. */
> > > > > > > > @@ -53,6 +60,23 @@
> > > > > > > >     * optimization.  */
> > > > > > > >    #define VRING_AVAIL_F_NO_INTERRUPT	1
> > > > > > > > +/* Enable events in packed ring. */
> > > > > > > > +#define VRING_PACKED_EVENT_FLAG_ENABLE	0x0
> > > > > > > > +/* Disable events in packed ring. */
> > > > > > > > +#define VRING_PACKED_EVENT_FLAG_DISABLE	0x1
> > > > > > > > +/*
> > > > > > > > + * Enable events for a specific descriptor in packed ring.
> > > > > > > > + * (as specified by Descriptor Ring Change Event Offset/Wrap Counter).
> > > > > > > > + * Only valid if VIRTIO_RING_F_EVENT_IDX has been negotiated.
> > > > > > > > + */
> > > > > > > > +#define VRING_PACKED_EVENT_FLAG_DESC	0x2
> > > > > > > 
> > > > > > > 
> > > > > > > Any reason for using _FLAG_ instead of _F_?
> > > > > > 
> > > > > > Yeah, it was suggested to not use _F_, as these are values,
> > > > > > should not have _F_:
> > > > > > 
> > > > > > https://patchwork.ozlabs.org/patch/942296/#1989390
> > > > > > 
> > > > > > Regards,
> > > > > > Tiwei
> > > > > > 
> > > > > > > 
> > > > > > > Thanks
> > > > > > > 
> > > > > > > 
> > > > > > > > +
> > > > > > > > +/*
> > > > > > > > + * Wrap counter bit shift in event suppression structure
> > > > > > > > + * of packed ring.
> > > > > > > > + */
> > > > > > > > +#define VRING_PACKED_EVENT_F_WRAP_CTR	15
> > > > > > > > +
> > > > > > > >    /* We support indirect buffer descriptors */
> > > > > > > >    #define VIRTIO_RING_F_INDIRECT_DESC	28
> > > > > > > > @@ -171,4 +195,32 @@ static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
> > > > > > > >    	return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old);
> > > > > > > >    }
> > > > > > > > +struct vring_packed_desc_event {
> > > > > > > > +	/* Descriptor Ring Change Event Offset/Wrap Counter. */
> > > > > > > > +	__le16 off_wrap;
> > > > > > > > +	/* Descriptor Ring Change Event Flags. */
> > > > > > > > +	__le16 flags;
> > > > > > > > +};
> > > > > > > > +
> > > > > > > > +struct vring_packed_desc {
> > > > > > > > +	/* Buffer Address. */
> > > > > > > > +	__le64 addr;
> > > > > > > > +	/* Buffer Length. */
> > > > > > > > +	__le32 len;
> > > > > > > > +	/* Buffer ID. */
> > > > > > > > +	__le16 id;
> > > > > > > > +	/* The flags depending on descriptor type. */
> > > > > > > > +	__le16 flags;
> > > > > > > > +};
> > > > > > > > +
> > > > > > > > +struct vring_packed {
> > > > > > > > +	unsigned int num;
> > > > > > > > +
> > > > > > > > +	struct vring_packed_desc *desc;
> > > > > > > > +
> > > > > > > > +	struct vring_packed_desc_event *driver;
> > > > > > > > +
> > > > > > > > +	struct vring_packed_desc_event *device;
> > > > > > > > +};
> > > > > > > > +
> > > > > > > >    #endif /* _UAPI_LINUX_VIRTIO_RING_H */
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* [PATCH v2] vhost: fix IOTLB locking
From: Jean-Philippe Brucker @ 2018-11-30 16:05 UTC (permalink / raw)
  To: mst, jasowang; +Cc: netdev, davem, kvm, virtualization

Commit 78139c94dc8c ("net: vhost: lock the vqs one by one") moved the vq
lock to improve scalability, but introduced a possible deadlock in
vhost-iotlb. vhost_iotlb_notify_vq() now takes vq->mutex while holding
the device's IOTLB spinlock. And on the vhost_iotlb_miss() path, the
spinlock is taken while holding vq->mutex.

Since calling vhost_poll_queue() doesn't require any lock, avoid the
deadlock by not taking vq->mutex.

Fixes: 78139c94dc8c ("net: vhost: lock the vqs one by one")
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 drivers/vhost/vhost.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 3a5f81a66d34..6b98d8e3a5bf 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -944,10 +944,7 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
 		if (msg->iova <= vq_msg->iova &&
 		    msg->iova + msg->size - 1 >= vq_msg->iova &&
 		    vq_msg->type == VHOST_IOTLB_MISS) {
-			mutex_lock(&node->vq->mutex);
 			vhost_poll_queue(&node->vq->poll);
-			mutex_unlock(&node->vq->mutex);
-
 			list_del(&node->node);
 			kfree(node);
 		}
-- 
2.19.1

^ permalink raw reply related

* Re: [PATCH net-next v3 01/13] virtio: add packed ring types and macros
From: Michael S. Tsirkin @ 2018-11-30 15:53 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: virtio-dev, netdev, linux-kernel, virtualization, Maxime Coquelin,
	wexu
In-Reply-To: <20181130153737.GA15105@debian>

On Fri, Nov 30, 2018 at 11:37:37PM +0800, Tiwei Bie wrote:
> On Fri, Nov 30, 2018 at 08:52:42AM -0500, Michael S. Tsirkin wrote:
> > On Fri, Nov 30, 2018 at 02:01:06PM +0100, Maxime Coquelin wrote:
> > > On 11/30/18 1:47 PM, Michael S. Tsirkin wrote:
> > > > On Fri, Nov 30, 2018 at 05:53:40PM +0800, Tiwei Bie wrote:
> > > > > On Fri, Nov 30, 2018 at 04:10:55PM +0800, Jason Wang wrote:
> > > > > > 
> > > > > > On 2018/11/21 下午6:03, Tiwei Bie wrote:
> > > > > > > Add types and macros for packed ring.
> > > > > > > 
> > > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > > ---
> > > > > > >    include/uapi/linux/virtio_config.h |  3 +++
> > > > > > >    include/uapi/linux/virtio_ring.h   | 52 ++++++++++++++++++++++++++++++++++++++
> > > > > > >    2 files changed, 55 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
> > > > > > > index 449132c76b1c..1196e1c1d4f6 100644
> > > > > > > --- a/include/uapi/linux/virtio_config.h
> > > > > > > +++ b/include/uapi/linux/virtio_config.h
> > > > > > > @@ -75,6 +75,9 @@
> > > > > > >     */
> > > > > > >    #define VIRTIO_F_IOMMU_PLATFORM		33
> > > > > > > +/* This feature indicates support for the packed virtqueue layout. */
> > > > > > > +#define VIRTIO_F_RING_PACKED		34
> > > > > > > +
> > > > > > >    /*
> > > > > > >     * Does the device support Single Root I/O Virtualization?
> > > > > > >     */
> > > > > > > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> > > > > > > index 6d5d5faa989b..2414f8af26b3 100644
> > > > > > > --- a/include/uapi/linux/virtio_ring.h
> > > > > > > +++ b/include/uapi/linux/virtio_ring.h
> > > > > > > @@ -44,6 +44,13 @@
> > > > > > >    /* This means the buffer contains a list of buffer descriptors. */
> > > > > > >    #define VRING_DESC_F_INDIRECT	4
> > > > > > > +/*
> > > > > > > + * Mark a descriptor as available or used in packed ring.
> > > > > > > + * Notice: they are defined as shifts instead of shifted values.
> > > > > > 
> > > > > > 
> > > > > > This looks inconsistent to previous flags, any reason for using shifts?
> > > > > 
> > > > > Yeah, it was suggested to use shifts, as _F_ should be a bit
> > > > > number, not a shifted value:
> > > > > 
> > > > > https://patchwork.ozlabs.org/patch/942296/#1989390
> > > > 
> > > > But let's add a _SPLIT_ variant that uses shifts consistently.
> > > 
> > > Maybe we could avoid adding SPLIT and PACKED, but define as follow:
> > > 
> > > #define VRING_DESC_F_INDIRECT_SHIFT 2
> > > #define VRING_DESC_F_INDIRECT (1ull <<  VRING_DESC_F_INDIRECT_SHIFT)
> > > 
> > > #define VRING_DESC_F_AVAIL_SHIFT 7
> > > #define VRING_DESC_F_AVAIL (1ull << VRING_DESC_F_AVAIL_SHIFT)
> > > 
> > > I think it would be better for consistency.
> > 
> > I don't think that will help. the problem is that
> > most of the existing virtio code consistently uses _F_ as shifts.
> > So we just need to do something about these 5 being inconsistent:
> > 
> > include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_NEXT      1
> > include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_WRITE     2
> > include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_INDIRECT  4
> > include/uapi/linux/virtio_ring.h:#define VRING_USED_F_NO_NOTIFY 1
> > include/uapi/linux/virtio_ring.h:#define VRING_AVAIL_F_NO_INTERRUPT     1
> > 
> > I do not want all of virtio to become verbose with _SHIFT, ergo
> > we need to change the above 5 to have names which are with _F_ and
> > have the bit number.
> 
> How about something like this:
> 
> #define VRING_COMM_DESC_F_NEXT			0
> #define VRING_COMM_DESC_F_WRITE			1
> #define VRING_COMM_DESC_F_INDIRECT		2
> 
> #define VRING_SPLIT_USED_F_NO_NOTIFY		0
> #define VRING_SPLIT_AVAIL_F_NO_INTERRUPT	0
> 
> or
> 
> #define VRING_SPLIT_DESC_F_NEXT			0
> #define VRING_SPLIT_DESC_F_WRITE		1
> #define VRING_SPLIT_DESC_F_INDIRECT		2
> 
> #define VRING_SPLIT_USED_F_NO_NOTIFY		0
> #define VRING_SPLIT_AVAIL_F_NO_INTERRUPT	0
> 
> #define VRING_PACKED_DESC_F_NEXT		0
> #define VRING_PACKED_DESC_F_WRITE		1
> #define VRING_PACKED_DESC_F_INDIRECT		2

As we aren't sharing code I think I prefer the second form.
Maybe add VRING_NO_LEGACY so people can audit their code
for assumptions?

We also want to guard layout definitions at the end of that file
I think.

> > 
> > 
> > 
> > > > 
> > > > 
> > > > > > 
> > > > > > 
> > > > > > > + */
> > > > > > > +#define VRING_PACKED_DESC_F_AVAIL	7
> > > > > > > +#define VRING_PACKED_DESC_F_USED	15
> > > > > > > +
> > > > > > >    /* The Host uses this in used->flags to advise the Guest: don't kick me when
> > > > > > >     * you add a buffer.  It's unreliable, so it's simply an optimization.  Guest
> > > > > > >     * will still kick if it's out of buffers. */
> > > > > > > @@ -53,6 +60,23 @@
> > > > > > >     * optimization.  */
> > > > > > >    #define VRING_AVAIL_F_NO_INTERRUPT	1
> > > > > > > +/* Enable events in packed ring. */
> > > > > > > +#define VRING_PACKED_EVENT_FLAG_ENABLE	0x0
> > > > > > > +/* Disable events in packed ring. */
> > > > > > > +#define VRING_PACKED_EVENT_FLAG_DISABLE	0x1
> > > > > > > +/*
> > > > > > > + * Enable events for a specific descriptor in packed ring.
> > > > > > > + * (as specified by Descriptor Ring Change Event Offset/Wrap Counter).
> > > > > > > + * Only valid if VIRTIO_RING_F_EVENT_IDX has been negotiated.
> > > > > > > + */
> > > > > > > +#define VRING_PACKED_EVENT_FLAG_DESC	0x2
> > > > > > 
> > > > > > 
> > > > > > Any reason for using _FLAG_ instead of _F_?
> > > > > 
> > > > > Yeah, it was suggested to not use _F_, as these are values,
> > > > > should not have _F_:
> > > > > 
> > > > > https://patchwork.ozlabs.org/patch/942296/#1989390
> > > > > 
> > > > > Regards,
> > > > > Tiwei
> > > > > 
> > > > > > 
> > > > > > Thanks
> > > > > > 
> > > > > > 
> > > > > > > +
> > > > > > > +/*
> > > > > > > + * Wrap counter bit shift in event suppression structure
> > > > > > > + * of packed ring.
> > > > > > > + */
> > > > > > > +#define VRING_PACKED_EVENT_F_WRAP_CTR	15
> > > > > > > +
> > > > > > >    /* We support indirect buffer descriptors */
> > > > > > >    #define VIRTIO_RING_F_INDIRECT_DESC	28
> > > > > > > @@ -171,4 +195,32 @@ static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
> > > > > > >    	return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old);
> > > > > > >    }
> > > > > > > +struct vring_packed_desc_event {
> > > > > > > +	/* Descriptor Ring Change Event Offset/Wrap Counter. */
> > > > > > > +	__le16 off_wrap;
> > > > > > > +	/* Descriptor Ring Change Event Flags. */
> > > > > > > +	__le16 flags;
> > > > > > > +};
> > > > > > > +
> > > > > > > +struct vring_packed_desc {
> > > > > > > +	/* Buffer Address. */
> > > > > > > +	__le64 addr;
> > > > > > > +	/* Buffer Length. */
> > > > > > > +	__le32 len;
> > > > > > > +	/* Buffer ID. */
> > > > > > > +	__le16 id;
> > > > > > > +	/* The flags depending on descriptor type. */
> > > > > > > +	__le16 flags;
> > > > > > > +};
> > > > > > > +
> > > > > > > +struct vring_packed {
> > > > > > > +	unsigned int num;
> > > > > > > +
> > > > > > > +	struct vring_packed_desc *desc;
> > > > > > > +
> > > > > > > +	struct vring_packed_desc_event *driver;
> > > > > > > +
> > > > > > > +	struct vring_packed_desc_event *device;
> > > > > > > +};
> > > > > > > +
> > > > > > >    #endif /* _UAPI_LINUX_VIRTIO_RING_H */
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net-next v3 01/13] virtio: add packed ring types and macros
From: Tiwei Bie @ 2018-11-30 15:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, netdev, linux-kernel, virtualization, Maxime Coquelin,
	wexu
In-Reply-To: <20181130084722-mutt-send-email-mst@kernel.org>

On Fri, Nov 30, 2018 at 08:52:42AM -0500, Michael S. Tsirkin wrote:
> On Fri, Nov 30, 2018 at 02:01:06PM +0100, Maxime Coquelin wrote:
> > On 11/30/18 1:47 PM, Michael S. Tsirkin wrote:
> > > On Fri, Nov 30, 2018 at 05:53:40PM +0800, Tiwei Bie wrote:
> > > > On Fri, Nov 30, 2018 at 04:10:55PM +0800, Jason Wang wrote:
> > > > > 
> > > > > On 2018/11/21 下午6:03, Tiwei Bie wrote:
> > > > > > Add types and macros for packed ring.
> > > > > > 
> > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > ---
> > > > > >    include/uapi/linux/virtio_config.h |  3 +++
> > > > > >    include/uapi/linux/virtio_ring.h   | 52 ++++++++++++++++++++++++++++++++++++++
> > > > > >    2 files changed, 55 insertions(+)
> > > > > > 
> > > > > > diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
> > > > > > index 449132c76b1c..1196e1c1d4f6 100644
> > > > > > --- a/include/uapi/linux/virtio_config.h
> > > > > > +++ b/include/uapi/linux/virtio_config.h
> > > > > > @@ -75,6 +75,9 @@
> > > > > >     */
> > > > > >    #define VIRTIO_F_IOMMU_PLATFORM		33
> > > > > > +/* This feature indicates support for the packed virtqueue layout. */
> > > > > > +#define VIRTIO_F_RING_PACKED		34
> > > > > > +
> > > > > >    /*
> > > > > >     * Does the device support Single Root I/O Virtualization?
> > > > > >     */
> > > > > > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> > > > > > index 6d5d5faa989b..2414f8af26b3 100644
> > > > > > --- a/include/uapi/linux/virtio_ring.h
> > > > > > +++ b/include/uapi/linux/virtio_ring.h
> > > > > > @@ -44,6 +44,13 @@
> > > > > >    /* This means the buffer contains a list of buffer descriptors. */
> > > > > >    #define VRING_DESC_F_INDIRECT	4
> > > > > > +/*
> > > > > > + * Mark a descriptor as available or used in packed ring.
> > > > > > + * Notice: they are defined as shifts instead of shifted values.
> > > > > 
> > > > > 
> > > > > This looks inconsistent to previous flags, any reason for using shifts?
> > > > 
> > > > Yeah, it was suggested to use shifts, as _F_ should be a bit
> > > > number, not a shifted value:
> > > > 
> > > > https://patchwork.ozlabs.org/patch/942296/#1989390
> > > 
> > > But let's add a _SPLIT_ variant that uses shifts consistently.
> > 
> > Maybe we could avoid adding SPLIT and PACKED, but define as follow:
> > 
> > #define VRING_DESC_F_INDIRECT_SHIFT 2
> > #define VRING_DESC_F_INDIRECT (1ull <<  VRING_DESC_F_INDIRECT_SHIFT)
> > 
> > #define VRING_DESC_F_AVAIL_SHIFT 7
> > #define VRING_DESC_F_AVAIL (1ull << VRING_DESC_F_AVAIL_SHIFT)
> > 
> > I think it would be better for consistency.
> 
> I don't think that will help. the problem is that
> most of the existing virtio code consistently uses _F_ as shifts.
> So we just need to do something about these 5 being inconsistent:
> 
> include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_NEXT      1
> include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_WRITE     2
> include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_INDIRECT  4
> include/uapi/linux/virtio_ring.h:#define VRING_USED_F_NO_NOTIFY 1
> include/uapi/linux/virtio_ring.h:#define VRING_AVAIL_F_NO_INTERRUPT     1
> 
> I do not want all of virtio to become verbose with _SHIFT, ergo
> we need to change the above 5 to have names which are with _F_ and
> have the bit number.

How about something like this:

#define VRING_COMM_DESC_F_NEXT			0
#define VRING_COMM_DESC_F_WRITE			1
#define VRING_COMM_DESC_F_INDIRECT		2

#define VRING_SPLIT_USED_F_NO_NOTIFY		0
#define VRING_SPLIT_AVAIL_F_NO_INTERRUPT	0

or

#define VRING_SPLIT_DESC_F_NEXT			0
#define VRING_SPLIT_DESC_F_WRITE		1
#define VRING_SPLIT_DESC_F_INDIRECT		2

#define VRING_SPLIT_USED_F_NO_NOTIFY		0
#define VRING_SPLIT_AVAIL_F_NO_INTERRUPT	0

#define VRING_PACKED_DESC_F_NEXT		0
#define VRING_PACKED_DESC_F_WRITE		1
#define VRING_PACKED_DESC_F_INDIRECT		2

> 
> 
> 
> > > 
> > > 
> > > > > 
> > > > > 
> > > > > > + */
> > > > > > +#define VRING_PACKED_DESC_F_AVAIL	7
> > > > > > +#define VRING_PACKED_DESC_F_USED	15
> > > > > > +
> > > > > >    /* The Host uses this in used->flags to advise the Guest: don't kick me when
> > > > > >     * you add a buffer.  It's unreliable, so it's simply an optimization.  Guest
> > > > > >     * will still kick if it's out of buffers. */
> > > > > > @@ -53,6 +60,23 @@
> > > > > >     * optimization.  */
> > > > > >    #define VRING_AVAIL_F_NO_INTERRUPT	1
> > > > > > +/* Enable events in packed ring. */
> > > > > > +#define VRING_PACKED_EVENT_FLAG_ENABLE	0x0
> > > > > > +/* Disable events in packed ring. */
> > > > > > +#define VRING_PACKED_EVENT_FLAG_DISABLE	0x1
> > > > > > +/*
> > > > > > + * Enable events for a specific descriptor in packed ring.
> > > > > > + * (as specified by Descriptor Ring Change Event Offset/Wrap Counter).
> > > > > > + * Only valid if VIRTIO_RING_F_EVENT_IDX has been negotiated.
> > > > > > + */
> > > > > > +#define VRING_PACKED_EVENT_FLAG_DESC	0x2
> > > > > 
> > > > > 
> > > > > Any reason for using _FLAG_ instead of _F_?
> > > > 
> > > > Yeah, it was suggested to not use _F_, as these are values,
> > > > should not have _F_:
> > > > 
> > > > https://patchwork.ozlabs.org/patch/942296/#1989390
> > > > 
> > > > Regards,
> > > > Tiwei
> > > > 
> > > > > 
> > > > > Thanks
> > > > > 
> > > > > 
> > > > > > +
> > > > > > +/*
> > > > > > + * Wrap counter bit shift in event suppression structure
> > > > > > + * of packed ring.
> > > > > > + */
> > > > > > +#define VRING_PACKED_EVENT_F_WRAP_CTR	15
> > > > > > +
> > > > > >    /* We support indirect buffer descriptors */
> > > > > >    #define VIRTIO_RING_F_INDIRECT_DESC	28
> > > > > > @@ -171,4 +195,32 @@ static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
> > > > > >    	return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old);
> > > > > >    }
> > > > > > +struct vring_packed_desc_event {
> > > > > > +	/* Descriptor Ring Change Event Offset/Wrap Counter. */
> > > > > > +	__le16 off_wrap;
> > > > > > +	/* Descriptor Ring Change Event Flags. */
> > > > > > +	__le16 flags;
> > > > > > +};
> > > > > > +
> > > > > > +struct vring_packed_desc {
> > > > > > +	/* Buffer Address. */
> > > > > > +	__le64 addr;
> > > > > > +	/* Buffer Length. */
> > > > > > +	__le32 len;
> > > > > > +	/* Buffer ID. */
> > > > > > +	__le16 id;
> > > > > > +	/* The flags depending on descriptor type. */
> > > > > > +	__le16 flags;
> > > > > > +};
> > > > > > +
> > > > > > +struct vring_packed {
> > > > > > +	unsigned int num;
> > > > > > +
> > > > > > +	struct vring_packed_desc *desc;
> > > > > > +
> > > > > > +	struct vring_packed_desc_event *driver;
> > > > > > +
> > > > > > +	struct vring_packed_desc_event *device;
> > > > > > +};
> > > > > > +
> > > > > >    #endif /* _UAPI_LINUX_VIRTIO_RING_H */
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH] vhost: fix IOTLB locking
From: Jean-Philippe Brucker @ 2018-11-30 15:33 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, davem, kvm, virtualization
In-Reply-To: <20181130075347-mutt-send-email-mst@kernel.org>

On 30/11/2018 13:32, Michael S. Tsirkin wrote:
> On Fri, Nov 30, 2018 at 11:37:02AM +0000, Jean-Philippe Brucker wrote:
>> Commit 78139c94dc8c ("net: vhost: lock the vqs one by one") moved the vq
>> lock to improve scalability, but introduced a possible deadlock in
>> vhost-iotlb. vhost_iotlb_notify_vq() now takes vq->mutex while holding
>> the device's IOTLB spinlock.
> 
> Indeed spin_lock is just outside this snippet. Yack.
> 
>> And on the vhost_iotlb_miss() path, the
>> spinlock is taken while holding vq->mutex.
>>
>> As long as we hold dev->mutex to prevent an ioctl from modifying
>> vq->poll concurrently, we can safely call vhost_poll_queue() without
>> holding vq->mutex. Since vhost_process_iotlb_msg() holds dev->mutex when
>> calling vhost_iotlb_notify_vq(), avoid the deadlock by not taking
>> vq->mutex.
>>
>> Fixes: 78139c94dc8c ("net: vhost: lock the vqs one by one")
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> 
> 
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> 
> but see below for a minor comment.
> 
> I guess we now need this on stable?

I don't think so, the bug is introduced in 4.20

> 
>> ---
>>  drivers/vhost/vhost.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index 3a5f81a66d34..1cbb17f898f7 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -944,10 +944,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
>>  		if (msg->iova <= vq_msg->iova &&
>>  		    msg->iova + msg->size - 1 >= vq_msg->iova &&
>>  		    vq_msg->type == VHOST_IOTLB_MISS) {
>> -			mutex_lock(&node->vq->mutex);
>> +			/* Safe to call outside vq->mutex as long as dev->mutex
>> +			 * is held.
>> +			 */
>>  			vhost_poll_queue(&node->vq->poll);
>> -			mutex_unlock(&node->vq->mutex);
>> -
> 
> In fact vhost_poll_queue is generally lockless so it's
> safe to call without any locks.

Right, I'll remove the comment

Thanks,
Jean

> 
> 
>>  			list_del(&node->node);
>>  			kfree(node);
> 
>>  		}
>> -- 
>> 2.19.1
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
> 

^ permalink raw reply

* Re: [PATCH net-next v3 01/13] virtio: add packed ring types and macros
From: Michael S. Tsirkin @ 2018-11-30 13:52 UTC (permalink / raw)
  To: Maxime Coquelin; +Cc: virtio-dev, netdev, linux-kernel, virtualization, wexu
In-Reply-To: <13593b5c-c1dc-3ab3-a843-d62908e0c7e0@redhat.com>

On Fri, Nov 30, 2018 at 02:01:06PM +0100, Maxime Coquelin wrote:
> 
> 
> On 11/30/18 1:47 PM, Michael S. Tsirkin wrote:
> > On Fri, Nov 30, 2018 at 05:53:40PM +0800, Tiwei Bie wrote:
> > > On Fri, Nov 30, 2018 at 04:10:55PM +0800, Jason Wang wrote:
> > > > 
> > > > On 2018/11/21 下午6:03, Tiwei Bie wrote:
> > > > > Add types and macros for packed ring.
> > > > > 
> > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > ---
> > > > >    include/uapi/linux/virtio_config.h |  3 +++
> > > > >    include/uapi/linux/virtio_ring.h   | 52 ++++++++++++++++++++++++++++++++++++++
> > > > >    2 files changed, 55 insertions(+)
> > > > > 
> > > > > diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
> > > > > index 449132c76b1c..1196e1c1d4f6 100644
> > > > > --- a/include/uapi/linux/virtio_config.h
> > > > > +++ b/include/uapi/linux/virtio_config.h
> > > > > @@ -75,6 +75,9 @@
> > > > >     */
> > > > >    #define VIRTIO_F_IOMMU_PLATFORM		33
> > > > > +/* This feature indicates support for the packed virtqueue layout. */
> > > > > +#define VIRTIO_F_RING_PACKED		34
> > > > > +
> > > > >    /*
> > > > >     * Does the device support Single Root I/O Virtualization?
> > > > >     */
> > > > > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> > > > > index 6d5d5faa989b..2414f8af26b3 100644
> > > > > --- a/include/uapi/linux/virtio_ring.h
> > > > > +++ b/include/uapi/linux/virtio_ring.h
> > > > > @@ -44,6 +44,13 @@
> > > > >    /* This means the buffer contains a list of buffer descriptors. */
> > > > >    #define VRING_DESC_F_INDIRECT	4
> > > > > +/*
> > > > > + * Mark a descriptor as available or used in packed ring.
> > > > > + * Notice: they are defined as shifts instead of shifted values.
> > > > 
> > > > 
> > > > This looks inconsistent to previous flags, any reason for using shifts?
> > > 
> > > Yeah, it was suggested to use shifts, as _F_ should be a bit
> > > number, not a shifted value:
> > > 
> > > https://patchwork.ozlabs.org/patch/942296/#1989390
> > 
> > But let's add a _SPLIT_ variant that uses shifts consistently.
> 
> Maybe we could avoid adding SPLIT and PACKED, but define as follow:
> 
> #define VRING_DESC_F_INDIRECT_SHIFT 2
> #define VRING_DESC_F_INDIRECT (1ull <<  VRING_DESC_F_INDIRECT_SHIFT)
> 
> #define VRING_DESC_F_AVAIL_SHIFT 7
> #define VRING_DESC_F_AVAIL (1ull << VRING_DESC_F_AVAIL_SHIFT)
> 
> I think it would be better for consistency.

I don't think that will help. the problem is that
most of the existing virtio code consistently uses _F_ as shifts.
So we just need to do something about these 5 being inconsistent:

include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_NEXT      1
include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_WRITE     2
include/uapi/linux/virtio_ring.h:#define VRING_DESC_F_INDIRECT  4
include/uapi/linux/virtio_ring.h:#define VRING_USED_F_NO_NOTIFY 1
include/uapi/linux/virtio_ring.h:#define VRING_AVAIL_F_NO_INTERRUPT     1

I do not want all of virtio to become verbose with _SHIFT, ergo
we need to change the above 5 to have names which are with _F_ and
have the bit number.



> > 
> > 
> > > > 
> > > > 
> > > > > + */
> > > > > +#define VRING_PACKED_DESC_F_AVAIL	7
> > > > > +#define VRING_PACKED_DESC_F_USED	15
> > > > > +
> > > > >    /* The Host uses this in used->flags to advise the Guest: don't kick me when
> > > > >     * you add a buffer.  It's unreliable, so it's simply an optimization.  Guest
> > > > >     * will still kick if it's out of buffers. */
> > > > > @@ -53,6 +60,23 @@
> > > > >     * optimization.  */
> > > > >    #define VRING_AVAIL_F_NO_INTERRUPT	1
> > > > > +/* Enable events in packed ring. */
> > > > > +#define VRING_PACKED_EVENT_FLAG_ENABLE	0x0
> > > > > +/* Disable events in packed ring. */
> > > > > +#define VRING_PACKED_EVENT_FLAG_DISABLE	0x1
> > > > > +/*
> > > > > + * Enable events for a specific descriptor in packed ring.
> > > > > + * (as specified by Descriptor Ring Change Event Offset/Wrap Counter).
> > > > > + * Only valid if VIRTIO_RING_F_EVENT_IDX has been negotiated.
> > > > > + */
> > > > > +#define VRING_PACKED_EVENT_FLAG_DESC	0x2
> > > > 
> > > > 
> > > > Any reason for using _FLAG_ instead of _F_?
> > > 
> > > Yeah, it was suggested to not use _F_, as these are values,
> > > should not have _F_:
> > > 
> > > https://patchwork.ozlabs.org/patch/942296/#1989390
> > > 
> > > Regards,
> > > Tiwei
> > > 
> > > > 
> > > > Thanks
> > > > 
> > > > 
> > > > > +
> > > > > +/*
> > > > > + * Wrap counter bit shift in event suppression structure
> > > > > + * of packed ring.
> > > > > + */
> > > > > +#define VRING_PACKED_EVENT_F_WRAP_CTR	15
> > > > > +
> > > > >    /* We support indirect buffer descriptors */
> > > > >    #define VIRTIO_RING_F_INDIRECT_DESC	28
> > > > > @@ -171,4 +195,32 @@ static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
> > > > >    	return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old);
> > > > >    }
> > > > > +struct vring_packed_desc_event {
> > > > > +	/* Descriptor Ring Change Event Offset/Wrap Counter. */
> > > > > +	__le16 off_wrap;
> > > > > +	/* Descriptor Ring Change Event Flags. */
> > > > > +	__le16 flags;
> > > > > +};
> > > > > +
> > > > > +struct vring_packed_desc {
> > > > > +	/* Buffer Address. */
> > > > > +	__le64 addr;
> > > > > +	/* Buffer Length. */
> > > > > +	__le32 len;
> > > > > +	/* Buffer ID. */
> > > > > +	__le16 id;
> > > > > +	/* The flags depending on descriptor type. */
> > > > > +	__le16 flags;
> > > > > +};
> > > > > +
> > > > > +struct vring_packed {
> > > > > +	unsigned int num;
> > > > > +
> > > > > +	struct vring_packed_desc *desc;
> > > > > +
> > > > > +	struct vring_packed_desc_event *driver;
> > > > > +
> > > > > +	struct vring_packed_desc_event *device;
> > > > > +};
> > > > > +
> > > > >    #endif /* _UAPI_LINUX_VIRTIO_RING_H */
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PULL] vhost: cleanups and fixes
From: Michael S. Tsirkin @ 2018-11-30 13:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: lenaic, mhocko, Kees Cook, kvm, netdev, liang.z.li,
	Linux Kernel Mailing List, virtualization, stefanha, joe,
	Andrew Morton, mhocko, bijan.mottahedeh
In-Reply-To: <CAHk-=wicvws38JqzVF6oNEZ0jGzP6RecR6yAGtyzX3AkoJ321g@mail.gmail.com>

On Thu, Nov 01, 2018 at 04:06:19PM -0700, Linus Torvalds wrote:
> On Thu, Nov 1, 2018 at 4:00 PM Kees Cook <keescook@chromium.org> wrote:
> >
> > +       memset(&rsp, 0, sizeof(rsp));
> > +       rsp.response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
> > +       resp = vq->iov[out].iov_base;
> > +       ret = __copy_to_user(resp, &rsp, sizeof(rsp));
> >
> > Is it actually safe to trust that iov_base has passed an earlier
> > access_ok() check here? Why not just use copy_to_user() instead?
> 
> Good point.
> 
> We really should have removed those double-underscore things ages ago.
> 
> Also, apart from the address, what about the size? Wouldn't it be
> better to use copy_to_iter() rather than implement it badly by hand?
> 
>                Linus

Bijan can you respond please?
Are you going to look into this and convert code to copy_to_iter?
I don't think we should release Linux like this, so if you don't
have the time I'd rather revert for now and you can look
into reposting for the next release.

Thanks,

-- 
MST

^ permalink raw reply

* Re: [RFC] Discuss about an new idea "Vsock over Virtio-net"
From: Michael S. Tsirkin @ 2018-11-30 13:40 UTC (permalink / raw)
  To: Jason Wang; +Cc: kvm, virtualization, stefanha, netdev
In-Reply-To: <27cd8ac6-e892-cfaa-cd39-74f39b452681@redhat.com>

On Fri, Nov 30, 2018 at 09:10:03PM +0800, Jason Wang wrote:
> 
> On 2018/11/30 下午8:55, Jason Wang wrote:
> > 
> > On 2018/11/30 下午8:52, Michael S. Tsirkin wrote:
> > > > >    If you want to compare it with
> > > > > something that would be TCP or QUIC.  The fundamental
> > > > > difference between
> > > > > virtio-vsock and e.g. TCP is that TCP operates in a packet
> > > > > loss environment.
> > > > > So they are using timers for reliability, and receiver is
> > > > > always free to
> > > > > discard any unacked data.
> > > > Virtio-net knows nothing above L2, so they are totally
> > > > transparent to device
> > > > itself. I still don't get why not using virtio-net instead.
> > > > 
> > > > 
> > > > Thanks
> > > Is your question why is virtio-vsock used instead of TCP on top of IP
> > > on top of virtio-net?
> > > 
> > > 
> > 
> > No, my question is why not do vsock through virtio-net.
> > 
> > Thanks
> > 
> 
> Just to clarify, it's not about vosck over ethernet, and it's not about
> inventing new features or APIs. It's probably something like:
> 
> - Let virtio-net driver probe vsock device and do vosck specific things if
> needed to share as much codes.
> 
> - A new kind of sockfd (which is vsock based) for vhost-net for it to do
> vsock specific things (hopefully it can be transparent).
> 
> The change should be totally transparent to userspace applications.
> 
> Thanks

Which code is duplicated between virtio vsock and virtio net right now?

-- 
MST
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC] Discuss about an new idea "Vsock over Virtio-net"
From: Michael S. Tsirkin @ 2018-11-30 13:34 UTC (permalink / raw)
  To: Jason Wang; +Cc: kvm, virtualization, stefanha, netdev
In-Reply-To: <55352308-9ceb-413e-44f6-e3dfd8f642cc@redhat.com>

On Fri, Nov 30, 2018 at 08:55:17PM +0800, Jason Wang wrote:
> 
> On 2018/11/30 下午8:52, Michael S. Tsirkin wrote:
> > > >    If you want to compare it with
> > > > something that would be TCP or QUIC.  The fundamental difference between
> > > > virtio-vsock and e.g. TCP is that TCP operates in a packet loss environment.
> > > > So they are using timers for reliability, and receiver is always free to
> > > > discard any unacked data.
> > > Virtio-net knows nothing above L2, so they are totally transparent to device
> > > itself. I still don't get why not using virtio-net instead.
> > > 
> > > 
> > > Thanks
> > Is your question why is virtio-vsock used instead of TCP on top of IP
> > on top of virtio-net?
> > 
> > 
> 
> No, my question is why not do vsock through virtio-net.
> 
> Thanks

Because apps need reliability, multiplexing and flow control and
virtio-net does not provide it.

-- 
MST
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH] vhost: fix IOTLB locking
From: Michael S. Tsirkin @ 2018-11-30 13:32 UTC (permalink / raw)
  To: Jean-Philippe Brucker; +Cc: kvm, netdev, virtualization, davem
In-Reply-To: <20181130113702.1565-1-jean-philippe.brucker@arm.com>

On Fri, Nov 30, 2018 at 11:37:02AM +0000, Jean-Philippe Brucker wrote:
> Commit 78139c94dc8c ("net: vhost: lock the vqs one by one") moved the vq
> lock to improve scalability, but introduced a possible deadlock in
> vhost-iotlb. vhost_iotlb_notify_vq() now takes vq->mutex while holding
> the device's IOTLB spinlock.

Indeed spin_lock is just outside this snippet. Yack.

> And on the vhost_iotlb_miss() path, the
> spinlock is taken while holding vq->mutex.
> 
> As long as we hold dev->mutex to prevent an ioctl from modifying
> vq->poll concurrently, we can safely call vhost_poll_queue() without
> holding vq->mutex. Since vhost_process_iotlb_msg() holds dev->mutex when
> calling vhost_iotlb_notify_vq(), avoid the deadlock by not taking
> vq->mutex.
> 
> Fixes: 78139c94dc8c ("net: vhost: lock the vqs one by one")
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>


Acked-by: Michael S. Tsirkin <mst@redhat.com>

but see below for a minor comment.

I guess we now need this on stable?

> ---
>  drivers/vhost/vhost.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 3a5f81a66d34..1cbb17f898f7 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -944,10 +944,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
>  		if (msg->iova <= vq_msg->iova &&
>  		    msg->iova + msg->size - 1 >= vq_msg->iova &&
>  		    vq_msg->type == VHOST_IOTLB_MISS) {
> -			mutex_lock(&node->vq->mutex);
> +			/* Safe to call outside vq->mutex as long as dev->mutex
> +			 * is held.
> +			 */
>  			vhost_poll_queue(&node->vq->poll);
> -			mutex_unlock(&node->vq->mutex);
> -

In fact vhost_poll_queue is generally lockless so it's
safe to call without any locks.


>  			list_del(&node->node);
>  			kfree(node);

>  		}
> -- 
> 2.19.1

^ permalink raw reply

* Re: [RFC] Discuss about an new idea "Vsock over Virtio-net"
From: Jason Wang @ 2018-11-30 13:10 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, virtualization, stefanha, netdev
In-Reply-To: <55352308-9ceb-413e-44f6-e3dfd8f642cc@redhat.com>


On 2018/11/30 下午8:55, Jason Wang wrote:
>
> On 2018/11/30 下午8:52, Michael S. Tsirkin wrote:
>>>>    If you want to compare it with
>>>> something that would be TCP or QUIC.  The fundamental difference 
>>>> between
>>>> virtio-vsock and e.g. TCP is that TCP operates in a packet loss 
>>>> environment.
>>>> So they are using timers for reliability, and receiver is always 
>>>> free to
>>>> discard any unacked data.
>>> Virtio-net knows nothing above L2, so they are totally transparent 
>>> to device
>>> itself. I still don't get why not using virtio-net instead.
>>>
>>>
>>> Thanks
>> Is your question why is virtio-vsock used instead of TCP on top of IP
>> on top of virtio-net?
>>
>>
>
> No, my question is why not do vsock through virtio-net.
>
> Thanks
>

Just to clarify, it's not about vosck over ethernet, and it's not about 
inventing new features or APIs. It's probably something like:

- Let virtio-net driver probe vsock device and do vosck specific things 
if needed to share as much codes.

- A new kind of sockfd (which is vsock based) for vhost-net for it to do 
vsock specific things (hopefully it can be transparent).

The change should be totally transparent to userspace applications.

Thanks


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH] vhost: fix IOTLB locking
From: Jason Wang @ 2018-11-30 12:56 UTC (permalink / raw)
  To: Jean-Philippe Brucker, mst; +Cc: netdev, davem, kvm, virtualization
In-Reply-To: <20181130113702.1565-1-jean-philippe.brucker@arm.com>


On 2018/11/30 下午7:37, Jean-Philippe Brucker wrote:
> Commit 78139c94dc8c ("net: vhost: lock the vqs one by one") moved the vq
> lock to improve scalability, but introduced a possible deadlock in
> vhost-iotlb. vhost_iotlb_notify_vq() now takes vq->mutex while holding
> the device's IOTLB spinlock. And on the vhost_iotlb_miss() path, the
> spinlock is taken while holding vq->mutex.
>
> As long as we hold dev->mutex to prevent an ioctl from modifying
> vq->poll concurrently, we can safely call vhost_poll_queue() without
> holding vq->mutex. Since vhost_process_iotlb_msg() holds dev->mutex when
> calling vhost_iotlb_notify_vq(), avoid the deadlock by not taking
> vq->mutex.
>
> Fixes: 78139c94dc8c ("net: vhost: lock the vqs one by one")
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> ---
>   drivers/vhost/vhost.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 3a5f81a66d34..1cbb17f898f7 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -944,10 +944,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
>   		if (msg->iova <= vq_msg->iova &&
>   		    msg->iova + msg->size - 1 >= vq_msg->iova &&
>   		    vq_msg->type == VHOST_IOTLB_MISS) {
> -			mutex_lock(&node->vq->mutex);
> +			/* Safe to call outside vq->mutex as long as dev->mutex
> +			 * is held.
> +			 */
>   			vhost_poll_queue(&node->vq->poll);
> -			mutex_unlock(&node->vq->mutex);
> -
>   			list_del(&node->node);
>   			kfree(node);
>   		}


Acked-by: Jason Wang <jasowang@redhat.com>

Thanks

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC] Discuss about an new idea "Vsock over Virtio-net"
From: Jason Wang @ 2018-11-30 12:55 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, virtualization, stefanha, netdev
In-Reply-To: <20181130075134-mutt-send-email-mst@kernel.org>


On 2018/11/30 下午8:52, Michael S. Tsirkin wrote:
>>>    If you want to compare it with
>>> something that would be TCP or QUIC.  The fundamental difference between
>>> virtio-vsock and e.g. TCP is that TCP operates in a packet loss environment.
>>> So they are using timers for reliability, and receiver is always free to
>>> discard any unacked data.
>> Virtio-net knows nothing above L2, so they are totally transparent to device
>> itself. I still don't get why not using virtio-net instead.
>>
>>
>> Thanks
> Is your question why is virtio-vsock used instead of TCP on top of IP
> on top of virtio-net?
>
>

No, my question is why not do vsock through virtio-net.

Thanks

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC] Discuss about an new idea "Vsock over Virtio-net"
From: Michael S. Tsirkin @ 2018-11-30 12:52 UTC (permalink / raw)
  To: Jason Wang; +Cc: kvm, virtualization, stefanha, netdev
In-Reply-To: <7e78fc3d-0d5a-090f-476d-03ad490ff8a2@redhat.com>

On Fri, Nov 30, 2018 at 08:45:39PM +0800, Jason Wang wrote:
> 
> On 2018/11/29 下午10:00, Michael S. Tsirkin wrote:
> > On Thu, Nov 15, 2018 at 04:24:38PM +0800, Jason Wang wrote:
> > > On 2018/11/15 下午3:04, Michael S. Tsirkin wrote:
> > > > On Thu, Nov 15, 2018 at 11:56:03AM +0800, jiangyiwen wrote:
> > > > > Hi Stefan, Michael, Jason and everyone,
> > > > > 
> > > > > Several days ago, I discussed with jason about "Vsock over Virtio-net".
> > > > > This idea has two advantages:
> > > > > First, it can use many great features of virtio-net, like batching,
> > > > > mergeable rx buffer and multiqueue, etc.
> > > > > Second, it can reduce many duplicate codes and make it easy to be
> > > > > maintained.
> > > > I'm not sure I get the motivation. Which features of
> > > > virtio net are relevant to vsock?
> > > 
> > > Vsock is just a L2 (and above) protocol from the view of the device.
> > I don't believe so. I think virtio-vsock operates at a transport level.
> > There is in theory a bit of network level but we don't really implement
> > it as it's only host to guest.  I am not aware of any data link
> > functionality n virtio-vsock. virtio-vsock provides services such as
> > connection-oriented communication, reliability, flow control and
> > multiplexing.
> 
> 
> Ok, consider it doesn't implement L2, it's pretty fit for virtio-net I
> believe?
> 
> 
> > 
> > > So I
> > > think we should answer the question why we need two different paths for
> > > networking traffic? Or what is the fundamental reason that makes vsock does
> > > not go for virtio-net?
> > So virtio-vsock ensures reliability.
> 
> 
> It's done at the level of protocol instead of virtio transport or virtio
> device.
> 
> 
> >   If you want to compare it with
> > something that would be TCP or QUIC.  The fundamental difference between
> > virtio-vsock and e.g. TCP is that TCP operates in a packet loss environment.
> > So they are using timers for reliability, and receiver is always free to
> > discard any unacked data.
> 
> 
> Virtio-net knows nothing above L2, so they are totally transparent to device
> itself. I still don't get why not using virtio-net instead.
> 
> 
> Thanks

Is your question why is virtio-vsock used instead of TCP on top of IP
on top of virtio-net?


> 
> > 
> > 
> > > I agree they could be different type of devices but codes could be shared in
> > > both guest and host (or even qemu) for not duplicating features(bugs).
> > > 
> > > Thanks
> > > 
> > > 
> > > > The ones that you mention
> > > > all seem to be mostly of use to the networking stack.
> > > > 
> > > > 
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH 0/5] VSOCK: support mergeable rx buffer in vhost-vsock
From: Jason Wang @ 2018-11-30 12:52 UTC (permalink / raw)
  To: Stefan Hajnoczi, jiangyiwen; +Cc: netdev, kvm, virtualization
In-Reply-To: <20181129141928.GC17554@stefanha-x1.localdomain>


On 2018/11/29 下午10:19, Stefan Hajnoczi wrote:
> On Tue, Nov 06, 2018 at 01:53:54PM +0800, jiangyiwen wrote:
>> On 2018/11/6 11:32, Jason Wang wrote:
>>> On 2018/11/6 上午11:17, jiangyiwen wrote:
>>>> On 2018/11/6 10:41, Jason Wang wrote:
>>>>> On 2018/11/6 上午10:17, jiangyiwen wrote:
>>>>>> On 2018/11/5 17:21, Jason Wang wrote:
>>>>>>> On 2018/11/5 下午3:43, jiangyiwen wrote:
>>>>>>>> Now vsock only support send/receive small packet, it can't achieve
>>>>>>>> high performance. As previous discussed with Jason Wang, I revisit the
>>>>>>>> idea of vhost-net about mergeable rx buffer and implement the mergeable
>>>>>>>> rx buffer in vhost-vsock, it can allow big packet to be scattered in
>>>>>>>> into different buffers and improve performance obviously.
>>>>>>>>
>>>>>>>> I write a tool to test the vhost-vsock performance, mainly send big
>>>>>>>> packet(64K) included guest->Host and Host->Guest. The result as
>>>>>>>> follows:
>>>>>>>>
>>>>>>>> Before performance:
>>>>>>>>                   Single socket            Multiple sockets(Max Bandwidth)
>>>>>>>> Guest->Host   ~400MB/s                 ~480MB/s
>>>>>>>> Host->Guest   ~1450MB/s                ~1600MB/s
>>>>>>>>
>>>>>>>> After performance:
>>>>>>>>                   Single socket            Multiple sockets(Max Bandwidth)
>>>>>>>> Guest->Host   ~1700MB/s                ~2900MB/s
>>>>>>>> Host->Guest   ~1700MB/s                ~2900MB/s
>>>>>>>>
>>>>>>>>     From the test results, the performance is improved obviously, and guest
>>>>>>>> memory will not be wasted.
>>>>>>> Hi:
>>>>>>>
>>>>>>> Thanks for the patches and the numbers are really impressive.
>>>>>>>
>>>>>>> But instead of duplicating codes between sock and net. I was considering to use virtio-net as a transport of vsock. Then we may have all existed features likes batching, mergeable rx buffers and multiqueue. Want to consider this idea? Thoughts?
>>>>>>>
>>>>>>>
>>>>>> Hi Jason,
>>>>>>
>>>>>> I am not very familiar with virtio-net, so I am afraid I can't give too
>>>>>> much effective advice. Then I have several problems:
>>>>>>
>>>>>> 1. If use virtio-net as a transport, guest should see a virtio-net
>>>>>> device instead of virtio-vsock device, right? Is vsock only as a
>>>>>> transport between socket and net_device? User should still use
>>>>>> AF_VSOCK type to create socket, right?
>>>>> Well, there're many choices. What you need is just to keep the socket API and hide the implementation. For example, you can keep the vosck device in guest and switch to use vhost-net in host. We probably need a new feature bit or header to let vhost know we are passing vsock packet. And vhost-net could forward the packet to vsock core on host.
>>>>>
>>>>>
>>>>>> 2. I want to know if this idea has already started, and how is
>>>>>> the current progress?
>>>>> Not yet started.  Just want to listen from the community. If this sounds good, do you have interest in implementing this?
>>>>>
>>>>>
>>>>>> 3. And what is stefan's idea?
>>>>> Talk with Stefan a little on this during KVM Forum. I think he tends to agree on this idea. Anyway, let's wait for his reply.
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>> Hi Jason,
>>>>
>>>> Thanks your reply, what you want is try to avoid duplicate code, and still
>>>> use the existed features with virtio-net.
>>>
>>> Yes, technically we can use virtio-net driver is guest as well but we could do it step by step.
>>>
>>>
>>>> Yes, if this sounds good and most people can recognize this idea, I am very
>>>> happy to implement this.
>>>
>>> Cool, thanks.
>>>
>>>
>>>> In addition, I hope you can review these patches before the new idea is
>>>> implemented, after all the performance can be improved. :-)
>>>
>>> Ok.
>>>
>>>
>>> So the patch actually did three things:
>>>
>>> - mergeable buffer implementation
>>>
>>> - increase the default rx buffer size
>>>
>>> - add used and signal guest in a batch
>>>
>>> It would be helpful if you can measure the performance improvement independently. This can give reviewer a better understanding on how much did each part help.
>>>
>>> Thanks
>>>
>>>
>> Great, I will test the performance independently in the later version.
> I'm catching up on email so maybe you've already discussed this, but a
> key design point in virtio-vsock is reliable in-order delivery.  When
> using virtio-net code it's important to keep those properties so that
> AF_VSOCK SOCK_STREAM sockets work as expected.  Packets must not be
> reordered or dropped.


Yes, vhost-net does not drop packet itself and it's not hard to forbid 
virtio-net to drop packets.


>
> In addition, there's the virtio-vsock flow control scheme that allows
> multiple sockets to share a ring without starvation or denial-of-service
> problems.  The guest knows how much socket buffer space is available on
> the host (and vice versa).  A well-behaved guest only sends up to the
> available buffer space so that the host can copy the data into the
> socket buffer and free up ring space for other sockets.  This scheme is
> how virtio-vsock achieves guaranteed delivery while avoiding starvation
> or denial-of-service.
>
> So you'll need to use some kind of framing (protocol) that preserves
> these properties on top of virtio-net.  This framing could be based on
> virtio-vsock's packet headers.


Current plan is to reuse those headers.

Thanks


>
> Stefan
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net-next v3 01/13] virtio: add packed ring types and macros
From: Michael S. Tsirkin @ 2018-11-30 12:47 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: virtio-dev, netdev, linux-kernel, virtualization, maxime.coquelin,
	wexu
In-Reply-To: <20181130095339.GA11984@debian>

On Fri, Nov 30, 2018 at 05:53:40PM +0800, Tiwei Bie wrote:
> On Fri, Nov 30, 2018 at 04:10:55PM +0800, Jason Wang wrote:
> > 
> > On 2018/11/21 下午6:03, Tiwei Bie wrote:
> > > Add types and macros for packed ring.
> > > 
> > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > ---
> > >   include/uapi/linux/virtio_config.h |  3 +++
> > >   include/uapi/linux/virtio_ring.h   | 52 ++++++++++++++++++++++++++++++++++++++
> > >   2 files changed, 55 insertions(+)
> > > 
> > > diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
> > > index 449132c76b1c..1196e1c1d4f6 100644
> > > --- a/include/uapi/linux/virtio_config.h
> > > +++ b/include/uapi/linux/virtio_config.h
> > > @@ -75,6 +75,9 @@
> > >    */
> > >   #define VIRTIO_F_IOMMU_PLATFORM		33
> > > +/* This feature indicates support for the packed virtqueue layout. */
> > > +#define VIRTIO_F_RING_PACKED		34
> > > +
> > >   /*
> > >    * Does the device support Single Root I/O Virtualization?
> > >    */
> > > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> > > index 6d5d5faa989b..2414f8af26b3 100644
> > > --- a/include/uapi/linux/virtio_ring.h
> > > +++ b/include/uapi/linux/virtio_ring.h
> > > @@ -44,6 +44,13 @@
> > >   /* This means the buffer contains a list of buffer descriptors. */
> > >   #define VRING_DESC_F_INDIRECT	4
> > > +/*
> > > + * Mark a descriptor as available or used in packed ring.
> > > + * Notice: they are defined as shifts instead of shifted values.
> > 
> > 
> > This looks inconsistent to previous flags, any reason for using shifts?
> 
> Yeah, it was suggested to use shifts, as _F_ should be a bit
> number, not a shifted value:
> 
> https://patchwork.ozlabs.org/patch/942296/#1989390

But let's add a _SPLIT_ variant that uses shifts consistently.


> > 
> > 
> > > + */
> > > +#define VRING_PACKED_DESC_F_AVAIL	7
> > > +#define VRING_PACKED_DESC_F_USED	15
> > > +
> > >   /* The Host uses this in used->flags to advise the Guest: don't kick me when
> > >    * you add a buffer.  It's unreliable, so it's simply an optimization.  Guest
> > >    * will still kick if it's out of buffers. */
> > > @@ -53,6 +60,23 @@
> > >    * optimization.  */
> > >   #define VRING_AVAIL_F_NO_INTERRUPT	1
> > > +/* Enable events in packed ring. */
> > > +#define VRING_PACKED_EVENT_FLAG_ENABLE	0x0
> > > +/* Disable events in packed ring. */
> > > +#define VRING_PACKED_EVENT_FLAG_DISABLE	0x1
> > > +/*
> > > + * Enable events for a specific descriptor in packed ring.
> > > + * (as specified by Descriptor Ring Change Event Offset/Wrap Counter).
> > > + * Only valid if VIRTIO_RING_F_EVENT_IDX has been negotiated.
> > > + */
> > > +#define VRING_PACKED_EVENT_FLAG_DESC	0x2
> > 
> > 
> > Any reason for using _FLAG_ instead of _F_?
> 
> Yeah, it was suggested to not use _F_, as these are values,
> should not have _F_:
> 
> https://patchwork.ozlabs.org/patch/942296/#1989390
> 
> Regards,
> Tiwei
> 
> > 
> > Thanks
> > 
> > 
> > > +
> > > +/*
> > > + * Wrap counter bit shift in event suppression structure
> > > + * of packed ring.
> > > + */
> > > +#define VRING_PACKED_EVENT_F_WRAP_CTR	15
> > > +
> > >   /* We support indirect buffer descriptors */
> > >   #define VIRTIO_RING_F_INDIRECT_DESC	28
> > > @@ -171,4 +195,32 @@ static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
> > >   	return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old);
> > >   }
> > > +struct vring_packed_desc_event {
> > > +	/* Descriptor Ring Change Event Offset/Wrap Counter. */
> > > +	__le16 off_wrap;
> > > +	/* Descriptor Ring Change Event Flags. */
> > > +	__le16 flags;
> > > +};
> > > +
> > > +struct vring_packed_desc {
> > > +	/* Buffer Address. */
> > > +	__le64 addr;
> > > +	/* Buffer Length. */
> > > +	__le32 len;
> > > +	/* Buffer ID. */
> > > +	__le16 id;
> > > +	/* The flags depending on descriptor type. */
> > > +	__le16 flags;
> > > +};
> > > +
> > > +struct vring_packed {
> > > +	unsigned int num;
> > > +
> > > +	struct vring_packed_desc *desc;
> > > +
> > > +	struct vring_packed_desc_event *driver;
> > > +
> > > +	struct vring_packed_desc_event *device;
> > > +};
> > > +
> > >   #endif /* _UAPI_LINUX_VIRTIO_RING_H */
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC] Discuss about an new idea "Vsock over Virtio-net"
From: Jason Wang @ 2018-11-30 12:45 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, virtualization, stefanha, netdev
In-Reply-To: <20181129085049-mutt-send-email-mst@kernel.org>


On 2018/11/29 下午10:00, Michael S. Tsirkin wrote:
> On Thu, Nov 15, 2018 at 04:24:38PM +0800, Jason Wang wrote:
>> On 2018/11/15 下午3:04, Michael S. Tsirkin wrote:
>>> On Thu, Nov 15, 2018 at 11:56:03AM +0800, jiangyiwen wrote:
>>>> Hi Stefan, Michael, Jason and everyone,
>>>>
>>>> Several days ago, I discussed with jason about "Vsock over Virtio-net".
>>>> This idea has two advantages:
>>>> First, it can use many great features of virtio-net, like batching,
>>>> mergeable rx buffer and multiqueue, etc.
>>>> Second, it can reduce many duplicate codes and make it easy to be
>>>> maintained.
>>> I'm not sure I get the motivation. Which features of
>>> virtio net are relevant to vsock?
>>
>> Vsock is just a L2 (and above) protocol from the view of the device.
> I don't believe so. I think virtio-vsock operates at a transport level.
> There is in theory a bit of network level but we don't really implement
> it as it's only host to guest.  I am not aware of any data link
> functionality n virtio-vsock. virtio-vsock provides services such as
> connection-oriented communication, reliability, flow control and
> multiplexing.


Ok, consider it doesn't implement L2, it's pretty fit for virtio-net I 
believe?


>
>> So I
>> think we should answer the question why we need two different paths for
>> networking traffic? Or what is the fundamental reason that makes vsock does
>> not go for virtio-net?
> So virtio-vsock ensures reliability.


It's done at the level of protocol instead of virtio transport or virtio 
device.


>   If you want to compare it with
> something that would be TCP or QUIC.  The fundamental difference between
> virtio-vsock and e.g. TCP is that TCP operates in a packet loss environment.
> So they are using timers for reliability, and receiver is always free to
> discard any unacked data.


Virtio-net knows nothing above L2, so they are totally transparent to 
device itself. I still don't get why not using virtio-net instead.


Thanks


>
>
>> I agree they could be different type of devices but codes could be shared in
>> both guest and host (or even qemu) for not duplicating features(bugs).
>>
>> Thanks
>>
>>
>>> The ones that you mention
>>> all seem to be mostly of use to the networking stack.
>>>
>>>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* [PATCH] vhost: fix IOTLB locking
From: Jean-Philippe Brucker @ 2018-11-30 11:37 UTC (permalink / raw)
  To: mst, jasowang; +Cc: netdev, davem, kvm, virtualization

Commit 78139c94dc8c ("net: vhost: lock the vqs one by one") moved the vq
lock to improve scalability, but introduced a possible deadlock in
vhost-iotlb. vhost_iotlb_notify_vq() now takes vq->mutex while holding
the device's IOTLB spinlock. And on the vhost_iotlb_miss() path, the
spinlock is taken while holding vq->mutex.

As long as we hold dev->mutex to prevent an ioctl from modifying
vq->poll concurrently, we can safely call vhost_poll_queue() without
holding vq->mutex. Since vhost_process_iotlb_msg() holds dev->mutex when
calling vhost_iotlb_notify_vq(), avoid the deadlock by not taking
vq->mutex.

Fixes: 78139c94dc8c ("net: vhost: lock the vqs one by one")
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 drivers/vhost/vhost.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 3a5f81a66d34..1cbb17f898f7 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -944,10 +944,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
 		if (msg->iova <= vq_msg->iova &&
 		    msg->iova + msg->size - 1 >= vq_msg->iova &&
 		    vq_msg->type == VHOST_IOTLB_MISS) {
-			mutex_lock(&node->vq->mutex);
+			/* Safe to call outside vq->mutex as long as dev->mutex
+			 * is held.
+			 */
 			vhost_poll_queue(&node->vq->poll);
-			mutex_unlock(&node->vq->mutex);
-
 			list_del(&node->node);
 			kfree(node);
 		}
-- 
2.19.1

^ permalink raw reply related

* Re: [REBASE PATCH net-next v9 1/4] net: vhost: lock the vqs one by one
From: Jean-Philippe Brucker @ 2018-11-30 10:28 UTC (permalink / raw)
  To: Jason Wang, xiangxia.m.yue@gmail.com, mst@redhat.com,
	makita.toshiaki@lab.ntt.co.jp, davem@davemloft.net
  Cc: netdev@vger.kernel.org, virtualization@lists.linux-foundation.org
In-Reply-To: <ebbf9c7b-519a-f2c1-547c-fd47b1e07550@redhat.com>

On 30/11/2018 02:34, Jason Wang wrote:
> 
> On 2018/11/30 上午3:28, Jean-Philippe Brucker wrote:
>> Hi,
>>
>> On 25/09/2018 13:36,xiangxia.m.yue@gmail.com  wrote:
>>> From: Tonghao Zhang<xiangxia.m.yue@gmail.com>
>>>
>>> This patch changes the way that lock all vqs
>>> at the same, to lock them one by one. It will
>>> be used for next patch to avoid the deadlock.
>>>
>>> Signed-off-by: Tonghao Zhang<xiangxia.m.yue@gmail.com>
>>> Acked-by: Jason Wang<jasowang@redhat.com>
>>> Signed-off-by: Jason Wang<jasowang@redhat.com>
>>> ---
>>>   drivers/vhost/vhost.c | 24 +++++++-----------------
>>>   1 file changed, 7 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>>> index b13c6b4..f52008b 100644
>>> --- a/drivers/vhost/vhost.c
>>> +++ b/drivers/vhost/vhost.c
>>> @@ -294,8 +294,11 @@ static void vhost_vq_meta_reset(struct vhost_dev *d)
>>>   {
>>>   	int i;
>>>   
>>> -	for (i = 0; i < d->nvqs; ++i)
>>> +	for (i = 0; i < d->nvqs; ++i) {
>>> +		mutex_lock(&d->vqs[i]->mutex);
>>>   		__vhost_vq_meta_reset(d->vqs[i]);
>>> +		mutex_unlock(&d->vqs[i]->mutex);
>>> +	}
>>>   }
>>>   
>>>   static void vhost_vq_reset(struct vhost_dev *dev,
>>> @@ -891,20 +894,6 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
>>>   #define vhost_get_used(vq, x, ptr) \
>>>   	vhost_get_user(vq, x, ptr, VHOST_ADDR_USED)
>>>   
>>> -static void vhost_dev_lock_vqs(struct vhost_dev *d)
>>> -{
>>> -	int i = 0;
>>> -	for (i = 0; i < d->nvqs; ++i)
>>> -		mutex_lock_nested(&d->vqs[i]->mutex, i);
>>> -}
>>> -
>>> -static void vhost_dev_unlock_vqs(struct vhost_dev *d)
>>> -{
>>> -	int i = 0;
>>> -	for (i = 0; i < d->nvqs; ++i)
>>> -		mutex_unlock(&d->vqs[i]->mutex);
>>> -}
>>> -
>>>   static int vhost_new_umem_range(struct vhost_umem *umem,
>>>   				u64 start, u64 size, u64 end,
>>>   				u64 userspace_addr, int perm)
>>> @@ -954,7 +943,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
>>>   		if (msg->iova <= vq_msg->iova &&
>>>   		    msg->iova + msg->size - 1 >= vq_msg->iova &&
>>>   		    vq_msg->type == VHOST_IOTLB_MISS) {
>>> +			mutex_lock(&node->vq->mutex);
>> This seems to introduce a deadlock (and sleep-in-atomic): the vq->mutex
>> is taken while the IOTLB spinlock is held (taken earlier in
>> vhost_iotlb_notify_vq()). On the vhost_iotlb_miss() path, the IOTLB
>> spinlock is taken while the vq->mutex is held.
> 
> 
> Good catch.
> 
> 
>> I'm not sure how to fix it. Given that we're holding dev->mutex, that
>> vq->poll only seems to be modified under dev->mutex, and assuming that
>> vhost_poll_queue(vq->poll) can be called concurrently, is it safe to
>> simply not take vq->mutex here?
> 
> 
> Yes, I think it can be removed here.
> 
> Want to post a patch for this?

Yes, I'll post it shortly

Thanks,
Jean
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net-next v3 01/13] virtio: add packed ring types and macros
From: Tiwei Bie @ 2018-11-30  9:53 UTC (permalink / raw)
  To: Jason Wang
  Cc: virtio-dev, mst, netdev, linux-kernel, virtualization,
	maxime.coquelin, wexu
In-Reply-To: <1928ac96-3684-45c4-1e8c-09dff03bf308@redhat.com>

On Fri, Nov 30, 2018 at 04:10:55PM +0800, Jason Wang wrote:
> 
> On 2018/11/21 下午6:03, Tiwei Bie wrote:
> > Add types and macros for packed ring.
> > 
> > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > ---
> >   include/uapi/linux/virtio_config.h |  3 +++
> >   include/uapi/linux/virtio_ring.h   | 52 ++++++++++++++++++++++++++++++++++++++
> >   2 files changed, 55 insertions(+)
> > 
> > diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
> > index 449132c76b1c..1196e1c1d4f6 100644
> > --- a/include/uapi/linux/virtio_config.h
> > +++ b/include/uapi/linux/virtio_config.h
> > @@ -75,6 +75,9 @@
> >    */
> >   #define VIRTIO_F_IOMMU_PLATFORM		33
> > +/* This feature indicates support for the packed virtqueue layout. */
> > +#define VIRTIO_F_RING_PACKED		34
> > +
> >   /*
> >    * Does the device support Single Root I/O Virtualization?
> >    */
> > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> > index 6d5d5faa989b..2414f8af26b3 100644
> > --- a/include/uapi/linux/virtio_ring.h
> > +++ b/include/uapi/linux/virtio_ring.h
> > @@ -44,6 +44,13 @@
> >   /* This means the buffer contains a list of buffer descriptors. */
> >   #define VRING_DESC_F_INDIRECT	4
> > +/*
> > + * Mark a descriptor as available or used in packed ring.
> > + * Notice: they are defined as shifts instead of shifted values.
> 
> 
> This looks inconsistent to previous flags, any reason for using shifts?

Yeah, it was suggested to use shifts, as _F_ should be a bit
number, not a shifted value:

https://patchwork.ozlabs.org/patch/942296/#1989390

> 
> 
> > + */
> > +#define VRING_PACKED_DESC_F_AVAIL	7
> > +#define VRING_PACKED_DESC_F_USED	15
> > +
> >   /* The Host uses this in used->flags to advise the Guest: don't kick me when
> >    * you add a buffer.  It's unreliable, so it's simply an optimization.  Guest
> >    * will still kick if it's out of buffers. */
> > @@ -53,6 +60,23 @@
> >    * optimization.  */
> >   #define VRING_AVAIL_F_NO_INTERRUPT	1
> > +/* Enable events in packed ring. */
> > +#define VRING_PACKED_EVENT_FLAG_ENABLE	0x0
> > +/* Disable events in packed ring. */
> > +#define VRING_PACKED_EVENT_FLAG_DISABLE	0x1
> > +/*
> > + * Enable events for a specific descriptor in packed ring.
> > + * (as specified by Descriptor Ring Change Event Offset/Wrap Counter).
> > + * Only valid if VIRTIO_RING_F_EVENT_IDX has been negotiated.
> > + */
> > +#define VRING_PACKED_EVENT_FLAG_DESC	0x2
> 
> 
> Any reason for using _FLAG_ instead of _F_?

Yeah, it was suggested to not use _F_, as these are values,
should not have _F_:

https://patchwork.ozlabs.org/patch/942296/#1989390

Regards,
Tiwei

> 
> Thanks
> 
> 
> > +
> > +/*
> > + * Wrap counter bit shift in event suppression structure
> > + * of packed ring.
> > + */
> > +#define VRING_PACKED_EVENT_F_WRAP_CTR	15
> > +
> >   /* We support indirect buffer descriptors */
> >   #define VIRTIO_RING_F_INDIRECT_DESC	28
> > @@ -171,4 +195,32 @@ static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
> >   	return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old);
> >   }
> > +struct vring_packed_desc_event {
> > +	/* Descriptor Ring Change Event Offset/Wrap Counter. */
> > +	__le16 off_wrap;
> > +	/* Descriptor Ring Change Event Flags. */
> > +	__le16 flags;
> > +};
> > +
> > +struct vring_packed_desc {
> > +	/* Buffer Address. */
> > +	__le64 addr;
> > +	/* Buffer Length. */
> > +	__le32 len;
> > +	/* Buffer ID. */
> > +	__le16 id;
> > +	/* The flags depending on descriptor type. */
> > +	__le16 flags;
> > +};
> > +
> > +struct vring_packed {
> > +	unsigned int num;
> > +
> > +	struct vring_packed_desc *desc;
> > +
> > +	struct vring_packed_desc_event *driver;
> > +
> > +	struct vring_packed_desc_event *device;
> > +};
> > +
> >   #endif /* _UAPI_LINUX_VIRTIO_RING_H */
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: PROPOSAL: Extend inline asm syntax with size spec
From: Boris Petkov via Virtualization @ 2018-11-30  9:06 UTC (permalink / raw)
  To: Segher Boessenkool, Masahiro Yamada
  Cc: Kate Stewart, Peter Zijlstra (Intel), Christopher Li,
	virtualization, Max Filippov, Nadav Amit, Jan Beulich,
	H. Peter Anvin, Sam Ravnborg, Thomas Gleixner, X86 ML,
	linux-sparse, Ingo Molnar, linux-xtensa, Kees Cook, Chris Zankel,
	matz, Josh Poimboeuf, Alok Kataria, Juergen Gross, gcc, rguenther,
	Greg Kroah-Hartman, Linux Kernel Mailing List,
	Philippe Ombredanne, Linus
In-Reply-To: <20181129122500.GX23873@gate.crashing.org>

On November 29, 2018 1:25:02 PM GMT+01:00, Segher Boessenkool <segher@kernel.crashing.org> wrote:
>This will only be fixed from GCC 9 on, if the compiler adopts it.  The
>kernel wants to support ancient GCC, so it will need to have a
>workaround
>for older GCC versions anyway.

What about backporting it, like Richard says?


-- 
Sent from a small device: formatting sux and brevity is inevitable. 

^ permalink raw reply

* Re: [PATCH net-next v3 01/13] virtio: add packed ring types and macros
From: Jason Wang @ 2018-11-30  8:10 UTC (permalink / raw)
  To: Tiwei Bie, mst, virtualization, linux-kernel, netdev, virtio-dev
  Cc: maxime.coquelin, wexu
In-Reply-To: <20181121100330.24846-2-tiwei.bie@intel.com>


On 2018/11/21 下午6:03, Tiwei Bie wrote:
> Add types and macros for packed ring.
>
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> ---
>   include/uapi/linux/virtio_config.h |  3 +++
>   include/uapi/linux/virtio_ring.h   | 52 ++++++++++++++++++++++++++++++++++++++
>   2 files changed, 55 insertions(+)
>
> diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
> index 449132c76b1c..1196e1c1d4f6 100644
> --- a/include/uapi/linux/virtio_config.h
> +++ b/include/uapi/linux/virtio_config.h
> @@ -75,6 +75,9 @@
>    */
>   #define VIRTIO_F_IOMMU_PLATFORM		33
>   
> +/* This feature indicates support for the packed virtqueue layout. */
> +#define VIRTIO_F_RING_PACKED		34
> +
>   /*
>    * Does the device support Single Root I/O Virtualization?
>    */
> diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> index 6d5d5faa989b..2414f8af26b3 100644
> --- a/include/uapi/linux/virtio_ring.h
> +++ b/include/uapi/linux/virtio_ring.h
> @@ -44,6 +44,13 @@
>   /* This means the buffer contains a list of buffer descriptors. */
>   #define VRING_DESC_F_INDIRECT	4
>   
> +/*
> + * Mark a descriptor as available or used in packed ring.
> + * Notice: they are defined as shifts instead of shifted values.


This looks inconsistent to previous flags, any reason for using shifts?


> + */
> +#define VRING_PACKED_DESC_F_AVAIL	7
> +#define VRING_PACKED_DESC_F_USED	15
> +
>   /* The Host uses this in used->flags to advise the Guest: don't kick me when
>    * you add a buffer.  It's unreliable, so it's simply an optimization.  Guest
>    * will still kick if it's out of buffers. */
> @@ -53,6 +60,23 @@
>    * optimization.  */
>   #define VRING_AVAIL_F_NO_INTERRUPT	1
>   
> +/* Enable events in packed ring. */
> +#define VRING_PACKED_EVENT_FLAG_ENABLE	0x0
> +/* Disable events in packed ring. */
> +#define VRING_PACKED_EVENT_FLAG_DISABLE	0x1
> +/*
> + * Enable events for a specific descriptor in packed ring.
> + * (as specified by Descriptor Ring Change Event Offset/Wrap Counter).
> + * Only valid if VIRTIO_RING_F_EVENT_IDX has been negotiated.
> + */
> +#define VRING_PACKED_EVENT_FLAG_DESC	0x2


Any reason for using _FLAG_ instead of _F_?

Thanks


> +
> +/*
> + * Wrap counter bit shift in event suppression structure
> + * of packed ring.
> + */
> +#define VRING_PACKED_EVENT_F_WRAP_CTR	15
> +
>   /* We support indirect buffer descriptors */
>   #define VIRTIO_RING_F_INDIRECT_DESC	28
>   
> @@ -171,4 +195,32 @@ static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
>   	return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old);
>   }
>   
> +struct vring_packed_desc_event {
> +	/* Descriptor Ring Change Event Offset/Wrap Counter. */
> +	__le16 off_wrap;
> +	/* Descriptor Ring Change Event Flags. */
> +	__le16 flags;
> +};
> +
> +struct vring_packed_desc {
> +	/* Buffer Address. */
> +	__le64 addr;
> +	/* Buffer Length. */
> +	__le32 len;
> +	/* Buffer ID. */
> +	__le16 id;
> +	/* The flags depending on descriptor type. */
> +	__le16 flags;
> +};
> +
> +struct vring_packed {
> +	unsigned int num;
> +
> +	struct vring_packed_desc *desc;
> +
> +	struct vring_packed_desc_event *driver;
> +
> +	struct vring_packed_desc_event *device;
> +};
> +
>   #endif /* _UAPI_LINUX_VIRTIO_RING_H */
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox