* Re: [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc
[not found] <cover.1660362668.git.bobby.eshleman@bytedance.com>
@ 2022-08-15 20:39 ` Michael S. Tsirkin
2022-08-16 7:00 ` Stefano Garzarella
` (6 subsequent siblings)
7 siblings, 0 replies; 23+ messages in thread
From: Michael S. Tsirkin @ 2022-08-15 20:39 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Bobby Eshleman, Wei Liu, Cong Wang, Stephen Hemminger,
Bobby Eshleman, Jiang Wang, Dexuan Cui, Haiyang Zhang,
linux-kernel, virtualization, Eric Dumazet, netdev,
Stefan Hajnoczi, kvm, Jakub Kicinski, Paolo Abeni, linux-hyperv,
David S. Miller
On Mon, Aug 15, 2022 at 10:56:03AM -0700, Bobby Eshleman wrote:
> Hey everybody,
>
> This series introduces datagrams, packet scheduling, and sk_buff usage
> to virtio vsock.
>
> The usage of struct sk_buff benefits users by a) preparing vsock to use
> other related systems that require sk_buff, such as sockmap and qdisc,
> b) supporting basic congestion control via sock_alloc_send_skb, and c)
> reducing copying when delivering packets to TAP.
>
> The socket layer no longer forces errors to be -ENOMEM, as typically
> userspace expects -EAGAIN when the sk_sndbuf threshold is reached and
> messages are being sent with option MSG_DONTWAIT.
>
> The datagram work is based off previous patches by Jiang Wang[1].
>
> The introduction of datagrams creates a transport layer fairness issue
> where datagrams may freely starve streams of queue access. This happens
> because, unlike streams, datagrams lack the transactions necessary for
> calculating credits and throttling.
>
> Previous proposals introduce changes to the spec to add an additional
> virtqueue pair for datagrams[1]. Although this solution works, using
> Linux's qdisc for packet scheduling leverages already existing systems,
> avoids the need to change the virtio specification, and gives additional
> capabilities. The usage of SFQ or fq_codel, for example, may solve the
> transport layer starvation problem. It is easy to imagine other use
> cases as well. For example, services of varying importance may be
> assigned different priorities, and qdisc will apply appropriate
> priority-based scheduling. By default, the system default pfifo qdisc is
> used. The qdisc may be bypassed and legacy queuing is resumed by simply
> setting the virtio-vsock%d network device to state DOWN. This technique
> still allows vsock to work with zero-configuration.
>
> In summary, this series introduces these major changes to vsock:
>
> - virtio vsock supports datagrams
> - virtio vsock uses struct sk_buff instead of virtio_vsock_pkt
> - Because virtio vsock uses sk_buff, it also uses sock_alloc_send_skb,
> which applies the throttling threshold sk_sndbuf.
> - The vsock socket layer supports returning errors other than -ENOMEM.
> - This is used to return -EAGAIN when the sk_sndbuf threshold is
> reached.
> - virtio vsock uses a net_device, through which qdisc may be used.
> - qdisc allows scheduling policies to be applied to vsock flows.
> - Some qdiscs, like SFQ, may allow vsock to avoid transport layer congestion. That is,
> it may avoid datagrams from flooding out stream flows. The benefit
> to this is that additional virtqueues are not needed for datagrams.
> - The net_device and qdisc is bypassed by simply setting the
> net_device state to DOWN.
>
> [1]: https://lore.kernel.org/all/20210914055440.3121004-1-jiang.wang@bytedance.com/
Given this affects the driver/device interface I'd like to
ask you to please copy virtio-dev mailing list on these patches.
Subscriber only I'm afraid you will need to subscribe :(
> Bobby Eshleman (5):
> vsock: replace virtio_vsock_pkt with sk_buff
> vsock: return errors other than -ENOMEM to socket
> vsock: add netdev to vhost/virtio vsock
> virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
> virtio/vsock: add support for dgram
>
> Jiang Wang (1):
> vsock_test: add tests for vsock dgram
>
> drivers/vhost/vsock.c | 238 ++++----
> include/linux/virtio_vsock.h | 73 ++-
> include/net/af_vsock.h | 2 +
> include/uapi/linux/virtio_vsock.h | 2 +
> net/vmw_vsock/af_vsock.c | 30 +-
> net/vmw_vsock/hyperv_transport.c | 2 +-
> net/vmw_vsock/virtio_transport.c | 237 +++++---
> net/vmw_vsock/virtio_transport_common.c | 771 ++++++++++++++++--------
> net/vmw_vsock/vmci_transport.c | 9 +-
> net/vmw_vsock/vsock_loopback.c | 51 +-
> tools/testing/vsock/util.c | 105 ++++
> tools/testing/vsock/util.h | 4 +
> tools/testing/vsock/vsock_test.c | 195 ++++++
> 13 files changed, 1176 insertions(+), 543 deletions(-)
>
> --
> 2.35.1
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread* [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc
[not found] <cover.1660362668.git.bobby.eshleman@bytedance.com>
2022-08-15 20:39 ` [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc Michael S. Tsirkin
@ 2022-08-16 7:00 ` Stefano Garzarella
2022-08-17 6:54 ` Michael S. Tsirkin
` (5 subsequent siblings)
7 siblings, 0 replies; 23+ messages in thread
From: Stefano Garzarella @ 2022-08-16 7:00 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Wei Liu, Cong Wang, Stephen Hemminger, Bobby Eshleman, Jiang Wang,
Michael S. Tsirkin, Dexuan Cui, Haiyang Zhang,
linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, Eric Dumazet,
netdev@vger.kernel.org, Stefan Hajnoczi, kvm@vger.kernel.org,
Jakub Kicinski, Paolo Abeni, linux-hyperv@vger.kernel.org,
David S. Miller
[-- Attachment #1.1: Type: text/plain, Size: 4360 bytes --]
Hi Bobby,
I’m on vacation so I’ll review this series when I’m back on 28th.
Please send next versions of this series as RFC until we have at least an
agreement on the spec changes.
I think will be better to agree on the spec before merge Linux changes.
Thanks,
Stefano
On Monday, August 15, 2022, Bobby Eshleman <bobby.eshleman@gmail.com> wrote:
> Hey everybody,
>
> This series introduces datagrams, packet scheduling, and sk_buff usage
> to virtio vsock.
>
> The usage of struct sk_buff benefits users by a) preparing vsock to use
> other related systems that require sk_buff, such as sockmap and qdisc,
> b) supporting basic congestion control via sock_alloc_send_skb, and c)
> reducing copying when delivering packets to TAP.
>
> The socket layer no longer forces errors to be -ENOMEM, as typically
> userspace expects -EAGAIN when the sk_sndbuf threshold is reached and
> messages are being sent with option MSG_DONTWAIT.
>
> The datagram work is based off previous patches by Jiang Wang[1].
>
> The introduction of datagrams creates a transport layer fairness issue
> where datagrams may freely starve streams of queue access. This happens
> because, unlike streams, datagrams lack the transactions necessary for
> calculating credits and throttling.
>
> Previous proposals introduce changes to the spec to add an additional
> virtqueue pair for datagrams[1]. Although this solution works, using
> Linux's qdisc for packet scheduling leverages already existing systems,
> avoids the need to change the virtio specification, and gives additional
> capabilities. The usage of SFQ or fq_codel, for example, may solve the
> transport layer starvation problem. It is easy to imagine other use
> cases as well. For example, services of varying importance may be
> assigned different priorities, and qdisc will apply appropriate
> priority-based scheduling. By default, the system default pfifo qdisc is
> used. The qdisc may be bypassed and legacy queuing is resumed by simply
> setting the virtio-vsock%d network device to state DOWN. This technique
> still allows vsock to work with zero-configuration.
>
> In summary, this series introduces these major changes to vsock:
>
> - virtio vsock supports datagrams
> - virtio vsock uses struct sk_buff instead of virtio_vsock_pkt
> - Because virtio vsock uses sk_buff, it also uses sock_alloc_send_skb,
> which applies the throttling threshold sk_sndbuf.
> - The vsock socket layer supports returning errors other than -ENOMEM.
> - This is used to return -EAGAIN when the sk_sndbuf threshold is
> reached.
> - virtio vsock uses a net_device, through which qdisc may be used.
> - qdisc allows scheduling policies to be applied to vsock flows.
> - Some qdiscs, like SFQ, may allow vsock to avoid transport layer
> congestion. That is,
> it may avoid datagrams from flooding out stream flows. The benefit
> to this is that additional virtqueues are not needed for datagrams.
> - The net_device and qdisc is bypassed by simply setting the
> net_device state to DOWN.
>
> [1]: https://lore.kernel.org/all/20210914055440.3121004-1-jiang.
> wang@bytedance.com/
>
> Bobby Eshleman (5):
> vsock: replace virtio_vsock_pkt with sk_buff
> vsock: return errors other than -ENOMEM to socket
> vsock: add netdev to vhost/virtio vsock
> virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
> virtio/vsock: add support for dgram
>
> Jiang Wang (1):
> vsock_test: add tests for vsock dgram
>
> drivers/vhost/vsock.c | 238 ++++----
> include/linux/virtio_vsock.h | 73 ++-
> include/net/af_vsock.h | 2 +
> include/uapi/linux/virtio_vsock.h | 2 +
> net/vmw_vsock/af_vsock.c | 30 +-
> net/vmw_vsock/hyperv_transport.c | 2 +-
> net/vmw_vsock/virtio_transport.c | 237 +++++---
> net/vmw_vsock/virtio_transport_common.c | 771 ++++++++++++++++--------
> net/vmw_vsock/vmci_transport.c | 9 +-
> net/vmw_vsock/vsock_loopback.c | 51 +-
> tools/testing/vsock/util.c | 105 ++++
> tools/testing/vsock/util.h | 4 +
> tools/testing/vsock/vsock_test.c | 195 ++++++
> 13 files changed, 1176 insertions(+), 543 deletions(-)
>
> --
> 2.35.1
>
>
[-- Attachment #1.2: Type: text/html, Size: 5082 bytes --]
[-- Attachment #2: Type: text/plain, Size: 183 bytes --]
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc
[not found] <cover.1660362668.git.bobby.eshleman@bytedance.com>
2022-08-15 20:39 ` [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc Michael S. Tsirkin
2022-08-16 7:00 ` Stefano Garzarella
@ 2022-08-17 6:54 ` Michael S. Tsirkin
[not found] ` <YvtmYpMieMFb80qR@bullseye>
2022-08-18 4:28 ` Jason Wang
[not found] ` <5a93c5aad99d79f028d349cb7e3c128c65d5d7e2.1660362668.git.bobby.eshleman@bytedance.com>
` (4 subsequent siblings)
7 siblings, 2 replies; 23+ messages in thread
From: Michael S. Tsirkin @ 2022-08-17 6:54 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Bobby Eshleman, Wei Liu, Cong Wang, Stephen Hemminger,
Bobby Eshleman, Jiang Wang, Dexuan Cui, Haiyang Zhang,
linux-kernel, virtualization, Eric Dumazet, netdev,
Stefan Hajnoczi, kvm, Jakub Kicinski, Paolo Abeni, linux-hyperv,
David S. Miller
On Mon, Aug 15, 2022 at 10:56:03AM -0700, Bobby Eshleman wrote:
> Hey everybody,
>
> This series introduces datagrams, packet scheduling, and sk_buff usage
> to virtio vsock.
>
> The usage of struct sk_buff benefits users by a) preparing vsock to use
> other related systems that require sk_buff, such as sockmap and qdisc,
> b) supporting basic congestion control via sock_alloc_send_skb, and c)
> reducing copying when delivering packets to TAP.
>
> The socket layer no longer forces errors to be -ENOMEM, as typically
> userspace expects -EAGAIN when the sk_sndbuf threshold is reached and
> messages are being sent with option MSG_DONTWAIT.
>
> The datagram work is based off previous patches by Jiang Wang[1].
>
> The introduction of datagrams creates a transport layer fairness issue
> where datagrams may freely starve streams of queue access. This happens
> because, unlike streams, datagrams lack the transactions necessary for
> calculating credits and throttling.
>
> Previous proposals introduce changes to the spec to add an additional
> virtqueue pair for datagrams[1]. Although this solution works, using
> Linux's qdisc for packet scheduling leverages already existing systems,
> avoids the need to change the virtio specification, and gives additional
> capabilities. The usage of SFQ or fq_codel, for example, may solve the
> transport layer starvation problem. It is easy to imagine other use
> cases as well. For example, services of varying importance may be
> assigned different priorities, and qdisc will apply appropriate
> priority-based scheduling. By default, the system default pfifo qdisc is
> used. The qdisc may be bypassed and legacy queuing is resumed by simply
> setting the virtio-vsock%d network device to state DOWN. This technique
> still allows vsock to work with zero-configuration.
The basic question to answer then is this: with a net device qdisc
etc in the picture, how is this different from virtio net then?
Why do you still want to use vsock?
> In summary, this series introduces these major changes to vsock:
>
> - virtio vsock supports datagrams
> - virtio vsock uses struct sk_buff instead of virtio_vsock_pkt
> - Because virtio vsock uses sk_buff, it also uses sock_alloc_send_skb,
> which applies the throttling threshold sk_sndbuf.
> - The vsock socket layer supports returning errors other than -ENOMEM.
> - This is used to return -EAGAIN when the sk_sndbuf threshold is
> reached.
> - virtio vsock uses a net_device, through which qdisc may be used.
> - qdisc allows scheduling policies to be applied to vsock flows.
> - Some qdiscs, like SFQ, may allow vsock to avoid transport layer congestion. That is,
> it may avoid datagrams from flooding out stream flows. The benefit
> to this is that additional virtqueues are not needed for datagrams.
> - The net_device and qdisc is bypassed by simply setting the
> net_device state to DOWN.
>
> [1]: https://lore.kernel.org/all/20210914055440.3121004-1-jiang.wang@bytedance.com/
>
> Bobby Eshleman (5):
> vsock: replace virtio_vsock_pkt with sk_buff
> vsock: return errors other than -ENOMEM to socket
> vsock: add netdev to vhost/virtio vsock
> virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
> virtio/vsock: add support for dgram
>
> Jiang Wang (1):
> vsock_test: add tests for vsock dgram
>
> drivers/vhost/vsock.c | 238 ++++----
> include/linux/virtio_vsock.h | 73 ++-
> include/net/af_vsock.h | 2 +
> include/uapi/linux/virtio_vsock.h | 2 +
> net/vmw_vsock/af_vsock.c | 30 +-
> net/vmw_vsock/hyperv_transport.c | 2 +-
> net/vmw_vsock/virtio_transport.c | 237 +++++---
> net/vmw_vsock/virtio_transport_common.c | 771 ++++++++++++++++--------
> net/vmw_vsock/vmci_transport.c | 9 +-
> net/vmw_vsock/vsock_loopback.c | 51 +-
> tools/testing/vsock/util.c | 105 ++++
> tools/testing/vsock/util.h | 4 +
> tools/testing/vsock/vsock_test.c | 195 ++++++
> 13 files changed, 1176 insertions(+), 543 deletions(-)
>
> --
> 2.35.1
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread[parent not found: <YvtmYpMieMFb80qR@bullseye>]
* Re: [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc
[not found] ` <YvtmYpMieMFb80qR@bullseye>
@ 2022-08-17 17:02 ` Michael S. Tsirkin
[not found] ` <Yvt6nxUYMfDrLd/A@bullseye>
0 siblings, 1 reply; 23+ messages in thread
From: Michael S. Tsirkin @ 2022-08-17 17:02 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Wei Liu, Cong Wang, Stephen Hemminger, Bobby Eshleman, Jiang Wang,
Dexuan Cui, Bobby Eshleman, linux-kernel, Haiyang Zhang,
virtualization, Eric Dumazet, netdev, Stefan Hajnoczi, kvm,
Jakub Kicinski, Paolo Abeni, linux-hyperv, David S. Miller
On Tue, Aug 16, 2022 at 09:42:51AM +0000, Bobby Eshleman wrote:
> > The basic question to answer then is this: with a net device qdisc
> > etc in the picture, how is this different from virtio net then?
> > Why do you still want to use vsock?
> >
>
> When using virtio-net, users looking for inter-VM communication are
> required to setup bridges, TAPs, allocate IP addresses or setup DNS,
> etc... and then finally when you have a network, you can open a socket
> on an IP address and port. This is the configuration that vsock avoids.
> For vsock, we just need a CID and a port, but no network configuration.
Surely when you mention DNS you are going overboard? vsock doesn't
remove the need for DNS as much as it does not support it.
--
MST
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc
2022-08-17 6:54 ` Michael S. Tsirkin
[not found] ` <YvtmYpMieMFb80qR@bullseye>
@ 2022-08-18 4:28 ` Jason Wang
2022-09-06 9:00 ` Stefano Garzarella
1 sibling, 1 reply; 23+ messages in thread
From: Jason Wang @ 2022-08-18 4:28 UTC (permalink / raw)
To: Michael S. Tsirkin, Bobby Eshleman
Cc: Bobby Eshleman, Wei Liu, Cong Wang, Stephen Hemminger,
Bobby Eshleman, Jiang Wang, netdev, Haiyang Zhang, Dexuan Cui,
linux-kernel, virtualization, Eric Dumazet, linux-hyperv,
Stefan Hajnoczi, kvm, Jakub Kicinski, Paolo Abeni,
David S. Miller
在 2022/8/17 14:54, Michael S. Tsirkin 写道:
> On Mon, Aug 15, 2022 at 10:56:03AM -0700, Bobby Eshleman wrote:
>> Hey everybody,
>>
>> This series introduces datagrams, packet scheduling, and sk_buff usage
>> to virtio vsock.
>>
>> The usage of struct sk_buff benefits users by a) preparing vsock to use
>> other related systems that require sk_buff, such as sockmap and qdisc,
>> b) supporting basic congestion control via sock_alloc_send_skb, and c)
>> reducing copying when delivering packets to TAP.
>>
>> The socket layer no longer forces errors to be -ENOMEM, as typically
>> userspace expects -EAGAIN when the sk_sndbuf threshold is reached and
>> messages are being sent with option MSG_DONTWAIT.
>>
>> The datagram work is based off previous patches by Jiang Wang[1].
>>
>> The introduction of datagrams creates a transport layer fairness issue
>> where datagrams may freely starve streams of queue access. This happens
>> because, unlike streams, datagrams lack the transactions necessary for
>> calculating credits and throttling.
>>
>> Previous proposals introduce changes to the spec to add an additional
>> virtqueue pair for datagrams[1]. Although this solution works, using
>> Linux's qdisc for packet scheduling leverages already existing systems,
>> avoids the need to change the virtio specification, and gives additional
>> capabilities. The usage of SFQ or fq_codel, for example, may solve the
>> transport layer starvation problem. It is easy to imagine other use
>> cases as well. For example, services of varying importance may be
>> assigned different priorities, and qdisc will apply appropriate
>> priority-based scheduling. By default, the system default pfifo qdisc is
>> used. The qdisc may be bypassed and legacy queuing is resumed by simply
>> setting the virtio-vsock%d network device to state DOWN. This technique
>> still allows vsock to work with zero-configuration.
> The basic question to answer then is this: with a net device qdisc
> etc in the picture, how is this different from virtio net then?
> Why do you still want to use vsock?
Or maybe it's time to revisit an old idea[1] to unify at least the
driver part (e.g using virtio-net driver for vsock then we can all
features that vsock is lacking now)?
Thanks
[1]
https://lists.linuxfoundation.org/pipermail/virtualization/2018-November/039783.html
>
>> In summary, this series introduces these major changes to vsock:
>>
>> - virtio vsock supports datagrams
>> - virtio vsock uses struct sk_buff instead of virtio_vsock_pkt
>> - Because virtio vsock uses sk_buff, it also uses sock_alloc_send_skb,
>> which applies the throttling threshold sk_sndbuf.
>> - The vsock socket layer supports returning errors other than -ENOMEM.
>> - This is used to return -EAGAIN when the sk_sndbuf threshold is
>> reached.
>> - virtio vsock uses a net_device, through which qdisc may be used.
>> - qdisc allows scheduling policies to be applied to vsock flows.
>> - Some qdiscs, like SFQ, may allow vsock to avoid transport layer congestion. That is,
>> it may avoid datagrams from flooding out stream flows. The benefit
>> to this is that additional virtqueues are not needed for datagrams.
>> - The net_device and qdisc is bypassed by simply setting the
>> net_device state to DOWN.
>>
>> [1]: https://lore.kernel.org/all/20210914055440.3121004-1-jiang.wang@bytedance.com/
>>
>> Bobby Eshleman (5):
>> vsock: replace virtio_vsock_pkt with sk_buff
>> vsock: return errors other than -ENOMEM to socket
>> vsock: add netdev to vhost/virtio vsock
>> virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
>> virtio/vsock: add support for dgram
>>
>> Jiang Wang (1):
>> vsock_test: add tests for vsock dgram
>>
>> drivers/vhost/vsock.c | 238 ++++----
>> include/linux/virtio_vsock.h | 73 ++-
>> include/net/af_vsock.h | 2 +
>> include/uapi/linux/virtio_vsock.h | 2 +
>> net/vmw_vsock/af_vsock.c | 30 +-
>> net/vmw_vsock/hyperv_transport.c | 2 +-
>> net/vmw_vsock/virtio_transport.c | 237 +++++---
>> net/vmw_vsock/virtio_transport_common.c | 771 ++++++++++++++++--------
>> net/vmw_vsock/vmci_transport.c | 9 +-
>> net/vmw_vsock/vsock_loopback.c | 51 +-
>> tools/testing/vsock/util.c | 105 ++++
>> tools/testing/vsock/util.h | 4 +
>> tools/testing/vsock/vsock_test.c | 195 ++++++
>> 13 files changed, 1176 insertions(+), 543 deletions(-)
>>
>> --
>> 2.35.1
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc
2022-08-18 4:28 ` Jason Wang
@ 2022-09-06 9:00 ` Stefano Garzarella
0 siblings, 0 replies; 23+ messages in thread
From: Stefano Garzarella @ 2022-09-06 9:00 UTC (permalink / raw)
To: Jason Wang
Cc: Bobby Eshleman, Wei Liu, Cong Wang, Stephen Hemminger,
Bobby Eshleman, Jiang Wang, Michael S. Tsirkin, Dexuan Cui,
Haiyang Zhang, Bobby Eshleman, linux-kernel, virtualization,
Eric Dumazet, netdev, Stefan Hajnoczi, kvm, Jakub Kicinski,
Paolo Abeni, linux-hyperv, David S. Miller
On Thu, Aug 18, 2022 at 12:28:48PM +0800, Jason Wang wrote:
>
>在 2022/8/17 14:54, Michael S. Tsirkin 写道:
>>On Mon, Aug 15, 2022 at 10:56:03AM -0700, Bobby Eshleman wrote:
>>>Hey everybody,
>>>
>>>This series introduces datagrams, packet scheduling, and sk_buff usage
>>>to virtio vsock.
>>>
>>>The usage of struct sk_buff benefits users by a) preparing vsock to use
>>>other related systems that require sk_buff, such as sockmap and qdisc,
>>>b) supporting basic congestion control via sock_alloc_send_skb, and c)
>>>reducing copying when delivering packets to TAP.
>>>
>>>The socket layer no longer forces errors to be -ENOMEM, as typically
>>>userspace expects -EAGAIN when the sk_sndbuf threshold is reached and
>>>messages are being sent with option MSG_DONTWAIT.
>>>
>>>The datagram work is based off previous patches by Jiang Wang[1].
>>>
>>>The introduction of datagrams creates a transport layer fairness issue
>>>where datagrams may freely starve streams of queue access. This happens
>>>because, unlike streams, datagrams lack the transactions necessary for
>>>calculating credits and throttling.
>>>
>>>Previous proposals introduce changes to the spec to add an additional
>>>virtqueue pair for datagrams[1]. Although this solution works, using
>>>Linux's qdisc for packet scheduling leverages already existing systems,
>>>avoids the need to change the virtio specification, and gives additional
>>>capabilities. The usage of SFQ or fq_codel, for example, may solve the
>>>transport layer starvation problem. It is easy to imagine other use
>>>cases as well. For example, services of varying importance may be
>>>assigned different priorities, and qdisc will apply appropriate
>>>priority-based scheduling. By default, the system default pfifo qdisc is
>>>used. The qdisc may be bypassed and legacy queuing is resumed by simply
>>>setting the virtio-vsock%d network device to state DOWN. This technique
>>>still allows vsock to work with zero-configuration.
>>The basic question to answer then is this: with a net device qdisc
>>etc in the picture, how is this different from virtio net then?
>>Why do you still want to use vsock?
>
>
>Or maybe it's time to revisit an old idea[1] to unify at least the
>driver part (e.g using virtio-net driver for vsock then we can all
>features that vsock is lacking now)?
Sorry for coming late to the discussion!
This would be great, though, last time I had looked at it, I had found
it quite complicated. The main problem is trying to avoid all the
net-specific stuff (MTU, ethernet header, HW offloading, etc.).
Maybe we could start thinking about this idea by adding a new transport
to vsock (e.g. virtio-net-vsock) completely separate from what we have
now.
Thanks,
Stefano
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <5a93c5aad99d79f028d349cb7e3c128c65d5d7e2.1660362668.git.bobby.eshleman@bytedance.com>]
* Re: [PATCH 3/6] vsock: add netdev to vhost/virtio vsock
[not found] ` <5a93c5aad99d79f028d349cb7e3c128c65d5d7e2.1660362668.git.bobby.eshleman@bytedance.com>
@ 2022-08-16 16:38 ` Michael S. Tsirkin
[not found] ` <20220816110717.5422e976@kernel.org>
2022-09-06 10:58 ` Michael S. Tsirkin
1 sibling, 1 reply; 23+ messages in thread
From: Michael S. Tsirkin @ 2022-08-16 16:38 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Bobby Eshleman, Cong Wang, Bobby Eshleman, Jiang Wang, netdev,
linux-kernel, virtualization, Eric Dumazet, Stefan Hajnoczi, kvm,
Jakub Kicinski, Paolo Abeni, David S. Miller
On Mon, Aug 15, 2022 at 10:56:06AM -0700, Bobby Eshleman wrote:
> In order to support usage of qdisc on vsock traffic, this commit
> introduces a struct net_device to vhost and virtio vsock.
>
> Two new devices are created, vhost-vsock for vhost and virtio-vsock
> for virtio. The devices are attached to the respective transports.
>
> To bypass the usage of the device, the user may "down" the associated
> network interface using common tools. For example, "ip link set dev
> virtio-vsock down" lets vsock bypass the net_device and qdisc entirely,
> simply using the FIFO logic of the prior implementation.
Ugh. That's quite a hack. Mark my words, at some point we will decide to
have down mean "down". Besides, multiple net devices with link up tend
to confuse userspace. So might want to keep it down at all times
even short term.
> For both hosts and guests, there is one device for all G2H vsock sockets
> and one device for all H2G vsock sockets. This makes sense for guests
> because the driver only supports a single vsock channel (one pair of
> TX/RX virtqueues), so one device and qdisc fits. For hosts, this may not
> seem ideal for some workloads. However, it is possible to use a
> multi-queue qdisc, where a given queue is responsible for a range of
> sockets. This seems to be a better solution than having one device per
> socket, which may yield a very large number of devices and qdiscs, all
> of which are dynamically being created and destroyed. Because of this
> dynamism, it would also require a complex policy management daemon, as
> devices would constantly be spun up and down as sockets were created and
> destroyed. To avoid this, one device and qdisc also applies to all H2G
> sockets.
>
> Signed-off-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
> ---
> drivers/vhost/vsock.c | 19 +++-
> include/linux/virtio_vsock.h | 10 +++
> net/vmw_vsock/virtio_transport.c | 19 +++-
> net/vmw_vsock/virtio_transport_common.c | 112 +++++++++++++++++++++++-
> 4 files changed, 152 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> index f8601d93d94d..b20ddec2664b 100644
> --- a/drivers/vhost/vsock.c
> +++ b/drivers/vhost/vsock.c
> @@ -927,13 +927,30 @@ static int __init vhost_vsock_init(void)
> VSOCK_TRANSPORT_F_H2G);
> if (ret < 0)
> return ret;
> - return misc_register(&vhost_vsock_misc);
> +
> + ret = virtio_transport_init(&vhost_transport, "vhost-vsock");
> + if (ret < 0)
> + goto out_unregister;
> +
> + ret = misc_register(&vhost_vsock_misc);
> + if (ret < 0)
> + goto out_transport_exit;
> + return ret;
> +
> +out_transport_exit:
> + virtio_transport_exit(&vhost_transport);
> +
> +out_unregister:
> + vsock_core_unregister(&vhost_transport.transport);
> + return ret;
> +
> };
>
> static void __exit vhost_vsock_exit(void)
> {
> misc_deregister(&vhost_vsock_misc);
> vsock_core_unregister(&vhost_transport.transport);
> + virtio_transport_exit(&vhost_transport);
> };
>
> module_init(vhost_vsock_init);
> diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
> index 9a37eddbb87a..5d7e7fbd75f8 100644
> --- a/include/linux/virtio_vsock.h
> +++ b/include/linux/virtio_vsock.h
> @@ -91,10 +91,20 @@ struct virtio_transport {
> /* This must be the first field */
> struct vsock_transport transport;
>
> + /* Used almost exclusively for qdisc */
> + struct net_device *dev;
> +
> /* Takes ownership of the packet */
> int (*send_pkt)(struct sk_buff *skb);
> };
>
> +int
> +virtio_transport_init(struct virtio_transport *t,
> + const char *name);
> +
> +void
> +virtio_transport_exit(struct virtio_transport *t);
> +
> ssize_t
> virtio_transport_stream_dequeue(struct vsock_sock *vsk,
> struct msghdr *msg,
> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> index 3bb293fd8607..c6212eb38d3c 100644
> --- a/net/vmw_vsock/virtio_transport.c
> +++ b/net/vmw_vsock/virtio_transport.c
> @@ -131,7 +131,9 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> * the vq
> */
> if (ret < 0) {
> - skb_queue_head(&vsock->send_pkt_queue, skb);
> + spin_lock_bh(&vsock->send_pkt_queue.lock);
> + __skb_queue_head(&vsock->send_pkt_queue, skb);
> + spin_unlock_bh(&vsock->send_pkt_queue.lock);
> break;
> }
>
> @@ -676,7 +678,9 @@ static void virtio_vsock_vqs_del(struct virtio_vsock *vsock)
> kfree_skb(skb);
> mutex_unlock(&vsock->tx_lock);
>
> - skb_queue_purge(&vsock->send_pkt_queue);
> + spin_lock_bh(&vsock->send_pkt_queue.lock);
> + __skb_queue_purge(&vsock->send_pkt_queue);
> + spin_unlock_bh(&vsock->send_pkt_queue.lock);
>
> /* Delete virtqueues and flush outstanding callbacks if any */
> vdev->config->del_vqs(vdev);
> @@ -760,6 +764,8 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> flush_work(&vsock->event_work);
> flush_work(&vsock->send_pkt_work);
>
> + virtio_transport_exit(&virtio_transport);
> +
> mutex_unlock(&the_virtio_vsock_mutex);
>
> kfree(vsock);
> @@ -844,12 +850,18 @@ static int __init virtio_vsock_init(void)
> if (ret)
> goto out_wq;
>
> - ret = register_virtio_driver(&virtio_vsock_driver);
> + ret = virtio_transport_init(&virtio_transport, "virtio-vsock");
> if (ret)
> goto out_vci;
>
> + ret = register_virtio_driver(&virtio_vsock_driver);
> + if (ret)
> + goto out_transport;
> +
> return 0;
>
> +out_transport:
> + virtio_transport_exit(&virtio_transport);
> out_vci:
> vsock_core_unregister(&virtio_transport.transport);
> out_wq:
> @@ -861,6 +873,7 @@ static void __exit virtio_vsock_exit(void)
> {
> unregister_virtio_driver(&virtio_vsock_driver);
> vsock_core_unregister(&virtio_transport.transport);
> + virtio_transport_exit(&virtio_transport);
> destroy_workqueue(virtio_vsock_workqueue);
> }
>
> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> index d5780599fe93..bdf16fff054f 100644
> --- a/net/vmw_vsock/virtio_transport_common.c
> +++ b/net/vmw_vsock/virtio_transport_common.c
> @@ -16,6 +16,7 @@
>
> #include <net/sock.h>
> #include <net/af_vsock.h>
> +#include <net/pkt_sched.h>
>
> #define CREATE_TRACE_POINTS
> #include <trace/events/vsock_virtio_transport_common.h>
> @@ -23,6 +24,93 @@
> /* How long to wait for graceful shutdown of a connection */
> #define VSOCK_CLOSE_TIMEOUT (8 * HZ)
>
> +struct virtio_transport_priv {
> + struct virtio_transport *trans;
> +};
> +
> +static netdev_tx_t virtio_transport_start_xmit(struct sk_buff *skb, struct net_device *dev)
> +{
> + struct virtio_transport *t =
> + ((struct virtio_transport_priv *)netdev_priv(dev))->trans;
> + int ret;
> +
> + ret = t->send_pkt(skb);
> + if (unlikely(ret == -ENODEV))
> + return NETDEV_TX_BUSY;
> +
> + return NETDEV_TX_OK;
> +}
> +
> +const struct net_device_ops virtio_transport_netdev_ops = {
> + .ndo_start_xmit = virtio_transport_start_xmit,
> +};
> +
> +static void virtio_transport_setup(struct net_device *dev)
> +{
> + dev->netdev_ops = &virtio_transport_netdev_ops;
> + dev->needs_free_netdev = true;
> + dev->flags = IFF_NOARP;
> + dev->mtu = VIRTIO_VSOCK_MAX_PKT_BUF_SIZE;
> + dev->tx_queue_len = DEFAULT_TX_QUEUE_LEN;
> +}
> +
> +static int ifup(struct net_device *dev)
> +{
> + int ret;
> +
> + rtnl_lock();
> + ret = dev_open(dev, NULL) ? -ENOMEM : 0;
> + rtnl_unlock();
> +
> + return ret;
> +}
> +
> +/* virtio_transport_init - initialize a virtio vsock transport layer
> + *
> + * @t: ptr to the virtio transport struct to initialize
> + * @name: the name of the net_device to be created.
> + *
> + * Return 0 on success, otherwise negative errno.
> + */
> +int virtio_transport_init(struct virtio_transport *t, const char *name)
> +{
> + struct virtio_transport_priv *priv;
> + int ret;
> +
> + t->dev = alloc_netdev(sizeof(*priv), name, NET_NAME_UNKNOWN, virtio_transport_setup);
> + if (!t->dev)
> + return -ENOMEM;
> +
> + priv = netdev_priv(t->dev);
> + priv->trans = t;
> +
> + ret = register_netdev(t->dev);
> + if (ret < 0)
> + goto out_free_netdev;
> +
> + ret = ifup(t->dev);
> + if (ret < 0)
> + goto out_unregister_netdev;
> +
> + return 0;
> +
> +out_unregister_netdev:
> + unregister_netdev(t->dev);
> +
> +out_free_netdev:
> + free_netdev(t->dev);
> +
> + return ret;
> +}
> +
> +void virtio_transport_exit(struct virtio_transport *t)
> +{
> + if (t->dev) {
> + unregister_netdev(t->dev);
> + free_netdev(t->dev);
> + }
> +}
> +
> static const struct virtio_transport *
> virtio_transport_get_ops(struct vsock_sock *vsk)
> {
> @@ -147,6 +235,24 @@ static u16 virtio_transport_get_type(struct sock *sk)
> return VIRTIO_VSOCK_TYPE_SEQPACKET;
> }
>
> +/* Return pkt->len on success, otherwise negative errno */
> +static int virtio_transport_send_pkt(const struct virtio_transport *t, struct sk_buff *skb)
> +{
> + int ret;
> + int len = skb->len;
> +
> + if (unlikely(!t->dev || !(t->dev->flags & IFF_UP)))
> + return t->send_pkt(skb);
> +
> + skb->dev = t->dev;
> + ret = dev_queue_xmit(skb);
> +
> + if (likely(ret == NET_XMIT_SUCCESS || ret == NET_XMIT_CN))
> + return len;
> +
> + return -ENOMEM;
> +}
> +
> /* This function can only be used on connecting/connected sockets,
> * since a socket assigned to a transport is required.
> *
> @@ -202,9 +308,7 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
>
> virtio_transport_inc_tx_pkt(vvs, skb);
>
> - err = t_ops->send_pkt(skb);
> -
> - return err < 0 ? -ENOMEM : err;
> + return virtio_transport_send_pkt(t_ops, skb);
> }
>
> static bool virtio_transport_inc_rx_pkt(struct virtio_vsock_sock *vvs,
> @@ -834,7 +938,7 @@ static int virtio_transport_reset_no_sock(const struct virtio_transport *t,
> return -ENOTCONN;
> }
>
> - return t->send_pkt(reply);
> + return virtio_transport_send_pkt(t, reply);
> }
>
> /* This function should be called with sk_lock held and SOCK_DONE set */
> --
> 2.35.1
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 3/6] vsock: add netdev to vhost/virtio vsock
[not found] ` <5a93c5aad99d79f028d349cb7e3c128c65d5d7e2.1660362668.git.bobby.eshleman@bytedance.com>
2022-08-16 16:38 ` [PATCH 3/6] vsock: add netdev to vhost/virtio vsock Michael S. Tsirkin
@ 2022-09-06 10:58 ` Michael S. Tsirkin
1 sibling, 0 replies; 23+ messages in thread
From: Michael S. Tsirkin @ 2022-09-06 10:58 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Bobby Eshleman, Cong Wang, Bobby Eshleman, Jiang Wang, netdev,
linux-kernel, virtualization, Eric Dumazet, Stefan Hajnoczi, kvm,
Jakub Kicinski, Paolo Abeni, David S. Miller
On Mon, Aug 15, 2022 at 10:56:06AM -0700, Bobby Eshleman wrote:
> In order to support usage of qdisc on vsock traffic, this commit
> introduces a struct net_device to vhost and virtio vsock.
>
> Two new devices are created, vhost-vsock for vhost and virtio-vsock
> for virtio. The devices are attached to the respective transports.
>
> To bypass the usage of the device, the user may "down" the associated
> network interface using common tools. For example, "ip link set dev
> virtio-vsock down" lets vsock bypass the net_device and qdisc entirely,
> simply using the FIFO logic of the prior implementation.
>
> For both hosts and guests, there is one device for all G2H vsock sockets
> and one device for all H2G vsock sockets. This makes sense for guests
> because the driver only supports a single vsock channel (one pair of
> TX/RX virtqueues), so one device and qdisc fits. For hosts, this may not
> seem ideal for some workloads. However, it is possible to use a
> multi-queue qdisc, where a given queue is responsible for a range of
> sockets. This seems to be a better solution than having one device per
> socket, which may yield a very large number of devices and qdiscs, all
> of which are dynamically being created and destroyed. Because of this
> dynamism, it would also require a complex policy management daemon, as
> devices would constantly be spun up and down as sockets were created and
> destroyed. To avoid this, one device and qdisc also applies to all H2G
> sockets.
>
> Signed-off-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
I've been thinking about this generally. vsock currently
assumes reliability, but with qdisc can't we get
packet drops e.g. depending on the queueing?
What prevents user from configuring such a discipline?
One thing people like about vsock is that it's very hard
to break H2G communication even with misconfigured
networking.
--
MST
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc
[not found] <cover.1660362668.git.bobby.eshleman@bytedance.com>
` (3 preceding siblings ...)
[not found] ` <5a93c5aad99d79f028d349cb7e3c128c65d5d7e2.1660362668.git.bobby.eshleman@bytedance.com>
@ 2022-09-06 13:26 ` Stefan Hajnoczi
[not found] ` <Yv5PFz1YrSk8jxzY@bullseye>
[not found] ` <3d1f32c4da81f8a0870e126369ba12bc8c4ad048.1660362668.git.bobby.eshleman@bytedance.com>
` (2 subsequent siblings)
7 siblings, 1 reply; 23+ messages in thread
From: Stefan Hajnoczi @ 2022-09-06 13:26 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Bobby Eshleman, Wei Liu, Cong Wang, Stephen Hemminger,
Bobby Eshleman, Jiang Wang, Michael S. Tsirkin, Dexuan Cui,
Haiyang Zhang, linux-kernel, virtualization, Eric Dumazet,
linux-hyperv, kvm, netdev, Jakub Kicinski, Paolo Abeni,
David S. Miller
[-- Attachment #1.1: Type: text/plain, Size: 552 bytes --]
Hi Bobby,
If you are attending Linux Foundation conferences in Dublin, Ireland
next week (Linux Plumbers Conference, Open Source Summit Europe, KVM
Forum, ContainerCon Europe, CloudOpen Europe, etc) then you could meet
Stefano Garzarella and others to discuss this patch series.
Using netdev and sk_buff is a big change to vsock. Discussing your
requirements and the future direction of vsock in person could help.
If you won't be in Dublin, don't worry. You can schedule a video call if
you feel it would be helpful to discuss these topics.
Stefan
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
[-- Attachment #2: Type: text/plain, Size: 183 bytes --]
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread[parent not found: <3d1f32c4da81f8a0870e126369ba12bc8c4ad048.1660362668.git.bobby.eshleman@bytedance.com>]
* Re: [PATCH 4/6] virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
[not found] ` <3d1f32c4da81f8a0870e126369ba12bc8c4ad048.1660362668.git.bobby.eshleman@bytedance.com>
@ 2022-09-26 13:17 ` Stefano Garzarella
0 siblings, 0 replies; 23+ messages in thread
From: Stefano Garzarella @ 2022-09-26 13:17 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Bobby Eshleman, Cong Wang, Bobby Eshleman, Jiang Wang,
Michael S. Tsirkin, netdev, linux-kernel, virtualization,
Eric Dumazet, Stefan Hajnoczi, kvm, Jakub Kicinski, Paolo Abeni,
David S. Miller
On Mon, Aug 15, 2022 at 10:56:07AM -0700, Bobby Eshleman wrote:
>This commit adds a feature bit for virtio vsock to support datagrams.
>
>Signed-off-by: Jiang Wang <jiang.wang@bytedance.com>
>Signed-off-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
>---
> drivers/vhost/vsock.c | 3 ++-
> include/uapi/linux/virtio_vsock.h | 1 +
> net/vmw_vsock/virtio_transport.c | 8 ++++++--
> 3 files changed, 9 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>index b20ddec2664b..a5d1bdb786fe 100644
>--- a/drivers/vhost/vsock.c
>+++ b/drivers/vhost/vsock.c
>@@ -32,7 +32,8 @@
> enum {
> VHOST_VSOCK_FEATURES = VHOST_FEATURES |
> (1ULL << VIRTIO_F_ACCESS_PLATFORM) |
>- (1ULL << VIRTIO_VSOCK_F_SEQPACKET)
>+ (1ULL << VIRTIO_VSOCK_F_SEQPACKET) |
>+ (1ULL << VIRTIO_VSOCK_F_DGRAM)
> };
>
> enum {
>diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h
>index 64738838bee5..857df3a3a70d 100644
>--- a/include/uapi/linux/virtio_vsock.h
>+++ b/include/uapi/linux/virtio_vsock.h
>@@ -40,6 +40,7 @@
>
> /* The feature bitmap for virtio vsock */
> #define VIRTIO_VSOCK_F_SEQPACKET 1 /* SOCK_SEQPACKET supported */
>+#define VIRTIO_VSOCK_F_DGRAM 2 /* Host support dgram vsock */
We already allocated bit 2 for F_NO_IMPLIED_STREAM , so we should use 3:
https://github.com/oasis-tcs/virtio-spec/blob/26ed30ccb049fd51d6e20aad3de2807d678edb3a/virtio-vsock.tex#L22
(I'll send patches to implement F_STREAM and F_NO_IMPLIED_STREAM
negotiation soon).
As long as it's RFC it's fine to introduce F_DGRAM, but we should first
change virtio-spec before merging this series.
About the patch, we should only negotiate the new feature when we really
have DGRAM support. So, it's better to move this patch after adding
support for datagram.
Thanks,
Stefano
>
> struct virtio_vsock_config {
> __le64 guest_cid;
>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>index c6212eb38d3c..073314312683 100644
>--- a/net/vmw_vsock/virtio_transport.c
>+++ b/net/vmw_vsock/virtio_transport.c
>@@ -35,6 +35,7 @@ static struct virtio_transport virtio_transport; /*
>forward declaration */
> struct virtio_vsock {
> struct virtio_device *vdev;
> struct virtqueue *vqs[VSOCK_VQ_MAX];
>+ bool has_dgram;
>
> /* Virtqueue processing is deferred to a workqueue */
> struct work_struct tx_work;
>@@ -709,7 +710,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
> }
>
> vsock->vdev = vdev;
>-
> vsock->rx_buf_nr = 0;
> vsock->rx_buf_max_nr = 0;
> atomic_set(&vsock->queued_replies, 0);
>@@ -726,6 +726,9 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
> if (virtio_has_feature(vdev, VIRTIO_VSOCK_F_SEQPACKET))
> vsock->seqpacket_allow = true;
>
>+ if (virtio_has_feature(vdev, VIRTIO_VSOCK_F_DGRAM))
>+ vsock->has_dgram = true;
>+
> vdev->priv = vsock;
>
> ret = virtio_vsock_vqs_init(vsock);
>@@ -820,7 +823,8 @@ static struct virtio_device_id id_table[] = {
> };
>
> static unsigned int features[] = {
>- VIRTIO_VSOCK_F_SEQPACKET
>+ VIRTIO_VSOCK_F_SEQPACKET,
>+ VIRTIO_VSOCK_F_DGRAM
> };
>
> static struct virtio_driver virtio_vsock_driver = {
>--
>2.35.1
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <d81818b868216c774613dd03641fcfe63cc55a45.1660362668.git.bobby.eshleman@bytedance.com>]
* Re: [PATCH 2/6] vsock: return errors other than -ENOMEM to socket
[not found] ` <d81818b868216c774613dd03641fcfe63cc55a45.1660362668.git.bobby.eshleman@bytedance.com>
@ 2022-08-15 20:01 ` kernel test robot
2022-08-15 23:13 ` kernel test robot
` (2 subsequent siblings)
3 siblings, 0 replies; 23+ messages in thread
From: kernel test robot @ 2022-08-15 20:01 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Bobby Eshleman, Wei Liu, Cong Wang, kbuild-all, kvm, Jiang Wang,
Michael S. Tsirkin, netdev, Dexuan Cui, linux-kernel,
virtualization, Eric Dumazet, linux-hyperv, Stefan Hajnoczi,
Jakub Kicinski, Paolo Abeni, Stephen Hemminger, Haiyang Zhang
Hi Bobby,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on mst-vhost/linux-next]
[also build test WARNING on linus/master v6.0-rc1 next-20220815]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Bobby-Eshleman/virtio-vsock-introduce-dgrams-sk_buff-and-qdisc/20220816-015812
base: https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next
config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20220816/202208160300.HYFUSTbF-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/68c9c8216a573cdfe2170cad677854e2f4a34634
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Bobby-Eshleman/virtio-vsock-introduce-dgrams-sk_buff-and-qdisc/20220816-015812
git checkout 68c9c8216a573cdfe2170cad677854e2f4a34634
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k SHELL=/bin/bash net/vmw_vsock/
If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
>> net/vmw_vsock/virtio_transport.c:178: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Merge the two most recent skbs together if possible.
vim +178 net/vmw_vsock/virtio_transport.c
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 176
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 177 /**
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 @178 * Merge the two most recent skbs together if possible.
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 179 *
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 180 * Caller must hold the queue lock.
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 181 */
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 182 static void
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 183 virtio_transport_add_to_queue(struct sk_buff_head *queue, struct sk_buff *new)
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 184 {
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 185 struct sk_buff *old;
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 186
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 187 spin_lock_bh(&queue->lock);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 188 /* In order to reduce skb memory overhead, we merge new packets with
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 189 * older packets if they pass virtio_transport_skbs_can_merge().
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 190 */
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 191 if (skb_queue_empty_lockless(queue)) {
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 192 __skb_queue_tail(queue, new);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 193 goto out;
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 194 }
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 195
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 196 old = skb_peek_tail(queue);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 197
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 198 if (!virtio_transport_skbs_can_merge(old, new)) {
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 199 __skb_queue_tail(queue, new);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 200 goto out;
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 201 }
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 202
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 203 memcpy(skb_put(old, new->len), new->data, new->len);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 204 vsock_hdr(old)->len = cpu_to_le32(old->len);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 205 vsock_hdr(old)->buf_alloc = vsock_hdr(new)->buf_alloc;
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 206 vsock_hdr(old)->fwd_cnt = vsock_hdr(new)->fwd_cnt;
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 207 dev_kfree_skb_any(new);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 208
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 209 out:
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 210 spin_unlock_bh(&queue->lock);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 211 }
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 212
--
0-DAY CI Kernel Test Service
https://01.org/lkp
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 2/6] vsock: return errors other than -ENOMEM to socket
[not found] ` <d81818b868216c774613dd03641fcfe63cc55a45.1660362668.git.bobby.eshleman@bytedance.com>
2022-08-15 20:01 ` [PATCH 2/6] vsock: return errors other than -ENOMEM to socket kernel test robot
@ 2022-08-15 23:13 ` kernel test robot
2022-08-16 2:16 ` kernel test robot
2022-09-26 13:21 ` Stefano Garzarella
3 siblings, 0 replies; 23+ messages in thread
From: kernel test robot @ 2022-08-15 23:13 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Bobby Eshleman, linux-hyperv, Cong Wang, kvm, Michael S. Tsirkin,
llvm, virtualization, Eric Dumazet, Wei Liu, Stephen Hemminger,
Dexuan Cui, Jakub Kicinski, Paolo Abeni, Haiyang Zhang,
Stefan Hajnoczi, kbuild-all, Jiang Wang, netdev, linux-kernel
Hi Bobby,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on mst-vhost/linux-next]
[also build test WARNING on linus/master v6.0-rc1]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Bobby-Eshleman/virtio-vsock-introduce-dgrams-sk_buff-and-qdisc/20220816-015812
base: https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next
config: i386-randconfig-a014-20220815 (https://download.01.org/0day-ci/archive/20220816/202208160737.gXXFmPbY-lkp@intel.com/config)
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 6afcc4a459ead8809a0d6d9b4bf7b64bcc13582b)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/68c9c8216a573cdfe2170cad677854e2f4a34634
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Bobby-Eshleman/virtio-vsock-introduce-dgrams-sk_buff-and-qdisc/20220816-015812
git checkout 68c9c8216a573cdfe2170cad677854e2f4a34634
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash net/vmw_vsock/
If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
>> net/vmw_vsock/virtio_transport.c:178: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Merge the two most recent skbs together if possible.
vim +178 net/vmw_vsock/virtio_transport.c
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 176
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 177 /**
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 @178 * Merge the two most recent skbs together if possible.
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 179 *
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 180 * Caller must hold the queue lock.
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 181 */
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 182 static void
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 183 virtio_transport_add_to_queue(struct sk_buff_head *queue, struct sk_buff *new)
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 184 {
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 185 struct sk_buff *old;
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 186
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 187 spin_lock_bh(&queue->lock);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 188 /* In order to reduce skb memory overhead, we merge new packets with
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 189 * older packets if they pass virtio_transport_skbs_can_merge().
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 190 */
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 191 if (skb_queue_empty_lockless(queue)) {
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 192 __skb_queue_tail(queue, new);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 193 goto out;
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 194 }
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 195
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 196 old = skb_peek_tail(queue);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 197
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 198 if (!virtio_transport_skbs_can_merge(old, new)) {
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 199 __skb_queue_tail(queue, new);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 200 goto out;
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 201 }
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 202
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 203 memcpy(skb_put(old, new->len), new->data, new->len);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 204 vsock_hdr(old)->len = cpu_to_le32(old->len);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 205 vsock_hdr(old)->buf_alloc = vsock_hdr(new)->buf_alloc;
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 206 vsock_hdr(old)->fwd_cnt = vsock_hdr(new)->fwd_cnt;
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 207 dev_kfree_skb_any(new);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 208
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 209 out:
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 210 spin_unlock_bh(&queue->lock);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 211 }
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 212
--
0-DAY CI Kernel Test Service
https://01.org/lkp
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 2/6] vsock: return errors other than -ENOMEM to socket
[not found] ` <d81818b868216c774613dd03641fcfe63cc55a45.1660362668.git.bobby.eshleman@bytedance.com>
2022-08-15 20:01 ` [PATCH 2/6] vsock: return errors other than -ENOMEM to socket kernel test robot
2022-08-15 23:13 ` kernel test robot
@ 2022-08-16 2:16 ` kernel test robot
2022-09-26 13:21 ` Stefano Garzarella
3 siblings, 0 replies; 23+ messages in thread
From: kernel test robot @ 2022-08-16 2:16 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Bobby Eshleman, Wei Liu, Cong Wang, kbuild-all, kvm, Jiang Wang,
Michael S. Tsirkin, netdev, Dexuan Cui, linux-kernel,
virtualization, Eric Dumazet, linux-hyperv, Stefan Hajnoczi,
Jakub Kicinski, Paolo Abeni, Stephen Hemminger, Haiyang Zhang
Hi Bobby,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on mst-vhost/linux-next]
[also build test WARNING on linus/master v6.0-rc1 next-20220815]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Bobby-Eshleman/virtio-vsock-introduce-dgrams-sk_buff-and-qdisc/20220816-015812
base: https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next
config: i386-randconfig-s001-20220815 (https://download.01.org/0day-ci/archive/20220816/202208161059.GPIlPpvd-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-5) 11.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.4-39-gce1a6720-dirty
# https://github.com/intel-lab-lkp/linux/commit/68c9c8216a573cdfe2170cad677854e2f4a34634
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Bobby-Eshleman/virtio-vsock-introduce-dgrams-sk_buff-and-qdisc/20220816-015812
git checkout 68c9c8216a573cdfe2170cad677854e2f4a34634
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=i386 SHELL=/bin/bash fs/nilfs2/ net/vmw_vsock/
If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>
sparse warnings: (new ones prefixed by >>)
>> net/vmw_vsock/virtio_transport.c:173:31: sparse: sparse: restricted __le16 degrades to integer
net/vmw_vsock/virtio_transport.c:174:31: sparse: sparse: restricted __le16 degrades to integer
vim +173 net/vmw_vsock/virtio_transport.c
0ea9e1d3a9e3ef Asias He 2016-07-28 161
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 162 static inline bool
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 163 virtio_transport_skbs_can_merge(struct sk_buff *old, struct sk_buff *new)
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 164 {
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 165 return (new->len < GOOD_COPY_LEN &&
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 166 skb_tailroom(old) >= new->len &&
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 167 vsock_hdr(new)->src_cid == vsock_hdr(old)->src_cid &&
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 168 vsock_hdr(new)->dst_cid == vsock_hdr(old)->dst_cid &&
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 169 vsock_hdr(new)->src_port == vsock_hdr(old)->src_port &&
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 170 vsock_hdr(new)->dst_port == vsock_hdr(old)->dst_port &&
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 171 vsock_hdr(new)->type == vsock_hdr(old)->type &&
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 172 vsock_hdr(new)->flags == vsock_hdr(old)->flags &&
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 @173 vsock_hdr(old)->op == VIRTIO_VSOCK_OP_RW &&
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 174 vsock_hdr(new)->op == VIRTIO_VSOCK_OP_RW);
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 175 }
93afaf2cdefaa9 Bobby Eshleman 2022-08-15 176
--
0-DAY CI Kernel Test Service
https://01.org/lkp
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 2/6] vsock: return errors other than -ENOMEM to socket
[not found] ` <d81818b868216c774613dd03641fcfe63cc55a45.1660362668.git.bobby.eshleman@bytedance.com>
` (2 preceding siblings ...)
2022-08-16 2:16 ` kernel test robot
@ 2022-09-26 13:21 ` Stefano Garzarella
3 siblings, 0 replies; 23+ messages in thread
From: Stefano Garzarella @ 2022-09-26 13:21 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Bobby Eshleman, Wei Liu, Cong Wang, Stephen Hemminger,
Bobby Eshleman, Jiang Wang, Michael S. Tsirkin, Dexuan Cui,
Haiyang Zhang, linux-kernel, virtualization, Eric Dumazet, netdev,
Stefan Hajnoczi, kvm, Jakub Kicinski, Paolo Abeni, linux-hyperv,
David S. Miller
On Mon, Aug 15, 2022 at 10:56:05AM -0700, Bobby Eshleman wrote:
>This commit allows vsock implementations to return errors
>to the socket layer other than -ENOMEM. One immediate effect
>of this is that upon the sk_sndbuf threshold being reached -EAGAIN
>will be returned and userspace may throttle appropriately.
>
>Resultingly, a known issue with uperf is resolved[1].
>
>Additionally, to preserve legacy behavior for non-virtio
>implementations, hyperv/vmci force errors to be -ENOMEM so that behavior
>is unchanged.
>
>[1]: https://gitlab.com/vsock/vsock/-/issues/1
>
>Signed-off-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
>---
> include/linux/virtio_vsock.h | 3 +++
> net/vmw_vsock/af_vsock.c | 3 ++-
> net/vmw_vsock/hyperv_transport.c | 2 +-
> net/vmw_vsock/virtio_transport_common.c | 3 ---
> net/vmw_vsock/vmci_transport.c | 9 ++++++++-
> 5 files changed, 14 insertions(+), 6 deletions(-)
>
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index 17ed01466875..9a37eddbb87a 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -8,6 +8,9 @@
> #include <net/sock.h>
> #include <net/af_vsock.h>
>
>+/* Threshold for detecting small packets to copy */
>+#define GOOD_COPY_LEN 128
>+
This change seems unrelated.
Please move it in the patch where you need this.
Maybe it's better to add a prefix if we move it in an header file (e.g.
VIRTIO_VSOCK_...).
Thanks,
Stefano
> enum virtio_vsock_metadata_flags {
> VIRTIO_VSOCK_METADATA_FLAGS_REPLY = BIT(0),
> VIRTIO_VSOCK_METADATA_FLAGS_TAP_DELIVERED = BIT(1),
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index e348b2d09eac..1893f8aafa48 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1844,8 +1844,9 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> written = transport->stream_enqueue(vsk,
> msg, len - total_written);
> }
>+
> if (written < 0) {
>- err = -ENOMEM;
>+ err = written;
> goto out_err;
> }
>
>diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
>index fd98229e3db3..e99aea571f6f 100644
>--- a/net/vmw_vsock/hyperv_transport.c
>+++ b/net/vmw_vsock/hyperv_transport.c
>@@ -687,7 +687,7 @@ static ssize_t hvs_stream_enqueue(struct vsock_sock *vsk, struct msghdr *msg,
> if (bytes_written)
> ret = bytes_written;
> kfree(send_buf);
>- return ret;
>+ return ret < 0 ? -ENOMEM : ret;
> }
>
> static s64 hvs_stream_has_data(struct vsock_sock *vsk)
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 920578597bb9..d5780599fe93 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -23,9 +23,6 @@
> /* How long to wait for graceful shutdown of a connection */
> #define VSOCK_CLOSE_TIMEOUT (8 * HZ)
>
>-/* Threshold for detecting small packets to copy */
>-#define GOOD_COPY_LEN 128
>-
> static const struct virtio_transport *
> virtio_transport_get_ops(struct vsock_sock *vsk)
> {
>diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
>index b14f0ed7427b..c927a90dc859 100644
>--- a/net/vmw_vsock/vmci_transport.c
>+++ b/net/vmw_vsock/vmci_transport.c
>@@ -1838,7 +1838,14 @@ static ssize_t vmci_transport_stream_enqueue(
> struct msghdr *msg,
> size_t len)
> {
>- return vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg, len, 0);
>+ int err;
>+
>+ err = vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg, len, 0);
>+
>+ if (err < 0)
>+ err = -ENOMEM;
>+
>+ return err;
> }
>
> static s64 vmci_transport_stream_has_data(struct vsock_sock *vsk)
>--
>2.35.1
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc
[not found] <cover.1660362668.git.bobby.eshleman@bytedance.com>
` (6 preceding siblings ...)
[not found] ` <d81818b868216c774613dd03641fcfe63cc55a45.1660362668.git.bobby.eshleman@bytedance.com>
@ 2022-09-26 13:42 ` Stefano Garzarella
2022-09-27 17:45 ` Stefano Garzarella
7 siblings, 1 reply; 23+ messages in thread
From: Stefano Garzarella @ 2022-09-26 13:42 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Bobby Eshleman, Wei Liu, Cong Wang, Stephen Hemminger,
Bobby Eshleman, Jiang Wang, Michael S. Tsirkin, Dexuan Cui,
Haiyang Zhang, linux-kernel, virtualization, Eric Dumazet, netdev,
Stefan Hajnoczi, kvm, Jakub Kicinski, Paolo Abeni, linux-hyperv,
David S. Miller
Hi,
On Mon, Aug 15, 2022 at 10:56:03AM -0700, Bobby Eshleman wrote:
>Hey everybody,
>
>This series introduces datagrams, packet scheduling, and sk_buff usage
>to virtio vsock.
Just a reminder for those who are interested, tomorrow Sep 27 @ 16:00
UTC we will discuss more about the next steps for this series in this
room: https://meet.google.com/fxi-vuzr-jjb
(I'll try to record it and take notes that we will share)
Bobby, thank you so much for working on this! It would be great to solve
the fairness issue and support datagram!
I took a look at the series, left some comments in the individual
patches, and add some advice here that we could pick up tomorrow:
- it would be nice to run benchmarks (e.g., iperf-vsock, uperf, etc.) to
see how much the changes cost (e.g. sk_buff use)
- we should take care also of other transports (i.e. vmci, hyperv), the
uAPI should be as close as possible regardless of the transport
About the use of netdev, it seems the most controversial point and I
understand Jakub and Michael's concerns. Tomorrow would be great if you
can update us if you have found any way to avoid it, just reusing a
packet scheduler somehow.
It would be great if we could make it available for all transports (I'm
not asking you to implement it for all, but to have a generic api that
others can use).
But we can talk about that tomorrow!
Thanks,
Stefano
>
>The usage of struct sk_buff benefits users by a) preparing vsock to use
>other related systems that require sk_buff, such as sockmap and qdisc,
>b) supporting basic congestion control via sock_alloc_send_skb, and c)
>reducing copying when delivering packets to TAP.
>
>The socket layer no longer forces errors to be -ENOMEM, as typically
>userspace expects -EAGAIN when the sk_sndbuf threshold )s reached and
>messages are being sent with option MSG_DONTWAIT.
>
>The datagram work is based off previous patches by Jiang Wang[1].
>
>The introduction of datagrams creates a transport layer fairness issue
>where datagrams may freely starve streams of queue access. This happens
>because, unlike streams, datagrams lack the transactions necessary for
>calculating credits and throttling.
>
>Previous proposals introduce changes to the spec to add an additional
>virtqueue pair for datagrams[1]. Although this solution works, using
>Linux's qdisc for packet scheduling leverages already existing systems,
>avoids the need to change the virtio specification, and gives additional
>capabilities. The usage of SFQ or fq_codel, for example, may solve the
>transport layer starvation problem. It is easy to imagine other use
>cases as well. For example, services of varying importance may be
>assigned different priorities, and qdisc will apply appropriate
>priority-based scheduling. By default, the system default pfifo qdisc is
>used. The qdisc may be bypassed and legacy queuing is resumed by simply
>setting the virtio-vsock%d network device to state DOWN. This technique
>still allows vsock to work with zero-configuration.
>
>In summary, this series introduces these major changes to vsock:
>
>- virtio vsock supports datagrams
>- virtio vsock uses struct sk_buff instead of virtio_vsock_pkt
> - Because virtio vsock uses sk_buff, it also uses sock_alloc_send_skb,
> which applies the throttling threshold sk_sndbuf.
>- The vsock socket layer supports returning errors other than -ENOMEM.
> - This is used to return -EAGAIN when the sk_sndbuf threshold is
> reached.
>- virtio vsock uses a net_device, through which qdisc may be used.
> - qdisc allows scheduling policies to be applied to vsock flows.
> - Some qdiscs, like SFQ, may allow vsock to avoid transport layer congestion. That is,
> it may avoid datagrams from flooding out stream flows. The benefit
> to this is that additional virtqueues are not needed for datagrams.
> - The net_device and qdisc is bypassed by simply setting the
> net_device state to DOWN.
>
>[1]: https://lore.kernel.org/all/20210914055440.3121004-1-jiang.wang@bytedance.com/
>
>Bobby Eshleman (5):
> vsock: replace virtio_vsock_pkt with sk_buff
> vsock: return errors other than -ENOMEM to socket
> vsock: add netdev to vhost/virtio vsock
> virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
> virtio/vsock: add support for dgram
>
>Jiang Wang (1):
> vsock_test: add tests for vsock dgram
>
> drivers/vhost/vsock.c | 238 ++++----
> include/linux/virtio_vsock.h | 73 ++-
> include/net/af_vsock.h | 2 +
> include/uapi/linux/virtio_vsock.h | 2 +
> net/vmw_vsock/af_vsock.c | 30 +-
> net/vmw_vsock/hyperv_transport.c | 2 +-
> net/vmw_vsock/virtio_transport.c | 237 +++++---
> net/vmw_vsock/virtio_transport_common.c | 771 ++++++++++++++++--------
> net/vmw_vsock/vmci_transport.c | 9 +-
> net/vmw_vsock/vsock_loopback.c | 51 +-
> tools/testing/vsock/util.c | 105 ++++
> tools/testing/vsock/util.h | 4 +
> tools/testing/vsock/vsock_test.c | 195 ++++++
> 13 files changed, 1176 insertions(+), 543 deletions(-)
>
>--
>2.35.1
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc
2022-09-26 13:42 ` [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc Stefano Garzarella
@ 2022-09-27 17:45 ` Stefano Garzarella
0 siblings, 0 replies; 23+ messages in thread
From: Stefano Garzarella @ 2022-09-27 17:45 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Bobby Eshleman, Wei Liu, Cong Wang, Stephen Hemminger,
Bobby Eshleman, Jiang Wang, Michael S. Tsirkin, Dexuan Cui,
Haiyang Zhang, linux-kernel, virtualization, Eric Dumazet, netdev,
Stefan Hajnoczi, kvm, Jakub Kicinski, Paolo Abeni, linux-hyperv,
David S. Miller
On Mon, Sep 26, 2022 at 03:42:19PM +0200, Stefano Garzarella wrote:
>Hi,
>
>On Mon, Aug 15, 2022 at 10:56:03AM -0700, Bobby Eshleman wrote:
>>Hey everybody,
>>
>>This series introduces datagrams, packet scheduling, and sk_buff usage
>>to virtio vsock.
>
>Just a reminder for those who are interested, tomorrow Sep 27 @ 16:00
>UTC we will discuss more about the next steps for this series in this
>room: https://meet.google.com/fxi-vuzr-jjb
>(I'll try to record it and take notes that we will share)
>
Thank you all for participating in the call!
I'm attaching video/audio recording and notes (feel free to update it).
Notes:
https://docs.google.com/document/d/14UHH0tEaBKfElLZjNkyKUs_HnOgHhZZBqIS86VEIqR0/edit?usp=sharing
Video recording:
https://drive.google.com/file/d/1vUvTc_aiE1mB30tLPeJjANnb915-CIKa/view?usp=sharing
Thanks,
Stefano
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 23+ messages in thread