virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support
       [not found] <20210207151259.803917-1-arseny.krasnov@kaspersky.com>
@ 2021-02-07 16:20 ` Michael S. Tsirkin
       [not found]   ` <8bd3789c-8df1-4383-f233-b4b854b30970@kaspersky.com>
       [not found] ` <20210207151426.804348-1-arseny.krasnov@kaspersky.com>
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 18+ messages in thread
From: Michael S. Tsirkin @ 2021-02-07 16:20 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, netdev, stsp2, linux-kernel, virtualization,
	oxffffaa, Stefan Hajnoczi, Colin Ian King, Jakub Kicinski,
	Alexander Popov, David S. Miller, Jorgen Hansen

On Sun, Feb 07, 2021 at 06:12:56PM +0300, Arseny Krasnov wrote:
> 	This patchset impelements support of SOCK_SEQPACKET for virtio
> transport.
> 	As SOCK_SEQPACKET guarantees to save record boundaries, so to
> do it, two new packet operations were added: first for start of record
>  and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
> both operations carries metadata - to maintain boundaries and payload
> integrity. Metadata is introduced by adding special header with two
> fields - message count and message length:
> 
> 	struct virtio_vsock_seq_hdr {
> 		__le32  msg_cnt;
> 		__le32  msg_len;
> 	} __attribute__((packed));
> 
> 	This header is transmitted as payload of SEQ_BEGIN and SEQ_END
> packets(buffer of second virtio descriptor in chain) in the same way as
> data transmitted in RW packets. Payload was chosen as buffer for this
> header to avoid touching first virtio buffer which carries header of
> packet, because someone could check that size of this buffer is equal
> to size of packet header. To send record, packet with start marker is
> sent first(it's header contains length of record and counter), then
> counter is incremented and all data is sent as usual 'RW' packets and
> finally SEQ_END is sent(it also carries counter of message, which is
> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is
> incremented again. On receiver's side, length of record is known from
> packet with start record marker. To check that no packets were dropped
> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are
> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by
> 1) and length of data between two markers is compared to length in
> SEQ_BEGIN header.
> 	Now as  packets of one socket are not reordered neither on
> vsock nor on vhost transport layers, such markers allows to restore
> original record on receiver's side. If user's buffer is smaller that
> record length, when all out of size data is dropped.
> 	Maximum length of datagram is not limited as in stream socket,
> because same credit logic is used. Difference with stream socket is
> that user is not woken up until whole record is received or error
> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
> 	Tests also implemented.
> 
>  Arseny Krasnov (17):
>   af_vsock: update functions for connectible socket
>   af_vsock: separate wait data loop
>   af_vsock: separate receive data loop
>   af_vsock: implement SEQPACKET receive loop
>   af_vsock: separate wait space loop
>   af_vsock: implement send logic for SEQPACKET
>   af_vsock: rest of SEQPACKET support
>   af_vsock: update comments for stream sockets
>   virtio/vsock: dequeue callback for SOCK_SEQPACKET
>   virtio/vsock: fetch length for SEQPACKET record
>   virtio/vsock: add SEQPACKET receive logic
>   virtio/vsock: rest of SOCK_SEQPACKET support
>   virtio/vsock: setup SEQPACKET ops for transport
>   vhost/vsock: setup SEQPACKET ops for transport
>   vsock_test: add SOCK_SEQPACKET tests
>   loopback/vsock: setup SEQPACKET ops for transport
>   virtio/vsock: simplify credit update function API
> 
>  drivers/vhost/vsock.c                   |   8 +-
>  include/linux/virtio_vsock.h            |  15 +
>  include/net/af_vsock.h                  |   9 +
>  include/uapi/linux/virtio_vsock.h       |  16 +
>  net/vmw_vsock/af_vsock.c                | 588 +++++++++++++++-------
>  net/vmw_vsock/virtio_transport.c        |   5 +
>  net/vmw_vsock/virtio_transport_common.c | 316 ++++++++++--
>  net/vmw_vsock/vsock_loopback.c          |   5 +
>  tools/testing/vsock/util.c              |  32 +-
>  tools/testing/vsock/util.h              |   3 +
>  tools/testing/vsock/vsock_test.c        | 126 +++++
>  11 files changed, 895 insertions(+), 228 deletions(-)
> 
>  TODO:
>  - What to do, when server doesn't support SOCK_SEQPACKET. In current
>    implementation RST is replied in the same way when listening port
>    is not found. I think that current RST is enough,because case when
>    server doesn't support SEQ_PACKET is same when listener missed(e.g.
>    no listener in both cases).

   - virtio spec patch

>  v3 -> v4:
>  - callbacks for loopback transport
>  - SEQPACKET specific metadata moved from packet header to payload
>    and called 'virtio_vsock_seq_hdr'
>  - record integrity check:
>    1) SEQ_END operation was added, which marks end of record.
>    2) Both SEQ_BEGIN and SEQ_END carries counter which is incremented
>       on every marker send.
>  - af_vsock.c: socket operations for STREAM and SEQPACKET call same
>    functions instead of having own "gates" differs only by names:
>    'vsock_seqpacket/stream_getsockopt()' now replaced with
>    'vsock_connectible_getsockopt()'.
>  - af_vsock.c: 'seqpacket_dequeue' callback returns error and flag that
>    record ready. There is no need to return number of copied bytes,
>    because case when record received successfully is checked at virtio
>    transport layer, when SEQ_END is processed. Also user doesn't need
>    number of copied bytes, because 'recv()' from SEQPACKET could return
>    error, length of users's buffer or length of whole record(both are
>    known in af_vsock.c).
>  - af_vsock.c: both wait loops in af_vsock.c(for data and space) moved
>    to separate functions because now both called from several places.
>  - af_vsock.c: 'vsock_assign_transport()' checks that 'new_transport'
>    pointer is not NULL and returns 'ESOCKTNOSUPPORT' instead of 'ENODEV'
>    if failed to use transport.
>  - tools/testing/vsock/vsock_test.c: rename tests
> 
>  v2 -> v3:
>  - patches reorganized: split for prepare and implementation patches
>  - local variables are declared in "Reverse Christmas tree" manner
>  - virtio_transport_common.c: valid leXX_to_cpu() for vsock header
>    fields access
>  - af_vsock.c: 'vsock_connectible_*sockopt()' added as shared code
>    between stream and seqpacket sockets.
>  - af_vsock.c: loops in '__vsock_*_recvmsg()' refactored.
>  - af_vsock.c: 'vsock_wait_data()' refactored.
> 
>  v1 -> v2:
>  - patches reordered: af_vsock.c related changes now before virtio vsock
>  - patches reorganized: more small patches, where +/- are not mixed
>  - tests for SOCK_SEQPACKET added
>  - all commit messages updated
>  - af_vsock.c: 'vsock_pre_recv_check()' inlined to
>    'vsock_connectible_recvmsg()'
>  - af_vsock.c: 'vsock_assign_transport()' returns ENODEV if transport
>    was not found
>  - virtio_transport_common.c: transport callback for seqpacket dequeue
>  - virtio_transport_common.c: simplified
>    'virtio_transport_recv_connected()'
>  - virtio_transport_common.c: send reset on socket and packet type
> 			      mismatch.
> 
> -- 
> 2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 01/17] af_vsock: update functions for connectible socket
       [not found] ` <20210207151426.804348-1-arseny.krasnov@kaspersky.com>
@ 2021-02-11 10:52   ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 10:52 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:14:23PM +0300, Arseny Krasnov wrote:
>This prepares af_vsock.c for SEQPACKET support: some functions such
>as setsockopt(), getsockopt(), connect(), recvmsg(), sendmsg() are
>shared between both types of sockets, so rename them in general
>manner.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 64 +++++++++++++++++++++-------------------
> 1 file changed, 34 insertions(+), 30 deletions(-)

This patch LGTM:

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>

Thanks,
Stefano

>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 6894f21dc147..f4fabec50650 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -604,8 +604,8 @@ static void vsock_pending_work(struct work_struct *work)
>
> /**** SOCKET OPERATIONS ****/
>
>-static int __vsock_bind_stream(struct vsock_sock *vsk,
>-			       struct sockaddr_vm *addr)
>+static int __vsock_bind_connectible(struct vsock_sock *vsk,
>+				    struct sockaddr_vm *addr)
> {
> 	static u32 port;
> 	struct sockaddr_vm new_addr;
>@@ -685,7 +685,7 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr)
> 	switch (sk->sk_socket->type) {
> 	case SOCK_STREAM:
> 		spin_lock_bh(&vsock_table_lock);
>-		retval = __vsock_bind_stream(vsk, addr);
>+		retval = __vsock_bind_connectible(vsk, addr);
> 		spin_unlock_bh(&vsock_table_lock);
> 		break;
>
>@@ -767,6 +767,11 @@ static struct sock *__vsock_create(struct net *net,
> 	return sk;
> }
>
>+static bool sock_type_connectible(u16 type)
>+{
>+	return type == SOCK_STREAM;
>+}
>+
> static void __vsock_release(struct sock *sk, int level)
> {
> 	if (sk) {
>@@ -785,7 +790,7 @@ static void __vsock_release(struct sock *sk, int level)
>
> 		if (vsk->transport)
> 			vsk->transport->release(vsk);
>-		else if (sk->sk_type == SOCK_STREAM)
>+		else if (sock_type_connectible(sk->sk_type))
> 			vsock_remove_sock(vsk);
>
> 		sock_orphan(sk);
>@@ -945,7 +950,7 @@ static int vsock_shutdown(struct socket *sock, int mode)
> 	sk = sock->sk;
> 	if (sock->state == SS_UNCONNECTED) {
> 		err = -ENOTCONN;
>-		if (sk->sk_type == SOCK_STREAM)
>+		if (sock_type_connectible(sk->sk_type))
> 			return err;
> 	} else {
> 		sock->state = SS_DISCONNECTING;
>@@ -960,7 +965,7 @@ static int vsock_shutdown(struct socket *sock, int mode)
> 		sk->sk_state_change(sk);
> 		release_sock(sk);
>
>-		if (sk->sk_type == SOCK_STREAM) {
>+		if (sock_type_connectible(sk->sk_type)) {
> 			sock_reset_flag(sk, SOCK_DONE);
> 			vsock_send_shutdown(sk, mode);
> 		}
>@@ -1013,7 +1018,7 @@ static __poll_t vsock_poll(struct file *file, struct socket *sock,
> 		if (!(sk->sk_shutdown & SEND_SHUTDOWN))
> 			mask |= EPOLLOUT | EPOLLWRNORM | EPOLLWRBAND;
>
>-	} else if (sock->type == SOCK_STREAM) {
>+	} else if (sock_type_connectible(sk->sk_type)) {
> 		const struct vsock_transport *transport;
>
> 		lock_sock(sk);
>@@ -1263,8 +1268,8 @@ static void vsock_connect_timeout(struct work_struct *work)
> 	sock_put(sk);
> }
>
>-static int vsock_stream_connect(struct socket *sock, struct sockaddr *addr,
>-				int addr_len, int flags)
>+static int vsock_connect(struct socket *sock, struct sockaddr *addr,
>+			 int addr_len, int flags)
> {
> 	int err;
> 	struct sock *sk;
>@@ -1414,7 +1419,7 @@ static int vsock_accept(struct socket *sock, struct socket *newsock, int flags,
>
> 	lock_sock(listener);
>
>-	if (sock->type != SOCK_STREAM) {
>+	if (!sock_type_connectible(sock->type)) {
> 		err = -EOPNOTSUPP;
> 		goto out;
> 	}
>@@ -1491,7 +1496,7 @@ static int vsock_listen(struct socket *sock, int backlog)
>
> 	lock_sock(sk);
>
>-	if (sock->type != SOCK_STREAM) {
>+	if (!sock_type_connectible(sk->sk_type)) {
> 		err = -EOPNOTSUPP;
> 		goto out;
> 	}
>@@ -1535,11 +1540,11 @@ static void vsock_update_buffer_size(struct vsock_sock *vsk,
> 	vsk->buffer_size = val;
> }
>
>-static int vsock_stream_setsockopt(struct socket *sock,
>-				   int level,
>-				   int optname,
>-				   sockptr_t optval,
>-				   unsigned int optlen)
>+static int vsock_connectible_setsockopt(struct socket *sock,
>+					int level,
>+					int optname,
>+					sockptr_t optval,
>+					unsigned int optlen)
> {
> 	int err;
> 	struct sock *sk;
>@@ -1617,10 +1622,10 @@ static int vsock_stream_setsockopt(struct socket *sock,
> 	return err;
> }
>
>-static int vsock_stream_getsockopt(struct socket *sock,
>-				   int level, int optname,
>-				   char __user *optval,
>-				   int __user *optlen)
>+static int vsock_connectible_getsockopt(struct socket *sock,
>+					int level, int optname,
>+					char __user *optval,
>+					int __user *optlen)
> {
> 	int err;
> 	int len;
>@@ -1688,8 +1693,8 @@ static int vsock_stream_getsockopt(struct socket *sock,
> 	return 0;
> }
>
>-static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg,
>-				size_t len)
>+static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
>+				     size_t len)
> {
> 	struct sock *sk;
> 	struct vsock_sock *vsk;
>@@ -1828,10 +1833,9 @@ static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg,
> 	return err;
> }
>
>-
> static int
>-vsock_stream_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>-		     int flags)
>+vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>+			  int flags)
> {
> 	struct sock *sk;
> 	struct vsock_sock *vsk;
>@@ -2007,7 +2011,7 @@ static const struct proto_ops vsock_stream_ops = {
> 	.owner = THIS_MODULE,
> 	.release = vsock_release,
> 	.bind = vsock_bind,
>-	.connect = vsock_stream_connect,
>+	.connect = vsock_connect,
> 	.socketpair = sock_no_socketpair,
> 	.accept = vsock_accept,
> 	.getname = vsock_getname,
>@@ -2015,10 +2019,10 @@ static const struct proto_ops vsock_stream_ops = {
> 	.ioctl = sock_no_ioctl,
> 	.listen = vsock_listen,
> 	.shutdown = vsock_shutdown,
>-	.setsockopt = vsock_stream_setsockopt,
>-	.getsockopt = vsock_stream_getsockopt,
>-	.sendmsg = vsock_stream_sendmsg,
>-	.recvmsg = vsock_stream_recvmsg,
>+	.setsockopt = vsock_connectible_setsockopt,
>+	.getsockopt = vsock_connectible_getsockopt,
>+	.sendmsg = vsock_connectible_sendmsg,
>+	.recvmsg = vsock_connectible_recvmsg,
> 	.mmap = sock_no_mmap,
> 	.sendpage = sock_no_sendpage,
> };
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 02/17] af_vsock: separate wait data loop
       [not found] ` <20210207151451.804498-1-arseny.krasnov@kaspersky.com>
@ 2021-02-11 11:24   ` Stefano Garzarella
  2021-02-11 15:11   ` Jorgen Hansen
  1 sibling, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 11:24 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, netdev, stsp2,
	linux-kernel, virtualization, oxffffaa, Stefan Hajnoczi,
	Colin Ian King, Jakub Kicinski, David S. Miller, Jorgen Hansen,
	Alexander Popov

On Sun, Feb 07, 2021 at 06:14:48PM +0300, Arseny Krasnov wrote:
>This moves wait loop for data to dedicated function, because later
>it will be used by SEQPACKET data receive loop.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 158 +++++++++++++++++++++------------------
> 1 file changed, 86 insertions(+), 72 deletions(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index f4fabec50650..38927695786f 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1833,6 +1833,71 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 	return err;
> }
>
>+static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
>+			   long timeout,
>+			   struct vsock_transport_recv_notify_data *recv_data,
>+			   size_t target)
>+{
>+	const struct vsock_transport *transport;
>+	struct vsock_sock *vsk;
>+	s64 data;
>+	int err;
>+
>+	vsk = vsock_sk(sk);
>+	err = 0;
>+	transport = vsk->transport;
>+	prepare_to_wait(sk_sleep(sk), wait, TASK_INTERRUPTIBLE);
>+
>+	while ((data = vsock_stream_has_data(vsk)) == 0) {
>+		if (sk->sk_err != 0 ||
>+		    (sk->sk_shutdown & RCV_SHUTDOWN) ||
>+		    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
>+			goto out;
>+		}
>+
>+		/* Don't wait for non-blocking sockets. */
>+		if (timeout == 0) {
>+			err = -EAGAIN;
>+			goto out;
>+		}
>+
>+		if (recv_data) {
>+			err = transport->notify_recv_pre_block(vsk, target, recv_data);
>+			if (err < 0)
>+				goto out;
>+		}
>+
>+		release_sock(sk);
>+		timeout = schedule_timeout(timeout);
>+		lock_sock(sk);
>+
>+		if (signal_pending(current)) {
>+			err = sock_intr_errno(timeout);
>+			goto out;
>+		} else if (timeout == 0) {
>+			err = -EAGAIN;
>+			goto out;
>+		}
>+	}
>+
>+	finish_wait(sk_sleep(sk), wait);
>+
>+	/* Invalid queue pair content. XXX This should
>+	 * be changed to a connection reset in a later
>+	 * change.
>+	 */
>+	if (data < 0)
>+		return -ENOMEM;
>+
>+	/* Have some data, return. */
>+	if (data)
>+		return data;

IIUC here data must be != 0 so you can simply return data in any case.

Or cleaner, you can do 'break' instead of 'goto out' in the error paths 
and after the while loop you can do something like this:

	finish_wait(sk_sleep(sk), wait);

	if (err)
		return err;

	if (data < 0)
		return -ENOMEM;

	return data;
}

>+
>+out:
>+	finish_wait(sk_sleep(sk), wait);
>+	return err;
>+}
>+
> static int
> vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 			  int flags)
>@@ -1912,85 +1977,34 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>
>
> 	while (1) {
>-		s64 ready;
>+		ssize_t read;
>
>-		prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
>-		ready = vsock_stream_has_data(vsk);
>-
>-		if (ready == 0) {
>-			if (sk->sk_err != 0 ||
>-			    (sk->sk_shutdown & RCV_SHUTDOWN) ||
>-			    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			}
>-			/* Don't wait for non-blocking sockets. */
>-			if (timeout == 0) {
>-				err = -EAGAIN;
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			}
>-
>-			err = transport->notify_recv_pre_block(
>-					vsk, target, &recv_data);
>-			if (err < 0) {
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			}
>-			release_sock(sk);
>-			timeout = schedule_timeout(timeout);
>-			lock_sock(sk);
>-
>-			if (signal_pending(current)) {
>-				err = sock_intr_errno(timeout);
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			} else if (timeout == 0) {
>-				err = -EAGAIN;
>-				finish_wait(sk_sleep(sk), &wait);
>-				break;
>-			}
>-		} else {
>-			ssize_t read;
>+		err = vsock_wait_data(sk, &wait, timeout, &recv_data, target);
>+		if (err <= 0)
>+			break;
>
>-			finish_wait(sk_sleep(sk), &wait);
>-
>-			if (ready < 0) {
>-				/* Invalid queue pair content. XXX This should
>-				* be changed to a connection reset in a later
>-				* change.
>-				*/
>-
>-				err = -ENOMEM;
>-				goto out;
>-			}
>-
>-			err = transport->notify_recv_pre_dequeue(
>-					vsk, target, &recv_data);
>-			if (err < 0)
>-				break;
>+		err = transport->notify_recv_pre_dequeue(vsk, target,
>+							 &recv_data);
>+		if (err < 0)
>+			break;
>
>-			read = transport->stream_dequeue(
>-					vsk, msg,
>-					len - copied, flags);
>-			if (read < 0) {
>-				err = -ENOMEM;
>-				break;
>-			}
>+		read = transport->stream_dequeue(vsk, msg, len - copied, flags);
>+		if (read < 0) {
>+			err = -ENOMEM;
>+			break;
>+		}
>
>-			copied += read;
>+		copied += read;
>
>-			err = transport->notify_recv_post_dequeue(
>-					vsk, target, read,
>-					!(flags & MSG_PEEK), &recv_data);
>-			if (err < 0)
>-				goto out;
>+		err = transport->notify_recv_post_dequeue(vsk, target, read,
>+						!(flags & MSG_PEEK), &recv_data);
>+		if (err < 0)
>+			goto out;
>
>-			if (read >= target || flags & MSG_PEEK)
>-				break;
>+		if (read >= target || flags & MSG_PEEK)
>+			break;
>
>-			target -= read;
>-		}
>+		target -= read;
> 	}

This part looks okay, maybe we could improve the loop a bit and make it 
more readable, but it's out of the scope of this patch.

Thanks,
Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 03/17] af_vsock: separate receive data loop
       [not found] ` <20210207151508.804615-1-arseny.krasnov@kaspersky.com>
@ 2021-02-11 11:37   ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 11:37 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:15:05PM +0300, Arseny Krasnov wrote:
>This moves STREAM specific data receive logic to dedicated function:
>'__vsock_stream_recvmsg()', while checks that will be same for both
>types of socket are in shared function: 'vsock_connectible_recvmsg()'.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 117 +++++++++++++++++++++++----------------
> 1 file changed, 68 insertions(+), 49 deletions(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 38927695786f..66c8a932f49b 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1898,65 +1898,22 @@ static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
> 	return err;
> }
>
>-static int
>-vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>-			  int flags)
>+static int __vsock_stream_recvmsg(struct sock *sk, struct msghdr *msg,
>+				  size_t len, int flags)
> {
>-	struct sock *sk;
>-	struct vsock_sock *vsk;
>+	struct vsock_transport_recv_notify_data recv_data;
> 	const struct vsock_transport *transport;
>-	int err;
>-	size_t target;
>+	struct vsock_sock *vsk;
> 	ssize_t copied;
>+	size_t target;
> 	long timeout;
>-	struct vsock_transport_recv_notify_data recv_data;
>+	int err;
>
> 	DEFINE_WAIT(wait);
>
>-	sk = sock->sk;
> 	vsk = vsock_sk(sk);
>-	err = 0;
>-
>-	lock_sock(sk);
>-
> 	transport = vsk->transport;
>
>-	if (!transport || sk->sk_state != TCP_ESTABLISHED) {
>-		/* Recvmsg is supposed to return 0 if a peer performs an
>-		 * orderly shutdown. Differentiate between that case and when a
>-		 * peer has not connected or a local shutdown occured with the
>-		 * SOCK_DONE flag.
>-		 */
>-		if (sock_flag(sk, SOCK_DONE))
>-			err = 0;
>-		else
>-			err = -ENOTCONN;
>-
>-		goto out;
>-	}
>-
>-	if (flags & MSG_OOB) {
>-		err = -EOPNOTSUPP;
>-		goto out;
>-	}
>-
>-	/* We don't check peer_shutdown flag here since peer may actually shut
>-	 * down, but there can be data in the queue that a local socket can
>-	 * receive.
>-	 */
>-	if (sk->sk_shutdown & RCV_SHUTDOWN) {
>-		err = 0;
>-		goto out;
>-	}
>-
>-	/* It is valid on Linux to pass in a zero-length receive buffer.  This
>-	 * is not an error.  We may as well bail out now.
>-	 */
>-	if (!len) {
>-		err = 0;
>-		goto out;
>-	}
>-
> 	/* We must not copy less than target bytes into the user's buffer
> 	 * before returning successfully, so we wait for the consume queue to
> 	 * have that much data to consume before dequeueing.  Note that this

At the end of __vsock_stream_recvmsg() you are calling release_sock(sk) 
and it's wrong since we are releasing it in vsock_connectible_recvmsg().

Please fix it.

>@@ -2020,6 +1977,68 @@ vsock_connectible_recvmsg(struct socket *sock, 
>struct msghdr *msg, size_t len,
> 	return err;
> }
>
>+static int
>+vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>+			  int flags)
>+{
>+	struct sock *sk;
>+	struct vsock_sock *vsk;
>+	const struct vsock_transport *transport;
>+	int err;
>+
>+	DEFINE_WAIT(wait);
>+
>+	sk = sock->sk;
>+	vsk = vsock_sk(sk);
>+	err = 0;
>+
>+	lock_sock(sk);
>+
>+	transport = vsk->transport;
>+
>+	if (!transport || sk->sk_state != TCP_ESTABLISHED) {
>+		/* Recvmsg is supposed to return 0 if a peer performs an
>+		 * orderly shutdown. Differentiate between that case and when a
>+		 * peer has not connected or a local shutdown occurred with the
>+		 * SOCK_DONE flag.
>+		 */
>+		if (sock_flag(sk, SOCK_DONE))
>+			err = 0;
>+		else
>+			err = -ENOTCONN;
>+
>+		goto out;
>+	}
>+
>+	if (flags & MSG_OOB) {
>+		err = -EOPNOTSUPP;
>+		goto out;
>+	}
>+
>+	/* We don't check peer_shutdown flag here since peer may actually shut
>+	 * down, but there can be data in the queue that a local socket can
>+	 * receive.
>+	 */
>+	if (sk->sk_shutdown & RCV_SHUTDOWN) {
>+		err = 0;
>+		goto out;
>+	}
>+
>+	/* It is valid on Linux to pass in a zero-length receive buffer.  This
>+	 * is not an error.  We may as well bail out now.
>+	 */
>+	if (!len) {
>+		err = 0;
>+		goto out;
>+	}
>+
>+	err = __vsock_stream_recvmsg(sk, msg, len, flags);
>+
>+out:
>+	release_sock(sk);
>+	return err;
>+}
>+

The rest of the patch LGTM.

Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 04/17] af_vsock: implement SEQPACKET receive loop
       [not found] ` <20210207151526.804741-1-arseny.krasnov@kaspersky.com>
@ 2021-02-11 11:47   ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 11:47 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, netdev, stsp2,
	linux-kernel, virtualization, oxffffaa, Stefan Hajnoczi,
	Colin Ian King, Jakub Kicinski, David S. Miller, Jorgen Hansen,
	Alexander Popov

On Sun, Feb 07, 2021 at 06:15:22PM +0300, Arseny Krasnov wrote:
>This adds receive loop for SEQPACKET. It looks like receive loop for
>STREAM, but there is a little bit difference:
>1) It doesn't call notify callbacks.
>2) It doesn't care about 'SO_SNDLOWAT' and 'SO_RCVLOWAT' values, because
>   there is no sense for these values in SEQPACKET case.
>3) It waits until whole record is received or error is found during
>   receiving.
>4) It processes and sets 'MSG_TRUNC' flag.
>
>So to avoid extra conditions for two types of socket inside one loop, two
>independent functions were created.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/net/af_vsock.h   |  5 +++
> net/vmw_vsock/af_vsock.c | 96 +++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 100 insertions(+), 1 deletion(-)
>
>diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>index b1c717286993..bb6a0e52be86 100644
>--- a/include/net/af_vsock.h
>+++ b/include/net/af_vsock.h
>@@ -135,6 +135,11 @@ struct vsock_transport {
> 	bool (*stream_is_active)(struct vsock_sock *);
> 	bool (*stream_allow)(u32 cid, u32 port);
>
>+	/* SEQ_PACKET. */
>+	size_t (*seqpacket_seq_get_len)(struct vsock_sock *);
>+	int (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
>+				     int flags, bool *msg_ready);

CHECK: Alignment should match open parenthesis
#35: FILE: include/net/af_vsock.h:141:
+	int (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
+				     int flags, bool *msg_ready);

And to make checkpatch.pl happy please use the identifier name also for 
the others parameter. I know we haven't done this before, but for new 
code I think we can do it.

>+
> 	/* Notification. */
> 	int (*notify_poll_in)(struct vsock_sock *, size_t, bool *);
> 	int (*notify_poll_out)(struct vsock_sock *, size_t, bool *);
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 66c8a932f49b..3d8af987216a 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1977,6 +1977,97 @@ static int __vsock_stream_recvmsg(struct sock *sk, struct msghdr *msg,
> 	return err;
> }
>
>+static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
>+				     size_t len, int flags)
>+{
>+	const struct vsock_transport *transport;
>+	const struct iovec *orig_iov;
>+	unsigned long orig_nr_segs;
>+	bool msg_ready;
>+	struct vsock_sock *vsk;
>+	size_t record_len;
>+	long timeout;
>+	int err = 0;
>+	DEFINE_WAIT(wait);
>+
>+	vsk = vsock_sk(sk);
>+	transport = vsk->transport;
>+
>+	timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
>+	orig_nr_segs = msg->msg_iter.nr_segs;
>+	orig_iov = msg->msg_iter.iov;
>+	msg_ready = false;
>+	record_len = 0;
>+
>+	while (1) {
>+		err = vsock_wait_data(sk, &wait, timeout, NULL, 0);
>+
>+		if (err <= 0) {
>+			/* In case of any loop break(timeout, signal
>+			 * interrupt or shutdown), we report user that
>+			 * nothing was copied.
>+			 */
>+			err = 0;
>+			break;
>+		}
>+
>+		if (record_len == 0) {
>+			record_len =
>+				transport->seqpacket_seq_get_len(vsk);
>+
>+			if (record_len == 0)
>+				continue;
>+		}
>+
>+		err = transport->seqpacket_dequeue(vsk, msg,
>+					flags, &msg_ready);

A single line here should be okay.

>+		if (err < 0) {
>+			if (err == -EAGAIN) {
>+				iov_iter_init(&msg->msg_iter, READ,
>+					      orig_iov, orig_nr_segs,
>+					      len);
>+				/* Clear 'MSG_EOR' here, because dequeue
>+				 * callback above set it again if it was
>+				 * set by sender. This 'MSG_EOR' is from
>+				 * dropped record.
>+				 */
>+				msg->msg_flags &= ~MSG_EOR;
>+				record_len = 0;
>+				continue;
>+			}
>+
>+			err = -ENOMEM;
>+			break;
>+		}
>+
>+		if (msg_ready)
>+			break;
>+	}
>+
>+	if (sk->sk_err)
>+		err = -sk->sk_err;
>+	else if (sk->sk_shutdown & RCV_SHUTDOWN)
>+		err = 0;
>+
>+	if (msg_ready) {
>+		/* User sets MSG_TRUNC, so return real length of
>+		 * packet.
>+		 */
>+		if (flags & MSG_TRUNC)
>+			err = record_len;
>+		else
>+			err = len - msg->msg_iter.count;
>+
>+		/* Always set MSG_TRUNC if real length of packet is
>+		 * bigger than user's buffer.
>+		 */
>+		if (record_len > len)
>+			msg->msg_flags |= MSG_TRUNC;
>+	}
>+
>+	return err;
>+}
>+
> static int
> vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 			  int flags)
>@@ -2032,7 +2123,10 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 		goto out;
> 	}
>
>-	err = __vsock_stream_recvmsg(sk, msg, len, flags);
>+	if (sk->sk_type == SOCK_STREAM)
>+		err = __vsock_stream_recvmsg(sk, msg, len, flags);
>+	else
>+		err = __vsock_seqpacket_recvmsg(sk, msg, len, flags);
>
> out:
> 	release_sock(sk);

The rest seems ok to me, but I need to get more familiar with SEQPACKET 
before giving my R-b.

Thanks,
Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 05/17] af_vsock: separate wait space loop
       [not found] ` <20210207151545.804889-1-arseny.krasnov@kaspersky.com>
@ 2021-02-11 12:14   ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 12:14 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:15:41PM +0300, Arseny Krasnov wrote:
>This moves loop that waits for space on send to separate function,
>because it will be used for SEQ_BEGIN/SEQ_END sending before and
>after data transmission. Waiting for SEQ_BEGIN/SEQ_END is needed
>because such packets carries SEQPACKET header that couldn't be
>fragmented by credit mechanism, so to avoid it, sender waits until
>enough space will be ready.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/net/af_vsock.h   |  2 +
> net/vmw_vsock/af_vsock.c | 93 ++++++++++++++++++++++++++--------------
> 2 files changed, 62 insertions(+), 33 deletions(-)
>
>diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>index bb6a0e52be86..19f6f22821ec 100644
>--- a/include/net/af_vsock.h
>+++ b/include/net/af_vsock.h
>@@ -205,6 +205,8 @@ void vsock_remove_sock(struct vsock_sock *vsk);
> void vsock_for_each_connected_socket(void (*fn)(struct sock *sk));
> int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk);
> bool vsock_find_cid(unsigned int cid);
>+int vsock_wait_space(struct sock *sk, size_t space, int flags,
>+		     struct vsock_transport_send_notify_data *send_data);
>
> /**** TAP ****/
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 3d8af987216a..ea99261e88ac 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1693,6 +1693,64 @@ static int vsock_connectible_getsockopt(struct socket *sock,
> 	return 0;
> }
>
>+int vsock_wait_space(struct sock *sk, size_t space, int flags,
>+		     struct vsock_transport_send_notify_data *send_data)
>+{
>+	const struct vsock_transport *transport;
>+	struct vsock_sock *vsk;
>+	long timeout;
>+	int err;
>+
>+	DEFINE_WAIT_FUNC(wait, woken_wake_function);
>+
>+	vsk = vsock_sk(sk);
>+	transport = vsk->transport;
>+	timeout = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
>+	err = 0;
>+
>+	add_wait_queue(sk_sleep(sk), &wait);
>+
>+	while (vsock_stream_has_space(vsk) < space &&
>+	       sk->sk_err == 0 &&
>+	       !(sk->sk_shutdown & SEND_SHUTDOWN) &&
>+	       !(vsk->peer_shutdown & RCV_SHUTDOWN)) {

Maybe a new line here, like in the original code, would help the 
readability.

>+		/* Don't wait for non-blocking sockets. */
>+		if (timeout == 0) {
>+			err = -EAGAIN;
>+			goto out_err;
>+		}
>+
>+		if (send_data) {
>+			err = transport->notify_send_pre_block(vsk, send_data);
>+			if (err < 0)
>+				goto out_err;
>+		}
>+
>+		release_sock(sk);
>+		timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout);
>+		lock_sock(sk);
>+		if (signal_pending(current)) {
>+			err = sock_intr_errno(timeout);
>+			goto out_err;
>+		} else if (timeout == 0) {
>+			err = -EAGAIN;
>+			goto out_err;
>+		}
>+	}
>+
>+	if (sk->sk_err) {
>+		err = -sk->sk_err;
>+	} else if ((sk->sk_shutdown & SEND_SHUTDOWN) ||
>+		   (vsk->peer_shutdown & RCV_SHUTDOWN)) {
>+		err = -EPIPE;
>+	}
>+
>+out_err:
>+	remove_wait_queue(sk_sleep(sk), &wait);
>+	return err;
>+}
>+EXPORT_SYMBOL_GPL(vsock_wait_space);
>+
> static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 				     size_t len)
> {

After removing the wait loop in vsock_connectible_sendmsg(), we should 
remove the 'timeout' variable because it is no longer used.

>@@ -1751,39 +1809,8 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 	while (total_written < len) {
> 		ssize_t written;
>
>-		add_wait_queue(sk_sleep(sk), &wait);
>-		while (vsock_stream_has_space(vsk) == 0 &&
>-		       sk->sk_err == 0 &&
>-		       !(sk->sk_shutdown & SEND_SHUTDOWN) &&
>-		       !(vsk->peer_shutdown & RCV_SHUTDOWN)) {
>-
>-			/* Don't wait for non-blocking sockets. */
>-			if (timeout == 0) {
>-				err = -EAGAIN;
>-				remove_wait_queue(sk_sleep(sk), &wait);
>-				goto out_err;
>-			}
>-
>-			err = transport->notify_send_pre_block(vsk, &send_data);
>-			if (err < 0) {
>-				remove_wait_queue(sk_sleep(sk), &wait);
>-				goto out_err;
>-			}
>-
>-			release_sock(sk);
>-			timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout);
>-			lock_sock(sk);
>-			if (signal_pending(current)) {
>-				err = sock_intr_errno(timeout);
>-				remove_wait_queue(sk_sleep(sk), &wait);
>-				goto out_err;
>-			} else if (timeout == 0) {
>-				err = -EAGAIN;
>-				remove_wait_queue(sk_sleep(sk), &wait);
>-				goto out_err;
>-			}
>-		}
>-		remove_wait_queue(sk_sleep(sk), &wait);
>+		if (vsock_wait_space(sk, 1, msg->msg_flags, &send_data))
>+			goto out_err;
>
> 		/* These checks occur both as part of and after the loop
> 		 * conditional since we need to check before and after
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 06/17] af_vsock: implement send logic for SEQPACKET
       [not found] ` <20210207151600.804998-1-arseny.krasnov@kaspersky.com>
@ 2021-02-11 12:17   ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 12:17 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:15:57PM +0300, Arseny Krasnov wrote:
>This adds some logic to current stream enqueue function for SEQPACKET
>support:
>1) Send record's begin/end marker.
>2) Return value from enqueue function is whole record length or error
>   for SOCK_SEQPACKET.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/net/af_vsock.h   |  2 ++
> net/vmw_vsock/af_vsock.c | 22 ++++++++++++++++++++--
> 2 files changed, 22 insertions(+), 2 deletions(-)
>
>diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>index 19f6f22821ec..198d58c4c7ee 100644
>--- a/include/net/af_vsock.h
>+++ b/include/net/af_vsock.h
>@@ -136,6 +136,8 @@ struct vsock_transport {
> 	bool (*stream_allow)(u32 cid, u32 port);
>
> 	/* SEQ_PACKET. */
>+	int (*seqpacket_seq_send_len)(struct vsock_sock *, size_t len, int flags);
>+	int (*seqpacket_seq_send_eor)(struct vsock_sock *, int flags);

As before, we could add the identifier of the parameters.

Other than that, the patch LGTM.

Stefano

> 	size_t (*seqpacket_seq_get_len)(struct vsock_sock *);
> 	int (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
> 				     int flags, bool *msg_ready);
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index ea99261e88ac..a033d3340ac4 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1806,6 +1806,12 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 	if (err < 0)
> 		goto out;
>
>+	if (sk->sk_type == SOCK_SEQPACKET) {
>+		err = transport->seqpacket_seq_send_len(vsk, len, msg->msg_flags);
>+		if (err < 0)
>+			goto out;
>+	}
>+
> 	while (total_written < len) {
> 		ssize_t written;
>
>@@ -1852,9 +1858,21 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
>
> 	}
>
>+	if (sk->sk_type == SOCK_SEQPACKET) {
>+		err = transport->seqpacket_seq_send_eor(vsk, msg->msg_flags);
>+		if (err < 0)
>+			goto out;
>+	}
>+
> out_err:
>-	if (total_written > 0)
>-		err = total_written;
>+	if (total_written > 0) {
>+		/* Return number of written bytes only if:
>+		 * 1) SOCK_STREAM socket.
>+		 * 2) SOCK_SEQPACKET socket when whole buffer is sent.
>+		 */
>+		if (sk->sk_type == SOCK_STREAM || total_written == len)
>+			err = total_written;
>+	}
> out:
> 	release_sock(sk);
> 	return err;
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 07/17] af_vsock: rest of SEQPACKET support
       [not found] ` <20210207151615.805115-1-arseny.krasnov@kaspersky.com>
@ 2021-02-11 12:27   ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 12:27 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:16:12PM +0300, Arseny Krasnov wrote:
>This does rest of SOCK_SEQPACKET support:
>1) Adds socket ops for SEQPACKET type.
>2) Allows to create socket with SEQPACKET type.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 37 ++++++++++++++++++++++++++++++++++++-
> 1 file changed, 36 insertions(+), 1 deletion(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index a033d3340ac4..c77998a14018 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -452,6 +452,7 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
> 		new_transport = transport_dgram;
> 		break;
> 	case SOCK_STREAM:
>+	case SOCK_SEQPACKET:
> 		if (vsock_use_local_transport(remote_cid))
> 			new_transport = transport_local;
> 		else if (remote_cid <= VMADDR_CID_HOST || !transport_h2g ||
>@@ -459,6 +460,15 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
> 			new_transport = transport_g2h;
> 		else
> 			new_transport = transport_h2g;
>+
>+		if (sk->sk_type == SOCK_SEQPACKET) {
>+			if (!new_transport ||
>+			    !new_transport->seqpacket_seq_send_len ||
>+			    !new_transport->seqpacket_seq_send_eor ||
>+			    !new_transport->seqpacket_seq_get_len ||
>+			    !new_transport->seqpacket_dequeue)
>+				return -ESOCKTNOSUPPORT;
>+		}

Maybe we should move this check after the try_module_get() call, since 
the memory pointed by 'new_transport' pointer can be deallocated in the 
meantime.

Also, if the socket had a transport before, we should deassign it before 
returning an error.

> 		break;
> 	default:
> 		return -ESOCKTNOSUPPORT;
>@@ -684,6 +694,7 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr)
>
> 	switch (sk->sk_socket->type) {
> 	case SOCK_STREAM:
>+	case SOCK_SEQPACKET:
> 		spin_lock_bh(&vsock_table_lock);
> 		retval = __vsock_bind_connectible(vsk, addr);
> 		spin_unlock_bh(&vsock_table_lock);
>@@ -769,7 +780,7 @@ static struct sock *__vsock_create(struct net *net,
>
> static bool sock_type_connectible(u16 type)
> {
>-	return type == SOCK_STREAM;
>+	return (type == SOCK_STREAM) || (type == SOCK_SEQPACKET);
> }
>
> static void __vsock_release(struct sock *sk, int level)
>@@ -2199,6 +2210,27 @@ static const struct proto_ops vsock_stream_ops = {
> 	.sendpage = sock_no_sendpage,
> };
>
>+static const struct proto_ops vsock_seqpacket_ops = {
>+	.family = PF_VSOCK,
>+	.owner = THIS_MODULE,
>+	.release = vsock_release,
>+	.bind = vsock_bind,
>+	.connect = vsock_connect,
>+	.socketpair = sock_no_socketpair,
>+	.accept = vsock_accept,
>+	.getname = vsock_getname,
>+	.poll = vsock_poll,
>+	.ioctl = sock_no_ioctl,
>+	.listen = vsock_listen,
>+	.shutdown = vsock_shutdown,
>+	.setsockopt = vsock_connectible_setsockopt,
>+	.getsockopt = vsock_connectible_getsockopt,
>+	.sendmsg = vsock_connectible_sendmsg,
>+	.recvmsg = vsock_connectible_recvmsg,
>+	.mmap = sock_no_mmap,
>+	.sendpage = sock_no_sendpage,
>+};
>+
> static int vsock_create(struct net *net, struct socket *sock,
> 			int protocol, int kern)
> {
>@@ -2219,6 +2251,9 @@ static int vsock_create(struct net *net, struct socket *sock,
> 	case SOCK_STREAM:
> 		sock->ops = &vsock_stream_ops;
> 		break;
>+	case SOCK_SEQPACKET:
>+		sock->ops = &vsock_seqpacket_ops;
>+		break;
> 	default:
> 		return -ESOCKTNOSUPPORT;
> 	}
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 08/17] af_vsock: update comments for stream sockets
       [not found] ` <20210207151632.805240-1-arseny.krasnov@kaspersky.com>
@ 2021-02-11 13:19   ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 13:19 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:16:29PM +0300, Arseny Krasnov wrote:
>This replaces 'stream' to 'connect oriented' in comments as SEQPACKET is
>also connect oriented.

I'm not a native speaker but maybe is better 'connection oriented' or 
looking at socket(2) man page 'connection-based' is also fine.

Thanks,
Stefano

>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/af_vsock.c | 31 +++++++++++++++++--------------
> 1 file changed, 17 insertions(+), 14 deletions(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index c77998a14018..6e5e192cb703 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -415,8 +415,8 @@ static void vsock_deassign_transport(struct vsock_sock *vsk)
>
> /* Assign a transport to a socket and call the .init transport callback.
>  *
>- * Note: for stream socket this must be called when vsk->remote_addr is set
>- * (e.g. during the connect() or when a connection request on a listener
>+ * Note: for connect oriented socket this must be called when vsk->remote_addr
>+ * is set (e.g. during the connect() or when a connection request on a listener
>  * socket is received).
>  * The vsk->remote_addr is used to decide which transport to use:
>  *  - remote CID == VMADDR_CID_LOCAL or g2h->local_cid or VMADDR_CID_HOST if
>@@ -479,10 +479,10 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
> 			return 0;
>
> 		/* transport->release() must be called with sock lock acquired.
>-		 * This path can only be taken during vsock_stream_connect(),
>-		 * where we have already held the sock lock.
>-		 * In the other cases, this function is called on a new socket
>-		 * which is not assigned to any transport.
>+		 * This path can only be taken during vsock_connect(), where we
>+		 * have already held the sock lock. In the other cases, this
>+		 * function is called on a new socket which is not assigned to
>+		 * any transport.
> 		 */
> 		vsk->transport->release(vsk);
> 		vsock_deassign_transport(vsk);
>@@ -659,9 +659,10 @@ static int __vsock_bind_connectible(struct vsock_sock *vsk,
>
> 	vsock_addr_init(&vsk->local_addr, new_addr.svm_cid, new_addr.svm_port);
>
>-	/* Remove stream sockets from the unbound list and add them to the hash
>-	 * table for easy lookup by its address.  The unbound list is simply an
>-	 * extra entry at the end of the hash table, a trick used by AF_UNIX.
>+	/* Remove connect oriented sockets from the unbound list and add them
>+	 * to the hash table for easy lookup by its address.  The unbound list
>+	 * is simply an extra entry at the end of the hash table, a trick used
>+	 * by AF_UNIX.
> 	 */
> 	__vsock_remove_bound(vsk);
> 	__vsock_insert_bound(vsock_bound_sockets(&vsk->local_addr), vsk);
>@@ -952,10 +953,10 @@ static int vsock_shutdown(struct socket *sock, int mode)
> 	if ((mode & ~SHUTDOWN_MASK) || !mode)
> 		return -EINVAL;
>
>-	/* If this is a STREAM socket and it is not connected then bail out
>-	 * immediately.  If it is a DGRAM socket then we must first kick the
>-	 * socket so that it wakes up from any sleeping calls, for example
>-	 * recv(), and then afterwards return the error.
>+	/* If this is a connect oriented socket and it is not connected then
>+	 * bail out immediately.  If it is a DGRAM socket then we must first
>+	 * kick the socket so that it wakes up from any sleeping calls, for
>+	 * example recv(), and then afterwards return the error.
> 	 */
>
> 	sk = sock->sk;
>@@ -1786,7 +1787,9 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
>
> 	transport = vsk->transport;
>
>-	/* Callers should not provide a destination with stream sockets. */
>+	/* Callers should not provide a destination with connect oriented
>+	 * sockets.
>+	 */
> 	if (msg->msg_namelen) {
> 		err = sk->sk_state == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
> 		goto out;
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET
       [not found] ` <20210207151649.805359-1-arseny.krasnov@kaspersky.com>
@ 2021-02-11 13:54   ` Stefano Garzarella
  2021-02-11 14:03     ` Stefano Garzarella
  0 siblings, 1 reply; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 13:54 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:16:46PM +0300, Arseny Krasnov wrote:
>This adds transport callback and it's logic for SEQPACKET dequeue.
>Callback fetches RW packets from rx queue of socket until whole record
>is copied(if user's buffer is full, user is not woken up). This is done
>to not stall sender, because if we wake up user and it leaves syscall,
>nobody will send credit update for rest of record, and sender will wait
>for next enter of read syscall at receiver's side. So if user buffer is
>full, we just send credit update and drop data. If during copy SEQ_BEGIN
>was found(and not all data was copied), copying is restarted by reset
>user's iov iterator(previous unfinished data is dropped).
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/linux/virtio_vsock.h            |   5 +
> include/uapi/linux/virtio_vsock.h       |  16 ++++
> net/vmw_vsock/virtio_transport_common.c | 120 ++++++++++++++++++++++++
> 3 files changed, 141 insertions(+)
>
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index dc636b727179..4d0de3dee9a4 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -36,6 +36,11 @@ struct virtio_vsock_sock {
> 	u32 rx_bytes;
> 	u32 buf_alloc;
> 	struct list_head rx_queue;
>+
>+	/* For SOCK_SEQPACKET */
>+	u32 user_read_seq_len;
>+	u32 user_read_copied;
>+	u32 curr_rx_msg_cnt;
> };
>
> struct virtio_vsock_pkt {
>diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h
>index 1d57ed3d84d2..cf9c165e5cca 100644
>--- a/include/uapi/linux/virtio_vsock.h
>+++ b/include/uapi/linux/virtio_vsock.h
>@@ -63,8 +63,14 @@ struct virtio_vsock_hdr {
> 	__le32	fwd_cnt;
> } __attribute__((packed));
>
>+struct virtio_vsock_seq_hdr {
>+	__le32  msg_cnt;
>+	__le32  msg_len;
>+} __attribute__((packed));
>+
> enum virtio_vsock_type {
> 	VIRTIO_VSOCK_TYPE_STREAM = 1,
>+	VIRTIO_VSOCK_TYPE_SEQPACKET = 2,
> };
>
> enum virtio_vsock_op {
>@@ -83,6 +89,11 @@ enum virtio_vsock_op {
> 	VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
> 	/* Request the peer to send the credit info to us */
> 	VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
>+
>+	/* Record begin for SOCK_SEQPACKET */
>+	VIRTIO_VSOCK_OP_SEQ_BEGIN = 8,
>+	/* Record end for SOCK_SEQPACKET */
>+	VIRTIO_VSOCK_OP_SEQ_END = 9,
> };
>
> /* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
>@@ -91,4 +102,9 @@ enum virtio_vsock_shutdown {
> 	VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
> };
>
>+/* VIRTIO_VSOCK_OP_RW flags values */
>+enum virtio_vsock_rw {
>+	VIRTIO_VSOCK_RW_EOR = 1,
>+};
>+
> #endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 5956939eebb7..4572d01c8ea5 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -397,6 +397,126 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> 	return err;
> }
>
>+static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
>+{
>+	list_del(&pkt->list);
>+	virtio_transport_free_pkt(pkt);
>+}
>+
>+static size_t virtio_transport_drop_until_seq_begin(struct virtio_vsock_sock *vvs)
>+{

This function is not used here, but in the next patch, so I'd add this 
with the next patch.

>+	struct virtio_vsock_pkt *pkt, *n;
>+	size_t bytes_dropped = 0;
>+
>+	list_for_each_entry_safe(pkt, n, &vvs->rx_queue, list) {
>+		if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_SEQ_BEGIN)
>+			break;
>+
>+		bytes_dropped += le32_to_cpu(pkt->hdr.len);
>+		virtio_transport_dec_rx_pkt(vvs, pkt);
>+		virtio_transport_remove_pkt(pkt);
>+	}
>+
>+	return bytes_dropped;
>+}
>+
>+static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
>+						 struct msghdr *msg,
>+						 bool *msg_ready)
>+{

Also this function is not used, maybe you can add in this patch the 
virtio_transport_seqpacket_dequeue() implementation.

>+	struct virtio_vsock_sock *vvs = vsk->trans;
>+	struct virtio_vsock_pkt *pkt;
>+	int err = 0;
>+	size_t user_buf_len = msg->msg_iter.count;
>+
>+	*msg_ready = false;
>+	spin_lock_bh(&vvs->rx_lock);
>+
>+	while (!*msg_ready && !list_empty(&vvs->rx_queue) && !err) {
>+		pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
>+
>+		switch (le16_to_cpu(pkt->hdr.op)) {
>+		case VIRTIO_VSOCK_OP_SEQ_BEGIN: {
>+			/* Unexpected 'SEQ_BEGIN' during record copy:
>+			 * Leave receive loop, 'EAGAIN' will restart it from
>+			 * outer receive loop, packet is still in queue and
>+			 * counters are cleared. So in next loop enter,
>+			 * 'SEQ_BEGIN' will be dequeued first. User's iov
>+			 * iterator will be reset in outer loop. Also
>+			 * send credit update, because some bytes could be
>+			 * copied. User will never see unfinished record.
>+			 */
>+			err = -EAGAIN;
>+			break;
>+		}
>+		case VIRTIO_VSOCK_OP_SEQ_END: {
>+			struct virtio_vsock_seq_hdr *seq_hdr;
>+
>+			seq_hdr = (struct virtio_vsock_seq_hdr *)pkt->buf;
>+			/* First check that whole record is received. */
>+
>+			if (vvs->user_read_copied != vvs->user_read_seq_len ||
>+			    (le32_to_cpu(seq_hdr->msg_cnt) - vvs->curr_rx_msg_cnt) != 1) {
>+				/* Tail of current record and head of next missed,
>+				 * so this EOR is from next record. Restart receive.
>+				 * Current record will be dropped, next headless will
>+				 * be dropped on next attempt to get record length.
>+				 */
>+				err = -EAGAIN;
>+			} else {
>+				/* Success. */
>+				*msg_ready = true;
>+			}
>+
>+			break;
>+		}
>+		case VIRTIO_VSOCK_OP_RW: {
>+			size_t bytes_to_copy;
>+			size_t pkt_len;
>+
>+			pkt_len = (size_t)le32_to_cpu(pkt->hdr.len);
>+			bytes_to_copy = min(user_buf_len, pkt_len);
>+
>+			/* sk_lock is held by caller so no one else can dequeue.
>+			 * Unlock rx_lock since memcpy_to_msg() may sleep.
>+			 */
>+			spin_unlock_bh(&vvs->rx_lock);
>+
>+			if (memcpy_to_msg(msg, pkt->buf, bytes_to_copy)) {
>+				spin_lock_bh(&vvs->rx_lock);
>+				err = -EINVAL;
>+				break;
>+			}
>+
>+			spin_lock_bh(&vvs->rx_lock);
>+			user_buf_len -= bytes_to_copy;
>+			vvs->user_read_copied += pkt_len;
>+
>+			if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_RW_EOR)
>+				msg->msg_flags |= MSG_EOR;
>+			break;
>+		}
>+		default:
>+			;
>+		}
>+
>+		/* For unexpected 'SEQ_BEGIN', keep such packet in queue,
>+		 * but drop any other type of packet.
>+		 */
>+		if (le16_to_cpu(pkt->hdr.op) != VIRTIO_VSOCK_OP_SEQ_BEGIN) {
>+			virtio_transport_dec_rx_pkt(vvs, pkt);
>+			virtio_transport_remove_pkt(pkt);
>+		}
>+	}
>+
>+	spin_unlock_bh(&vvs->rx_lock);
>+
>+	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_SEQPACKET,
>+					    NULL);
>+
>+	return err;
>+}
>+
> ssize_t
> virtio_transport_stream_dequeue(struct vsock_sock *vsk,
> 				struct msghdr *msg,
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 10/17] virtio/vsock: fetch length for SEQPACKET record
       [not found] ` <20210207151711.805503-1-arseny.krasnov@kaspersky.com>
@ 2021-02-11 13:58   ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 13:58 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Sun, Feb 07, 2021 at 06:17:08PM +0300, Arseny Krasnov wrote:
>This adds transport callback which tries to fetch record begin marker
>from socket's rx queue. It is called from af_vsock.c before reading data
>packets of record.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/linux/virtio_vsock.h            |  1 +
> net/vmw_vsock/virtio_transport_common.c | 40 +++++++++++++++++++++++++
> 2 files changed, 41 insertions(+)
>
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index 4d0de3dee9a4..a5e8681bfc6a 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -85,6 +85,7 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
> 			       struct msghdr *msg,
> 			       size_t len, int flags);
>
>+size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk);
> s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
> s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 4572d01c8ea5..7ac552bfd90b 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -420,6 +420,46 @@ static size_t virtio_transport_drop_until_seq_begin(struct virtio_vsock_sock *vv
> 	return bytes_dropped;
> }
>
>+size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk)
>+{
>+	struct virtio_vsock_seq_hdr *seq_hdr;
>+	struct virtio_vsock_sock *vvs;
>+	struct virtio_vsock_pkt *pkt;
>+	size_t bytes_dropped;
>+
>+	vvs = vsk->trans;
>+
>+	spin_lock_bh(&vvs->rx_lock);
>+
>+	/* Fetch all orphaned 'RW', packets, and
>+	 * send credit update.

Single line?

>+	 */
>+	bytes_dropped = virtio_transport_drop_until_seq_begin(vvs);
>+
>+	if (list_empty(&vvs->rx_queue))
>+		goto out;
>+
>+	pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
>+
>+	vvs->user_read_copied = 0;
>+
>+	seq_hdr = (struct virtio_vsock_seq_hdr *)pkt->buf;
>+	vvs->user_read_seq_len = le32_to_cpu(seq_hdr->msg_len);
>+	vvs->curr_rx_msg_cnt = le32_to_cpu(seq_hdr->msg_cnt);
>+	virtio_transport_dec_rx_pkt(vvs, pkt);
>+	virtio_transport_remove_pkt(pkt);
>+out:
>+	spin_unlock_bh(&vvs->rx_lock);
>+
>+	if (bytes_dropped)
>+		virtio_transport_send_credit_update(vsk,
>+						    VIRTIO_VSOCK_TYPE_SEQPACKET,
>+						    NULL);
>+
>+	return vvs->user_read_seq_len;
>+}
>+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_get_len);
>+
> static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
> 						 struct msghdr *msg,
> 						 bool *msg_ready)
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET
  2021-02-11 13:54   ` [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET Stefano Garzarella
@ 2021-02-11 14:03     ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:03 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

On Thu, Feb 11, 2021 at 02:54:28PM +0100, Stefano Garzarella wrote:
>On Sun, Feb 07, 2021 at 06:16:46PM +0300, Arseny Krasnov wrote:
>>This adds transport callback and it's logic for SEQPACKET dequeue.
>>Callback fetches RW packets from rx queue of socket until whole record
>>is copied(if user's buffer is full, user is not woken up). This is done
>>to not stall sender, because if we wake up user and it leaves syscall,
>>nobody will send credit update for rest of record, and sender will wait
>>for next enter of read syscall at receiver's side. So if user buffer is
>>full, we just send credit update and drop data. If during copy SEQ_BEGIN
>>was found(and not all data was copied), copying is restarted by reset
>>user's iov iterator(previous unfinished data is dropped).
>>
>>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>>---
>>include/linux/virtio_vsock.h            |   5 +
>>include/uapi/linux/virtio_vsock.h       |  16 ++++
>>net/vmw_vsock/virtio_transport_common.c | 120 ++++++++++++++++++++++++
>>3 files changed, 141 insertions(+)
>>
>>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>>index dc636b727179..4d0de3dee9a4 100644
>>--- a/include/linux/virtio_vsock.h
>>+++ b/include/linux/virtio_vsock.h
>>@@ -36,6 +36,11 @@ struct virtio_vsock_sock {
>>	u32 rx_bytes;
>>	u32 buf_alloc;
>>	struct list_head rx_queue;
>>+
>>+	/* For SOCK_SEQPACKET */
>>+	u32 user_read_seq_len;
>>+	u32 user_read_copied;
>>+	u32 curr_rx_msg_cnt;
>>};
>>
>>struct virtio_vsock_pkt {
>>diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h
>>index 1d57ed3d84d2..cf9c165e5cca 100644
>>--- a/include/uapi/linux/virtio_vsock.h
>>+++ b/include/uapi/linux/virtio_vsock.h
>>@@ -63,8 +63,14 @@ struct virtio_vsock_hdr {
>>	__le32	fwd_cnt;
>>} __attribute__((packed));
>>
>>+struct virtio_vsock_seq_hdr {
>>+	__le32  msg_cnt;

Maybe it's better 'msg_id' for this field, since we use it to identify a 
message. Then whether we use a counter or a random number, I think it's 
just an implementation detail.

As Michael said, perhaps this detail should be discussed in the proposal 
for VIRTIO spec changes.

>>+	__le32  msg_len;
>>+} __attribute__((packed));
>>+
>>enum virtio_vsock_type {
>>	VIRTIO_VSOCK_TYPE_STREAM = 1,
>>+	VIRTIO_VSOCK_TYPE_SEQPACKET = 2,
>>};
>>
>>enum virtio_vsock_op {
>>@@ -83,6 +89,11 @@ enum virtio_vsock_op {
>>	VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
>>	/* Request the peer to send the credit info to us */
>>	VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
>>+
>>+	/* Record begin for SOCK_SEQPACKET */
>>+	VIRTIO_VSOCK_OP_SEQ_BEGIN = 8,
>>+	/* Record end for SOCK_SEQPACKET */
>>+	VIRTIO_VSOCK_OP_SEQ_END = 9,
>>};
>>
>>/* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
>>@@ -91,4 +102,9 @@ enum virtio_vsock_shutdown {
>>	VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
>>};
>>
>>+/* VIRTIO_VSOCK_OP_RW flags values */
>>+enum virtio_vsock_rw {
>>+	VIRTIO_VSOCK_RW_EOR = 1,
>>+};
>>+
>>#endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
>>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>>index 5956939eebb7..4572d01c8ea5 100644
>>--- a/net/vmw_vsock/virtio_transport_common.c
>>+++ b/net/vmw_vsock/virtio_transport_common.c
>>@@ -397,6 +397,126 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
>>	return err;
>>}
>>
>>+static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
>>+{
>>+	list_del(&pkt->list);
>>+	virtio_transport_free_pkt(pkt);
>>+}
>>+
>>+static size_t virtio_transport_drop_until_seq_begin(struct virtio_vsock_sock *vvs)
>>+{
>
>This function is not used here, but in the next patch, so I'd add this 
>with the next patch.
>
>>+	struct virtio_vsock_pkt *pkt, *n;
>>+	size_t bytes_dropped = 0;
>>+
>>+	list_for_each_entry_safe(pkt, n, &vvs->rx_queue, list) {
>>+		if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_SEQ_BEGIN)
>>+			break;
>>+
>>+		bytes_dropped += le32_to_cpu(pkt->hdr.len);
>>+		virtio_transport_dec_rx_pkt(vvs, pkt);
>>+		virtio_transport_remove_pkt(pkt);
>>+	}
>>+
>>+	return bytes_dropped;
>>+}
>>+
>>+static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
>>+						 struct msghdr *msg,
>>+						 bool *msg_ready)
>>+{
>
>Also this function is not used, maybe you can add in this patch the 
>virtio_transport_seqpacket_dequeue() implementation.
>
>>+	struct virtio_vsock_sock *vvs = vsk->trans;
>>+	struct virtio_vsock_pkt *pkt;
>>+	int err = 0;
>>+	size_t user_buf_len = msg->msg_iter.count;
>>+
>>+	*msg_ready = false;
>>+	spin_lock_bh(&vvs->rx_lock);
>>+
>>+	while (!*msg_ready && !list_empty(&vvs->rx_queue) && !err) {
>>+		pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
>>+
>>+		switch (le16_to_cpu(pkt->hdr.op)) {
>>+		case VIRTIO_VSOCK_OP_SEQ_BEGIN: {
>>+			/* Unexpected 'SEQ_BEGIN' during record copy:
>>+			 * Leave receive loop, 'EAGAIN' will restart it from
>>+			 * outer receive loop, packet is still in queue and
>>+			 * counters are cleared. So in next loop enter,
>>+			 * 'SEQ_BEGIN' will be dequeued first. User's iov
>>+			 * iterator will be reset in outer loop. Also
>>+			 * send credit update, because some bytes could be
>>+			 * copied. User will never see unfinished record.
>>+			 */
>>+			err = -EAGAIN;
>>+			break;
>>+		}
>>+		case VIRTIO_VSOCK_OP_SEQ_END: {
>>+			struct virtio_vsock_seq_hdr *seq_hdr;
>>+
>>+			seq_hdr = (struct virtio_vsock_seq_hdr *)pkt->buf;
>>+			/* First check that whole record is received. */
>>+
>>+			if (vvs->user_read_copied != vvs->user_read_seq_len ||
>>+			    (le32_to_cpu(seq_hdr->msg_cnt) - vvs->curr_rx_msg_cnt) != 1) {
>>+				/* Tail of current record and head of next missed,
>>+				 * so this EOR is from next record. Restart receive.
>>+				 * Current record will be dropped, next headless will
>>+				 * be dropped on next attempt to get record length.
>>+				 */
>>+				err = -EAGAIN;
>>+			} else {
>>+				/* Success. */
>>+				*msg_ready = true;
>>+			}
>>+
>>+			break;
>>+		}
>>+		case VIRTIO_VSOCK_OP_RW: {
>>+			size_t bytes_to_copy;
>>+			size_t pkt_len;
>>+
>>+			pkt_len = (size_t)le32_to_cpu(pkt->hdr.len);
>>+			bytes_to_copy = min(user_buf_len, pkt_len);
>>+
>>+			/* sk_lock is held by caller so no one else can dequeue.
>>+			 * Unlock rx_lock since memcpy_to_msg() may sleep.
>>+			 */
>>+			spin_unlock_bh(&vvs->rx_lock);
>>+
>>+			if (memcpy_to_msg(msg, pkt->buf, bytes_to_copy)) {
>>+				spin_lock_bh(&vvs->rx_lock);
>>+				err = -EINVAL;
>>+				break;
>>+			}
>>+
>>+			spin_lock_bh(&vvs->rx_lock);
>>+			user_buf_len -= bytes_to_copy;
>>+			vvs->user_read_copied += pkt_len;
>>+
>>+			if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_RW_EOR)
>>+				msg->msg_flags |= MSG_EOR;
>>+			break;
>>+		}
>>+		default:
>>+			;
>>+		}
>>+
>>+		/* For unexpected 'SEQ_BEGIN', keep such packet in queue,
>>+		 * but drop any other type of packet.
>>+		 */
>>+		if (le16_to_cpu(pkt->hdr.op) != VIRTIO_VSOCK_OP_SEQ_BEGIN) {
>>+			virtio_transport_dec_rx_pkt(vvs, pkt);
>>+			virtio_transport_remove_pkt(pkt);
>>+		}
>>+	}
>>+
>>+	spin_unlock_bh(&vvs->rx_lock);
>>+
>>+	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_SEQPACKET,
>>+					    NULL);
>>+
>>+	return err;
>>+}
>>+
>>ssize_t
>>virtio_transport_stream_dequeue(struct vsock_sock *vsk,
>>				struct msghdr *msg,
>>-- 
>>2.25.1
>>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 12/17] virtio/vsock: rest of SOCK_SEQPACKET support
       [not found] ` <20210207151747.805754-1-arseny.krasnov@kaspersky.com>
@ 2021-02-11 14:29   ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:29 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, netdev, stsp2,
	linux-kernel, virtualization, oxffffaa, Stefan Hajnoczi,
	Colin Ian King, Jakub Kicinski, David S. Miller, Jorgen Hansen,
	Alexander Popov

On Sun, Feb 07, 2021 at 06:17:44PM +0300, Arseny Krasnov wrote:
>This adds rest of logic for SEQPACKET:
>1) Packet's type is now set in 'virtio_send_pkt_info()' using
>   type of socket.
>2) SEQPACKET specific functions which send SEQ_BEGIN/SEQ_END.
>   Note that both functions may sleep to wait enough space for
>   SEQPACKET header.
>3) SEQ_BEGIN/SEQ_END to TAP packet capture.
>4) Send SHUTDOWN on socket close for SEQPACKET type.
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> include/linux/virtio_vsock.h            |  9 +++
> net/vmw_vsock/virtio_transport_common.c | 99 +++++++++++++++++++++----
> 2 files changed, 95 insertions(+), 13 deletions(-)
>
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index a5e8681bfc6a..c4a39424686d 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -41,6 +41,7 @@ struct virtio_vsock_sock {
> 	u32 user_read_seq_len;
> 	u32 user_read_copied;
> 	u32 curr_rx_msg_cnt;
>+	u32 next_tx_msg_cnt;
> };
>
> struct virtio_vsock_pkt {
>@@ -85,7 +86,15 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
> 			       struct msghdr *msg,
> 			       size_t len, int flags);
>
>+int virtio_transport_seqpacket_seq_send_len(struct vsock_sock *vsk, size_t len, int flags);
>+int virtio_transport_seqpacket_seq_send_eor(struct vsock_sock *vsk, int flags);
> size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk);
>+int
>+virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
>+				   struct msghdr *msg,
>+				   int flags,
>+				   bool *msg_ready);
>+
> s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
> s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 51b66f8dd7c7..0aa0fd33e9d6 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -139,6 +139,8 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque)
> 		break;
> 	case VIRTIO_VSOCK_OP_CREDIT_UPDATE:
> 	case VIRTIO_VSOCK_OP_CREDIT_REQUEST:
>+	case VIRTIO_VSOCK_OP_SEQ_BEGIN:
>+	case VIRTIO_VSOCK_OP_SEQ_END:
> 		hdr->op = cpu_to_le16(AF_VSOCK_OP_CONTROL);
> 		break;
> 	default:
>@@ -165,6 +167,14 @@ void virtio_transport_deliver_tap_pkt(struct virtio_vsock_pkt *pkt)
> }
> EXPORT_SYMBOL_GPL(virtio_transport_deliver_tap_pkt);
>
>+static u16 virtio_transport_get_type(struct sock *sk)
>+{
>+	if (sk->sk_type == SOCK_STREAM)
>+		return VIRTIO_VSOCK_TYPE_STREAM;
>+	else
>+		return VIRTIO_VSOCK_TYPE_SEQPACKET;
>+}
>+

Maybe add this function in this part of the file from the first patch, 
so you don't need to move it in this series.

> /* This function can only be used on connecting/connected sockets,
>  * since a socket assigned to a transport is required.
>  *
>@@ -179,6 +189,13 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
> 	struct virtio_vsock_pkt *pkt;
> 	u32 pkt_len = info->pkt_len;
>
>+	info->type = virtio_transport_get_type(sk_vsock(vsk));

I'd this change in another patch before this one, since this touch also 
the stream part.

>+
>+	if (info->type == VIRTIO_VSOCK_TYPE_SEQPACKET &&
>+	    info->msg &&
>+	    info->msg->msg_flags & MSG_EOR)
>+		info->flags |= VIRTIO_VSOCK_RW_EOR;
>+
> 	t_ops = virtio_transport_get_ops(vsk);
> 	if (unlikely(!t_ops))
> 		return -EFAULT;
>@@ -397,13 +414,61 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> 	return err;
> }
>
>-static u16 virtio_transport_get_type(struct sock *sk)
>+static int virtio_transport_seqpacket_send_ctrl(struct vsock_sock *vsk,
>+						int type,
>+						size_t len,
>+						int flags)
> {
>-	if (sk->sk_type == SOCK_STREAM)
>-		return VIRTIO_VSOCK_TYPE_STREAM;
>-	else
>-		return VIRTIO_VSOCK_TYPE_SEQPACKET;
>+	struct virtio_vsock_sock *vvs = vsk->trans;
>+	struct virtio_vsock_pkt_info info = {
>+		.op = type,
>+		.vsk = vsk,
>+		.pkt_len = sizeof(struct virtio_vsock_seq_hdr)
>+	};
>+
>+	struct virtio_vsock_seq_hdr seq_hdr = {
>+		.msg_cnt = vvs->next_tx_msg_cnt,
>+		.msg_len = len
>+	};
>+
>+	struct kvec seq_hdr_kiov = {
>+		.iov_base = (void *)&seq_hdr,
>+		.iov_len = sizeof(struct virtio_vsock_seq_hdr)
>+	};
>+
>+	struct msghdr msg = {0};
>+
>+	//XXX: do we need 'vsock_transport_send_notify_data' pointer?
>+	if (vsock_wait_space(sk_vsock(vsk),
>+			     sizeof(struct virtio_vsock_seq_hdr),
>+			     flags, NULL))
>+		return -1;
>+
>+	iov_iter_kvec(&msg.msg_iter, WRITE, &seq_hdr_kiov, 1, sizeof(seq_hdr));
>+
>+	info.msg = &msg;
>+	vvs->next_tx_msg_cnt++;
>+
>+	return virtio_transport_send_pkt_info(vsk, &info);
>+}
>+
>+int virtio_transport_seqpacket_seq_send_len(struct vsock_sock *vsk, size_t len, int flags)
>+{
>+	return virtio_transport_seqpacket_send_ctrl(vsk,
>+						    VIRTIO_VSOCK_OP_SEQ_BEGIN,
>+						    len,
>+						    flags);
> }
>+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_send_len);
>+
>+int virtio_transport_seqpacket_seq_send_eor(struct vsock_sock *vsk, int flags)
>+{
>+	return virtio_transport_seqpacket_send_ctrl(vsk,
>+						    VIRTIO_VSOCK_OP_SEQ_END,
>+						    0,
>+						    flags);
>+}
>+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_send_eor);
>
> static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
> {
>@@ -577,6 +642,18 @@ virtio_transport_stream_dequeue(struct vsock_sock *vsk,
> }
> EXPORT_SYMBOL_GPL(virtio_transport_stream_dequeue);
>
>+int
>+virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
>+				   struct msghdr *msg,
>+				   int flags, bool *msg_ready)
>+{
>+	if (flags & MSG_PEEK)
>+		return -EOPNOTSUPP;
>+
>+	return virtio_transport_seqpacket_do_dequeue(vsk, msg, msg_ready);
>+}
>+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_dequeue);
>+
> int
> virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
> 			       struct msghdr *msg,
>@@ -658,14 +735,15 @@ EXPORT_SYMBOL_GPL(virtio_transport_do_socket_init);
> void virtio_transport_notify_buffer_size(struct vsock_sock *vsk, u64 *val)
> {
> 	struct virtio_vsock_sock *vvs = vsk->trans;
>+	int type;
>
> 	if (*val > VIRTIO_VSOCK_MAX_BUF_SIZE)
> 		*val = VIRTIO_VSOCK_MAX_BUF_SIZE;
>
> 	vvs->buf_alloc = *val;
>
>-	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_STREAM,
>-					    NULL);
>+	type = virtio_transport_get_type(sk_vsock(vsk));
>+	virtio_transport_send_credit_update(vsk, type, NULL);

I think we can remove the 'type' parameter of 
virtio_transport_send_credit_update() since 
virtio_transport_send_pkt_info() will overwrite it.

> }
> EXPORT_SYMBOL_GPL(virtio_transport_notify_buffer_size);
>
>@@ -792,7 +870,6 @@ int virtio_transport_connect(struct vsock_sock *vsk)
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_REQUEST,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.vsk = vsk,
> 	};
>
>@@ -804,7 +881,6 @@ int virtio_transport_shutdown(struct vsock_sock *vsk, int mode)
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_SHUTDOWN,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.flags = (mode & RCV_SHUTDOWN ?
> 			  VIRTIO_VSOCK_SHUTDOWN_RCV : 0) |
> 			 (mode & SEND_SHUTDOWN ?
>@@ -833,7 +909,6 @@ virtio_transport_stream_enqueue(struct vsock_sock *vsk,
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_RW,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.msg = msg,
> 		.pkt_len = len,
> 		.vsk = vsk,
>@@ -856,7 +931,6 @@ static int virtio_transport_reset(struct vsock_sock *vsk,
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_RST,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.reply = !!pkt,
> 		.vsk = vsk,
> 	};

These changes could go with the new patch to handle the type directly in 
the virtio_transport_send_pkt_info().


>@@ -1001,7 +1075,7 @@ void virtio_transport_release(struct vsock_sock *vsk)
> 	struct sock *sk = &vsk->sk;
> 	bool remove_sock = true;
>
>-	if (sk->sk_type == SOCK_STREAM)
>+	if (sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET)
> 		remove_sock = virtio_transport_close(vsk);
>
> 	list_for_each_entry_safe(pkt, tmp, &vvs->rx_queue, list) {
>@@ -1164,7 +1238,6 @@ virtio_transport_send_response(struct vsock_sock *vsk,
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_RESPONSE,
>-		.type = VIRTIO_VSOCK_TYPE_STREAM,
> 		.remote_cid = le64_to_cpu(pkt->hdr.src_cid),
> 		.remote_port = le32_to_cpu(pkt->hdr.src_port),
> 		.reply = true,

Also this one.

Thanks,
Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 16/17] loopback/vsock: setup SEQPACKET ops for transport
       [not found] ` <20210207151851.806233-1-arseny.krasnov@kaspersky.com>
@ 2021-02-11 14:31   ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:31 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, Jeff Vander Stoep,
	stsp2, linux-kernel, virtualization, oxffffaa, netdev,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen

Please move this patch before the test and I'd change the prefix in 
"vsock_loopback" or "vsock/loopback".

Thanks,
Stefano

On Sun, Feb 07, 2021 at 06:18:48PM +0300, Arseny Krasnov wrote:
>This adds SEQPACKET ops for loopback transport
>
>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/vsock_loopback.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
>diff --git a/net/vmw_vsock/vsock_loopback.c b/net/vmw_vsock/vsock_loopback.c
>index a45f7ffca8c5..c0da94119f74 100644
>--- a/net/vmw_vsock/vsock_loopback.c
>+++ b/net/vmw_vsock/vsock_loopback.c
>@@ -89,6 +89,11 @@ static struct virtio_transport loopback_transport = {
> 		.stream_is_active         = virtio_transport_stream_is_active,
> 		.stream_allow             = virtio_transport_stream_allow,
>
>+		.seqpacket_seq_send_len	  = virtio_transport_seqpacket_seq_send_len,
>+		.seqpacket_seq_send_eor	  = virtio_transport_seqpacket_seq_send_eor,
>+		.seqpacket_seq_get_len	  = virtio_transport_seqpacket_seq_get_len,
>+		.seqpacket_dequeue        = virtio_transport_seqpacket_dequeue,
>+
> 		.notify_poll_in           = virtio_transport_notify_poll_in,
> 		.notify_poll_out          = virtio_transport_notify_poll_out,
> 		.notify_recv_init         = virtio_transport_notify_recv_init,
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 17/17] virtio/vsock: simplify credit update function API
       [not found] ` <20210207151906.806343-1-arseny.krasnov@kaspersky.com>
@ 2021-02-11 14:39   ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:39 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm, Michael S. Tsirkin, netdev, stsp2,
	linux-kernel, virtualization, oxffffaa, Stefan Hajnoczi,
	Colin Ian King, Jakub Kicinski, David S. Miller, Jorgen Hansen,
	Alexander Popov

On Sun, Feb 07, 2021 at 06:19:03PM +0300, Arseny Krasnov wrote:
>'virtio_transport_send_credit_update()' has some extra args:
>1) 'type' may be set in 'virtio_transport_send_pkt_info()' using type
>   of socket.
>2) This function is static and 'hdr' arg was always NULL.
>

Okay, I saw this patch after my previous comment.

I think this looks good, but please move this before your changes (e.g.  
before patch 'virtio/vsock: dequeue callback for SOCK_SEQPACKET').

In this way you don't need to modify 
virtio_transport_notify_buffer_size(), calling 
virtio_transport_get_type() and then remove these changes.

It's generally not a good idea to make changes in a patch and then 
remove them a few patches later in the same series. This should ring a 
bell about moving these changes before others.

Thanks,
Stefano

>Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
>---
> net/vmw_vsock/virtio_transport_common.c | 20 +++++---------------
> 1 file changed, 5 insertions(+), 15 deletions(-)
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c 
>b/net/vmw_vsock/virtio_transport_common.c
>index 0aa0fd33e9d6..46308679c8a4 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -286,13 +286,10 @@ void virtio_transport_put_credit(struct virtio_vsock_sock *vvs, u32 credit)
> }
> EXPORT_SYMBOL_GPL(virtio_transport_put_credit);
>
>-static int virtio_transport_send_credit_update(struct vsock_sock *vsk,
>-					       int type,
>-					       struct virtio_vsock_hdr *hdr)
>+static int virtio_transport_send_credit_update(struct vsock_sock *vsk)
> {
> 	struct virtio_vsock_pkt_info info = {
> 		.op = VIRTIO_VSOCK_OP_CREDIT_UPDATE,
>-		.type = type,
> 		.vsk = vsk,
> 	};
>
>@@ -401,9 +398,7 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> 	 * with different values.
> 	 */
> 	if (free_space < VIRTIO_VSOCK_MAX_PKT_BUF_SIZE) {
>-		virtio_transport_send_credit_update(vsk,
>-						    VIRTIO_VSOCK_TYPE_STREAM,
>-						    NULL);
>+		virtio_transport_send_credit_update(vsk);
> 	}
>
> 	return total;
>@@ -525,9 +520,7 @@ size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk)
> 	spin_unlock_bh(&vvs->rx_lock);
>
> 	if (bytes_dropped)
>-		virtio_transport_send_credit_update(vsk,
>-						    VIRTIO_VSOCK_TYPE_SEQPACKET,
>-						    NULL);
>+		virtio_transport_send_credit_update(vsk);
>
> 	return vvs->user_read_seq_len;
> }
>@@ -624,8 +617,7 @@ static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
>
> 	spin_unlock_bh(&vvs->rx_lock);
>
>-	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_SEQPACKET,
>-					    NULL);
>+	virtio_transport_send_credit_update(vsk);
>
> 	return err;
> }
>@@ -735,15 +727,13 @@ EXPORT_SYMBOL_GPL(virtio_transport_do_socket_init);
> void virtio_transport_notify_buffer_size(struct vsock_sock *vsk, u64 *val)
> {
> 	struct virtio_vsock_sock *vvs = vsk->trans;
>-	int type;
>
> 	if (*val > VIRTIO_VSOCK_MAX_BUF_SIZE)
> 		*val = VIRTIO_VSOCK_MAX_BUF_SIZE;
>
> 	vvs->buf_alloc = *val;
>
>-	type = virtio_transport_get_type(sk_vsock(vsk));
>-	virtio_transport_send_credit_update(vsk, type, NULL);
>+	virtio_transport_send_credit_update(vsk);
> }
> EXPORT_SYMBOL_GPL(virtio_transport_notify_buffer_size);
>
>-- 
>2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support
       [not found]   ` <8bd3789c-8df1-4383-f233-b4b854b30970@kaspersky.com>
@ 2021-02-11 14:57     ` Stefano Garzarella
       [not found]       ` <10aa4548-2455-295d-c993-30f25fba15f2@kaspersky.com>
  0 siblings, 1 reply; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-11 14:57 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm@vger.kernel.org, Michael S. Tsirkin,
	netdev@vger.kernel.org, stsp2@yandex.ru,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, oxffffaa@gmail.com,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen, Alexander Popov

Hi Arseny,

On Mon, Feb 08, 2021 at 09:32:59AM +0300, Arseny Krasnov wrote:
>
>On 07.02.2021 19:20, Michael S. Tsirkin wrote:
>> On Sun, Feb 07, 2021 at 06:12:56PM +0300, Arseny Krasnov wrote:
>>> 	This patchset impelements support of SOCK_SEQPACKET for virtio
>>> transport.
>>> 	As SOCK_SEQPACKET guarantees to save record boundaries, so to
>>> do it, two new packet operations were added: first for start of record
>>>  and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>>> both operations carries metadata - to maintain boundaries and payload
>>> integrity. Metadata is introduced by adding special header with two
>>> fields - message count and message length:
>>>
>>> 	struct virtio_vsock_seq_hdr {
>>> 		__le32  msg_cnt;
>>> 		__le32  msg_len;
>>> 	} __attribute__((packed));
>>>
>>> 	This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>>> packets(buffer of second virtio descriptor in chain) in the same way as
>>> data transmitted in RW packets. Payload was chosen as buffer for this
>>> header to avoid touching first virtio buffer which carries header of
>>> packet, because someone could check that size of this buffer is equal
>>> to size of packet header. To send record, packet with start marker is
>>> sent first(it's header contains length of record and counter), then
>>> counter is incremented and all data is sent as usual 'RW' packets and
>>> finally SEQ_END is sent(it also carries counter of message, which is
>>> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is
>>> incremented again. On receiver's side, length of record is known from
>>> packet with start record marker. To check that no packets were dropped
>>> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are
>>> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by
>>> 1) and length of data between two markers is compared to length in
>>> SEQ_BEGIN header.
>>> 	Now as  packets of one socket are not reordered neither on
>>> vsock nor on vhost transport layers, such markers allows to restore
>>> original record on receiver's side. If user's buffer is smaller that
>>> record length, when all out of size data is dropped.
>>> 	Maximum length of datagram is not limited as in stream socket,
>>> because same credit logic is used. Difference with stream socket is
>>> that user is not woken up until whole record is received or error
>>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
>>> 	Tests also implemented.
>>>
>>>  Arseny Krasnov (17):
>>>   af_vsock: update functions for connectible socket
>>>   af_vsock: separate wait data loop
>>>   af_vsock: separate receive data loop
>>>   af_vsock: implement SEQPACKET receive loop
>>>   af_vsock: separate wait space loop
>>>   af_vsock: implement send logic for SEQPACKET
>>>   af_vsock: rest of SEQPACKET support
>>>   af_vsock: update comments for stream sockets
>>>   virtio/vsock: dequeue callback for SOCK_SEQPACKET
>>>   virtio/vsock: fetch length for SEQPACKET record
>>>   virtio/vsock: add SEQPACKET receive logic
>>>   virtio/vsock: rest of SOCK_SEQPACKET support
>>>   virtio/vsock: setup SEQPACKET ops for transport
>>>   vhost/vsock: setup SEQPACKET ops for transport
>>>   vsock_test: add SOCK_SEQPACKET tests
>>>   loopback/vsock: setup SEQPACKET ops for transport
>>>   virtio/vsock: simplify credit update function API
>>>
>>>  drivers/vhost/vsock.c                   |   8 +-
>>>  include/linux/virtio_vsock.h            |  15 +
>>>  include/net/af_vsock.h                  |   9 +
>>>  include/uapi/linux/virtio_vsock.h       |  16 +
>>>  net/vmw_vsock/af_vsock.c                | 588 +++++++++++++++-------
>>>  net/vmw_vsock/virtio_transport.c        |   5 +
>>>  net/vmw_vsock/virtio_transport_common.c | 316 ++++++++++--
>>>  net/vmw_vsock/vsock_loopback.c          |   5 +
>>>  tools/testing/vsock/util.c              |  32 +-
>>>  tools/testing/vsock/util.h              |   3 +
>>>  tools/testing/vsock/vsock_test.c        | 126 +++++
>>>  11 files changed, 895 insertions(+), 228 deletions(-)
>>>
>>>  TODO:
>>>  - What to do, when server doesn't support SOCK_SEQPACKET. In current
>>>    implementation RST is replied in the same way when listening port
>>>    is not found. I think that current RST is enough,because case when
>>>    server doesn't support SEQ_PACKET is same when listener missed(e.g.
>>>    no listener in both cases).

I think is fine.

>>    - virtio spec patch
>Ok

Yes, please prepare a patch to discuss the VIRTIO spec changes.

For example for 'virtio_vsock_seq_hdr', I left a comment about 'msg_cnt' 
naming that should be better to discuss with virtio guys.

Anyway, I reviewed this series and I left some comments.
I think we are in a good shape :-)

Thanks,
Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 02/17] af_vsock: separate wait data loop
       [not found] ` <20210207151451.804498-1-arseny.krasnov@kaspersky.com>
  2021-02-11 11:24   ` [RFC PATCH v4 02/17] af_vsock: separate wait data loop Stefano Garzarella
@ 2021-02-11 15:11   ` Jorgen Hansen
  1 sibling, 0 replies; 18+ messages in thread
From: Jorgen Hansen @ 2021-02-11 15:11 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm@vger.kernel.org, Michael S. Tsirkin,
	netdev@vger.kernel.org, stsp2@yandex.ru,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, oxffffaa@gmail.com,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, Alexander Popov,
	David S. Miller


> On 7 Feb 2021, at 16:14, Arseny Krasnov <arseny.krasnov@kaspersky.com> wrote:
> 
> This moves wait loop for data to dedicated function, because later
> it will be used by SEQPACKET data receive loop.
> 
> Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
> ---
> net/vmw_vsock/af_vsock.c | 158 +++++++++++++++++++++------------------
> 1 file changed, 86 insertions(+), 72 deletions(-)
> 
> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> index f4fabec50650..38927695786f 100644
> --- a/net/vmw_vsock/af_vsock.c
> +++ b/net/vmw_vsock/af_vsock.c
> @@ -1833,6 +1833,71 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> 	return err;
> }
> 
> +static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
> +			   long timeout,
> +			   struct vsock_transport_recv_notify_data *recv_data,
> +			   size_t target)
> +{
> +	const struct vsock_transport *transport;
> +	struct vsock_sock *vsk;
> +	s64 data;
> +	int err;
> +
> +	vsk = vsock_sk(sk);
> +	err = 0;
> +	transport = vsk->transport;
> +	prepare_to_wait(sk_sleep(sk), wait, TASK_INTERRUPTIBLE);
> +
> +	while ((data = vsock_stream_has_data(vsk)) == 0) {
> +		if (sk->sk_err != 0 ||
> +		    (sk->sk_shutdown & RCV_SHUTDOWN) ||
> +		    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
> +			goto out;
> +		}
> +
> +		/* Don't wait for non-blocking sockets. */
> +		if (timeout == 0) {
> +			err = -EAGAIN;
> +			goto out;
> +		}
> +
> +		if (recv_data) {
> +			err = transport->notify_recv_pre_block(vsk, target, recv_data);
> +			if (err < 0)
> +				goto out;
> +		}
> +
> +		release_sock(sk);
> +		timeout = schedule_timeout(timeout);
> +		lock_sock(sk);
> +
> +		if (signal_pending(current)) {
> +			err = sock_intr_errno(timeout);
> +			goto out;
> +		} else if (timeout == 0) {
> +			err = -EAGAIN;
> +			goto out;
> +		}
> +	}
> +
> +	finish_wait(sk_sleep(sk), wait);
> +
> +	/* Invalid queue pair content. XXX This should
> +	 * be changed to a connection reset in a later
> +	 * change.
> +	 */

Since you are here, could you update this comment to something like:

/* Internal transport error when checking for available
 * data. XXX This should be changed to a connection
 * reset in a later change.
 */

> +	if (data < 0)
> +		return -ENOMEM;
> +
> +	/* Have some data, return. */
> +	if (data)
> +		return data;
> +
> +out:
> +	finish_wait(sk_sleep(sk), wait);
> +	return err;
> +}

I agree with Stefanos suggestion to get rid of the out: part  and just have the single finish_wait().

> +
> static int
> vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 			  int flags)
> @@ -1912,85 +1977,34 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> 
> 
> 	while (1) {
> -		s64 ready;
> +		ssize_t read;
> 
> -		prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
> -		ready = vsock_stream_has_data(vsk);
> -
> -		if (ready == 0) {
> -			if (sk->sk_err != 0 ||
> -			    (sk->sk_shutdown & RCV_SHUTDOWN) ||
> -			    (vsk->peer_shutdown & SEND_SHUTDOWN)) {
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			}
> -			/* Don't wait for non-blocking sockets. */
> -			if (timeout == 0) {
> -				err = -EAGAIN;
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			}
> -
> -			err = transport->notify_recv_pre_block(
> -					vsk, target, &recv_data);
> -			if (err < 0) {
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			}
> -			release_sock(sk);
> -			timeout = schedule_timeout(timeout);
> -			lock_sock(sk);
> -
> -			if (signal_pending(current)) {
> -				err = sock_intr_errno(timeout);
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			} else if (timeout == 0) {
> -				err = -EAGAIN;
> -				finish_wait(sk_sleep(sk), &wait);
> -				break;
> -			}
> -		} else {
> -			ssize_t read;
> +		err = vsock_wait_data(sk, &wait, timeout, &recv_data, target);
> +		if (err <= 0)
> +			break;

There is a small change in the behaviour here if vsock_stream_has_data(vsk)
returned something < 0. Since you just do a break, the err value can be updated
if there is an sk->sk_err, a receive shutdown has been performed or data has
already been copied. That should be ok, though.

> -			finish_wait(sk_sleep(sk), &wait);
> -
> -			if (ready < 0) {
> -				/* Invalid queue pair content. XXX This should
> -				* be changed to a connection reset in a later
> -				* change.
> -				*/
> -
> -				err = -ENOMEM;
> -				goto out;
> -			}
> -
> -			err = transport->notify_recv_pre_dequeue(
> -					vsk, target, &recv_data);
> -			if (err < 0)
> -				break;
> +		err = transport->notify_recv_pre_dequeue(vsk, target,
> +							 &recv_data);
> +		if (err < 0)
> +			break;
> 
> -			read = transport->stream_dequeue(
> -					vsk, msg,
> -					len - copied, flags);
> -			if (read < 0) {
> -				err = -ENOMEM;
> -				break;
> -			}
> +		read = transport->stream_dequeue(vsk, msg, len - copied, flags);
> +		if (read < 0) {
> +			err = -ENOMEM;
> +			break;
> +		}
> 
> -			copied += read;
> +		copied += read;
> 
> -			err = transport->notify_recv_post_dequeue(
> -					vsk, target, read,
> -					!(flags & MSG_PEEK), &recv_data);
> -			if (err < 0)
> -				goto out;
> +		err = transport->notify_recv_post_dequeue(vsk, target, read,
> +						!(flags & MSG_PEEK), &recv_data);
> +		if (err < 0)
> +			goto out;
> 
> -			if (read >= target || flags & MSG_PEEK)
> -				break;
> +		if (read >= target || flags & MSG_PEEK)
> +			break;
> 
> -			target -= read;
> -		}
> +		target -= read;
> 	}
> 
> 	if (sk->sk_err)
> -- 
> 2.25.1
> 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support
       [not found]       ` <10aa4548-2455-295d-c993-30f25fba15f2@kaspersky.com>
@ 2021-02-12  8:07         ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2021-02-12  8:07 UTC (permalink / raw)
  To: Arseny Krasnov
  Cc: Andra Paraschiv, kvm@vger.kernel.org, Michael S. Tsirkin,
	netdev@vger.kernel.org, stsp2@yandex.ru,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, oxffffaa@gmail.com,
	Stefan Hajnoczi, Colin Ian King, Jakub Kicinski, David S. Miller,
	Jorgen Hansen, Alexander Popov

On Fri, Feb 12, 2021 at 09:11:50AM +0300, Arseny Krasnov wrote:
>
>On 11.02.2021 17:57, Stefano Garzarella wrote:
>> Hi Arseny,
>>
>> On Mon, Feb 08, 2021 at 09:32:59AM +0300, Arseny Krasnov wrote:
>>> On 07.02.2021 19:20, Michael S. Tsirkin wrote:
>>>> On Sun, Feb 07, 2021 at 06:12:56PM +0300, Arseny Krasnov wrote:
>>>>> 	This patchset impelements support of SOCK_SEQPACKET for virtio
>>>>> transport.
>>>>> 	As SOCK_SEQPACKET guarantees to save record boundaries, so to
>>>>> do it, two new packet operations were added: first for start of record
>>>>>  and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>>>>> both operations carries metadata - to maintain boundaries and payload
>>>>> integrity. Metadata is introduced by adding special header with two
>>>>> fields - message count and message length:
>>>>>
>>>>> 	struct virtio_vsock_seq_hdr {
>>>>> 		__le32  msg_cnt;
>>>>> 		__le32  msg_len;
>>>>> 	} __attribute__((packed));
>>>>>
>>>>> 	This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>>>>> packets(buffer of second virtio descriptor in chain) in the same way as
>>>>> data transmitted in RW packets. Payload was chosen as buffer for this
>>>>> header to avoid touching first virtio buffer which carries header of
>>>>> packet, because someone could check that size of this buffer is equal
>>>>> to size of packet header. To send record, packet with start marker is
>>>>> sent first(it's header contains length of record and counter), then
>>>>> counter is incremented and all data is sent as usual 'RW' packets and
>>>>> finally SEQ_END is sent(it also carries counter of message, which is
>>>>> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is
>>>>> incremented again. On receiver's side, length of record is known from
>>>>> packet with start record marker. To check that no packets were dropped
>>>>> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are
>>>>> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by
>>>>> 1) and length of data between two markers is compared to length in
>>>>> SEQ_BEGIN header.
>>>>> 	Now as  packets of one socket are not reordered neither on
>>>>> vsock nor on vhost transport layers, such markers allows to restore
>>>>> original record on receiver's side. If user's buffer is smaller that
>>>>> record length, when all out of size data is dropped.
>>>>> 	Maximum length of datagram is not limited as in stream socket,
>>>>> because same credit logic is used. Difference with stream socket is
>>>>> that user is not woken up until whole record is received or error
>>>>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
>>>>> 	Tests also implemented.
>>>>>
>>>>>  Arseny Krasnov (17):
>>>>>   af_vsock: update functions for connectible socket
>>>>>   af_vsock: separate wait data loop
>>>>>   af_vsock: separate receive data loop
>>>>>   af_vsock: implement SEQPACKET receive loop
>>>>>   af_vsock: separate wait space loop
>>>>>   af_vsock: implement send logic for SEQPACKET
>>>>>   af_vsock: rest of SEQPACKET support
>>>>>   af_vsock: update comments for stream sockets
>>>>>   virtio/vsock: dequeue callback for SOCK_SEQPACKET
>>>>>   virtio/vsock: fetch length for SEQPACKET record
>>>>>   virtio/vsock: add SEQPACKET receive logic
>>>>>   virtio/vsock: rest of SOCK_SEQPACKET support
>>>>>   virtio/vsock: setup SEQPACKET ops for transport
>>>>>   vhost/vsock: setup SEQPACKET ops for transport
>>>>>   vsock_test: add SOCK_SEQPACKET tests
>>>>>   loopback/vsock: setup SEQPACKET ops for transport
>>>>>   virtio/vsock: simplify credit update function API
>>>>>
>>>>>  drivers/vhost/vsock.c                   |   8 +-
>>>>>  include/linux/virtio_vsock.h            |  15 +
>>>>>  include/net/af_vsock.h                  |   9 +
>>>>>  include/uapi/linux/virtio_vsock.h       |  16 +
>>>>>  net/vmw_vsock/af_vsock.c                | 588 +++++++++++++++-------
>>>>>  net/vmw_vsock/virtio_transport.c        |   5 +
>>>>>  net/vmw_vsock/virtio_transport_common.c | 316 ++++++++++--
>>>>>  net/vmw_vsock/vsock_loopback.c          |   5 +
>>>>>  tools/testing/vsock/util.c              |  32 +-
>>>>>  tools/testing/vsock/util.h              |   3 +
>>>>>  tools/testing/vsock/vsock_test.c        | 126 +++++
>>>>>  11 files changed, 895 insertions(+), 228 deletions(-)
>>>>>
>>>>>  TODO:
>>>>>  - What to do, when server doesn't support SOCK_SEQPACKET. In current
>>>>>    implementation RST is replied in the same way when listening port
>>>>>    is not found. I think that current RST is enough,because case when
>>>>>    server doesn't support SEQ_PACKET is same when listener missed(e.g.
>>>>>    no listener in both cases).
>> I think is fine.
>>
>>>>    - virtio spec patch
>>> Ok
>> Yes, please prepare a patch to discuss the VIRTIO spec changes.
>>
>> For example for 'virtio_vsock_seq_hdr', I left a comment about 'msg_cnt'
>> naming that should be better to discuss with virtio guys.
>
>Ok, i'll prepare it in v5. So I have to send it both LKML(as one of patches) and
>
>virtio mailing lists? (e.g. virtio-comment@lists.oasis-open.org)

I think you can send the VIRTIO spec patch separately from this series 
to virtio-comment, maybe CCing virtualization@lists.linux-foundation.org

But Michael could correct me :-)

>
>>
>> Anyway, I reviewed this series and I left some comments.
>> I think we are in a good shape :-)
>Great, thanks for review. I'll consider all review comments in next 
>version.

Great!

Stefano

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-02-12  8:08 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20210207151259.803917-1-arseny.krasnov@kaspersky.com>
2021-02-07 16:20 ` [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support Michael S. Tsirkin
     [not found]   ` <8bd3789c-8df1-4383-f233-b4b854b30970@kaspersky.com>
2021-02-11 14:57     ` Stefano Garzarella
     [not found]       ` <10aa4548-2455-295d-c993-30f25fba15f2@kaspersky.com>
2021-02-12  8:07         ` Stefano Garzarella
     [not found] ` <20210207151426.804348-1-arseny.krasnov@kaspersky.com>
2021-02-11 10:52   ` [RFC PATCH v4 01/17] af_vsock: update functions for connectible socket Stefano Garzarella
     [not found] ` <20210207151451.804498-1-arseny.krasnov@kaspersky.com>
2021-02-11 11:24   ` [RFC PATCH v4 02/17] af_vsock: separate wait data loop Stefano Garzarella
2021-02-11 15:11   ` Jorgen Hansen
     [not found] ` <20210207151508.804615-1-arseny.krasnov@kaspersky.com>
2021-02-11 11:37   ` [RFC PATCH v4 03/17] af_vsock: separate receive " Stefano Garzarella
     [not found] ` <20210207151526.804741-1-arseny.krasnov@kaspersky.com>
2021-02-11 11:47   ` [RFC PATCH v4 04/17] af_vsock: implement SEQPACKET receive loop Stefano Garzarella
     [not found] ` <20210207151545.804889-1-arseny.krasnov@kaspersky.com>
2021-02-11 12:14   ` [RFC PATCH v4 05/17] af_vsock: separate wait space loop Stefano Garzarella
     [not found] ` <20210207151600.804998-1-arseny.krasnov@kaspersky.com>
2021-02-11 12:17   ` [RFC PATCH v4 06/17] af_vsock: implement send logic for SEQPACKET Stefano Garzarella
     [not found] ` <20210207151615.805115-1-arseny.krasnov@kaspersky.com>
2021-02-11 12:27   ` [RFC PATCH v4 07/17] af_vsock: rest of SEQPACKET support Stefano Garzarella
     [not found] ` <20210207151632.805240-1-arseny.krasnov@kaspersky.com>
2021-02-11 13:19   ` [RFC PATCH v4 08/17] af_vsock: update comments for stream sockets Stefano Garzarella
     [not found] ` <20210207151649.805359-1-arseny.krasnov@kaspersky.com>
2021-02-11 13:54   ` [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET Stefano Garzarella
2021-02-11 14:03     ` Stefano Garzarella
     [not found] ` <20210207151711.805503-1-arseny.krasnov@kaspersky.com>
2021-02-11 13:58   ` [RFC PATCH v4 10/17] virtio/vsock: fetch length for SEQPACKET record Stefano Garzarella
     [not found] ` <20210207151747.805754-1-arseny.krasnov@kaspersky.com>
2021-02-11 14:29   ` [RFC PATCH v4 12/17] virtio/vsock: rest of SOCK_SEQPACKET support Stefano Garzarella
     [not found] ` <20210207151851.806233-1-arseny.krasnov@kaspersky.com>
2021-02-11 14:31   ` [RFC PATCH v4 16/17] loopback/vsock: setup SEQPACKET ops for transport Stefano Garzarella
     [not found] ` <20210207151906.806343-1-arseny.krasnov@kaspersky.com>
2021-02-11 14:39   ` [RFC PATCH v4 17/17] virtio/vsock: simplify credit update function API Stefano Garzarella

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).