linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v3 0/3] vsock: Introduce SIOCINQ ioctl support
@ 2025-06-17  4:53 Xuewei Niu
  2025-06-17  4:53 ` [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl Xuewei Niu
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Xuewei Niu @ 2025-06-17  4:53 UTC (permalink / raw)
  To: sgarzare, mst, pabeni, jasowang, xuanzhuo, davem, netdev,
	stefanha, leonardi
  Cc: virtualization, kvm, linux-kernel, fupan.lfp, Xuewei Niu

Introduce SIOCINQ ioctl support for vsock, indicating the length of unread
bytes.

Similar with SIOCOUTQ ioctl, the information is transport-dependent.

The first patch adds SIOCINQ ioctl support in AF_VSOCK.

The second patch wraps the ioctl into `ioctl_int()`, which implements a
retry mechanism to prevent immediate failure.

The last one adds two test cases to check the functionality. The changes
have been tested, and the results are as expected.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>

--

v1->v2:
https://lore.kernel.org/lkml/20250519070649.3063874-1-niuxuewei.nxw@antgroup.com/
- Use net-next tree.
- Reuse `rx_bytes` to count unread bytes.
- Wrap ioctl syscall with an int pointer argument to implement a retry
  mechanism.

v2->v3:
https://lore.kernel.org/netdev/20250613031152.1076725-1-niuxuewei.nxw@antgroup.com/
- Update commit messages following the guidelines
- Remove `unread_bytes` callback and reuse `vsock_stream_has_data()`
- Move the tests to the end of array
- Split the refactoring patch
- Include <sys/ioctl.h> in the util.c

Xuewei Niu (3):
  vsock: Add support for SIOCINQ ioctl
  test/vsock: Add retry mechanism to ioctl wrapper
  test/vsock: Add ioctl SIOCINQ tests

 net/vmw_vsock/af_vsock.c         | 22 +++++++++
 tools/testing/vsock/util.c       | 37 ++++++++++----
 tools/testing/vsock/util.h       |  1 +
 tools/testing/vsock/vsock_test.c | 82 ++++++++++++++++++++++++++++++++
 4 files changed, 133 insertions(+), 9 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl
  2025-06-17  4:53 [PATCH net-next v3 0/3] vsock: Introduce SIOCINQ ioctl support Xuewei Niu
@ 2025-06-17  4:53 ` Xuewei Niu
  2025-06-17 14:39   ` Stefano Garzarella
  2025-06-17  4:53 ` [PATCH net-next v3 2/3] test/vsock: Add retry mechanism to ioctl wrapper Xuewei Niu
  2025-06-17  4:53 ` [PATCH net-next v3 3/3] test/vsock: Add ioctl SIOCINQ tests Xuewei Niu
  2 siblings, 1 reply; 18+ messages in thread
From: Xuewei Niu @ 2025-06-17  4:53 UTC (permalink / raw)
  To: sgarzare, mst, pabeni, jasowang, xuanzhuo, davem, netdev,
	stefanha, leonardi
  Cc: virtualization, kvm, linux-kernel, fupan.lfp, Xuewei Niu

Add support for SIOCINQ ioctl, indicating the length of bytes unread in the
socket. The value is obtained from `vsock_stream_has_data()`.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
---
 net/vmw_vsock/af_vsock.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 2e7a3034e965..bae6b89bb5fb 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1389,6 +1389,28 @@ static int vsock_do_ioctl(struct socket *sock, unsigned int cmd,
 	vsk = vsock_sk(sk);
 
 	switch (cmd) {
+	case SIOCINQ: {
+		ssize_t n_bytes;
+
+		if (!vsk->transport) {
+			ret = -EOPNOTSUPP;
+			break;
+		}
+
+		if (sock_type_connectible(sk->sk_type) &&
+		    sk->sk_state == TCP_LISTEN) {
+			ret = -EINVAL;
+			break;
+		}
+
+		n_bytes = vsock_stream_has_data(vsk);
+		if (n_bytes < 0) {
+			ret = n_bytes;
+			break;
+		}
+		ret = put_user(n_bytes, arg);
+		break;
+	}
 	case SIOCOUTQ: {
 		ssize_t n_bytes;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v3 2/3] test/vsock: Add retry mechanism to ioctl wrapper
  2025-06-17  4:53 [PATCH net-next v3 0/3] vsock: Introduce SIOCINQ ioctl support Xuewei Niu
  2025-06-17  4:53 ` [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl Xuewei Niu
@ 2025-06-17  4:53 ` Xuewei Niu
  2025-06-17 15:10   ` Stefano Garzarella
  2025-06-17  4:53 ` [PATCH net-next v3 3/3] test/vsock: Add ioctl SIOCINQ tests Xuewei Niu
  2 siblings, 1 reply; 18+ messages in thread
From: Xuewei Niu @ 2025-06-17  4:53 UTC (permalink / raw)
  To: sgarzare, mst, pabeni, jasowang, xuanzhuo, davem, netdev,
	stefanha, leonardi
  Cc: virtualization, kvm, linux-kernel, fupan.lfp, Xuewei Niu

Wrap the ioctl in `ioctl_int()`, which takes a pointer to the actual
int value and an expected int value. The function will not return until
either the ioctl returns the expected value or a timeout occurs, thus
avoiding immediate failure.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
---
 tools/testing/vsock/util.c | 37 ++++++++++++++++++++++++++++---------
 tools/testing/vsock/util.h |  1 +
 2 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
index 0c7e9cbcbc85..ecfbe52efca2 100644
--- a/tools/testing/vsock/util.c
+++ b/tools/testing/vsock/util.c
@@ -16,6 +16,7 @@
 #include <unistd.h>
 #include <assert.h>
 #include <sys/epoll.h>
+#include <sys/ioctl.h>
 #include <sys/mman.h>
 #include <linux/sockios.h>
 
@@ -97,28 +98,46 @@ void vsock_wait_remote_close(int fd)
 	close(epollfd);
 }
 
-/* Wait until transport reports no data left to be sent.
- * Return false if transport does not implement the unsent_bytes() callback.
+/* Wait until ioctl gives an expected int value.
+ * Return a negative value if the op is not supported.
  */
-bool vsock_wait_sent(int fd)
+int ioctl_int(int fd, unsigned long op, int *actual, int expected)
 {
-	int ret, sock_bytes_unsent;
+	int ret;
+	char name[32];
+
+	if (!actual) {
+		fprintf(stderr, "%s requires a non-null pointer\n", __func__);
+		exit(EXIT_FAILURE);
+	}
+
+	snprintf(name, sizeof(name), "ioctl(%lu)", op);
 
 	timeout_begin(TIMEOUT);
 	do {
-		ret = ioctl(fd, SIOCOUTQ, &sock_bytes_unsent);
+		ret = ioctl(fd, op, actual);
 		if (ret < 0) {
 			if (errno == EOPNOTSUPP)
 				break;
 
-			perror("ioctl(SIOCOUTQ)");
+			perror(name);
 			exit(EXIT_FAILURE);
 		}
-		timeout_check("SIOCOUTQ");
-	} while (sock_bytes_unsent != 0);
+		timeout_check(name);
+	} while (*actual != expected);
 	timeout_end();
 
-	return !ret;
+	return ret;
+}
+
+/* Wait until transport reports no data left to be sent.
+ * Return false if transport does not implement the unsent_bytes() callback.
+ */
+bool vsock_wait_sent(int fd)
+{
+	int sock_bytes_unsent;
+
+	return !(ioctl_int(fd, SIOCOUTQ, &sock_bytes_unsent, 0));
 }
 
 /* Create socket <type>, bind to <cid, port> and return the file descriptor. */
diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h
index 5e2db67072d5..f3fe725cdeab 100644
--- a/tools/testing/vsock/util.h
+++ b/tools/testing/vsock/util.h
@@ -54,6 +54,7 @@ int vsock_stream_listen(unsigned int cid, unsigned int port);
 int vsock_seqpacket_accept(unsigned int cid, unsigned int port,
 			   struct sockaddr_vm *clientaddrp);
 void vsock_wait_remote_close(int fd);
+int ioctl_int(int fd, unsigned long op, int *actual, int expected);
 bool vsock_wait_sent(int fd);
 void send_buf(int fd, const void *buf, size_t len, int flags,
 	      ssize_t expected_ret);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v3 3/3] test/vsock: Add ioctl SIOCINQ tests
  2025-06-17  4:53 [PATCH net-next v3 0/3] vsock: Introduce SIOCINQ ioctl support Xuewei Niu
  2025-06-17  4:53 ` [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl Xuewei Niu
  2025-06-17  4:53 ` [PATCH net-next v3 2/3] test/vsock: Add retry mechanism to ioctl wrapper Xuewei Niu
@ 2025-06-17  4:53 ` Xuewei Niu
  2025-06-17 15:21   ` Stefano Garzarella
  2 siblings, 1 reply; 18+ messages in thread
From: Xuewei Niu @ 2025-06-17  4:53 UTC (permalink / raw)
  To: sgarzare, mst, pabeni, jasowang, xuanzhuo, davem, netdev,
	stefanha, leonardi
  Cc: virtualization, kvm, linux-kernel, fupan.lfp, Xuewei Niu

Add SIOCINQ ioctl tests for both SOCK_STREAM and SOCK_SEQPACKET.

The client waits for the server to send data, and checks if the SIOCINQ
ioctl value matches the data size. After consuming the data, the client
checks if the SIOCINQ value is 0.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
---
 tools/testing/vsock/vsock_test.c | 82 ++++++++++++++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index f669baaa0dca..66bb9fde7eca 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -1305,6 +1305,58 @@ static void test_unsent_bytes_client(const struct test_opts *opts, int type)
 	close(fd);
 }
 
+static void test_unread_bytes_server(const struct test_opts *opts, int type)
+{
+	unsigned char buf[MSG_BUF_IOCTL_LEN];
+	int client_fd;
+
+	client_fd = vsock_accept(VMADDR_CID_ANY, opts->peer_port, NULL, type);
+	if (client_fd < 0) {
+		perror("accept");
+		exit(EXIT_FAILURE);
+	}
+
+	for (int i = 0; i < sizeof(buf); i++)
+		buf[i] = rand() & 0xFF;
+
+	send_buf(client_fd, buf, sizeof(buf), 0, sizeof(buf));
+	control_writeln("SENT");
+	control_expectln("RECEIVED");
+
+	close(client_fd);
+}
+
+static void test_unread_bytes_client(const struct test_opts *opts, int type)
+{
+	unsigned char buf[MSG_BUF_IOCTL_LEN];
+	int ret, fd;
+	int sock_bytes_unread;
+
+	fd = vsock_connect(opts->peer_cid, opts->peer_port, type);
+	if (fd < 0) {
+		perror("connect");
+		exit(EXIT_FAILURE);
+	}
+
+	control_expectln("SENT");
+	/* The data has arrived but has not been read. The expected is
+	 * MSG_BUF_IOCTL_LEN.
+	 */
+	ret = ioctl_int(fd, TIOCINQ, &sock_bytes_unread, MSG_BUF_IOCTL_LEN);
+	if (ret) {
+		fprintf(stderr, "Test skipped, TIOCINQ not supported.\n");
+		goto out;
+	}
+
+	recv_buf(fd, buf, sizeof(buf), 0, sizeof(buf));
+	// All date has been consumed, so the expected is 0.
+	ioctl_int(fd, TIOCINQ, &sock_bytes_unread, 0);
+	control_writeln("RECEIVED");
+
+out:
+	close(fd);
+}
+
 static void test_stream_unsent_bytes_client(const struct test_opts *opts)
 {
 	test_unsent_bytes_client(opts, SOCK_STREAM);
@@ -1325,6 +1377,26 @@ static void test_seqpacket_unsent_bytes_server(const struct test_opts *opts)
 	test_unsent_bytes_server(opts, SOCK_SEQPACKET);
 }
 
+static void test_stream_unread_bytes_client(const struct test_opts *opts)
+{
+	test_unread_bytes_client(opts, SOCK_STREAM);
+}
+
+static void test_stream_unread_bytes_server(const struct test_opts *opts)
+{
+	test_unread_bytes_server(opts, SOCK_STREAM);
+}
+
+static void test_seqpacket_unread_bytes_client(const struct test_opts *opts)
+{
+	test_unread_bytes_client(opts, SOCK_SEQPACKET);
+}
+
+static void test_seqpacket_unread_bytes_server(const struct test_opts *opts)
+{
+	test_unread_bytes_server(opts, SOCK_SEQPACKET);
+}
+
 #define RCVLOWAT_CREDIT_UPD_BUF_SIZE	(1024 * 128)
 /* This define is the same as in 'include/linux/virtio_vsock.h':
  * it is used to decide when to send credit update message during
@@ -2051,6 +2123,16 @@ static struct test_case test_cases[] = {
 		.run_client = test_stream_nolinger_client,
 		.run_server = test_stream_nolinger_server,
 	},
+	{
+		.name = "SOCK_STREAM ioctl(SIOCINQ) functionality",
+		.run_client = test_stream_unread_bytes_client,
+		.run_server = test_stream_unread_bytes_server,
+	},
+	{
+		.name = "SOCK_SEQPACKET ioctl(SIOCINQ) functionality",
+		.run_client = test_seqpacket_unread_bytes_client,
+		.run_server = test_seqpacket_unread_bytes_server,
+	},
 	{},
 };
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl
  2025-06-17  4:53 ` [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl Xuewei Niu
@ 2025-06-17 14:39   ` Stefano Garzarella
  2025-06-22 13:59     ` Xuewei Niu
  2025-06-25  8:03     ` [EXTERNAL] " Dexuan Cui
  0 siblings, 2 replies; 18+ messages in thread
From: Stefano Garzarella @ 2025-06-17 14:39 UTC (permalink / raw)
  To: Xuewei Niu, K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui,
	linux-hyperv
  Cc: mst, pabeni, jasowang, xuanzhuo, davem, netdev, stefanha,
	leonardi, virtualization, kvm, linux-kernel, fupan.lfp,
	Xuewei Niu

CCin hyper-v maintainers and list since I have a question about hyperv 
transport.

On Tue, Jun 17, 2025 at 12:53:44PM +0800, Xuewei Niu wrote:
>Add support for SIOCINQ ioctl, indicating the length of bytes unread in the
>socket. The value is obtained from `vsock_stream_has_data()`.
>
>Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
>---
> net/vmw_vsock/af_vsock.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 2e7a3034e965..bae6b89bb5fb 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1389,6 +1389,28 @@ static int vsock_do_ioctl(struct socket *sock, unsigned int cmd,
> 	vsk = vsock_sk(sk);
>
> 	switch (cmd) {
>+	case SIOCINQ: {
>+		ssize_t n_bytes;
>+
>+		if (!vsk->transport) {
>+			ret = -EOPNOTSUPP;
>+			break;
>+		}
>+
>+		if (sock_type_connectible(sk->sk_type) &&
>+		    sk->sk_state == TCP_LISTEN) {
>+			ret = -EINVAL;
>+			break;
>+		}
>+
>+		n_bytes = vsock_stream_has_data(vsk);

Now looks better to me, I just checked transports: vmci and virtio/vhost 
returns what we want, but for hyperv we have:

	static s64 hvs_stream_has_data(struct vsock_sock *vsk)
	{
		struct hvsock *hvs = vsk->trans;
		s64 ret;

		if (hvs->recv_data_len > 0)
			return 1;

@Hyper-v maintainers: do you know why we don't return `recv_data_len`?
Do you think we can do that to support this new feature?

Thanks,
Stefano

>+		if (n_bytes < 0) {
>+			ret = n_bytes;
>+			break;
>+		}
>+		ret = put_user(n_bytes, arg);
>+		break;
>+	}
> 	case SIOCOUTQ: {
> 		ssize_t n_bytes;
>
>-- 
>2.34.1
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 2/3] test/vsock: Add retry mechanism to ioctl wrapper
  2025-06-17  4:53 ` [PATCH net-next v3 2/3] test/vsock: Add retry mechanism to ioctl wrapper Xuewei Niu
@ 2025-06-17 15:10   ` Stefano Garzarella
  2025-06-17 15:23     ` Xuewei Niu
  0 siblings, 1 reply; 18+ messages in thread
From: Stefano Garzarella @ 2025-06-17 15:10 UTC (permalink / raw)
  To: Xuewei Niu
  Cc: mst, pabeni, jasowang, xuanzhuo, davem, netdev, stefanha,
	leonardi, virtualization, kvm, linux-kernel, fupan.lfp,
	Xuewei Niu

On Tue, Jun 17, 2025 at 12:53:45PM +0800, Xuewei Niu wrote:
>Wrap the ioctl in `ioctl_int()`, which takes a pointer to the actual
>int value and an expected int value. The function will not return until
>either the ioctl returns the expected value or a timeout occurs, thus
>avoiding immediate failure.
>
>Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
>---
> tools/testing/vsock/util.c | 37 ++++++++++++++++++++++++++++---------
> tools/testing/vsock/util.h |  1 +
> 2 files changed, 29 insertions(+), 9 deletions(-)
>
>diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
>index 0c7e9cbcbc85..ecfbe52efca2 100644
>--- a/tools/testing/vsock/util.c
>+++ b/tools/testing/vsock/util.c
>@@ -16,6 +16,7 @@
> #include <unistd.h>
> #include <assert.h>
> #include <sys/epoll.h>
>+#include <sys/ioctl.h>
> #include <sys/mman.h>
> #include <linux/sockios.h>
>
>@@ -97,28 +98,46 @@ void vsock_wait_remote_close(int fd)
> 	close(epollfd);
> }
>
>-/* Wait until transport reports no data left to be sent.
>- * Return false if transport does not implement the unsent_bytes() callback.
>+/* Wait until ioctl gives an expected int value.
>+ * Return a negative value if the op is not supported.
>  */
>-bool vsock_wait_sent(int fd)
>+int ioctl_int(int fd, unsigned long op, int *actual, int expected)
> {
>-	int ret, sock_bytes_unsent;
>+	int ret;
>+	char name[32];
>+
>+	if (!actual) {
>+		fprintf(stderr, "%s requires a non-null pointer\n", __func__);
>+		exit(EXIT_FAILURE);
>+	}

I think we can skip this kind of validation in a test, it will crash 
anyway and we don't have in other places.

>+
>+	snprintf(name, sizeof(name), "ioctl(%lu)", op);
>
> 	timeout_begin(TIMEOUT);
> 	do {
>-		ret = ioctl(fd, SIOCOUTQ, &sock_bytes_unsent);
>+		ret = ioctl(fd, op, actual);
> 		if (ret < 0) {
> 			if (errno == EOPNOTSUPP)
> 				break;
>
>-			perror("ioctl(SIOCOUTQ)");
>+			perror(name);
> 			exit(EXIT_FAILURE);
> 		}
>-		timeout_check("SIOCOUTQ");
>-	} while (sock_bytes_unsent != 0);
>+		timeout_check(name);
>+	} while (*actual != expected);
> 	timeout_end();
>
>-	return !ret;
>+	return ret;
>+}
>+
>+/* Wait until transport reports no data left to be sent.
>+ * Return false if transport does not implement the unsent_bytes() callback.
>+ */
>+bool vsock_wait_sent(int fd)
>+{
>+	int sock_bytes_unsent;
>+
>+	return !(ioctl_int(fd, SIOCOUTQ, &sock_bytes_unsent, 0));
> }
>
> /* Create socket <type>, bind to <cid, port> and return the file descriptor. */
>diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h
>index 5e2db67072d5..f3fe725cdeab 100644
>--- a/tools/testing/vsock/util.h
>+++ b/tools/testing/vsock/util.h
>@@ -54,6 +54,7 @@ int vsock_stream_listen(unsigned int cid, unsigned int port);
> int vsock_seqpacket_accept(unsigned int cid, unsigned int port,
> 			   struct sockaddr_vm *clientaddrp);
> void vsock_wait_remote_close(int fd);
>+int ioctl_int(int fd, unsigned long op, int *actual, int expected);

what about using vsock_* prefix?
nit: if not, please move after the vsock_* functions.

The rest LGTM!

Thanks,
Stefano

> bool vsock_wait_sent(int fd);
> void send_buf(int fd, const void *buf, size_t len, int flags,
> 	      ssize_t expected_ret);
>-- 
>2.34.1
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 3/3] test/vsock: Add ioctl SIOCINQ tests
  2025-06-17  4:53 ` [PATCH net-next v3 3/3] test/vsock: Add ioctl SIOCINQ tests Xuewei Niu
@ 2025-06-17 15:21   ` Stefano Garzarella
  2025-06-17 15:31     ` Xuewei Niu
  0 siblings, 1 reply; 18+ messages in thread
From: Stefano Garzarella @ 2025-06-17 15:21 UTC (permalink / raw)
  To: Xuewei Niu
  Cc: mst, pabeni, jasowang, xuanzhuo, davem, netdev, stefanha,
	leonardi, virtualization, kvm, linux-kernel, fupan.lfp,
	Xuewei Niu

On Tue, Jun 17, 2025 at 12:53:46PM +0800, Xuewei Niu wrote:
>Add SIOCINQ ioctl tests for both SOCK_STREAM and SOCK_SEQPACKET.
>
>The client waits for the server to send data, and checks if the SIOCINQ
>ioctl value matches the data size. After consuming the data, the client
>checks if the SIOCINQ value is 0.
>
>Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
>---
> tools/testing/vsock/vsock_test.c | 82 ++++++++++++++++++++++++++++++++
> 1 file changed, 82 insertions(+)
>
>diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>index f669baaa0dca..66bb9fde7eca 100644
>--- a/tools/testing/vsock/vsock_test.c
>+++ b/tools/testing/vsock/vsock_test.c
>@@ -1305,6 +1305,58 @@ static void test_unsent_bytes_client(const struct test_opts *opts, int type)
> 	close(fd);
> }
>
>+static void test_unread_bytes_server(const struct test_opts *opts, int type)
>+{
>+	unsigned char buf[MSG_BUF_IOCTL_LEN];
>+	int client_fd;
>+
>+	client_fd = vsock_accept(VMADDR_CID_ANY, opts->peer_port, NULL, type);
>+	if (client_fd < 0) {
>+		perror("accept");
>+		exit(EXIT_FAILURE);
>+	}
>+
>+	for (int i = 0; i < sizeof(buf); i++)
>+		buf[i] = rand() & 0xFF;
>+
>+	send_buf(client_fd, buf, sizeof(buf), 0, sizeof(buf));
>+	control_writeln("SENT");
>+	control_expectln("RECEIVED");
>+
>+	close(client_fd);
>+}
>+
>+static void test_unread_bytes_client(const struct test_opts *opts, int type)
>+{
>+	unsigned char buf[MSG_BUF_IOCTL_LEN];
>+	int ret, fd;
>+	int sock_bytes_unread;
>+
>+	fd = vsock_connect(opts->peer_cid, opts->peer_port, type);
>+	if (fd < 0) {
>+		perror("connect");
>+		exit(EXIT_FAILURE);
>+	}
>+
>+	control_expectln("SENT");
>+	/* The data has arrived but has not been read. The expected is
>+	 * MSG_BUF_IOCTL_LEN.
>+	 */
>+	ret = ioctl_int(fd, TIOCINQ, &sock_bytes_unread, MSG_BUF_IOCTL_LEN);
>+	if (ret) {

Since we are returning a value !=0 only if EOPNOTSUPP, I think we can 
just return a bool when the ioctl is supported or not, like for 
vsock_wait_sent().

>+		fprintf(stderr, "Test skipped, TIOCINQ not supported.\n");
>+		goto out;
>+	}
>+
>+	recv_buf(fd, buf, sizeof(buf), 0, sizeof(buf));
>+	// All date has been consumed, so the expected is 0.

s/date/data

Please follow the style of the file (/* */ for comments)

>+	ioctl_int(fd, TIOCINQ, &sock_bytes_unread, 0);
>+	control_writeln("RECEIVED");

Why we need this control barrier here?

Thanks,
Stefano

>+
>+out:
>+	close(fd);
>+}
>+
> static void test_stream_unsent_bytes_client(const struct test_opts *opts)
> {
> 	test_unsent_bytes_client(opts, SOCK_STREAM);
>@@ -1325,6 +1377,26 @@ static void test_seqpacket_unsent_bytes_server(const struct test_opts *opts)
> 	test_unsent_bytes_server(opts, SOCK_SEQPACKET);
> }
>
>+static void test_stream_unread_bytes_client(const struct test_opts *opts)
>+{
>+	test_unread_bytes_client(opts, SOCK_STREAM);
>+}
>+
>+static void test_stream_unread_bytes_server(const struct test_opts *opts)
>+{
>+	test_unread_bytes_server(opts, SOCK_STREAM);
>+}
>+
>+static void test_seqpacket_unread_bytes_client(const struct test_opts *opts)
>+{
>+	test_unread_bytes_client(opts, SOCK_SEQPACKET);
>+}
>+
>+static void test_seqpacket_unread_bytes_server(const struct test_opts *opts)
>+{
>+	test_unread_bytes_server(opts, SOCK_SEQPACKET);
>+}
>+
> #define RCVLOWAT_CREDIT_UPD_BUF_SIZE	(1024 * 128)
> /* This define is the same as in 'include/linux/virtio_vsock.h':
>  * it is used to decide when to send credit update message during
>@@ -2051,6 +2123,16 @@ static struct test_case test_cases[] = {
> 		.run_client = test_stream_nolinger_client,
> 		.run_server = test_stream_nolinger_server,
> 	},
>+	{
>+		.name = "SOCK_STREAM ioctl(SIOCINQ) functionality",
>+		.run_client = test_stream_unread_bytes_client,
>+		.run_server = test_stream_unread_bytes_server,
>+	},
>+	{
>+		.name = "SOCK_SEQPACKET ioctl(SIOCINQ) functionality",
>+		.run_client = test_seqpacket_unread_bytes_client,
>+		.run_server = test_seqpacket_unread_bytes_server,
>+	},
> 	{},
> };
>
>-- 
>2.34.1
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 2/3] test/vsock: Add retry mechanism to ioctl wrapper
  2025-06-17 15:10   ` Stefano Garzarella
@ 2025-06-17 15:23     ` Xuewei Niu
  0 siblings, 0 replies; 18+ messages in thread
From: Xuewei Niu @ 2025-06-17 15:23 UTC (permalink / raw)
  To: sgarzare
  Cc: davem, fupan.lfp, jasowang, kvm, leonardi, linux-kernel, mst,
	netdev, niuxuewei.nxw, niuxuewei97, pabeni, stefanha,
	virtualization, xuanzhuo

> On Tue, Jun 17, 2025 at 12:53:45PM +0800, Xuewei Niu wrote:
> >Wrap the ioctl in `ioctl_int()`, which takes a pointer to the actual
> >int value and an expected int value. The function will not return until
> >either the ioctl returns the expected value or a timeout occurs, thus
> >avoiding immediate failure.
> >
> >Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
> >---
> > tools/testing/vsock/util.c | 37 ++++++++++++++++++++++++++++---------
> > tools/testing/vsock/util.h |  1 +
> > 2 files changed, 29 insertions(+), 9 deletions(-)
> >
> >diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
> >index 0c7e9cbcbc85..ecfbe52efca2 100644
> >--- a/tools/testing/vsock/util.c
> >+++ b/tools/testing/vsock/util.c
> >@@ -16,6 +16,7 @@
> > #include <unistd.h>
> > #include <assert.h>
> > #include <sys/epoll.h>
> >+#include <sys/ioctl.h>
> > #include <sys/mman.h>
> > #include <linux/sockios.h>
> >
> >@@ -97,28 +98,46 @@ void vsock_wait_remote_close(int fd)
> > 	close(epollfd);
> > }
> >
> >-/* Wait until transport reports no data left to be sent.
> >- * Return false if transport does not implement the unsent_bytes() callback.
> >+/* Wait until ioctl gives an expected int value.
> >+ * Return a negative value if the op is not supported.
> >  */
> >-bool vsock_wait_sent(int fd)
> >+int ioctl_int(int fd, unsigned long op, int *actual, int expected)
> > {
> >-	int ret, sock_bytes_unsent;
> >+	int ret;
> >+	char name[32];
> >+
> >+	if (!actual) {
> >+		fprintf(stderr, "%s requires a non-null pointer\n", __func__);
> >+		exit(EXIT_FAILURE);
> >+	}
> 
> I think we can skip this kind of validation in a test, it will crash 
> anyway and we don't have in other places.

Will do.

> >+	snprintf(name, sizeof(name), "ioctl(%lu)", op);
> >
> > 	timeout_begin(TIMEOUT);
> > 	do {
> >-		ret = ioctl(fd, SIOCOUTQ, &sock_bytes_unsent);
> >+		ret = ioctl(fd, op, actual);
> > 		if (ret < 0) {
> > 			if (errno == EOPNOTSUPP)
> > 				break;
> >
> >-			perror("ioctl(SIOCOUTQ)");
> >+			perror(name);
> > 			exit(EXIT_FAILURE);
> > 		}
> >-		timeout_check("SIOCOUTQ");
> >-	} while (sock_bytes_unsent != 0);
> >+		timeout_check(name);
> >+	} while (*actual != expected);
> > 	timeout_end();
> >
> >-	return !ret;
> >+	return ret;
> >+}
> >+
> >+/* Wait until transport reports no data left to be sent.
> >+ * Return false if transport does not implement the unsent_bytes() callback.
> >+ */
> >+bool vsock_wait_sent(int fd)
> >+{
> >+	int sock_bytes_unsent;
> >+
> >+	return !(ioctl_int(fd, SIOCOUTQ, &sock_bytes_unsent, 0));
> > }
> >
> > /* Create socket <type>, bind to <cid, port> and return the file descriptor. */
> >diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h
> >index 5e2db67072d5..f3fe725cdeab 100644
> >--- a/tools/testing/vsock/util.h
> >+++ b/tools/testing/vsock/util.h
> >@@ -54,6 +54,7 @@ int vsock_stream_listen(unsigned int cid, unsigned int port);
> > int vsock_seqpacket_accept(unsigned int cid, unsigned int port,
> > 			   struct sockaddr_vm *clientaddrp);
> > void vsock_wait_remote_close(int fd);
> >+int ioctl_int(int fd, unsigned long op, int *actual, int expected);
> 
> what about using vsock_* prefix?
> nit: if not, please move after the vsock_* functions.

My first thought was that `ioctl_int()` doesn't take any arguments related
to vsock (e.g. cid).

I am fine with the prefix, and will add it back.

Thanks,
Xuewei

> The rest LGTM!
> 
> Thanks,
> Stefano
> 
> > bool vsock_wait_sent(int fd);
> > void send_buf(int fd, const void *buf, size_t len, int flags,
> > 	      ssize_t expected_ret);
> >-- 
> >2.34.1
> >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 3/3] test/vsock: Add ioctl SIOCINQ tests
  2025-06-17 15:21   ` Stefano Garzarella
@ 2025-06-17 15:31     ` Xuewei Niu
  0 siblings, 0 replies; 18+ messages in thread
From: Xuewei Niu @ 2025-06-17 15:31 UTC (permalink / raw)
  To: sgarzare
  Cc: davem, fupan.lfp, jasowang, kvm, leonardi, linux-kernel, mst,
	netdev, niuxuewei.nxw, niuxuewei97, pabeni, stefanha,
	virtualization, xuanzhuo

> On Tue, Jun 17, 2025 at 12:53:46PM +0800, Xuewei Niu wrote:
> >Add SIOCINQ ioctl tests for both SOCK_STREAM and SOCK_SEQPACKET.
> >
> >The client waits for the server to send data, and checks if the SIOCINQ
> >ioctl value matches the data size. After consuming the data, the client
> >checks if the SIOCINQ value is 0.
> >
> >Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
> >---
> > tools/testing/vsock/vsock_test.c | 82 ++++++++++++++++++++++++++++++++
> > 1 file changed, 82 insertions(+)
> >
> >diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
> >index f669baaa0dca..66bb9fde7eca 100644
> >--- a/tools/testing/vsock/vsock_test.c
> >+++ b/tools/testing/vsock/vsock_test.c
> >@@ -1305,6 +1305,58 @@ static void test_unsent_bytes_client(const struct test_opts *opts, int type)
> > 	close(fd);
> > }
> >
> >+static void test_unread_bytes_server(const struct test_opts *opts, int type)
> >+{
> >+	unsigned char buf[MSG_BUF_IOCTL_LEN];
> >+	int client_fd;
> >+
> >+	client_fd = vsock_accept(VMADDR_CID_ANY, opts->peer_port, NULL, type);
> >+	if (client_fd < 0) {
> >+		perror("accept");
> >+		exit(EXIT_FAILURE);
> >+	}
> >+
> >+	for (int i = 0; i < sizeof(buf); i++)
> >+		buf[i] = rand() & 0xFF;
> >+
> >+	send_buf(client_fd, buf, sizeof(buf), 0, sizeof(buf));
> >+	control_writeln("SENT");
> >+	control_expectln("RECEIVED");
> >+
> >+	close(client_fd);
> >+}
> >+
> >+static void test_unread_bytes_client(const struct test_opts *opts, int type)
> >+{
> >+	unsigned char buf[MSG_BUF_IOCTL_LEN];
> >+	int ret, fd;
> >+	int sock_bytes_unread;
> >+
> >+	fd = vsock_connect(opts->peer_cid, opts->peer_port, type);
> >+	if (fd < 0) {
> >+		perror("connect");
> >+		exit(EXIT_FAILURE);
> >+	}
> >+
> >+	control_expectln("SENT");
> >+	/* The data has arrived but has not been read. The expected is
> >+	 * MSG_BUF_IOCTL_LEN.
> >+	 */
> >+	ret = ioctl_int(fd, TIOCINQ, &sock_bytes_unread, MSG_BUF_IOCTL_LEN);
> >+	if (ret) {
> 
> Since we are returning a value !=0 only if EOPNOTSUPP, I think we can 
> just return a bool when the ioctl is supported or not, like for 
> vsock_wait_sent().

Will do.

> >+		fprintf(stderr, "Test skipped, TIOCINQ not supported.\n");
> >+		goto out;
> >+	}
> >+
> >+	recv_buf(fd, buf, sizeof(buf), 0, sizeof(buf));
> >+	// All date has been consumed, so the expected is 0.
> 
> s/date/data
> 
> Please follow the style of the file (/* */ for comments)

Will do.

> >+	ioctl_int(fd, TIOCINQ, &sock_bytes_unread, 0);
> >+	control_writeln("RECEIVED");
> 
> Why we need this control barrier here?

Nice catch! It is not necessary. Will remote it.

Thanks,
Xuewei

> >+
> >+out:
> >+	close(fd);
> >+}
> >+
> > static void test_stream_unsent_bytes_client(const struct test_opts *opts)
> > {
> > 	test_unsent_bytes_client(opts, SOCK_STREAM);
> >@@ -1325,6 +1377,26 @@ static void test_seqpacket_unsent_bytes_server(const struct test_opts *opts)
> > 	test_unsent_bytes_server(opts, SOCK_SEQPACKET);
> > }
> >
> >+static void test_stream_unread_bytes_client(const struct test_opts *opts)
> >+{
> >+	test_unread_bytes_client(opts, SOCK_STREAM);
> >+}
> >+
> >+static void test_stream_unread_bytes_server(const struct test_opts *opts)
> >+{
> >+	test_unread_bytes_server(opts, SOCK_STREAM);
> >+}
> >+
> >+static void test_seqpacket_unread_bytes_client(const struct test_opts *opts)
> >+{
> >+	test_unread_bytes_client(opts, SOCK_SEQPACKET);
> >+}
> >+
> >+static void test_seqpacket_unread_bytes_server(const struct test_opts *opts)
> >+{
> >+	test_unread_bytes_server(opts, SOCK_SEQPACKET);
> >+}
> >+
> > #define RCVLOWAT_CREDIT_UPD_BUF_SIZE	(1024 * 128)
> > /* This define is the same as in 'include/linux/virtio_vsock.h':
> >  * it is used to decide when to send credit update message during
> >@@ -2051,6 +2123,16 @@ static struct test_case test_cases[] = {
> > 		.run_client = test_stream_nolinger_client,
> > 		.run_server = test_stream_nolinger_server,
> > 	},
> >+	{
> >+		.name = "SOCK_STREAM ioctl(SIOCINQ) functionality",
> >+		.run_client = test_stream_unread_bytes_client,
> >+		.run_server = test_stream_unread_bytes_server,
> >+	},
> >+	{
> >+		.name = "SOCK_SEQPACKET ioctl(SIOCINQ) functionality",
> >+		.run_client = test_seqpacket_unread_bytes_client,
> >+		.run_server = test_seqpacket_unread_bytes_server,
> >+	},
> > 	{},
> > };
> >
> >-- 
> >2.34.1
> >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl
  2025-06-17 14:39   ` Stefano Garzarella
@ 2025-06-22 13:59     ` Xuewei Niu
  2025-06-23 17:01       ` Stefano Garzarella
  2025-06-25  8:03     ` [EXTERNAL] " Dexuan Cui
  1 sibling, 1 reply; 18+ messages in thread
From: Xuewei Niu @ 2025-06-22 13:59 UTC (permalink / raw)
  To: sgarzare
  Cc: davem, decui, fupan.lfp, haiyangz, jasowang, kvm, kys, leonardi,
	linux-hyperv, linux-kernel, mst, netdev, niuxuewei.nxw,
	niuxuewei97, pabeni, stefanha, virtualization, wei.liu, xuanzhuo

> ACCin hyper-v maintainers and list since I have a question about hyperv 
> transport.
> 
> On Tue, Jun 17, 2025 at 12:53:44PM +0800, Xuewei Niu wrote:
> >Add support for SIOCINQ ioctl, indicating the length of bytes unread in the
> >socket. The value is obtained from `vsock_stream_has_data()`.
> >
> >Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
> >---
> > net/vmw_vsock/af_vsock.c | 22 ++++++++++++++++++++++
> > 1 file changed, 22 insertions(+)
> >
> >diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> >index 2e7a3034e965..bae6b89bb5fb 100644
> >--- a/net/vmw_vsock/af_vsock.c
> >+++ b/net/vmw_vsock/af_vsock.c
> >@@ -1389,6 +1389,28 @@ static int vsock_do_ioctl(struct socket *sock, unsigned int cmd,
> > 	vsk = vsock_sk(sk);
> >
> > 	switch (cmd) {
> >+	case SIOCINQ: {
> >+		ssize_t n_bytes;
> >+
> >+		if (!vsk->transport) {
> >+			ret = -EOPNOTSUPP;
> >+			break;
> >+		}
> >+
> >+		if (sock_type_connectible(sk->sk_type) &&
> >+		    sk->sk_state == TCP_LISTEN) {
> >+			ret = -EINVAL;
> >+			break;
> >+		}
> >+
> >+		n_bytes = vsock_stream_has_data(vsk);
> 
> Now looks better to me, I just checked transports: vmci and virtio/vhost 
> returns what we want, but for hyperv we have:
> 
> 	static s64 hvs_stream_has_data(struct vsock_sock *vsk)
> 	{
> 		struct hvsock *hvs = vsk->trans;
> 		s64 ret;
> 
> 		if (hvs->recv_data_len > 0)
> 			return 1;
> 
> @Hyper-v maintainers: do you know why we don't return `recv_data_len`?
> Do you think we can do that to support this new feature?

Hi Hyper-v maintainers, could you please take a look at this?

Hi Stefano, if no response, can I fix this issue in the next version?

Thanks,
Xuewei
 
> Thanks,
> Stefano
> 
> >+		if (n_bytes < 0) {
> >+			ret = n_bytes;
> >+			break;
> >+		}
> >+		ret = put_user(n_bytes, arg);
> >+		break;
> >+	}
> > 	case SIOCOUTQ: {
> > 		ssize_t n_bytes;
> >
> >-- 
> >2.34.1
> >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl
  2025-06-22 13:59     ` Xuewei Niu
@ 2025-06-23 17:01       ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2025-06-23 17:01 UTC (permalink / raw)
  To: Xuewei Niu
  Cc: davem, decui, fupan.lfp, haiyangz, jasowang, kvm, kys, leonardi,
	linux-hyperv, linux-kernel, mst, netdev, niuxuewei.nxw, pabeni,
	stefanha, virtualization, wei.liu, xuanzhuo

On Sun, Jun 22, 2025 at 09:59:10PM +0800, Xuewei Niu wrote:
>> ACCin hyper-v maintainers and list since I have a question about hyperv
>> transport.
>>
>> On Tue, Jun 17, 2025 at 12:53:44PM +0800, Xuewei Niu wrote:
>> >Add support for SIOCINQ ioctl, indicating the length of bytes unread in the
>> >socket. The value is obtained from `vsock_stream_has_data()`.
>> >
>> >Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
>> >---
>> > net/vmw_vsock/af_vsock.c | 22 ++++++++++++++++++++++
>> > 1 file changed, 22 insertions(+)
>> >
>> >diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>> >index 2e7a3034e965..bae6b89bb5fb 100644
>> >--- a/net/vmw_vsock/af_vsock.c
>> >+++ b/net/vmw_vsock/af_vsock.c
>> >@@ -1389,6 +1389,28 @@ static int vsock_do_ioctl(struct socket *sock, unsigned int cmd,
>> > 	vsk = vsock_sk(sk);
>> >
>> > 	switch (cmd) {
>> >+	case SIOCINQ: {
>> >+		ssize_t n_bytes;
>> >+
>> >+		if (!vsk->transport) {
>> >+			ret = -EOPNOTSUPP;
>> >+			break;
>> >+		}
>> >+
>> >+		if (sock_type_connectible(sk->sk_type) &&
>> >+		    sk->sk_state == TCP_LISTEN) {
>> >+			ret = -EINVAL;
>> >+			break;
>> >+		}
>> >+
>> >+		n_bytes = vsock_stream_has_data(vsk);
>>
>> Now looks better to me, I just checked transports: vmci and virtio/vhost
>> returns what we want, but for hyperv we have:
>>
>> 	static s64 hvs_stream_has_data(struct vsock_sock *vsk)
>> 	{
>> 		struct hvsock *hvs = vsk->trans;
>> 		s64 ret;
>>
>> 		if (hvs->recv_data_len > 0)
>> 			return 1;
>>
>> @Hyper-v maintainers: do you know why we don't return `recv_data_len`?
>> Do you think we can do that to support this new feature?
>
>Hi Hyper-v maintainers, could you please take a look at this?
>
>Hi Stefano, if no response, can I fix this issue in the next version?

Yep, but let's wait a little bit more.

In that case, please do it in a separate patch (same series is fine) 
that we can easily revert/fix if they will find issues later.

Thanks,
Stefano

>
>Thanks,
>Xuewei
>
>> Thanks,
>> Stefano
>>
>> >+		if (n_bytes < 0) {
>> >+			ret = n_bytes;
>> >+			break;
>> >+		}
>> >+		ret = put_user(n_bytes, arg);
>> >+		break;
>> >+	}
>> > 	case SIOCOUTQ: {
>> > 		ssize_t n_bytes;
>> >
>> >--
>> >2.34.1
>> >
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [EXTERNAL] Re: [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl
  2025-06-17 14:39   ` Stefano Garzarella
  2025-06-22 13:59     ` Xuewei Niu
@ 2025-06-25  8:03     ` Dexuan Cui
  2025-06-25 13:32       ` Stefano Garzarella
  1 sibling, 1 reply; 18+ messages in thread
From: Dexuan Cui @ 2025-06-25  8:03 UTC (permalink / raw)
  To: Stefano Garzarella, Xuewei Niu, KY Srinivasan, Haiyang Zhang,
	Wei Liu, linux-hyperv@vger.kernel.org
  Cc: mst@redhat.com, pabeni@redhat.com, jasowang@redhat.com,
	xuanzhuo@linux.alibaba.com, davem@davemloft.net,
	netdev@vger.kernel.org, stefanha@redhat.com, leonardi@redhat.com,
	virtualization@lists.linux.dev, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, fupan.lfp@antgroup.com, Xuewei Niu

> From: Stefano Garzarella <sgarzare@redhat.com>
> Sent: Tuesday, June 17, 2025 7:39 AM
>  ...
> Now looks better to me, I just checked transports: vmci and virtio/vhost
> returns what we want, but for hyperv we have:
> 
> 	static s64 hvs_stream_has_data(struct vsock_sock *vsk)
> 	{
> 		struct hvsock *hvs = vsk->trans;
> 		s64 ret;
> 
> 		if (hvs->recv_data_len > 0)
> 			return 1;
> 
> @Hyper-v maintainers: do you know why we don't return `recv_data_len`?

Sorry for the late response!  This is the complete code of the function:

static s64 hvs_stream_has_data(struct vsock_sock *vsk)
{
        struct hvsock *hvs = vsk->trans;
        s64 ret;

        if (hvs->recv_data_len > 0)
                return 1;

        switch (hvs_channel_readable_payload(hvs->chan)) {
        case 1:
                ret = 1;
                break;
        case 0:
                vsk->peer_shutdown |= SEND_SHUTDOWN;
                ret = 0;
                break;
        default: /* -1 */
                ret = 0;
                break;
        }

        return ret;
}

If (hvs->recv_data_len > 0), I think we can return hvs->recv_data_len here.

If hvs->recv_data_len is 0, and hvs_channel_readable_payload(hvs->chan)
returns 1, we should not return hvs->recv_data_len (which is 0 here), and it's
not very easy to find how many bytes of payload in total is available right now:
each host-to-guest "packet" in the VMBus channel ringbuffer has a header
(which is not part of the payload data) and a trailing padding field, and we
would have to iterate on all the "packets" (or at least the next
"packet"?) to find the exact bytes of pending payload. Please see
hvs_stream_dequeue() for details.

Ideally hvs_stream_has_data() should return the exact length of pending
readable payload, but when the hv_sock code was written in 2017, 
vsock_stream_has_data() -> ... -> hvs_stream_has_data() basically only needs
to know whether there is any data or not, i.e. it's kind of a boolean variable, so
hvs_stream_has_data() was written to return 1 or 0 for simplicity. :-)

I can post the patch below (not tested yet) to fix hvs_stream_has_data() by
returning the payload length of the next single "packet".  Does it look good
to you?

--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -694,15 +694,25 @@ static ssize_t hvs_stream_enqueue(struct vsock_sock *vsk, struct msghdr *msg,
 static s64 hvs_stream_has_data(struct vsock_sock *vsk)
 {
        struct hvsock *hvs = vsk->trans;
+       bool need_refill = !hvs->recv_desc;
        s64 ret;

        if (hvs->recv_data_len > 0)
-               return 1;
+               return hvs->recv_data_len;

        switch (hvs_channel_readable_payload(hvs->chan)) {
        case 1:
-               ret = 1;
-               break;
+               if (!need_refill)
+                       return -EIO;
+
+               hvs->recv_desc = hv_pkt_iter_first(hvs->chan);
+               if (!hvs->recv_desc)
+                       return -ENOBUFS;
+
+               ret = hvs_update_recv_data(hvs);
+               if (ret)
+                       return ret;
+               return hvs->recv_data_len;
        case 0:
                vsk->peer_shutdown |= SEND_SHUTDOWN;
                ret = 0;

Thanks,
Dexuan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [EXTERNAL] Re: [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl
  2025-06-25  8:03     ` [EXTERNAL] " Dexuan Cui
@ 2025-06-25 13:32       ` Stefano Garzarella
  2025-06-25 16:41         ` Dexuan Cui
  2025-06-26  5:02         ` Xuewei Niu
  0 siblings, 2 replies; 18+ messages in thread
From: Stefano Garzarella @ 2025-06-25 13:32 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Xuewei Niu, KY Srinivasan, Haiyang Zhang, Wei Liu,
	linux-hyperv@vger.kernel.org, mst@redhat.com, pabeni@redhat.com,
	jasowang@redhat.com, xuanzhuo@linux.alibaba.com,
	davem@davemloft.net, netdev@vger.kernel.org, stefanha@redhat.com,
	leonardi@redhat.com, virtualization@lists.linux.dev,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	fupan.lfp@antgroup.com, Xuewei Niu

On Wed, Jun 25, 2025 at 08:03:00AM +0000, Dexuan Cui wrote:
>> From: Stefano Garzarella <sgarzare@redhat.com>
>> Sent: Tuesday, June 17, 2025 7:39 AM
>>  ...
>> Now looks better to me, I just checked transports: vmci and virtio/vhost
>> returns what we want, but for hyperv we have:
>>
>> 	static s64 hvs_stream_has_data(struct vsock_sock *vsk)
>> 	{
>> 		struct hvsock *hvs = vsk->trans;
>> 		s64 ret;
>>
>> 		if (hvs->recv_data_len > 0)
>> 			return 1;
>>
>> @Hyper-v maintainers: do you know why we don't return `recv_data_len`?
>
>Sorry for the late response!  This is the complete code of the function:
>
>static s64 hvs_stream_has_data(struct vsock_sock *vsk)
>{
>        struct hvsock *hvs = vsk->trans;
>        s64 ret;
>
>        if (hvs->recv_data_len > 0)
>                return 1;
>
>        switch (hvs_channel_readable_payload(hvs->chan)) {
>        case 1:
>                ret = 1;
>                break;
>        case 0:
>                vsk->peer_shutdown |= SEND_SHUTDOWN;
>                ret = 0;
>                break;
>        default: /* -1 */
>                ret = 0;
>                break;
>        }
>
>        return ret;
>}
>
>If (hvs->recv_data_len > 0), I think we can return hvs->recv_data_len here.
>
>If hvs->recv_data_len is 0, and hvs_channel_readable_payload(hvs->chan)
>returns 1, we should not return hvs->recv_data_len (which is 0 here), 
>and it's
>not very easy to find how many bytes of payload in total is available right now:
>each host-to-guest "packet" in the VMBus channel ringbuffer has a header
>(which is not part of the payload data) and a trailing padding field, and we
>would have to iterate on all the "packets" (or at least the next
>"packet"?) to find the exact bytes of pending payload. Please see
>hvs_stream_dequeue() for details.
>
>Ideally hvs_stream_has_data() should return the exact length of pending
>readable payload, but when the hv_sock code was written in 2017,
>vsock_stream_has_data() -> ... -> hvs_stream_has_data() basically only needs
>to know whether there is any data or not, i.e. it's kind of a boolean variable, so
>hvs_stream_has_data() was written to return 1 or 0 for simplicity. :-)

Yeah, I see, thanks for the details! :-)

>
>I can post the patch below (not tested yet) to fix hvs_stream_has_data() by
>returning the payload length of the next single "packet".  Does it look good
>to you?

Yep, LGTM! Can be a best effort IMO.

Maybe when you have it tested, post it here as proper patch, and Xuewei 
can include it in the next version of this series (of course with you as 
author, etc.). In this way will be easy to test/merge, since they are 
related.

@Xuewei @Dexuan Is it okay for you?

Thanks,
Stefano

>
>--- a/net/vmw_vsock/hyperv_transport.c
>+++ b/net/vmw_vsock/hyperv_transport.c
>@@ -694,15 +694,25 @@ static ssize_t hvs_stream_enqueue(struct vsock_sock *vsk, struct msghdr *msg,
> static s64 hvs_stream_has_data(struct vsock_sock *vsk)
> {
>        struct hvsock *hvs = vsk->trans;
>+       bool need_refill = !hvs->recv_desc;
>        s64 ret;
>
>        if (hvs->recv_data_len > 0)
>-               return 1;
>+               return hvs->recv_data_len;
>
>        switch (hvs_channel_readable_payload(hvs->chan)) {
>        case 1:
>-               ret = 1;
>-               break;
>+               if (!need_refill)
>+                       return -EIO;
>+
>+               hvs->recv_desc = hv_pkt_iter_first(hvs->chan);
>+               if (!hvs->recv_desc)
>+                       return -ENOBUFS;
>+
>+               ret = hvs_update_recv_data(hvs);
>+               if (ret)
>+                       return ret;
>+               return hvs->recv_data_len;
>        case 0:
>                vsk->peer_shutdown |= SEND_SHUTDOWN;
>                ret = 0;
>
>Thanks,
>Dexuan
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [EXTERNAL] Re: [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl
  2025-06-25 13:32       ` Stefano Garzarella
@ 2025-06-25 16:41         ` Dexuan Cui
  2025-06-26  5:02         ` Xuewei Niu
  1 sibling, 0 replies; 18+ messages in thread
From: Dexuan Cui @ 2025-06-25 16:41 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Xuewei Niu, KY Srinivasan, Haiyang Zhang, Wei Liu,
	linux-hyperv@vger.kernel.org, mst@redhat.com, pabeni@redhat.com,
	jasowang@redhat.com, xuanzhuo@linux.alibaba.com,
	davem@davemloft.net, netdev@vger.kernel.org, stefanha@redhat.com,
	leonardi@redhat.com, virtualization@lists.linux.dev,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	fupan.lfp@antgroup.com, Xuewei Niu

> From: Stefano Garzarella <sgarzare@redhat.com>
> Sent: Wednesday, June 25, 2025 6:32 AM
> ...
> >I can post the patch below (not tested yet) to fix hvs_stream_has_data() by
> >returning the payload length of the next single "packet".  Does it look good
> >to you?
> 
> Yep, LGTM! Can be a best effort IMO.
Thanks! I'll test and post it later today.
 
> Maybe when you have it tested, post it here as proper patch, and Xuewei
> can include it in the next version of this series (of course with you as
> author, etc.). In this way will be easy to test/merge, since they are
> related.
> 
> @Xuewei @Dexuan Is it okay for you?
> 
> Thanks,
> Stefano

This is a good idea to me. I'll post a patch later today.

Thanks,
Dexuan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl
  2025-06-25 13:32       ` Stefano Garzarella
  2025-06-25 16:41         ` Dexuan Cui
@ 2025-06-26  5:02         ` Xuewei Niu
  2025-06-27  8:50           ` [EXTERNAL] " Dexuan Cui
  1 sibling, 1 reply; 18+ messages in thread
From: Xuewei Niu @ 2025-06-26  5:02 UTC (permalink / raw)
  To: sgarzare
  Cc: davem, decui, fupan.lfp, haiyangz, jasowang, kvm, kys, leonardi,
	linux-hyperv, linux-kernel, mst, netdev, niuxuewei.nxw,
	niuxuewei97, pabeni, stefanha, virtualization, wei.liu, xuanzhuo

> On Wed, Jun 25, 2025 at 08:03:00AM +0000, Dexuan Cui wrote:
> >> From: Stefano Garzarella <sgarzare@redhat.com>
> >> Sent: Tuesday, June 17, 2025 7:39 AM
> >>  ...
> >> Now looks better to me, I just checked transports: vmci and virtio/vhost
> >> returns what we want, but for hyperv we have:
> >>
> >> 	static s64 hvs_stream_has_data(struct vsock_sock *vsk)
> >> 	{
> >> 		struct hvsock *hvs = vsk->trans;
> >> 		s64 ret;
> >>
> >> 		if (hvs->recv_data_len > 0)
> >> 			return 1;
> >>
> >> @Hyper-v maintainers: do you know why we don't return `recv_data_len`?
> >
> >Sorry for the late response!  This is the complete code of the function:
> >
> >static s64 hvs_stream_has_data(struct vsock_sock *vsk)
> >{
> >        struct hvsock *hvs = vsk->trans;
> >        s64 ret;
> >
> >        if (hvs->recv_data_len > 0)
> >                return 1;
> >
> >        switch (hvs_channel_readable_payload(hvs->chan)) {
> >        case 1:
> >                ret = 1;
> >                break;
> >        case 0:
> >                vsk->peer_shutdown |= SEND_SHUTDOWN;
> >                ret = 0;
> >                break;
> >        default: /* -1 */
> >                ret = 0;
> >                break;
> >        }
> >
> >        return ret;
> >}
> >
> >If (hvs->recv_data_len > 0), I think we can return hvs->recv_data_len here.
> >
> >If hvs->recv_data_len is 0, and hvs_channel_readable_payload(hvs->chan)
> >returns 1, we should not return hvs->recv_data_len (which is 0 here), 
> >and it's
> >not very easy to find how many bytes of payload in total is available right now:
> >each host-to-guest "packet" in the VMBus channel ringbuffer has a header
> >(which is not part of the payload data) and a trailing padding field, and we
> >would have to iterate on all the "packets" (or at least the next
> >"packet"?) to find the exact bytes of pending payload. Please see
> >hvs_stream_dequeue() for details.
> >
> >Ideally hvs_stream_has_data() should return the exact length of pending
> >readable payload, but when the hv_sock code was written in 2017,
> >vsock_stream_has_data() -> ... -> hvs_stream_has_data() basically only needs
> >to know whether there is any data or not, i.e. it's kind of a boolean variable, so
> >hvs_stream_has_data() was written to return 1 or 0 for simplicity. :-)
> 
> Yeah, I see, thanks for the details! :-)
> 
> >
> >I can post the patch below (not tested yet) to fix hvs_stream_has_data() by
> >returning the payload length of the next single "packet".  Does it look good
> >to you?
> 
> Yep, LGTM! Can be a best effort IMO.
> 
> Maybe when you have it tested, post it here as proper patch, and Xuewei 
> can include it in the next version of this series (of course with you as 
> author, etc.). In this way will be easy to test/merge, since they are 
> related.
> 
> @Xuewei @Dexuan Is it okay for you?

Yeah, sounds good to me!

Thanks,
Xuewei

> Thanks,
> Stefano
> 
> >
> >--- a/net/vmw_vsock/hyperv_transport.c
> >+++ b/net/vmw_vsock/hyperv_transport.c
> >@@ -694,15 +694,25 @@ static ssize_t hvs_stream_enqueue(struct vsock_sock *vsk, struct msghdr *msg,
> > static s64 hvs_stream_has_data(struct vsock_sock *vsk)
> > {
> >        struct hvsock *hvs = vsk->trans;
> >+       bool need_refill = !hvs->recv_desc;
> >        s64 ret;
> >
> >        if (hvs->recv_data_len > 0)
> >-               return 1;
> >+               return hvs->recv_data_len;
> >
> >        switch (hvs_channel_readable_payload(hvs->chan)) {
> >        case 1:
> >-               ret = 1;
> >-               break;
> >+               if (!need_refill)
> >+                       return -EIO;
> >+
> >+               hvs->recv_desc = hv_pkt_iter_first(hvs->chan);
> >+               if (!hvs->recv_desc)
> >+                       return -ENOBUFS;
> >+
> >+               ret = hvs_update_recv_data(hvs);
> >+               if (ret)
> >+                       return ret;
> >+               return hvs->recv_data_len;
> >        case 0:
> >                vsk->peer_shutdown |= SEND_SHUTDOWN;
> >                ret = 0;
> >
> >Thanks,
> >Dexuan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [EXTERNAL] Re: [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl
  2025-06-26  5:02         ` Xuewei Niu
@ 2025-06-27  8:50           ` Dexuan Cui
  2025-06-27 11:01             ` Stefano Garzarella
  2025-06-27 11:42             ` Xuewei Niu
  0 siblings, 2 replies; 18+ messages in thread
From: Dexuan Cui @ 2025-06-27  8:50 UTC (permalink / raw)
  To: Xuewei Niu, sgarzare@redhat.com
  Cc: davem@davemloft.net, fupan.lfp@antgroup.com, Haiyang Zhang,
	jasowang@redhat.com, kvm@vger.kernel.org, KY Srinivasan,
	leonardi@redhat.com, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, mst@redhat.com,
	netdev@vger.kernel.org, niuxuewei.nxw@antgroup.com,
	pabeni@redhat.com, stefanha@redhat.com,
	virtualization@lists.linux.dev, wei.liu@kernel.org,
	xuanzhuo@linux.alibaba.com

> From: Xuewei Niu <niuxuewei97@gmail.com>
> Sent: Wednesday, June 25, 2025 10:02 PM
> > ...
> > Maybe when you have it tested, post it here as proper patch, and Xuewei
> > can include it in the next version of this series (of course with you as
> > author, etc.). In this way will be easy to test/merge, since they are
> > related.
> >
> > @Xuewei @Dexuan Is it okay for you?
> 
> Yeah, sounds good to me!
> 
> Thanks,
> Xuewei

Hi Xuewei, Stefano, I posted the patch here:
https://lore.kernel.org/virtualization/1751013889-4951-1-git-send-email-decui@microsoft.com/T/#u

Xuewei, please help to re-post this patch with the next version of your patchset.
Feel free to add your Signed-off-by, if you need. 

Thanks,
Dexuan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [EXTERNAL] Re: [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl
  2025-06-27  8:50           ` [EXTERNAL] " Dexuan Cui
@ 2025-06-27 11:01             ` Stefano Garzarella
  2025-06-27 11:42             ` Xuewei Niu
  1 sibling, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2025-06-27 11:01 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Xuewei Niu, davem@davemloft.net, fupan.lfp@antgroup.com,
	Haiyang Zhang, jasowang@redhat.com, kvm@vger.kernel.org,
	KY Srinivasan, leonardi@redhat.com, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, mst@redhat.com,
	netdev@vger.kernel.org, niuxuewei.nxw@antgroup.com,
	pabeni@redhat.com, stefanha@redhat.com,
	virtualization@lists.linux.dev, wei.liu@kernel.org,
	xuanzhuo@linux.alibaba.com

On Fri, Jun 27, 2025 at 08:50:46AM +0000, Dexuan Cui wrote:
>> From: Xuewei Niu <niuxuewei97@gmail.com>
>> Sent: Wednesday, June 25, 2025 10:02 PM
>> > ...
>> > Maybe when you have it tested, post it here as proper patch, and Xuewei
>> > can include it in the next version of this series (of course with you as
>> > author, etc.). In this way will be easy to test/merge, since they are
>> > related.
>> >
>> > @Xuewei @Dexuan Is it okay for you?
>>
>> Yeah, sounds good to me!
>>
>> Thanks,
>> Xuewei
>
>Hi Xuewei, Stefano, I posted the patch here:
>https://lore.kernel.org/virtualization/1751013889-4951-1-git-send-email-decui@microsoft.com/T/#u

Great, thanks!

>
>Xuewei, please help to re-post this patch with the next version of your patchset.
>Feel free to add your Signed-off-by, if you need.
>
>Thanks,
>Dexuan
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl
  2025-06-27  8:50           ` [EXTERNAL] " Dexuan Cui
  2025-06-27 11:01             ` Stefano Garzarella
@ 2025-06-27 11:42             ` Xuewei Niu
  1 sibling, 0 replies; 18+ messages in thread
From: Xuewei Niu @ 2025-06-27 11:42 UTC (permalink / raw)
  To: decui
  Cc: davem, fupan.lfp, haiyangz, jasowang, kvm, kys, leonardi,
	linux-hyperv, linux-kernel, mst, netdev, niuxuewei.nxw,
	niuxuewei97, pabeni, sgarzare, stefanha, virtualization, wei.liu,
	xuanzhuo

> > From: Xuewei Niu <niuxuewei97@gmail.com>
> > Sent: Wednesday, June 25, 2025 10:02 PM
> > > ...
> > > Maybe when you have it tested, post it here as proper patch, and Xuewei
> > > can include it in the next version of this series (of course with you as
> > > author, etc.). In this way will be easy to test/merge, since they are
> > > related.
> > >
> > > @Xuewei @Dexuan Is it okay for you?
> > 
> > Yeah, sounds good to me!
> > 
> > Thanks,
> > Xuewei
> 
> Hi Xuewei, Stefano, I posted the patch here:
> https://lore.kernel.org/virtualization/1751013889-4951-1-git-send-email-decui@microsoft.com/T/#u
> 
> Xuewei, please help to re-post this patch with the next version of your patchset.
> Feel free to add your Signed-off-by, if you need. 

I'll update my patchset and send it out. Thanks for your work!

Thanks,
Xuewei

> Thanks,
> Dexuan

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-06-27 11:42 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-17  4:53 [PATCH net-next v3 0/3] vsock: Introduce SIOCINQ ioctl support Xuewei Niu
2025-06-17  4:53 ` [PATCH net-next v3 1/3] vsock: Add support for SIOCINQ ioctl Xuewei Niu
2025-06-17 14:39   ` Stefano Garzarella
2025-06-22 13:59     ` Xuewei Niu
2025-06-23 17:01       ` Stefano Garzarella
2025-06-25  8:03     ` [EXTERNAL] " Dexuan Cui
2025-06-25 13:32       ` Stefano Garzarella
2025-06-25 16:41         ` Dexuan Cui
2025-06-26  5:02         ` Xuewei Niu
2025-06-27  8:50           ` [EXTERNAL] " Dexuan Cui
2025-06-27 11:01             ` Stefano Garzarella
2025-06-27 11:42             ` Xuewei Niu
2025-06-17  4:53 ` [PATCH net-next v3 2/3] test/vsock: Add retry mechanism to ioctl wrapper Xuewei Niu
2025-06-17 15:10   ` Stefano Garzarella
2025-06-17 15:23     ` Xuewei Niu
2025-06-17  4:53 ` [PATCH net-next v3 3/3] test/vsock: Add ioctl SIOCINQ tests Xuewei Niu
2025-06-17 15:21   ` Stefano Garzarella
2025-06-17 15:31     ` Xuewei Niu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).