public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2 0/3] vsock/virtio: fix MSG_PEEK calculation on bytes to copy
@ 2026-04-07  9:13 Luigi Leonardi
  2026-04-07  9:13 ` [PATCH net v2 1/3] vsock/virtio: fix MSG_PEEK ignoring skb offset when calculating " Luigi Leonardi
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Luigi Leonardi @ 2026-04-07  9:13 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, Xuan Zhuo, Eugenio Pérez, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Arseniy Krasnov
  Cc: kvm, virtualization, netdev, linux-kernel, Luigi Leonardi

`virtio_transport_stream_do_peek`, when calculating the number of bytes to
copy, didn't consider the `offset`, caused by partial reads that happened
before.
This might cause out-of-bounds read that lead to an EFAULT.
More details in the commits.

Commit 1 introduces the fix
Commit 2 introduces some preliminary work for adding a test
Commit 3 introduces a test that checks for this bug to avoid future
regressions.

Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
---
Changes in v2:
- Addressed reviewers comment
    - Test now uses the recv_buf utils.
    - Removed unnecessary barrier
    - Checkpatch warnings.
- Added new commit that allows to use recv_buf with MSG_PEEK
- Picked up RoBs
- Link to v1: https://lore.kernel.org/r/20260402-fix_peek-v1-0-ad274fcef77b@redhat.com

---
Luigi Leonardi (3):
      vsock/virtio: fix MSG_PEEK ignoring skb offset when calculating bytes to copy
      vsock/test: handle MSG_PEEK in `recv_buf`
      vsock/test: add MSG_PEEK after partial recv test

 net/vmw_vsock/virtio_transport_common.c |  5 ++-
 tools/testing/vsock/util.c              |  8 +++++
 tools/testing/vsock/vsock_test.c        | 56 ++++++++++++++++++++++++++-------
 3 files changed, 54 insertions(+), 15 deletions(-)
---
base-commit: bfe62a454542cfad3379f6ef5680b125f41e20f4
change-id: 20260401-fix_peek-6837b83469e3

Best regards,
-- 
Luigi Leonardi <leonardi@redhat.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net v2 1/3] vsock/virtio: fix MSG_PEEK ignoring skb offset when calculating bytes to copy
  2026-04-07  9:13 [PATCH net v2 0/3] vsock/virtio: fix MSG_PEEK calculation on bytes to copy Luigi Leonardi
@ 2026-04-07  9:13 ` Luigi Leonardi
  2026-04-07  9:13 ` [PATCH net v2 2/3] vsock/test: handle MSG_PEEK in `recv_buf` Luigi Leonardi
  2026-04-07  9:13 ` [PATCH net v2 3/3] vsock/test: add MSG_PEEK after partial recv test Luigi Leonardi
  2 siblings, 0 replies; 6+ messages in thread
From: Luigi Leonardi @ 2026-04-07  9:13 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, Xuan Zhuo, Eugenio Pérez, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Arseniy Krasnov
  Cc: kvm, virtualization, netdev, linux-kernel, Luigi Leonardi

`virtio_transport_stream_do_peek()` does not account for the skb offset
when computing the number of bytes to copy.

This means that, after a partial recv() that advances the offset, a peek
requesting more bytes than are available in the sk_buff causes
`skb_copy_datagram_iter()` to go past the valid payload, resulting in
a -EFAULT.

The dequeue path already handles this correctly.
Apply the same logic to the peek path.

Fixes: 0df7cd3c13e4 ("vsock/virtio/vhost: read data from non-linear skb")
Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
---
 net/vmw_vsock/virtio_transport_common.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 8a9fb23c6e85..4b65bfe5d875 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -547,9 +547,8 @@ virtio_transport_stream_do_peek(struct vsock_sock *vsk,
 	skb_queue_walk(&vvs->rx_queue, skb) {
 		size_t bytes;
 
-		bytes = len - total;
-		if (bytes > skb->len)
-			bytes = skb->len;
+		bytes = min_t(size_t, len - total,
+			      skb->len - VIRTIO_VSOCK_SKB_CB(skb)->offset);
 
 		spin_unlock_bh(&vvs->rx_lock);
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net v2 2/3] vsock/test: handle MSG_PEEK in `recv_buf`
  2026-04-07  9:13 [PATCH net v2 0/3] vsock/virtio: fix MSG_PEEK calculation on bytes to copy Luigi Leonardi
  2026-04-07  9:13 ` [PATCH net v2 1/3] vsock/virtio: fix MSG_PEEK ignoring skb offset when calculating " Luigi Leonardi
@ 2026-04-07  9:13 ` Luigi Leonardi
  2026-04-08 10:03   ` Stefano Garzarella
  2026-04-07  9:13 ` [PATCH net v2 3/3] vsock/test: add MSG_PEEK after partial recv test Luigi Leonardi
  2 siblings, 1 reply; 6+ messages in thread
From: Luigi Leonardi @ 2026-04-07  9:13 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, Xuan Zhuo, Eugenio Pérez, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Arseniy Krasnov
  Cc: kvm, virtualization, netdev, linux-kernel, Luigi Leonardi

`recv_buf` does not handle the MSG_PEEK flag correctly: it keeps calling
`recv` until all requested bytes are available or an error occurs.

The problem is how it calculates the amount of bytes read: MSG_PEEK
doesn't consume any bytes, will re-read the same bytes from the buffer
head, so, summing the return value every time is wrong.

Moreover, MSG_PEEK doesn't consume the bytes in the buffer, so if the
requested amount is more than the bytes available, the loop will never
terminate, because `recv` will never return EOF. For this reason we need
to compare the amount of read bytes with the number of bytes expected.

Add a check, and if the MSG_PEEK flag is present, update the counter of
read bytes differently, and break if we read the expected amount.

This allows us to simplify the `test_stream_credit_update_test`, by
reusing `recv_buf`.

Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
---
 tools/testing/vsock/util.c       |  8 ++++++++
 tools/testing/vsock/vsock_test.c | 13 +------------
 2 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
index 9430ef5b8bc3..f12425ca99ed 100644
--- a/tools/testing/vsock/util.c
+++ b/tools/testing/vsock/util.c
@@ -399,6 +399,14 @@ void recv_buf(int fd, void *buf, size_t len, int flags, ssize_t expected_ret)
 		if (ret == 0 || (ret < 0 && errno != EINTR))
 			break;
 
+		if (flags & MSG_PEEK) {
+			if (ret == expected_ret) {
+				nread = ret;
+				break;
+			}
+			continue;
+		}
+
 		nread += ret;
 	} while (nread < len);
 	timeout_end();
diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 5bd20ccd9335..bdb0754965df 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -1500,18 +1500,7 @@ static void test_stream_credit_update_test(const struct test_opts *opts,
 	}
 
 	/* Wait until there will be 128KB of data in rx queue. */
-	while (1) {
-		ssize_t res;
-
-		res = recv(fd, buf, buf_size, MSG_PEEK);
-		if (res == buf_size)
-			break;
-
-		if (res <= 0) {
-			fprintf(stderr, "unexpected 'recv()' return: %zi\n", res);
-			exit(EXIT_FAILURE);
-		}
-	}
+	recv_buf(fd, buf, buf_size, MSG_PEEK, buf_size);
 
 	/* There is 128KB of data in the socket's rx queue, dequeue first
 	 * 64KB, credit update is sent if 'low_rx_bytes_test' == true.

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net v2 3/3] vsock/test: add MSG_PEEK after partial recv test
  2026-04-07  9:13 [PATCH net v2 0/3] vsock/virtio: fix MSG_PEEK calculation on bytes to copy Luigi Leonardi
  2026-04-07  9:13 ` [PATCH net v2 1/3] vsock/virtio: fix MSG_PEEK ignoring skb offset when calculating " Luigi Leonardi
  2026-04-07  9:13 ` [PATCH net v2 2/3] vsock/test: handle MSG_PEEK in `recv_buf` Luigi Leonardi
@ 2026-04-07  9:13 ` Luigi Leonardi
  2026-04-08 10:18   ` Stefano Garzarella
  2 siblings, 1 reply; 6+ messages in thread
From: Luigi Leonardi @ 2026-04-07  9:13 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
	Jason Wang, Xuan Zhuo, Eugenio Pérez, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Arseniy Krasnov
  Cc: kvm, virtualization, netdev, linux-kernel, Luigi Leonardi

Add a test that verifies MSG_PEEK works correctly after a partial
recv().

This is to test a bug that was present in the
`virtio_transport_stream_do_peek()` when computing the number of bytes to
copy: After a partial read, the peek function didn't take into
consideration the number of bytes that were already read. So peeking the
whole buffer would cause a out-of-bounds read, that resulted in a -EFAULT.

This test does exactly this: do a partial recv on a buffer, then try to
peek the whole buffer content.

Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
---
 tools/testing/vsock/vsock_test.c | 43 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index bdb0754965df..d38a90a86f34 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -346,6 +346,44 @@ static void test_stream_msg_peek_server(const struct test_opts *opts)
 	return test_msg_peek_server(opts, false);
 }
 
+static void test_stream_peek_after_recv_client(const struct test_opts *opts)
+{
+	unsigned char buf[MSG_PEEK_BUF_LEN];
+	int fd;
+
+	fd = vsock_stream_connect(opts->peer_cid, opts->peer_port);
+	if (fd < 0) {
+		perror("connect");
+		exit(EXIT_FAILURE);
+	}
+
+	memset(buf, 0, sizeof(buf));
+
+	send_buf(fd, buf, sizeof(buf), 0, sizeof(buf));
+
+	close(fd);
+}
+
+static void test_stream_peek_after_recv_server(const struct test_opts *opts)
+{
+	unsigned char buf[MSG_PEEK_BUF_LEN];
+	int fd;
+
+	fd = vsock_stream_accept(VMADDR_CID_ANY, opts->peer_port, NULL);
+	if (fd < 0) {
+		perror("accept");
+		exit(EXIT_FAILURE);
+	}
+
+	/* Partial recv to advance offset within the skb */
+	recv_buf(fd, buf, 1, 0, 1);
+
+	/* Ask more bytes than available */
+	recv_buf(fd, buf, sizeof(buf), MSG_PEEK, sizeof(buf) - 1);
+
+	close(fd);
+}
+
 #define SOCK_BUF_SIZE (2 * 1024 * 1024)
 #define SOCK_BUF_SIZE_SMALL (64 * 1024)
 #define MAX_MSG_PAGES 4
@@ -2509,6 +2547,11 @@ static struct test_case test_cases[] = {
 		.run_client = test_stream_tx_credit_bounds_client,
 		.run_server = test_stream_tx_credit_bounds_server,
 	},
+	{
+		.name = "SOCK_STREAM MSG_PEEK after partial recv",
+		.run_client = test_stream_peek_after_recv_client,
+		.run_server = test_stream_peek_after_recv_server,
+	},
 	{},
 };
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2 2/3] vsock/test: handle MSG_PEEK in `recv_buf`
  2026-04-07  9:13 ` [PATCH net v2 2/3] vsock/test: handle MSG_PEEK in `recv_buf` Luigi Leonardi
@ 2026-04-08 10:03   ` Stefano Garzarella
  0 siblings, 0 replies; 6+ messages in thread
From: Stefano Garzarella @ 2026-04-08 10:03 UTC (permalink / raw)
  To: Luigi Leonardi
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Arseniy Krasnov, kvm, virtualization,
	netdev, linux-kernel

On Tue, Apr 07, 2026 at 11:13:56AM +0200, Luigi Leonardi wrote:
>`recv_buf` does not handle the MSG_PEEK flag correctly: it keeps calling
>`recv` until all requested bytes are available or an error occurs.
>
>The problem is how it calculates the amount of bytes read: MSG_PEEK
>doesn't consume any bytes, will re-read the same bytes from the buffer
>head, so, summing the return value every time is wrong.
>
>Moreover, MSG_PEEK doesn't consume the bytes in the buffer, so if the
>requested amount is more than the bytes available, the loop will never
>terminate, because `recv` will never return EOF. For this reason we need
>to compare the amount of read bytes with the number of bytes expected.
>
>Add a check, and if the MSG_PEEK flag is present, update the counter of
>read bytes differently, and break if we read the expected amount.
>
>This allows us to simplify the `test_stream_credit_update_test`, by
>reusing `recv_buf`.
>
>Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
>Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
>---
> tools/testing/vsock/util.c       |  8 ++++++++
> tools/testing/vsock/vsock_test.c | 13 +------------
> 2 files changed, 9 insertions(+), 12 deletions(-)
>
>diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
>index 9430ef5b8bc3..f12425ca99ed 100644
>--- a/tools/testing/vsock/util.c
>+++ b/tools/testing/vsock/util.c
>@@ -399,6 +399,14 @@ void recv_buf(int fd, void *buf, size_t len, int flags, ssize_t expected_ret)
> 		if (ret == 0 || (ret < 0 && errno != EINTR))
> 			break;
>

We should add a comment here or even better in the documentation on top 
of the function to explain better this behaviour for future reference.

>+		if (flags & MSG_PEEK) {
>+			if (ret == expected_ret) {
>+				nread = ret;
>+				break;
>+			}

Not that I expect this to happen often, but I wonder if it would be 
better to add a `timeout_usleep()` here to avoid using up too many CPU 
cycles.

>+			continue;
>+		}
>+
> 		nread += ret;
> 	} while (nread < len);
> 	timeout_end();
>diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>index 5bd20ccd9335..bdb0754965df 100644
>--- a/tools/testing/vsock/vsock_test.c
>+++ b/tools/testing/vsock/vsock_test.c
>@@ -1500,18 +1500,7 @@ static void test_stream_credit_update_test(const struct test_opts *opts,
> 	}
>
> 	/* Wait until there will be 128KB of data in rx queue. */
>-	while (1) {
>-		ssize_t res;
>-
>-		res = recv(fd, buf, buf_size, MSG_PEEK);
>-		if (res == buf_size)
>-			break;
>-
>-		if (res <= 0) {
>-			fprintf(stderr, "unexpected 'recv()' return: %zi\n", res);
>-			exit(EXIT_FAILURE);
>-		}
>-	}
>+	recv_buf(fd, buf, buf_size, MSG_PEEK, buf_size);

Not sure about this change in this patch, but since they are just tests 
it could be fine.

mmm, on a second thought, we already used recv_buf() with MSG_PEEK in 
several other tests, so yep, better to add this change, but I'd make 
more clear in the commit title/description that this is at the end a fix 
for other tests using MSG_PEEK with recv_buf(), I mean something like 
this:

     vsock/test: fix MSG_PEEK handling in recv_buf()

And also pointing out that other tests already used MSG_PEEK with 
recv_buf(), but it can be broken in some cases.

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2 3/3] vsock/test: add MSG_PEEK after partial recv test
  2026-04-07  9:13 ` [PATCH net v2 3/3] vsock/test: add MSG_PEEK after partial recv test Luigi Leonardi
@ 2026-04-08 10:18   ` Stefano Garzarella
  0 siblings, 0 replies; 6+ messages in thread
From: Stefano Garzarella @ 2026-04-08 10:18 UTC (permalink / raw)
  To: Luigi Leonardi
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Arseniy Krasnov, kvm, virtualization,
	netdev, linux-kernel

On Tue, Apr 07, 2026 at 11:13:57AM +0200, Luigi Leonardi wrote:
>Add a test that verifies MSG_PEEK works correctly after a partial
>recv().
>
>This is to test a bug that was present in the
>`virtio_transport_stream_do_peek()` when computing the number of bytes to
>copy: After a partial read, the peek function didn't take into
>consideration the number of bytes that were already read. So peeking the
>whole buffer would cause a out-of-bounds read, that resulted in a -EFAULT.
>
>This test does exactly this: do a partial recv on a buffer, then try to
>peek the whole buffer content.
>
>Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
>---
> tools/testing/vsock/vsock_test.c | 43 ++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 43 insertions(+)
>
>diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>index bdb0754965df..d38a90a86f34 100644
>--- a/tools/testing/vsock/vsock_test.c
>+++ b/tools/testing/vsock/vsock_test.c
>@@ -346,6 +346,44 @@ static void test_stream_msg_peek_server(const struct test_opts *opts)
> 	return test_msg_peek_server(opts, false);
> }
>
>+static void test_stream_peek_after_recv_client(const struct test_opts *opts)
>+{
>+	unsigned char buf[MSG_PEEK_BUF_LEN];

What about:

	unsigned char buf[MSG_PEEK_BUF_LEN] = { 0 };

so we can remove the memset() call?

>+	int fd;
>+
>+	fd = vsock_stream_connect(opts->peer_cid, opts->peer_port);
>+	if (fd < 0) {
>+		perror("connect");
>+		exit(EXIT_FAILURE);
>+	}
>+
>+	memset(buf, 0, sizeof(buf));
>+
>+	send_buf(fd, buf, sizeof(buf), 0, sizeof(buf));
>+
>+	close(fd);
>+}
>+
>+static void test_stream_peek_after_recv_server(const struct test_opts *opts)
>+{
>+	unsigned char buf[MSG_PEEK_BUF_LEN];
>+	int fd;
>+
>+	fd = vsock_stream_accept(VMADDR_CID_ANY, opts->peer_port, NULL);
>+	if (fd < 0) {
>+		perror("accept");
>+		exit(EXIT_FAILURE);
>+	}
>+
>+	/* Partial recv to advance offset within the skb */
>+	recv_buf(fd, buf, 1, 0, 1);
>+
>+	/* Ask more bytes than available */
>+	recv_buf(fd, buf, sizeof(buf), MSG_PEEK, sizeof(buf) - 1);

What about checking also what we read like we do in 
test_msg_peek_server() ?

Not a strong opinion, but if we go in that direction, maybe we can just 
reuse test_stream_msg_peek_client() also for this test case.

Thanks,
Stefano

>+
>+	close(fd);
>+}
>+
> #define SOCK_BUF_SIZE (2 * 1024 * 1024)
> #define SOCK_BUF_SIZE_SMALL (64 * 1024)
> #define MAX_MSG_PAGES 4
>@@ -2509,6 +2547,11 @@ static struct test_case test_cases[] = {
> 		.run_client = test_stream_tx_credit_bounds_client,
> 		.run_server = test_stream_tx_credit_bounds_server,
> 	},
>+	{
>+		.name = "SOCK_STREAM MSG_PEEK after partial recv",
>+		.run_client = test_stream_peek_after_recv_client,
>+		.run_server = test_stream_peek_after_recv_server,
>+	},
> 	{},
> };
>
>
>-- 
>2.53.0
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-04-08 10:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-07  9:13 [PATCH net v2 0/3] vsock/virtio: fix MSG_PEEK calculation on bytes to copy Luigi Leonardi
2026-04-07  9:13 ` [PATCH net v2 1/3] vsock/virtio: fix MSG_PEEK ignoring skb offset when calculating " Luigi Leonardi
2026-04-07  9:13 ` [PATCH net v2 2/3] vsock/test: handle MSG_PEEK in `recv_buf` Luigi Leonardi
2026-04-08 10:03   ` Stefano Garzarella
2026-04-07  9:13 ` [PATCH net v2 3/3] vsock/test: add MSG_PEEK after partial recv test Luigi Leonardi
2026-04-08 10:18   ` Stefano Garzarella

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox