* [PATCH net v2 1/3] vsock/virtio: fix MSG_PEEK ignoring skb offset when calculating bytes to copy
2026-04-07 9:13 [PATCH net v2 0/3] vsock/virtio: fix MSG_PEEK calculation on bytes to copy Luigi Leonardi
@ 2026-04-07 9:13 ` Luigi Leonardi
2026-04-07 9:13 ` [PATCH net v2 2/3] vsock/test: handle MSG_PEEK in `recv_buf` Luigi Leonardi
2026-04-07 9:13 ` [PATCH net v2 3/3] vsock/test: add MSG_PEEK after partial recv test Luigi Leonardi
2 siblings, 0 replies; 6+ messages in thread
From: Luigi Leonardi @ 2026-04-07 9:13 UTC (permalink / raw)
To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
Jason Wang, Xuan Zhuo, Eugenio Pérez, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Arseniy Krasnov
Cc: kvm, virtualization, netdev, linux-kernel, Luigi Leonardi
`virtio_transport_stream_do_peek()` does not account for the skb offset
when computing the number of bytes to copy.
This means that, after a partial recv() that advances the offset, a peek
requesting more bytes than are available in the sk_buff causes
`skb_copy_datagram_iter()` to go past the valid payload, resulting in
a -EFAULT.
The dequeue path already handles this correctly.
Apply the same logic to the peek path.
Fixes: 0df7cd3c13e4 ("vsock/virtio/vhost: read data from non-linear skb")
Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
---
net/vmw_vsock/virtio_transport_common.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 8a9fb23c6e85..4b65bfe5d875 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -547,9 +547,8 @@ virtio_transport_stream_do_peek(struct vsock_sock *vsk,
skb_queue_walk(&vvs->rx_queue, skb) {
size_t bytes;
- bytes = len - total;
- if (bytes > skb->len)
- bytes = skb->len;
+ bytes = min_t(size_t, len - total,
+ skb->len - VIRTIO_VSOCK_SKB_CB(skb)->offset);
spin_unlock_bh(&vvs->rx_lock);
--
2.53.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH net v2 2/3] vsock/test: handle MSG_PEEK in `recv_buf`
2026-04-07 9:13 [PATCH net v2 0/3] vsock/virtio: fix MSG_PEEK calculation on bytes to copy Luigi Leonardi
2026-04-07 9:13 ` [PATCH net v2 1/3] vsock/virtio: fix MSG_PEEK ignoring skb offset when calculating " Luigi Leonardi
@ 2026-04-07 9:13 ` Luigi Leonardi
2026-04-08 10:03 ` Stefano Garzarella
2026-04-07 9:13 ` [PATCH net v2 3/3] vsock/test: add MSG_PEEK after partial recv test Luigi Leonardi
2 siblings, 1 reply; 6+ messages in thread
From: Luigi Leonardi @ 2026-04-07 9:13 UTC (permalink / raw)
To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
Jason Wang, Xuan Zhuo, Eugenio Pérez, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Arseniy Krasnov
Cc: kvm, virtualization, netdev, linux-kernel, Luigi Leonardi
`recv_buf` does not handle the MSG_PEEK flag correctly: it keeps calling
`recv` until all requested bytes are available or an error occurs.
The problem is how it calculates the amount of bytes read: MSG_PEEK
doesn't consume any bytes, will re-read the same bytes from the buffer
head, so, summing the return value every time is wrong.
Moreover, MSG_PEEK doesn't consume the bytes in the buffer, so if the
requested amount is more than the bytes available, the loop will never
terminate, because `recv` will never return EOF. For this reason we need
to compare the amount of read bytes with the number of bytes expected.
Add a check, and if the MSG_PEEK flag is present, update the counter of
read bytes differently, and break if we read the expected amount.
This allows us to simplify the `test_stream_credit_update_test`, by
reusing `recv_buf`.
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
---
tools/testing/vsock/util.c | 8 ++++++++
tools/testing/vsock/vsock_test.c | 13 +------------
2 files changed, 9 insertions(+), 12 deletions(-)
diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
index 9430ef5b8bc3..f12425ca99ed 100644
--- a/tools/testing/vsock/util.c
+++ b/tools/testing/vsock/util.c
@@ -399,6 +399,14 @@ void recv_buf(int fd, void *buf, size_t len, int flags, ssize_t expected_ret)
if (ret == 0 || (ret < 0 && errno != EINTR))
break;
+ if (flags & MSG_PEEK) {
+ if (ret == expected_ret) {
+ nread = ret;
+ break;
+ }
+ continue;
+ }
+
nread += ret;
} while (nread < len);
timeout_end();
diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 5bd20ccd9335..bdb0754965df 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -1500,18 +1500,7 @@ static void test_stream_credit_update_test(const struct test_opts *opts,
}
/* Wait until there will be 128KB of data in rx queue. */
- while (1) {
- ssize_t res;
-
- res = recv(fd, buf, buf_size, MSG_PEEK);
- if (res == buf_size)
- break;
-
- if (res <= 0) {
- fprintf(stderr, "unexpected 'recv()' return: %zi\n", res);
- exit(EXIT_FAILURE);
- }
- }
+ recv_buf(fd, buf, buf_size, MSG_PEEK, buf_size);
/* There is 128KB of data in the socket's rx queue, dequeue first
* 64KB, credit update is sent if 'low_rx_bytes_test' == true.
--
2.53.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH net v2 2/3] vsock/test: handle MSG_PEEK in `recv_buf`
2026-04-07 9:13 ` [PATCH net v2 2/3] vsock/test: handle MSG_PEEK in `recv_buf` Luigi Leonardi
@ 2026-04-08 10:03 ` Stefano Garzarella
0 siblings, 0 replies; 6+ messages in thread
From: Stefano Garzarella @ 2026-04-08 10:03 UTC (permalink / raw)
To: Luigi Leonardi
Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
Eugenio Pérez, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Arseniy Krasnov, kvm, virtualization,
netdev, linux-kernel
On Tue, Apr 07, 2026 at 11:13:56AM +0200, Luigi Leonardi wrote:
>`recv_buf` does not handle the MSG_PEEK flag correctly: it keeps calling
>`recv` until all requested bytes are available or an error occurs.
>
>The problem is how it calculates the amount of bytes read: MSG_PEEK
>doesn't consume any bytes, will re-read the same bytes from the buffer
>head, so, summing the return value every time is wrong.
>
>Moreover, MSG_PEEK doesn't consume the bytes in the buffer, so if the
>requested amount is more than the bytes available, the loop will never
>terminate, because `recv` will never return EOF. For this reason we need
>to compare the amount of read bytes with the number of bytes expected.
>
>Add a check, and if the MSG_PEEK flag is present, update the counter of
>read bytes differently, and break if we read the expected amount.
>
>This allows us to simplify the `test_stream_credit_update_test`, by
>reusing `recv_buf`.
>
>Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
>Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
>---
> tools/testing/vsock/util.c | 8 ++++++++
> tools/testing/vsock/vsock_test.c | 13 +------------
> 2 files changed, 9 insertions(+), 12 deletions(-)
>
>diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
>index 9430ef5b8bc3..f12425ca99ed 100644
>--- a/tools/testing/vsock/util.c
>+++ b/tools/testing/vsock/util.c
>@@ -399,6 +399,14 @@ void recv_buf(int fd, void *buf, size_t len, int flags, ssize_t expected_ret)
> if (ret == 0 || (ret < 0 && errno != EINTR))
> break;
>
We should add a comment here or even better in the documentation on top
of the function to explain better this behaviour for future reference.
>+ if (flags & MSG_PEEK) {
>+ if (ret == expected_ret) {
>+ nread = ret;
>+ break;
>+ }
Not that I expect this to happen often, but I wonder if it would be
better to add a `timeout_usleep()` here to avoid using up too many CPU
cycles.
>+ continue;
>+ }
>+
> nread += ret;
> } while (nread < len);
> timeout_end();
>diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>index 5bd20ccd9335..bdb0754965df 100644
>--- a/tools/testing/vsock/vsock_test.c
>+++ b/tools/testing/vsock/vsock_test.c
>@@ -1500,18 +1500,7 @@ static void test_stream_credit_update_test(const struct test_opts *opts,
> }
>
> /* Wait until there will be 128KB of data in rx queue. */
>- while (1) {
>- ssize_t res;
>-
>- res = recv(fd, buf, buf_size, MSG_PEEK);
>- if (res == buf_size)
>- break;
>-
>- if (res <= 0) {
>- fprintf(stderr, "unexpected 'recv()' return: %zi\n", res);
>- exit(EXIT_FAILURE);
>- }
>- }
>+ recv_buf(fd, buf, buf_size, MSG_PEEK, buf_size);
Not sure about this change in this patch, but since they are just tests
it could be fine.
mmm, on a second thought, we already used recv_buf() with MSG_PEEK in
several other tests, so yep, better to add this change, but I'd make
more clear in the commit title/description that this is at the end a fix
for other tests using MSG_PEEK with recv_buf(), I mean something like
this:
vsock/test: fix MSG_PEEK handling in recv_buf()
And also pointing out that other tests already used MSG_PEEK with
recv_buf(), but it can be broken in some cases.
Thanks,
Stefano
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net v2 3/3] vsock/test: add MSG_PEEK after partial recv test
2026-04-07 9:13 [PATCH net v2 0/3] vsock/virtio: fix MSG_PEEK calculation on bytes to copy Luigi Leonardi
2026-04-07 9:13 ` [PATCH net v2 1/3] vsock/virtio: fix MSG_PEEK ignoring skb offset when calculating " Luigi Leonardi
2026-04-07 9:13 ` [PATCH net v2 2/3] vsock/test: handle MSG_PEEK in `recv_buf` Luigi Leonardi
@ 2026-04-07 9:13 ` Luigi Leonardi
2026-04-08 10:18 ` Stefano Garzarella
2 siblings, 1 reply; 6+ messages in thread
From: Luigi Leonardi @ 2026-04-07 9:13 UTC (permalink / raw)
To: Stefan Hajnoczi, Stefano Garzarella, Michael S. Tsirkin,
Jason Wang, Xuan Zhuo, Eugenio Pérez, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Arseniy Krasnov
Cc: kvm, virtualization, netdev, linux-kernel, Luigi Leonardi
Add a test that verifies MSG_PEEK works correctly after a partial
recv().
This is to test a bug that was present in the
`virtio_transport_stream_do_peek()` when computing the number of bytes to
copy: After a partial read, the peek function didn't take into
consideration the number of bytes that were already read. So peeking the
whole buffer would cause a out-of-bounds read, that resulted in a -EFAULT.
This test does exactly this: do a partial recv on a buffer, then try to
peek the whole buffer content.
Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
---
tools/testing/vsock/vsock_test.c | 43 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 43 insertions(+)
diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index bdb0754965df..d38a90a86f34 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -346,6 +346,44 @@ static void test_stream_msg_peek_server(const struct test_opts *opts)
return test_msg_peek_server(opts, false);
}
+static void test_stream_peek_after_recv_client(const struct test_opts *opts)
+{
+ unsigned char buf[MSG_PEEK_BUF_LEN];
+ int fd;
+
+ fd = vsock_stream_connect(opts->peer_cid, opts->peer_port);
+ if (fd < 0) {
+ perror("connect");
+ exit(EXIT_FAILURE);
+ }
+
+ memset(buf, 0, sizeof(buf));
+
+ send_buf(fd, buf, sizeof(buf), 0, sizeof(buf));
+
+ close(fd);
+}
+
+static void test_stream_peek_after_recv_server(const struct test_opts *opts)
+{
+ unsigned char buf[MSG_PEEK_BUF_LEN];
+ int fd;
+
+ fd = vsock_stream_accept(VMADDR_CID_ANY, opts->peer_port, NULL);
+ if (fd < 0) {
+ perror("accept");
+ exit(EXIT_FAILURE);
+ }
+
+ /* Partial recv to advance offset within the skb */
+ recv_buf(fd, buf, 1, 0, 1);
+
+ /* Ask more bytes than available */
+ recv_buf(fd, buf, sizeof(buf), MSG_PEEK, sizeof(buf) - 1);
+
+ close(fd);
+}
+
#define SOCK_BUF_SIZE (2 * 1024 * 1024)
#define SOCK_BUF_SIZE_SMALL (64 * 1024)
#define MAX_MSG_PAGES 4
@@ -2509,6 +2547,11 @@ static struct test_case test_cases[] = {
.run_client = test_stream_tx_credit_bounds_client,
.run_server = test_stream_tx_credit_bounds_server,
},
+ {
+ .name = "SOCK_STREAM MSG_PEEK after partial recv",
+ .run_client = test_stream_peek_after_recv_client,
+ .run_server = test_stream_peek_after_recv_server,
+ },
{},
};
--
2.53.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH net v2 3/3] vsock/test: add MSG_PEEK after partial recv test
2026-04-07 9:13 ` [PATCH net v2 3/3] vsock/test: add MSG_PEEK after partial recv test Luigi Leonardi
@ 2026-04-08 10:18 ` Stefano Garzarella
0 siblings, 0 replies; 6+ messages in thread
From: Stefano Garzarella @ 2026-04-08 10:18 UTC (permalink / raw)
To: Luigi Leonardi
Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
Eugenio Pérez, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Arseniy Krasnov, kvm, virtualization,
netdev, linux-kernel
On Tue, Apr 07, 2026 at 11:13:57AM +0200, Luigi Leonardi wrote:
>Add a test that verifies MSG_PEEK works correctly after a partial
>recv().
>
>This is to test a bug that was present in the
>`virtio_transport_stream_do_peek()` when computing the number of bytes to
>copy: After a partial read, the peek function didn't take into
>consideration the number of bytes that were already read. So peeking the
>whole buffer would cause a out-of-bounds read, that resulted in a -EFAULT.
>
>This test does exactly this: do a partial recv on a buffer, then try to
>peek the whole buffer content.
>
>Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
>---
> tools/testing/vsock/vsock_test.c | 43 ++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 43 insertions(+)
>
>diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>index bdb0754965df..d38a90a86f34 100644
>--- a/tools/testing/vsock/vsock_test.c
>+++ b/tools/testing/vsock/vsock_test.c
>@@ -346,6 +346,44 @@ static void test_stream_msg_peek_server(const struct test_opts *opts)
> return test_msg_peek_server(opts, false);
> }
>
>+static void test_stream_peek_after_recv_client(const struct test_opts *opts)
>+{
>+ unsigned char buf[MSG_PEEK_BUF_LEN];
What about:
unsigned char buf[MSG_PEEK_BUF_LEN] = { 0 };
so we can remove the memset() call?
>+ int fd;
>+
>+ fd = vsock_stream_connect(opts->peer_cid, opts->peer_port);
>+ if (fd < 0) {
>+ perror("connect");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ memset(buf, 0, sizeof(buf));
>+
>+ send_buf(fd, buf, sizeof(buf), 0, sizeof(buf));
>+
>+ close(fd);
>+}
>+
>+static void test_stream_peek_after_recv_server(const struct test_opts *opts)
>+{
>+ unsigned char buf[MSG_PEEK_BUF_LEN];
>+ int fd;
>+
>+ fd = vsock_stream_accept(VMADDR_CID_ANY, opts->peer_port, NULL);
>+ if (fd < 0) {
>+ perror("accept");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ /* Partial recv to advance offset within the skb */
>+ recv_buf(fd, buf, 1, 0, 1);
>+
>+ /* Ask more bytes than available */
>+ recv_buf(fd, buf, sizeof(buf), MSG_PEEK, sizeof(buf) - 1);
What about checking also what we read like we do in
test_msg_peek_server() ?
Not a strong opinion, but if we go in that direction, maybe we can just
reuse test_stream_msg_peek_client() also for this test case.
Thanks,
Stefano
>+
>+ close(fd);
>+}
>+
> #define SOCK_BUF_SIZE (2 * 1024 * 1024)
> #define SOCK_BUF_SIZE_SMALL (64 * 1024)
> #define MAX_MSG_PAGES 4
>@@ -2509,6 +2547,11 @@ static struct test_case test_cases[] = {
> .run_client = test_stream_tx_credit_bounds_client,
> .run_server = test_stream_tx_credit_bounds_server,
> },
>+ {
>+ .name = "SOCK_STREAM MSG_PEEK after partial recv",
>+ .run_client = test_stream_peek_after_recv_client,
>+ .run_server = test_stream_peek_after_recv_server,
>+ },
> {},
> };
>
>
>--
>2.53.0
>
^ permalink raw reply [flat|nested] 6+ messages in thread