Netdev List
 help / color / mirror / Atom feed
* [PATCH net v3 0/2] vsock/virtio: fix skb overhead accounting to preserve full buf_alloc
@ 2026-05-13 10:54 Stefano Garzarella
  2026-05-13 10:54 ` [PATCH net v3 1/2] vsock/virtio: reset connection on receiving queue overflow Stefano Garzarella
  2026-05-13 10:54 ` [PATCH net v3 2/2] vsock/virtio: fix skb overhead accounting to preserve full buf_alloc Stefano Garzarella
  0 siblings, 2 replies; 8+ messages in thread
From: Stefano Garzarella @ 2026-05-13 10:54 UTC (permalink / raw)
  To: netdev
  Cc: Xuan Zhuo, Michael S. Tsirkin, Eugenio Pérez, linux-kernel,
	Simon Horman, Paolo Abeni, Jakub Kicinski, Stefano Garzarella,
	Jason Wang, kvm, Stefan Hajnoczi, virtualization, Eric Dumazet,
	David S. Miller

The code is exactly the same as in v2, but split into two patches as
Michael suggested in v2:
- Patch 1 resets the connection when we can no longer queue packets,
  this prevents silent data loss, and both peers are notified.
- Patch 2 increases the total budget to `buf_alloc * 2` for payload
  plus skb overhead similar to how SO_RCVBUF is doubled to reserve
  space for sk_buff metadata. This preserves the full buf_alloc for
  payload under normal operation, while still bounding the skb queue
  growth.

v3:
- Split in 2 patches [MST]

v2: https://lore.kernel.org/netdev/20260512080737.36787-1-sgarzare@redhat.com/
- Close the connection when we can no longer queue new packets instead
  of losing data.
- No longer announce the reduced buf_alloc to avoid violating the
  spec. [MST]

v1: https://lore.kernel.org/netdev/20260508092330.69690-1-sgarzare@redhat.com/

Stefano Garzarella (2):
  vsock/virtio: reset connection on receiving queue overflow
  vsock/virtio: fix skb overhead accounting to preserve full buf_alloc

 net/vmw_vsock/virtio_transport_common.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

-- 
2.54.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH net v3 1/2] vsock/virtio: reset connection on receiving queue overflow
  2026-05-13 10:54 [PATCH net v3 0/2] vsock/virtio: fix skb overhead accounting to preserve full buf_alloc Stefano Garzarella
@ 2026-05-13 10:54 ` Stefano Garzarella
  2026-05-14 14:57   ` Stefano Garzarella
  2026-05-13 10:54 ` [PATCH net v3 2/2] vsock/virtio: fix skb overhead accounting to preserve full buf_alloc Stefano Garzarella
  1 sibling, 1 reply; 8+ messages in thread
From: Stefano Garzarella @ 2026-05-13 10:54 UTC (permalink / raw)
  To: netdev
  Cc: Xuan Zhuo, Michael S. Tsirkin, Eugenio Pérez, linux-kernel,
	Simon Horman, Paolo Abeni, Jakub Kicinski, Stefano Garzarella,
	Jason Wang, kvm, Stefan Hajnoczi, virtualization, Eric Dumazet,
	David S. Miller

From: Stefano Garzarella <sgarzare@redhat.com>

When there is no more space to queue an incoming packet, the packet is
silently dropped. This causes data loss without any notification to
either peer, since there is no retransmission.

Under normal circumstances, this should never happen. However, it could
happen if the other peer doesn't respect the credit, or if the skb
overhead, which we recently began to take into account with commit
059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb queue"),
is too high.

Fix this by resetting the connection and setting the local socket error
to ENOBUFS when virtio_transport_recv_enqueue() can no longer queue a
packet, so both peers are explicitly notified of the failure rather than
silently losing data.

Fixes: ae6fcfbf5f03 ("vsock/virtio: discard packets if credit is not respected")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 net/vmw_vsock/virtio_transport_common.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 989cc252d3d3..4a4ac69d1ad1 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -1350,7 +1350,7 @@ virtio_transport_recv_connecting(struct sock *sk,
 	return err;
 }
 
-static void
+static bool
 virtio_transport_recv_enqueue(struct vsock_sock *vsk,
 			      struct sk_buff *skb)
 {
@@ -1365,10 +1365,8 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
 	spin_lock_bh(&vvs->rx_lock);
 
 	can_enqueue = virtio_transport_inc_rx_pkt(vvs, len);
-	if (!can_enqueue) {
-		free_pkt = true;
+	if (!can_enqueue)
 		goto out;
-	}
 
 	if (le32_to_cpu(hdr->flags) & VIRTIO_VSOCK_SEQ_EOM)
 		vvs->msg_count++;
@@ -1408,6 +1406,8 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
 	spin_unlock_bh(&vvs->rx_lock);
 	if (free_pkt)
 		kfree_skb(skb);
+
+	return can_enqueue;
 }
 
 static int
@@ -1420,7 +1420,16 @@ virtio_transport_recv_connected(struct sock *sk,
 
 	switch (le16_to_cpu(hdr->op)) {
 	case VIRTIO_VSOCK_OP_RW:
-		virtio_transport_recv_enqueue(vsk, skb);
+		if (!virtio_transport_recv_enqueue(vsk, skb)) {
+			/* There is no more space to queue the packet, so let's
+			 * close the connection; otherwise, we'll lose data.
+			 */
+			(void)virtio_transport_reset(vsk, skb);
+			sk->sk_state = TCP_CLOSE;
+			sk->sk_err = ENOBUFS;
+			sk_error_report(sk);
+			break;
+		}
 		vsock_data_ready(sk);
 		return err;
 	case VIRTIO_VSOCK_OP_CREDIT_REQUEST:
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net v3 2/2] vsock/virtio: fix skb overhead accounting to preserve full buf_alloc
  2026-05-13 10:54 [PATCH net v3 0/2] vsock/virtio: fix skb overhead accounting to preserve full buf_alloc Stefano Garzarella
  2026-05-13 10:54 ` [PATCH net v3 1/2] vsock/virtio: reset connection on receiving queue overflow Stefano Garzarella
@ 2026-05-13 10:54 ` Stefano Garzarella
  2026-05-14 14:44   ` Stefano Garzarella
  1 sibling, 1 reply; 8+ messages in thread
From: Stefano Garzarella @ 2026-05-13 10:54 UTC (permalink / raw)
  To: netdev
  Cc: Xuan Zhuo, Michael S. Tsirkin, Eugenio Pérez, linux-kernel,
	Simon Horman, Paolo Abeni, Jakub Kicinski, Stefano Garzarella,
	Jason Wang, kvm, Stefan Hajnoczi, virtualization, Eric Dumazet,
	David S. Miller

From: Stefano Garzarella <sgarzare@redhat.com>

After commit 059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb
queue"), virtio_transport_inc_rx_pkt() subtracts per-skb overhead from
buf_alloc when checking whether a new packet fits. This reduces the
effective receive buffer below what the user configured via
SO_VM_SOCKETS_BUFFER_SIZE, causing legitimate data packets to be
silently dropped and applications that rely on the full buffer size
to deadlock.

Also, the reduced space is not communicated to the remote peer, so
its credit calculation accounts more credit than the receiver will
actually accept, causing data loss (there is no retransmission).

With this approach we currently have failures in
tools/testing/vsock/vsock_test.c. Test 18 sometimes fails, while
test 22 always fails in this way:
    18 - SOCK_STREAM MSG_ZEROCOPY...hash mismatch

    22 - SOCK_STREAM virtio credit update + SO_RCVLOWAT...send failed:
    Resource temporarily unavailable

Fix this by using `buf_alloc * 2` as the total budget for payload plus
skb overhead in virtio_transport_inc_rx_pkt(), similar to how SO_RCVBUF
is doubled to reserve space for sk_buff metadata. This preserves the
full buf_alloc for payload under normal operation, while still bounding
the skb queue growth.

With this patch, all tests in tools/testing/vsock/vsock_test.c are
now passing again.

Fixes: 059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb queue")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 net/vmw_vsock/virtio_transport_common.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 4a4ac69d1ad1..e22117bf5dcd 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -434,7 +434,10 @@ static bool virtio_transport_inc_rx_pkt(struct virtio_vsock_sock *vvs,
 {
 	u64 skb_overhead = (skb_queue_len(&vvs->rx_queue) + 1) * SKB_TRUESIZE(0);
 
-	if (skb_overhead + vvs->buf_used + len > vvs->buf_alloc)
+	/* Use buf_alloc * 2 as total budget (payload + overhead), similar to
+	 * how SO_RCVBUF is doubled to reserve space for sk_buff metadata.
+	 */
+	if (skb_overhead + vvs->buf_used + len > (u64)vvs->buf_alloc * 2)
 		return false;
 
 	vvs->rx_bytes += len;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net v3 2/2] vsock/virtio: fix skb overhead accounting to preserve full buf_alloc
  2026-05-13 10:54 ` [PATCH net v3 2/2] vsock/virtio: fix skb overhead accounting to preserve full buf_alloc Stefano Garzarella
@ 2026-05-14 14:44   ` Stefano Garzarella
  0 siblings, 0 replies; 8+ messages in thread
From: Stefano Garzarella @ 2026-05-14 14:44 UTC (permalink / raw)
  To: netdev
  Cc: Xuan Zhuo, Michael S. Tsirkin, Eugenio Pérez, linux-kernel,
	Simon Horman, Paolo Abeni, Jakub Kicinski, Jason Wang, kvm,
	Stefan Hajnoczi, virtualization, Eric Dumazet, David S. Miller

On Wed, May 13, 2026 at 12:54:17PM +0200, Stefano Garzarella wrote:
>From: Stefano Garzarella <sgarzare@redhat.com>
>
>After commit 059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb
>queue"), virtio_transport_inc_rx_pkt() subtracts per-skb overhead from
>buf_alloc when checking whether a new packet fits. This reduces the
>effective receive buffer below what the user configured via
>SO_VM_SOCKETS_BUFFER_SIZE, causing legitimate data packets to be
>silently dropped and applications that rely on the full buffer size
>to deadlock.
>
>Also, the reduced space is not communicated to the remote peer, so
>its credit calculation accounts more credit than the receiver will
>actually accept, causing data loss (there is no retransmission).
>
>With this approach we currently have failures in
>tools/testing/vsock/vsock_test.c. Test 18 sometimes fails, while
>test 22 always fails in this way:
>    18 - SOCK_STREAM MSG_ZEROCOPY...hash mismatch
>
>    22 - SOCK_STREAM virtio credit update + SO_RCVLOWAT...send failed:
>    Resource temporarily unavailable
>
>Fix this by using `buf_alloc * 2` as the total budget for payload plus
>skb overhead in virtio_transport_inc_rx_pkt(), similar to how SO_RCVBUF
>is doubled to reserve space for sk_buff metadata. This preserves the
>full buf_alloc for payload under normal operation, while still bounding
>the skb queue growth.
>
>With this patch, all tests in tools/testing/vsock/vsock_test.c are
>now passing again.
>
>Fixes: 059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb queue")
>Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>---
> net/vmw_vsock/virtio_transport_common.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 4a4ac69d1ad1..e22117bf5dcd 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -434,7 +434,10 @@ static bool virtio_transport_inc_rx_pkt(struct virtio_vsock_sock *vvs,
> {
> 	u64 skb_overhead = (skb_queue_len(&vvs->rx_queue) + 1) * SKB_TRUESIZE(0);
>
>-	if (skb_overhead + vvs->buf_used + len > vvs->buf_alloc)
>+	/* Use buf_alloc * 2 as total budget (payload + overhead), similar to
>+	 * how SO_RCVBUF is doubled to reserve space for sk_buff metadata.
>+	 */
>+	if (skb_overhead + vvs->buf_used + len > (u64)vvs->buf_alloc * 2)
> 		return false;

sashiko reported a potential overflow here:
https://sashiko.dev/#/patchset/20260513105417.56761-1-sgarzare%40redhat.com

I'll check the credit and overflow separately to ensure both are 
correct.

The portion relating to the user setting a buffer that is too small is a 
pre-existing issue and IMO an edge case that we can ignore.

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net v3 1/2] vsock/virtio: reset connection on receiving queue overflow
  2026-05-13 10:54 ` [PATCH net v3 1/2] vsock/virtio: reset connection on receiving queue overflow Stefano Garzarella
@ 2026-05-14 14:57   ` Stefano Garzarella
  2026-05-14 15:16     ` Michael S. Tsirkin
  0 siblings, 1 reply; 8+ messages in thread
From: Stefano Garzarella @ 2026-05-14 14:57 UTC (permalink / raw)
  To: netdev
  Cc: Xuan Zhuo, Michael S. Tsirkin, Eugenio Pérez, linux-kernel,
	Simon Horman, Paolo Abeni, Jakub Kicinski, Jason Wang, kvm,
	Stefan Hajnoczi, virtualization, Eric Dumazet, David S. Miller

On Wed, May 13, 2026 at 12:54:16PM +0200, Stefano Garzarella wrote:
>From: Stefano Garzarella <sgarzare@redhat.com>
>
>When there is no more space to queue an incoming packet, the packet is
>silently dropped. This causes data loss without any notification to
>either peer, since there is no retransmission.
>
>Under normal circumstances, this should never happen. However, it could
>happen if the other peer doesn't respect the credit, or if the skb
>overhead, which we recently began to take into account with commit
>059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb queue"),
>is too high.
>
>Fix this by resetting the connection and setting the local socket error
>to ENOBUFS when virtio_transport_recv_enqueue() can no longer queue a
>packet, so both peers are explicitly notified of the failure rather than
>silently losing data.
>
>Fixes: ae6fcfbf5f03 ("vsock/virtio: discard packets if credit is not respected")
>Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>---
> net/vmw_vsock/virtio_transport_common.c | 19 ++++++++++++++-----
> 1 file changed, 14 insertions(+), 5 deletions(-)
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 989cc252d3d3..4a4ac69d1ad1 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -1350,7 +1350,7 @@ virtio_transport_recv_connecting(struct sock *sk,
> 	return err;
> }
>
>-static void
>+static bool
> virtio_transport_recv_enqueue(struct vsock_sock *vsk,
> 			      struct sk_buff *skb)
> {
>@@ -1365,10 +1365,8 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
> 	spin_lock_bh(&vvs->rx_lock);
>
> 	can_enqueue = virtio_transport_inc_rx_pkt(vvs, len);
>-	if (!can_enqueue) {
>-		free_pkt = true;
>+	if (!can_enqueue)
> 		goto out;
>-	}
>
> 	if (le32_to_cpu(hdr->flags) & VIRTIO_VSOCK_SEQ_EOM)
> 		vvs->msg_count++;
>@@ -1408,6 +1406,8 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
> 	spin_unlock_bh(&vvs->rx_lock);
> 	if (free_pkt)
> 		kfree_skb(skb);
>+
>+	return can_enqueue;
> }
>
> static int
>@@ -1420,7 +1420,16 @@ virtio_transport_recv_connected(struct sock *sk,
>
> 	switch (le16_to_cpu(hdr->op)) {
> 	case VIRTIO_VSOCK_OP_RW:
>-		virtio_transport_recv_enqueue(vsk, skb);
>+		if (!virtio_transport_recv_enqueue(vsk, skb)) {
>+			/* There is no more space to queue the packet, so let's
>+			 * close the connection; otherwise, we'll lose data.
>+			 */
>+			(void)virtio_transport_reset(vsk, skb);
>+			sk->sk_state = TCP_CLOSE;
>+			sk->sk_err = ENOBUFS;
>+			sk_error_report(sk);

sashiko reported some issues related to setting TCP_CLOSE state and not 
removing the socket from the connect table:
https://sashiko.dev/#/patchset/20260513105417.56761-1-sgarzare%40redhat.com

I'll change this by calling virtio_transport_do_close() and 
vsock_remove_sock() in the next version.

Stefano

>+			break;
>+		}
> 		vsock_data_ready(sk);
> 		return err;
> 	case VIRTIO_VSOCK_OP_CREDIT_REQUEST:
>-- 
>2.54.0
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net v3 1/2] vsock/virtio: reset connection on receiving queue overflow
  2026-05-14 14:57   ` Stefano Garzarella
@ 2026-05-14 15:16     ` Michael S. Tsirkin
  2026-05-14 16:45       ` Stefano Garzarella
  0 siblings, 1 reply; 8+ messages in thread
From: Michael S. Tsirkin @ 2026-05-14 15:16 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: netdev, Xuan Zhuo, Eugenio Pérez, linux-kernel, Simon Horman,
	Paolo Abeni, Jakub Kicinski, Jason Wang, kvm, Stefan Hajnoczi,
	virtualization, Eric Dumazet, David S. Miller

On Thu, May 14, 2026 at 04:57:16PM +0200, Stefano Garzarella wrote:
> On Wed, May 13, 2026 at 12:54:16PM +0200, Stefano Garzarella wrote:
> > From: Stefano Garzarella <sgarzare@redhat.com>
> > 
> > When there is no more space to queue an incoming packet, the packet is
> > silently dropped. This causes data loss without any notification to
> > either peer, since there is no retransmission.
> > 
> > Under normal circumstances, this should never happen. However, it could
> > happen if the other peer doesn't respect the credit, or if the skb
> > overhead, which we recently began to take into account with commit
> > 059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb queue"),
> > is too high.
> > 
> > Fix this by resetting the connection and setting the local socket error
> > to ENOBUFS when virtio_transport_recv_enqueue() can no longer queue a
> > packet, so both peers are explicitly notified of the failure rather than
> > silently losing data.
> > 
> > Fixes: ae6fcfbf5f03 ("vsock/virtio: discard packets if credit is not respected")
> > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> > ---
> > net/vmw_vsock/virtio_transport_common.c | 19 ++++++++++++++-----
> > 1 file changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> > index 989cc252d3d3..4a4ac69d1ad1 100644
> > --- a/net/vmw_vsock/virtio_transport_common.c
> > +++ b/net/vmw_vsock/virtio_transport_common.c
> > @@ -1350,7 +1350,7 @@ virtio_transport_recv_connecting(struct sock *sk,
> > 	return err;
> > }
> > 
> > -static void
> > +static bool
> > virtio_transport_recv_enqueue(struct vsock_sock *vsk,
> > 			      struct sk_buff *skb)
> > {
> > @@ -1365,10 +1365,8 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
> > 	spin_lock_bh(&vvs->rx_lock);
> > 
> > 	can_enqueue = virtio_transport_inc_rx_pkt(vvs, len);
> > -	if (!can_enqueue) {
> > -		free_pkt = true;
> > +	if (!can_enqueue)
> > 		goto out;
> > -	}
> > 
> > 	if (le32_to_cpu(hdr->flags) & VIRTIO_VSOCK_SEQ_EOM)
> > 		vvs->msg_count++;
> > @@ -1408,6 +1406,8 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
> > 	spin_unlock_bh(&vvs->rx_lock);
> > 	if (free_pkt)
> > 		kfree_skb(skb);
> > +
> > +	return can_enqueue;
> > }
> > 
> > static int
> > @@ -1420,7 +1420,16 @@ virtio_transport_recv_connected(struct sock *sk,
> > 
> > 	switch (le16_to_cpu(hdr->op)) {
> > 	case VIRTIO_VSOCK_OP_RW:
> > -		virtio_transport_recv_enqueue(vsk, skb);
> > +		if (!virtio_transport_recv_enqueue(vsk, skb)) {
> > +			/* There is no more space to queue the packet, so let's
> > +			 * close the connection; otherwise, we'll lose data.
> > +			 */
> > +			(void)virtio_transport_reset(vsk, skb);
> > +			sk->sk_state = TCP_CLOSE;
> > +			sk->sk_err = ENOBUFS;
> > +			sk_error_report(sk);
> 
> sashiko reported some issues related to setting TCP_CLOSE state and not
> removing the socket from the connect table:
> https://sashiko.dev/#/patchset/20260513105417.56761-1-sgarzare%40redhat.com
> 
> I'll change this by calling virtio_transport_do_close() and
> vsock_remove_sock() in the next version.
> 
> Stefano
> 
> > +			break;
> > +		}
> > 		vsock_data_ready(sk);
> > 		return err;
> > 	case VIRTIO_VSOCK_OP_CREDIT_REQUEST:
> > -- 
> > 2.54.0
> > 


And so the bag of hacks grows. I feel this is energy not well spent.
Please, let us fix this properly *first*. And then worry about how to
backport.  Maybe it will not be so terrible to backport after all.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net v3 1/2] vsock/virtio: reset connection on receiving queue overflow
  2026-05-14 15:16     ` Michael S. Tsirkin
@ 2026-05-14 16:45       ` Stefano Garzarella
  2026-05-14 17:44         ` Michael S. Tsirkin
  0 siblings, 1 reply; 8+ messages in thread
From: Stefano Garzarella @ 2026-05-14 16:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, Xuan Zhuo, Eugenio Pérez, linux-kernel, Simon Horman,
	Paolo Abeni, Jakub Kicinski, Jason Wang, kvm, Stefan Hajnoczi,
	virtualization, Eric Dumazet, David S. Miller

On Thu, 14 May 2026 at 17:16, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, May 14, 2026 at 04:57:16PM +0200, Stefano Garzarella wrote:
> > On Wed, May 13, 2026 at 12:54:16PM +0200, Stefano Garzarella wrote:
> > > From: Stefano Garzarella <sgarzare@redhat.com>
> > >
> > > When there is no more space to queue an incoming packet, the packet is
> > > silently dropped. This causes data loss without any notification to
> > > either peer, since there is no retransmission.
> > >
> > > Under normal circumstances, this should never happen. However, it could
> > > happen if the other peer doesn't respect the credit, or if the skb
> > > overhead, which we recently began to take into account with commit
> > > 059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb queue"),
> > > is too high.
> > >
> > > Fix this by resetting the connection and setting the local socket error
> > > to ENOBUFS when virtio_transport_recv_enqueue() can no longer queue a
> > > packet, so both peers are explicitly notified of the failure rather than
> > > silently losing data.
> > >
> > > Fixes: ae6fcfbf5f03 ("vsock/virtio: discard packets if credit is not respected")
> > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> > > ---
> > > net/vmw_vsock/virtio_transport_common.c | 19 ++++++++++++++-----
> > > 1 file changed, 14 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> > > index 989cc252d3d3..4a4ac69d1ad1 100644
> > > --- a/net/vmw_vsock/virtio_transport_common.c
> > > +++ b/net/vmw_vsock/virtio_transport_common.c
> > > @@ -1350,7 +1350,7 @@ virtio_transport_recv_connecting(struct sock *sk,
> > >     return err;
> > > }
> > >
> > > -static void
> > > +static bool
> > > virtio_transport_recv_enqueue(struct vsock_sock *vsk,
> > >                           struct sk_buff *skb)
> > > {
> > > @@ -1365,10 +1365,8 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
> > >     spin_lock_bh(&vvs->rx_lock);
> > >
> > >     can_enqueue = virtio_transport_inc_rx_pkt(vvs, len);
> > > -   if (!can_enqueue) {
> > > -           free_pkt = true;
> > > +   if (!can_enqueue)
> > >             goto out;
> > > -   }
> > >
> > >     if (le32_to_cpu(hdr->flags) & VIRTIO_VSOCK_SEQ_EOM)
> > >             vvs->msg_count++;
> > > @@ -1408,6 +1406,8 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
> > >     spin_unlock_bh(&vvs->rx_lock);
> > >     if (free_pkt)
> > >             kfree_skb(skb);
> > > +
> > > +   return can_enqueue;
> > > }
> > >
> > > static int
> > > @@ -1420,7 +1420,16 @@ virtio_transport_recv_connected(struct sock *sk,
> > >
> > >     switch (le16_to_cpu(hdr->op)) {
> > >     case VIRTIO_VSOCK_OP_RW:
> > > -           virtio_transport_recv_enqueue(vsk, skb);
> > > +           if (!virtio_transport_recv_enqueue(vsk, skb)) {
> > > +                   /* There is no more space to queue the packet, so let's
> > > +                    * close the connection; otherwise, we'll lose data.
> > > +                    */
> > > +                   (void)virtio_transport_reset(vsk, skb);
> > > +                   sk->sk_state = TCP_CLOSE;
> > > +                   sk->sk_err = ENOBUFS;
> > > +                   sk_error_report(sk);
> >
> > sashiko reported some issues related to setting TCP_CLOSE state and not
> > removing the socket from the connect table:
> > https://sashiko.dev/#/patchset/20260513105417.56761-1-sgarzare%40redhat.com
> >
> > I'll change this by calling virtio_transport_do_close() and
> > vsock_remove_sock() in the next version.
> >
> > Stefano
> >
> > > +                   break;
> > > +           }
> > >             vsock_data_ready(sk);
> > >             return err;
> > >     case VIRTIO_VSOCK_OP_CREDIT_REQUEST:
> > > --
> > > 2.54.0
> > >
>
>
> And so the bag of hacks grows. I feel this is energy not well spent.
> Please, let us fix this properly *first*. And then worry about how to
> backport.  Maybe it will not be so terrible to backport after all.
>

TBH I don't think this is an hack, but an issue we should fix in any case.
Regarding the second patch, I see your point, but it's a big change
that worries me. I'd like some more time to fix it properly without
rushing. Staying calm without realizing that userspace is broken like
we are now without this series :-(

That said, evaluating further, I think we have a similar issue also
with STREAM on the host side where the skb usually doesn't free space,
so we need a merge strategy also there.

So, I'd like to have time to fix both definitely. If you have time and
want to go ahead, please do.

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net v3 1/2] vsock/virtio: reset connection on receiving queue overflow
  2026-05-14 16:45       ` Stefano Garzarella
@ 2026-05-14 17:44         ` Michael S. Tsirkin
  0 siblings, 0 replies; 8+ messages in thread
From: Michael S. Tsirkin @ 2026-05-14 17:44 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: netdev, Xuan Zhuo, Eugenio Pérez, linux-kernel, Simon Horman,
	Paolo Abeni, Jakub Kicinski, Jason Wang, kvm, Stefan Hajnoczi,
	virtualization, Eric Dumazet, David S. Miller

On Thu, May 14, 2026 at 06:45:00PM +0200, Stefano Garzarella wrote:
> On Thu, 14 May 2026 at 17:16, Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Thu, May 14, 2026 at 04:57:16PM +0200, Stefano Garzarella wrote:
> > > On Wed, May 13, 2026 at 12:54:16PM +0200, Stefano Garzarella wrote:
> > > > From: Stefano Garzarella <sgarzare@redhat.com>
> > > >
> > > > When there is no more space to queue an incoming packet, the packet is
> > > > silently dropped. This causes data loss without any notification to
> > > > either peer, since there is no retransmission.
> > > >
> > > > Under normal circumstances, this should never happen. However, it could
> > > > happen if the other peer doesn't respect the credit, or if the skb
> > > > overhead, which we recently began to take into account with commit
> > > > 059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb queue"),
> > > > is too high.
> > > >
> > > > Fix this by resetting the connection and setting the local socket error
> > > > to ENOBUFS when virtio_transport_recv_enqueue() can no longer queue a
> > > > packet, so both peers are explicitly notified of the failure rather than
> > > > silently losing data.
> > > >
> > > > Fixes: ae6fcfbf5f03 ("vsock/virtio: discard packets if credit is not respected")
> > > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> > > > ---
> > > > net/vmw_vsock/virtio_transport_common.c | 19 ++++++++++++++-----
> > > > 1 file changed, 14 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> > > > index 989cc252d3d3..4a4ac69d1ad1 100644
> > > > --- a/net/vmw_vsock/virtio_transport_common.c
> > > > +++ b/net/vmw_vsock/virtio_transport_common.c
> > > > @@ -1350,7 +1350,7 @@ virtio_transport_recv_connecting(struct sock *sk,
> > > >     return err;
> > > > }
> > > >
> > > > -static void
> > > > +static bool
> > > > virtio_transport_recv_enqueue(struct vsock_sock *vsk,
> > > >                           struct sk_buff *skb)
> > > > {
> > > > @@ -1365,10 +1365,8 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
> > > >     spin_lock_bh(&vvs->rx_lock);
> > > >
> > > >     can_enqueue = virtio_transport_inc_rx_pkt(vvs, len);
> > > > -   if (!can_enqueue) {
> > > > -           free_pkt = true;
> > > > +   if (!can_enqueue)
> > > >             goto out;
> > > > -   }
> > > >
> > > >     if (le32_to_cpu(hdr->flags) & VIRTIO_VSOCK_SEQ_EOM)
> > > >             vvs->msg_count++;
> > > > @@ -1408,6 +1406,8 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
> > > >     spin_unlock_bh(&vvs->rx_lock);
> > > >     if (free_pkt)
> > > >             kfree_skb(skb);
> > > > +
> > > > +   return can_enqueue;
> > > > }
> > > >
> > > > static int
> > > > @@ -1420,7 +1420,16 @@ virtio_transport_recv_connected(struct sock *sk,
> > > >
> > > >     switch (le16_to_cpu(hdr->op)) {
> > > >     case VIRTIO_VSOCK_OP_RW:
> > > > -           virtio_transport_recv_enqueue(vsk, skb);
> > > > +           if (!virtio_transport_recv_enqueue(vsk, skb)) {
> > > > +                   /* There is no more space to queue the packet, so let's
> > > > +                    * close the connection; otherwise, we'll lose data.
> > > > +                    */
> > > > +                   (void)virtio_transport_reset(vsk, skb);
> > > > +                   sk->sk_state = TCP_CLOSE;
> > > > +                   sk->sk_err = ENOBUFS;
> > > > +                   sk_error_report(sk);
> > >
> > > sashiko reported some issues related to setting TCP_CLOSE state and not
> > > removing the socket from the connect table:
> > > https://sashiko.dev/#/patchset/20260513105417.56761-1-sgarzare%40redhat.com
> > >
> > > I'll change this by calling virtio_transport_do_close() and
> > > vsock_remove_sock() in the next version.
> > >
> > > Stefano
> > >
> > > > +                   break;
> > > > +           }
> > > >             vsock_data_ready(sk);
> > > >             return err;
> > > >     case VIRTIO_VSOCK_OP_CREDIT_REQUEST:
> > > > --
> > > > 2.54.0
> > > >
> >
> >
> > And so the bag of hacks grows. I feel this is energy not well spent.
> > Please, let us fix this properly *first*. And then worry about how to
> > backport.  Maybe it will not be so terrible to backport after all.
> >
> 
> TBH I don't think this is an hack, but an issue we should fix in any case.
> Regarding the second patch, I see your point, but it's a big change
> that worries me. I'd like some more time to fix it properly without
> rushing. Staying calm without realizing that userspace is broken like
> we are now without this series :-(
> 
> That said, evaluating further, I think we have a similar issue also
> with STREAM on the host side where the skb usually doesn't free space,
> so we need a merge strategy also there.
> 
> So, I'd like to have time to fix both definitely. If you have time and
> want to go ahead, please do.
> 
> Thanks,
> Stefano

Well my patch was a start, we just need a strategy how to avoid copying
everything, right?

-- 
MST


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-05-14 17:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-13 10:54 [PATCH net v3 0/2] vsock/virtio: fix skb overhead accounting to preserve full buf_alloc Stefano Garzarella
2026-05-13 10:54 ` [PATCH net v3 1/2] vsock/virtio: reset connection on receiving queue overflow Stefano Garzarella
2026-05-14 14:57   ` Stefano Garzarella
2026-05-14 15:16     ` Michael S. Tsirkin
2026-05-14 16:45       ` Stefano Garzarella
2026-05-14 17:44         ` Michael S. Tsirkin
2026-05-13 10:54 ` [PATCH net v3 2/2] vsock/virtio: fix skb overhead accounting to preserve full buf_alloc Stefano Garzarella
2026-05-14 14:44   ` Stefano Garzarella

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox