Netdev List
 help / color / mirror / Atom feed
From: Stefano Garzarella <sgarzare@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: netdev@vger.kernel.org, "Eric Dumazet" <edumazet@google.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	virtualization@lists.linux.dev,
	"David S. Miller" <davem@davemloft.net>,
	"Jason Wang" <jasowang@redhat.com>,
	"Simon Horman" <horms@kernel.org>,
	linux-kernel@vger.kernel.org, "Paolo Abeni" <pabeni@redhat.com>,
	"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
	kvm@vger.kernel.org, "Jakub Kicinski" <kuba@kernel.org>,
	"Eugenio Pérez" <eperezma@redhat.com>
Subject: Re: [PATCH net] vsock/virtio: fix skb overhead accounting to preserve full buf_alloc
Date: Fri, 8 May 2026 12:38:32 +0200	[thread overview]
Message-ID: <af28MAV0LF4qHVmB@sgarzare-redhat> (raw)
In-Reply-To: <20260508063104-mutt-send-email-mst@kernel.org>

On Fri, May 08, 2026 at 06:33:13AM -0400, Michael S. Tsirkin wrote:
>On Fri, May 08, 2026 at 12:01:50PM +0200, Stefano Garzarella wrote:
>> On Fri, May 08, 2026 at 05:53:07AM -0400, Michael S. Tsirkin wrote:
>> > On Fri, May 08, 2026 at 11:23:30AM +0200, Stefano Garzarella wrote:
>> > > From: Stefano Garzarella <sgarzare@redhat.com>
>> > >
>> > > After commit 059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb
>> > > queue"), virtio_transport_inc_rx_pkt() subtracts per-skb overhead from
>> > > buf_alloc when checking whether a new packet fits. This reduces the
>> > > effective receive buffer below what the user configured via
>> > > SO_VM_SOCKETS_BUFFER_SIZE, causing legitimate data packets to be
>> > > silently dropped and applications that rely on the full buffer size
>> > > to deadlock.
>> > >
>> > > Also, the reduced space is not communicated to the remote peer, so
>> > > its credit calculation accounts more credit than the receiver will
>> > > actually accept, causing data loss (there is no retransmission).
>> > >
>> > > This also causes failures in tools/testing/vsock/vsock_test.c.
>> > > Test 18 sometimes fails, while test 22 always fails in this way:
>> > >     18 - SOCK_STREAM MSG_ZEROCOPY...hash mismatch
>> > >
>> > >     22 - SOCK_STREAM virtio credit update + SO_RCVLOWAT...send failed:
>> > >     Resource temporarily unavailable
>> > >
>> > > Fix this by introducing virtio_transport_rx_buf_size() to calculate the
>> > > size of the RX buffer based on the overhead. Using it in the acceptance
>> > > check, the advertised buf_alloc, and the credit update decision.
>> > > Use buf_alloc * 2 as total budget (payload + overhead), similar to how
>> > > SO_RCVBUF is doubled to reserve space for sk_buff metadata.
>> > > The function returns buf_alloc as long as overhead fits within the
>> > > reservation, then gradually reduces toward 0 as overhead exceeds
>> > > buf_alloc (e.g. under small-packet flooding), informing the peer to
>> > > slow down.
>> > >
>> > > Fixes: 059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb queue")
>> > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>> >
>> >
>> > unfortunately, this is a bit of a spec violation and there is no guarantee
>> > it helps.
>>
>> Loosing data like we are doing in 059b7dbd20a6 is even worse IMHO.
>>
>> >
>> > a spec violation because the spec says:
>> > Only payload bytes are counted and header bytes are not
>> > included
>> >
>> > and the implication is that a side can not reduce its own buf_alloc.
>> >
>> > no guarantee because the other side is not required to process your
>> > packets, so it might not see your buf alloc reduction.
>> >
>> > as designed in the current spec, you can only increase your buf alloc,
>> > not decrease it.
>>
>> We never enforced this, currently an user can reduce it by
>> SO_VM_SOCKETS_BUFFER_SIZE and we haven't blocked it since virtio-vsock was
>> introduced, should we update the spec?
>
>
>it's not that we need to enforce it, it's that all synchronization
>assumes this. as in, anyone can use an old copy until they run out
>of credits.
>
>
>> >
>> > what can be done:
>> > - more efficient storage for small packets (poc i posted)
>> > - reduce buf alloc ahead of the time
>>
>> That's basically what I'm doing here: I'm using twice the size of
>> `buf_alloc` (just like `SO_RCVBUF` does for other socket types) and telling
>> the other peer just `buf_alloc`.
>>
>> But then, somehow, we have to let the other person know that we're running
>> out of space. With this patch that only happens when the other peer isn't
>> behaving properly, sending so many small packets that the overhead exceeds
>> `buf_alloc`.
>>
>> Stefano
>
>what is "not proper" here, it is up to the application what to send.

Sure, but here we're just slowing down the application by telling it we 
don't have any more space.

Again, without this patch we are just dropping data, which IMO is even 
worse.

So I think we should merge this for now, while we handle better the EOM.
If you prefer, I can drop the part where we reduce the buf_alloc 
advertised to the other peer, but at least we should drop data after 
`buf_alloc * 2` IMO.


Stefano


  reply	other threads:[~2026-05-08 10:38 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-08  9:23 [PATCH net] vsock/virtio: fix skb overhead accounting to preserve full buf_alloc Stefano Garzarella
2026-05-08  9:53 ` Michael S. Tsirkin
2026-05-08 10:01   ` Stefano Garzarella
2026-05-08 10:33     ` Michael S. Tsirkin
2026-05-08 10:38       ` Stefano Garzarella [this message]
2026-05-11 10:54         ` Stefano Garzarella
2026-05-11 12:46           ` Michael S. Tsirkin
2026-05-11 13:17             ` Stefano Garzarella
2026-05-11 23:48               ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=af28MAV0LF4qHVmB@sgarzare-redhat \
    --to=sgarzare@redhat.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eperezma@redhat.com \
    --cc=horms@kernel.org \
    --cc=jasowang@redhat.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux.dev \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox