netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni@redhat.com>
To: Melbin K Mathew <mlbnkm1@gmail.com>,
	stefanha@redhat.com, sgarzare@redhat.com
Cc: kvm@vger.kernel.org, netdev@vger.kernel.org,
	virtualization@lists.linux.dev, linux-kernel@vger.kernel.org,
	mst@redhat.com, jasowang@redhat.com, xuanzhuo@linux.alibaba.com,
	eperezma@redhat.com, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, horms@kernel.org
Subject: Re: [PATCH net v4 2/4] vsock/virtio: cap TX credit to local buffer size
Date: Sat, 27 Dec 2025 17:00:41 +0100	[thread overview]
Message-ID: <d4a97f0e-a858-4958-a10f-bf91062e5df9@redhat.com> (raw)
In-Reply-To: <20251217181206.3681159-3-mlbnkm1@gmail.com>

On 12/17/25 7:12 PM, Melbin K Mathew wrote:
> The virtio vsock transport derives its TX credit directly from
> peer_buf_alloc, which is set from the remote endpoint's
> SO_VM_SOCKETS_BUFFER_SIZE value.
> 
> On the host side this means that the amount of data we are willing to
> queue for a connection is scaled by a guest-chosen buffer size, rather
> than the host's own vsock configuration. A malicious guest can advertise
> a large buffer and read slowly, causing the host to allocate a
> correspondingly large amount of sk_buff memory.
> 
> Introduce a small helper, virtio_transport_tx_buf_alloc(), that
> returns min(peer_buf_alloc, buf_alloc), and use it wherever we consume
> peer_buf_alloc:
> 
>   - virtio_transport_get_credit()
>   - virtio_transport_has_space()
>   - virtio_transport_seqpacket_enqueue()
> 
> This ensures the effective TX window is bounded by both the peer's
> advertised buffer and our own buf_alloc (already clamped to
> buffer_max_size via SO_VM_SOCKETS_BUFFER_MAX_SIZE), so a remote guest
> cannot force the host to queue more data than allowed by the host's own
> vsock settings.
> 
> On an unpatched Ubuntu 22.04 host (~64 GiB RAM), running a PoC with
> 32 guest vsock connections advertising 2 GiB each and reading slowly
> drove Slab/SUnreclaim from ~0.5 GiB to ~57 GiB; the system only
> recovered after killing the QEMU process.
> 
> With this patch applied:
> 
>   Before:
>     MemFree:        ~61.6 GiB
>     Slab:           ~142 MiB
>     SUnreclaim:     ~117 MiB
> 
>   After 32 high-credit connections:
>     MemFree:        ~61.5 GiB
>     Slab:           ~178 MiB
>     SUnreclaim:     ~152 MiB
> 
> Only ~35 MiB increase in Slab/SUnreclaim, no host OOM, and the guest
> remains responsive.
> 
> Compatibility with non-virtio transports:
> 
>   - VMCI uses the AF_VSOCK buffer knobs to size its queue pairs per
>     socket based on the local vsk->buffer_* values; the remote side
>     cannot enlarge those queues beyond what the local endpoint
>     configured.
> 
>   - Hyper-V's vsock transport uses fixed-size VMBus ring buffers and
>     an MTU bound; there is no peer-controlled credit field comparable
>     to peer_buf_alloc, and the remote endpoint cannot drive in-flight
>     kernel memory above those ring sizes.
> 
>   - The loopback path reuses virtio_transport_common.c, so it
>     naturally follows the same semantics as the virtio transport.
> 
> This change is limited to virtio_transport_common.c and thus affects
> virtio and loopback, bringing them in line with the "remote window
> intersected with local policy" behaviour that VMCI and Hyper-V already
> effectively have.
> 
> Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko")
> Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
> Signed-off-by: Melbin K Mathew <mlbnkm1@gmail.com>

Does not apply cleanly to net. On top of Stefano requests, please rebase.

Thanks,

Paolo


  parent reply	other threads:[~2025-12-27 16:00 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-17 18:12 [PATCH net v4 0/4] vsock/virtio: fix TX credit handling Melbin K Mathew
2025-12-17 18:12 ` [PATCH net v4 1/4] vsock/virtio: fix potential underflow in virtio_transport_get_credit() Melbin K Mathew
2025-12-18  9:19   ` Stefano Garzarella
2025-12-17 18:12 ` [PATCH net v4 2/4] vsock/virtio: cap TX credit to local buffer size Melbin K Mathew
2025-12-18  9:24   ` Stefano Garzarella
2025-12-27 16:00   ` Paolo Abeni [this message]
2025-12-17 18:12 ` [PATCH net v4 3/4] vsock/test: fix seqpacket message bounds test Melbin K Mathew
2025-12-18  9:14   ` Stefano Garzarella
2025-12-17 18:12 ` [PATCH net v4 4/4] vsock/test: add stream TX credit " Melbin K Mathew
2025-12-18  9:45   ` Stefano Garzarella
2025-12-18  9:18 ` [PATCH net v4 0/4] vsock/virtio: fix TX credit handling Stefano Garzarella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d4a97f0e-a858-4958-a10f-bf91062e5df9@redhat.com \
    --to=pabeni@redhat.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eperezma@redhat.com \
    --cc=horms@kernel.org \
    --cc=jasowang@redhat.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlbnkm1@gmail.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux.dev \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).