From: "Michael S. Tsirkin" <mst@redhat.com>
To: Melbin K Mathew <mlbnkm1@gmail.com>
Cc: netdev@vger.kernel.org, virtualization@lists.linux.dev,
linux-kernel@vger.kernel.org, sgarzare@redhat.com,
stefanha@redhat.com, jasowang@redhat.com
Subject: Re: [PATCH net] vsock/virtio: cap TX credit to local buffer size
Date: Wed, 10 Dec 2025 08:47:25 -0500 [thread overview]
Message-ID: <20251210084318-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20251210133259.16238-1-mlbnkm1@gmail.com>
thanks for the patch! yet something to improve:
On Wed, Dec 10, 2025 at 02:32:59PM +0100, Melbin K Mathew wrote:
> The virtio vsock transport currently derives its TX credit directly
> from peer_buf_alloc, which is set from the remote endpoint's
> SO_VM_SOCKETS_BUFFER_SIZE value.
>
> On the host side this means that the amount of data we are willing to
> queue for a connection is scaled by a guest-chosen buffer size,
> rather than the host's own vsock configuration. A malicious guest can
> advertise a large buffer and read slowly, causing the host to allocate
> a correspondingly large amount of sk_buff memory.
>
> Introduce a small helper, virtio_transport_peer_buf_alloc(), that
> returns min(peer_buf_alloc, buf_alloc), and use it wherever we consume
> peer_buf_alloc:
>
> - virtio_transport_get_credit()
> - virtio_transport_has_space()
> - virtio_transport_seqpacket_enqueue()
>
> This ensures the effective TX window is bounded by both the peer's
> advertised buffer and our own buf_alloc (already clamped to
> buffer_max_size via SO_VM_SOCKETS_BUFFER_MAX_SIZE), so a remote guest
> cannot force the host to queue more data than allowed by the host's
> own vsock settings.
>
> On an unpatched Ubuntu 22.04 host (~64 GiB RAM), running a PoC with
> 32 guest vsock connections advertising 2 GiB each and reading slowly
> drove Slab/SUnreclaim from ~0.5 GiB to ~57 GiB and the system only
> recovered after killing the QEMU process.
>
> With this patch applied, rerunning the same PoC yields:
>
> Before:
> MemFree: ~61.6 GiB
> MemAvailable: ~62.3 GiB
> Slab: ~142 MiB
> SUnreclaim: ~117 MiB
>
> After 32 high-credit connections:
> MemFree: ~61.5 GiB
> MemAvailable: ~62.3 GiB
> Slab: ~178 MiB
> SUnreclaim: ~152 MiB
>
> i.e. only ~35 MiB increase in Slab/SUnreclaim, no host OOM, and the
> guest remains responsive.
>
what is missing here, is how do non-virtio transports behave?
because I think we want transports to be compatible.
> Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
that commit does not even include the patched file.
how can it be the right commit to fix?
> Reported-by: Melbin K Mathew <mlbnkm1@gmail.com>
> Signed-off-by: Melbin K Mathew <mlbnkm1@gmail.com>
this is the fix suggested by Stefano, right?
maybe mention this.
> ---
> net/vmw_vsock/virtio_transport_common.c | 27 ++++++++++++++++++++++---
> 1 file changed, 24 insertions(+), 3 deletions(-)
>
> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> index dcc8a1d58..f5afedf01 100644
> --- a/net/vmw_vsock/virtio_transport_common.c
> +++ b/net/vmw_vsock/virtio_transport_common.c
> @@ -491,6 +491,25 @@ void virtio_transport_consume_skb_sent(struct sk_buff *skb, bool consume)
> }
> EXPORT_SYMBOL_GPL(virtio_transport_consume_skb_sent);
>
> +/*
> + * Return the effective peer buffer size for TX credit computation.
> + *
> + * The peer advertises its receive buffer via peer_buf_alloc, but we
> + * cap that to our local buf_alloc (derived from
> + * SO_VM_SOCKETS_BUFFER_SIZE and already clamped to buffer_max_size)
> + * so that a remote endpoint cannot force us to queue more data than
> + * our own configuration allows.
> + */
> +static u32 virtio_transport_peer_buf_alloc(struct virtio_vsock_sock *vvs)
> +{
> + u32 peer = vvs->peer_buf_alloc;
> + u32 local = vvs->buf_alloc;
> +
> + if (peer > local)
> + return local;
> + return peer;
is this just
return min(vvs->peer_buf_alloc, vvs->buf_alloc)
?
> +}
> +
> u32 virtio_transport_get_credit(struct virtio_vsock_sock *vvs, u32 credit)
> {
> u32 ret;
> @@ -499,7 +518,8 @@ u32 virtio_transport_get_credit(struct virtio_vsock_sock *vvs, u32 credit)
> return 0;
>
> spin_lock_bh(&vvs->tx_lock);
> - ret = vvs->peer_buf_alloc - (vvs->tx_cnt - vvs->peer_fwd_cnt);
> + ret = virtio_transport_peer_buf_alloc(vvs) -
> + (vvs->tx_cnt - vvs->peer_fwd_cnt);
> if (ret > credit)
> ret = credit;
> vvs->tx_cnt += ret;
> @@ -831,7 +851,7 @@ virtio_transport_seqpacket_enqueue(struct vsock_sock *vsk,
>
> spin_lock_bh(&vvs->tx_lock);
>
> - if (len > vvs->peer_buf_alloc) {
> + if (len > virtio_transport_peer_buf_alloc(vvs)) {
> spin_unlock_bh(&vvs->tx_lock);
> return -EMSGSIZE;
> }
> @@ -882,7 +902,8 @@ static s64 virtio_transport_has_space(struct vsock_sock *vsk)
> struct virtio_vsock_sock *vvs = vsk->trans;
> s64 bytes;
>
> - bytes = (s64)vvs->peer_buf_alloc - (vvs->tx_cnt - vvs->peer_fwd_cnt);
> + bytes = (s64)virtio_transport_peer_buf_alloc(vvs) -
> + (vvs->tx_cnt - vvs->peer_fwd_cnt);
> if (bytes < 0)
> bytes = 0;
>
> --
> 2.34.1
next prev parent reply other threads:[~2025-12-10 13:47 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-10 13:32 [PATCH net] vsock/virtio: cap TX credit to local buffer size Melbin K Mathew
2025-12-10 13:47 ` Michael S. Tsirkin [this message]
2025-12-10 14:22 ` Stefano Garzarella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251210084318-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=jasowang@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mlbnkm1@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=sgarzare@redhat.com \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.