From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1039F1D6AA; Wed, 28 Jan 2026 15:58:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769615892; cv=none; b=n3NcEpYtYUJ9+79N24OQKvW5+nJDOlydIij6hRz2bu8hYbdQbaf+SH1S1CN5la3DSIkoDBiDHCOfsO8Kce1SizgM8qZ6dgOuonXVYnF0uDZa9JezS8YL6e78BaH+SvaZQVnRhHl0Shz+72TgCu6hKWZGB1H0lRRzU1k1Xc8CFE0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769615892; c=relaxed/simple; bh=dL/Y4vy6CSkeuYDh6isk5AAu3I5gZR1CIojgeeqfIyg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WP5FDmN7eIQ7yEFqPUJtCox/r64lvEvYQSyEDfYQ151wowSTW5Tl23RMShKmffSQVJRV4zUgmywGFUhJywXaPYnFhQiasOxhbcZOMeSTqVpqjUdB/RkRAcA6MF7fEJK1SOHt7OVuBEUCRgabI/StgSlkZMBvrLB5v5CoF9db/Mg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=bFBbT2AP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="bFBbT2AP" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 767ADC4CEF1; Wed, 28 Jan 2026 15:58:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1769615891; bh=dL/Y4vy6CSkeuYDh6isk5AAu3I5gZR1CIojgeeqfIyg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bFBbT2APTF7W14TJlP4tJFJ0ZUdJHyiMR8Oidk7JtFNXplFlsImJYvEcGEJLwPzoc pZUzJsHz/oZsq+DXoAQ1Kk8jWWQAPJFwRqFWfgCy5FmPiBokpQZGzq8+X5irBlRDzL YSqVpaJobJcKmvnw1rqJSn/FaV1v8ijRE8DO8AjA= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Stefano Garzarella , Melbin K Mathew , Luigi Leonardi , "Michael S. Tsirkin" , Paolo Abeni , Sasha Levin Subject: [PATCH 6.18 145/227] vsock/virtio: cap TX credit to local buffer size Date: Wed, 28 Jan 2026 16:23:10 +0100 Message-ID: <20260128145349.676299117@linuxfoundation.org> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260128145344.331957407@linuxfoundation.org> References: <20260128145344.331957407@linuxfoundation.org> User-Agent: quilt/0.69 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 6.18-stable review patch. If anyone has any objections, please let me know. ------------------ From: Melbin K Mathew [ Upstream commit 8ee784fdf006cbe8739cfa093f54d326cbf54037 ] The virtio transports derives its TX credit directly from peer_buf_alloc, which is set from the remote endpoint's SO_VM_SOCKETS_BUFFER_SIZE value. On the host side this means that the amount of data we are willing to queue for a connection is scaled by a guest-chosen buffer size, rather than the host's own vsock configuration. A malicious guest can advertise a large buffer and read slowly, causing the host to allocate a correspondingly large amount of sk_buff memory. The same thing would happen in the guest with a malicious host, since virtio transports share the same code base. Introduce a small helper, virtio_transport_tx_buf_size(), that returns min(peer_buf_alloc, buf_alloc), and use it wherever we consume peer_buf_alloc. This ensures the effective TX window is bounded by both the peer's advertised buffer and our own buf_alloc (already clamped to buffer_max_size via SO_VM_SOCKETS_BUFFER_MAX_SIZE), so a remote peer cannot force the other to queue more data than allowed by its own vsock settings. On an unpatched Ubuntu 22.04 host (~64 GiB RAM), running a PoC with 32 guest vsock connections advertising 2 GiB each and reading slowly drove Slab/SUnreclaim from ~0.5 GiB to ~57 GiB; the system only recovered after killing the QEMU process. That said, if QEMU memory is limited with cgroups, the maximum memory used will be limited. With this patch applied: Before: MemFree: ~61.6 GiB Slab: ~142 MiB SUnreclaim: ~117 MiB After 32 high-credit connections: MemFree: ~61.5 GiB Slab: ~178 MiB SUnreclaim: ~152 MiB Only ~35 MiB increase in Slab/SUnreclaim, no host OOM, and the guest remains responsive. Compatibility with non-virtio transports: - VMCI uses the AF_VSOCK buffer knobs to size its queue pairs per socket based on the local vsk->buffer_* values; the remote side cannot enlarge those queues beyond what the local endpoint configured. - Hyper-V's vsock transport uses fixed-size VMBus ring buffers and an MTU bound; there is no peer-controlled credit field comparable to peer_buf_alloc, and the remote endpoint cannot drive in-flight kernel memory above those ring sizes. - The loopback path reuses virtio_transport_common.c, so it naturally follows the same semantics as the virtio transport. This change is limited to virtio_transport_common.c and thus affects virtio-vsock, vhost-vsock, and loopback, bringing them in line with the "remote window intersected with local policy" behaviour that VMCI and Hyper-V already effectively have. Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko") Suggested-by: Stefano Garzarella Signed-off-by: Melbin K Mathew [Stefano: small adjustments after changing the previous patch] [Stefano: tweak the commit message] Signed-off-by: Stefano Garzarella Reviewed-by: Luigi Leonardi Link: https://patch.msgid.link/20260121093628.9941-4-sgarzare@redhat.com Acked-by: Michael S. Tsirkin Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin --- net/vmw_vsock/virtio_transport_common.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 6175124d63d34..d3e26025ef589 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -821,6 +821,15 @@ virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk, } EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_dequeue); +static u32 virtio_transport_tx_buf_size(struct virtio_vsock_sock *vvs) +{ + /* The peer advertises its receive buffer via peer_buf_alloc, but we + * cap it to our local buf_alloc so a remote peer cannot force us to + * queue more data than our own buffer configuration allows. + */ + return min(vvs->peer_buf_alloc, vvs->buf_alloc); +} + int virtio_transport_seqpacket_enqueue(struct vsock_sock *vsk, struct msghdr *msg, @@ -830,7 +839,7 @@ virtio_transport_seqpacket_enqueue(struct vsock_sock *vsk, spin_lock_bh(&vvs->tx_lock); - if (len > vvs->peer_buf_alloc) { + if (len > virtio_transport_tx_buf_size(vvs)) { spin_unlock_bh(&vvs->tx_lock); return -EMSGSIZE; } @@ -884,7 +893,8 @@ static s64 virtio_transport_has_space(struct virtio_vsock_sock *vvs) * we have bytes in flight (tx_cnt - peer_fwd_cnt), the subtraction * does not underflow. */ - bytes = (s64)vvs->peer_buf_alloc - (vvs->tx_cnt - vvs->peer_fwd_cnt); + bytes = (s64)virtio_transport_tx_buf_size(vvs) - + (vvs->tx_cnt - vvs->peer_fwd_cnt); if (bytes < 0) bytes = 0; -- 2.51.0