Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Andrey Drobyshev" <andrey.drobyshev@virtuozzo.com>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH v2 2/4] vhost/vsock: suppress EHOSTUNREACH fast-fail during CPR pause
Date: Tue, 23 Jun 2026 17:58:54 +0000	[thread overview]
Message-ID: <20260623175854.DB6601F00A3A@smtp.kernel.org> (raw)
In-Reply-To: <20260622175808.508084-3-andrey.drobyshev@virtuozzo.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

New issues:
- [Medium] The `started` flag is never cleared on VM reboot, permanently disabling the originally intended fast-fail behavior for all subsequent boots.

Pre-existing issues:
- [High] The patch exposes an unbounded memory leak and Local DoS vulnerability by queueing host-initiated packets indefinitely when the guest is paused.
--

commit c58db98ab34af8efb673c49c85406f685bf9805b
Author: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>

vhost/vsock: suppress EHOSTUNREACH fast-fail during CPR pause

This patch alters the fast-fail condition in vhost_transport_send_pkt() to
check a new 'started' flag rather than checking if the device backends are
dropped. This ensures that host-initiated packets are queued rather than
fast-failed when a guest is temporarily paused during checkpoint-restore.

> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> index b12221ce6faf2..bec6bcfd885fa 100644
> --- a/drivers/vhost/vsock.c
> +++ b/drivers/vhost/vsock.c
[ ... ]
> @@ -302,17 +303,12 @@ vhost_transport_send_pkt(struct sk_buff *skb, struct net *net)
>  		return -ENODEV;
>  	}
>  
> -	/* Fast-fail if the guest hasn't enabled the RX vq yet. Queuing the packet
> -	 * and making the caller wait is pointless: even if the guest manages to init
> -	 * within the timeout, it'll immediately reply with RST, because there's no
> -	 * listener on the port yet.
> -	 *
> -	 * vhost_vq_get_backend() without vq->mutex is acceptable here: locking
> -	 * the mutex would be too expensive in this hot path, and we already have
> -	 * all the outcomes covered: if the backend becomes NULL right after the check,
> -	 * vhost_transport_do_send_pkt() will check it under the mutex anyway.
> +	/* Fast-fail until the guest first enables the device (SET_RUNNING(1)).
> +	 * Before that there is no listener, so queuing is pointless. 'started'
> +	 * is never cleared, so once we're up we keep queuing across later
> +	 * stop / CPR-pause windows.
>  	 */
> -	if (unlikely(!data_race(vhost_vq_get_backend(&vsock->vqs[VSOCK_VQ_RX])))) {
> +	if (unlikely(!READ_ONCE(vsock->started))) {

[Severity: High]
This is a pre-existing issue, but does changing the fast-fail condition to
rely on a flag that is never cleared expose an unbounded memory leak when a
VM is paused?

If a guest is paused, the device backend is stopped, but the started flag
remains true. A local user could repeatedly open a vsock socket, call
connect() with O_NONBLOCK to the paused guest's CID, and immediately close
the socket.

The close system call invokes virtio_transport_release(), which drops the
file descriptor but does not call cancel_pkt() to purge the pending
VIRTIO_VSOCK_OP_REQUEST packet from the transport's queue:

net/vmw_vsock/virtio_transport_common.c:virtio_transport_release() {
    ...
    if (remove_sock) {
        sock_set_flag(sk, SOCK_DONE);
        virtio_transport_remove_sock(vsk);
    }
}

Since skb_set_owner_w() binds the queued SKB to the socket's sk_wmem_alloc,
the orphaned socket and the SKB are kept alive in kernel memory. Because
send_pkt_queue is unbounded and there is no orphan socket limit, could this
allow a user to indefinitely leak kernel memory and exhaust host resources?

>  		rcu_read_unlock();
>  		kfree_skb(skb);
>  		return -EHOSTUNREACH;
[ ... ]
> @@ -640,6 +636,11 @@ static int vhost_vsock_start(struct vhost_vsock *vsock)
>  		mutex_unlock(&vq->mutex);
>  	}
>  
> +	/* Set 'started' flag on the first start; never cleared, so send_pkt
> +	 * keeps queuing (instead of fast-failing) on later stop / CPR pauses.
> +	 */
> +	WRITE_ONCE(vsock->started, true);
> +

[Severity: Medium]
Does leaving this flag permanently set disable the originally intended
fast-fail behavior for all subsequent VM boots?

During a VM reboot, QEMU resets the vhost device via the VHOST_RESET_OWNER
ioctl, which clears the backends via vhost_vsock_drop_backends(), but
typically keeps the /dev/vhost-vsock file descriptor open.

Since the started flag is never reset to false during VHOST_RESET_OWNER,
host connection attempts during the rebooted VM's early boot phase will
bypass the fast-fail check and be queued indefinitely until the VM fully
boots.

Could this contradict the commit message's goal to keep the behavior
originally intended, which is to fast-fail if there is no listener yet?

>  	/* Some packets may have been queued before the device was started,
>  	 * let's kick the send worker to send them.
>  	 */

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260622175808.508084-1-andrey.drobyshev@virtuozzo.com?part=2

  reply	other threads:[~2026-06-23 17:58 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-22 17:58 [PATCH v2 0/4] vhost/vsock: add support for VHOST_RESET_OWNER and CPR migration Andrey Drobyshev
2026-06-22 17:58 ` [PATCH v2 1/4] vhost/vsock: split out vhost_vsock_drop_backends helper Andrey Drobyshev
2026-06-22 17:58 ` [PATCH v2 2/4] vhost/vsock: suppress EHOSTUNREACH fast-fail during CPR pause Andrey Drobyshev
2026-06-23 17:58   ` sashiko-bot [this message]
2026-06-22 17:58 ` [PATCH v2 3/4] vhost/vsock: re-scan TX virtqueue on device start Andrey Drobyshev
2026-06-22 17:58 ` [PATCH v2 4/4] vhost/vsock: add VHOST_RESET_OWNER ioctl Andrey Drobyshev
2026-06-23 17:58   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260623175854.DB6601F00A3A@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=andrey.drobyshev@virtuozzo.com \
    --cc=kvm@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox