Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Andrey Drobyshev" <andrey.drobyshev@virtuozzo.com>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH 4/4] vhost/vsock: re-scan TX virtqueue on device start
Date: Sat, 13 Jun 2026 16:57:44 +0000	[thread overview]
Message-ID: <20260613165744.7359A1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260612165718.433546-5-andrey.drobyshev@virtuozzo.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [High] Host memory allocation failure incorrectly triggers a fatal virtqueue error and leaks descriptors, causing a guest vsock DoS.
- [High] TOCTOU race in `vhost_transport_send_pkt()` can erroneously drop packets during QEMU CPR live migration.
--

commit 917560a155f52870e04784955e886f79d44169bb
Author: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>

vhost/vsock: re-scan TX virtqueue on device start

This commit addresses a deadlock during QEMU CPR live-update where a TX kick
is lost because the backend is temporarily NULL. It updates vhost_vsock_start()
to queue a TX poll and rescan buffers enqueued while paused.

> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> index bcaba36becd79..1fcfe71d18bec 100644
> --- a/drivers/vhost/vsock.c
> +++ b/drivers/vhost/vsock.c
> @@ -655,6 +655,12 @@ static int vhost_vsock_start(struct vhost_vsock *vsock)
>  	 */
>  	vhost_vq_work_queue(&vsock->vqs[VSOCK_VQ_RX], &vsock->send_pkt_work);
>  
> +	/*
> +	 * Some packets might've also been queued in TX VQ.  Re-scan it here,
> +	 * mirroring the RX send-worker kick above.
> +	 */
> +	vhost_poll_queue(&vsock->vqs[VSOCK_VQ_TX].poll);
> +
>  	mutex_unlock(&vsock->dev.mutex);
>  	return 0;

[Severity: High]
This is a pre-existing issue, but does this allocation failure handling
correctly manage the descriptor?

In vhost_vsock_handle_tx_kick(), if vhost_vsock_alloc_skb() fails due to
transient host memory pressure:

drivers/vhost/vsock.c:vhost_vsock_handle_tx_kick() {
        ...
		skb = vhost_vsock_alloc_skb(vq, out, in);
		if (!skb) {
			vq_err(vq, "Faulted on pkt\n");
			continue;
		}
        ...
}

It looks like this skips the vhost_add_used() call, triggering a device error
via vq_err() and permanently leaking the descriptor.

Should a transient host memory allocation failure drop the packet and cleanly
return or discard the descriptor instead of breaking the virtqueue?


[Severity: High]
This isn't a bug introduced by this patch, but is there a race condition
during live migration in vhost_transport_send_pkt() that could drop packets?

Looking at vhost_transport_send_pkt():

drivers/vhost/vsock.c:vhost_transport_send_pkt() {
        ...
	if (unlikely(!data_race(vhost_vq_get_backend(&vsock->vqs[VSOCK_VQ_RX])))) {
		smp_rmb();	/* pairs with smp_wmb() in start/drop_backends */
		if (!READ_ONCE(vsock->cpr_paused)) {
			rcu_read_unlock();
			kfree_skb(skb);
			return -EHOSTUNREACH;
		}
	}
        ...
}

If a reader observes the backend is NULL, gets preempted, and then the writer
in vhost_vsock_start() sets the backend to non-NULL and cpr_paused to false,
could the reader wake up and incorrectly conclude the device is permanently
uninitialized?

If this sequence occurs, it returns -EHOSTUNREACH and drops the packet. Would
it be safer to re-verify the backend is still NULL after observing
cpr_paused == false?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260612165718.433546-1-andrey.drobyshev@virtuozzo.com?part=4

      reply	other threads:[~2026-06-13 16:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-12 16:57 [PATCH 0/4] vhost/vsock: add support for VHOST_RESET_OWNER and CPR migration Andrey Drobyshev
2026-06-12 16:57 ` [PATCH 1/4] vhost/vsock: split out vhost_vsock_drop_backends helper Andrey Drobyshev
2026-06-12 16:57 ` [PATCH 2/4] vhost/vsock: add VHOST_RESET_OWNER ioctl Andrey Drobyshev
2026-06-13 16:57   ` sashiko-bot
2026-06-12 16:57 ` [PATCH 3/4] vhost/vsock: suppress EHOSTUNREACH fast-fail during CPR pause Andrey Drobyshev
2026-06-13 16:57   ` sashiko-bot
2026-06-12 16:57 ` [PATCH 4/4] vhost/vsock: re-scan TX virtqueue on device start Andrey Drobyshev
2026-06-13 16:57   ` sashiko-bot [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260613165744.7359A1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=andrey.drobyshev@virtuozzo.com \
    --cc=kvm@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox