From: Jon Kohler <jon@nutanix.com>
To: "Michael S. Tsirkin" <mst@redhat.com>,
"Jason Wang" <jasowang@redhat.com>,
"Eugenio Pérez" <eperezma@redhat.com>,
kvm@vger.kernel.org, virtualization@lists.linux.dev,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Jon Kohler <jon@nutanix.com>
Subject: [PATCH net-next] vhost/net: check peek_head_len after signal to guest to avoid delays
Date: Tue, 25 Nov 2025 11:00:33 -0700 [thread overview]
Message-ID: <20251125180034.1167847-1-jon@nutanix.com> (raw)
In non-busypoll handle_rx paths, if peek_head_len returns 0, the RX
loop breaks, the RX wait queue is re-enabled, and vhost_net_signal_used
is called to flush done_idx and notify the guest if needed.
However, signaling the guest can take non-trivial time. During this
window, additional RX payloads may arrive on rx_ring without further
kicks. These new payloads will sit unprocessed until another kick
arrives, increasing latency. In high-rate UDP RX workloads, this was
observed to occur over 20k times per second.
To minimize this window and improve opportunities to process packets
promptly, immediately call peek_head_len after signaling. If new packets
are found, treat it as a busy poll interrupt and requeue handle_rx,
improving fairness to TX handlers and other pending CPU work. This also
helps suppress unnecessary thread wakeups, reducing waker CPU demand.
Signed-off-by: Jon Kohler <jon@nutanix.com>
---
drivers/vhost/net.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 35ded4330431..04cb5f1dc6e4 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1015,6 +1015,27 @@ static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk,
struct vhost_virtqueue *tvq = &tnvq->vq;
int len = peek_head_len(rnvq, sk);
+ if (!len && rnvq->done_idx) {
+ /* When idle, flush signal first, which can take some
+ * time for ring management and guest notification.
+ * Afterwards, check one last time for work, as the ring
+ * may have received new work during the notification
+ * window.
+ */
+ vhost_net_signal_used(rnvq, *count);
+ *count = 0;
+ if (peek_head_len(rnvq, sk)) {
+ /* More work came in during the notification
+ * window. To be fair to the TX handler and other
+ * potentially pending work items, pretend like
+ * this was a busy poll interruption so that
+ * the RX handler will be rescheduled and try
+ * again.
+ */
+ *busyloop_intr = true;
+ }
+ }
+
if (!len && rvq->busyloop_timeout) {
/* Flush batched heads first */
vhost_net_signal_used(rnvq, *count);
--
2.43.0
next reply other threads:[~2025-11-25 17:18 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-25 18:00 Jon Kohler [this message]
2025-11-25 23:50 ` [PATCH net-next] vhost/net: check peek_head_len after signal to guest to avoid delays Michael S. Tsirkin
2025-11-26 16:49 ` Jon Kohler
2025-12-26 14:50 ` Michael S. Tsirkin
2025-11-26 6:15 ` Jason Wang
2025-11-26 16:48 ` Jon Kohler
2025-11-26 6:41 ` Michael S. Tsirkin
2025-11-26 16:47 ` Jon Kohler
2025-11-26 8:56 ` Michael S. Tsirkin
2025-11-26 16:43 ` Jon Kohler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251125180034.1167847-1-jon@nutanix.com \
--to=jon@nutanix.com \
--cc=eperezma@redhat.com \
--cc=jasowang@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).