From: Stefano Garzarella <sgarzare@redhat.com>
To: Laurence Rowe <laurencerowe@gmail.com>
Cc: virtualization@lists.linux.dev, netdev@vger.kernel.org
Subject: Re: [PATCH] vsock: avoid timeout for non-blocking accept() with empty backlog
Date: Thu, 2 Apr 2026 14:02:31 +0200 [thread overview]
Message-ID: <ac47PYPeL0IyRtY6@sgarzare-redhat> (raw)
In-Reply-To: <20260402044637.73531-1-laurencerowe@gmail.com>
On Wed, Apr 01, 2026 at 09:46:37PM -0700, Laurence Rowe wrote:
>A common pattern in epoll network servers is to eagerly accept all
>pending connections from the non-blocking listening socket after
>epoll_wait indicates the socket is ready by calling accept in a loop
>until EAGAIN is returned indicating that the backlog is empty.
>
>Scheduling a timeout for a non-blocking accept with an empty backlog
>meant AF_VSOCK sockets used by epoll network servers incurred hundreds
>of microseconds of additional latency per accept loop compared to
>AF_INET or AF_UNIX sockets.
Not related to this patch, but should we do something similar (in
another patch) also in vsock_connect() or doesn't matter since usually
it's always blocking?
>
>Signed-off-by: Laurence Rowe <laurencerowe@gmail.com>
>---
>
>This fixes the observed issue for me:
>
>1. With loopback vsock on the host running Linux v6.19.10 built with
>config-6.17.0-19-generic from Ubuntu 24.04 and make olddefconfig.
>
>2. With Firecracker guests with current torvalds/master, v6.19.10, and
>amazonlinux/microvm-kernel-6.1.166-24.303.amzn2023 used in Firecracker
>CI and examples. (Firecracker guest vsocks are unix sockets on the host
>side so this fix works there with just a fixed guest kernel.)
>
>I struggled to build a generic 6.1.166 kernel that worked as a
>Firecracker guest but the patch applies (conflict due to change of
>`flags` to `arg->flags` in surrounding context) so I believe it should
>work for generic v6.1.166 kernel.
>
>Alternatively a minimal version of this fix is to just wrap the
>`schedule_timeout` in an `if (timeout != 0)` but that leaves an
>unnecessary additional `lock_sock` call.
>
>There are ftrace's and reproduction tools at:
>https://github.com/lrowe/linux-vsock-accept-timeout-investigation
>---
> net/vmw_vsock/af_vsock.c | 16 +++++++---------
> 1 file changed, 7 insertions(+), 9 deletions(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 2f7d94d682..483889b6d8 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1850,11 +1850,11 @@ static int vsock_accept(struct socket *sock, struct socket *newsock,
> * created upon connection establishment.
> */
> timeout = sock_rcvtimeo(listener, arg->flags & O_NONBLOCK);
>- prepare_to_wait(sk_sleep(listener), &wait, TASK_INTERRUPTIBLE);
>
> while ((connected = vsock_dequeue_accept(listener)) == NULL &&
>- listener->sk_err == 0) {
>+ listener->sk_err == 0 && timeout != 0) {
> release_sock(listener);
>+ prepare_to_wait(sk_sleep(listener), &wait, TASK_INTERRUPTIBLE);
Is it okay to move prepare_to_wait() after `release_sock(listener)`?
I'm worried if we can miss any wakeup. BTW if this change is okay, we
should document that at least in the commit description.
> timeout = schedule_timeout(timeout);
> finish_wait(sk)sleep(listener), &wait);
> lock_sock(listener);
>@@ -1862,17 +1862,15 @@ static int vsock_accept(struct socket *sock, struct socket *newsock,
> if (signal_pending(current)) {
> err = sock_intr_errno(timeout);
> goto out;
>- } else if (timeout == 0) {
>- err = -EAGAIN;
>- goto out;
> }
>-
>- prepare_to_wait(sk_sleep(listener), &wait, TASK_INTERRUPTIBLE);
> }
>- finish_wait(sk_sleep(listener), &wait);
>
>- if (listener->sk_err)
>+ if (listener->sk_err) {
> err = -listener->sk_err;
>+ } else if (timeout == 0 && connected == NULL) {
From checkpatch:
CHECK: Comparison to NULL could be written "!connected"
#58: FILE: net/vmw_vsock/af_vsock.c:1870:
+ } else if (timeout == 0 && connected == NULL) {
>+ err = -EAGAIN;
>+ goto out;
>+ }
What about simplifying this with (not a strong opinion):
} else if (connected == NULL) {
err = -EAGAIN;
}
Also
https://patchwork.kernel.org/project/netdevbpf/patch/20260402044637.73531-1-laurencerowe@gmail.com/
suggests to specify a tree (net-next I think for this change) and be
sure to CC other maintainers (scripts/get_maintainer.pl can help).
Thanks,
Stefano
next prev parent reply other threads:[~2026-04-02 12:03 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-02 4:46 [PATCH] vsock: avoid timeout for non-blocking accept() with empty backlog Laurence Rowe
2026-04-02 12:02 ` Stefano Garzarella [this message]
2026-04-02 19:22 ` Laurence Rowe
2026-04-02 23:30 ` Laurence Rowe
2026-04-03 10:04 ` Stefano Garzarella
2026-04-02 17:19 ` Bobby Eshleman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ac47PYPeL0IyRtY6@sgarzare-redhat \
--to=sgarzare@redhat.com \
--cc=laurencerowe@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox