All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Nir Soffer <nirsof@gmail.com>
Cc: qemu-devel@nongnu.org,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Richard Jones" <rjones@redhat.com>,
	"Eric Blake" <eblake@redhat.com>
Subject: Re: [PATCH v2] io: Increase unix socket buffers size on macOS
Date: Tue, 22 Apr 2025 11:42:29 +0100	[thread overview]
Message-ID: <aAdylVf7RZVaTee3@redhat.com> (raw)
In-Reply-To: <20250419231218.67636-1-nirsof@gmail.com>

On Sun, Apr 20, 2025 at 02:12:18AM +0300, Nir Soffer wrote:
> On macOS we need to increase unix socket buffers size on the client and
> server to get good performance. We set the socket buffers on macOS after
> connecting or accepting a client connection.
> 
> Testing with qemu-nbd shows that reading an image with qemu-img convert
> from qemu-nbd is *11.4 times faster* and qemu-img cpu usage is *8.3 times
> lower*.
> 
> | qemu-img | qemu-nbd | time   | user   | system |
> |----------|----------|--------|--------|--------|
> | before   | before   | 12.957 |  2.643 |  5.777 |
> | after    | before   | 12.803 |  2.632 |  5.742 |
> | before   | after    |  1.139 |  0.074 |  0.905 |
> | after    | after    |  1.179 |  0.077 |  0.931 |
> 
> For testing buffers size I built qemu-nbd and qemu-img with send buffer
> size from 64k to 2m. In this test 256k send buffer and 1m receive buffer
> are optimal.
> 
> | send buffer | recv buffer | time   | user   | system |
> |-------------|-------------|--------|--------|--------|
> |         64k |        256k |  2.233 |  0.290 |  1.408 |
> |        128k |        512k |  1.189 |  0.103 |  0.841 |
> |        256k |       1024k |  1.121 |  0.085 |  0.813 |
> |        512k |       2048k |  1.172 |  0.081 |  0.953 |
> |       1024k |       4096k |  1.160 |  0.072 |  0.907 |
> |       2048k |       8192k |  1.309 |  0.056 |  0.960 |
> 
> Using null-co driver is useful to focus on the read part, but in the
> real world we do something with the read data. I tested real world usage
> with nbdcopy and blksum.
> 
> I tested computing a hash of the image using nbdcopy, using 4 NBD
> connections and 256k request size. In this test 1m send buffer size and
> 4m receive buffer size are optimal.
> 
> | send buffer | recv buffer | time   | user   | system |
> |-------------|-------------|--------|--------|--------|
> |         64k |        256k |  2.832 |  4.866 |  2.550 |
> |        128k |        512k |  2.429 |  4.762 |  2.037 |
> |        256k |       1024k |  2.158 |  4.724 |  1.813 |
> |        512k |       2048k |  1.777 |  4.632 |  1.790 |
> |       1024k |       4096k |  1.657 |  4.466 |  1.812 |
> |       2048k |       8192k |  1.782 |  4.570 |  1.912 |
> 
> I tested creating a hash of the image with blksum, using one NBD
> connection and 256k read size. In this test 2m send buffer and 8m
> receive buffer are optimal.
> 
> | send buffer | recv buffer | time   | user   | system |
> |-------------|-------------|--------|--------|--------|
> |         64k |        256k |  4.233 |  5.242 |  2.632 |
> |        128k |        512k |  3.329 |  4.915 |  2.015 |
> |        256k |       1024k |  2.071 |  4.647 |  1.474 |
> |        512k |       2048k |  1.980 |  4.554 |  1.432 |
> |       1024k |       4096k |  2.058 |  4.553 |  1.497 |
> |       2048k |       8192k |  1.972 |  4.539 |  1.497 |
> 
> In the real world tests larger buffers are optimal, so I picked send
> buffer of 1m and receive buffer of 4m.

IIUC all your test scenarios have recv buffer x4 size of send buffer.

Do you have any link / reference for the idea that we should be using
this x4 size multiplier ? This feels rather peculiar as a rule.

Can you show test result grid matrix for the incrementing these
send/recv buffers independently ?

> 
> This will improve other usage of unix domain sockets on macOS. I tested
> only reading from qemu-nbd.
> 
> The same change for libnbd:
> https://gitlab.com/nbdkit/libnbd/-/merge_requests/21
> 
> Signed-off-by: Nir Soffer <nirsof@gmail.com>
> ---
>  io/channel-socket.c | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
> 
> Changes since v1:
> - Add UNIX_SOCKET_*_BUFFER_SIZE macros (Philippe)
> - Handle both server and client sockets
> - Add qio_channel_socket_set_buffers() helper to cleaner code
> - Add tests results for qemu-img convert
> - Add tests results for different buffer sizes
> - Link to same change in libnbd
> 
> v1 was here:
> https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg03081.html
> 
> diff --git a/io/channel-socket.c b/io/channel-socket.c
> index 608bcf066e..635c5c973d 100644
> --- a/io/channel-socket.c
> +++ b/io/channel-socket.c
> @@ -21,6 +21,7 @@
>  #include "qapi/error.h"
>  #include "qapi/qapi-visit-sockets.h"
>  #include "qemu/module.h"
> +#include "qemu/units.h"
>  #include "io/channel-socket.h"
>  #include "io/channel-util.h"
>  #include "io/channel-watch.h"
> @@ -37,6 +38,33 @@
>  
>  #define SOCKET_MAX_FDS 16
>  
> +/*
> + * Apple recommends sizing the receive buffer at 4 times the size of the send
> + * buffer. Testing shows that 1m send buffer and 4 MiB receive buffer gives
> + * best throuput and lowest cpu usage.
> + */
> +#ifdef __APPLE__
> +#define UNIX_SOCKET_SEND_BUFFER_SIZE (1 * MiB)
> +#define UNIX_SOCKET_RECV_BUFFER_SIZE (4 * UNIX_SOCKET_SEND_BUFFER_SIZE)
> +#endif /* __APPLE__ */
> +
> +static void qio_channel_socket_set_buffers(QIOChannelSocket *ioc)
> +{
> +#ifdef __APPLE__
> +    if (ioc->localAddr.ss_family == AF_UNIX) {
> +        int value;
> +
> +        /* This is a performance optimization; don't fail on errors. */
> +
> +        value = UNIX_SOCKET_SEND_BUFFER_SIZE;
> +        setsockopt(ioc->fd, SOL_SOCKET, SO_SNDBUF, &value, sizeof(value));
> +
> +        value = UNIX_SOCKET_RECV_BUFFER_SIZE;
> +        setsockopt(ioc->fd, SOL_SOCKET, SO_RCVBUF, &value, sizeof(value));
> +    }
> +#endif /* __APPLE__ */
> +}
> +
>  SocketAddress *
>  qio_channel_socket_get_local_address(QIOChannelSocket *ioc,
>                                       Error **errp)
> @@ -174,6 +202,8 @@ int qio_channel_socket_connect_sync(QIOChannelSocket *ioc,
>      }
>  #endif
>  
> +    qio_channel_socket_set_buffers(ioc);
> +
>      qio_channel_set_feature(QIO_CHANNEL(ioc),
>                              QIO_CHANNEL_FEATURE_READ_MSG_PEEK);
>  
> @@ -410,6 +440,8 @@ qio_channel_socket_accept(QIOChannelSocket *ioc,
>      }
>  #endif /* WIN32 */
>  
> +    qio_channel_socket_set_buffers(cioc);
> +
>      qio_channel_set_feature(QIO_CHANNEL(cioc),
>                              QIO_CHANNEL_FEATURE_READ_MSG_PEEK);
>  
> -- 
> 2.39.5 (Apple Git-154)
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



  reply	other threads:[~2025-04-22 10:43 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-19 23:12 [PATCH v2] io: Increase unix socket buffers size on macOS Nir Soffer
2025-04-22 10:42 ` Daniel P. Berrangé [this message]
2025-04-22 12:46   ` Nir Soffer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aAdylVf7RZVaTee3@redhat.com \
    --to=berrange@redhat.com \
    --cc=eblake@redhat.com \
    --cc=nirsof@gmail.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rjones@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.