From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Nir Soffer <nirsof@gmail.com>
Cc: qemu-devel@nongnu.org, "Richard Jones" <rjones@redhat.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Eric Blake" <eblake@redhat.com>
Subject: Re: [PATCH v3 1/2] io: Increase unix socket buffers size on macOS
Date: Wed, 7 May 2025 19:23:08 +0100 [thread overview]
Message-ID: <aBulDPx3h02280i2@redhat.com> (raw)
In-Reply-To: <FAF66BF6-176E-43A8-B097-85960D81ADCE@gmail.com>
On Wed, May 07, 2025 at 08:17:19PM +0300, Nir Soffer wrote:
>
>
> > On 7 May 2025, at 19:37, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Sun, Apr 27, 2025 at 07:50:28PM +0300, Nir Soffer wrote:
> >> On macOS we need to increase unix stream socket buffers size on the
> >> client and server to get good performance. We set socket buffers on
> >> macOS after connecting or accepting a client connection. For unix
> >> datagram socket we need different configuration that can be done later.
> >>
> >> Testing shows that setting socket receive buffer size (SO_RCVBUF) has no
> >> effect on performance, so we set only the send buffer size (SO_SNDBUF).
> >> It seems to work like Linux but not documented.
> >>
> >> Testing shows that optimal buffer size is 512k to 4 MiB, depending on
> >> the test case. The difference is very small, so I chose 2 MiB.
> >>
> >> I tested reading from qemu-nbd and writing to qemu-nbd with qemu-img and
> >> computing a blkhash with nbdcopy and blksum.
> >>
> >> To focus on NBD communication and get less noisy results, I tested
> >> reading and writing to null-co driver. I added a read-pattern option to
> >> the null-co driver to return data full of 0xff:
> >>
> >> NULL="json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '10g', 'read-pattern': -1}}"
> >>
> >> For testing buffer size I added an environment variable for setting the
> >> socket buffer size.
> >>
> >> Read from qemu-nbd via qemu-img convert. In this test buffer size of 2m
> >> is optimal (12.6 times faster).
> >>
> >> qemu-nbd -r -t -e 0 -f raw -k /tmp/nbd.sock "$NULL" &
> >> qemu-img convert -f raw -O raw -W -n "nbd+unix:///?socket=/tmp/nbd.sock" "$NULL"
> >>
> >> | buffer size | time | user | system |
> >> |-------------|---------|---------|---------|
> >> | default | 13.361 | 2.653 | 5.702 |
> >> | 65536 | 2.283 | 0.204 | 1.318 |
> >> | 131072 | 1.673 | 0.062 | 1.008 |
> >> | 262144 | 1.592 | 0.053 | 0.952 |
> >> | 524288 | 1.496 | 0.049 | 0.887 |
> >> | 1048576 | 1.234 | 0.047 | 0.738 |
> >> | 2097152 | 1.060 | 0.080 | 0.602 |
> >> | 4194304 | 1.061 | 0.076 | 0.604 |
> >>
> >> Write to qemu-nbd with qemu-img convert. In this test buffer size of 2m
> >> is optimal (9.2 times faster).
> >>
> >> qemu-nbd -t -e 0 -f raw -k /tmp/nbd.sock "$NULL" &
> >> qemu-img convert -f raw -O raw -W -n "$NULL" "nbd+unix:///?socket=/tmp/nbd.sock"
> >>
> >> | buffer size | time | user | system |
> >> |-------------|---------|---------|---------|
> >> | default | 8.063 | 2.522 | 4.184 |
> >> | 65536 | 1.472 | 0.430 | 0.867 |
> >> | 131072 | 1.071 | 0.297 | 0.654 |
> >> | 262144 | 1.012 | 0.239 | 0.587 |
> >> | 524288 | 0.970 | 0.201 | 0.514 |
> >> | 1048576 | 0.895 | 0.184 | 0.454 |
> >> | 2097152 | 0.877 | 0.174 | 0.440 |
> >> | 4194304 | 0.944 | 0.231 | 0.535 |
> >>
> >> Compute a blkhash with nbdcopy, using 4 NBD connections and 256k request
> >> size. In this test buffer size of 4m is optimal (5.1 times faster).
> >>
> >> qemu-nbd -r -t -e 0 -f raw -k /tmp/nbd.sock "$NULL" &
> >> nbdcopy --blkhash "nbd+unix:///?socket=/tmp/nbd.sock" null:
> >>
> >> | buffer size | time | user | system |
> >> |-------------|---------|---------|---------|
> >> | default | 8.624 | 5.727 | 6.507 |
> >> | 65536 | 2.563 | 4.760 | 2.498 |
> >> | 131072 | 1.903 | 4.559 | 2.093 |
> >> | 262144 | 1.759 | 4.513 | 1.935 |
> >> | 524288 | 1.729 | 4.489 | 1.924 |
> >> | 1048576 | 1.696 | 4.479 | 1.884 |
> >> | 2097152 | 1.710 | 4.480 | 1.763 |
> >> | 4194304 | 1.687 | 4.479 | 1.712 |
> >>
> >> Compute a blkhash with blksum, using 1 NBD connection and 256k read
> >> size. In this test buffer size of 512k is optimal (10.3 times faster).
> >>
> >> qemu-nbd -r -t -e 0 -f raw -k /tmp/nbd.sock "$NULL" &
> >> blksum "nbd+unix:///?socket=/tmp/nbd.sock"
> >>
> >> | buffer size | time | user | system |
> >> |-------------|---------|---------|---------|
> >> | default | 13.085 | 5.664 | 6.461 |
> >> | 65536 | 3.299 | 5.106 | 2.515 |
> >> | 131072 | 2.396 | 4.989 | 2.069 |
> >> | 262144 | 1.607 | 4.724 | 1.555 |
> >> | 524288 | 1.271 | 4.528 | 1.224 |
> >> | 1048576 | 1.294 | 4.565 | 1.333 |
> >> | 2097152 | 1.299 | 4.569 | 1.344 |
> >> | 4194304 | 1.291 | 4.559 | 1.327 |
> >>
> >> Signed-off-by: Nir Soffer <nirsof@gmail.com>
> >> ---
> >> io/channel-socket.c | 32 ++++++++++++++++++++++++++++++++
> >> 1 file changed, 32 insertions(+)
> >>
> >> diff --git a/io/channel-socket.c b/io/channel-socket.c
> >> index 608bcf066e..06901ab694 100644
> >> --- a/io/channel-socket.c
> >> +++ b/io/channel-socket.c
> >> @@ -21,6 +21,7 @@
> >> #include "qapi/error.h"
> >> #include "qapi/qapi-visit-sockets.h"
> >> #include "qemu/module.h"
> >> +#include "qemu/units.h"
> >> #include "io/channel-socket.h"
> >> #include "io/channel-util.h"
> >> #include "io/channel-watch.h"
> >> @@ -37,6 +38,33 @@
> >>
> >> #define SOCKET_MAX_FDS 16
> >>
> >> +/*
> >> + * Testing shows that 2m send buffer gives best throuput and lowest cpu usage.
> >> + * Changing the receive buffer size has no effect on performance.
> >> + */
> >> +#ifdef __APPLE__
> >> +#define UNIX_STREAM_SOCKET_SEND_BUFFER_SIZE (2 * MiB)
> >> +#endif /* __APPLE__ */
> >> +
> >> +static void qio_channel_socket_set_buffers(QIOChannelSocket *ioc)
> >> +{
> >> + if (ioc->localAddr.ss_family == AF_UNIX) {
> >> + int type;
> >> + socklen_t type_len = sizeof(type);
> >> +
> >> + if (getsockopt(ioc->fd, SOL_SOCKET, SO_TYPE, &type, &type_len) == -1) {
> >> + return;
> >> + }
> >> +
> >> +#ifdef UNIX_STREAM_SOCKET_SEND_BUFFER_SIZE
> >> + if (type == SOCK_STREAM) {
> >> + const int value = UNIX_STREAM_SOCKET_SEND_BUFFER_SIZE;
> >> + setsockopt(ioc->fd, SOL_SOCKET, SO_SNDBUF, &value, sizeof(value));
> >> + }
> >> +#endif /* UNIX_STREAM_SOCKET_SEND_BUFFER_SIZE */
> >> + }
> >> +}
> >
> > While I'm not doubting your benchmark results, I'm a little uneasy about
> > setting this unconditionally for *all* UNIX sockets QEMU creates. The
> > benchmarks show NBD benefits from this, but I'm not convinced that all
> > the other scenarios QEMU creates UNIX sockets for justify it.
> >
> > On Linux, whatever value you set with SO_SNDBUF appears to get doubled
> > internally by the kernel.
> >
> > IOW, this is adding 4 MB fixed overhead for every UNIX socket that
> > QEMU creates. It doesn't take many UNIX sockets in QEMU for that to
> > become a significant amount of extra memory overhead on a host.
> >
> > I'm thinking we might be better with a helper
> >
> > qio_channel_socket_set_send_buffer(QIOChannelSocket *ioc, size_t size)
> >
> > that we call from the NBD code, to limit the impact. Also I think this
> > helper ought not to filter on AF_UNIX - the caller can see the socket
> > type via qio_channel_socket_get_local_address if it does not already
> > have a record of the address, and selectively set the buffer size.
>
> So you suggest to move also UNIX_STREAM_SOCKET_SEND_BUFFER_SIZE to nbd?
>
> If we use this only for nbd this is fine, but once we add another caller we will
> to duplicate the code selecting the right size for the OS. But I guess we can
> reconsider this when have this problem.
Yeah, lets worry about that aspect another day and focus on NBD.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2025-05-07 18:24 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-27 16:50 [PATCH v3 0/2] io: Increase unix stream socket buffer size Nir Soffer
2025-04-27 16:50 ` [PATCH v3 1/2] io: Increase unix socket buffers size on macOS Nir Soffer
2025-05-07 16:37 ` Daniel P. Berrangé
2025-05-07 17:17 ` Nir Soffer
2025-05-07 18:23 ` Daniel P. Berrangé [this message]
2025-04-27 16:50 ` [PATCH v3 2/2] io: Increase unix socket buffers on Linux Nir Soffer
2025-04-28 21:37 ` Eric Blake
2025-04-30 21:02 ` Nir Soffer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aBulDPx3h02280i2@redhat.com \
--to=berrange@redhat.com \
--cc=eblake@redhat.com \
--cc=nirsof@gmail.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=rjones@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).