From: Nir Soffer <nirsof@gmail.com>
To: qemu-devel@nongnu.org
Cc: "Daniel P. Berrangé" <berrange@redhat.com>,
"Eric Blake" <eblake@redhat.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Richard Jones" <rjones@redhat.com>,
"Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru>,
qemu-block@nongnu.org, "Nir Soffer" <nirsof@gmail.com>
Subject: [PATCH v4 3/3] nbd: Set unix socket send buffer on Linux
Date: Sat, 17 May 2025 23:11:54 +0300 [thread overview]
Message-ID: <20250517201154.88456-4-nirsof@gmail.com> (raw)
In-Reply-To: <20250517201154.88456-1-nirsof@gmail.com>
Like macOS we have similar issue on Linux. For TCP socket the send
buffer size is 2626560 bytes (~2.5 MiB) and we get good performance.
However for unix socket the default and maximum buffer size is 212992
bytes (208 KiB) and we see poor performance when using one NBD
connection, up to 4 times slower than macOS on the same machine.
Tracing shows that for every 2 MiB payload (qemu uses 2 MiB io size), we
do 1 recvmsg call with TCP socket, and 10 recvmsg calls with unix
socket.
Fixing this issue requires changing the maximum send buffer size (the
receive buffer size is ignored). This can be done using:
$ cat /etc/sysctl.d/net-mem-max.conf
net.core.wmem_max = 2097152
$ sudo sysctl -p /etc/sysctl.d/net-mem-max.conf
With this we can set the socket buffer size to 2 MiB. With the defaults
the value requested by qemu is clipped to the maximum size and has no
effect.
I tested on 2 machines:
- Fedora 42 VM on MacBook Pro M2 Max
- Dell PowerEdge R640 (Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz)
On the older Dell machine we see very little improvement, up to 1.03
higher throughput. On the M2 machine we see up to 2.67 times higher
throughput. The following results are from the M2 machine.
Reading from qemu-nbd with qemu-img convert. In this test buffer size of
4m is optimal (2.28 times faster).
| buffer size | time | user | system |
|-------------|---------|---------|---------|
| default | 4.292 | 0.243 | 1.604 |
| 524288 | 2.167 | 0.058 | 1.288 |
| 1048576 | 2.041 | 0.060 | 1.238 |
| 2097152 | 1.884 | 0.060 | 1.191 |
| 4194304 | 1.881 | 0.054 | 1.196 |
Writing to qemu-nbd with qemu-img convert. In this test buffer size of
1m is optimal (2.67 times faster).
| buffer size | time | user | system |
|-------------|---------|---------|---------|
| default | 3.113 | 0.334 | 1.094 |
| 524288 | 1.173 | 0.179 | 0.654 |
| 1048576 | 1.164 | 0.164 | 0.670 |
| 2097152 | 1.227 | 0.197 | 0.663 |
| 4194304 | 1.227 | 0.198 | 0.666 |
Computing a blkhash with nbdcopy. In this test buffer size of 512k is
optimal (1.19 times faster).
| buffer size | time | user | system |
|-------------|---------|---------|---------|
| default | 2.140 | 4.483 | 2.681 |
| 524288 | 1.794 | 4.467 | 2.572 |
| 1048576 | 1.807 | 4.447 | 2.644 |
| 2097152 | 1.822 | 4.461 | 2.698 |
| 4194304 | 1.827 | 4.465 | 2.700 |
Computing a blkhash with blksum. In this test buffer size of 4m is
optimal (2.65 times faster).
| buffer size | time | user | system |
|-------------|---------|---------|---------|
| default | 3.582 | 4.595 | 2.392 |
| 524288 | 1.499 | 4.384 | 1.482 |
| 1048576 | 1.377 | 4.381 | 1.345 |
| 2097152 | 1.388 | 4.389 | 1.354 |
| 4194304 | 1.352 | 4.395 | 1.302 |
Signed-off-by: Nir Soffer <nirsof@gmail.com>
---
nbd/common.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/nbd/common.c b/nbd/common.c
index 9436e9d1d1..2a133a66c3 100644
--- a/nbd/common.c
+++ b/nbd/common.c
@@ -271,8 +271,9 @@ const char *nbd_mode_lookup(NBDMode mode)
/*
* Testing shows that 2m send buffer is optimal. Changing the receive buffer
* size has no effect on performance.
+ * On Linux we need to increase net.core.wmem_max to make this effective.
*/
-#if defined(__APPLE__)
+#if defined(__APPLE__) || defined(__linux__)
#define UNIX_STREAM_SOCKET_SEND_BUFFER_SIZE (2 * MiB)
#endif
--
2.39.5 (Apple Git-154)
next prev parent reply other threads:[~2025-05-17 20:13 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-17 20:11 [PATCH v4 0/3] nbd: Increase unix socket buffer size Nir Soffer
2025-05-17 20:11 ` [PATCH v4 1/3] io: Add helper for setting socket send " Nir Soffer
2025-05-19 10:46 ` Daniel P. Berrangé
2025-05-19 11:20 ` Richard W.M. Jones
2025-05-19 20:11 ` Eric Blake
2025-05-17 20:11 ` [PATCH v4 2/3] nbd: Set unix socket send buffer on macOS Nir Soffer
2025-05-19 10:47 ` Daniel P. Berrangé
2025-05-17 20:11 ` Nir Soffer [this message]
2025-05-19 10:47 ` [PATCH v4 3/3] nbd: Set unix socket send buffer on Linux Daniel P. Berrangé
2025-05-19 20:12 ` [PATCH v4 0/3] nbd: Increase unix socket buffer size Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250517201154.88456-4-nirsof@gmail.com \
--to=nirsof@gmail.com \
--cc=berrange@redhat.com \
--cc=eblake@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=rjones@redhat.com \
--cc=vsementsov@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).