qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/6] MSG_ZEROCOPY + multifd
@ 2021-12-09  9:39 Leonardo Bras
  2021-12-09  9:39 ` [PATCH v6 1/6] QIOChannel: Add io_writev_zero_copy & io_flush_zero_copy callbacks Leonardo Bras
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Leonardo Bras @ 2021-12-09  9:39 UTC (permalink / raw)
  To: Daniel P. Berrangé, Juan Quintela, Dr. David Alan Gilbert,
	Eric Blake, Markus Armbruster
  Cc: Leonardo Bras, qemu-devel

This patch series intends to enable MSG_ZEROCOPY in QIOChannel, and make
use of it for multifd migration performance improvement, by reducing cpu
usage.

Patch #1 creates new callbacks for QIOChannel, allowing the implementation
of zero copy writing.

Patch #2 reworks qio_channel_socket_writev() so it accepts flags for
that are later passed to sendmsg().

Patch #3 implements writev_zero_copy and flush_zero_copy on QIOChannelSocket,
making use of MSG_ZEROCOPY on Linux.

Patch #4 adds a "zero_copy" migration property, only available with
CONFIG_LINUX, and compiled-out in any other architectures.
This migration property has to be enabled before multifd migration starts.

Patch #5 adds a helper function that allows to see if TLS is going to be used.
This helper will be later used in patch #6.

Patch #6 Makes use of QIOChannelSocket zero_copy implementation on
nocomp multifd migration.

Results:
In preliminary tests, the resource usage of __sys_sendmsg() reduced 15 times,
and the overall migration took 13-22% less time, based in synthetic cpu
workload.

In further tests, it was noted that, on multifd migration with 8 channels:
- On idle hosts, migration time reduced in 10% to 21%.
- On hosts busy with heavy cpu stress (1 stress thread per cpu, but
  not cpu-pinned) migration time reduced in ~25% by enabling zero-copy.
- On hosts with heavy cpu-pinned workloads (1 stress thread per cpu, 
  cpu-pinned), migration time reducted in ~66% by enabling zero-copy.

Above tests setup:
- Sending and Receiving hosts:
  - CPU : Intel(R) Xeon(R) Platinum 8276L CPU @ 2.20GHz (448 CPUS)
  - Network card: E810-C (100Gbps)
  - >1TB RAM
  - QEMU: Upstream master branch + This patchset
  - Linux: Upstream v5.15 
- VM configuration:
  - 28 VCPUs
  - 512GB RAM


---
Changes since v5:
- flush_zero_copy now returns -1 on fail, 0 on success, and 1 when all
  processed writes were not able to use zerocopy in kernel.
- qio_channel_socket_poll() removed, using qio_channel_wait() instead
- ENOBUFS is now processed inside qio_channel_socket_writev_flags()
- Most zerocopy parameter validation moved to migrate_params_check(),
  leaving only feature test to socket_outgoing_migration() callback
- Naming went from *zerocopy to *zero_copy or *zero-copy, due to QAPI/QMP
  preferences
- Improved docs

Changes since v4:
- 3 patches got splitted in 6
- Flush is used for syncing after each iteration, instead of only at the end
- If zerocopy is not available, fail in connect instead of failing on write
- 'multifd-zerocopy' property renamed to 'zerocopy'
- Fail migrations that don't support zerocopy, if it's enabled.
- Instead of checking for zerocopy at each write, save the flags in
  MultiFDSendParams->write_flags and use them on write
- Reorganized flag usage in QIOChannelSocket 
- A lot of typos fixed
- More doc on buffer restrictions

Changes since v3:
- QIOChannel interface names changed from io_async_{writev,flush} to
  io_{writev,flush}_zerocopy
- Instead of falling back in case zerocopy is not implemented, return
  error and abort operation.
- Flush now waits as long as needed, or return error in case anything
  goes wrong, aborting the operation.
- Zerocopy is now conditional in multifd, being set by parameter
  multifd-zerocopy
- Moves zerocopy_flush to multifd_send_sync_main() from multifd_save_cleanup
  so migration can abort if flush goes wrong.
- Several other small improvements

Changes since v2:
- Patch #1: One more fallback
- Patch #2: Fall back to sync if fails to lock buffer memory in MSG_ZEROCOPY send.

Changes since v1:
- Reimplemented the patchset using async_write + async_flush approach.
- Implemented a flush to be able to tell whenever all data was written.


Leonardo Bras (6):
  QIOChannel: Add io_writev_zero_copy & io_flush_zero_copy callbacks
  QIOChannelSocket: Add flags parameter for writing
  QIOChannelSocket: Implement io_writev_zero_copy & io_flush_zero_copy
    for CONFIG_LINUX
  migration: Add zero-copy parameter for QMP/HMP for Linux
  migration: Add migrate_use_tls() helper
  multifd: Implement zero copy write in multifd migration
    (multifd-zero-copy)

 qapi/migration.json         |  24 ++++++
 include/io/channel-socket.h |   2 +
 include/io/channel.h        |  98 +++++++++++++++++++++---
 migration/migration.h       |   6 ++
 migration/multifd.h         |   4 +-
 io/channel-socket.c         | 145 +++++++++++++++++++++++++++++++++---
 io/channel.c                |  66 +++++++++++++---
 migration/channel.c         |   6 +-
 migration/migration.c       |  49 ++++++++++++
 migration/multifd.c         |  45 ++++++++---
 migration/ram.c             |  29 ++++++--
 migration/socket.c          |   6 ++
 monitor/hmp-cmds.c          |   6 ++
 13 files changed, 434 insertions(+), 52 deletions(-)

-- 
2.33.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-12-10 12:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-12-09  9:39 [PATCH v6 0/6] MSG_ZEROCOPY + multifd Leonardo Bras
2021-12-09  9:39 ` [PATCH v6 1/6] QIOChannel: Add io_writev_zero_copy & io_flush_zero_copy callbacks Leonardo Bras
2021-12-10 12:15   ` Daniel P. Berrangé
2021-12-09  9:39 ` [PATCH v6 2/6] QIOChannelSocket: Add flags parameter for writing Leonardo Bras
2021-12-09  9:39 ` [PATCH v6 3/6] QIOChannelSocket: Implement io_writev_zero_copy & io_flush_zero_copy for CONFIG_LINUX Leonardo Bras
2021-12-09  9:39 ` [PATCH v6 4/6] migration: Add zero-copy parameter for QMP/HMP for Linux Leonardo Bras
2021-12-09  9:39 ` [PATCH v6 5/6] migration: Add migrate_use_tls() helper Leonardo Bras
2021-12-09  9:39 ` [PATCH v6 6/6] multifd: Implement zero copy write in multifd migration (multifd-zero-copy) Leonardo Bras

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).