* [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram
@ 2024-02-28 15:21 Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 01/23] migration/multifd: Cleanup multifd_recv_sync_main Fabiano Rosas
` (22 more replies)
0 siblings, 23 replies; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
Based-on: 74aa0fb297 (migration: options incompatible with cpr) # peterx/migration-next
Hi,
In this v5:
- Rebased on migration-next;
- Renamed the feature to mapped-ram;
- Reworked recv_sync logic to only sync at RAM_SAVE_FLAG_MEM_SIZE and
ignore/avoid all other RAM_FLAGS;
- Fixed and documented barriers at multifd_recv/multifd_recv_thread;
- Duplicated fds passed to multifd to avoid cross-channel effects;
- Dropped the direct-io and fdset patches. Will send them in a
separate series.
The rest are minor changes, I have noted them in the patches
themselves.
CI run: https://gitlab.com/farosas/qemu/-/pipelines/1194172845
Series structure
================
This series enables mapped-ram in steps:
0) Cleanups [1]
1) QIOChannel interfaces [2-6]
2) Mapped-ram format for precopy [7-11]
3) Multifd adaptation without packets [12-15]
4) Mapped-ram format for multifd [16-23]
* below will be sent separately *
5) Direct-io generic support [TODO]
6) Direct-io for mapped-ram multifd with file: URI [TODO]
7) Fdset interface for mapped-ram multifd [TODO]
About mapped-ram
================
Mapped-ram is a new stream format for the RAM section designed to
supplement the existing ``file:`` migration and make it compatible
with ``multifd``. This enables parallel migration of a guest's RAM to
a file.
The core of the feature is to map RAM pages to migration file
offsets. This enables the ``multifd`` threads to write exclusively to
those offsets even if the guest is constantly dirtying pages
(i.e. live migration).
Another benefit is that the resulting file will have a bounded size,
since pages which are dirtied multiple times will always go to a fixed
location in the file, rather than constantly being added to a
sequential stream.
Having the pages at fixed offsets also allows the usage of O_DIRECT
for save/restore of the migration stream as the pages are ensured to
be written respecting O_DIRECT alignment restrictions.
Latest numbers (unchanged from v4)
==============
=> guest: 128 GB RAM - 120 GB dirty - 1 vcpu in tight loop dirtying memory
=> host: 128 CPU AMD EPYC 7543 - 2 NVMe disks in RAID0 (8586 MiB/s) - xfs
=> pinned vcpus w/ NUMA shortest distances - average of 3 runs - results
from query-migrate
non-live | time (ms) pages/s mb/s MB/s
-------------------+-----------------------------------
file | 110512 256258 9549 1193
+ bg-snapshot | 245660 119581 4303 537
-------------------+-----------------------------------
mapped-ram | 157975 216877 6672 834
+ multifd 8 ch. | 95922 292178 10982 1372
+ direct-io | 23268 1936897 45330 5666
-------------------------------------------------------
live | time (ms) pages/s mb/s MB/s
-------------------+-----------------------------------
file | - - - - (file grew 4x the VM size)
+ bg-snapshot | 357635 141747 2974 371
-------------------+-----------------------------------
mapped-ram | - - - - (no convergence in 5 min)
+ multifd 8 ch. | 230812 497551 14900 1862
+ direct-io | 27475 1788025 46736 5842
-------------------------------------------------------
v4:
https://lore.kernel.org/r/20240220224138.24759-1-farosas@suse.de
v3:
https://lore.kernel.org/r/20231127202612.23012-1-farosas@suse.de
v2:
https://lore.kernel.org/r/20231023203608.26370-1-farosas@suse.de
v1:
https://lore.kernel.org/r/20230330180336.2791-1-farosas@suse.de
Fabiano Rosas (20):
migration/multifd: Cleanup multifd_recv_sync_main
io: fsync before closing a file channel
migration/qemu-file: add utility methods for working with seekable
channels
migration/ram: Introduce 'mapped-ram' migration capability
migration: Add mapped-ram URI compatibility check
migration/ram: Add outgoing 'mapped-ram' migration
migration/ram: Add incoming 'mapped-ram' migration
tests/qtest/migration: Add tests for mapped-ram file-based migration
migration/multifd: Rename MultiFDSend|RecvParams::data to
compress_data
migration/multifd: Decouple recv method from pages
migration/multifd: Allow multifd without packets
migration/multifd: Allow receiving pages without packets
migration/multifd: Add a wrapper for channels_created
migration/multifd: Add outgoing QIOChannelFile support
migration/multifd: Add incoming QIOChannelFile support
migration/multifd: Prepare multifd sync for mapped-ram migration
migration/multifd: Support outgoing mapped-ram stream format
migration/multifd: Support incoming mapped-ram stream format
migration/multifd: Add mapped-ram support to fd: URI
tests/qtest/migration: Add a multifd + mapped-ram migration test
Nikolay Borisov (3):
io: add and implement QIO_CHANNEL_FEATURE_SEEKABLE for channel file
io: Add generic pwritev/preadv interface
io: implement io_pwritev/preadv for QIOChannelFile
docs/devel/migration/features.rst | 1 +
docs/devel/migration/mapped-ram.rst | 138 ++++++++++
include/exec/ramblock.h | 13 +
include/io/channel.h | 83 ++++++
include/migration/qemu-file-types.h | 2 +
include/qemu/bitops.h | 13 +
io/channel-file.c | 69 +++++
io/channel.c | 58 ++++
migration/fd.c | 44 +++
migration/fd.h | 2 +
migration/file.c | 153 ++++++++++-
migration/file.h | 8 +
migration/migration.c | 56 +++-
migration/multifd-zlib.c | 26 +-
migration/multifd-zstd.c | 26 +-
migration/multifd.c | 405 ++++++++++++++++++++++------
migration/multifd.h | 27 +-
migration/options.c | 35 +++
migration/options.h | 1 +
migration/qemu-file.c | 106 ++++++++
migration/qemu-file.h | 6 +
migration/ram.c | 333 +++++++++++++++++++++--
migration/ram.h | 1 +
migration/savevm.c | 1 +
migration/trace-events | 2 +-
qapi/migration.json | 6 +-
tests/qtest/migration-test.c | 127 +++++++++
27 files changed, 1600 insertions(+), 142 deletions(-)
create mode 100644 docs/devel/migration/mapped-ram.rst
--
2.35.3
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v5 01/23] migration/multifd: Cleanup multifd_recv_sync_main
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-29 1:26 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 02/23] io: add and implement QIO_CHANNEL_FEATURE_SEEKABLE for channel file Fabiano Rosas
` (21 subsequent siblings)
22 siblings, 1 reply; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
Some minor cleanups and documentation for multifd_recv_sync_main.
Use thread_count as done in other parts of the code. Remove p->id from
the multifd_recv_state sync, since that is global and not tied to a
channel. Add documentation for the sync steps.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
migration/multifd.c | 17 +++++++++++++----
migration/trace-events | 2 +-
2 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index 6c07f19af1..c7389bf833 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1182,18 +1182,27 @@ void multifd_recv_cleanup(void)
void multifd_recv_sync_main(void)
{
+ int thread_count = migrate_multifd_channels();
int i;
if (!migrate_multifd()) {
return;
}
- for (i = 0; i < migrate_multifd_channels(); i++) {
- MultiFDRecvParams *p = &multifd_recv_state->params[i];
- trace_multifd_recv_sync_main_wait(p->id);
+ /*
+ * Initiate the synchronization by waiting for all channels.
+ * For socket-based migration this means each channel has received
+ * the SYNC packet on the stream.
+ */
+ for (i = 0; i < thread_count; i++) {
+ trace_multifd_recv_sync_main_wait(i);
qemu_sem_wait(&multifd_recv_state->sem_sync);
}
- for (i = 0; i < migrate_multifd_channels(); i++) {
+
+ /*
+ * Sync done. Release the channels for the next iteration.
+ */
+ for (i = 0; i < thread_count; i++) {
MultiFDRecvParams *p = &multifd_recv_state->params[i];
WITH_QEMU_LOCK_GUARD(&p->mutex) {
diff --git a/migration/trace-events b/migration/trace-events
index 298ad2b0dd..bf1a069632 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -132,7 +132,7 @@ multifd_recv(uint8_t id, uint64_t packet_num, uint32_t used, uint32_t flags, uin
multifd_recv_new_channel(uint8_t id) "channel %u"
multifd_recv_sync_main(long packet_num) "packet num %ld"
multifd_recv_sync_main_signal(uint8_t id) "channel %u"
-multifd_recv_sync_main_wait(uint8_t id) "channel %u"
+multifd_recv_sync_main_wait(uint8_t id) "iter %u"
multifd_recv_terminate_threads(bool error) "error %d"
multifd_recv_thread_end(uint8_t id, uint64_t packets, uint64_t pages) "channel %u packets %" PRIu64 " pages %" PRIu64
multifd_recv_thread_start(uint8_t id) "%u"
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 02/23] io: add and implement QIO_CHANNEL_FEATURE_SEEKABLE for channel file
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 01/23] migration/multifd: Cleanup multifd_recv_sync_main Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 03/23] io: Add generic pwritev/preadv interface Fabiano Rosas
` (20 subsequent siblings)
22 siblings, 0 replies; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana, Nikolay Borisov
From: Nikolay Borisov <nborisov@suse.com>
Add a generic QIOChannel feature SEEKABLE which would be used by the
qemu_file* apis. For the time being this will be only implemented for
file channels.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
include/io/channel.h | 1 +
io/channel-file.c | 8 ++++++++
2 files changed, 9 insertions(+)
diff --git a/include/io/channel.h b/include/io/channel.h
index 5f9dbaab65..fcb19fd672 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -44,6 +44,7 @@ enum QIOChannelFeature {
QIO_CHANNEL_FEATURE_LISTEN,
QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY,
QIO_CHANNEL_FEATURE_READ_MSG_PEEK,
+ QIO_CHANNEL_FEATURE_SEEKABLE,
};
diff --git a/io/channel-file.c b/io/channel-file.c
index 4a12c61886..f91bf6db1c 100644
--- a/io/channel-file.c
+++ b/io/channel-file.c
@@ -36,6 +36,10 @@ qio_channel_file_new_fd(int fd)
ioc->fd = fd;
+ if (lseek(fd, 0, SEEK_CUR) != (off_t)-1) {
+ qio_channel_set_feature(QIO_CHANNEL(ioc), QIO_CHANNEL_FEATURE_SEEKABLE);
+ }
+
trace_qio_channel_file_new_fd(ioc, fd);
return ioc;
@@ -60,6 +64,10 @@ qio_channel_file_new_path(const char *path,
return NULL;
}
+ if (lseek(ioc->fd, 0, SEEK_CUR) != (off_t)-1) {
+ qio_channel_set_feature(QIO_CHANNEL(ioc), QIO_CHANNEL_FEATURE_SEEKABLE);
+ }
+
trace_qio_channel_file_new_path(ioc, path, flags, mode, ioc->fd);
return ioc;
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 03/23] io: Add generic pwritev/preadv interface
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 01/23] migration/multifd: Cleanup multifd_recv_sync_main Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 02/23] io: add and implement QIO_CHANNEL_FEATURE_SEEKABLE for channel file Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 04/23] io: implement io_pwritev/preadv for QIOChannelFile Fabiano Rosas
` (19 subsequent siblings)
22 siblings, 0 replies; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana, Nikolay Borisov
From: Nikolay Borisov <nborisov@suse.com>
Introduce basic pwritev/preadv support in the generic channel layer.
Specific implementation will follow for the file channel as this is
required in order to support migration streams with fixed location of
each ram page.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
include/io/channel.h | 82 ++++++++++++++++++++++++++++++++++++++++++++
io/channel.c | 58 +++++++++++++++++++++++++++++++
2 files changed, 140 insertions(+)
diff --git a/include/io/channel.h b/include/io/channel.h
index fcb19fd672..7986c49c71 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -131,6 +131,16 @@ struct QIOChannelClass {
Error **errp);
/* Optional callbacks */
+ ssize_t (*io_pwritev)(QIOChannel *ioc,
+ const struct iovec *iov,
+ size_t niov,
+ off_t offset,
+ Error **errp);
+ ssize_t (*io_preadv)(QIOChannel *ioc,
+ const struct iovec *iov,
+ size_t niov,
+ off_t offset,
+ Error **errp);
int (*io_shutdown)(QIOChannel *ioc,
QIOChannelShutdown how,
Error **errp);
@@ -529,6 +539,78 @@ void qio_channel_set_follow_coroutine_ctx(QIOChannel *ioc, bool enabled);
int qio_channel_close(QIOChannel *ioc,
Error **errp);
+/**
+ * qio_channel_pwritev
+ * @ioc: the channel object
+ * @iov: the array of memory regions to write data from
+ * @niov: the length of the @iov array
+ * @offset: offset in the channel where writes should begin
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Not all implementations will support this facility, so may report
+ * an error. To avoid errors, the caller may check for the feature
+ * flag QIO_CHANNEL_FEATURE_SEEKABLE prior to calling this method.
+ *
+ * Behaves as qio_channel_writev_full, apart from not supporting
+ * sending of file handles as well as beginning the write at the
+ * passed @offset
+ *
+ */
+ssize_t qio_channel_pwritev(QIOChannel *ioc, const struct iovec *iov,
+ size_t niov, off_t offset, Error **errp);
+
+/**
+ * qio_channel_pwrite
+ * @ioc: the channel object
+ * @buf: the memory region to write data into
+ * @buflen: the number of bytes to @buf
+ * @offset: offset in the channel where writes should begin
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Not all implementations will support this facility, so may report
+ * an error. To avoid errors, the caller may check for the feature
+ * flag QIO_CHANNEL_FEATURE_SEEKABLE prior to calling this method.
+ *
+ */
+ssize_t qio_channel_pwrite(QIOChannel *ioc, char *buf, size_t buflen,
+ off_t offset, Error **errp);
+
+/**
+ * qio_channel_preadv
+ * @ioc: the channel object
+ * @iov: the array of memory regions to read data into
+ * @niov: the length of the @iov array
+ * @offset: offset in the channel where writes should begin
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Not all implementations will support this facility, so may report
+ * an error. To avoid errors, the caller may check for the feature
+ * flag QIO_CHANNEL_FEATURE_SEEKABLE prior to calling this method.
+ *
+ * Behaves as qio_channel_readv_full, apart from not supporting
+ * receiving of file handles as well as beginning the read at the
+ * passed @offset
+ *
+ */
+ssize_t qio_channel_preadv(QIOChannel *ioc, const struct iovec *iov,
+ size_t niov, off_t offset, Error **errp);
+
+/**
+ * qio_channel_pread
+ * @ioc: the channel object
+ * @buf: the memory region to write data into
+ * @buflen: the number of bytes to @buf
+ * @offset: offset in the channel where writes should begin
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Not all implementations will support this facility, so may report
+ * an error. To avoid errors, the caller may check for the feature
+ * flag QIO_CHANNEL_FEATURE_SEEKABLE prior to calling this method.
+ *
+ */
+ssize_t qio_channel_pread(QIOChannel *ioc, char *buf, size_t buflen,
+ off_t offset, Error **errp);
+
/**
* qio_channel_shutdown:
* @ioc: the channel object
diff --git a/io/channel.c b/io/channel.c
index 86c5834510..a1f12f8e90 100644
--- a/io/channel.c
+++ b/io/channel.c
@@ -454,6 +454,64 @@ GSource *qio_channel_add_watch_source(QIOChannel *ioc,
}
+ssize_t qio_channel_pwritev(QIOChannel *ioc, const struct iovec *iov,
+ size_t niov, off_t offset, Error **errp)
+{
+ QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc);
+
+ if (!klass->io_pwritev) {
+ error_setg(errp, "Channel does not support pwritev");
+ return -1;
+ }
+
+ if (!qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_SEEKABLE)) {
+ error_setg_errno(errp, EINVAL, "Requested channel is not seekable");
+ return -1;
+ }
+
+ return klass->io_pwritev(ioc, iov, niov, offset, errp);
+}
+
+ssize_t qio_channel_pwrite(QIOChannel *ioc, char *buf, size_t buflen,
+ off_t offset, Error **errp)
+{
+ struct iovec iov = {
+ .iov_base = buf,
+ .iov_len = buflen
+ };
+
+ return qio_channel_pwritev(ioc, &iov, 1, offset, errp);
+}
+
+ssize_t qio_channel_preadv(QIOChannel *ioc, const struct iovec *iov,
+ size_t niov, off_t offset, Error **errp)
+{
+ QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc);
+
+ if (!klass->io_preadv) {
+ error_setg(errp, "Channel does not support preadv");
+ return -1;
+ }
+
+ if (!qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_SEEKABLE)) {
+ error_setg_errno(errp, EINVAL, "Requested channel is not seekable");
+ return -1;
+ }
+
+ return klass->io_preadv(ioc, iov, niov, offset, errp);
+}
+
+ssize_t qio_channel_pread(QIOChannel *ioc, char *buf, size_t buflen,
+ off_t offset, Error **errp)
+{
+ struct iovec iov = {
+ .iov_base = buf,
+ .iov_len = buflen
+ };
+
+ return qio_channel_preadv(ioc, &iov, 1, offset, errp);
+}
+
int qio_channel_shutdown(QIOChannel *ioc,
QIOChannelShutdown how,
Error **errp)
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 04/23] io: implement io_pwritev/preadv for QIOChannelFile
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (2 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 03/23] io: Add generic pwritev/preadv interface Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 05/23] io: fsync before closing a file channel Fabiano Rosas
` (18 subsequent siblings)
22 siblings, 0 replies; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana, Nikolay Borisov
From: Nikolay Borisov <nborisov@suse.com>
The upcoming 'mapped-ram' feature will require qemu to write data to
(and restore from) specific offsets of the migration file.
Add a minimal implementation of pwritev/preadv and expose them via the
io_pwritev and io_preadv interfaces.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
io/channel-file.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 56 insertions(+)
diff --git a/io/channel-file.c b/io/channel-file.c
index f91bf6db1c..a6ad7770c6 100644
--- a/io/channel-file.c
+++ b/io/channel-file.c
@@ -146,6 +146,58 @@ static ssize_t qio_channel_file_writev(QIOChannel *ioc,
return ret;
}
+#ifdef CONFIG_PREADV
+static ssize_t qio_channel_file_preadv(QIOChannel *ioc,
+ const struct iovec *iov,
+ size_t niov,
+ off_t offset,
+ Error **errp)
+{
+ QIOChannelFile *fioc = QIO_CHANNEL_FILE(ioc);
+ ssize_t ret;
+
+ retry:
+ ret = preadv(fioc->fd, iov, niov, offset);
+ if (ret < 0) {
+ if (errno == EAGAIN) {
+ return QIO_CHANNEL_ERR_BLOCK;
+ }
+ if (errno == EINTR) {
+ goto retry;
+ }
+
+ error_setg_errno(errp, errno, "Unable to read from file");
+ return -1;
+ }
+
+ return ret;
+}
+
+static ssize_t qio_channel_file_pwritev(QIOChannel *ioc,
+ const struct iovec *iov,
+ size_t niov,
+ off_t offset,
+ Error **errp)
+{
+ QIOChannelFile *fioc = QIO_CHANNEL_FILE(ioc);
+ ssize_t ret;
+
+ retry:
+ ret = pwritev(fioc->fd, iov, niov, offset);
+ if (ret <= 0) {
+ if (errno == EAGAIN) {
+ return QIO_CHANNEL_ERR_BLOCK;
+ }
+ if (errno == EINTR) {
+ goto retry;
+ }
+ error_setg_errno(errp, errno, "Unable to write to file");
+ return -1;
+ }
+ return ret;
+}
+#endif /* CONFIG_PREADV */
+
static int qio_channel_file_set_blocking(QIOChannel *ioc,
bool enabled,
Error **errp)
@@ -231,6 +283,10 @@ static void qio_channel_file_class_init(ObjectClass *klass,
ioc_klass->io_writev = qio_channel_file_writev;
ioc_klass->io_readv = qio_channel_file_readv;
ioc_klass->io_set_blocking = qio_channel_file_set_blocking;
+#ifdef CONFIG_PREADV
+ ioc_klass->io_pwritev = qio_channel_file_pwritev;
+ ioc_klass->io_preadv = qio_channel_file_preadv;
+#endif
ioc_klass->io_seek = qio_channel_file_seek;
ioc_klass->io_close = qio_channel_file_close;
ioc_klass->io_create_watch = qio_channel_file_create_watch;
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 05/23] io: fsync before closing a file channel
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (3 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 04/23] io: implement io_pwritev/preadv for QIOChannelFile Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 06/23] migration/qemu-file: add utility methods for working with seekable channels Fabiano Rosas
` (17 subsequent siblings)
22 siblings, 0 replies; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
Make sure the data is flushed to disk before closing file
channels. This is to ensure data is on disk and not lost in the event
of a host crash.
This is currently being implemented to affect the migration code when
migrating to a file, but all QIOChannelFile users should benefit from
the change.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
io/channel-file.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/io/channel-file.c b/io/channel-file.c
index a6ad7770c6..d4706fa592 100644
--- a/io/channel-file.c
+++ b/io/channel-file.c
@@ -242,6 +242,11 @@ static int qio_channel_file_close(QIOChannel *ioc,
{
QIOChannelFile *fioc = QIO_CHANNEL_FILE(ioc);
+ if (qemu_fdatasync(fioc->fd) < 0) {
+ error_setg_errno(errp, errno,
+ "Unable to synchronize file data with storage device");
+ return -1;
+ }
if (qemu_close(fioc->fd) < 0) {
error_setg_errno(errp, errno,
"Unable to close file");
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 06/23] migration/qemu-file: add utility methods for working with seekable channels
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (4 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 05/23] io: fsync before closing a file channel Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 07/23] migration/ram: Introduce 'mapped-ram' migration capability Fabiano Rosas
` (16 subsequent siblings)
22 siblings, 0 replies; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
Add utility methods that will be needed when implementing 'mapped-ram'
migration capability.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
---
include/migration/qemu-file-types.h | 2 +
migration/qemu-file.c | 106 ++++++++++++++++++++++++++++
migration/qemu-file.h | 6 ++
3 files changed, 114 insertions(+)
diff --git a/include/migration/qemu-file-types.h b/include/migration/qemu-file-types.h
index 9ba163f333..adec5abc07 100644
--- a/include/migration/qemu-file-types.h
+++ b/include/migration/qemu-file-types.h
@@ -50,6 +50,8 @@ unsigned int qemu_get_be16(QEMUFile *f);
unsigned int qemu_get_be32(QEMUFile *f);
uint64_t qemu_get_be64(QEMUFile *f);
+bool qemu_file_is_seekable(QEMUFile *f);
+
static inline void qemu_put_be64s(QEMUFile *f, const uint64_t *pv)
{
qemu_put_be64(f, *pv);
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 94231ff295..b10c882629 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -33,6 +33,7 @@
#include "options.h"
#include "qapi/error.h"
#include "rdma.h"
+#include "io/channel-file.h"
#define IO_BUF_SIZE 32768
#define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 64)
@@ -255,6 +256,10 @@ static void qemu_iovec_release_ram(QEMUFile *f)
memset(f->may_free, 0, sizeof(f->may_free));
}
+bool qemu_file_is_seekable(QEMUFile *f)
+{
+ return qio_channel_has_feature(f->ioc, QIO_CHANNEL_FEATURE_SEEKABLE);
+}
/**
* Flushes QEMUFile buffer
@@ -447,6 +452,107 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, size_t size)
}
}
+void qemu_put_buffer_at(QEMUFile *f, const uint8_t *buf, size_t buflen,
+ off_t pos)
+{
+ Error *err = NULL;
+ size_t ret;
+
+ if (f->last_error) {
+ return;
+ }
+
+ qemu_fflush(f);
+ ret = qio_channel_pwrite(f->ioc, (char *)buf, buflen, pos, &err);
+
+ if (err) {
+ qemu_file_set_error_obj(f, -EIO, err);
+ return;
+ }
+
+ if ((ssize_t)ret == QIO_CHANNEL_ERR_BLOCK) {
+ qemu_file_set_error_obj(f, -EAGAIN, NULL);
+ return;
+ }
+
+ if (ret != buflen) {
+ error_setg(&err, "Partial write of size %zu, expected %zu", ret,
+ buflen);
+ qemu_file_set_error_obj(f, -EIO, err);
+ return;
+ }
+
+ stat64_add(&mig_stats.qemu_file_transferred, buflen);
+
+ return;
+}
+
+
+size_t qemu_get_buffer_at(QEMUFile *f, const uint8_t *buf, size_t buflen,
+ off_t pos)
+{
+ Error *err = NULL;
+ size_t ret;
+
+ if (f->last_error) {
+ return 0;
+ }
+
+ ret = qio_channel_pread(f->ioc, (char *)buf, buflen, pos, &err);
+
+ if ((ssize_t)ret == -1 || err) {
+ qemu_file_set_error_obj(f, -EIO, err);
+ return 0;
+ }
+
+ if ((ssize_t)ret == QIO_CHANNEL_ERR_BLOCK) {
+ qemu_file_set_error_obj(f, -EAGAIN, NULL);
+ return 0;
+ }
+
+ if (ret != buflen) {
+ error_setg(&err, "Partial read of size %zu, expected %zu", ret, buflen);
+ qemu_file_set_error_obj(f, -EIO, err);
+ return 0;
+ }
+
+ return ret;
+}
+
+void qemu_set_offset(QEMUFile *f, off_t off, int whence)
+{
+ Error *err = NULL;
+ off_t ret;
+
+ if (qemu_file_is_writable(f)) {
+ qemu_fflush(f);
+ } else {
+ /* Drop all cached buffers if existed; will trigger a re-fill later */
+ f->buf_index = 0;
+ f->buf_size = 0;
+ }
+
+ ret = qio_channel_io_seek(f->ioc, off, whence, &err);
+ if (ret == (off_t)-1) {
+ qemu_file_set_error_obj(f, -EIO, err);
+ }
+}
+
+off_t qemu_get_offset(QEMUFile *f)
+{
+ Error *err = NULL;
+ off_t ret;
+
+ qemu_fflush(f);
+
+ ret = qio_channel_io_seek(f->ioc, 0, SEEK_CUR, &err);
+ if (ret == (off_t)-1) {
+ qemu_file_set_error_obj(f, -EIO, err);
+ }
+ return ret;
+}
+
+
void qemu_put_byte(QEMUFile *f, int v)
{
if (f->last_error) {
diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 8aec9fabf7..32fd4a34fd 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -75,6 +75,12 @@ QEMUFile *qemu_file_get_return_path(QEMUFile *f);
int qemu_fflush(QEMUFile *f);
void qemu_file_set_blocking(QEMUFile *f, bool block);
int qemu_file_get_to_fd(QEMUFile *f, int fd, size_t size);
+void qemu_set_offset(QEMUFile *f, off_t off, int whence);
+off_t qemu_get_offset(QEMUFile *f);
+void qemu_put_buffer_at(QEMUFile *f, const uint8_t *buf, size_t buflen,
+ off_t pos);
+size_t qemu_get_buffer_at(QEMUFile *f, const uint8_t *buf, size_t buflen,
+ off_t pos);
QIOChannel *qemu_file_get_ioc(QEMUFile *file);
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 07/23] migration/ram: Introduce 'mapped-ram' migration capability
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (5 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 06/23] migration/qemu-file: add utility methods for working with seekable channels Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-29 2:10 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 08/23] migration: Add mapped-ram URI compatibility check Fabiano Rosas
` (15 subsequent siblings)
22 siblings, 1 reply; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana, Eric Blake
Add a new migration capability 'mapped-ram'.
The core of the feature is to ensure that RAM pages are mapped
directly to offsets in the resulting migration file instead of being
streamed at arbitrary points.
The reasons why we'd want such behavior are:
- The resulting file will have a bounded size, since pages which are
dirtied multiple times will always go to a fixed location in the
file, rather than constantly being added to a sequential
stream. This eliminates cases where a VM with, say, 1G of RAM can
result in a migration file that's 10s of GBs, provided that the
workload constantly redirties memory.
- It paves the way to implement O_DIRECT-enabled save/restore of the
migration stream as the pages are ensured to be written at aligned
offsets.
- It allows the usage of multifd so we can write RAM pages to the
migration file in parallel.
For now, enabling the capability has no effect. The next couple of
patches implement the core functionality.
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
- s/fixed-ram/mapped-ram/
- mentioned VRAM in docs
---
docs/devel/migration/features.rst | 1 +
docs/devel/migration/mapped-ram.rst | 138 ++++++++++++++++++++++++++++
migration/migration.c | 7 ++
migration/options.c | 34 +++++++
migration/options.h | 1 +
migration/savevm.c | 1 +
qapi/migration.json | 6 +-
7 files changed, 187 insertions(+), 1 deletion(-)
create mode 100644 docs/devel/migration/mapped-ram.rst
diff --git a/docs/devel/migration/features.rst b/docs/devel/migration/features.rst
index a9acaf618e..9d1abd2587 100644
--- a/docs/devel/migration/features.rst
+++ b/docs/devel/migration/features.rst
@@ -10,3 +10,4 @@ Migration has plenty of features to support different use cases.
dirty-limit
vfio
virtio
+ mapped-ram
diff --git a/docs/devel/migration/mapped-ram.rst b/docs/devel/migration/mapped-ram.rst
new file mode 100644
index 0000000000..fa4cefd9fc
--- /dev/null
+++ b/docs/devel/migration/mapped-ram.rst
@@ -0,0 +1,138 @@
+Mapped-ram
+==========
+
+Mapped-ram is a new stream format for the RAM section designed to
+supplement the existing ``file:`` migration and make it compatible
+with ``multifd``. This enables parallel migration of a guest's RAM to
+a file.
+
+The core of the feature is to ensure that RAM pages are mapped
+directly to offsets in the resulting migration file. This enables the
+``multifd`` threads to write exclusively to those offsets even if the
+guest is constantly dirtying pages (i.e. live migration). Another
+benefit is that the resulting file will have a bounded size, since
+pages which are dirtied multiple times will always go to a fixed
+location in the file, rather than constantly being added to a
+sequential stream. Having the pages at fixed offsets also allows the
+usage of O_DIRECT for save/restore of the migration stream as the
+pages are ensured to be written respecting O_DIRECT alignment
+restrictions (direct-io support not yet implemented).
+
+Usage
+-----
+
+On both source and destination, enable the ``multifd`` and
+``mapped-ram`` capabilities:
+
+ ``migrate_set_capability multifd on``
+
+ ``migrate_set_capability mapped-ram on``
+
+Use a ``file:`` URL for migration:
+
+ ``migrate file:/path/to/migration/file``
+
+Mapped-ram migration is best done non-live, i.e. by stopping the VM on
+the source side before migrating.
+
+Use-cases
+---------
+
+The mapped-ram feature was designed for use cases where the migration
+stream will be directed to a file in the filesystem and not
+immediately restored on the destination VM [#]_. These could be
+thought of as snapshots. We can further categorize them into live and
+non-live.
+
+- Non-live snapshot
+
+If the use case requires a VM to be stopped before taking a snapshot,
+that's the ideal scenario for mapped-ram migration. Not having to
+track dirty pages, the migration will write the RAM pages to the disk
+as fast as it can.
+
+Note: if a snapshot is taken of a running VM, but the VM will be
+stopped after the snapshot by the admin, then consider stopping it
+right before the snapshot to take benefit of the performance gains
+mentioned above.
+
+- Live snapshot
+
+If the use case requires that the VM keeps running during and after
+the snapshot operation, then mapped-ram migration can still be used,
+but will be less performant. Other strategies such as
+background-snapshot should be evaluated as well. One benefit of
+mapped-ram in this scenario is portability since background-snapshot
+depends on async dirty tracking (KVM_GET_DIRTY_LOG) which is not
+supported outside of Linux.
+
+.. [#] While this same effect could be obtained with the usage of
+ snapshots or the ``file:`` migration alone, mapped-ram provides
+ a performance increase for VMs with larger RAM sizes (10s to
+ 100s of GiBs), specially if the VM has been stopped beforehand.
+
+RAM section format
+------------------
+
+Instead of having a sequential stream of pages that follow the
+RAMBlock headers, the dirty pages for a RAMBlock follow its header
+instead. This ensures that each RAM page has a fixed offset in the
+resulting migration file.
+
+A bitmap is introduced to track which pages have been written in the
+migration file. Pages are written at a fixed location for every
+ramblock. Zero pages are ignored as they'd be zero in the destination
+migration as well.
+
+::
+
+ Without mapped-ram: With mapped-ram:
+
+ --------------------- --------------------------------
+ | ramblock 1 header | | ramblock 1 header |
+ --------------------- --------------------------------
+ | ramblock 2 header | | ramblock 1 mapped-ram header |
+ --------------------- --------------------------------
+ | ... | | padding to next 1MB boundary |
+ --------------------- | ... |
+ | ramblock n header | --------------------------------
+ --------------------- | ramblock 1 pages |
+ | RAM_SAVE_FLAG_EOS | | ... |
+ --------------------- --------------------------------
+ | stream of pages | | ramblock 2 header |
+ | (iter 1) | --------------------------------
+ | ... | | ramblock 2 mapped-ram header |
+ --------------------- --------------------------------
+ | RAM_SAVE_FLAG_EOS | | padding to next 1MB boundary |
+ --------------------- | ... |
+ | stream of pages | --------------------------------
+ | (iter 2) | | ramblock 2 pages |
+ | ... | | ... |
+ --------------------- --------------------------------
+ | ... | | ... |
+ --------------------- --------------------------------
+ | RAM_SAVE_FLAG_EOS |
+ --------------------------------
+ | ... |
+ --------------------------------
+
+where:
+ - ramblock header: the generic information for a ramblock, such as
+ idstr, used_len, etc.
+
+ - ramblock mapped-ram header: the information added by this feature:
+ bitmap of pages written, bitmap size and offset of pages in the
+ migration file.
+
+Restrictions
+------------
+
+Since pages are written to their relative offsets and out of order
+(due to the memory dirtying patterns), streaming channels such as
+sockets are not supported. A seekable channel such as a file is
+required. This can be verified in the QIOChannel by the presence of
+the QIO_CHANNEL_FEATURE_SEEKABLE.
+
+The improvements brought by this feature apply only to guest physical
+RAM. Other types of memory such as VRAM are migrated as part of device
+states.
diff --git a/migration/migration.c b/migration/migration.c
index 7652fd4d14..25f01d7818 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1953,6 +1953,13 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
return false;
}
+ if (migrate_mapped_ram()) {
+ if (migrate_tls()) {
+ error_setg(errp, "Cannot use TLS with mapped-ram");
+ return false;
+ }
+ }
+
if (migrate_mode_is_cpr(s)) {
const char *conflict = NULL;
diff --git a/migration/options.c b/migration/options.c
index 3e3e0b93b4..c6edbe4f3e 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -204,6 +204,7 @@ Property migration_properties[] = {
DEFINE_PROP_MIG_CAP("x-switchover-ack",
MIGRATION_CAPABILITY_SWITCHOVER_ACK),
DEFINE_PROP_MIG_CAP("x-dirty-limit", MIGRATION_CAPABILITY_DIRTY_LIMIT),
+ DEFINE_PROP_MIG_CAP("mapped-ram", MIGRATION_CAPABILITY_MAPPED_RAM),
DEFINE_PROP_END_OF_LIST(),
};
@@ -263,6 +264,13 @@ bool migrate_events(void)
return s->capabilities[MIGRATION_CAPABILITY_EVENTS];
}
+bool migrate_mapped_ram(void)
+{
+ MigrationState *s = migrate_get_current();
+
+ return s->capabilities[MIGRATION_CAPABILITY_MAPPED_RAM];
+}
+
bool migrate_ignore_shared(void)
{
MigrationState *s = migrate_get_current();
@@ -645,6 +653,32 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp)
}
}
+ if (new_caps[MIGRATION_CAPABILITY_MAPPED_RAM]) {
+ if (new_caps[MIGRATION_CAPABILITY_MULTIFD]) {
+ error_setg(errp,
+ "Mapped-ram migration is incompatible with multifd");
+ return false;
+ }
+
+ if (new_caps[MIGRATION_CAPABILITY_XBZRLE]) {
+ error_setg(errp,
+ "Mapped-ram migration is incompatible with xbzrle");
+ return false;
+ }
+
+ if (new_caps[MIGRATION_CAPABILITY_COMPRESS]) {
+ error_setg(errp,
+ "Mapped-ram migration is incompatible with compression");
+ return false;
+ }
+
+ if (new_caps[MIGRATION_CAPABILITY_POSTCOPY_RAM]) {
+ error_setg(errp,
+ "Mapped-ram migration is incompatible with postcopy");
+ return false;
+ }
+ }
+
return true;
}
diff --git a/migration/options.h b/migration/options.h
index 246c160aee..6ddd8dad9b 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -31,6 +31,7 @@ bool migrate_compress(void);
bool migrate_dirty_bitmaps(void);
bool migrate_dirty_limit(void);
bool migrate_events(void);
+bool migrate_mapped_ram(void);
bool migrate_ignore_shared(void);
bool migrate_late_block_activate(void);
bool migrate_multifd(void);
diff --git a/migration/savevm.c b/migration/savevm.c
index d612c8a902..dc1fb9c0d3 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -245,6 +245,7 @@ static bool should_validate_capability(int capability)
/* Validate only new capabilities to keep compatibility. */
switch (capability) {
case MIGRATION_CAPABILITY_X_IGNORE_SHARED:
+ case MIGRATION_CAPABILITY_MAPPED_RAM:
return true;
default:
return false;
diff --git a/qapi/migration.json b/qapi/migration.json
index c6bfe2e8c2..df9bcc0b17 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -531,6 +531,10 @@
# and can result in more stable read performance. Requires KVM
# with accelerator property "dirty-ring-size" set. (Since 8.1)
#
+# @mapped-ram: Migrate using fixed offsets in the migration file for
+# each RAM page. Requires a migration URI that supports seeking,
+# such as a file. (since 9.0)
+#
# Features:
#
# @deprecated: Member @block is deprecated. Use blockdev-mirror with
@@ -555,7 +559,7 @@
{ 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
'validate-uuid', 'background-snapshot',
'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
- 'dirty-limit'] }
+ 'dirty-limit', 'mapped-ram'] }
##
# @MigrationCapabilityStatus:
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 08/23] migration: Add mapped-ram URI compatibility check
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (6 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 07/23] migration/ram: Introduce 'mapped-ram' migration capability Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 09/23] migration/ram: Add outgoing 'mapped-ram' migration Fabiano Rosas
` (14 subsequent siblings)
22 siblings, 0 replies; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
The mapped-ram migration format needs a channel that supports seeking
to be able to write each page to an arbitrary offset in the migration
stream.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
migration/migration.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/migration/migration.c b/migration/migration.c
index 25f01d7818..c1cc003b99 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -147,10 +147,39 @@ static bool transport_supports_multi_channels(MigrationAddress *addr)
return false;
}
+static bool migration_needs_seekable_channel(void)
+{
+ return migrate_mapped_ram();
+}
+
+static bool transport_supports_seeking(MigrationAddress *addr)
+{
+ if (addr->transport == MIGRATION_ADDRESS_TYPE_FILE) {
+ return true;
+ }
+
+ /*
+ * At this point, the user might not yet have passed the file
+ * descriptor to QEMU, so we cannot know for sure whether it
+ * refers to a plain file or a socket. Let it through anyway.
+ */
+ if (addr->transport == MIGRATION_ADDRESS_TYPE_SOCKET) {
+ return addr->u.socket.type == SOCKET_ADDRESS_TYPE_FD;
+ }
+
+ return false;
+}
+
static bool
migration_channels_and_transport_compatible(MigrationAddress *addr,
Error **errp)
{
+ if (migration_needs_seekable_channel() &&
+ !transport_supports_seeking(addr)) {
+ error_setg(errp, "Migration requires seekable transport (e.g. file)");
+ return false;
+ }
+
if (migration_needs_multiple_sockets() &&
!transport_supports_multi_channels(addr)) {
error_setg(errp, "Migration requires multi-channel URIs (e.g. tcp)");
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 09/23] migration/ram: Add outgoing 'mapped-ram' migration
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (7 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 08/23] migration: Add mapped-ram URI compatibility check Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 10/23] migration/ram: Add incoming " Fabiano Rosas
` (13 subsequent siblings)
22 siblings, 0 replies; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel
Cc: berrange, armbru, Peter Xu, Claudio Fontana, Nikolay Borisov,
Paolo Bonzini, David Hildenbrand, Philippe Mathieu-Daudé
Implement the outgoing migration side for the 'mapped-ram' capability.
A bitmap is introduced to track which pages have been written in the
migration file. Pages are written at a fixed location for every
ramblock. Zero pages are ignored as they'd be zero in the destination
migration as well.
The migration stream is altered to put the dirty pages for a ramblock
after its header instead of having a sequential stream of pages that
follow the ramblock headers.
Without mapped-ram (current): With mapped-ram (new):
--------------------- --------------------------------
| ramblock 1 header | | ramblock 1 header |
--------------------- --------------------------------
| ramblock 2 header | | ramblock 1 mapped-ram header |
--------------------- --------------------------------
| ... | | padding to next 1MB boundary |
--------------------- | ... |
| ramblock n header | --------------------------------
--------------------- | ramblock 1 pages |
| RAM_SAVE_FLAG_EOS | | ... |
--------------------- --------------------------------
| stream of pages | | ramblock 2 header |
| (iter 1) | --------------------------------
| ... | | ramblock 2 mapped-ram header |
--------------------- --------------------------------
| RAM_SAVE_FLAG_EOS | | padding to next 1MB boundary |
--------------------- | ... |
| stream of pages | --------------------------------
| (iter 2) | | ramblock 2 pages |
| ... | | ... |
--------------------- --------------------------------
| ... | | ... |
--------------------- --------------------------------
| RAM_SAVE_FLAG_EOS |
--------------------------------
| ... |
--------------------------------
where:
- ramblock header: the generic information for a ramblock, such as
idstr, used_len, etc.
- ramblock mapped-ram header: the new information added by this
feature: bitmap of pages written, bitmap size and offset of pages
in the migration file.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
include/exec/ramblock.h | 13 ++++
migration/ram.c | 131 +++++++++++++++++++++++++++++++++++++---
2 files changed, 135 insertions(+), 9 deletions(-)
diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
index 3eb79723c6..848915ea5b 100644
--- a/include/exec/ramblock.h
+++ b/include/exec/ramblock.h
@@ -44,6 +44,19 @@ struct RAMBlock {
size_t page_size;
/* dirty bitmap used during migration */
unsigned long *bmap;
+
+ /*
+ * Below fields are only used by mapped-ram migration
+ */
+ /* bitmap of pages present in the migration file */
+ unsigned long *file_bmap;
+ /*
+ * offset in the file pages belonging to this ramblock are saved,
+ * used only during migration to a file.
+ */
+ off_t bitmap_offset;
+ uint64_t pages_offset;
+
/* bitmap of already received pages in postcopy */
unsigned long *receivedmap;
diff --git a/migration/ram.c b/migration/ram.c
index 45a00b45ed..f807824d49 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -94,6 +94,18 @@
#define RAM_SAVE_FLAG_MULTIFD_FLUSH 0x200
/* We can't use any flag that is bigger than 0x200 */
+/*
+ * mapped-ram migration supports O_DIRECT, so we need to make sure the
+ * userspace buffer, the IO operation size and the file offset are
+ * aligned according to the underlying device's block size. The first
+ * two are already aligned to page size, but we need to add padding to
+ * the file to align the offset. We cannot read the block size
+ * dynamically because the migration file can be moved between
+ * different systems, so use 1M to cover most block sizes and to keep
+ * the file offset aligned at page size as well.
+ */
+#define MAPPED_RAM_FILE_OFFSET_ALIGNMENT 0x100000
+
XBZRLECacheStats xbzrle_counters;
/* used by the search for pages to send */
@@ -1126,12 +1138,18 @@ static int save_zero_page(RAMState *rs, PageSearchStatus *pss,
return 0;
}
+ stat64_add(&mig_stats.zero_pages, 1);
+
+ if (migrate_mapped_ram()) {
+ /* zero pages are not transferred with mapped-ram */
+ clear_bit(offset >> TARGET_PAGE_BITS, pss->block->file_bmap);
+ return 1;
+ }
+
len += save_page_header(pss, file, pss->block, offset | RAM_SAVE_FLAG_ZERO);
qemu_put_byte(file, 0);
len += 1;
ram_release_page(pss->block->idstr, offset);
-
- stat64_add(&mig_stats.zero_pages, 1);
ram_transferred_add(len);
/*
@@ -1189,14 +1207,20 @@ static int save_normal_page(PageSearchStatus *pss, RAMBlock *block,
{
QEMUFile *file = pss->pss_channel;
- ram_transferred_add(save_page_header(pss, pss->pss_channel, block,
- offset | RAM_SAVE_FLAG_PAGE));
- if (async) {
- qemu_put_buffer_async(file, buf, TARGET_PAGE_SIZE,
- migrate_release_ram() &&
- migration_in_postcopy());
+ if (migrate_mapped_ram()) {
+ qemu_put_buffer_at(file, buf, TARGET_PAGE_SIZE,
+ block->pages_offset + offset);
+ set_bit(offset >> TARGET_PAGE_BITS, block->file_bmap);
} else {
- qemu_put_buffer(file, buf, TARGET_PAGE_SIZE);
+ ram_transferred_add(save_page_header(pss, pss->pss_channel, block,
+ offset | RAM_SAVE_FLAG_PAGE));
+ if (async) {
+ qemu_put_buffer_async(file, buf, TARGET_PAGE_SIZE,
+ migrate_release_ram() &&
+ migration_in_postcopy());
+ } else {
+ qemu_put_buffer(file, buf, TARGET_PAGE_SIZE);
+ }
}
ram_transferred_add(TARGET_PAGE_SIZE);
stat64_add(&mig_stats.normal_pages, 1);
@@ -2411,6 +2435,8 @@ static void ram_save_cleanup(void *opaque)
block->clear_bmap = NULL;
g_free(block->bmap);
block->bmap = NULL;
+ g_free(block->file_bmap);
+ block->file_bmap = NULL;
}
xbzrle_cleanup();
@@ -2778,6 +2804,9 @@ static void ram_list_init_bitmaps(void)
*/
block->bmap = bitmap_new(pages);
bitmap_set(block->bmap, 0, pages);
+ if (migrate_mapped_ram()) {
+ block->file_bmap = bitmap_new(pages);
+ }
block->clear_bmap_shift = shift;
block->clear_bmap = bitmap_new(clear_bmap_size(pages, shift));
}
@@ -2915,6 +2944,60 @@ void qemu_guest_free_page_hint(void *addr, size_t len)
}
}
+#define MAPPED_RAM_HDR_VERSION 1
+struct MappedRamHeader {
+ uint32_t version;
+ /*
+ * The target's page size, so we know how many pages are in the
+ * bitmap.
+ */
+ uint64_t page_size;
+ /*
+ * The offset in the migration file where the pages bitmap is
+ * stored.
+ */
+ uint64_t bitmap_offset;
+ /*
+ * The offset in the migration file where the actual pages (data)
+ * are stored.
+ */
+ uint64_t pages_offset;
+} QEMU_PACKED;
+typedef struct MappedRamHeader MappedRamHeader;
+
+static void mapped_ram_setup_ramblock(QEMUFile *file, RAMBlock *block)
+{
+ g_autofree MappedRamHeader *header = NULL;
+ size_t header_size, bitmap_size;
+ long num_pages;
+
+ header = g_new0(MappedRamHeader, 1);
+ header_size = sizeof(MappedRamHeader);
+
+ num_pages = block->used_length >> TARGET_PAGE_BITS;
+ bitmap_size = BITS_TO_LONGS(num_pages) * sizeof(unsigned long);
+
+ /*
+ * Save the file offsets of where the bitmap and the pages should
+ * go as they are written at the end of migration and during the
+ * iterative phase, respectively.
+ */
+ block->bitmap_offset = qemu_get_offset(file) + header_size;
+ block->pages_offset = ROUND_UP(block->bitmap_offset +
+ bitmap_size,
+ MAPPED_RAM_FILE_OFFSET_ALIGNMENT);
+
+ header->version = cpu_to_be32(MAPPED_RAM_HDR_VERSION);
+ header->page_size = cpu_to_be64(TARGET_PAGE_SIZE);
+ header->bitmap_offset = cpu_to_be64(block->bitmap_offset);
+ header->pages_offset = cpu_to_be64(block->pages_offset);
+
+ qemu_put_buffer(file, (uint8_t *) header, header_size);
+
+ /* prepare offset for next ramblock */
+ qemu_set_offset(file, block->pages_offset + block->used_length, SEEK_SET);
+}
+
/*
* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
* long-running RCU critical section. When rcu-reclaims in the code
@@ -2964,6 +3047,10 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
if (migrate_ignore_shared()) {
qemu_put_be64(f, block->mr->addr);
}
+
+ if (migrate_mapped_ram()) {
+ mapped_ram_setup_ramblock(f, block);
+ }
}
}
@@ -2997,6 +3084,20 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
return qemu_fflush(f);
}
+static void ram_save_file_bmap(QEMUFile *f)
+{
+ RAMBlock *block;
+
+ RAMBLOCK_FOREACH_MIGRATABLE(block) {
+ long num_pages = block->used_length >> TARGET_PAGE_BITS;
+ long bitmap_size = BITS_TO_LONGS(num_pages) * sizeof(unsigned long);
+
+ qemu_put_buffer_at(f, (uint8_t *)block->file_bmap, bitmap_size,
+ block->bitmap_offset);
+ ram_transferred_add(bitmap_size);
+ }
+}
+
/**
* ram_save_iterate: iterative stage for migration
*
@@ -3186,6 +3287,18 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
return ret;
}
+ if (migrate_mapped_ram()) {
+ ram_save_file_bmap(f);
+
+ if (qemu_file_get_error(f)) {
+ Error *local_err = NULL;
+ int err = qemu_file_get_error_obj(f, &local_err);
+
+ error_reportf_err(local_err, "Failed to write bitmap to file: ");
+ return -err;
+ }
+ }
+
if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
}
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 10/23] migration/ram: Add incoming 'mapped-ram' migration
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (8 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 09/23] migration/ram: Add outgoing 'mapped-ram' migration Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 11/23] tests/qtest/migration: Add tests for mapped-ram file-based migration Fabiano Rosas
` (12 subsequent siblings)
22 siblings, 0 replies; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana, Nikolay Borisov
Add the necessary code to parse the format changes for the
'mapped-ram' capability.
One of the more notable changes in behavior is that in the
'mapped-ram' case ram pages are restored in one go rather than
constantly looping through the migration stream.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
- s/mismatch/not supported/
- moved Error declaration to the top
---
migration/ram.c | 143 +++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 141 insertions(+), 2 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index f807824d49..18620784c6 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -106,6 +106,12 @@
*/
#define MAPPED_RAM_FILE_OFFSET_ALIGNMENT 0x100000
+/*
+ * When doing mapped-ram migration, this is the amount we read from
+ * the pages region in the migration file at a time.
+ */
+#define MAPPED_RAM_LOAD_BUF_SIZE 0x100000
+
XBZRLECacheStats xbzrle_counters;
/* used by the search for pages to send */
@@ -2998,6 +3004,35 @@ static void mapped_ram_setup_ramblock(QEMUFile *file, RAMBlock *block)
qemu_set_offset(file, block->pages_offset + block->used_length, SEEK_SET);
}
+static bool mapped_ram_read_header(QEMUFile *file, MappedRamHeader *header,
+ Error **errp)
+{
+ size_t ret, header_size = sizeof(MappedRamHeader);
+
+ ret = qemu_get_buffer(file, (uint8_t *)header, header_size);
+ if (ret != header_size) {
+ error_setg(errp, "Could not read whole mapped-ram migration header "
+ "(expected %zd, got %zd bytes)", header_size, ret);
+ return false;
+ }
+
+ /* migration stream is big-endian */
+ header->version = be32_to_cpu(header->version);
+
+ if (header->version > MAPPED_RAM_HDR_VERSION) {
+ error_setg(errp, "Migration mapped-ram capability version not "
+ "supported (expected <= %d, got %d)", MAPPED_RAM_HDR_VERSION,
+ header->version);
+ return false;
+ }
+
+ header->page_size = be64_to_cpu(header->page_size);
+ header->bitmap_offset = be64_to_cpu(header->bitmap_offset);
+ header->pages_offset = be64_to_cpu(header->pages_offset);
+
+ return true;
+}
+
/*
* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
* long-running RCU critical section. When rcu-reclaims in the code
@@ -3899,22 +3934,126 @@ void colo_flush_ram_cache(void)
trace_colo_flush_ram_cache_end();
}
+static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
+ long num_pages, unsigned long *bitmap,
+ Error **errp)
+{
+ ERRP_GUARD();
+ unsigned long set_bit_idx, clear_bit_idx;
+ ram_addr_t offset;
+ void *host;
+ size_t read, unread, size;
+
+ for (set_bit_idx = find_first_bit(bitmap, num_pages);
+ set_bit_idx < num_pages;
+ set_bit_idx = find_next_bit(bitmap, num_pages, clear_bit_idx + 1)) {
+
+ clear_bit_idx = find_next_zero_bit(bitmap, num_pages, set_bit_idx + 1);
+
+ unread = TARGET_PAGE_SIZE * (clear_bit_idx - set_bit_idx);
+ offset = set_bit_idx << TARGET_PAGE_BITS;
+
+ while (unread > 0) {
+ host = host_from_ram_block_offset(block, offset);
+ if (!host) {
+ error_setg(errp, "page outside of ramblock %s range",
+ block->idstr);
+ return false;
+ }
+
+ size = MIN(unread, MAPPED_RAM_LOAD_BUF_SIZE);
+
+ read = qemu_get_buffer_at(f, host, size,
+ block->pages_offset + offset);
+ if (!read) {
+ goto err;
+ }
+ offset += read;
+ unread -= read;
+ }
+ }
+
+ return true;
+
+err:
+ qemu_file_get_error_obj(f, errp);
+ error_prepend(errp, "(%s) failed to read page " RAM_ADDR_FMT
+ "from file offset %" PRIx64 ": ", block->idstr, offset,
+ block->pages_offset + offset);
+ return false;
+}
+
+static void parse_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
+ ram_addr_t length, Error **errp)
+{
+ g_autofree unsigned long *bitmap = NULL;
+ MappedRamHeader header;
+ size_t bitmap_size;
+ long num_pages;
+
+ if (!mapped_ram_read_header(f, &header, errp)) {
+ return;
+ }
+
+ block->pages_offset = header.pages_offset;
+
+ /*
+ * Check the alignment of the file region that contains pages. We
+ * don't enforce MAPPED_RAM_FILE_OFFSET_ALIGNMENT to allow that
+ * value to change in the future. Do only a sanity check with page
+ * size alignment.
+ */
+ if (!QEMU_IS_ALIGNED(block->pages_offset, TARGET_PAGE_SIZE)) {
+ error_setg(errp,
+ "Error reading ramblock %s pages, region has bad alignment",
+ block->idstr);
+ return;
+ }
+
+ num_pages = length / header.page_size;
+ bitmap_size = BITS_TO_LONGS(num_pages) * sizeof(unsigned long);
+
+ bitmap = g_malloc0(bitmap_size);
+ if (qemu_get_buffer_at(f, (uint8_t *)bitmap, bitmap_size,
+ header.bitmap_offset) != bitmap_size) {
+ error_setg(errp, "Error reading dirty bitmap");
+ return;
+ }
+
+ if (!read_ramblock_mapped_ram(f, block, num_pages, bitmap, errp)) {
+ return;
+ }
+
+ /* Skip pages array */
+ qemu_set_offset(f, block->pages_offset + length, SEEK_SET);
+
+ return;
+}
+
static int parse_ramblock(QEMUFile *f, RAMBlock *block, ram_addr_t length)
{
int ret = 0;
/* ADVISE is earlier, it shows the source has the postcopy capability on */
bool postcopy_advised = migration_incoming_postcopy_advised();
+ Error *local_err = NULL;
assert(block);
+ if (migrate_mapped_ram()) {
+ parse_ramblock_mapped_ram(f, block, length, &local_err);
+ if (local_err) {
+ error_report_err(local_err);
+ return -EINVAL;
+ }
+ return 0;
+ }
+
if (!qemu_ram_is_migratable(block)) {
error_report("block %s should not be migrated !", block->idstr);
return -EINVAL;
}
if (length != block->used_length) {
- Error *local_err = NULL;
-
ret = qemu_ram_resize(block, length, &local_err);
if (local_err) {
error_report_err(local_err);
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 11/23] tests/qtest/migration: Add tests for mapped-ram file-based migration
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (9 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 10/23] migration/ram: Add incoming " Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 12/23] migration/multifd: Rename MultiFDSend|RecvParams::data to compress_data Fabiano Rosas
` (11 subsequent siblings)
22 siblings, 0 replies; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel
Cc: berrange, armbru, Peter Xu, Claudio Fontana, Thomas Huth,
Laurent Vivier, Paolo Bonzini
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
tests/qtest/migration-test.c | 59 ++++++++++++++++++++++++++++++++++++
1 file changed, 59 insertions(+)
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 83512bce85..64a26009e9 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2200,6 +2200,14 @@ static void *test_mode_reboot_start(QTestState *from, QTestState *to)
return NULL;
}
+static void *migrate_mapped_ram_start(QTestState *from, QTestState *to)
+{
+ migrate_set_capability(from, "mapped-ram", true);
+ migrate_set_capability(to, "mapped-ram", true);
+
+ return NULL;
+}
+
static void test_mode_reboot(void)
{
g_autofree char *uri = g_strdup_printf("file:%s/%s", tmpfs,
@@ -2214,6 +2222,32 @@ static void test_mode_reboot(void)
test_file_common(&args, true);
}
+static void test_precopy_file_mapped_ram_live(void)
+{
+ g_autofree char *uri = g_strdup_printf("file:%s/%s", tmpfs,
+ FILE_TEST_FILENAME);
+ MigrateCommon args = {
+ .connect_uri = uri,
+ .listen_uri = "defer",
+ .start_hook = migrate_mapped_ram_start,
+ };
+
+ test_file_common(&args, false);
+}
+
+static void test_precopy_file_mapped_ram(void)
+{
+ g_autofree char *uri = g_strdup_printf("file:%s/%s", tmpfs,
+ FILE_TEST_FILENAME);
+ MigrateCommon args = {
+ .connect_uri = uri,
+ .listen_uri = "defer",
+ .start_hook = migrate_mapped_ram_start,
+ };
+
+ test_file_common(&args, true);
+}
+
static void test_precopy_tcp_plain(void)
{
MigrateCommon args = {
@@ -2462,6 +2496,13 @@ static void *migrate_precopy_fd_file_start(QTestState *from, QTestState *to)
return NULL;
}
+static void *migrate_fd_file_mapped_ram_start(QTestState *from, QTestState *to)
+{
+ migrate_mapped_ram_start(from, to);
+
+ return migrate_precopy_fd_file_start(from, to);
+}
+
static void test_migrate_precopy_fd_file(void)
{
MigrateCommon args = {
@@ -2472,6 +2513,17 @@ static void test_migrate_precopy_fd_file(void)
};
test_file_common(&args, true);
}
+
+static void test_migrate_precopy_fd_file_mapped_ram(void)
+{
+ MigrateCommon args = {
+ .listen_uri = "defer",
+ .connect_uri = "fd:fd-mig",
+ .start_hook = migrate_fd_file_mapped_ram_start,
+ .finish_hook = test_migrate_fd_finish_hook
+ };
+ test_file_common(&args, true);
+}
#endif /* _WIN32 */
static void do_test_validate_uuid(MigrateStart *args, bool should_fail)
@@ -3509,6 +3561,11 @@ int main(int argc, char **argv)
migration_test_add("/migration/mode/reboot", test_mode_reboot);
}
+ migration_test_add("/migration/precopy/file/mapped-ram",
+ test_precopy_file_mapped_ram);
+ migration_test_add("/migration/precopy/file/mapped-ram/live",
+ test_precopy_file_mapped_ram_live);
+
#ifdef CONFIG_GNUTLS
migration_test_add("/migration/precopy/unix/tls/psk",
test_precopy_unix_tls_psk);
@@ -3570,6 +3627,8 @@ int main(int argc, char **argv)
test_migrate_precopy_fd_socket);
migration_test_add("/migration/precopy/fd/file",
test_migrate_precopy_fd_file);
+ migration_test_add("/migration/precopy/fd/file/mapped-ram",
+ test_migrate_precopy_fd_file_mapped_ram);
#endif
migration_test_add("/migration/validate_uuid", test_validate_uuid);
migration_test_add("/migration/validate_uuid_error",
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 12/23] migration/multifd: Rename MultiFDSend|RecvParams::data to compress_data
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (10 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 11/23] tests/qtest/migration: Add tests for mapped-ram file-based migration Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 13/23] migration/multifd: Decouple recv method from pages Fabiano Rosas
` (10 subsequent siblings)
22 siblings, 0 replies; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
Use a more specific name for the compression data so we can use the
generic for the multifd core code.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
migration/multifd-zlib.c | 20 ++++++++++----------
migration/multifd-zstd.c | 20 ++++++++++----------
migration/multifd.h | 4 ++--
3 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c
index 012e3bdea1..2a8f5fc9a6 100644
--- a/migration/multifd-zlib.c
+++ b/migration/multifd-zlib.c
@@ -69,7 +69,7 @@ static int zlib_send_setup(MultiFDSendParams *p, Error **errp)
err_msg = "out of memory for buf";
goto err_free_zbuff;
}
- p->data = z;
+ p->compress_data = z;
return 0;
err_free_zbuff:
@@ -92,15 +92,15 @@ err_free_z:
*/
static void zlib_send_cleanup(MultiFDSendParams *p, Error **errp)
{
- struct zlib_data *z = p->data;
+ struct zlib_data *z = p->compress_data;
deflateEnd(&z->zs);
g_free(z->zbuff);
z->zbuff = NULL;
g_free(z->buf);
z->buf = NULL;
- g_free(p->data);
- p->data = NULL;
+ g_free(p->compress_data);
+ p->compress_data = NULL;
}
/**
@@ -117,7 +117,7 @@ static void zlib_send_cleanup(MultiFDSendParams *p, Error **errp)
static int zlib_send_prepare(MultiFDSendParams *p, Error **errp)
{
MultiFDPages_t *pages = p->pages;
- struct zlib_data *z = p->data;
+ struct zlib_data *z = p->compress_data;
z_stream *zs = &z->zs;
uint32_t out_size = 0;
int ret;
@@ -194,7 +194,7 @@ static int zlib_recv_setup(MultiFDRecvParams *p, Error **errp)
struct zlib_data *z = g_new0(struct zlib_data, 1);
z_stream *zs = &z->zs;
- p->data = z;
+ p->compress_data = z;
zs->zalloc = Z_NULL;
zs->zfree = Z_NULL;
zs->opaque = Z_NULL;
@@ -224,13 +224,13 @@ static int zlib_recv_setup(MultiFDRecvParams *p, Error **errp)
*/
static void zlib_recv_cleanup(MultiFDRecvParams *p)
{
- struct zlib_data *z = p->data;
+ struct zlib_data *z = p->compress_data;
inflateEnd(&z->zs);
g_free(z->zbuff);
z->zbuff = NULL;
- g_free(p->data);
- p->data = NULL;
+ g_free(p->compress_data);
+ p->compress_data = NULL;
}
/**
@@ -246,7 +246,7 @@ static void zlib_recv_cleanup(MultiFDRecvParams *p)
*/
static int zlib_recv_pages(MultiFDRecvParams *p, Error **errp)
{
- struct zlib_data *z = p->data;
+ struct zlib_data *z = p->compress_data;
z_stream *zs = &z->zs;
uint32_t in_size = p->next_packet_size;
/* we measure the change of total_out */
diff --git a/migration/multifd-zstd.c b/migration/multifd-zstd.c
index dc8fe43e94..593cf290ad 100644
--- a/migration/multifd-zstd.c
+++ b/migration/multifd-zstd.c
@@ -52,7 +52,7 @@ static int zstd_send_setup(MultiFDSendParams *p, Error **errp)
struct zstd_data *z = g_new0(struct zstd_data, 1);
int res;
- p->data = z;
+ p->compress_data = z;
z->zcs = ZSTD_createCStream();
if (!z->zcs) {
g_free(z);
@@ -90,14 +90,14 @@ static int zstd_send_setup(MultiFDSendParams *p, Error **errp)
*/
static void zstd_send_cleanup(MultiFDSendParams *p, Error **errp)
{
- struct zstd_data *z = p->data;
+ struct zstd_data *z = p->compress_data;
ZSTD_freeCStream(z->zcs);
z->zcs = NULL;
g_free(z->zbuff);
z->zbuff = NULL;
- g_free(p->data);
- p->data = NULL;
+ g_free(p->compress_data);
+ p->compress_data = NULL;
}
/**
@@ -114,7 +114,7 @@ static void zstd_send_cleanup(MultiFDSendParams *p, Error **errp)
static int zstd_send_prepare(MultiFDSendParams *p, Error **errp)
{
MultiFDPages_t *pages = p->pages;
- struct zstd_data *z = p->data;
+ struct zstd_data *z = p->compress_data;
int ret;
uint32_t i;
@@ -183,7 +183,7 @@ static int zstd_recv_setup(MultiFDRecvParams *p, Error **errp)
struct zstd_data *z = g_new0(struct zstd_data, 1);
int ret;
- p->data = z;
+ p->compress_data = z;
z->zds = ZSTD_createDStream();
if (!z->zds) {
g_free(z);
@@ -221,14 +221,14 @@ static int zstd_recv_setup(MultiFDRecvParams *p, Error **errp)
*/
static void zstd_recv_cleanup(MultiFDRecvParams *p)
{
- struct zstd_data *z = p->data;
+ struct zstd_data *z = p->compress_data;
ZSTD_freeDStream(z->zds);
z->zds = NULL;
g_free(z->zbuff);
z->zbuff = NULL;
- g_free(p->data);
- p->data = NULL;
+ g_free(p->compress_data);
+ p->compress_data = NULL;
}
/**
@@ -248,7 +248,7 @@ static int zstd_recv_pages(MultiFDRecvParams *p, Error **errp)
uint32_t out_size = 0;
uint32_t expected_size = p->normal_num * p->page_size;
uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
- struct zstd_data *z = p->data;
+ struct zstd_data *z = p->compress_data;
int ret;
int i;
diff --git a/migration/multifd.h b/migration/multifd.h
index b3fe27ae93..adccd3532f 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -127,7 +127,7 @@ typedef struct {
/* number of iovs used */
uint32_t iovs_num;
/* used for compression methods */
- void *data;
+ void *compress_data;
} MultiFDSendParams;
typedef struct {
@@ -183,7 +183,7 @@ typedef struct {
/* num of non zero pages */
uint32_t normal_num;
/* used for de-compression methods */
- void *data;
+ void *compress_data;
} MultiFDRecvParams;
typedef struct {
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 13/23] migration/multifd: Decouple recv method from pages
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (11 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 12/23] migration/multifd: Rename MultiFDSend|RecvParams::data to compress_data Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 14/23] migration/multifd: Allow multifd without packets Fabiano Rosas
` (9 subsequent siblings)
22 siblings, 0 replies; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
Next patches will abstract the type of data being received by the
channels, so do some cleanup now to remove references to pages and
dependency on 'normal_num'.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
migration/multifd-zlib.c | 6 +++---
migration/multifd-zstd.c | 6 +++---
migration/multifd.c | 13 ++++++++-----
migration/multifd.h | 4 ++--
4 files changed, 16 insertions(+), 13 deletions(-)
diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c
index 2a8f5fc9a6..6120faad65 100644
--- a/migration/multifd-zlib.c
+++ b/migration/multifd-zlib.c
@@ -234,7 +234,7 @@ static void zlib_recv_cleanup(MultiFDRecvParams *p)
}
/**
- * zlib_recv_pages: read the data from the channel into actual pages
+ * zlib_recv: read the data from the channel into actual pages
*
* Read the compressed buffer, and uncompress it into the actual
* pages.
@@ -244,7 +244,7 @@ static void zlib_recv_cleanup(MultiFDRecvParams *p)
* @p: Params for the channel that we are using
* @errp: pointer to an error
*/
-static int zlib_recv_pages(MultiFDRecvParams *p, Error **errp)
+static int zlib_recv(MultiFDRecvParams *p, Error **errp)
{
struct zlib_data *z = p->compress_data;
z_stream *zs = &z->zs;
@@ -319,7 +319,7 @@ static MultiFDMethods multifd_zlib_ops = {
.send_prepare = zlib_send_prepare,
.recv_setup = zlib_recv_setup,
.recv_cleanup = zlib_recv_cleanup,
- .recv_pages = zlib_recv_pages
+ .recv = zlib_recv
};
static void multifd_zlib_register(void)
diff --git a/migration/multifd-zstd.c b/migration/multifd-zstd.c
index 593cf290ad..cac236833d 100644
--- a/migration/multifd-zstd.c
+++ b/migration/multifd-zstd.c
@@ -232,7 +232,7 @@ static void zstd_recv_cleanup(MultiFDRecvParams *p)
}
/**
- * zstd_recv_pages: read the data from the channel into actual pages
+ * zstd_recv: read the data from the channel into actual pages
*
* Read the compressed buffer, and uncompress it into the actual
* pages.
@@ -242,7 +242,7 @@ static void zstd_recv_cleanup(MultiFDRecvParams *p)
* @p: Params for the channel that we are using
* @errp: pointer to an error
*/
-static int zstd_recv_pages(MultiFDRecvParams *p, Error **errp)
+static int zstd_recv(MultiFDRecvParams *p, Error **errp)
{
uint32_t in_size = p->next_packet_size;
uint32_t out_size = 0;
@@ -310,7 +310,7 @@ static MultiFDMethods multifd_zstd_ops = {
.send_prepare = zstd_send_prepare,
.recv_setup = zstd_recv_setup,
.recv_cleanup = zstd_recv_cleanup,
- .recv_pages = zstd_recv_pages
+ .recv = zstd_recv
};
static void multifd_zstd_register(void)
diff --git a/migration/multifd.c b/migration/multifd.c
index c7389bf833..3a8520097b 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -197,7 +197,7 @@ static void nocomp_recv_cleanup(MultiFDRecvParams *p)
}
/**
- * nocomp_recv_pages: read the data from the channel into actual pages
+ * nocomp_recv: read the data from the channel
*
* For no compression we just need to read things into the correct place.
*
@@ -206,7 +206,7 @@ static void nocomp_recv_cleanup(MultiFDRecvParams *p)
* @p: Params for the channel that we are using
* @errp: pointer to an error
*/
-static int nocomp_recv_pages(MultiFDRecvParams *p, Error **errp)
+static int nocomp_recv(MultiFDRecvParams *p, Error **errp)
{
uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
@@ -228,7 +228,7 @@ static MultiFDMethods multifd_nocomp_ops = {
.send_prepare = nocomp_send_prepare,
.recv_setup = nocomp_recv_setup,
.recv_cleanup = nocomp_recv_cleanup,
- .recv_pages = nocomp_recv_pages
+ .recv = nocomp_recv
};
static MultiFDMethods *multifd_ops[MULTIFD_COMPRESSION__MAX] = {
@@ -1227,6 +1227,8 @@ static void *multifd_recv_thread(void *opaque)
while (true) {
uint32_t flags;
+ bool has_data = false;
+ p->normal_num = 0;
if (multifd_recv_should_exit()) {
break;
@@ -1248,10 +1250,11 @@ static void *multifd_recv_thread(void *opaque)
flags = p->flags;
/* recv methods don't know how to handle the SYNC flag */
p->flags &= ~MULTIFD_FLAG_SYNC;
+ has_data = !!p->normal_num;
qemu_mutex_unlock(&p->mutex);
- if (p->normal_num) {
- ret = multifd_recv_state->ops->recv_pages(p, &local_err);
+ if (has_data) {
+ ret = multifd_recv_state->ops->recv(p, &local_err);
if (ret != 0) {
break;
}
diff --git a/migration/multifd.h b/migration/multifd.h
index adccd3532f..6a54377cc1 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -197,8 +197,8 @@ typedef struct {
int (*recv_setup)(MultiFDRecvParams *p, Error **errp);
/* Cleanup for receiving side */
void (*recv_cleanup)(MultiFDRecvParams *p);
- /* Read all pages */
- int (*recv_pages)(MultiFDRecvParams *p, Error **errp);
+ /* Read all data */
+ int (*recv)(MultiFDRecvParams *p, Error **errp);
} MultiFDMethods;
void multifd_register_ops(int method, MultiFDMethods *ops);
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 14/23] migration/multifd: Allow multifd without packets
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (12 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 13/23] migration/multifd: Decouple recv method from pages Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-29 2:20 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 15/23] migration/multifd: Allow receiving pages " Fabiano Rosas
` (8 subsequent siblings)
22 siblings, 1 reply; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
For the upcoming support to the new 'mapped-ram' migration stream
format, we cannot use multifd packets because each write into the
ramblock section in the migration file is expected to contain only the
guest pages. They are written at their respective offsets relative to
the ramblock section header.
There is no space for the packet information and the expected gains
from the new approach come partly from being able to write the pages
sequentially without extraneous data in between.
The new format also simply doesn't need the packets and all necessary
information can be taken from the standard migration headers with some
(future) changes to multifd code.
Use the presence of the mapped-ram capability to decide whether to
send packets.
This only moves code under multifd_use_packets(), it has no effect for
now as mapped-ram cannot yet be enabled with multifd.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
- added multifd_send_prepare_iovs
- posted channels_created at file.c as well
---
migration/multifd.c | 175 +++++++++++++++++++++++++++++---------------
1 file changed, 114 insertions(+), 61 deletions(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index 3a8520097b..8c43424c81 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -92,6 +92,11 @@ struct {
MultiFDMethods *ops;
} *multifd_recv_state;
+static bool multifd_use_packets(void)
+{
+ return !migrate_mapped_ram();
+}
+
/* Multifd without compression */
/**
@@ -122,6 +127,19 @@ static void nocomp_send_cleanup(MultiFDSendParams *p, Error **errp)
return;
}
+static void multifd_send_prepare_iovs(MultiFDSendParams *p)
+{
+ MultiFDPages_t *pages = p->pages;
+
+ for (int i = 0; i < pages->num; i++) {
+ p->iov[p->iovs_num].iov_base = pages->block->host + pages->offset[i];
+ p->iov[p->iovs_num].iov_len = p->page_size;
+ p->iovs_num++;
+ }
+
+ p->next_packet_size = pages->num * p->page_size;
+}
+
/**
* nocomp_send_prepare: prepare date to be able to send
*
@@ -136,9 +154,13 @@ static void nocomp_send_cleanup(MultiFDSendParams *p, Error **errp)
static int nocomp_send_prepare(MultiFDSendParams *p, Error **errp)
{
bool use_zero_copy_send = migrate_zero_copy_send();
- MultiFDPages_t *pages = p->pages;
int ret;
+ if (!multifd_use_packets()) {
+ multifd_send_prepare_iovs(p);
+ return 0;
+ }
+
if (!use_zero_copy_send) {
/*
* Only !zerocopy needs the header in IOV; zerocopy will
@@ -147,13 +169,7 @@ static int nocomp_send_prepare(MultiFDSendParams *p, Error **errp)
multifd_send_prepare_header(p);
}
- for (int i = 0; i < pages->num; i++) {
- p->iov[p->iovs_num].iov_base = pages->block->host + pages->offset[i];
- p->iov[p->iovs_num].iov_len = p->page_size;
- p->iovs_num++;
- }
-
- p->next_packet_size = pages->num * p->page_size;
+ multifd_send_prepare_iovs(p);
p->flags |= MULTIFD_FLAG_NOCOMP;
multifd_send_fill_packet(p);
@@ -208,7 +224,13 @@ static void nocomp_recv_cleanup(MultiFDRecvParams *p)
*/
static int nocomp_recv(MultiFDRecvParams *p, Error **errp)
{
- uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
+ uint32_t flags;
+
+ if (!multifd_use_packets()) {
+ return 0;
+ }
+
+ flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
if (flags != MULTIFD_FLAG_NOCOMP) {
error_setg(errp, "multifd %u: flags received %x flags expected %x",
@@ -795,15 +817,18 @@ static void *multifd_send_thread(void *opaque)
MigrationThread *thread = NULL;
Error *local_err = NULL;
int ret = 0;
+ bool use_packets = multifd_use_packets();
thread = migration_threads_add(p->name, qemu_get_thread_id());
trace_multifd_send_thread_start(p->id);
rcu_register_thread();
- if (multifd_send_initial_packet(p, &local_err) < 0) {
- ret = -1;
- goto out;
+ if (use_packets) {
+ if (multifd_send_initial_packet(p, &local_err) < 0) {
+ ret = -1;
+ goto out;
+ }
}
while (true) {
@@ -854,16 +879,20 @@ static void *multifd_send_thread(void *opaque)
* it doesn't require explicit memory barriers.
*/
assert(qatomic_read(&p->pending_sync));
- p->flags = MULTIFD_FLAG_SYNC;
- multifd_send_fill_packet(p);
- ret = qio_channel_write_all(p->c, (void *)p->packet,
- p->packet_len, &local_err);
- if (ret != 0) {
- break;
+
+ if (use_packets) {
+ p->flags = MULTIFD_FLAG_SYNC;
+ multifd_send_fill_packet(p);
+ ret = qio_channel_write_all(p->c, (void *)p->packet,
+ p->packet_len, &local_err);
+ if (ret != 0) {
+ break;
+ }
+ /* p->next_packet_size will always be zero for a SYNC packet */
+ stat64_add(&mig_stats.multifd_bytes, p->packet_len);
+ p->flags = 0;
}
- /* p->next_packet_size will always be zero for a SYNC packet */
- stat64_add(&mig_stats.multifd_bytes, p->packet_len);
- p->flags = 0;
+
qatomic_set(&p->pending_sync, false);
qemu_sem_post(&p->sem_sync);
}
@@ -1018,6 +1047,7 @@ bool multifd_send_setup(void)
Error *local_err = NULL;
int thread_count, ret = 0;
uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size();
+ bool use_packets = multifd_use_packets();
uint8_t i;
if (!migrate_multifd()) {
@@ -1040,14 +1070,20 @@ bool multifd_send_setup(void)
qemu_sem_init(&p->sem_sync, 0);
p->id = i;
p->pages = multifd_pages_init(page_count);
- p->packet_len = sizeof(MultiFDPacket_t)
- + sizeof(uint64_t) * page_count;
- p->packet = g_malloc0(p->packet_len);
- p->packet->magic = cpu_to_be32(MULTIFD_MAGIC);
- p->packet->version = cpu_to_be32(MULTIFD_VERSION);
+
+ if (use_packets) {
+ p->packet_len = sizeof(MultiFDPacket_t)
+ + sizeof(uint64_t) * page_count;
+ p->packet = g_malloc0(p->packet_len);
+ p->packet->magic = cpu_to_be32(MULTIFD_MAGIC);
+ p->packet->version = cpu_to_be32(MULTIFD_VERSION);
+
+ /* We need one extra place for the packet header */
+ p->iov = g_new0(struct iovec, page_count + 1);
+ } else {
+ p->iov = g_new0(struct iovec, page_count);
+ }
p->name = g_strdup_printf("multifdsend_%d", i);
- /* We need one extra place for the packet header */
- p->iov = g_new0(struct iovec, page_count + 1);
p->page_size = qemu_target_page_size();
p->page_count = page_count;
p->write_flags = 0;
@@ -1110,7 +1146,9 @@ static void multifd_recv_terminate_threads(Error *err)
* multifd_recv_thread may hung at MULTIFD_FLAG_SYNC handle code,
* however try to wakeup it without harm in cleanup phase.
*/
- qemu_sem_post(&p->sem_sync);
+ if (multifd_use_packets()) {
+ qemu_sem_post(&p->sem_sync);
+ }
/*
* We could arrive here for two reasons:
@@ -1185,7 +1223,7 @@ void multifd_recv_sync_main(void)
int thread_count = migrate_multifd_channels();
int i;
- if (!migrate_multifd()) {
+ if (!migrate_multifd() || !multifd_use_packets()) {
return;
}
@@ -1220,13 +1258,14 @@ static void *multifd_recv_thread(void *opaque)
{
MultiFDRecvParams *p = opaque;
Error *local_err = NULL;
+ bool use_packets = multifd_use_packets();
int ret;
trace_multifd_recv_thread_start(p->id);
rcu_register_thread();
while (true) {
- uint32_t flags;
+ uint32_t flags = 0;
bool has_data = false;
p->normal_num = 0;
@@ -1234,25 +1273,27 @@ static void *multifd_recv_thread(void *opaque)
break;
}
- ret = qio_channel_read_all_eof(p->c, (void *)p->packet,
- p->packet_len, &local_err);
- if (ret == 0 || ret == -1) { /* 0: EOF -1: Error */
- break;
- }
+ if (use_packets) {
+ ret = qio_channel_read_all_eof(p->c, (void *)p->packet,
+ p->packet_len, &local_err);
+ if (ret == 0 || ret == -1) { /* 0: EOF -1: Error */
+ break;
+ }
- qemu_mutex_lock(&p->mutex);
- ret = multifd_recv_unfill_packet(p, &local_err);
- if (ret) {
+ qemu_mutex_lock(&p->mutex);
+ ret = multifd_recv_unfill_packet(p, &local_err);
+ if (ret) {
+ qemu_mutex_unlock(&p->mutex);
+ break;
+ }
+
+ flags = p->flags;
+ /* recv methods don't know how to handle the SYNC flag */
+ p->flags &= ~MULTIFD_FLAG_SYNC;
+ has_data = !!p->normal_num;
qemu_mutex_unlock(&p->mutex);
- break;
}
- flags = p->flags;
- /* recv methods don't know how to handle the SYNC flag */
- p->flags &= ~MULTIFD_FLAG_SYNC;
- has_data = !!p->normal_num;
- qemu_mutex_unlock(&p->mutex);
-
if (has_data) {
ret = multifd_recv_state->ops->recv(p, &local_err);
if (ret != 0) {
@@ -1260,9 +1301,11 @@ static void *multifd_recv_thread(void *opaque)
}
}
- if (flags & MULTIFD_FLAG_SYNC) {
- qemu_sem_post(&multifd_recv_state->sem_sync);
- qemu_sem_wait(&p->sem_sync);
+ if (use_packets) {
+ if (flags & MULTIFD_FLAG_SYNC) {
+ qemu_sem_post(&multifd_recv_state->sem_sync);
+ qemu_sem_wait(&p->sem_sync);
+ }
}
}
@@ -1281,6 +1324,7 @@ int multifd_recv_setup(Error **errp)
{
int thread_count;
uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size();
+ bool use_packets = multifd_use_packets();
uint8_t i;
/*
@@ -1305,9 +1349,12 @@ int multifd_recv_setup(Error **errp)
qemu_mutex_init(&p->mutex);
qemu_sem_init(&p->sem_sync, 0);
p->id = i;
- p->packet_len = sizeof(MultiFDPacket_t)
- + sizeof(uint64_t) * page_count;
- p->packet = g_malloc0(p->packet_len);
+
+ if (use_packets) {
+ p->packet_len = sizeof(MultiFDPacket_t)
+ + sizeof(uint64_t) * page_count;
+ p->packet = g_malloc0(p->packet_len);
+ }
p->name = g_strdup_printf("multifdrecv_%d", i);
p->iov = g_new0(struct iovec, page_count);
p->normal = g_new0(ram_addr_t, page_count);
@@ -1351,18 +1398,24 @@ void multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
{
MultiFDRecvParams *p;
Error *local_err = NULL;
+ bool use_packets = multifd_use_packets();
int id;
- id = multifd_recv_initial_packet(ioc, &local_err);
- if (id < 0) {
- multifd_recv_terminate_threads(local_err);
- error_propagate_prepend(errp, local_err,
- "failed to receive packet"
- " via multifd channel %d: ",
- qatomic_read(&multifd_recv_state->count));
- return;
+ if (use_packets) {
+ id = multifd_recv_initial_packet(ioc, &local_err);
+ if (id < 0) {
+ multifd_recv_terminate_threads(local_err);
+ error_propagate_prepend(errp, local_err,
+ "failed to receive packet"
+ " via multifd channel %d: ",
+ qatomic_read(&multifd_recv_state->count));
+ return;
+ }
+ trace_multifd_recv_new_channel(id);
+ } else {
+ /* next patch gives this a meaningful value */
+ id = 0;
}
- trace_multifd_recv_new_channel(id);
p = &multifd_recv_state->params[id];
if (p->c != NULL) {
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 15/23] migration/multifd: Allow receiving pages without packets
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (13 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 14/23] migration/multifd: Allow multifd without packets Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-29 2:28 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 16/23] migration/multifd: Add a wrapper for channels_created Fabiano Rosas
` (7 subsequent siblings)
22 siblings, 1 reply; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
Currently multifd does not need to have knowledge of pages on the
receiving side because all the information needed is within the
packets that come in the stream.
We're about to add support to mapped-ram migration, which cannot use
packets because it expects the ramblock section in the migration file
to contain only the guest pages data.
Add a data structure to transfer pages between the ram migration code
and the multifd receiving threads.
We don't want to reuse MultiFDPages_t for two reasons:
a) multifd threads don't really need to know about the data they're
receiving.
b) the receiving side has to be stopped to load the pages, which means
we can experiment with larger granularities than page size when
transferring data.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
- added barriers similar to send side
- changed recv_sync to allow reentrancy
---
migration/file.c | 1 +
migration/multifd.c | 129 +++++++++++++++++++++++++++++++++++++++++---
migration/multifd.h | 15 ++++++
3 files changed, 138 insertions(+), 7 deletions(-)
diff --git a/migration/file.c b/migration/file.c
index 5d4975f43e..22d052a71f 100644
--- a/migration/file.c
+++ b/migration/file.c
@@ -6,6 +6,7 @@
*/
#include "qemu/osdep.h"
+#include "exec/ramblock.h"
#include "qemu/cutils.h"
#include "qapi/error.h"
#include "channel.h"
diff --git a/migration/multifd.c b/migration/multifd.c
index 8c43424c81..d470af73ba 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -81,9 +81,13 @@ struct {
struct {
MultiFDRecvParams *params;
+ MultiFDRecvData *data;
/* number of created threads */
int count;
- /* syncs main thread and channels */
+ /*
+ * This is always posted by the recv threads, the migration thread
+ * uses it to wait for recv threads to finish assigned tasks.
+ */
QemuSemaphore sem_sync;
/* global number of generated multifd packets */
uint64_t packet_num;
@@ -1119,6 +1123,57 @@ bool multifd_send_setup(void)
return true;
}
+bool multifd_recv(void)
+{
+ int i;
+ static int next_recv_channel;
+ MultiFDRecvParams *p = NULL;
+ MultiFDRecvData *data = multifd_recv_state->data;
+
+ /*
+ * next_channel can remain from a previous migration that was
+ * using more channels, so ensure it doesn't overflow if the
+ * limit is lower now.
+ */
+ next_recv_channel %= migrate_multifd_channels();
+ for (i = next_recv_channel;; i = (i + 1) % migrate_multifd_channels()) {
+ if (multifd_recv_should_exit()) {
+ return false;
+ }
+
+ p = &multifd_recv_state->params[i];
+
+ if (qatomic_read(&p->pending_job) == false) {
+ next_recv_channel = (i + 1) % migrate_multifd_channels();
+ break;
+ }
+ }
+
+ /*
+ * Order pending_job read before manipulating p->data below. Pairs
+ * with qatomic_store_release() at multifd_recv_thread().
+ */
+ smp_mb_acquire();
+
+ assert(!p->data->size);
+ multifd_recv_state->data = p->data;
+ p->data = data;
+
+ /*
+ * Order p->data update before setting pending_job. Pairs with
+ * qatomic_load_acquire() at multifd_recv_thread().
+ */
+ qatomic_store_release(&p->pending_job, true);
+ qemu_sem_post(&p->sem);
+
+ return true;
+}
+
+MultiFDRecvData *multifd_get_recv_data(void)
+{
+ return multifd_recv_state->data;
+}
+
static void multifd_recv_terminate_threads(Error *err)
{
int i;
@@ -1143,11 +1198,26 @@ static void multifd_recv_terminate_threads(Error *err)
MultiFDRecvParams *p = &multifd_recv_state->params[i];
/*
- * multifd_recv_thread may hung at MULTIFD_FLAG_SYNC handle code,
- * however try to wakeup it without harm in cleanup phase.
+ * The migration thread and channels interact differently
+ * depending on the presence of packets.
*/
if (multifd_use_packets()) {
+ /*
+ * The channel receives as long as there are packets. When
+ * packets end (i.e. MULTIFD_FLAG_SYNC is reached), the
+ * channel waits for the migration thread to sync. If the
+ * sync never happens, do it here.
+ */
qemu_sem_post(&p->sem_sync);
+ } else {
+ /*
+ * The channel waits for the migration thread to give it
+ * work. When the migration thread runs out of work, it
+ * releases the channel and waits for any pending work to
+ * finish. If we reach here (e.g. due to error) before the
+ * work runs out, release the channel.
+ */
+ qemu_sem_post(&p->sem);
}
/*
@@ -1176,6 +1246,7 @@ static void multifd_recv_cleanup_channel(MultiFDRecvParams *p)
p->c = NULL;
qemu_mutex_destroy(&p->mutex);
qemu_sem_destroy(&p->sem_sync);
+ qemu_sem_destroy(&p->sem);
g_free(p->name);
p->name = NULL;
p->packet_len = 0;
@@ -1193,6 +1264,8 @@ static void multifd_recv_cleanup_state(void)
qemu_sem_destroy(&multifd_recv_state->sem_sync);
g_free(multifd_recv_state->params);
multifd_recv_state->params = NULL;
+ g_free(multifd_recv_state->data);
+ multifd_recv_state->data = NULL;
g_free(multifd_recv_state);
multifd_recv_state = NULL;
}
@@ -1269,11 +1342,11 @@ static void *multifd_recv_thread(void *opaque)
bool has_data = false;
p->normal_num = 0;
- if (multifd_recv_should_exit()) {
- break;
- }
-
if (use_packets) {
+ if (multifd_recv_should_exit()) {
+ break;
+ }
+
ret = qio_channel_read_all_eof(p->c, (void *)p->packet,
p->packet_len, &local_err);
if (ret == 0 || ret == -1) { /* 0: EOF -1: Error */
@@ -1292,6 +1365,30 @@ static void *multifd_recv_thread(void *opaque)
p->flags &= ~MULTIFD_FLAG_SYNC;
has_data = !!p->normal_num;
qemu_mutex_unlock(&p->mutex);
+ } else {
+ /*
+ * No packets, so we need to wait for the vmstate code to
+ * give us work.
+ */
+ qemu_sem_wait(&p->sem);
+
+ if (multifd_recv_should_exit()) {
+ break;
+ }
+
+ /* pairs with qatomic_store_release() at multifd_recv() */
+ if (!qatomic_load_acquire(&p->pending_job)) {
+ /*
+ * Migration thread did not send work, this is
+ * equivalent to pending_sync on the sending
+ * side. Post sem_sync to notify we reached this
+ * point.
+ */
+ qemu_sem_post(&multifd_recv_state->sem_sync);
+ continue;
+ }
+
+ has_data = !!p->data->size;
}
if (has_data) {
@@ -1306,6 +1403,15 @@ static void *multifd_recv_thread(void *opaque)
qemu_sem_post(&multifd_recv_state->sem_sync);
qemu_sem_wait(&p->sem_sync);
}
+ } else {
+ p->total_normal_pages += p->data->size / qemu_target_page_size();
+ p->data->size = 0;
+ /*
+ * Order data->size update before clearing
+ * pending_job. Pairs with smp_mb_acquire() at
+ * multifd_recv().
+ */
+ qatomic_store_release(&p->pending_job, false);
}
}
@@ -1338,6 +1444,10 @@ int multifd_recv_setup(Error **errp)
thread_count = migrate_multifd_channels();
multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
+
+ multifd_recv_state->data = g_new0(MultiFDRecvData, 1);
+ multifd_recv_state->data->size = 0;
+
qatomic_set(&multifd_recv_state->count, 0);
qatomic_set(&multifd_recv_state->exiting, 0);
qemu_sem_init(&multifd_recv_state->sem_sync, 0);
@@ -1348,8 +1458,13 @@ int multifd_recv_setup(Error **errp)
qemu_mutex_init(&p->mutex);
qemu_sem_init(&p->sem_sync, 0);
+ qemu_sem_init(&p->sem, 0);
+ p->pending_job = false;
p->id = i;
+ p->data = g_new0(MultiFDRecvData, 1);
+ p->data->size = 0;
+
if (use_packets) {
p->packet_len = sizeof(MultiFDPacket_t)
+ sizeof(uint64_t) * page_count;
diff --git a/migration/multifd.h b/migration/multifd.h
index 6a54377cc1..1be985978e 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -13,6 +13,8 @@
#ifndef QEMU_MIGRATION_MULTIFD_H
#define QEMU_MIGRATION_MULTIFD_H
+typedef struct MultiFDRecvData MultiFDRecvData;
+
bool multifd_send_setup(void);
void multifd_send_shutdown(void);
int multifd_recv_setup(Error **errp);
@@ -23,6 +25,8 @@ void multifd_recv_new_channel(QIOChannel *ioc, Error **errp);
void multifd_recv_sync_main(void);
int multifd_send_sync_main(void);
bool multifd_queue_page(RAMBlock *block, ram_addr_t offset);
+bool multifd_recv(void);
+MultiFDRecvData *multifd_get_recv_data(void);
/* Multifd Compression flags */
#define MULTIFD_FLAG_SYNC (1 << 0)
@@ -63,6 +67,13 @@ typedef struct {
RAMBlock *block;
} MultiFDPages_t;
+struct MultiFDRecvData {
+ void *opaque;
+ size_t size;
+ /* for preadv */
+ off_t file_offset;
+};
+
typedef struct {
/* Fields are only written at creating/deletion time */
/* No lock required for them, they are read only */
@@ -152,6 +163,8 @@ typedef struct {
/* syncs main thread and channels */
QemuSemaphore sem_sync;
+ /* sem where to wait for more work */
+ QemuSemaphore sem;
/* this mutex protects the following parameters */
QemuMutex mutex;
@@ -161,6 +174,8 @@ typedef struct {
uint32_t flags;
/* global number of generated multifd packets */
uint64_t packet_num;
+ int pending_job;
+ MultiFDRecvData *data;
/* thread local variables. No locking required */
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 16/23] migration/multifd: Add a wrapper for channels_created
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (14 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 15/23] migration/multifd: Allow receiving pages " Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-29 2:29 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 17/23] migration/multifd: Add outgoing QIOChannelFile support Fabiano Rosas
` (6 subsequent siblings)
22 siblings, 1 reply; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
We'll need to access multifd_send_state->channels_created from outside
multifd.c, so introduce a helper for that.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
migration/multifd.c | 7 ++++++-
migration/multifd.h | 1 +
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index d470af73ba..3574fd3953 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -101,6 +101,11 @@ static bool multifd_use_packets(void)
return !migrate_mapped_ram();
}
+void multifd_send_channel_created(void)
+{
+ qemu_sem_post(&multifd_send_state->channels_created);
+}
+
/* Multifd without compression */
/**
@@ -1023,7 +1028,7 @@ out:
* Here we're not interested whether creation succeeded, only that
* it happened at all.
*/
- qemu_sem_post(&multifd_send_state->channels_created);
+ multifd_send_channel_created();
if (ret) {
return;
diff --git a/migration/multifd.h b/migration/multifd.h
index 1be985978e..1d8bbaf96b 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -17,6 +17,7 @@ typedef struct MultiFDRecvData MultiFDRecvData;
bool multifd_send_setup(void);
void multifd_send_shutdown(void);
+void multifd_send_channel_created(void);
int multifd_recv_setup(Error **errp);
void multifd_recv_cleanup(void);
void multifd_recv_shutdown(void);
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 17/23] migration/multifd: Add outgoing QIOChannelFile support
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (15 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 16/23] migration/multifd: Add a wrapper for channels_created Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-29 2:44 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 18/23] migration/multifd: Add incoming " Fabiano Rosas
` (5 subsequent siblings)
22 siblings, 1 reply; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
Allow multifd to open file-backed channels. This will be used when
enabling the mapped-ram migration stream format which expects a
seekable transport.
The QIOChannel read and write methods will use the preadv/pwritev
versions which don't update the file offset at each call so we can
reuse the fd without re-opening for every channel.
Contrary to the socket migration, the file migration doesn't need an
asynchronous channel creation process, so expose
multifd_channel_connect() and call it directly.
Note that this is just setup code and multifd cannot yet make use of
the file channels.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
- moved flags change to another patch
- removed channels_created assert
---
migration/file.c | 41 +++++++++++++++++++++++++++++++++++++++--
migration/file.h | 4 ++++
migration/multifd.c | 18 +++++++++++++++---
migration/multifd.h | 1 +
4 files changed, 59 insertions(+), 5 deletions(-)
diff --git a/migration/file.c b/migration/file.c
index 22d052a71f..83328a7a1b 100644
--- a/migration/file.c
+++ b/migration/file.c
@@ -12,12 +12,17 @@
#include "channel.h"
#include "file.h"
#include "migration.h"
+#include "multifd.h"
#include "io/channel-file.h"
#include "io/channel-util.h"
#include "trace.h"
#define OFFSET_OPTION ",offset="
+static struct FileOutgoingArgs {
+ char *fname;
+} outgoing_args;
+
/* Remove the offset option from @filespec and return it in @offsetp. */
int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp)
@@ -37,6 +42,36 @@ int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp)
return 0;
}
+void file_cleanup_outgoing_migration(void)
+{
+ g_free(outgoing_args.fname);
+ outgoing_args.fname = NULL;
+}
+
+bool file_send_channel_create(gpointer opaque, Error **errp)
+{
+ QIOChannelFile *ioc;
+ int flags = O_WRONLY;
+ bool ret = true;
+
+ ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
+ if (!ioc) {
+ ret = false;
+ goto out;
+ }
+
+ multifd_channel_connect(opaque, QIO_CHANNEL(ioc));
+
+out:
+ /*
+ * File channel creation is synchronous. However posting this
+ * semaphore here is simpler than adding a special case.
+ */
+ multifd_send_channel_created();
+
+ return ret;
+}
+
void file_start_outgoing_migration(MigrationState *s,
FileMigrationArgs *file_args, Error **errp)
{
@@ -47,12 +82,14 @@ void file_start_outgoing_migration(MigrationState *s,
trace_migration_file_outgoing(filename);
- fioc = qio_channel_file_new_path(filename, O_CREAT | O_WRONLY | O_TRUNC,
- 0600, errp);
+ fioc = qio_channel_file_new_path(filename, O_CREAT | O_TRUNC | O_WRONLY,
+ 0660, errp);
if (!fioc) {
return;
}
+ outgoing_args.fname = g_strdup(filename);
+
ioc = QIO_CHANNEL(fioc);
if (offset && qio_channel_io_seek(ioc, offset, SEEK_SET, errp) < 0) {
return;
diff --git a/migration/file.h b/migration/file.h
index 37d6a08bfc..4577f9efdd 100644
--- a/migration/file.h
+++ b/migration/file.h
@@ -9,10 +9,14 @@
#define QEMU_MIGRATION_FILE_H
#include "qapi/qapi-types-migration.h"
+#include "io/task.h"
+#include "channel.h"
void file_start_incoming_migration(FileMigrationArgs *file_args, Error **errp);
void file_start_outgoing_migration(MigrationState *s,
FileMigrationArgs *file_args, Error **errp);
int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp);
+void file_cleanup_outgoing_migration(void);
+bool file_send_channel_create(gpointer opaque, Error **errp);
#endif
diff --git a/migration/multifd.c b/migration/multifd.c
index 3574fd3953..f155223303 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -17,6 +17,7 @@
#include "exec/ramblock.h"
#include "qemu/error-report.h"
#include "qapi/error.h"
+#include "file.h"
#include "ram.h"
#include "migration.h"
#include "migration-stats.h"
@@ -28,6 +29,7 @@
#include "threadinfo.h"
#include "options.h"
#include "qemu/yank.h"
+#include "io/channel-file.h"
#include "io/channel-socket.h"
#include "yank_functions.h"
@@ -694,6 +696,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
{
if (p->c) {
migration_ioc_unregister_yank(p->c);
+ qio_channel_close(p->c, NULL);
object_unref(OBJECT(p->c));
p->c = NULL;
}
@@ -715,6 +718,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
static void multifd_send_cleanup_state(void)
{
+ file_cleanup_outgoing_migration();
socket_cleanup_outgoing_migration();
qemu_sem_destroy(&multifd_send_state->channels_created);
qemu_sem_destroy(&multifd_send_state->channels_ready);
@@ -977,7 +981,7 @@ static bool multifd_tls_channel_connect(MultiFDSendParams *p,
return true;
}
-static void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc)
+void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc)
{
qio_channel_set_delay(ioc, false);
@@ -1045,9 +1049,14 @@ out:
error_free(local_err);
}
-static void multifd_new_send_channel_create(gpointer opaque)
+static bool multifd_new_send_channel_create(gpointer opaque, Error **errp)
{
+ if (!multifd_use_packets()) {
+ return file_send_channel_create(opaque, errp);
+ }
+
socket_send_channel_create(multifd_new_send_channel_async, opaque);
+ return true;
}
bool multifd_send_setup(void)
@@ -1096,7 +1105,10 @@ bool multifd_send_setup(void)
p->page_size = qemu_target_page_size();
p->page_count = page_count;
p->write_flags = 0;
- multifd_new_send_channel_create(p);
+
+ if (!multifd_new_send_channel_create(p, &local_err)) {
+ return -1;
+ }
}
/*
diff --git a/migration/multifd.h b/migration/multifd.h
index 1d8bbaf96b..db8887f088 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -227,5 +227,6 @@ static inline void multifd_send_prepare_header(MultiFDSendParams *p)
p->iovs_num++;
}
+void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc);
#endif
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 18/23] migration/multifd: Add incoming QIOChannelFile support
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (16 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 17/23] migration/multifd: Add outgoing QIOChannelFile support Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-29 2:53 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 19/23] migration/multifd: Prepare multifd sync for mapped-ram migration Fabiano Rosas
` (4 subsequent siblings)
22 siblings, 1 reply; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
On the receiving side we don't need to differentiate between main
channel and threads, so whichever channel is defined first gets to be
the main one. And since there are no packets, use the atomic channel
count to index into the params array.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
- used dup()
---
migration/file.c | 35 +++++++++++++++++++++++++++--------
migration/migration.c | 3 ++-
migration/multifd.c | 3 +--
3 files changed, 30 insertions(+), 11 deletions(-)
diff --git a/migration/file.c b/migration/file.c
index 83328a7a1b..5e1348fec0 100644
--- a/migration/file.c
+++ b/migration/file.c
@@ -8,6 +8,7 @@
#include "qemu/osdep.h"
#include "exec/ramblock.h"
#include "qemu/cutils.h"
+#include "qemu/error-report.h"
#include "qapi/error.h"
#include "channel.h"
#include "file.h"
@@ -15,6 +16,7 @@
#include "multifd.h"
#include "io/channel-file.h"
#include "io/channel-util.h"
+#include "options.h"
#include "trace.h"
#define OFFSET_OPTION ",offset="
@@ -112,7 +114,8 @@ void file_start_incoming_migration(FileMigrationArgs *file_args, Error **errp)
g_autofree char *filename = g_strdup(file_args->filename);
QIOChannelFile *fioc = NULL;
uint64_t offset = file_args->offset;
- QIOChannel *ioc;
+ int channels = 1;
+ int i = 0;
trace_migration_file_incoming(filename);
@@ -121,13 +124,29 @@ void file_start_incoming_migration(FileMigrationArgs *file_args, Error **errp)
return;
}
- ioc = QIO_CHANNEL(fioc);
- if (offset && qio_channel_io_seek(ioc, offset, SEEK_SET, errp) < 0) {
+ if (offset &&
+ qio_channel_io_seek(QIO_CHANNEL(fioc), offset, SEEK_SET, errp) < 0) {
return;
}
- qio_channel_set_name(QIO_CHANNEL(ioc), "migration-file-incoming");
- qio_channel_add_watch_full(ioc, G_IO_IN,
- file_accept_incoming_migration,
- NULL, NULL,
- g_main_context_get_thread_default());
+
+ if (migrate_multifd()) {
+ channels += migrate_multifd_channels();
+ }
+
+ do {
+ QIOChannel *ioc = QIO_CHANNEL(fioc);
+
+ qio_channel_set_name(ioc, "migration-file-incoming");
+ qio_channel_add_watch_full(ioc, G_IO_IN,
+ file_accept_incoming_migration,
+ NULL, NULL,
+ g_main_context_get_thread_default());
+
+ fioc = qio_channel_file_new_fd(dup(fioc->fd));
+
+ if (!fioc || fioc->fd == -1) {
+ error_setg(errp, "Error creating migration incoming channel");
+ break;
+ }
+ } while (++i < channels);
}
diff --git a/migration/migration.c b/migration/migration.c
index c1cc003b99..ff3872468f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -909,7 +909,8 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
uint32_t channel_magic = 0;
int ret = 0;
- if (migrate_multifd() && !migrate_postcopy_ram() &&
+ if (migrate_multifd() && !migrate_mapped_ram() &&
+ !migrate_postcopy_ram() &&
qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) {
/*
* With multiple channels, it is possible that we receive channels
diff --git a/migration/multifd.c b/migration/multifd.c
index f155223303..7c3994b3ba 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1545,8 +1545,7 @@ void multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
}
trace_multifd_recv_new_channel(id);
} else {
- /* next patch gives this a meaningful value */
- id = 0;
+ id = qatomic_read(&multifd_recv_state->count);
}
p = &multifd_recv_state->params[id];
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 19/23] migration/multifd: Prepare multifd sync for mapped-ram migration
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (17 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 18/23] migration/multifd: Add incoming " Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-29 3:16 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 20/23] migration/multifd: Support outgoing mapped-ram stream format Fabiano Rosas
` (3 subsequent siblings)
22 siblings, 1 reply; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
The mapped-ram migration can be performed live or non-live, but it is
always asynchronous, i.e. the source machine and the destination
machine are not migrating at the same time. We only need some pieces
of the multifd sync operations.
multifd_send_sync_main()
------------------------
Issued by the ram migration code on the migration thread, causes the
multifd send channels to synchronize with the migration thread and
makes the sending side emit a packet with the MULTIFD_FLUSH flag.
With mapped-ram we want to maintain the sync on the sending side
because that provides ordering between the rounds of dirty pages when
migrating live.
MULTIFD_FLUSH
-------------
On the receiving side, the presence of the MULTIFD_FLUSH flag on a
packet causes the receiving channels to start synchronizing with the
main thread.
We're not using packets with mapped-ram, so there's no MULTIFD_FLUSH
flag and therefore no channel sync on the receiving side.
multifd_recv_sync_main()
------------------------
Issued by the migration thread when the ram migration flag
RAM_SAVE_FLAG_MULTIFD_FLUSH is received, causes the migration thread
on the receiving side to start synchronizing with the recv
channels. Due to compatibility, this is also issued when
RAM_SAVE_FLAG_EOS is received.
For mapped-ram we only need to synchronize the channels at the end of
migration to avoid doing cleanup before the channels have finished
their IO.
Make sure the multifd syncs are only issued at the appropriate times.
Note that due to pre-existing backward compatibility issues, we have
the multifd_flush_after_each_section property that can cause a sync to
happen at EOS. Since the EOS flag is needed on the stream, allow
mapped-ram to just ignore it.
Also emit an error if any other unexpected flags are found on the
stream.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
- skipped all FLUSH flags
- added invalid flags
- skipped EOS
---
migration/ram.c | 26 ++++++++++++++++++++++----
1 file changed, 22 insertions(+), 4 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index 18620784c6..250dcd110c 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1368,8 +1368,11 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
if (ret < 0) {
return ret;
}
- qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
- qemu_fflush(f);
+
+ if (!migrate_mapped_ram()) {
+ qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
+ qemu_fflush(f);
+ }
}
/*
* If memory migration starts over, we will meet a dirtied page
@@ -3111,7 +3114,8 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
return ret;
}
- if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
+ if (migrate_multifd() && !migrate_multifd_flush_after_each_section()
+ && !migrate_mapped_ram()) {
qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
}
@@ -3334,7 +3338,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
}
}
- if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
+ if (migrate_multifd() && !migrate_multifd_flush_after_each_section() &&
+ !migrate_mapped_ram()) {
qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
}
qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
@@ -4137,6 +4142,12 @@ static int ram_load_precopy(QEMUFile *f)
invalid_flags |= RAM_SAVE_FLAG_COMPRESS_PAGE;
}
+ if (migrate_mapped_ram()) {
+ invalid_flags |= (RAM_SAVE_FLAG_EOS | RAM_SAVE_FLAG_HOOK |
+ RAM_SAVE_FLAG_MULTIFD_FLUSH | RAM_SAVE_FLAG_PAGE |
+ RAM_SAVE_FLAG_XBZRLE | RAM_SAVE_FLAG_ZERO);
+ }
+
while (!ret && !(flags & RAM_SAVE_FLAG_EOS)) {
ram_addr_t addr;
void *host = NULL, *host_bak = NULL;
@@ -4158,6 +4169,13 @@ static int ram_load_precopy(QEMUFile *f)
addr &= TARGET_PAGE_MASK;
if (flags & invalid_flags) {
+ if (invalid_flags & RAM_SAVE_FLAG_EOS) {
+ /* EOS is always present, just ignore it */
+ continue;
+ }
+
+ error_report("Unexpected RAM flags: %d", flags & invalid_flags);
+
if (flags & invalid_flags & RAM_SAVE_FLAG_COMPRESS_PAGE) {
error_report("Received an unexpected compressed page");
}
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 20/23] migration/multifd: Support outgoing mapped-ram stream format
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (18 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 19/23] migration/multifd: Prepare multifd sync for mapped-ram migration Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 21/23] migration/multifd: Support incoming " Fabiano Rosas
` (2 subsequent siblings)
22 siblings, 0 replies; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
The new mapped-ram stream format uses a file transport and puts ram
pages in the migration file at their respective offsets and can be
done in parallel by using the pwritev system call which takes iovecs
and an offset.
Add support to enabling the new format along with multifd to make use
of the threading and page handling already in place.
This requires multifd to stop sending headers and leaving the stream
format to the mapped-ram code. When it comes time to write the data, we
need to call a version of qio_channel_write that can take an offset.
Usage on HMP is:
(qemu) stop
(qemu) migrate_set_capability multifd on
(qemu) migrate_set_capability mapped-ram on
(qemu) migrate_set_parameter max-bandwidth 0
(qemu) migrate_set_parameter multifd-channels 8
(qemu) migrate file:migfile
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
- added used_length sanity check
- allowed returning negative value
---
include/qemu/bitops.h | 13 +++++++++++
migration/file.c | 54 +++++++++++++++++++++++++++++++++++++++++++
migration/file.h | 2 ++
migration/migration.c | 17 ++++++++++----
migration/multifd.c | 24 +++++++++++++++++--
migration/options.c | 13 ++++++-----
migration/ram.c | 17 +++++++++++---
migration/ram.h | 1 +
8 files changed, 125 insertions(+), 16 deletions(-)
diff --git a/include/qemu/bitops.h b/include/qemu/bitops.h
index cb3526d1f4..2c0a2fe751 100644
--- a/include/qemu/bitops.h
+++ b/include/qemu/bitops.h
@@ -67,6 +67,19 @@ static inline void clear_bit(long nr, unsigned long *addr)
*p &= ~mask;
}
+/**
+ * clear_bit_atomic - Clears a bit in memory atomically
+ * @nr: Bit to clear
+ * @addr: Address to start counting from
+ */
+static inline void clear_bit_atomic(long nr, unsigned long *addr)
+{
+ unsigned long mask = BIT_MASK(nr);
+ unsigned long *p = addr + BIT_WORD(nr);
+
+ return qatomic_and(p, ~mask);
+}
+
/**
* change_bit - Toggle a bit in memory
* @nr: Bit to change
diff --git a/migration/file.c b/migration/file.c
index 5e1348fec0..2188774a9d 100644
--- a/migration/file.c
+++ b/migration/file.c
@@ -150,3 +150,57 @@ void file_start_incoming_migration(FileMigrationArgs *file_args, Error **errp)
}
} while (++i < channels);
}
+
+int file_write_ramblock_iov(QIOChannel *ioc, const struct iovec *iov,
+ int niov, RAMBlock *block, Error **errp)
+{
+ ssize_t ret = -1;
+ int i, slice_idx, slice_num;
+ uintptr_t base, next, offset;
+ size_t len;
+
+ slice_idx = 0;
+ slice_num = 1;
+
+ /*
+ * If the iov array doesn't have contiguous elements, we need to
+ * split it in slices because we only have one file offset for the
+ * whole iov. Do this here so callers don't need to break the iov
+ * array themselves.
+ */
+ for (i = 0; i < niov; i++, slice_num++) {
+ base = (uintptr_t) iov[i].iov_base;
+
+ if (i != niov - 1) {
+ len = iov[i].iov_len;
+ next = (uintptr_t) iov[i + 1].iov_base;
+
+ if (base + len == next) {
+ continue;
+ }
+ }
+
+ /*
+ * Use the offset of the first element of the segment that
+ * we're sending.
+ */
+ offset = (uintptr_t) iov[slice_idx].iov_base - (uintptr_t) block->host;
+ if (offset >= block->used_length) {
+ error_setg(errp, "offset " RAM_ADDR_FMT
+ "outside of ramblock %s range", offset, block->idstr);
+ ret = -1;
+ break;
+ }
+
+ ret = qio_channel_pwritev(ioc, &iov[slice_idx], slice_num,
+ block->pages_offset + offset, errp);
+ if (ret < 0) {
+ break;
+ }
+
+ slice_idx += slice_num;
+ slice_num = 0;
+ }
+
+ return (ret < 0) ? ret : 0;
+}
diff --git a/migration/file.h b/migration/file.h
index 4577f9efdd..01a338cac7 100644
--- a/migration/file.h
+++ b/migration/file.h
@@ -19,4 +19,6 @@ void file_start_outgoing_migration(MigrationState *s,
int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp);
void file_cleanup_outgoing_migration(void);
bool file_send_channel_create(gpointer opaque, Error **errp);
+int file_write_ramblock_iov(QIOChannel *ioc, const struct iovec *iov,
+ int niov, RAMBlock *block, Error **errp);
#endif
diff --git a/migration/migration.c b/migration/migration.c
index ff3872468f..957d2890b7 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -139,12 +139,14 @@ static bool transport_supports_multi_channels(MigrationAddress *addr)
if (addr->transport == MIGRATION_ADDRESS_TYPE_SOCKET) {
SocketAddress *saddr = &addr->u.socket;
- return saddr->type == SOCKET_ADDRESS_TYPE_INET ||
- saddr->type == SOCKET_ADDRESS_TYPE_UNIX ||
- saddr->type == SOCKET_ADDRESS_TYPE_VSOCK;
+ return (saddr->type == SOCKET_ADDRESS_TYPE_INET ||
+ saddr->type == SOCKET_ADDRESS_TYPE_UNIX ||
+ saddr->type == SOCKET_ADDRESS_TYPE_VSOCK);
+ } else if (addr->transport == MIGRATION_ADDRESS_TYPE_FILE) {
+ return migrate_mapped_ram();
+ } else {
+ return false;
}
-
- return false;
}
static bool migration_needs_seekable_channel(void)
@@ -1988,6 +1990,11 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
error_setg(errp, "Cannot use TLS with mapped-ram");
return false;
}
+
+ if (migrate_multifd_compression()) {
+ error_setg(errp, "Cannot use compression with mapped-ram");
+ return false;
+ }
}
if (migrate_mode_is_cpr(s)) {
diff --git a/migration/multifd.c b/migration/multifd.c
index 7c3994b3ba..e31d2934cc 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -108,6 +108,17 @@ void multifd_send_channel_created(void)
qemu_sem_post(&multifd_send_state->channels_created);
}
+static void multifd_set_file_bitmap(MultiFDSendParams *p)
+{
+ MultiFDPages_t *pages = p->pages;
+
+ assert(pages->block);
+
+ for (int i = 0; i < p->pages->num; i++) {
+ ramblock_set_file_bmap_atomic(pages->block, pages->offset[i]);
+ }
+}
+
/* Multifd without compression */
/**
@@ -169,6 +180,8 @@ static int nocomp_send_prepare(MultiFDSendParams *p, Error **errp)
if (!multifd_use_packets()) {
multifd_send_prepare_iovs(p);
+ multifd_set_file_bitmap(p);
+
return 0;
}
@@ -867,8 +880,15 @@ static void *multifd_send_thread(void *opaque)
break;
}
- ret = qio_channel_writev_full_all(p->c, p->iov, p->iovs_num, NULL,
- 0, p->write_flags, &local_err);
+ if (migrate_mapped_ram()) {
+ ret = file_write_ramblock_iov(p->c, p->iov, p->iovs_num,
+ p->pages->block, &local_err);
+ } else {
+ ret = qio_channel_writev_full_all(p->c, p->iov, p->iovs_num,
+ NULL, 0, p->write_flags,
+ &local_err);
+ }
+
if (ret != 0) {
break;
}
diff --git a/migration/options.c b/migration/options.c
index c6edbe4f3e..b6f39c57d8 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -654,12 +654,6 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp)
}
if (new_caps[MIGRATION_CAPABILITY_MAPPED_RAM]) {
- if (new_caps[MIGRATION_CAPABILITY_MULTIFD]) {
- error_setg(errp,
- "Mapped-ram migration is incompatible with multifd");
- return false;
- }
-
if (new_caps[MIGRATION_CAPABILITY_XBZRLE]) {
error_setg(errp,
"Mapped-ram migration is incompatible with xbzrle");
@@ -1252,6 +1246,13 @@ bool migrate_params_check(MigrationParameters *params, Error **errp)
}
#endif
+ if (migrate_mapped_ram() &&
+ (migrate_multifd_compression() || migrate_tls())) {
+ error_setg(errp,
+ "Mapped-ram only available for non-compressed non-TLS multifd migration");
+ return false;
+ }
+
if (params->has_x_vcpu_dirty_limit_period &&
(params->x_vcpu_dirty_limit_period < 1 ||
params->x_vcpu_dirty_limit_period > 1000)) {
diff --git a/migration/ram.c b/migration/ram.c
index 250dcd110c..53643d2046 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1148,7 +1148,7 @@ static int save_zero_page(RAMState *rs, PageSearchStatus *pss,
if (migrate_mapped_ram()) {
/* zero pages are not transferred with mapped-ram */
- clear_bit(offset >> TARGET_PAGE_BITS, pss->block->file_bmap);
+ clear_bit_atomic(offset >> TARGET_PAGE_BITS, pss->block->file_bmap);
return 1;
}
@@ -2444,8 +2444,6 @@ static void ram_save_cleanup(void *opaque)
block->clear_bmap = NULL;
g_free(block->bmap);
block->bmap = NULL;
- g_free(block->file_bmap);
- block->file_bmap = NULL;
}
xbzrle_cleanup();
@@ -3134,9 +3132,22 @@ static void ram_save_file_bmap(QEMUFile *f)
qemu_put_buffer_at(f, (uint8_t *)block->file_bmap, bitmap_size,
block->bitmap_offset);
ram_transferred_add(bitmap_size);
+
+ /*
+ * Free the bitmap here to catch any synchronization issues
+ * with multifd channels. No channels should be sending pages
+ * after we've written the bitmap to file.
+ */
+ g_free(block->file_bmap);
+ block->file_bmap = NULL;
}
}
+void ramblock_set_file_bmap_atomic(RAMBlock *block, ram_addr_t offset)
+{
+ set_bit_atomic(offset >> TARGET_PAGE_BITS, block->file_bmap);
+}
+
/**
* ram_save_iterate: iterative stage for migration
*
diff --git a/migration/ram.h b/migration/ram.h
index 9b937a446b..b9ac0da587 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -75,6 +75,7 @@ bool ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb, Error **errp);
bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t start);
void postcopy_preempt_shutdown_file(MigrationState *s);
void *postcopy_preempt_thread(void *opaque);
+void ramblock_set_file_bmap_atomic(RAMBlock *block, ram_addr_t offset);
/* ram cache */
int colo_init_ram_cache(void);
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 21/23] migration/multifd: Support incoming mapped-ram stream format
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (19 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 20/23] migration/multifd: Support outgoing mapped-ram stream format Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-29 3:23 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 22/23] migration/multifd: Add mapped-ram support to fd: URI Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 23/23] tests/qtest/migration: Add a multifd + mapped-ram migration test Fabiano Rosas
22 siblings, 1 reply; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
For the incoming mapped-ram migration we need to read the ramblock
headers, get the pages bitmap and send the host address of each
non-zero page to the multifd channel thread for writing.
Usage on HMP is:
(qemu) migrate_set_capability multifd on
(qemu) migrate_set_capability mapped-ram on
(qemu) migrate_incoming file:migfile
(the ram.h include needs to move because we've been previously relying
on it being included from migration.c. Now file.h will start including
multifd.h before migration.o is processed)
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
- dropped NOCOMP flag handling
- dropped comment and newline
- dropped duplicate LOAD_BUF_SIZE
---
migration/file.c | 18 +++++++++++++++++-
migration/file.h | 2 ++
migration/multifd.c | 31 ++++++++++++++++++++++++++++---
migration/multifd.h | 2 ++
migration/ram.c | 26 ++++++++++++++++++++++++--
5 files changed, 73 insertions(+), 6 deletions(-)
diff --git a/migration/file.c b/migration/file.c
index 2188774a9d..44b73bf9e5 100644
--- a/migration/file.c
+++ b/migration/file.c
@@ -13,7 +13,6 @@
#include "channel.h"
#include "file.h"
#include "migration.h"
-#include "multifd.h"
#include "io/channel-file.h"
#include "io/channel-util.h"
#include "options.h"
@@ -204,3 +203,20 @@ int file_write_ramblock_iov(QIOChannel *ioc, const struct iovec *iov,
return (ret < 0) ? ret : 0;
}
+
+int multifd_file_recv_data(MultiFDRecvParams *p, Error **errp)
+{
+ MultiFDRecvData *data = p->data;
+ size_t ret;
+
+ ret = qio_channel_pread(p->c, (char *) data->opaque,
+ data->size, data->file_offset, errp);
+ if (ret != data->size) {
+ error_prepend(errp,
+ "multifd recv (%u): read 0x%zx, expected 0x%zx",
+ p->id, ret, data->size);
+ return -1;
+ }
+
+ return 0;
+}
diff --git a/migration/file.h b/migration/file.h
index 01a338cac7..9f71e87f74 100644
--- a/migration/file.h
+++ b/migration/file.h
@@ -11,6 +11,7 @@
#include "qapi/qapi-types-migration.h"
#include "io/task.h"
#include "channel.h"
+#include "multifd.h"
void file_start_incoming_migration(FileMigrationArgs *file_args, Error **errp);
@@ -21,4 +22,5 @@ void file_cleanup_outgoing_migration(void);
bool file_send_channel_create(gpointer opaque, Error **errp);
int file_write_ramblock_iov(QIOChannel *ioc, const struct iovec *iov,
int niov, RAMBlock *block, Error **errp);
+int multifd_file_recv_data(MultiFDRecvParams *p, Error **errp);
#endif
diff --git a/migration/multifd.c b/migration/multifd.c
index e31d2934cc..7a3977fc34 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -18,7 +18,6 @@
#include "qemu/error-report.h"
#include "qapi/error.h"
#include "file.h"
-#include "ram.h"
#include "migration.h"
#include "migration-stats.h"
#include "socket.h"
@@ -251,7 +250,7 @@ static int nocomp_recv(MultiFDRecvParams *p, Error **errp)
uint32_t flags;
if (!multifd_use_packets()) {
- return 0;
+ return multifd_file_recv_data(p, errp);
}
flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
@@ -1331,22 +1330,48 @@ void multifd_recv_cleanup(void)
void multifd_recv_sync_main(void)
{
int thread_count = migrate_multifd_channels();
+ bool file_based = !multifd_use_packets();
int i;
- if (!migrate_multifd() || !multifd_use_packets()) {
+ if (!migrate_multifd()) {
return;
}
+ /*
+ * File-based channels don't use packets and therefore need to
+ * wait for more work. Release them to start the sync.
+ */
+ if (file_based) {
+ for (i = 0; i < thread_count; i++) {
+ MultiFDRecvParams *p = &multifd_recv_state->params[i];
+
+ trace_multifd_recv_sync_main_signal(p->id);
+ qemu_sem_post(&p->sem);
+ }
+ }
+
/*
* Initiate the synchronization by waiting for all channels.
+ *
* For socket-based migration this means each channel has received
* the SYNC packet on the stream.
+ *
+ * For file-based migration this means each channel is done with
+ * the work (pending_job=false).
*/
for (i = 0; i < thread_count; i++) {
trace_multifd_recv_sync_main_wait(i);
qemu_sem_wait(&multifd_recv_state->sem_sync);
}
+ if (file_based) {
+ /*
+ * For file-based loading is done in one iteration. We're
+ * done.
+ */
+ return;
+ }
+
/*
* Sync done. Release the channels for the next iteration.
*/
diff --git a/migration/multifd.h b/migration/multifd.h
index db8887f088..7447c2bea3 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -13,6 +13,8 @@
#ifndef QEMU_MIGRATION_MULTIFD_H
#define QEMU_MIGRATION_MULTIFD_H
+#include "ram.h"
+
typedef struct MultiFDRecvData MultiFDRecvData;
bool multifd_send_setup(void);
diff --git a/migration/ram.c b/migration/ram.c
index 53643d2046..c8e2372f06 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3950,6 +3950,22 @@ void colo_flush_ram_cache(void)
trace_colo_flush_ram_cache_end();
}
+static size_t ram_load_multifd_pages(void *host_addr, size_t size,
+ uint64_t offset)
+{
+ MultiFDRecvData *data = multifd_get_recv_data();
+
+ data->opaque = host_addr;
+ data->file_offset = offset;
+ data->size = size;
+
+ if (!multifd_recv()) {
+ return 0;
+ }
+
+ return size;
+}
+
static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
long num_pages, unsigned long *bitmap,
Error **errp)
@@ -3979,8 +3995,14 @@ static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
size = MIN(unread, MAPPED_RAM_LOAD_BUF_SIZE);
- read = qemu_get_buffer_at(f, host, size,
- block->pages_offset + offset);
+ if (migrate_multifd()) {
+ read = ram_load_multifd_pages(host, size,
+ block->pages_offset + offset);
+ } else {
+ read = qemu_get_buffer_at(f, host, size,
+ block->pages_offset + offset);
+ }
+
if (!read) {
goto err;
}
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 22/23] migration/multifd: Add mapped-ram support to fd: URI
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (20 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 21/23] migration/multifd: Support incoming " Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
2024-02-29 3:31 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 23/23] tests/qtest/migration: Add a multifd + mapped-ram migration test Fabiano Rosas
22 siblings, 1 reply; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel; +Cc: berrange, armbru, Peter Xu, Claudio Fontana
If we receive a file descriptor that points to a regular file, there's
nothing stopping us from doing multifd migration with mapped-ram to
that file.
Enable the fd: URI to work with multifd + mapped-ram.
Note that the fds passed into multifd are duplicated because we want
to avoid cross-thread effects when doing cleanup (i.e. close(fd)). The
original fd doesn't need to be duplicated because monitor_get_fd()
transfers ownership to the caller.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
- dup() the fds that are passed to multifd.
- update the commit message
---
migration/fd.c | 44 +++++++++++++++++++++++++++++++++++++++++++
migration/fd.h | 2 ++
migration/file.c | 16 +++++++++++-----
migration/migration.c | 4 ++++
migration/multifd.c | 2 ++
5 files changed, 63 insertions(+), 5 deletions(-)
diff --git a/migration/fd.c b/migration/fd.c
index 0eb677dcae..d4ae72d132 100644
--- a/migration/fd.c
+++ b/migration/fd.c
@@ -15,18 +15,41 @@
*/
#include "qemu/osdep.h"
+#include "qapi/error.h"
#include "channel.h"
#include "fd.h"
#include "migration.h"
#include "monitor/monitor.h"
+#include "io/channel-file.h"
#include "io/channel-util.h"
+#include "options.h"
#include "trace.h"
+static struct FdOutgoingArgs {
+ int fd;
+} outgoing_args;
+
+int fd_args_get_fd(void)
+{
+ return outgoing_args.fd;
+}
+
+void fd_cleanup_outgoing_migration(void)
+{
+ if (outgoing_args.fd > 0) {
+ close(outgoing_args.fd);
+ outgoing_args.fd = -1;
+ }
+}
+
void fd_start_outgoing_migration(MigrationState *s, const char *fdname, Error **errp)
{
QIOChannel *ioc;
int fd = monitor_get_fd(monitor_cur(), fdname, errp);
+
+ outgoing_args.fd = -1;
+
if (fd == -1) {
return;
}
@@ -38,6 +61,8 @@ void fd_start_outgoing_migration(MigrationState *s, const char *fdname, Error **
return;
}
+ outgoing_args.fd = fd;
+
qio_channel_set_name(ioc, "migration-fd-outgoing");
migration_channel_connect(s, ioc, NULL, NULL);
object_unref(OBJECT(ioc));
@@ -73,4 +98,23 @@ void fd_start_incoming_migration(const char *fdname, Error **errp)
fd_accept_incoming_migration,
NULL, NULL,
g_main_context_get_thread_default());
+
+ if (migrate_multifd()) {
+ int channels = migrate_multifd_channels();
+
+ while (channels--) {
+ ioc = QIO_CHANNEL(qio_channel_file_new_fd(dup(fd)));
+
+ if (QIO_CHANNEL_FILE(ioc)->fd == -1) {
+ error_setg(errp, "Failed to duplicate fd %d", fd);
+ return;
+ }
+
+ qio_channel_set_name(ioc, "migration-fd-incoming");
+ qio_channel_add_watch_full(ioc, G_IO_IN,
+ fd_accept_incoming_migration,
+ NULL, NULL,
+ g_main_context_get_thread_default());
+ }
+ }
}
diff --git a/migration/fd.h b/migration/fd.h
index b901bc014e..0c0a18d9e7 100644
--- a/migration/fd.h
+++ b/migration/fd.h
@@ -20,4 +20,6 @@ void fd_start_incoming_migration(const char *fdname, Error **errp);
void fd_start_outgoing_migration(MigrationState *s, const char *fdname,
Error **errp);
+void fd_cleanup_outgoing_migration(void);
+int fd_args_get_fd(void);
#endif
diff --git a/migration/file.c b/migration/file.c
index 44b73bf9e5..9a8cba2c3d 100644
--- a/migration/file.c
+++ b/migration/file.c
@@ -11,6 +11,7 @@
#include "qemu/error-report.h"
#include "qapi/error.h"
#include "channel.h"
+#include "fd.h"
#include "file.h"
#include "migration.h"
#include "io/channel-file.h"
@@ -53,15 +54,20 @@ bool file_send_channel_create(gpointer opaque, Error **errp)
{
QIOChannelFile *ioc;
int flags = O_WRONLY;
- bool ret = true;
+ bool ret = false;
+ int fd = fd_args_get_fd();
- ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
- if (!ioc) {
- ret = false;
- goto out;
+ if (fd && fd != -1) {
+ ioc = qio_channel_file_new_fd(dup(fd));
+ } else {
+ ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
+ if (!ioc) {
+ goto out;
+ }
}
multifd_channel_connect(opaque, QIO_CHANNEL(ioc));
+ ret = true;
out:
/*
diff --git a/migration/migration.c b/migration/migration.c
index 957d2890b7..0f1c044707 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -139,6 +139,10 @@ static bool transport_supports_multi_channels(MigrationAddress *addr)
if (addr->transport == MIGRATION_ADDRESS_TYPE_SOCKET) {
SocketAddress *saddr = &addr->u.socket;
+ if (saddr->type == SOCKET_ADDRESS_TYPE_FD) {
+ return migrate_mapped_ram();
+ }
+
return (saddr->type == SOCKET_ADDRESS_TYPE_INET ||
saddr->type == SOCKET_ADDRESS_TYPE_UNIX ||
saddr->type == SOCKET_ADDRESS_TYPE_VSOCK);
diff --git a/migration/multifd.c b/migration/multifd.c
index 7a3977fc34..e46e1f2bf3 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -17,6 +17,7 @@
#include "exec/ramblock.h"
#include "qemu/error-report.h"
#include "qapi/error.h"
+#include "fd.h"
#include "file.h"
#include "migration.h"
#include "migration-stats.h"
@@ -731,6 +732,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
static void multifd_send_cleanup_state(void)
{
file_cleanup_outgoing_migration();
+ fd_cleanup_outgoing_migration();
socket_cleanup_outgoing_migration();
qemu_sem_destroy(&multifd_send_state->channels_created);
qemu_sem_destroy(&multifd_send_state->channels_ready);
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v5 23/23] tests/qtest/migration: Add a multifd + mapped-ram migration test
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
` (21 preceding siblings ...)
2024-02-28 15:21 ` [PATCH v5 22/23] migration/multifd: Add mapped-ram support to fd: URI Fabiano Rosas
@ 2024-02-28 15:21 ` Fabiano Rosas
22 siblings, 0 replies; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-28 15:21 UTC (permalink / raw)
To: qemu-devel
Cc: berrange, armbru, Peter Xu, Claudio Fontana, Thomas Huth,
Laurent Vivier, Paolo Bonzini
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
tests/qtest/migration-test.c | 68 ++++++++++++++++++++++++++++++++++++
1 file changed, 68 insertions(+)
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 64a26009e9..a71504b262 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2248,6 +2248,46 @@ static void test_precopy_file_mapped_ram(void)
test_file_common(&args, true);
}
+static void *migrate_multifd_mapped_ram_start(QTestState *from, QTestState *to)
+{
+ migrate_mapped_ram_start(from, to);
+
+ migrate_set_parameter_int(from, "multifd-channels", 4);
+ migrate_set_parameter_int(to, "multifd-channels", 4);
+
+ migrate_set_capability(from, "multifd", true);
+ migrate_set_capability(to, "multifd", true);
+
+ return NULL;
+}
+
+static void test_multifd_file_mapped_ram_live(void)
+{
+ g_autofree char *uri = g_strdup_printf("file:%s/%s", tmpfs,
+ FILE_TEST_FILENAME);
+ MigrateCommon args = {
+ .connect_uri = uri,
+ .listen_uri = "defer",
+ .start_hook = migrate_multifd_mapped_ram_start,
+ };
+
+ test_file_common(&args, false);
+}
+
+static void test_multifd_file_mapped_ram(void)
+{
+ g_autofree char *uri = g_strdup_printf("file:%s/%s", tmpfs,
+ FILE_TEST_FILENAME);
+ MigrateCommon args = {
+ .connect_uri = uri,
+ .listen_uri = "defer",
+ .start_hook = migrate_multifd_mapped_ram_start,
+ };
+
+ test_file_common(&args, true);
+}
+
+
static void test_precopy_tcp_plain(void)
{
MigrateCommon args = {
@@ -2524,6 +2564,25 @@ static void test_migrate_precopy_fd_file_mapped_ram(void)
};
test_file_common(&args, true);
}
+
+static void *migrate_multifd_fd_mapped_ram_start(QTestState *from,
+ QTestState *to)
+{
+ migrate_multifd_mapped_ram_start(from, to);
+ return migrate_precopy_fd_file_start(from, to);
+}
+
+static void test_multifd_fd_mapped_ram(void)
+{
+ MigrateCommon args = {
+ .connect_uri = "fd:fd-mig",
+ .listen_uri = "defer",
+ .start_hook = migrate_multifd_fd_mapped_ram_start,
+ .finish_hook = test_migrate_fd_finish_hook
+ };
+
+ test_file_common(&args, true);
+}
#endif /* _WIN32 */
static void do_test_validate_uuid(MigrateStart *args, bool should_fail)
@@ -3566,6 +3625,15 @@ int main(int argc, char **argv)
migration_test_add("/migration/precopy/file/mapped-ram/live",
test_precopy_file_mapped_ram_live);
+ migration_test_add("/migration/multifd/file/mapped-ram",
+ test_multifd_file_mapped_ram);
+ migration_test_add("/migration/multifd/file/mapped-ram/live",
+ test_multifd_file_mapped_ram_live);
+#ifndef _WIN32
+ migration_test_add("/migration/multifd/fd/mapped-ram",
+ test_multifd_fd_mapped_ram);
+#endif
+
#ifdef CONFIG_GNUTLS
migration_test_add("/migration/precopy/unix/tls/psk",
test_precopy_unix_tls_psk);
--
2.35.3
^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH v5 01/23] migration/multifd: Cleanup multifd_recv_sync_main
2024-02-28 15:21 ` [PATCH v5 01/23] migration/multifd: Cleanup multifd_recv_sync_main Fabiano Rosas
@ 2024-02-29 1:26 ` Peter Xu
0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2024-02-29 1:26 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: qemu-devel, berrange, armbru, Claudio Fontana
On Wed, Feb 28, 2024 at 12:21:05PM -0300, Fabiano Rosas wrote:
> Some minor cleanups and documentation for multifd_recv_sync_main.
>
> Use thread_count as done in other parts of the code. Remove p->id from
> the multifd_recv_state sync, since that is global and not tied to a
> channel. Add documentation for the sync steps.
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 07/23] migration/ram: Introduce 'mapped-ram' migration capability
2024-02-28 15:21 ` [PATCH v5 07/23] migration/ram: Introduce 'mapped-ram' migration capability Fabiano Rosas
@ 2024-02-29 2:10 ` Peter Xu
0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2024-02-29 2:10 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: qemu-devel, berrange, armbru, Claudio Fontana, Eric Blake
On Wed, Feb 28, 2024 at 12:21:11PM -0300, Fabiano Rosas wrote:
> Add a new migration capability 'mapped-ram'.
>
> The core of the feature is to ensure that RAM pages are mapped
> directly to offsets in the resulting migration file instead of being
> streamed at arbitrary points.
>
> The reasons why we'd want such behavior are:
>
> - The resulting file will have a bounded size, since pages which are
> dirtied multiple times will always go to a fixed location in the
> file, rather than constantly being added to a sequential
> stream. This eliminates cases where a VM with, say, 1G of RAM can
> result in a migration file that's 10s of GBs, provided that the
> workload constantly redirties memory.
>
> - It paves the way to implement O_DIRECT-enabled save/restore of the
> migration stream as the pages are ensured to be written at aligned
> offsets.
>
> - It allows the usage of multifd so we can write RAM pages to the
> migration file in parallel.
>
> For now, enabling the capability has no effect. The next couple of
> patches implement the core functionality.
>
> Acked-by: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 14/23] migration/multifd: Allow multifd without packets
2024-02-28 15:21 ` [PATCH v5 14/23] migration/multifd: Allow multifd without packets Fabiano Rosas
@ 2024-02-29 2:20 ` Peter Xu
0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2024-02-29 2:20 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: qemu-devel, berrange, armbru, Claudio Fontana
On Wed, Feb 28, 2024 at 12:21:18PM -0300, Fabiano Rosas wrote:
> For the upcoming support to the new 'mapped-ram' migration stream
> format, we cannot use multifd packets because each write into the
> ramblock section in the migration file is expected to contain only the
> guest pages. They are written at their respective offsets relative to
> the ramblock section header.
>
> There is no space for the packet information and the expected gains
> from the new approach come partly from being able to write the pages
> sequentially without extraneous data in between.
>
> The new format also simply doesn't need the packets and all necessary
> information can be taken from the standard migration headers with some
> (future) changes to multifd code.
>
> Use the presence of the mapped-ram capability to decide whether to
> send packets.
>
> This only moves code under multifd_use_packets(), it has no effect for
> now as mapped-ram cannot yet be enabled with multifd.
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
> - added multifd_send_prepare_iovs
I saw that you also moved p->next_packet_size setup into it. IMHO it
doesn't need to be there; it'll also be tiny bit confusing to setup
next_packet_size when !use_packet to me.
But I think I get your point on putting that together with IOV setups.
Not a big deal.
> - posted channels_created at file.c as well
This is done in the other patch ("migration/multifd: Add outgoing
QIOChannelFile support"). It won't appear when it's merged anyway, so
that's fine.
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 15/23] migration/multifd: Allow receiving pages without packets
2024-02-28 15:21 ` [PATCH v5 15/23] migration/multifd: Allow receiving pages " Fabiano Rosas
@ 2024-02-29 2:28 ` Peter Xu
0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2024-02-29 2:28 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: qemu-devel, berrange, armbru, Claudio Fontana
On Wed, Feb 28, 2024 at 12:21:19PM -0300, Fabiano Rosas wrote:
> Currently multifd does not need to have knowledge of pages on the
> receiving side because all the information needed is within the
> packets that come in the stream.
>
> We're about to add support to mapped-ram migration, which cannot use
> packets because it expects the ramblock section in the migration file
> to contain only the guest pages data.
>
> Add a data structure to transfer pages between the ram migration code
> and the multifd receiving threads.
>
> We don't want to reuse MultiFDPages_t for two reasons:
>
> a) multifd threads don't really need to know about the data they're
> receiving.
>
> b) the receiving side has to be stopped to load the pages, which means
> we can experiment with larger granularities than page size when
> transferring data.
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 16/23] migration/multifd: Add a wrapper for channels_created
2024-02-28 15:21 ` [PATCH v5 16/23] migration/multifd: Add a wrapper for channels_created Fabiano Rosas
@ 2024-02-29 2:29 ` Peter Xu
0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2024-02-29 2:29 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: qemu-devel, berrange, armbru, Claudio Fontana
On Wed, Feb 28, 2024 at 12:21:20PM -0300, Fabiano Rosas wrote:
> We'll need to access multifd_send_state->channels_created from outside
> multifd.c, so introduce a helper for that.
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 17/23] migration/multifd: Add outgoing QIOChannelFile support
2024-02-28 15:21 ` [PATCH v5 17/23] migration/multifd: Add outgoing QIOChannelFile support Fabiano Rosas
@ 2024-02-29 2:44 ` Peter Xu
2024-02-29 3:33 ` Peter Xu
0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2024-02-29 2:44 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: qemu-devel, berrange, armbru, Claudio Fontana
On Wed, Feb 28, 2024 at 12:21:21PM -0300, Fabiano Rosas wrote:
> Allow multifd to open file-backed channels. This will be used when
> enabling the mapped-ram migration stream format which expects a
> seekable transport.
>
> The QIOChannel read and write methods will use the preadv/pwritev
> versions which don't update the file offset at each call so we can
> reuse the fd without re-opening for every channel.
>
> Contrary to the socket migration, the file migration doesn't need an
> asynchronous channel creation process, so expose
> multifd_channel_connect() and call it directly.
>
> Note that this is just setup code and multifd cannot yet make use of
> the file channels.
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
> - moved flags change to another patch
> - removed channels_created assert
> ---
> migration/file.c | 41 +++++++++++++++++++++++++++++++++++++++--
> migration/file.h | 4 ++++
> migration/multifd.c | 18 +++++++++++++++---
> migration/multifd.h | 1 +
> 4 files changed, 59 insertions(+), 5 deletions(-)
>
> diff --git a/migration/file.c b/migration/file.c
> index 22d052a71f..83328a7a1b 100644
> --- a/migration/file.c
> +++ b/migration/file.c
> @@ -12,12 +12,17 @@
> #include "channel.h"
> #include "file.h"
> #include "migration.h"
> +#include "multifd.h"
> #include "io/channel-file.h"
> #include "io/channel-util.h"
> #include "trace.h"
>
> #define OFFSET_OPTION ",offset="
>
> +static struct FileOutgoingArgs {
> + char *fname;
> +} outgoing_args;
> +
> /* Remove the offset option from @filespec and return it in @offsetp. */
>
> int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp)
> @@ -37,6 +42,36 @@ int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp)
> return 0;
> }
>
> +void file_cleanup_outgoing_migration(void)
> +{
> + g_free(outgoing_args.fname);
> + outgoing_args.fname = NULL;
> +}
> +
> +bool file_send_channel_create(gpointer opaque, Error **errp)
> +{
> + QIOChannelFile *ioc;
> + int flags = O_WRONLY;
> + bool ret = true;
> +
> + ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
> + if (!ioc) {
> + ret = false;
> + goto out;
> + }
> +
> + multifd_channel_connect(opaque, QIO_CHANNEL(ioc));
> +
> +out:
> + /*
> + * File channel creation is synchronous. However posting this
> + * semaphore here is simpler than adding a special case.
> + */
> + multifd_send_channel_created();
> +
> + return ret;
> +}
> +
> void file_start_outgoing_migration(MigrationState *s,
> FileMigrationArgs *file_args, Error **errp)
> {
> @@ -47,12 +82,14 @@ void file_start_outgoing_migration(MigrationState *s,
>
> trace_migration_file_outgoing(filename);
>
> - fioc = qio_channel_file_new_path(filename, O_CREAT | O_WRONLY | O_TRUNC,
> - 0600, errp);
> + fioc = qio_channel_file_new_path(filename, O_CREAT | O_TRUNC | O_WRONLY,
> + 0660, errp);
It seems this is still leftover?
> if (!fioc) {
> return;
> }
>
> + outgoing_args.fname = g_strdup(filename);
> +
> ioc = QIO_CHANNEL(fioc);
> if (offset && qio_channel_io_seek(ioc, offset, SEEK_SET, errp) < 0) {
> return;
> diff --git a/migration/file.h b/migration/file.h
> index 37d6a08bfc..4577f9efdd 100644
> --- a/migration/file.h
> +++ b/migration/file.h
> @@ -9,10 +9,14 @@
> #define QEMU_MIGRATION_FILE_H
>
> #include "qapi/qapi-types-migration.h"
> +#include "io/task.h"
> +#include "channel.h"
>
> void file_start_incoming_migration(FileMigrationArgs *file_args, Error **errp);
>
> void file_start_outgoing_migration(MigrationState *s,
> FileMigrationArgs *file_args, Error **errp);
> int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp);
> +void file_cleanup_outgoing_migration(void);
> +bool file_send_channel_create(gpointer opaque, Error **errp);
> #endif
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 3574fd3953..f155223303 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -17,6 +17,7 @@
> #include "exec/ramblock.h"
> #include "qemu/error-report.h"
> #include "qapi/error.h"
> +#include "file.h"
> #include "ram.h"
> #include "migration.h"
> #include "migration-stats.h"
> @@ -28,6 +29,7 @@
> #include "threadinfo.h"
> #include "options.h"
> #include "qemu/yank.h"
> +#include "io/channel-file.h"
> #include "io/channel-socket.h"
> #include "yank_functions.h"
>
> @@ -694,6 +696,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
> {
> if (p->c) {
> migration_ioc_unregister_yank(p->c);
> + qio_channel_close(p->c, NULL);
s/NULL/&error_abort/?
> object_unref(OBJECT(p->c));
> p->c = NULL;
> }
> @@ -715,6 +718,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
>
> static void multifd_send_cleanup_state(void)
> {
> + file_cleanup_outgoing_migration();
> socket_cleanup_outgoing_migration();
> qemu_sem_destroy(&multifd_send_state->channels_created);
> qemu_sem_destroy(&multifd_send_state->channels_ready);
> @@ -977,7 +981,7 @@ static bool multifd_tls_channel_connect(MultiFDSendParams *p,
> return true;
> }
>
> -static void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc)
> +void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc)
> {
> qio_channel_set_delay(ioc, false);
>
> @@ -1045,9 +1049,14 @@ out:
> error_free(local_err);
> }
>
> -static void multifd_new_send_channel_create(gpointer opaque)
> +static bool multifd_new_send_channel_create(gpointer opaque, Error **errp)
> {
> + if (!multifd_use_packets()) {
> + return file_send_channel_create(opaque, errp);
> + }
> +
> socket_send_channel_create(multifd_new_send_channel_async, opaque);
> + return true;
> }
>
> bool multifd_send_setup(void)
> @@ -1096,7 +1105,10 @@ bool multifd_send_setup(void)
> p->page_size = qemu_target_page_size();
> p->page_count = page_count;
> p->write_flags = 0;
> - multifd_new_send_channel_create(p);
> +
> + if (!multifd_new_send_channel_create(p, &local_err)) {
> + return -1;
"-1" is unfortunately a "true"!..
> + }
> }
>
> /*
> diff --git a/migration/multifd.h b/migration/multifd.h
> index 1d8bbaf96b..db8887f088 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -227,5 +227,6 @@ static inline void multifd_send_prepare_header(MultiFDSendParams *p)
> p->iovs_num++;
> }
>
> +void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc);
>
> #endif
> --
> 2.35.3
>
--
Peter Xu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 18/23] migration/multifd: Add incoming QIOChannelFile support
2024-02-28 15:21 ` [PATCH v5 18/23] migration/multifd: Add incoming " Fabiano Rosas
@ 2024-02-29 2:53 ` Peter Xu
0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2024-02-29 2:53 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: qemu-devel, berrange, armbru, Claudio Fontana
On Wed, Feb 28, 2024 at 12:21:22PM -0300, Fabiano Rosas wrote:
> On the receiving side we don't need to differentiate between main
> channel and threads, so whichever channel is defined first gets to be
> the main one. And since there are no packets, use the atomic channel
> count to index into the params array.
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 19/23] migration/multifd: Prepare multifd sync for mapped-ram migration
2024-02-28 15:21 ` [PATCH v5 19/23] migration/multifd: Prepare multifd sync for mapped-ram migration Fabiano Rosas
@ 2024-02-29 3:16 ` Peter Xu
2024-02-29 13:19 ` Fabiano Rosas
0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2024-02-29 3:16 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: qemu-devel, berrange, armbru, Claudio Fontana
On Wed, Feb 28, 2024 at 12:21:23PM -0300, Fabiano Rosas wrote:
> The mapped-ram migration can be performed live or non-live, but it is
> always asynchronous, i.e. the source machine and the destination
> machine are not migrating at the same time. We only need some pieces
> of the multifd sync operations.
>
> multifd_send_sync_main()
> ------------------------
> Issued by the ram migration code on the migration thread, causes the
> multifd send channels to synchronize with the migration thread and
> makes the sending side emit a packet with the MULTIFD_FLUSH flag.
>
> With mapped-ram we want to maintain the sync on the sending side
> because that provides ordering between the rounds of dirty pages when
> migrating live.
IIUC as I used to comment, we should probably only need that sync after
each full iteration, which is find_dirty_block().
I think keeping the setup/complete sync is fine, and that can be discussed
separately. However IMHO we should still avoid the sync in
ram_save_iterate() always, or on new qemu + old machine types (where
flush_after_each_section=true) fixed-ram could suffer perf issues, IIUC.
So I assume at a minimum below would still be preferred?
@@ -3257,7 +3257,8 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
out:
if (ret >= 0
&& migration_is_setup_or_active(migrate_get_current()->state)) {
- if (migrate_multifd() && migrate_multifd_flush_after_each_section()) {
+ if (migrate_multifd() && migrate_multifd_flush_after_each_section() &&
+ !migrate_mapped_ram()) {
ret = multifd_send_sync_main();
if (ret < 0) {
return ret;
>
> MULTIFD_FLUSH
> -------------
> On the receiving side, the presence of the MULTIFD_FLUSH flag on a
> packet causes the receiving channels to start synchronizing with the
> main thread.
>
> We're not using packets with mapped-ram, so there's no MULTIFD_FLUSH
> flag and therefore no channel sync on the receiving side.
>
> multifd_recv_sync_main()
> ------------------------
> Issued by the migration thread when the ram migration flag
> RAM_SAVE_FLAG_MULTIFD_FLUSH is received, causes the migration thread
> on the receiving side to start synchronizing with the recv
> channels. Due to compatibility, this is also issued when
> RAM_SAVE_FLAG_EOS is received.
>
> For mapped-ram we only need to synchronize the channels at the end of
> migration to avoid doing cleanup before the channels have finished
> their IO.
Did you forget to add the sync at parse_ramblocks() for mapped-ram?
>
> Make sure the multifd syncs are only issued at the appropriate times.
>
> Note that due to pre-existing backward compatibility issues, we have
> the multifd_flush_after_each_section property that can cause a sync to
> happen at EOS. Since the EOS flag is needed on the stream, allow
> mapped-ram to just ignore it.
Skipping EOS makes sense, but I suggest do that without invalid_flags. See
below.
>
> Also emit an error if any other unexpected flags are found on the
> stream.
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
> - skipped all FLUSH flags
> - added invalid flags
> - skipped EOS
> ---
> migration/ram.c | 26 ++++++++++++++++++++++----
> 1 file changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/migration/ram.c b/migration/ram.c
> index 18620784c6..250dcd110c 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1368,8 +1368,11 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
> if (ret < 0) {
> return ret;
> }
> - qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> - qemu_fflush(f);
> +
> + if (!migrate_mapped_ram()) {
> + qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> + qemu_fflush(f);
> + }
> }
> /*
> * If memory migration starts over, we will meet a dirtied page
> @@ -3111,7 +3114,8 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> return ret;
> }
>
> - if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
> + if (migrate_multifd() && !migrate_multifd_flush_after_each_section()
> + && !migrate_mapped_ram()) {
> qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> }
>
> @@ -3334,7 +3338,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
> }
> }
>
> - if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
> + if (migrate_multifd() && !migrate_multifd_flush_after_each_section() &&
> + !migrate_mapped_ram()) {
> qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> }
> qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> @@ -4137,6 +4142,12 @@ static int ram_load_precopy(QEMUFile *f)
> invalid_flags |= RAM_SAVE_FLAG_COMPRESS_PAGE;
> }
>
> + if (migrate_mapped_ram()) {
> + invalid_flags |= (RAM_SAVE_FLAG_EOS | RAM_SAVE_FLAG_HOOK |
> + RAM_SAVE_FLAG_MULTIFD_FLUSH | RAM_SAVE_FLAG_PAGE |
> + RAM_SAVE_FLAG_XBZRLE | RAM_SAVE_FLAG_ZERO);
IMHO EOS cannot be accounted as "invalid" here because it always exists.
Rather than this trick (then explicitly ignore it below... which is even
hackier, IMHO), we can avoid setting EOS in invalid_flags, but explicitly
ignore EOS in below code to bypass it for mapped-ram:
@@ -4301,7 +4302,12 @@ static int ram_load_precopy(QEMUFile *f)
case RAM_SAVE_FLAG_EOS:
/* normal exit */
if (migrate_multifd() &&
- migrate_multifd_flush_after_each_section()) {
+ migrate_multifd_flush_after_each_section() &&
+ /*
+ * Mapped-ram migration flushes once and for all after
+ * parsing ramblocks. Always ignore EOS for it.
+ */
+ !migrate_mapped_ram()) {
multifd_recv_sync_main();
}
break;
> + }
> +
> while (!ret && !(flags & RAM_SAVE_FLAG_EOS)) {
> ram_addr_t addr;
> void *host = NULL, *host_bak = NULL;
> @@ -4158,6 +4169,13 @@ static int ram_load_precopy(QEMUFile *f)
> addr &= TARGET_PAGE_MASK;
>
> if (flags & invalid_flags) {
> + if (invalid_flags & RAM_SAVE_FLAG_EOS) {
> + /* EOS is always present, just ignore it */
> + continue;
> + }
> +
> + error_report("Unexpected RAM flags: %d", flags & invalid_flags);
> +
> if (flags & invalid_flags & RAM_SAVE_FLAG_COMPRESS_PAGE) {
> error_report("Received an unexpected compressed page");
> }
> --
> 2.35.3
>
--
Peter Xu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 21/23] migration/multifd: Support incoming mapped-ram stream format
2024-02-28 15:21 ` [PATCH v5 21/23] migration/multifd: Support incoming " Fabiano Rosas
@ 2024-02-29 3:23 ` Peter Xu
0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2024-02-29 3:23 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: qemu-devel, berrange, armbru, Claudio Fontana
On Wed, Feb 28, 2024 at 12:21:25PM -0300, Fabiano Rosas wrote:
> For the incoming mapped-ram migration we need to read the ramblock
> headers, get the pages bitmap and send the host address of each
> non-zero page to the multifd channel thread for writing.
>
> Usage on HMP is:
>
> (qemu) migrate_set_capability multifd on
> (qemu) migrate_set_capability mapped-ram on
> (qemu) migrate_incoming file:migfile
>
> (the ram.h include needs to move because we've been previously relying
> on it being included from migration.c. Now file.h will start including
> multifd.h before migration.o is processed)
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 22/23] migration/multifd: Add mapped-ram support to fd: URI
2024-02-28 15:21 ` [PATCH v5 22/23] migration/multifd: Add mapped-ram support to fd: URI Fabiano Rosas
@ 2024-02-29 3:31 ` Peter Xu
0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2024-02-29 3:31 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: qemu-devel, berrange, armbru, Claudio Fontana
On Wed, Feb 28, 2024 at 12:21:26PM -0300, Fabiano Rosas wrote:
> If we receive a file descriptor that points to a regular file, there's
> nothing stopping us from doing multifd migration with mapped-ram to
> that file.
>
> Enable the fd: URI to work with multifd + mapped-ram.
>
> Note that the fds passed into multifd are duplicated because we want
> to avoid cross-thread effects when doing cleanup (i.e. close(fd)). The
> original fd doesn't need to be duplicated because monitor_get_fd()
> transfers ownership to the caller.
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 17/23] migration/multifd: Add outgoing QIOChannelFile support
2024-02-29 2:44 ` Peter Xu
@ 2024-02-29 3:33 ` Peter Xu
2024-02-29 14:27 ` Fabiano Rosas
0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2024-02-29 3:33 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: qemu-devel, berrange, armbru, Claudio Fontana
On Thu, Feb 29, 2024 at 10:44:21AM +0800, Peter Xu wrote:
> On Wed, Feb 28, 2024 at 12:21:21PM -0300, Fabiano Rosas wrote:
> > Allow multifd to open file-backed channels. This will be used when
> > enabling the mapped-ram migration stream format which expects a
> > seekable transport.
> >
> > The QIOChannel read and write methods will use the preadv/pwritev
> > versions which don't update the file offset at each call so we can
> > reuse the fd without re-opening for every channel.
> >
> > Contrary to the socket migration, the file migration doesn't need an
> > asynchronous channel creation process, so expose
> > multifd_channel_connect() and call it directly.
> >
> > Note that this is just setup code and multifd cannot yet make use of
> > the file channels.
> >
> > Signed-off-by: Fabiano Rosas <farosas@suse.de>
> > ---
> > - moved flags change to another patch
> > - removed channels_created assert
> > ---
> > migration/file.c | 41 +++++++++++++++++++++++++++++++++++++++--
> > migration/file.h | 4 ++++
> > migration/multifd.c | 18 +++++++++++++++---
> > migration/multifd.h | 1 +
> > 4 files changed, 59 insertions(+), 5 deletions(-)
> >
> > diff --git a/migration/file.c b/migration/file.c
> > index 22d052a71f..83328a7a1b 100644
> > --- a/migration/file.c
> > +++ b/migration/file.c
> > @@ -12,12 +12,17 @@
> > #include "channel.h"
> > #include "file.h"
> > #include "migration.h"
> > +#include "multifd.h"
> > #include "io/channel-file.h"
> > #include "io/channel-util.h"
> > #include "trace.h"
> >
> > #define OFFSET_OPTION ",offset="
> >
> > +static struct FileOutgoingArgs {
> > + char *fname;
> > +} outgoing_args;
> > +
> > /* Remove the offset option from @filespec and return it in @offsetp. */
> >
> > int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp)
> > @@ -37,6 +42,36 @@ int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp)
> > return 0;
> > }
> >
> > +void file_cleanup_outgoing_migration(void)
> > +{
> > + g_free(outgoing_args.fname);
> > + outgoing_args.fname = NULL;
> > +}
> > +
> > +bool file_send_channel_create(gpointer opaque, Error **errp)
> > +{
> > + QIOChannelFile *ioc;
> > + int flags = O_WRONLY;
> > + bool ret = true;
> > +
> > + ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
> > + if (!ioc) {
> > + ret = false;
> > + goto out;
> > + }
> > +
> > + multifd_channel_connect(opaque, QIO_CHANNEL(ioc));
> > +
> > +out:
> > + /*
> > + * File channel creation is synchronous. However posting this
> > + * semaphore here is simpler than adding a special case.
> > + */
> > + multifd_send_channel_created();
> > +
> > + return ret;
> > +}
> > +
> > void file_start_outgoing_migration(MigrationState *s,
> > FileMigrationArgs *file_args, Error **errp)
> > {
> > @@ -47,12 +82,14 @@ void file_start_outgoing_migration(MigrationState *s,
> >
> > trace_migration_file_outgoing(filename);
> >
> > - fioc = qio_channel_file_new_path(filename, O_CREAT | O_WRONLY | O_TRUNC,
> > - 0600, errp);
> > + fioc = qio_channel_file_new_path(filename, O_CREAT | O_TRUNC | O_WRONLY,
> > + 0660, errp);
>
> It seems this is still leftover?
>
> > if (!fioc) {
> > return;
> > }
> >
> > + outgoing_args.fname = g_strdup(filename);
> > +
> > ioc = QIO_CHANNEL(fioc);
> > if (offset && qio_channel_io_seek(ioc, offset, SEEK_SET, errp) < 0) {
> > return;
> > diff --git a/migration/file.h b/migration/file.h
> > index 37d6a08bfc..4577f9efdd 100644
> > --- a/migration/file.h
> > +++ b/migration/file.h
> > @@ -9,10 +9,14 @@
> > #define QEMU_MIGRATION_FILE_H
> >
> > #include "qapi/qapi-types-migration.h"
> > +#include "io/task.h"
> > +#include "channel.h"
> >
> > void file_start_incoming_migration(FileMigrationArgs *file_args, Error **errp);
> >
> > void file_start_outgoing_migration(MigrationState *s,
> > FileMigrationArgs *file_args, Error **errp);
> > int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp);
> > +void file_cleanup_outgoing_migration(void);
> > +bool file_send_channel_create(gpointer opaque, Error **errp);
> > #endif
> > diff --git a/migration/multifd.c b/migration/multifd.c
> > index 3574fd3953..f155223303 100644
> > --- a/migration/multifd.c
> > +++ b/migration/multifd.c
> > @@ -17,6 +17,7 @@
> > #include "exec/ramblock.h"
> > #include "qemu/error-report.h"
> > #include "qapi/error.h"
> > +#include "file.h"
> > #include "ram.h"
> > #include "migration.h"
> > #include "migration-stats.h"
> > @@ -28,6 +29,7 @@
> > #include "threadinfo.h"
> > #include "options.h"
> > #include "qemu/yank.h"
> > +#include "io/channel-file.h"
> > #include "io/channel-socket.h"
> > #include "yank_functions.h"
> >
> > @@ -694,6 +696,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
> > {
> > if (p->c) {
> > migration_ioc_unregister_yank(p->c);
> > + qio_channel_close(p->c, NULL);
>
> s/NULL/&error_abort/?
Or we can drop this line? IIUC iochannel finalize() will always close it,
or it could be a separate bug.
>
> > object_unref(OBJECT(p->c));
> > p->c = NULL;
> > }
> > @@ -715,6 +718,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
> >
> > static void multifd_send_cleanup_state(void)
> > {
> > + file_cleanup_outgoing_migration();
> > socket_cleanup_outgoing_migration();
> > qemu_sem_destroy(&multifd_send_state->channels_created);
> > qemu_sem_destroy(&multifd_send_state->channels_ready);
> > @@ -977,7 +981,7 @@ static bool multifd_tls_channel_connect(MultiFDSendParams *p,
> > return true;
> > }
> >
> > -static void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc)
> > +void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc)
> > {
> > qio_channel_set_delay(ioc, false);
> >
> > @@ -1045,9 +1049,14 @@ out:
> > error_free(local_err);
> > }
> >
> > -static void multifd_new_send_channel_create(gpointer opaque)
> > +static bool multifd_new_send_channel_create(gpointer opaque, Error **errp)
> > {
> > + if (!multifd_use_packets()) {
> > + return file_send_channel_create(opaque, errp);
> > + }
> > +
> > socket_send_channel_create(multifd_new_send_channel_async, opaque);
> > + return true;
> > }
> >
> > bool multifd_send_setup(void)
> > @@ -1096,7 +1105,10 @@ bool multifd_send_setup(void)
> > p->page_size = qemu_target_page_size();
> > p->page_count = page_count;
> > p->write_flags = 0;
> > - multifd_new_send_channel_create(p);
> > +
> > + if (!multifd_new_send_channel_create(p, &local_err)) {
> > + return -1;
>
> "-1" is unfortunately a "true"!..
>
> > + }
> > }
> >
> > /*
> > diff --git a/migration/multifd.h b/migration/multifd.h
> > index 1d8bbaf96b..db8887f088 100644
> > --- a/migration/multifd.h
> > +++ b/migration/multifd.h
> > @@ -227,5 +227,6 @@ static inline void multifd_send_prepare_header(MultiFDSendParams *p)
> > p->iovs_num++;
> > }
> >
> > +void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc);
> >
> > #endif
> > --
> > 2.35.3
> >
>
> --
> Peter Xu
--
Peter Xu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 19/23] migration/multifd: Prepare multifd sync for mapped-ram migration
2024-02-29 3:16 ` Peter Xu
@ 2024-02-29 13:19 ` Fabiano Rosas
2024-03-01 0:15 ` Peter Xu
0 siblings, 1 reply; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-29 13:19 UTC (permalink / raw)
To: Peter Xu; +Cc: qemu-devel, berrange, armbru, Claudio Fontana
Peter Xu <peterx@redhat.com> writes:
> On Wed, Feb 28, 2024 at 12:21:23PM -0300, Fabiano Rosas wrote:
>> The mapped-ram migration can be performed live or non-live, but it is
>> always asynchronous, i.e. the source machine and the destination
>> machine are not migrating at the same time. We only need some pieces
>> of the multifd sync operations.
>>
>> multifd_send_sync_main()
>> ------------------------
>> Issued by the ram migration code on the migration thread, causes the
>> multifd send channels to synchronize with the migration thread and
>> makes the sending side emit a packet with the MULTIFD_FLUSH flag.
>>
>> With mapped-ram we want to maintain the sync on the sending side
>> because that provides ordering between the rounds of dirty pages when
>> migrating live.
>
> IIUC as I used to comment, we should probably only need that sync after
> each full iteration, which is find_dirty_block().
>
> I think keeping the setup/complete sync is fine, and that can be discussed
> separately. However IMHO we should still avoid the sync in
> ram_save_iterate() always, or on new qemu + old machine types (where
> flush_after_each_section=true) fixed-ram could suffer perf issues, IIUC.
>
> So I assume at a minimum below would still be preferred?
>
> @@ -3257,7 +3257,8 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
> out:
> if (ret >= 0
> && migration_is_setup_or_active(migrate_get_current()->state)) {
> - if (migrate_multifd() && migrate_multifd_flush_after_each_section()) {
> + if (migrate_multifd() && migrate_multifd_flush_after_each_section() &&
> + !migrate_mapped_ram()) {
> ret = multifd_send_sync_main();
> if (ret < 0) {
> return ret;
>
I think I forgot this. I'll amend it.
>>
>> MULTIFD_FLUSH
>> -------------
>> On the receiving side, the presence of the MULTIFD_FLUSH flag on a
>> packet causes the receiving channels to start synchronizing with the
>> main thread.
>>
>> We're not using packets with mapped-ram, so there's no MULTIFD_FLUSH
>> flag and therefore no channel sync on the receiving side.
>>
>> multifd_recv_sync_main()
>> ------------------------
>> Issued by the migration thread when the ram migration flag
>> RAM_SAVE_FLAG_MULTIFD_FLUSH is received, causes the migration thread
>> on the receiving side to start synchronizing with the recv
>> channels. Due to compatibility, this is also issued when
>> RAM_SAVE_FLAG_EOS is received.
>>
>> For mapped-ram we only need to synchronize the channels at the end of
>> migration to avoid doing cleanup before the channels have finished
>> their IO.
>
> Did you forget to add the sync at parse_ramblocks() for mapped-ram?
>
Ugh, I messed it up. I'll fix it.
>>
>> Make sure the multifd syncs are only issued at the appropriate times.
>>
>> Note that due to pre-existing backward compatibility issues, we have
>> the multifd_flush_after_each_section property that can cause a sync to
>> happen at EOS. Since the EOS flag is needed on the stream, allow
>> mapped-ram to just ignore it.
>
> Skipping EOS makes sense, but I suggest do that without invalid_flags. See
> below.
>
>>
>> Also emit an error if any other unexpected flags are found on the
>> stream.
>>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>> - skipped all FLUSH flags
>> - added invalid flags
>> - skipped EOS
>> ---
>> migration/ram.c | 26 ++++++++++++++++++++++----
>> 1 file changed, 22 insertions(+), 4 deletions(-)
>>
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 18620784c6..250dcd110c 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -1368,8 +1368,11 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
>> if (ret < 0) {
>> return ret;
>> }
>> - qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
>> - qemu_fflush(f);
>> +
>> + if (!migrate_mapped_ram()) {
>> + qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
>> + qemu_fflush(f);
>> + }
>> }
>> /*
>> * If memory migration starts over, we will meet a dirtied page
>> @@ -3111,7 +3114,8 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>> return ret;
>> }
>>
>> - if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
>> + if (migrate_multifd() && !migrate_multifd_flush_after_each_section()
>> + && !migrate_mapped_ram()) {
>> qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
>> }
>>
>> @@ -3334,7 +3338,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>> }
>> }
>>
>> - if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
>> + if (migrate_multifd() && !migrate_multifd_flush_after_each_section() &&
>> + !migrate_mapped_ram()) {
>> qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
>> }
>> qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>> @@ -4137,6 +4142,12 @@ static int ram_load_precopy(QEMUFile *f)
>> invalid_flags |= RAM_SAVE_FLAG_COMPRESS_PAGE;
>> }
>>
>> + if (migrate_mapped_ram()) {
>> + invalid_flags |= (RAM_SAVE_FLAG_EOS | RAM_SAVE_FLAG_HOOK |
>> + RAM_SAVE_FLAG_MULTIFD_FLUSH | RAM_SAVE_FLAG_PAGE |
>> + RAM_SAVE_FLAG_XBZRLE | RAM_SAVE_FLAG_ZERO);
>
> IMHO EOS cannot be accounted as "invalid" here because it always exists.
> Rather than this trick (then explicitly ignore it below... which is even
> hackier, IMHO), we can avoid setting EOS in invalid_flags, but explicitly
> ignore EOS in below code to bypass it for mapped-ram:
>
> @@ -4301,7 +4302,12 @@ static int ram_load_precopy(QEMUFile *f)
> case RAM_SAVE_FLAG_EOS:
> /* normal exit */
> if (migrate_multifd() &&
> - migrate_multifd_flush_after_each_section()) {
> + migrate_multifd_flush_after_each_section() &&
> + /*
> + * Mapped-ram migration flushes once and for all after
> + * parsing ramblocks. Always ignore EOS for it.
> + */
> + !migrate_mapped_ram()) {
> multifd_recv_sync_main();
> }
> break;
I thought we were already spraying too many migrate_mapped_ram() checks
all over the code. But wat you said makes sense, I'll change it.
>> + }
>> +
>> while (!ret && !(flags & RAM_SAVE_FLAG_EOS)) {
>> ram_addr_t addr;
>> void *host = NULL, *host_bak = NULL;
>> @@ -4158,6 +4169,13 @@ static int ram_load_precopy(QEMUFile *f)
>> addr &= TARGET_PAGE_MASK;
>>
>> if (flags & invalid_flags) {
>> + if (invalid_flags & RAM_SAVE_FLAG_EOS) {
>> + /* EOS is always present, just ignore it */
>> + continue;
>> + }
>> +
>> + error_report("Unexpected RAM flags: %d", flags & invalid_flags);
>> +
>> if (flags & invalid_flags & RAM_SAVE_FLAG_COMPRESS_PAGE) {
>> error_report("Received an unexpected compressed page");
>> }
>> --
>> 2.35.3
>>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 17/23] migration/multifd: Add outgoing QIOChannelFile support
2024-02-29 3:33 ` Peter Xu
@ 2024-02-29 14:27 ` Fabiano Rosas
2024-02-29 14:34 ` Daniel P. Berrangé
0 siblings, 1 reply; 40+ messages in thread
From: Fabiano Rosas @ 2024-02-29 14:27 UTC (permalink / raw)
To: Peter Xu; +Cc: qemu-devel, berrange, armbru, Claudio Fontana
Peter Xu <peterx@redhat.com> writes:
> On Thu, Feb 29, 2024 at 10:44:21AM +0800, Peter Xu wrote:
>> On Wed, Feb 28, 2024 at 12:21:21PM -0300, Fabiano Rosas wrote:
>> > Allow multifd to open file-backed channels. This will be used when
>> > enabling the mapped-ram migration stream format which expects a
>> > seekable transport.
>> >
>> > The QIOChannel read and write methods will use the preadv/pwritev
>> > versions which don't update the file offset at each call so we can
>> > reuse the fd without re-opening for every channel.
>> >
>> > Contrary to the socket migration, the file migration doesn't need an
>> > asynchronous channel creation process, so expose
>> > multifd_channel_connect() and call it directly.
>> >
>> > Note that this is just setup code and multifd cannot yet make use of
>> > the file channels.
>> >
>> > Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> > ---
>> > - moved flags change to another patch
>> > - removed channels_created assert
>> > ---
>> > migration/file.c | 41 +++++++++++++++++++++++++++++++++++++++--
>> > migration/file.h | 4 ++++
>> > migration/multifd.c | 18 +++++++++++++++---
>> > migration/multifd.h | 1 +
>> > 4 files changed, 59 insertions(+), 5 deletions(-)
>> >
>> > diff --git a/migration/file.c b/migration/file.c
>> > index 22d052a71f..83328a7a1b 100644
>> > --- a/migration/file.c
>> > +++ b/migration/file.c
>> > @@ -12,12 +12,17 @@
>> > #include "channel.h"
>> > #include "file.h"
>> > #include "migration.h"
>> > +#include "multifd.h"
>> > #include "io/channel-file.h"
>> > #include "io/channel-util.h"
>> > #include "trace.h"
>> >
>> > #define OFFSET_OPTION ",offset="
>> >
>> > +static struct FileOutgoingArgs {
>> > + char *fname;
>> > +} outgoing_args;
>> > +
>> > /* Remove the offset option from @filespec and return it in @offsetp. */
>> >
>> > int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp)
>> > @@ -37,6 +42,36 @@ int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp)
>> > return 0;
>> > }
>> >
>> > +void file_cleanup_outgoing_migration(void)
>> > +{
>> > + g_free(outgoing_args.fname);
>> > + outgoing_args.fname = NULL;
>> > +}
>> > +
>> > +bool file_send_channel_create(gpointer opaque, Error **errp)
>> > +{
>> > + QIOChannelFile *ioc;
>> > + int flags = O_WRONLY;
>> > + bool ret = true;
>> > +
>> > + ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
>> > + if (!ioc) {
>> > + ret = false;
>> > + goto out;
>> > + }
>> > +
>> > + multifd_channel_connect(opaque, QIO_CHANNEL(ioc));
>> > +
>> > +out:
>> > + /*
>> > + * File channel creation is synchronous. However posting this
>> > + * semaphore here is simpler than adding a special case.
>> > + */
>> > + multifd_send_channel_created();
>> > +
>> > + return ret;
>> > +}
>> > +
>> > void file_start_outgoing_migration(MigrationState *s,
>> > FileMigrationArgs *file_args, Error **errp)
>> > {
>> > @@ -47,12 +82,14 @@ void file_start_outgoing_migration(MigrationState *s,
>> >
>> > trace_migration_file_outgoing(filename);
>> >
>> > - fioc = qio_channel_file_new_path(filename, O_CREAT | O_WRONLY | O_TRUNC,
>> > - 0600, errp);
>> > + fioc = qio_channel_file_new_path(filename, O_CREAT | O_TRUNC | O_WRONLY,
>> > + 0660, errp);
>>
>> It seems this is still leftover?
>>
>> > if (!fioc) {
>> > return;
>> > }
>> >
>> > + outgoing_args.fname = g_strdup(filename);
>> > +
>> > ioc = QIO_CHANNEL(fioc);
>> > if (offset && qio_channel_io_seek(ioc, offset, SEEK_SET, errp) < 0) {
>> > return;
>> > diff --git a/migration/file.h b/migration/file.h
>> > index 37d6a08bfc..4577f9efdd 100644
>> > --- a/migration/file.h
>> > +++ b/migration/file.h
>> > @@ -9,10 +9,14 @@
>> > #define QEMU_MIGRATION_FILE_H
>> >
>> > #include "qapi/qapi-types-migration.h"
>> > +#include "io/task.h"
>> > +#include "channel.h"
>> >
>> > void file_start_incoming_migration(FileMigrationArgs *file_args, Error **errp);
>> >
>> > void file_start_outgoing_migration(MigrationState *s,
>> > FileMigrationArgs *file_args, Error **errp);
>> > int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp);
>> > +void file_cleanup_outgoing_migration(void);
>> > +bool file_send_channel_create(gpointer opaque, Error **errp);
>> > #endif
>> > diff --git a/migration/multifd.c b/migration/multifd.c
>> > index 3574fd3953..f155223303 100644
>> > --- a/migration/multifd.c
>> > +++ b/migration/multifd.c
>> > @@ -17,6 +17,7 @@
>> > #include "exec/ramblock.h"
>> > #include "qemu/error-report.h"
>> > #include "qapi/error.h"
>> > +#include "file.h"
>> > #include "ram.h"
>> > #include "migration.h"
>> > #include "migration-stats.h"
>> > @@ -28,6 +29,7 @@
>> > #include "threadinfo.h"
>> > #include "options.h"
>> > #include "qemu/yank.h"
>> > +#include "io/channel-file.h"
>> > #include "io/channel-socket.h"
>> > #include "yank_functions.h"
>> >
>> > @@ -694,6 +696,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
>> > {
>> > if (p->c) {
>> > migration_ioc_unregister_yank(p->c);
>> > + qio_channel_close(p->c, NULL);
>>
>> s/NULL/&error_abort/?
>
> Or we can drop this line? IIUC iochannel finalize() will always close it,
> or it could be a separate bug.
>
We need it so the fsync happens. The finalize() will be a noop because
the qio_channel_file_close() will clear the fd. Not the cleanest, but it
works.
>>
>> > object_unref(OBJECT(p->c));
>> > p->c = NULL;
>> > }
>> > @@ -715,6 +718,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
>> >
>> > static void multifd_send_cleanup_state(void)
>> > {
>> > + file_cleanup_outgoing_migration();
>> > socket_cleanup_outgoing_migration();
>> > qemu_sem_destroy(&multifd_send_state->channels_created);
>> > qemu_sem_destroy(&multifd_send_state->channels_ready);
>> > @@ -977,7 +981,7 @@ static bool multifd_tls_channel_connect(MultiFDSendParams *p,
>> > return true;
>> > }
>> >
>> > -static void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc)
>> > +void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc)
>> > {
>> > qio_channel_set_delay(ioc, false);
>> >
>> > @@ -1045,9 +1049,14 @@ out:
>> > error_free(local_err);
>> > }
>> >
>> > -static void multifd_new_send_channel_create(gpointer opaque)
>> > +static bool multifd_new_send_channel_create(gpointer opaque, Error **errp)
>> > {
>> > + if (!multifd_use_packets()) {
>> > + return file_send_channel_create(opaque, errp);
>> > + }
>> > +
>> > socket_send_channel_create(multifd_new_send_channel_async, opaque);
>> > + return true;
>> > }
>> >
>> > bool multifd_send_setup(void)
>> > @@ -1096,7 +1105,10 @@ bool multifd_send_setup(void)
>> > p->page_size = qemu_target_page_size();
>> > p->page_count = page_count;
>> > p->write_flags = 0;
>> > - multifd_new_send_channel_create(p);
>> > +
>> > + if (!multifd_new_send_channel_create(p, &local_err)) {
>> > + return -1;
>>
>> "-1" is unfortunately a "true"!..
>>
>> > + }
>> > }
>> >
>> > /*
>> > diff --git a/migration/multifd.h b/migration/multifd.h
>> > index 1d8bbaf96b..db8887f088 100644
>> > --- a/migration/multifd.h
>> > +++ b/migration/multifd.h
>> > @@ -227,5 +227,6 @@ static inline void multifd_send_prepare_header(MultiFDSendParams *p)
>> > p->iovs_num++;
>> > }
>> >
>> > +void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc);
>> >
>> > #endif
>> > --
>> > 2.35.3
>> >
>>
>> --
>> Peter Xu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 17/23] migration/multifd: Add outgoing QIOChannelFile support
2024-02-29 14:27 ` Fabiano Rosas
@ 2024-02-29 14:34 ` Daniel P. Berrangé
2024-03-01 0:08 ` Peter Xu
0 siblings, 1 reply; 40+ messages in thread
From: Daniel P. Berrangé @ 2024-02-29 14:34 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: Peter Xu, qemu-devel, armbru, Claudio Fontana
On Thu, Feb 29, 2024 at 11:27:44AM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
>
> > On Thu, Feb 29, 2024 at 10:44:21AM +0800, Peter Xu wrote:
> >> On Wed, Feb 28, 2024 at 12:21:21PM -0300, Fabiano Rosas wrote:
> >> > Allow multifd to open file-backed channels. This will be used when
> >> > enabling the mapped-ram migration stream format which expects a
> >> > seekable transport.
> >> >
> >> > The QIOChannel read and write methods will use the preadv/pwritev
> >> > versions which don't update the file offset at each call so we can
> >> > reuse the fd without re-opening for every channel.
> >> >
> >> > Contrary to the socket migration, the file migration doesn't need an
> >> > asynchronous channel creation process, so expose
> >> > multifd_channel_connect() and call it directly.
> >> >
> >> > Note that this is just setup code and multifd cannot yet make use of
> >> > the file channels.
> >> >
> >> > Signed-off-by: Fabiano Rosas <farosas@suse.de>
> >> > ---
> >> > - moved flags change to another patch
> >> > - removed channels_created assert
> >> > ---
> >> > migration/file.c | 41 +++++++++++++++++++++++++++++++++++++++--
> >> > migration/file.h | 4 ++++
> >> > migration/multifd.c | 18 +++++++++++++++---
> >> > migration/multifd.h | 1 +
> >> > 4 files changed, 59 insertions(+), 5 deletions(-)
> >> >
> >> > diff --git a/migration/file.c b/migration/file.c
> >> > index 22d052a71f..83328a7a1b 100644
> >> > --- a/migration/file.c
> >> > +++ b/migration/file.c
> >> > @@ -12,12 +12,17 @@
> >> > #include "channel.h"
> >> > #include "file.h"
> >> > #include "migration.h"
> >> > +#include "multifd.h"
> >> > #include "io/channel-file.h"
> >> > #include "io/channel-util.h"
> >> > #include "trace.h"
> >> >
> >> > #define OFFSET_OPTION ",offset="
> >> >
> >> > +static struct FileOutgoingArgs {
> >> > + char *fname;
> >> > +} outgoing_args;
> >> > +
> >> > /* Remove the offset option from @filespec and return it in @offsetp. */
> >> >
> >> > int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp)
> >> > @@ -37,6 +42,36 @@ int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp)
> >> > return 0;
> >> > }
> >> >
> >> > +void file_cleanup_outgoing_migration(void)
> >> > +{
> >> > + g_free(outgoing_args.fname);
> >> > + outgoing_args.fname = NULL;
> >> > +}
> >> > +
> >> > +bool file_send_channel_create(gpointer opaque, Error **errp)
> >> > +{
> >> > + QIOChannelFile *ioc;
> >> > + int flags = O_WRONLY;
> >> > + bool ret = true;
> >> > +
> >> > + ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
> >> > + if (!ioc) {
> >> > + ret = false;
> >> > + goto out;
> >> > + }
> >> > +
> >> > + multifd_channel_connect(opaque, QIO_CHANNEL(ioc));
> >> > +
> >> > +out:
> >> > + /*
> >> > + * File channel creation is synchronous. However posting this
> >> > + * semaphore here is simpler than adding a special case.
> >> > + */
> >> > + multifd_send_channel_created();
> >> > +
> >> > + return ret;
> >> > +}
> >> > +
> >> > void file_start_outgoing_migration(MigrationState *s,
> >> > FileMigrationArgs *file_args, Error **errp)
> >> > {
> >> > @@ -47,12 +82,14 @@ void file_start_outgoing_migration(MigrationState *s,
> >> >
> >> > trace_migration_file_outgoing(filename);
> >> >
> >> > - fioc = qio_channel_file_new_path(filename, O_CREAT | O_WRONLY | O_TRUNC,
> >> > - 0600, errp);
> >> > + fioc = qio_channel_file_new_path(filename, O_CREAT | O_TRUNC | O_WRONLY,
> >> > + 0660, errp);
> >>
> >> It seems this is still leftover?
> >>
> >> > if (!fioc) {
> >> > return;
> >> > }
> >> >
> >> > + outgoing_args.fname = g_strdup(filename);
> >> > +
> >> > ioc = QIO_CHANNEL(fioc);
> >> > if (offset && qio_channel_io_seek(ioc, offset, SEEK_SET, errp) < 0) {
> >> > return;
> >> > diff --git a/migration/file.h b/migration/file.h
> >> > index 37d6a08bfc..4577f9efdd 100644
> >> > --- a/migration/file.h
> >> > +++ b/migration/file.h
> >> > @@ -9,10 +9,14 @@
> >> > #define QEMU_MIGRATION_FILE_H
> >> >
> >> > #include "qapi/qapi-types-migration.h"
> >> > +#include "io/task.h"
> >> > +#include "channel.h"
> >> >
> >> > void file_start_incoming_migration(FileMigrationArgs *file_args, Error **errp);
> >> >
> >> > void file_start_outgoing_migration(MigrationState *s,
> >> > FileMigrationArgs *file_args, Error **errp);
> >> > int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp);
> >> > +void file_cleanup_outgoing_migration(void);
> >> > +bool file_send_channel_create(gpointer opaque, Error **errp);
> >> > #endif
> >> > diff --git a/migration/multifd.c b/migration/multifd.c
> >> > index 3574fd3953..f155223303 100644
> >> > --- a/migration/multifd.c
> >> > +++ b/migration/multifd.c
> >> > @@ -17,6 +17,7 @@
> >> > #include "exec/ramblock.h"
> >> > #include "qemu/error-report.h"
> >> > #include "qapi/error.h"
> >> > +#include "file.h"
> >> > #include "ram.h"
> >> > #include "migration.h"
> >> > #include "migration-stats.h"
> >> > @@ -28,6 +29,7 @@
> >> > #include "threadinfo.h"
> >> > #include "options.h"
> >> > #include "qemu/yank.h"
> >> > +#include "io/channel-file.h"
> >> > #include "io/channel-socket.h"
> >> > #include "yank_functions.h"
> >> >
> >> > @@ -694,6 +696,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
> >> > {
> >> > if (p->c) {
> >> > migration_ioc_unregister_yank(p->c);
> >> > + qio_channel_close(p->c, NULL);
> >>
> >> s/NULL/&error_abort/?
> >
> > Or we can drop this line? IIUC iochannel finalize() will always close it,
> > or it could be a separate bug.
> >
>
> We need it so the fsync happens. The finalize() will be a noop because
> the qio_channel_file_close() will clear the fd. Not the cleanest, but it
> works.
It is always wise to explicitly call 'close'.
If something still has a GSource watch registered against
the QIOChannel, that GSource will be holding a reference
on the QIOChannel and will thus prevent finalize() ever
running.
By calling close() you guarantee the channel is closed,
even if you've mistakenly leaked a GSource somewhere.
Finalize still won't run in that case, but at least the
FD is gone, and the HUP might cause the GSource callback
to trigger correct cleanup;
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 17/23] migration/multifd: Add outgoing QIOChannelFile support
2024-02-29 14:34 ` Daniel P. Berrangé
@ 2024-03-01 0:08 ` Peter Xu
0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2024-03-01 0:08 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: Fabiano Rosas, qemu-devel, armbru, Claudio Fontana
On Thu, Feb 29, 2024 at 02:34:13PM +0000, Daniel P. Berrangé wrote:
> On Thu, Feb 29, 2024 at 11:27:44AM -0300, Fabiano Rosas wrote:
> > Peter Xu <peterx@redhat.com> writes:
> >
> > > On Thu, Feb 29, 2024 at 10:44:21AM +0800, Peter Xu wrote:
> > >> On Wed, Feb 28, 2024 at 12:21:21PM -0300, Fabiano Rosas wrote:
> > >> > Allow multifd to open file-backed channels. This will be used when
> > >> > enabling the mapped-ram migration stream format which expects a
> > >> > seekable transport.
> > >> >
> > >> > The QIOChannel read and write methods will use the preadv/pwritev
> > >> > versions which don't update the file offset at each call so we can
> > >> > reuse the fd without re-opening for every channel.
> > >> >
> > >> > Contrary to the socket migration, the file migration doesn't need an
> > >> > asynchronous channel creation process, so expose
> > >> > multifd_channel_connect() and call it directly.
> > >> >
> > >> > Note that this is just setup code and multifd cannot yet make use of
> > >> > the file channels.
> > >> >
> > >> > Signed-off-by: Fabiano Rosas <farosas@suse.de>
> > >> > ---
> > >> > - moved flags change to another patch
> > >> > - removed channels_created assert
> > >> > ---
> > >> > migration/file.c | 41 +++++++++++++++++++++++++++++++++++++++--
> > >> > migration/file.h | 4 ++++
> > >> > migration/multifd.c | 18 +++++++++++++++---
> > >> > migration/multifd.h | 1 +
> > >> > 4 files changed, 59 insertions(+), 5 deletions(-)
> > >> >
> > >> > diff --git a/migration/file.c b/migration/file.c
> > >> > index 22d052a71f..83328a7a1b 100644
> > >> > --- a/migration/file.c
> > >> > +++ b/migration/file.c
> > >> > @@ -12,12 +12,17 @@
> > >> > #include "channel.h"
> > >> > #include "file.h"
> > >> > #include "migration.h"
> > >> > +#include "multifd.h"
> > >> > #include "io/channel-file.h"
> > >> > #include "io/channel-util.h"
> > >> > #include "trace.h"
> > >> >
> > >> > #define OFFSET_OPTION ",offset="
> > >> >
> > >> > +static struct FileOutgoingArgs {
> > >> > + char *fname;
> > >> > +} outgoing_args;
> > >> > +
> > >> > /* Remove the offset option from @filespec and return it in @offsetp. */
> > >> >
> > >> > int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp)
> > >> > @@ -37,6 +42,36 @@ int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp)
> > >> > return 0;
> > >> > }
> > >> >
> > >> > +void file_cleanup_outgoing_migration(void)
> > >> > +{
> > >> > + g_free(outgoing_args.fname);
> > >> > + outgoing_args.fname = NULL;
> > >> > +}
> > >> > +
> > >> > +bool file_send_channel_create(gpointer opaque, Error **errp)
> > >> > +{
> > >> > + QIOChannelFile *ioc;
> > >> > + int flags = O_WRONLY;
> > >> > + bool ret = true;
> > >> > +
> > >> > + ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
> > >> > + if (!ioc) {
> > >> > + ret = false;
> > >> > + goto out;
> > >> > + }
> > >> > +
> > >> > + multifd_channel_connect(opaque, QIO_CHANNEL(ioc));
> > >> > +
> > >> > +out:
> > >> > + /*
> > >> > + * File channel creation is synchronous. However posting this
> > >> > + * semaphore here is simpler than adding a special case.
> > >> > + */
> > >> > + multifd_send_channel_created();
> > >> > +
> > >> > + return ret;
> > >> > +}
> > >> > +
> > >> > void file_start_outgoing_migration(MigrationState *s,
> > >> > FileMigrationArgs *file_args, Error **errp)
> > >> > {
> > >> > @@ -47,12 +82,14 @@ void file_start_outgoing_migration(MigrationState *s,
> > >> >
> > >> > trace_migration_file_outgoing(filename);
> > >> >
> > >> > - fioc = qio_channel_file_new_path(filename, O_CREAT | O_WRONLY | O_TRUNC,
> > >> > - 0600, errp);
> > >> > + fioc = qio_channel_file_new_path(filename, O_CREAT | O_TRUNC | O_WRONLY,
> > >> > + 0660, errp);
> > >>
> > >> It seems this is still leftover?
> > >>
> > >> > if (!fioc) {
> > >> > return;
> > >> > }
> > >> >
> > >> > + outgoing_args.fname = g_strdup(filename);
> > >> > +
> > >> > ioc = QIO_CHANNEL(fioc);
> > >> > if (offset && qio_channel_io_seek(ioc, offset, SEEK_SET, errp) < 0) {
> > >> > return;
> > >> > diff --git a/migration/file.h b/migration/file.h
> > >> > index 37d6a08bfc..4577f9efdd 100644
> > >> > --- a/migration/file.h
> > >> > +++ b/migration/file.h
> > >> > @@ -9,10 +9,14 @@
> > >> > #define QEMU_MIGRATION_FILE_H
> > >> >
> > >> > #include "qapi/qapi-types-migration.h"
> > >> > +#include "io/task.h"
> > >> > +#include "channel.h"
> > >> >
> > >> > void file_start_incoming_migration(FileMigrationArgs *file_args, Error **errp);
> > >> >
> > >> > void file_start_outgoing_migration(MigrationState *s,
> > >> > FileMigrationArgs *file_args, Error **errp);
> > >> > int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp);
> > >> > +void file_cleanup_outgoing_migration(void);
> > >> > +bool file_send_channel_create(gpointer opaque, Error **errp);
> > >> > #endif
> > >> > diff --git a/migration/multifd.c b/migration/multifd.c
> > >> > index 3574fd3953..f155223303 100644
> > >> > --- a/migration/multifd.c
> > >> > +++ b/migration/multifd.c
> > >> > @@ -17,6 +17,7 @@
> > >> > #include "exec/ramblock.h"
> > >> > #include "qemu/error-report.h"
> > >> > #include "qapi/error.h"
> > >> > +#include "file.h"
> > >> > #include "ram.h"
> > >> > #include "migration.h"
> > >> > #include "migration-stats.h"
> > >> > @@ -28,6 +29,7 @@
> > >> > #include "threadinfo.h"
> > >> > #include "options.h"
> > >> > #include "qemu/yank.h"
> > >> > +#include "io/channel-file.h"
> > >> > #include "io/channel-socket.h"
> > >> > #include "yank_functions.h"
> > >> >
> > >> > @@ -694,6 +696,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
> > >> > {
> > >> > if (p->c) {
> > >> > migration_ioc_unregister_yank(p->c);
> > >> > + qio_channel_close(p->c, NULL);
> > >>
> > >> s/NULL/&error_abort/?
> > >
> > > Or we can drop this line? IIUC iochannel finalize() will always close it,
> > > or it could be a separate bug.
> > >
> >
> > We need it so the fsync happens. The finalize() will be a noop because
> > the qio_channel_file_close() will clear the fd. Not the cleanest, but it
> > works.
>
> It is always wise to explicitly call 'close'.
>
> If something still has a GSource watch registered against
> the QIOChannel, that GSource will be holding a reference
> on the QIOChannel and will thus prevent finalize() ever
> running.
>
> By calling close() you guarantee the channel is closed,
> even if you've mistakenly leaked a GSource somewhere.
> Finalize still won't run in that case, but at least the
> FD is gone, and the HUP might cause the GSource callback
> to trigger correct cleanup;
I see. Let's add a comment to explain why we do this explicit close(),
then? It wasn't that clear, and we also don't do that on recv side. It
seems also only useful for "file:", we can mention that in the comment if
so.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v5 19/23] migration/multifd: Prepare multifd sync for mapped-ram migration
2024-02-29 13:19 ` Fabiano Rosas
@ 2024-03-01 0:15 ` Peter Xu
0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2024-03-01 0:15 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: qemu-devel, berrange, armbru, Claudio Fontana
On Thu, Feb 29, 2024 at 10:19:12AM -0300, Fabiano Rosas wrote:
> > IMHO EOS cannot be accounted as "invalid" here because it always exists.
> > Rather than this trick (then explicitly ignore it below... which is even
> > hackier, IMHO), we can avoid setting EOS in invalid_flags, but explicitly
> > ignore EOS in below code to bypass it for mapped-ram:
> >
> > @@ -4301,7 +4302,12 @@ static int ram_load_precopy(QEMUFile *f)
> > case RAM_SAVE_FLAG_EOS:
> > /* normal exit */
> > if (migrate_multifd() &&
> > - migrate_multifd_flush_after_each_section()) {
> > + migrate_multifd_flush_after_each_section() &&
> > + /*
> > + * Mapped-ram migration flushes once and for all after
> > + * parsing ramblocks. Always ignore EOS for it.
> > + */
> > + !migrate_mapped_ram()) {
> > multifd_recv_sync_main();
> > }
> > break;
>
> I thought we were already spraying too many migrate_mapped_ram() checks
> all over the code. But wat you said makes sense, I'll change it.
Yep that's not good, but I can't think of anything better yet and
simple. E.g. we could have some flag so ram_save_iterate()/etc. generates
nothing to the stream but only update the pages with the offsets, then we
don't need this at all and EOS can be legally accounted as invalid. But
that can involve more changes and not helpful on this series to converge.
And it's also the long condition which makes me even more worry.. For the
long run I think we should cleanup most of these "multifd &&
after_flush_each_section && !mapped_ram" at some point..
--
Peter Xu
^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2024-03-01 0:17 UTC | newest]
Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-28 15:21 [PATCH v5 00/23] migration: File based migration with multifd and mapped-ram Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 01/23] migration/multifd: Cleanup multifd_recv_sync_main Fabiano Rosas
2024-02-29 1:26 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 02/23] io: add and implement QIO_CHANNEL_FEATURE_SEEKABLE for channel file Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 03/23] io: Add generic pwritev/preadv interface Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 04/23] io: implement io_pwritev/preadv for QIOChannelFile Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 05/23] io: fsync before closing a file channel Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 06/23] migration/qemu-file: add utility methods for working with seekable channels Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 07/23] migration/ram: Introduce 'mapped-ram' migration capability Fabiano Rosas
2024-02-29 2:10 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 08/23] migration: Add mapped-ram URI compatibility check Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 09/23] migration/ram: Add outgoing 'mapped-ram' migration Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 10/23] migration/ram: Add incoming " Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 11/23] tests/qtest/migration: Add tests for mapped-ram file-based migration Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 12/23] migration/multifd: Rename MultiFDSend|RecvParams::data to compress_data Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 13/23] migration/multifd: Decouple recv method from pages Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 14/23] migration/multifd: Allow multifd without packets Fabiano Rosas
2024-02-29 2:20 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 15/23] migration/multifd: Allow receiving pages " Fabiano Rosas
2024-02-29 2:28 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 16/23] migration/multifd: Add a wrapper for channels_created Fabiano Rosas
2024-02-29 2:29 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 17/23] migration/multifd: Add outgoing QIOChannelFile support Fabiano Rosas
2024-02-29 2:44 ` Peter Xu
2024-02-29 3:33 ` Peter Xu
2024-02-29 14:27 ` Fabiano Rosas
2024-02-29 14:34 ` Daniel P. Berrangé
2024-03-01 0:08 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 18/23] migration/multifd: Add incoming " Fabiano Rosas
2024-02-29 2:53 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 19/23] migration/multifd: Prepare multifd sync for mapped-ram migration Fabiano Rosas
2024-02-29 3:16 ` Peter Xu
2024-02-29 13:19 ` Fabiano Rosas
2024-03-01 0:15 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 20/23] migration/multifd: Support outgoing mapped-ram stream format Fabiano Rosas
2024-02-28 15:21 ` [PATCH v5 21/23] migration/multifd: Support incoming " Fabiano Rosas
2024-02-29 3:23 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 22/23] migration/multifd: Add mapped-ram support to fd: URI Fabiano Rosas
2024-02-29 3:31 ` Peter Xu
2024-02-28 15:21 ` [PATCH v5 23/23] tests/qtest/migration: Add a multifd + mapped-ram migration test Fabiano Rosas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).