* [PULL 0/8] Next patches
@ 2022-11-21 12:58 Juan Quintela
2022-11-21 12:59 ` [PULL 1/8] migration/channel-block: fix return value for qio_channel_block_{readv, writev} Juan Quintela
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: Juan Quintela @ 2022-11-21 12:58 UTC (permalink / raw)
To: qemu-devel
Cc: Stefan Hajnoczi, Dr. David Alan Gilbert,
Philippe Mathieu-Daudé, Fam Zheng, Juan Quintela, qemu-block,
David Hildenbrand, Peter Xu, Paolo Bonzini
The following changes since commit a082fab9d259473a9d5d53307cf83b1223301181:
Merge tag 'pull-ppc-20221117' of https://gitlab.com/danielhb/qemu into staging (2022-11-17 12:39:38 -0500)
are available in the Git repository at:
https://gitlab.com/juan.quintela/qemu.git tags/next-pull-request
for you to fetch changes up to b5280437a7f49cf617cdd99bbbe2c7bd1652408b:
migration: Block migration comment or code is wrong (2022-11-21 11:58:10 +0100)
----------------------------------------------------------------
Migration PULL request (take 3)
Hi
Drop everything that is not a bug fix:
- fixes by peter
- fix comment on block creation (me)
- fix return values from qio_channel_block()
Please, apply.
(take 1)
It includes:
- Leonardo fix for zero_copy flush
- Fiona fix for return value of readv/writev
- Peter Xu cleanups
- Peter Xu preempt patches
- Patches ready from zero page (me)
- AVX2 support (ling)
- fix for slow networking and reordering of first packets (manish)
Please, apply.
----------------------------------------------------------------
Fiona Ebner (1):
migration/channel-block: fix return value for
qio_channel_block_{readv,writev}
Juan Quintela (1):
migration: Block migration comment or code is wrong
Leonardo Bras (1):
migration/multifd/zero-copy: Create helper function for flushing
Peter Xu (5):
migration: Fix possible infinite loop of ram save process
migration: Fix race on qemu_file_shutdown()
migration: Disallow postcopy preempt to be used with compress
migration: Use non-atomic ops for clear log bitmap
migration: Disable multifd explicitly with compression
include/exec/ram_addr.h | 11 +++++-----
include/exec/ramblock.h | 3 +++
include/qemu/bitmap.h | 1 +
migration/block.c | 4 ++--
migration/channel-block.c | 6 ++++--
migration/migration.c | 18 ++++++++++++++++
migration/multifd.c | 30 ++++++++++++++++----------
migration/qemu-file.c | 27 ++++++++++++++++++++---
migration/ram.c | 27 ++++++++++++++---------
util/bitmap.c | 45 +++++++++++++++++++++++++++++++++++++++
10 files changed, 139 insertions(+), 33 deletions(-)
--
2.38.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PULL 1/8] migration/channel-block: fix return value for qio_channel_block_{readv, writev}
2022-11-21 12:58 [PULL 0/8] Next patches Juan Quintela
@ 2022-11-21 12:59 ` Juan Quintela
2022-11-21 12:59 ` [PULL 2/8] migration/multifd/zero-copy: Create helper function for flushing Juan Quintela
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2022-11-21 12:59 UTC (permalink / raw)
To: qemu-devel
Cc: Stefan Hajnoczi, Dr. David Alan Gilbert,
Philippe Mathieu-Daudé, Fam Zheng, Juan Quintela, qemu-block,
David Hildenbrand, Peter Xu, Paolo Bonzini, Fiona Ebner
From: Fiona Ebner <f.ebner@proxmox.com>
in the error case. The documentation in include/io/channel.h states
that -1 or QIO_CHANNEL_ERR_BLOCK should be returned upon error. Simply
passing along the return value from the bdrv-functions has the
potential to confuse the call sides. Non-blocking mode is not
implemented currently, so -1 it is.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/channel-block.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/migration/channel-block.c b/migration/channel-block.c
index c55c8c93ce..f4ab53acdb 100644
--- a/migration/channel-block.c
+++ b/migration/channel-block.c
@@ -62,7 +62,8 @@ qio_channel_block_readv(QIOChannel *ioc,
qemu_iovec_init_external(&qiov, (struct iovec *)iov, niov);
ret = bdrv_readv_vmstate(bioc->bs, &qiov, bioc->offset);
if (ret < 0) {
- return ret;
+ error_setg_errno(errp, -ret, "bdrv_readv_vmstate failed");
+ return -1;
}
bioc->offset += qiov.size;
@@ -86,7 +87,8 @@ qio_channel_block_writev(QIOChannel *ioc,
qemu_iovec_init_external(&qiov, (struct iovec *)iov, niov);
ret = bdrv_writev_vmstate(bioc->bs, &qiov, bioc->offset);
if (ret < 0) {
- return ret;
+ error_setg_errno(errp, -ret, "bdrv_writev_vmstate failed");
+ return -1;
}
bioc->offset += qiov.size;
--
2.38.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PULL 2/8] migration/multifd/zero-copy: Create helper function for flushing
2022-11-21 12:58 [PULL 0/8] Next patches Juan Quintela
2022-11-21 12:59 ` [PULL 1/8] migration/channel-block: fix return value for qio_channel_block_{readv, writev} Juan Quintela
@ 2022-11-21 12:59 ` Juan Quintela
2022-11-21 12:59 ` [PULL 3/8] migration: Fix possible infinite loop of ram save process Juan Quintela
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2022-11-21 12:59 UTC (permalink / raw)
To: qemu-devel
Cc: Stefan Hajnoczi, Dr. David Alan Gilbert,
Philippe Mathieu-Daudé, Fam Zheng, Juan Quintela, qemu-block,
David Hildenbrand, Peter Xu, Paolo Bonzini, Leonardo Bras
From: Leonardo Bras <leobras@redhat.com>
Move flushing code from multifd_send_sync_main() to a new helper, and call
it in multifd_send_sync_main().
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/multifd.c | 30 +++++++++++++++++++-----------
1 file changed, 19 insertions(+), 11 deletions(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index 586ddc9d65..509bbbe3bf 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -566,6 +566,23 @@ void multifd_save_cleanup(void)
multifd_send_state = NULL;
}
+static int multifd_zero_copy_flush(QIOChannel *c)
+{
+ int ret;
+ Error *err = NULL;
+
+ ret = qio_channel_flush(c, &err);
+ if (ret < 0) {
+ error_report_err(err);
+ return -1;
+ }
+ if (ret == 1) {
+ dirty_sync_missed_zero_copy();
+ }
+
+ return ret;
+}
+
int multifd_send_sync_main(QEMUFile *f)
{
int i;
@@ -616,17 +633,8 @@ int multifd_send_sync_main(QEMUFile *f)
qemu_mutex_unlock(&p->mutex);
qemu_sem_post(&p->sem);
- if (flush_zero_copy && p->c) {
- int ret;
- Error *err = NULL;
-
- ret = qio_channel_flush(p->c, &err);
- if (ret < 0) {
- error_report_err(err);
- return -1;
- } else if (ret == 1) {
- dirty_sync_missed_zero_copy();
- }
+ if (flush_zero_copy && p->c && (multifd_zero_copy_flush(p->c) < 0)) {
+ return -1;
}
}
for (i = 0; i < migrate_multifd_channels(); i++) {
--
2.38.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PULL 3/8] migration: Fix possible infinite loop of ram save process
2022-11-21 12:58 [PULL 0/8] Next patches Juan Quintela
2022-11-21 12:59 ` [PULL 1/8] migration/channel-block: fix return value for qio_channel_block_{readv, writev} Juan Quintela
2022-11-21 12:59 ` [PULL 2/8] migration/multifd/zero-copy: Create helper function for flushing Juan Quintela
@ 2022-11-21 12:59 ` Juan Quintela
2022-11-21 12:59 ` [PULL 4/8] migration: Fix race on qemu_file_shutdown() Juan Quintela
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2022-11-21 12:59 UTC (permalink / raw)
To: qemu-devel
Cc: Stefan Hajnoczi, Dr. David Alan Gilbert,
Philippe Mathieu-Daudé, Fam Zheng, Juan Quintela, qemu-block,
David Hildenbrand, Peter Xu, Paolo Bonzini
From: Peter Xu <peterx@redhat.com>
When starting ram saving procedure (especially at the completion phase),
always set last_seen_block to non-NULL to make sure we can always correctly
detect the case where "we've migrated all the dirty pages".
Then we'll guarantee both last_seen_block and pss.block will be valid
always before the loop starts.
See the comment in the code for some details.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/ram.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index dc1de9ddbc..1d42414ecc 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2546,14 +2546,22 @@ static int ram_find_and_save_block(RAMState *rs)
return pages;
}
+ /*
+ * Always keep last_seen_block/last_page valid during this procedure,
+ * because find_dirty_block() relies on these values (e.g., we compare
+ * last_seen_block with pss.block to see whether we searched all the
+ * ramblocks) to detect the completion of migration. Having NULL value
+ * of last_seen_block can conditionally cause below loop to run forever.
+ */
+ if (!rs->last_seen_block) {
+ rs->last_seen_block = QLIST_FIRST_RCU(&ram_list.blocks);
+ rs->last_page = 0;
+ }
+
pss.block = rs->last_seen_block;
pss.page = rs->last_page;
pss.complete_round = false;
- if (!pss.block) {
- pss.block = QLIST_FIRST_RCU(&ram_list.blocks);
- }
-
do {
again = true;
found = get_queued_page(rs, &pss);
--
2.38.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PULL 4/8] migration: Fix race on qemu_file_shutdown()
2022-11-21 12:58 [PULL 0/8] Next patches Juan Quintela
` (2 preceding siblings ...)
2022-11-21 12:59 ` [PULL 3/8] migration: Fix possible infinite loop of ram save process Juan Quintela
@ 2022-11-21 12:59 ` Juan Quintela
2022-11-21 12:59 ` [PULL 5/8] migration: Disallow postcopy preempt to be used with compress Juan Quintela
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2022-11-21 12:59 UTC (permalink / raw)
To: qemu-devel
Cc: Stefan Hajnoczi, Dr. David Alan Gilbert,
Philippe Mathieu-Daudé, Fam Zheng, Juan Quintela, qemu-block,
David Hildenbrand, Peter Xu, Paolo Bonzini, Daniel P . Berrange
From: Peter Xu <peterx@redhat.com>
In qemu_file_shutdown(), there's a possible race if with current order of
operation. There're two major things to do:
(1) Do real shutdown() (e.g. shutdown() syscall on socket)
(2) Update qemufile's last_error
We must do (2) before (1) otherwise there can be a race condition like:
page receiver other thread
------------- ------------
qemu_get_buffer()
do shutdown()
returns 0 (buffer all zero)
(meanwhile we didn't check this retcode)
try to detect IO error
last_error==NULL, IO okay
install ALL-ZERO page
set last_error
--> guest crash!
To fix this, we can also check retval of qemu_get_buffer(), but not all
APIs can be properly checked and ultimately we still need to go back to
qemu_file_get_error(). E.g. qemu_get_byte() doesn't return error.
Maybe some day a rework of qemufile API is really needed, but for now keep
using qemu_file_get_error() and fix it by not allowing that race condition
to happen. Here shutdown() is indeed special because the last_error was
emulated. For real -EIO errors it'll always be set when e.g. sendmsg()
error triggers so we won't miss those ones, only shutdown() is a bit tricky
here.
Cc: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/qemu-file.c | 27 ++++++++++++++++++++++++---
1 file changed, 24 insertions(+), 3 deletions(-)
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 4f400c2e52..2d5f74ffc2 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -79,6 +79,30 @@ int qemu_file_shutdown(QEMUFile *f)
int ret = 0;
f->shutdown = true;
+
+ /*
+ * We must set qemufile error before the real shutdown(), otherwise
+ * there can be a race window where we thought IO all went though
+ * (because last_error==NULL) but actually IO has already stopped.
+ *
+ * If without correct ordering, the race can happen like this:
+ *
+ * page receiver other thread
+ * ------------- ------------
+ * qemu_get_buffer()
+ * do shutdown()
+ * returns 0 (buffer all zero)
+ * (we didn't check this retcode)
+ * try to detect IO error
+ * last_error==NULL, IO okay
+ * install ALL-ZERO page
+ * set last_error
+ * --> guest crash!
+ */
+ if (!f->last_error) {
+ qemu_file_set_error(f, -EIO);
+ }
+
if (!qio_channel_has_feature(f->ioc,
QIO_CHANNEL_FEATURE_SHUTDOWN)) {
return -ENOSYS;
@@ -88,9 +112,6 @@ int qemu_file_shutdown(QEMUFile *f)
ret = -EIO;
}
- if (!f->last_error) {
- qemu_file_set_error(f, -EIO);
- }
return ret;
}
--
2.38.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PULL 5/8] migration: Disallow postcopy preempt to be used with compress
2022-11-21 12:58 [PULL 0/8] Next patches Juan Quintela
` (3 preceding siblings ...)
2022-11-21 12:59 ` [PULL 4/8] migration: Fix race on qemu_file_shutdown() Juan Quintela
@ 2022-11-21 12:59 ` Juan Quintela
2022-11-21 12:59 ` [PULL 6/8] migration: Use non-atomic ops for clear log bitmap Juan Quintela
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2022-11-21 12:59 UTC (permalink / raw)
To: qemu-devel
Cc: Stefan Hajnoczi, Dr. David Alan Gilbert,
Philippe Mathieu-Daudé, Fam Zheng, Juan Quintela, qemu-block,
David Hildenbrand, Peter Xu, Paolo Bonzini
From: Peter Xu <peterx@redhat.com>
The preempt mode requires the capability to assign channel for each of the
page, while the compression logic will currently assign pages to different
compress thread/local-channel so potentially they're incompatible.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/migration.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/migration/migration.c b/migration/migration.c
index 739bb683f3..f3ed77a7d0 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1337,6 +1337,17 @@ static bool migrate_caps_check(bool *cap_list,
error_setg(errp, "Postcopy preempt requires postcopy-ram");
return false;
}
+
+ /*
+ * Preempt mode requires urgent pages to be sent in separate
+ * channel, OTOH compression logic will disorder all pages into
+ * different compression channels, which is not compatible with the
+ * preempt assumptions on channel assignments.
+ */
+ if (cap_list[MIGRATION_CAPABILITY_COMPRESS]) {
+ error_setg(errp, "Postcopy preempt not compatible with compress");
+ return false;
+ }
}
return true;
--
2.38.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PULL 6/8] migration: Use non-atomic ops for clear log bitmap
2022-11-21 12:58 [PULL 0/8] Next patches Juan Quintela
` (4 preceding siblings ...)
2022-11-21 12:59 ` [PULL 5/8] migration: Disallow postcopy preempt to be used with compress Juan Quintela
@ 2022-11-21 12:59 ` Juan Quintela
2022-11-21 12:59 ` [PULL 7/8] migration: Disable multifd explicitly with compression Juan Quintela
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2022-11-21 12:59 UTC (permalink / raw)
To: qemu-devel
Cc: Stefan Hajnoczi, Dr. David Alan Gilbert,
Philippe Mathieu-Daudé, Fam Zheng, Juan Quintela, qemu-block,
David Hildenbrand, Peter Xu, Paolo Bonzini
From: Peter Xu <peterx@redhat.com>
Since we already have bitmap_mutex to protect either the dirty bitmap or
the clear log bitmap, we don't need atomic operations to set/clear/test on
the clear log bitmap. Switching all ops from atomic to non-atomic
versions, meanwhile touch up the comments to show which lock is in charge.
Introduced non-atomic version of bitmap_test_and_clear_atomic(), mostly the
same as the atomic version but simplified a few places, e.g. dropped the
"old_bits" variable, and also the explicit memory barriers.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
include/exec/ram_addr.h | 11 +++++-----
include/exec/ramblock.h | 3 +++
include/qemu/bitmap.h | 1 +
util/bitmap.c | 45 +++++++++++++++++++++++++++++++++++++++++
4 files changed, 55 insertions(+), 5 deletions(-)
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 1500680458..f4fb6a2111 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -42,7 +42,8 @@ static inline long clear_bmap_size(uint64_t pages, uint8_t shift)
}
/**
- * clear_bmap_set: set clear bitmap for the page range
+ * clear_bmap_set: set clear bitmap for the page range. Must be with
+ * bitmap_mutex held.
*
* @rb: the ramblock to operate on
* @start: the start page number
@@ -55,12 +56,12 @@ static inline void clear_bmap_set(RAMBlock *rb, uint64_t start,
{
uint8_t shift = rb->clear_bmap_shift;
- bitmap_set_atomic(rb->clear_bmap, start >> shift,
- clear_bmap_size(npages, shift));
+ bitmap_set(rb->clear_bmap, start >> shift, clear_bmap_size(npages, shift));
}
/**
- * clear_bmap_test_and_clear: test clear bitmap for the page, clear if set
+ * clear_bmap_test_and_clear: test clear bitmap for the page, clear if set.
+ * Must be with bitmap_mutex held.
*
* @rb: the ramblock to operate on
* @page: the page number to check
@@ -71,7 +72,7 @@ static inline bool clear_bmap_test_and_clear(RAMBlock *rb, uint64_t page)
{
uint8_t shift = rb->clear_bmap_shift;
- return bitmap_test_and_clear_atomic(rb->clear_bmap, page >> shift, 1);
+ return bitmap_test_and_clear(rb->clear_bmap, page >> shift, 1);
}
static inline bool offset_in_ramblock(RAMBlock *b, ram_addr_t offset)
diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
index 6cbedf9e0c..adc03df59c 100644
--- a/include/exec/ramblock.h
+++ b/include/exec/ramblock.h
@@ -53,6 +53,9 @@ struct RAMBlock {
* and split clearing of dirty bitmap on the remote node (e.g.,
* KVM). The bitmap will be set only when doing global sync.
*
+ * It is only used during src side of ram migration, and it is
+ * protected by the global ram_state.bitmap_mutex.
+ *
* NOTE: this bitmap is different comparing to the other bitmaps
* in that one bit can represent multiple guest pages (which is
* decided by the `clear_bmap_shift' variable below). On
diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index 82a1d2f41f..3ccb00865f 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -253,6 +253,7 @@ void bitmap_set(unsigned long *map, long i, long len);
void bitmap_set_atomic(unsigned long *map, long i, long len);
void bitmap_clear(unsigned long *map, long start, long nr);
bool bitmap_test_and_clear_atomic(unsigned long *map, long start, long nr);
+bool bitmap_test_and_clear(unsigned long *map, long start, long nr);
void bitmap_copy_and_clear_atomic(unsigned long *dst, unsigned long *src,
long nr);
unsigned long bitmap_find_next_zero_area(unsigned long *map,
diff --git a/util/bitmap.c b/util/bitmap.c
index f81d8057a7..8d12e90a5a 100644
--- a/util/bitmap.c
+++ b/util/bitmap.c
@@ -240,6 +240,51 @@ void bitmap_clear(unsigned long *map, long start, long nr)
}
}
+bool bitmap_test_and_clear(unsigned long *map, long start, long nr)
+{
+ unsigned long *p = map + BIT_WORD(start);
+ const long size = start + nr;
+ int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
+ unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
+ bool dirty = false;
+
+ assert(start >= 0 && nr >= 0);
+
+ /* First word */
+ if (nr - bits_to_clear > 0) {
+ if ((*p) & mask_to_clear) {
+ dirty = true;
+ }
+ *p &= ~mask_to_clear;
+ nr -= bits_to_clear;
+ bits_to_clear = BITS_PER_LONG;
+ p++;
+ }
+
+ /* Full words */
+ if (bits_to_clear == BITS_PER_LONG) {
+ while (nr >= BITS_PER_LONG) {
+ if (*p) {
+ dirty = true;
+ *p = 0;
+ }
+ nr -= BITS_PER_LONG;
+ p++;
+ }
+ }
+
+ /* Last word */
+ if (nr) {
+ mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
+ if ((*p) & mask_to_clear) {
+ dirty = true;
+ }
+ *p &= ~mask_to_clear;
+ }
+
+ return dirty;
+}
+
bool bitmap_test_and_clear_atomic(unsigned long *map, long start, long nr)
{
unsigned long *p = map + BIT_WORD(start);
--
2.38.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PULL 7/8] migration: Disable multifd explicitly with compression
2022-11-21 12:58 [PULL 0/8] Next patches Juan Quintela
` (5 preceding siblings ...)
2022-11-21 12:59 ` [PULL 6/8] migration: Use non-atomic ops for clear log bitmap Juan Quintela
@ 2022-11-21 12:59 ` Juan Quintela
2022-11-21 12:59 ` [PULL 8/8] migration: Block migration comment or code is wrong Juan Quintela
2022-11-21 15:54 ` [PULL 0/8] Next patches Stefan Hajnoczi
8 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2022-11-21 12:59 UTC (permalink / raw)
To: qemu-devel
Cc: Stefan Hajnoczi, Dr. David Alan Gilbert,
Philippe Mathieu-Daudé, Fam Zheng, Juan Quintela, qemu-block,
David Hildenbrand, Peter Xu, Paolo Bonzini
From: Peter Xu <peterx@redhat.com>
Multifd thread model does not work for compression, explicitly disable it.
Note that previuosly even we can enable both of them, nothing will go
wrong, because the compression code has higher priority so multifd feature
will just be ignored. Now we'll fail even earlier at config time so the
user should be aware of the consequence better.
Note that there can be a slight chance of breaking existing users, but
let's assume they're not majority and not serious users, or they should
have found that multifd is not working already.
With that, we can safely drop the check in ram_save_target_page() for using
multifd, because when multifd=on then compression=off, then the removed
check on save_page_use_compression() will also always return false too.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/migration.c | 7 +++++++
migration/ram.c | 11 +++++------
2 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index f3ed77a7d0..f485eea5fb 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1350,6 +1350,13 @@ static bool migrate_caps_check(bool *cap_list,
}
}
+ if (cap_list[MIGRATION_CAPABILITY_MULTIFD]) {
+ if (cap_list[MIGRATION_CAPABILITY_COMPRESS]) {
+ error_setg(errp, "Multifd is not compatible with compress");
+ return false;
+ }
+ }
+
return true;
}
diff --git a/migration/ram.c b/migration/ram.c
index 1d42414ecc..1338e47665 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2305,13 +2305,12 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
}
/*
- * Do not use multifd for:
- * 1. Compression as the first page in the new block should be posted out
- * before sending the compressed page
- * 2. In postcopy as one whole host page should be placed
+ * Do not use multifd in postcopy as one whole host page should be
+ * placed. Meanwhile postcopy requires atomic update of pages, so even
+ * if host page size == guest page size the dest guest during run may
+ * still see partially copied pages which is data corruption.
*/
- if (!save_page_use_compression(rs) && migrate_use_multifd()
- && !migration_in_postcopy()) {
+ if (migrate_use_multifd() && !migration_in_postcopy()) {
return ram_save_multifd_page(rs, block, offset);
}
--
2.38.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PULL 8/8] migration: Block migration comment or code is wrong
2022-11-21 12:58 [PULL 0/8] Next patches Juan Quintela
` (6 preceding siblings ...)
2022-11-21 12:59 ` [PULL 7/8] migration: Disable multifd explicitly with compression Juan Quintela
@ 2022-11-21 12:59 ` Juan Quintela
2022-11-21 15:54 ` [PULL 0/8] Next patches Stefan Hajnoczi
8 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2022-11-21 12:59 UTC (permalink / raw)
To: qemu-devel
Cc: Stefan Hajnoczi, Dr. David Alan Gilbert,
Philippe Mathieu-Daudé, Fam Zheng, Juan Quintela, qemu-block,
David Hildenbrand, Peter Xu, Paolo Bonzini
And it appears that what is wrong is the code. During bulk stage we
need to make sure that some block is dirty, but no games with
max_size at all.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
migration/block.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/migration/block.c b/migration/block.c
index 3577c815a9..4347da1526 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -880,8 +880,8 @@ static void block_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
blk_mig_unlock();
/* Report at least one block pending during bulk phase */
- if (pending <= max_size && !block_mig_state.bulk_completed) {
- pending = max_size + BLK_MIG_BLOCK_SIZE;
+ if (!pending && !block_mig_state.bulk_completed) {
+ pending = BLK_MIG_BLOCK_SIZE;
}
trace_migration_block_save_pending(pending);
--
2.38.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PULL 0/8] Next patches
2022-11-21 12:58 [PULL 0/8] Next patches Juan Quintela
` (7 preceding siblings ...)
2022-11-21 12:59 ` [PULL 8/8] migration: Block migration comment or code is wrong Juan Quintela
@ 2022-11-21 15:54 ` Stefan Hajnoczi
8 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2022-11-21 15:54 UTC (permalink / raw)
To: Juan Quintela
Cc: qemu-devel, Stefan Hajnoczi, Dr. David Alan Gilbert,
Philippe Mathieu-Daudé, Fam Zheng, Juan Quintela, qemu-block,
David Hildenbrand, Peter Xu, Paolo Bonzini
[-- Attachment #1: Type: text/plain, Size: 115 bytes --]
Applied, thanks.
Please update the changelog at https://wiki.qemu.org/ChangeLog/7.2 for any user-visible changes.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-11-21 15:54 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-21 12:58 [PULL 0/8] Next patches Juan Quintela
2022-11-21 12:59 ` [PULL 1/8] migration/channel-block: fix return value for qio_channel_block_{readv, writev} Juan Quintela
2022-11-21 12:59 ` [PULL 2/8] migration/multifd/zero-copy: Create helper function for flushing Juan Quintela
2022-11-21 12:59 ` [PULL 3/8] migration: Fix possible infinite loop of ram save process Juan Quintela
2022-11-21 12:59 ` [PULL 4/8] migration: Fix race on qemu_file_shutdown() Juan Quintela
2022-11-21 12:59 ` [PULL 5/8] migration: Disallow postcopy preempt to be used with compress Juan Quintela
2022-11-21 12:59 ` [PULL 6/8] migration: Use non-atomic ops for clear log bitmap Juan Quintela
2022-11-21 12:59 ` [PULL 7/8] migration: Disable multifd explicitly with compression Juan Quintela
2022-11-21 12:59 ` [PULL 8/8] migration: Block migration comment or code is wrong Juan Quintela
2022-11-21 15:54 ` [PULL 0/8] Next patches Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).