* [PATCH 0/5] Allow to enable multifd and postcopy migration together @ 2024-10-29 15:09 Prasad Pandit 2024-10-29 15:09 ` [PATCH 1/5] migration/multifd: move macros to multifd header Prasad Pandit ` (4 more replies) 0 siblings, 5 replies; 33+ messages in thread From: Prasad Pandit @ 2024-10-29 15:09 UTC (permalink / raw) To: qemu-devel; +Cc: Peter Xu, Fabiano Rosas, Prasad Pandit From: Prasad Pandit <pjp@fedoraproject.org> Hello, * Currently Multifd and Postcopy migration can not be used together. QEMU shows "Postcopy is not yet compatible with multifd" message. When migrating guests with large (100's GB) RAM, Multifd threads help to accelerate migration, but inability to use it with the Postcopy mode delays guest start up on the destination side. * This patch series allows to enable both Multifd and Postcopy migration together. In this, Precopy and Multifd threads work during the initial guest (RAM) transfer. When migration moves to the Postcopy phase, Precopy and Multifd threads on the source side stop sending data on their channels. Instead Postcopy threads on the destination start to request pages from the source side. * This series introduces magic value (4-bytes) to be sent on the Postcopy channel. It helps to differentiate channels and properly setup incoming connections on the destination side. Thank you. --- Prasad Pandit (5): migration/multifd: move macros to multifd header migration/postcopy: magic value for postcopy channel migration: remove multifd check with postcopy migration: refactor ram_save_target_page functions migration: enable multifd and postcopy together migration/migration.c | 73 ++++++++++++++++++++++++---------------- migration/multifd.c | 4 --- migration/multifd.h | 5 +++ migration/options.c | 8 ++--- migration/postcopy-ram.c | 7 ++++ migration/postcopy-ram.h | 3 ++ migration/ram.c | 54 ++++++++++++----------------- 7 files changed, 83 insertions(+), 71 deletions(-) -- 2.47.0 ^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 1/5] migration/multifd: move macros to multifd header 2024-10-29 15:09 [PATCH 0/5] Allow to enable multifd and postcopy migration together Prasad Pandit @ 2024-10-29 15:09 ` Prasad Pandit 2024-10-29 15:09 ` [PATCH 2/5] migration/postcopy: magic value for postcopy channel Prasad Pandit ` (3 subsequent siblings) 4 siblings, 0 replies; 33+ messages in thread From: Prasad Pandit @ 2024-10-29 15:09 UTC (permalink / raw) To: qemu-devel; +Cc: Peter Xu, Fabiano Rosas, Prasad Pandit From: Prasad Pandit <pjp@fedoraproject.org> Move MULTIFD_ macros to the header file so that they are accessible from other files. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/multifd.c | 4 ---- migration/multifd.h | 5 +++++ 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 9b200f4ad9..97f6b6242a 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -33,10 +33,6 @@ #include "io/channel-socket.h" #include "yank_functions.h" -/* Multiple fd's */ - -#define MULTIFD_MAGIC 0x11223344U -#define MULTIFD_VERSION 1 typedef struct { uint32_t magic; diff --git a/migration/multifd.h b/migration/multifd.h index 50d58c0c9c..519dffeb9e 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -33,6 +33,11 @@ bool multifd_queue_page(RAMBlock *block, ram_addr_t offset); bool multifd_recv(void); MultiFDRecvData *multifd_get_recv_data(void); +/* Multiple fd's */ + +#define MULTIFD_MAGIC 0x11223344U +#define MULTIFD_VERSION 1 + /* Multifd Compression flags */ #define MULTIFD_FLAG_SYNC (1 << 0) -- 2.47.0 ^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-10-29 15:09 [PATCH 0/5] Allow to enable multifd and postcopy migration together Prasad Pandit 2024-10-29 15:09 ` [PATCH 1/5] migration/multifd: move macros to multifd header Prasad Pandit @ 2024-10-29 15:09 ` Prasad Pandit [not found] ` <ZyTnBwpOwXcHGGPJ@x1n> 2024-10-29 15:09 ` [PATCH 3/5] migration: remove multifd check with postcopy Prasad Pandit ` (2 subsequent siblings) 4 siblings, 1 reply; 33+ messages in thread From: Prasad Pandit @ 2024-10-29 15:09 UTC (permalink / raw) To: qemu-devel; +Cc: Peter Xu, Fabiano Rosas, Prasad Pandit From: Prasad Pandit <pjp@fedoraproject.org> During migration, the precopy and the multifd channels send magic value (4-bytes) to the destination side, for it to identify the channel and properly establish connection. But Postcopy channel did not send such value. Introduce a magic value to be sent on the postcopy channel. It helps to identify channels when both multifd and postcopy migration are enabled together. An explicitly defined magic value makes code easier to follow, because it is consistent with the other channels. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/postcopy-ram.c | 7 +++++++ migration/postcopy-ram.h | 3 +++ 2 files changed, 10 insertions(+) diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 83f6160a36..5bc19b541c 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -1580,6 +1580,9 @@ void postcopy_unregister_shared_ufd(struct PostCopyFD *pcfd) void postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file) { + if (mis->postcopy_qemufile_dst) { + return; + } /* * The new loading channel has its own threads, so it needs to be * blocked too. It's by default true, just be explicit. @@ -1612,6 +1615,10 @@ postcopy_preempt_send_channel_done(MigrationState *s, * postcopy_qemufile_src to know whether it failed or not. */ qemu_sem_post(&s->postcopy_qemufile_src_sem); + + /* Send magic value to identify postcopy channel on the destination */ + uint32_t magic = cpu_to_be32(POSTCOPY_MAGIC); + qio_channel_write_all(ioc, (char *)&magic, sizeof(magic), NULL); } static void diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h index a6df1b2811..49e2982558 100644 --- a/migration/postcopy-ram.h +++ b/migration/postcopy-ram.h @@ -15,6 +15,9 @@ #include "qapi/qapi-types-migration.h" +/* Magic value to identify postcopy channel on the destination */ +#define POSTCOPY_MAGIC 0x55667788U + /* Return true if the host supports everything we need to do postcopy-ram */ bool postcopy_ram_supported_by_host(MigrationIncomingState *mis, Error **errp); -- 2.47.0 ^ permalink raw reply related [flat|nested] 33+ messages in thread
[parent not found: <ZyTnBwpOwXcHGGPJ@x1n>]
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel [not found] ` <ZyTnBwpOwXcHGGPJ@x1n> @ 2024-11-04 12:32 ` Prasad Pandit 2024-11-04 17:18 ` Peter Xu 0 siblings, 1 reply; 33+ messages in thread From: Prasad Pandit @ 2024-11-04 12:32 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Fri, 1 Nov 2024 at 20:04, Peter Xu <peterx@redhat.com> wrote: > As we discussed internally, we can't do this unconditionally. We at least > some compat properties. * ie. we define a new compat property to check if postcopy sends a magic value or not? > Or we need to see whether Fabiano's handshake can > simplify this, because the handshake will also re-design the channel > establishment protocol. * May I know more about this handshake change? Is there a repository? OR a page/document that describes what is being planned? Is it just channel establishment change or there's more to it? Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-04 12:32 ` Prasad Pandit @ 2024-11-04 17:18 ` Peter Xu 2024-11-05 11:19 ` Prasad Pandit 0 siblings, 1 reply; 33+ messages in thread From: Peter Xu @ 2024-11-04 17:18 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Mon, Nov 04, 2024 at 06:02:33PM +0530, Prasad Pandit wrote: > On Fri, 1 Nov 2024 at 20:04, Peter Xu <peterx@redhat.com> wrote: > > As we discussed internally, we can't do this unconditionally. We at least > > some compat properties. > > * ie. we define a new compat property to check if postcopy sends a > magic value or not? > > > Or we need to see whether Fabiano's handshake can > > simplify this, because the handshake will also re-design the channel > > establishment protocol. > > * May I know more about this handshake change? Is there a repository? > OR a page/document that describes what is being planned? Is it just > channel establishment change or there's more to it? After a 2nd thought, maybe we don't need a new compat property, and this should work even before handshake ready. Firstly, we'll need a way to tell mgmt that the new qemu binary supports enablement of both multifd + postcopy feature. That can be done with a "features": [ "postcopy-with-multifd-precopy" ] Flag attached to the QMP "migrate" command. Then, I think we don't need a magic for preempt channel, because new qemu binaries (after 7.2) have no issue on out-of-order connections between main / preempt channel. Preempt channel is always connected later than main. It means in the test logic of "which channel is which", it should be: - If it's a multifd channel (check multifd header match), it's a multifd channel, otherwise, - if main channel is not present yet, it must be the main channel (and we can double check the main channel magic), otherwise, - it's the preempt channel With that, I think we can drop the new magic sent here. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-04 17:18 ` Peter Xu @ 2024-11-05 11:19 ` Prasad Pandit 2024-11-05 13:00 ` Peter Xu 0 siblings, 1 reply; 33+ messages in thread From: Prasad Pandit @ 2024-11-05 11:19 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Mon, 4 Nov 2024 at 22:48, Peter Xu <peterx@redhat.com> wrote: > Firstly, we'll need a way to tell mgmt that the new qemu binary supports > enablement of both multifd + postcopy feature. That can be done with a > > "features": [ "postcopy-with-multifd-precopy" ] > > Flag attached to the QMP "migrate" command. * IIUC, currently virsh(1)/libvirtd(8) sends the migration command to the destination to inform it of the migration features to use, whether to use multifd or postcopy or none etc. Based on that the destination QEMU prepares to accept incoming VM. Not sure how/what above flag shall benefit. > Then, I think we don't need a magic for preempt channel, because new qemu > binaries (after 7.2) have no issue on out-of-order connections between main > / preempt channel. Preempt channel is always connected later than main. > > It means in the test logic of "which channel is which", it should be: > > - If it's a multifd channel (check multifd header match), it's a multifd > channel, otherwise, > > - if main channel is not present yet, it must be the main channel (and > we can double check the main channel magic), otherwise, > > - it's the preempt channel > > With that, I think we can drop the new magic sent here. * Sending magic value on the postcopy channel only makes it consistent with other channels. There's no harm in sending it. Explicitly defining/sending the magic value is better than leaving it for the code/reader to figure it out. Is there a compelling reason to drop it? IMO, we should either define/send the magic value for all channels or none. Both ways are consistent. Defining it for few and not for others does not seem right. Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-05 11:19 ` Prasad Pandit @ 2024-11-05 13:00 ` Peter Xu 2024-11-06 12:19 ` Prasad Pandit 0 siblings, 1 reply; 33+ messages in thread From: Peter Xu @ 2024-11-05 13:00 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Tue, Nov 05, 2024 at 04:49:23PM +0530, Prasad Pandit wrote: > On Mon, 4 Nov 2024 at 22:48, Peter Xu <peterx@redhat.com> wrote: > > Firstly, we'll need a way to tell mgmt that the new qemu binary supports > > enablement of both multifd + postcopy feature. That can be done with a > > > > "features": [ "postcopy-with-multifd-precopy" ] > > > > Flag attached to the QMP "migrate" command. > > * IIUC, currently virsh(1)/libvirtd(8) sends the migration command to > the destination to inform it of the migration features to use, whether > to use multifd or postcopy or none etc. Based on that the destination > QEMU prepares to accept incoming VM. Not sure how/what above flag > shall benefit. See: https://www.qemu.org/docs/master/devel/qapi-code-gen.html Sometimes, the behaviour of QEMU changes compatibly, but without a change in the QMP syntax (usually by allowing values or operations that previously resulted in an error). QMP clients may still need to know whether the extension is available. For this purpose, a list of features can be specified for definitions, enumeration values, and struct members. Each feature list member can either be { 'name': STRING, '*if': COND }, or STRING, which is shorthand for { 'name': STRING }. > > > Then, I think we don't need a magic for preempt channel, because new qemu > > binaries (after 7.2) have no issue on out-of-order connections between main > > / preempt channel. Preempt channel is always connected later than main. > > > > It means in the test logic of "which channel is which", it should be: > > > > - If it's a multifd channel (check multifd header match), it's a multifd > > channel, otherwise, > > > > - if main channel is not present yet, it must be the main channel (and > > we can double check the main channel magic), otherwise, > > > > - it's the preempt channel > > > > With that, I think we can drop the new magic sent here. > > * Sending magic value on the postcopy channel only makes it consistent > with other channels. There's no harm in sending it. Explicitly > defining/sending the magic value is better than leaving it for the > code/reader to figure it out. Is there a compelling reason to drop it? > IMO, we should either define/send the magic value for all channels or > none. Both ways are consistent. Defining it for few and not for others > does not seem right. It's a legacy issue as not all features are developed together, and that was planned to be fixed together with handshake. I think the handshake could introduce one header on top to pair channels. IMHO it is an overkill to add a feature now if it works even if tricky, because it's not the 1st day it was tricky. And we're going to have another header very soon.. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-05 13:00 ` Peter Xu @ 2024-11-06 12:19 ` Prasad Pandit 2024-11-06 13:11 ` Fabiano Rosas 2024-11-06 16:00 ` Peter Xu 0 siblings, 2 replies; 33+ messages in thread From: Prasad Pandit @ 2024-11-06 12:19 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Tue, 5 Nov 2024 at 18:30, Peter Xu <peterx@redhat.com> wrote: > https://www.qemu.org/docs/master/devel/qapi-code-gen.html > > Sometimes, the behaviour of QEMU changes compatibly, but without a > change in the QMP syntax (usually by allowing values or operations > that previously resulted in an error). QMP clients may still need > to know whether the extension is available. > > For this purpose, a list of features can be specified for > definitions, enumeration values, and struct members. Each feature > list member can either be { 'name': STRING, '*if': COND }, or > STRING, which is shorthand for { 'name': STRING }. * I see, okay. > It's a legacy issue as not all features are developed together, and that > was planned to be fixed together with handshake. I think the handshake > could introduce one header on top to pair channels. > > IMHO it is an overkill to add a feature now if it works even if tricky, > because it's not the 1st day it was tricky. And we're going to have another > header very soon.. * See, current (this series) 'if' conditional in the migration_ioc_process_incoming() function is simple as: if (qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) { peek_magic_bytes() ... } If we don't send magic value for the postcopy channel, then we avoid peeking into magic bytes when postcopy is enabled, because otherwise thread will block peeking into the magic bytes, so the 'if' conditional becomes: if (migrate_multifd() && !migrate_postcopy() && qio_channel_has_feature (...) ) { peek_magic_bytes() ... } else { When migrate_postcopy() is true It'll reach here not only for the 'Postcopy' channel, but even for the 'default' and 'multifd' channels which send the magic bytes. Then here again we'll need to identify different channels, right? } * Let's not make it so complex. Let's send the magic value for the postcopy channel and simplify it. If 'handshake' feature is going to redo it, so be it, what's the difference? OR maybe we can align it with the 'handshake' feature or as part of it or something like that. @Fabiano Rosas : may I know more about the 'handshake' feature? What it'll do and not do? Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-06 12:19 ` Prasad Pandit @ 2024-11-06 13:11 ` Fabiano Rosas 2024-11-07 12:05 ` Prasad Pandit 2024-11-06 16:00 ` Peter Xu 1 sibling, 1 reply; 33+ messages in thread From: Fabiano Rosas @ 2024-11-06 13:11 UTC (permalink / raw) To: Prasad Pandit, Peter Xu; +Cc: qemu-devel, Prasad Pandit Prasad Pandit <ppandit@redhat.com> writes: > On Tue, 5 Nov 2024 at 18:30, Peter Xu <peterx@redhat.com> wrote: >> https://www.qemu.org/docs/master/devel/qapi-code-gen.html >> >> Sometimes, the behaviour of QEMU changes compatibly, but without a >> change in the QMP syntax (usually by allowing values or operations >> that previously resulted in an error). QMP clients may still need >> to know whether the extension is available. >> >> For this purpose, a list of features can be specified for >> definitions, enumeration values, and struct members. Each feature >> list member can either be { 'name': STRING, '*if': COND }, or >> STRING, which is shorthand for { 'name': STRING }. > > * I see, okay. > >> It's a legacy issue as not all features are developed together, and that >> was planned to be fixed together with handshake. I think the handshake >> could introduce one header on top to pair channels. >> >> IMHO it is an overkill to add a feature now if it works even if tricky, >> because it's not the 1st day it was tricky. And we're going to have another >> header very soon.. > > * See, current (this series) 'if' conditional in the > migration_ioc_process_incoming() function is simple as: > > if (qio_channel_has_feature(ioc, > QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) { peek_magic_bytes() ... } > > If we don't send magic value for the postcopy channel, then we avoid > peeking into magic bytes when postcopy is enabled, because otherwise > thread will block peeking into the magic bytes, so the 'if' > conditional becomes: > > if (migrate_multifd() && !migrate_postcopy() && > qio_channel_has_feature (...) ) { > peek_magic_bytes() > ... > } else { > When migrate_postcopy() is true > It'll reach here not only for the 'Postcopy' channel, but even > for the 'default' and 'multifd' channels which send the magic bytes. > Then here again we'll need to identify different channels, right? > } > > * Let's not make it so complex. Let's send the magic value for the > postcopy channel and simplify it. If 'handshake' feature is going to > redo it, so be it, what's the difference? OR maybe we can align it > with the 'handshake' feature or as part of it or something like that. > > @Fabiano Rosas : may I know more about the 'handshake' feature? What > it'll do and not do? What we're thinking is having an initial exchange of information between src & dst as soon as migration starts and that would sync the capabilities and parameters between both sides. Which would then be followed by a channel establishment phase that would open each necessary channel (according to caps) in order, removing the current ambiguity. > > Thank you. > --- > - Prasad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-06 13:11 ` Fabiano Rosas @ 2024-11-07 12:05 ` Prasad Pandit 2024-11-07 12:11 ` Fabiano Rosas 2024-11-07 12:33 ` Daniel P. Berrangé 0 siblings, 2 replies; 33+ messages in thread From: Prasad Pandit @ 2024-11-07 12:05 UTC (permalink / raw) To: Fabiano Rosas; +Cc: Peter Xu, qemu-devel, Prasad Pandit On Wed, 6 Nov 2024 at 18:41, Fabiano Rosas <farosas@suse.de> wrote: > What we're thinking is having an initial exchange of information between > src & dst as soon as migration starts and that would sync the > capabilities and parameters between both sides. Which would then be > followed by a channel establishment phase that would open each necessary > channel (according to caps) in order, removing the current ambiguity. > * Isn't that how it works? IIUC, libvirtd(8) sends migration command options to the destination and based on that the destination prepares for the multifd and/or postcopy migration. In case of 'Postcopy' the source sends 'postcopy advise' to the destination to indicate that postcopy might follow at the end of precopy. Also, in the discussion above Peter mentioned that libvirtd(8) may exchange list of features between source and destination to facilitate QMP clients. * What is the handshake doing differently? (just trying to understand) Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-07 12:05 ` Prasad Pandit @ 2024-11-07 12:11 ` Fabiano Rosas 2024-11-07 12:33 ` Daniel P. Berrangé 1 sibling, 0 replies; 33+ messages in thread From: Fabiano Rosas @ 2024-11-07 12:11 UTC (permalink / raw) To: Prasad Pandit; +Cc: Peter Xu, qemu-devel, Prasad Pandit Prasad Pandit <ppandit@redhat.com> writes: > On Wed, 6 Nov 2024 at 18:41, Fabiano Rosas <farosas@suse.de> wrote: >> What we're thinking is having an initial exchange of information between >> src & dst as soon as migration starts and that would sync the >> capabilities and parameters between both sides. Which would then be >> followed by a channel establishment phase that would open each necessary >> channel (according to caps) in order, removing the current ambiguity. >> > > * Isn't that how it works? IIUC, libvirtd(8) sends migration command > options to the destination and based on that the destination prepares > for the multifd and/or postcopy migration. In case of 'Postcopy' the > source sends 'postcopy advise' to the destination to indicate that > postcopy might follow at the end of precopy. Also, in the discussion > above Peter mentioned that libvirtd(8) may exchange list of features > between source and destination to facilitate QMP clients. > > * What is the handshake doing differently? (just trying to understand) The handshake will be a QEMU-only feature. Libvirt will then only start the migration on src and QEMU will do the capabilities handling. > > Thank you. > --- > - Prasad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-07 12:05 ` Prasad Pandit 2024-11-07 12:11 ` Fabiano Rosas @ 2024-11-07 12:33 ` Daniel P. Berrangé 2024-11-07 16:17 ` Peter Xu 1 sibling, 1 reply; 33+ messages in thread From: Daniel P. Berrangé @ 2024-11-07 12:33 UTC (permalink / raw) To: Prasad Pandit; +Cc: Fabiano Rosas, Peter Xu, qemu-devel, Prasad Pandit On Thu, Nov 07, 2024 at 05:35:06PM +0530, Prasad Pandit wrote: > On Wed, 6 Nov 2024 at 18:41, Fabiano Rosas <farosas@suse.de> wrote: > > What we're thinking is having an initial exchange of information between > > src & dst as soon as migration starts and that would sync the > > capabilities and parameters between both sides. Which would then be > > followed by a channel establishment phase that would open each necessary > > channel (according to caps) in order, removing the current ambiguity. > > > > * Isn't that how it works? IIUC, libvirtd(8) sends migration command > options to the destination and based on that the destination prepares > for the multifd and/or postcopy migration. In case of 'Postcopy' the > source sends 'postcopy advise' to the destination to indicate that > postcopy might follow at the end of precopy. Also, in the discussion > above Peter mentioned that libvirtd(8) may exchange list of features > between source and destination to facilitate QMP clients. > > * What is the handshake doing differently? (just trying to understand) Libvirt does what it does because it has had no other choice, not because it was good or desirable. This kind of handshake really does not belong in libvirt. A number of exposed migration protocol feature knobs should be considered private to QEMU only. It has the very negative consequence that every time QEMU wants to provide a new feature in migration, it needs to be plumbed up through libvirt, and often applications above, and those 3rd party projects need to be told when & where to use the new features. The 3rd party developers have their own project dev priorities so may not get around to enable the new migration features for years, if ever, undermining the work of QEMU's migration maintainers. As examples... If we had QEMU self-negotiation of features 10 years ago, everywhere would already be using multifd out of the box. QEMU would have been able to self-negotiate use of the new "multifd" protocol, and QEMU would be well on its way to being able to delete the old single- threaded migration code. Similarly post-copy would have been way easier for apps, QEMU would auto-negotiate a channel for the post-copy async page fetching. All migrations would be running with the post-copy feature available. All that libvirt & apps would have needed was a API to initiate the switch to post-copy mode. Or the hacks QEMU has put in place where we peek at incoming data on some channels to identify the channel type would not exist. TL;DR: once QEMU can self-negotiate features for migration itself, the implementation burden for libvirt & applications is greatly reduced. QEMU migration maintainers will control their own destiny, able to deliver improvements to users much more quickly, be able to delete obsolete features more quickly, and be able to make migration *automatically* enable new features & pick the optimal defaults on their own expert knowledge, not waitnig for 3rd parties to pay attention years later. Some things will still need work & decisions in libvirt & apps, but this burden should be reduced compared over the long term. Ultimately everyone will win. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-07 12:33 ` Daniel P. Berrangé @ 2024-11-07 16:17 ` Peter Xu 2024-11-07 16:57 ` Daniel P. Berrangé 0 siblings, 1 reply; 33+ messages in thread From: Peter Xu @ 2024-11-07 16:17 UTC (permalink / raw) To: Daniel P. Berrangé Cc: Prasad Pandit, Fabiano Rosas, qemu-devel, Prasad Pandit On Thu, Nov 07, 2024 at 12:33:17PM +0000, Daniel P. Berrangé wrote: > On Thu, Nov 07, 2024 at 05:35:06PM +0530, Prasad Pandit wrote: > > On Wed, 6 Nov 2024 at 18:41, Fabiano Rosas <farosas@suse.de> wrote: > > > What we're thinking is having an initial exchange of information between > > > src & dst as soon as migration starts and that would sync the > > > capabilities and parameters between both sides. Which would then be > > > followed by a channel establishment phase that would open each necessary > > > channel (according to caps) in order, removing the current ambiguity. > > > > > > > * Isn't that how it works? IIUC, libvirtd(8) sends migration command > > options to the destination and based on that the destination prepares > > for the multifd and/or postcopy migration. In case of 'Postcopy' the > > source sends 'postcopy advise' to the destination to indicate that > > postcopy might follow at the end of precopy. Also, in the discussion > > above Peter mentioned that libvirtd(8) may exchange list of features > > between source and destination to facilitate QMP clients. > > > > * What is the handshake doing differently? (just trying to understand) > > Libvirt does what it does because it has had no other choice, > not because it was good or desirable. > > This kind of handshake really does not belong in libvirt. A number > of exposed migration protocol feature knobs should be considered > private to QEMU only. > > It has the very negative consequence that every time QEMU wants to > provide a new feature in migration, it needs to be plumbed up through > libvirt, and often applications above, and those 3rd party projects > need to be told when & where to use the new features. The 3rd party > developers have their own project dev priorities so may not get > around to enable the new migration features for years, if ever, > undermining the work of QEMU's migration maintainers. > > As examples... > > If we had QEMU self-negotiation of features 10 years ago, everywhere > would already be using multifd out of the box. QEMU would have been > able to self-negotiate use of the new "multifd" protocol, and QEMU > would be well on its way to being able to delete the old single- > threaded migration code. > > Similarly post-copy would have been way easier for apps, QEMU would > auto-negotiate a channel for the post-copy async page fetching. All > migrations would be running with the post-copy feature available. > All that libvirt & apps would have needed was a API to initiate the > switch to post-copy mode. > > Or the hacks QEMU has put in place where we peek at incoming data > on some channels to identify the channel type would not exist. > > > TL;DR: once QEMU can self-negotiate features for migration itself, > the implementation burden for libvirt & applications is greatly > reduced. QEMU migration maintainers will control their own destiny, > able to deliver improvements to users much more quickly, be able > to delete obsolete features more quickly, and be able to make > migration *automatically* enable new features & pick the optimal > defaults on their own expert knowledge, not waitnig for 3rd parties > to pay attention years later. > > Some things will still need work & decisions in libvirt & apps, > but this burden should be reduced compared over the long term. > Ultimately everyone will win. I'll comment on a few examples above, which I think some of them, even if handshake is ready, may still need mgmt layers to involve.. Multifd and postcopy are the two major features, and they, IMHO, all better need user involvements.. Multifd needs it because it relies on the channel being able to provide >1 channels. It means "| nc XXX > file" can stop working if we turn it on by default silently. We also see generic use case in containers now that when there're dedicated cores for vcpus and "no dedicate" / "one" core setup for qemu hypervisor threads, it means anything like multifd or even block layer iothreads can be pure overheads comparing to one thread / one channel does the job.. in those cases they're better disabled. Postcopy, when enabled manually like right now, has one benefit I can think of: when the user will 100% use postcopy, the failure can happen earlier if dest host doesn't even support it! So it'll generate an error upfront saying "postcopy not supported" before migration starts, comparing to fail later at migrate_start_postcopy, then fail that command, but maybe if so the user may not even start the migration at all, knowing workload too busy to converge. It's pretty common that postcopy is not supported on dest due to the fact that people are over cautious on what userfault could do (which I kind of agree.. it's a tricky but powerful feature), so unprivileged_userfaultfd=0 knob alone can disable it for many hosts, afaiu.. simply because kvm will be the default accel, and qemu needs kernel fault trapping capability to make postcopy work.. which requires unprivileged userfaults that qemu process may not have. Maybe the postcopy-preempt mode is the one closest to what we can enable by default, so if it's not a capability we could auto-enalbe it with/without handshake being there. However there's still the tricky part where it starts to require >1 channels, so if it's the default it could break anyone who has a proxy after the socket, for example, iff the proxy only supports one channel.. similar to multifd, but not completely the same. Mapped-ram, definitely needs user involvements, because it will change the image layout. So.. just to say, maybe we'll still need some mgmt help to explicitly enable many of the features after the handshake work, as the mgmt knows better on what is the context of the VM, rather than risking qemu default configs to fail on some existing but working setups. Though at least libvirt will only need to set caps/params only once for each migration. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-07 16:17 ` Peter Xu @ 2024-11-07 16:57 ` Daniel P. Berrangé 2024-11-07 17:45 ` Peter Xu 0 siblings, 1 reply; 33+ messages in thread From: Daniel P. Berrangé @ 2024-11-07 16:57 UTC (permalink / raw) To: Peter Xu; +Cc: Prasad Pandit, Fabiano Rosas, qemu-devel, Prasad Pandit On Thu, Nov 07, 2024 at 11:17:30AM -0500, Peter Xu wrote: > On Thu, Nov 07, 2024 at 12:33:17PM +0000, Daniel P. Berrangé wrote: > I'll comment on a few examples above, which I think some of them, even if > handshake is ready, may still need mgmt layers to involve.. > > Multifd and postcopy are the two major features, and they, IMHO, all better > need user involvements.. > > Multifd needs it because it relies on the channel being able to provide >1 > channels. It means "| nc XXX > file" can stop working if we turn it on by > default silently. NB, my point was referring to a hypothetical alternative history, where we had the handshake at the QEMU level from day 1. That would neccessarily imply a bi-directional channel, so the 'nc' use case would already have been out of scope. That said, QEMU could identify whether the channel it was told to use was bi-directional or not, and thus not try to do multifd over a non-socket transport. So the general point still holds - if QEMU had this protocol negotiation phase built-in, there would be more flexiblity in introducing new features without layers above needing changes, for every single one, just a subset. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-07 16:57 ` Daniel P. Berrangé @ 2024-11-07 17:45 ` Peter Xu 2024-11-08 12:37 ` Prasad Pandit 0 siblings, 1 reply; 33+ messages in thread From: Peter Xu @ 2024-11-07 17:45 UTC (permalink / raw) To: Daniel P. Berrangé Cc: Prasad Pandit, Fabiano Rosas, qemu-devel, Prasad Pandit On Thu, Nov 07, 2024 at 04:57:46PM +0000, Daniel P. Berrangé wrote: > On Thu, Nov 07, 2024 at 11:17:30AM -0500, Peter Xu wrote: > > On Thu, Nov 07, 2024 at 12:33:17PM +0000, Daniel P. Berrangé wrote: > > I'll comment on a few examples above, which I think some of them, even if > > handshake is ready, may still need mgmt layers to involve.. > > > > Multifd and postcopy are the two major features, and they, IMHO, all better > > need user involvements.. > > > > Multifd needs it because it relies on the channel being able to provide >1 > > channels. It means "| nc XXX > file" can stop working if we turn it on by > > default silently. > > NB, my point was referring to a hypothetical alternative history, > where we had the handshake at the QEMU level from day 1. That > would neccessarily imply a bi-directional channel, so the 'nc' > use case would already have been out of scope. That said, QEMU > could identify whether the channel it was told to use was > bi-directional or not, and thus not try to do multifd over > a non-socket transport. Ah, that's true. > > So the general point still holds - if QEMU had this protocol > negotiation phase built-in, there would be more flexiblity in > introducing new features without layers above needing changes, > for every single one, just a subset. Yes. Just to mention, we can already do that now without handshake, as long as the feature has zero side effect, and as long as we don't expose it as a migration capability (which by default all off). In that case, we can make the property to "on", and add "off" in machine compat properties. IOW, machine compat property can play part of the role as handshake, based on the machine type must be the same when initiating QEMU on both hosts. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-07 17:45 ` Peter Xu @ 2024-11-08 12:37 ` Prasad Pandit 2024-11-08 13:25 ` Fabiano Rosas 0 siblings, 1 reply; 33+ messages in thread From: Prasad Pandit @ 2024-11-08 12:37 UTC (permalink / raw) To: Peter Xu Cc: Daniel P. Berrangé, Fabiano Rosas, qemu-devel, Prasad Pandit Hello Peter, Dan, Fabiano, Thank you so much for the detailed comments, I appreciate it. On Thu, 7 Nov 2024 at 17:41, Fabiano Rosas <farosas@suse.de> wrote: > The handshake will be a QEMU-only feature. Libvirt will then only start > the migration on src and QEMU will do the capabilities handling. > On Thu, 7 Nov 2024 at 18:03, Daniel P. Berrangé <berrange@redhat.com> wrote: > Libvirt does what it does because it has had no other choice, > not because it was good or desirable. > > This kind of handshake really does not belong in libvirt. A number > of exposed migration protocol feature knobs should be considered > private to QEMU only. * Right, okay. * So then IIUC, libvirtd(8) would invoke migration by sending (without first checking with the destination libvirtd(8)) the migration command to the source QEMU and in qmp_migrate() before setting up the required connections, QEMU will add the feature negotiation (or handshake) step, right? > It has the very negative consequence that every time QEMU wants to > provide a new feature in migration, it needs to be plumbed up through > libvirt, and often applications above, and those 3rd party projects > need to be told when & where to use the new features. The 3rd party > developers have their own project dev priorities so may not get > around to enable the new migration features for years, if ever, > undermining the work of QEMU's migration maintainers. * True. I've seen the mismatch/disconnect between QEMU features and how libvirtd(8)/virsh(1) offers them to the end users. ex. What QEMU calls Multifd, virsh(1) calls --parallel-connections. Features like 'postcopy-preempt-channel' are implemented in QEMU, but no way for an end user to see/enable/disable it from virsh(1) side. * TBH, Multifd is such a misnomer, it could have been a plain simple --threads <N> option. ex: virsh migrate --threads <N>: precopy migration with <N> threads, default <N> = 1. Irrespective of the number of threads, the underlying migration mechanism/protocol remains the same. Threads only help to accelerate the rate of data transfer through multiple connections. We don't have to differentiate channels by sending magic values then. * Since we are thinking about correcting past wrongs, does having a unified migration protocol make sense? OR that is too ambitious a thing to think about? (just checking) * Meanwhile, @Fabiano Rosas If you have not started with the handshake or feature negotiation in QEMU, I'd try it on my side and we can discuss how the handshake should work. If you've already started with it, I'd be happy to help in some way as possible. * Are we thinking about dynamically adjusting migration features based on their availability on both sides? Ex. say source says open 10 parallel connections, but destination can do only 5, then source reports an error and terminates migration or continues with 5 threads/connections OR say user does not mention parallel connections, but still we automatically start multiple threads because both ends support it? Just checking about dynamic adjustments, because earlier while discussing with Peter, he mentioned that we can not enable/disable user supplied options. Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-08 12:37 ` Prasad Pandit @ 2024-11-08 13:25 ` Fabiano Rosas 0 siblings, 0 replies; 33+ messages in thread From: Fabiano Rosas @ 2024-11-08 13:25 UTC (permalink / raw) To: Prasad Pandit, Peter Xu Cc: Daniel P. Berrangé, qemu-devel, Prasad Pandit Prasad Pandit <ppandit@redhat.com> writes: > Hello Peter, Dan, Fabiano, > > Thank you so much for the detailed comments, I appreciate it. > > On Thu, 7 Nov 2024 at 17:41, Fabiano Rosas <farosas@suse.de> wrote: >> The handshake will be a QEMU-only feature. Libvirt will then only start >> the migration on src and QEMU will do the capabilities handling. >> > On Thu, 7 Nov 2024 at 18:03, Daniel P. Berrangé <berrange@redhat.com> wrote: >> Libvirt does what it does because it has had no other choice, >> not because it was good or desirable. >> >> This kind of handshake really does not belong in libvirt. A number >> of exposed migration protocol feature knobs should be considered >> private to QEMU only. > > * Right, okay. > > * So then IIUC, libvirtd(8) would invoke migration by sending (without > first checking with the destination libvirtd(8)) the migration command > to the source QEMU and in qmp_migrate() before setting up the required > connections, QEMU will add the feature negotiation (or handshake) > step, right? Yes, but there are still some points to figure out, as Peter mentioned, such as how to handle capabilities for which there is a high chance that libvirt does actually want to control, e.g. multifd. One approach is to just continue to allow migrate-set-caps before qmp-migrate and take those into account during handshake as well. > >> It has the very negative consequence that every time QEMU wants to >> provide a new feature in migration, it needs to be plumbed up through >> libvirt, and often applications above, and those 3rd party projects >> need to be told when & where to use the new features. The 3rd party >> developers have their own project dev priorities so may not get >> around to enable the new migration features for years, if ever, >> undermining the work of QEMU's migration maintainers. > > * True. I've seen the mismatch/disconnect between QEMU features and > how libvirtd(8)/virsh(1) offers them to the end users. ex. What QEMU > calls Multifd, virsh(1) calls --parallel-connections. Features like > 'postcopy-preempt-channel' are implemented in QEMU, but no way for an > end user to see/enable/disable it from virsh(1) side. > > * TBH, Multifd is such a misnomer, it could have been a plain simple > --threads <N> option. > ex: virsh migrate --threads <N>: precopy migration with <N> > threads, default <N> = 1. > > Irrespective of the number of threads, the underlying migration > mechanism/protocol remains the same. Threads only help to accelerate > the rate of data transfer through multiple connections. We don't have > to differentiate channels by sending magic values then. > > * Since we are thinking about correcting past wrongs, does having a > unified migration protocol make sense? OR that is too ambitious a > thing to think about? (just checking) > > * Meanwhile, @Fabiano Rosas If you have not started with the handshake > or feature negotiation in QEMU, I'd try it on my side and we can > discuss how the handshake should work. If you've already started with > it, I'd be happy to help in some way as possible. > I'm putting together a prototype that we can iterate on. I'll let you know as soon as I have something I can share. > * Are we thinking about dynamically adjusting migration features based > on their availability on both sides? Ex. say source says open 10 > parallel connections, but destination can do only 5, then source > reports an error and terminates migration or continues with 5 > threads/connections OR say user does not mention parallel connections, > but still we automatically start multiple threads because both ends > support it? Just checking about dynamic adjustments, because earlier > while discussing with Peter, he mentioned that we can not > enable/disable user supplied options. > > Thank you. > --- > - Prasad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-06 12:19 ` Prasad Pandit 2024-11-06 13:11 ` Fabiano Rosas @ 2024-11-06 16:00 ` Peter Xu 2024-11-07 11:52 ` Prasad Pandit 1 sibling, 1 reply; 33+ messages in thread From: Peter Xu @ 2024-11-06 16:00 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Wed, Nov 06, 2024 at 05:49:23PM +0530, Prasad Pandit wrote: > On Tue, 5 Nov 2024 at 18:30, Peter Xu <peterx@redhat.com> wrote: > > https://www.qemu.org/docs/master/devel/qapi-code-gen.html > > > > Sometimes, the behaviour of QEMU changes compatibly, but without a > > change in the QMP syntax (usually by allowing values or operations > > that previously resulted in an error). QMP clients may still need > > to know whether the extension is available. > > > > For this purpose, a list of features can be specified for > > definitions, enumeration values, and struct members. Each feature > > list member can either be { 'name': STRING, '*if': COND }, or > > STRING, which is shorthand for { 'name': STRING }. > > * I see, okay. > > > It's a legacy issue as not all features are developed together, and that > > was planned to be fixed together with handshake. I think the handshake > > could introduce one header on top to pair channels. > > > > IMHO it is an overkill to add a feature now if it works even if tricky, > > because it's not the 1st day it was tricky. And we're going to have another > > header very soon.. > > * See, current (this series) 'if' conditional in the > migration_ioc_process_incoming() function is simple as: > > if (qio_channel_has_feature(ioc, > QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) { peek_magic_bytes() ... } IIUC we can't simply change it to this. We can do this with a compat property and we can start sending a magic in the preempt channel, but then the code still needs to keep working with old binaries of QEMU, so in all cases we'll need to keep the old complexity for quite a while. IOW, I think it may break migrations from old QEMUs when postcopy preempt is enabled, because then the new QEMU (with your patch applied) will always peek the byte assuming the magic is there, but old binaries don't have those. Handshake, in my mind, will use a totally separate path, then the hope is we'll move to that with more machine types and finally obsolete / remove this path. > > If we don't send magic value for the postcopy channel, then we avoid > peeking into magic bytes when postcopy is enabled, because otherwise > thread will block peeking into the magic bytes, so the 'if' > conditional becomes: > > if (migrate_multifd() && !migrate_postcopy() && > qio_channel_has_feature (...) ) { > peek_magic_bytes() > ... > } else { > When migrate_postcopy() is true > It'll reach here not only for the 'Postcopy' channel, but even > for the 'default' and 'multifd' channels which send the magic bytes. > Then here again we'll need to identify different channels, right? > } > > * Let's not make it so complex. Let's send the magic value for the Firstly, the complexity is there on migration, requiring it work with old qemu binaries, bi-directional by default. In your case you're changing receiving side, so it's even more important, because it's common old qemu migrates to new ones. What I want to avoid is we introduce two flags in a short period doing the same thing. If we want we can merge the effort, I'll leave that to you and Fabiano to decide, so that maybe you can work out the channel establishment part of things. But note again that I still think your goal of enabling multifd + postcopy may not require that new flag yet, simply because after 7.2 qemu will connect preempt channel later than the main channel. I think logically QEMU can identify which channel is which: the preempt channel must be established in this case after both main thread and multifd threads. Meanwhile, as mentioned above, we still need to make pre-7.2 works in general on migration in most cases (when there's network instability it won't work.. so that's unavoidable).. it's indeed slightly complicated, but hopefully could still work. In all cases, we may want to test postcopy preempt from a 7.1 qemu to the new qemu to keep it working when the patch is ready (and you can already try that with current patch). > postcopy channel and simplify it. If 'handshake' feature is going to > redo it, so be it, what's the difference? OR maybe we can align it > with the 'handshake' feature or as part of it or something like that. > > @Fabiano Rosas : may I know more about the 'handshake' feature? What > it'll do and not do? > > Thank you. > --- > - Prasad > -- Peter Xu ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-06 16:00 ` Peter Xu @ 2024-11-07 11:52 ` Prasad Pandit 2024-11-07 15:56 ` Peter Xu 0 siblings, 1 reply; 33+ messages in thread From: Prasad Pandit @ 2024-11-07 11:52 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Wed, 6 Nov 2024 at 21:30, Peter Xu <peterx@redhat.com> wrote: > IIUC we can't simply change it to this. We can do this with a compat > property and we can start sending a magic in the preempt channel, but then > the code still needs to keep working with old binaries of QEMU, so in all > cases we'll need to keep the old complexity for quite a while. * I see...sigh. > Handshake, in my mind, will use a totally separate path, then the hope is > we'll move to that with more machine types and finally obsolete / remove > this path. * I need to check how 'separate path' works. > But note again that I still think your goal of enabling multifd + postcopy > may not require that new flag yet, simply because after 7.2 qemu will > connect preempt channel later than the main channel. I think logically > QEMU can identify which channel is which: the preempt channel must be > established in this case after both main thread and multifd threads. * You mean, instead of relying on magic bytes, we check if both 'main channel' and 'multifd channels' already exist, then the incoming connection is assumed to be for the 'postcopy preempt' channel? Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel 2024-11-07 11:52 ` Prasad Pandit @ 2024-11-07 15:56 ` Peter Xu 0 siblings, 0 replies; 33+ messages in thread From: Peter Xu @ 2024-11-07 15:56 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Thu, Nov 07, 2024 at 05:22:05PM +0530, Prasad Pandit wrote: > On Wed, 6 Nov 2024 at 21:30, Peter Xu <peterx@redhat.com> wrote: > > IIUC we can't simply change it to this. We can do this with a compat > > property and we can start sending a magic in the preempt channel, but then > > the code still needs to keep working with old binaries of QEMU, so in all > > cases we'll need to keep the old complexity for quite a while. > > * I see...sigh. > > > Handshake, in my mind, will use a totally separate path, then the hope is > > we'll move to that with more machine types and finally obsolete / remove > > this path. > > * I need to check how 'separate path' works. The plan is handshake will only happen on the main channel, and that includes waiting all the channels to be established. There, each channel may need to have its own header, it could be a new handshake header that whatever migration channel will start to establish, so it's named channels and dest qemu handshake logic can "understand" which channel is which, and properly assign those channels in the handshake.c logic, for example. On src, now we kick off migration by migration_should_start_incoming() returns true, only relying on "whether qemu have all the channels ready". The handshake code can do more, so it checks more, then after all handshake ready it can directly invoke migration_incoming_process() in the separate path, to which stage it also needs to guarantee channel establishment. We'll need to keep this path though for "if (!migrate_handshake())". > > > But note again that I still think your goal of enabling multifd + postcopy > > may not require that new flag yet, simply because after 7.2 qemu will > > connect preempt channel later than the main channel. I think logically > > QEMU can identify which channel is which: the preempt channel must be > > established in this case after both main thread and multifd threads. > > * You mean, instead of relying on magic bytes, we check if both 'main > channel' and 'multifd channels' already exist, then the incoming > connection is assumed to be for the 'postcopy preempt' channel? Exactly. So we keep the fact that we should only peek() if multifd is enabled at least. Then in your case even postcopy is also enabled, it won't connect the preempt channel until very late (entry of postcopy_start()), and it'll only connect if it receives a PONG first (migration_wait_main_channel()). That means dest has all multifd+main channels ready otherwise no PONG will generate. This makes sure we'll never try to peek() on preempt channel (which may hang) as long as it's always the latest. I think 7.1 will keep working even if it behaves differently (it connects preempt channel earlier, see preempt_pre_7_2), because the 1st requirement in the new code (and also, in the old code) you will also only peek() if multifd enabled first, while multifd cannot be enabled before with postcopy/preempt or it was already a mess, so we can imagine old users only enable either multifd or postcopy. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 3/5] migration: remove multifd check with postcopy 2024-10-29 15:09 [PATCH 0/5] Allow to enable multifd and postcopy migration together Prasad Pandit 2024-10-29 15:09 ` [PATCH 1/5] migration/multifd: move macros to multifd header Prasad Pandit 2024-10-29 15:09 ` [PATCH 2/5] migration/postcopy: magic value for postcopy channel Prasad Pandit @ 2024-10-29 15:09 ` Prasad Pandit [not found] ` <ZyTnWYyHlrJUYQRB@x1n> 2024-10-29 15:09 ` [PATCH 4/5] migration: refactor ram_save_target_page functions Prasad Pandit 2024-10-29 15:09 ` [PATCH 5/5] migration: enable multifd and postcopy together Prasad Pandit 4 siblings, 1 reply; 33+ messages in thread From: Prasad Pandit @ 2024-10-29 15:09 UTC (permalink / raw) To: qemu-devel; +Cc: Peter Xu, Fabiano Rosas, Prasad Pandit From: Prasad Pandit <pjp@fedoraproject.org> Remove multifd capability check with Postcopy mode. This helps to enable both multifd and postcopy together. Update migrate_multifd() to return false when migration reaches Postcopy phase. In Postcopy phase, source guest is paused, so the migration threads on the source stop sending/pushing data on the channels. The destination guest starts running and Postcopy threads there begin to request/pull data from the source side. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/options.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/migration/options.c b/migration/options.c index ad8d6989a8..47c5137d5f 100644 --- a/migration/options.c +++ b/migration/options.c @@ -266,7 +266,8 @@ bool migrate_multifd(void) { MigrationState *s = migrate_get_current(); - return s->capabilities[MIGRATION_CAPABILITY_MULTIFD]; + return s->capabilities[MIGRATION_CAPABILITY_MULTIFD] + && !migration_in_postcopy(); } bool migrate_pause_before_switchover(void) @@ -479,11 +480,6 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) error_setg(errp, "Postcopy is not compatible with ignore-shared"); return false; } - - if (new_caps[MIGRATION_CAPABILITY_MULTIFD]) { - error_setg(errp, "Postcopy is not yet compatible with multifd"); - return false; - } } if (new_caps[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) { -- 2.47.0 ^ permalink raw reply related [flat|nested] 33+ messages in thread
[parent not found: <ZyTnWYyHlrJUYQRB@x1n>]
* Re: [PATCH 3/5] migration: remove multifd check with postcopy [not found] ` <ZyTnWYyHlrJUYQRB@x1n> @ 2024-11-04 12:23 ` Prasad Pandit 2024-11-04 16:52 ` Peter Xu 0 siblings, 1 reply; 33+ messages in thread From: Prasad Pandit @ 2024-11-04 12:23 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Fri, 1 Nov 2024 at 20:06, Peter Xu <peterx@redhat.com> wrote: > > - return s->capabilities[MIGRATION_CAPABILITY_MULTIFD]; > > + return s->capabilities[MIGRATION_CAPABILITY_MULTIFD] > > + && !migration_in_postcopy(); > > } > > We need to keep this as-is.. I'm afraid. > You can always do proper check with multifd & !postcopy in your use cases. * Above change simplifies it a lot. Separate checks as migrate_multifd() && !migration_in_postcopy() make it more complicated to follow, because migrate_multifd() is often combined with other checks like migrate_multifd_flush, or migrate_mapped_ram etc. I was hoping to avoid adding one more check to those conditionals. Also, with the above change we don't have to explicitly check where to add !migration_in_postcopy() check. * Will try to separate them. Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 3/5] migration: remove multifd check with postcopy 2024-11-04 12:23 ` Prasad Pandit @ 2024-11-04 16:52 ` Peter Xu 2024-11-05 9:50 ` Prasad Pandit 0 siblings, 1 reply; 33+ messages in thread From: Peter Xu @ 2024-11-04 16:52 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Mon, Nov 04, 2024 at 05:53:22PM +0530, Prasad Pandit wrote: > On Fri, 1 Nov 2024 at 20:06, Peter Xu <peterx@redhat.com> wrote: > > > - return s->capabilities[MIGRATION_CAPABILITY_MULTIFD]; > > > + return s->capabilities[MIGRATION_CAPABILITY_MULTIFD] > > > + && !migration_in_postcopy(); > > > } > > > > We need to keep this as-is.. I'm afraid. > > You can always do proper check with multifd & !postcopy in your use cases. > > * Above change simplifies it a lot. Separate checks as > migrate_multifd() && !migration_in_postcopy() make it more complicated > to follow, because migrate_multifd() is often combined with other > checks like migrate_multifd_flush, or migrate_mapped_ram etc. I was > hoping to avoid adding one more check to those conditionals. Also, > with the above change we don't have to explicitly check where to add > !migration_in_postcopy() check. We definitely need a helper like this to simply detect what the user chose on the feature. You can still introduce a new helper, e.g. migrate_multifd_precopy(), if that simplifies the code. Thanks, > > * Will try to separate them. > > Thank you. > --- > - Prasad > -- Peter Xu ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 3/5] migration: remove multifd check with postcopy 2024-11-04 16:52 ` Peter Xu @ 2024-11-05 9:50 ` Prasad Pandit 0 siblings, 0 replies; 33+ messages in thread From: Prasad Pandit @ 2024-11-05 9:50 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Mon, 4 Nov 2024 at 22:23, Peter Xu <peterx@redhat.com> wrote: > We definitely need a helper like this to simply detect what the user chose > on the feature. > > You can still introduce a new helper, e.g. migrate_multifd_precopy(), if > that simplifies the code. Okay. Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 4/5] migration: refactor ram_save_target_page functions 2024-10-29 15:09 [PATCH 0/5] Allow to enable multifd and postcopy migration together Prasad Pandit ` (2 preceding siblings ...) 2024-10-29 15:09 ` [PATCH 3/5] migration: remove multifd check with postcopy Prasad Pandit @ 2024-10-29 15:09 ` Prasad Pandit [not found] ` <ZyToBbvfWkIZ_40W@x1n> 2024-10-29 15:09 ` [PATCH 5/5] migration: enable multifd and postcopy together Prasad Pandit 4 siblings, 1 reply; 33+ messages in thread From: Prasad Pandit @ 2024-10-29 15:09 UTC (permalink / raw) To: qemu-devel; +Cc: Peter Xu, Fabiano Rosas, Prasad Pandit From: Prasad Pandit <pjp@fedoraproject.org> Refactor ram_save_target_page legacy and multifd functions into one. Other than simplifying it, it avoids reinitialization of the 'migration_ops' object, when migration moves from multifd to postcopy phase. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/ram.c | 54 ++++++++++++++++++++----------------------------- 1 file changed, 22 insertions(+), 32 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 326ce7eb79..f9a6395d00 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1985,18 +1985,36 @@ int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len, } /** - * ram_save_target_page_legacy: save one target page + * ram_save_target_page_common: + * send one target page to multifd workers OR save one target page. * - * Returns the number of pages written + * Multifd mode: returns 1 if the page was queued, -1 otherwise. + * + * Non-multifd mode: returns the number of pages written * * @rs: current RAM state * @pss: data about the page we want to send */ -static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss) +static int ram_save_target_page_common(RAMState *rs, PageSearchStatus *pss) { ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; int res; + if (migrate_multifd()) { + RAMBlock *block = pss->block; + /* + * While using multifd live migration, we still need to handle zero + * page checking on the migration main thread. + */ + if (migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) { + if (save_zero_page(rs, pss, offset)) { + return 1; + } + } + + return ram_save_multifd_page(block, offset); + } + if (control_save_page(pss, offset, &res)) { return res; } @@ -2008,32 +2026,6 @@ static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss) return ram_save_page(rs, pss); } -/** - * ram_save_target_page_multifd: send one target page to multifd workers - * - * Returns 1 if the page was queued, -1 otherwise. - * - * @rs: current RAM state - * @pss: data about the page we want to send - */ -static int ram_save_target_page_multifd(RAMState *rs, PageSearchStatus *pss) -{ - RAMBlock *block = pss->block; - ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; - - /* - * While using multifd live migration, we still need to handle zero - * page checking on the migration main thread. - */ - if (migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) { - if (save_zero_page(rs, pss, offset)) { - return 1; - } - } - - return ram_save_multifd_page(block, offset); -} - /* Should be called before sending a host page */ static void pss_host_page_prepare(PageSearchStatus *pss) { @@ -3055,12 +3047,10 @@ static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp) } migration_ops = g_malloc0(sizeof(MigrationOps)); + migration_ops->ram_save_target_page = ram_save_target_page_common; if (migrate_multifd()) { multifd_ram_save_setup(); - migration_ops->ram_save_target_page = ram_save_target_page_multifd; - } else { - migration_ops->ram_save_target_page = ram_save_target_page_legacy; } bql_unlock(); -- 2.47.0 ^ permalink raw reply related [flat|nested] 33+ messages in thread
[parent not found: <ZyToBbvfWkIZ_40W@x1n>]
* Re: [PATCH 4/5] migration: refactor ram_save_target_page functions [not found] ` <ZyToBbvfWkIZ_40W@x1n> @ 2024-11-04 11:56 ` Prasad Pandit 2024-11-04 17:00 ` Peter Xu 0 siblings, 1 reply; 33+ messages in thread From: Prasad Pandit @ 2024-11-04 11:56 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Fri, 1 Nov 2024 at 20:09, Peter Xu <peterx@redhat.com> wrote: > > + if (migrate_multifd()) { > > + RAMBlock *block = pss->block; > > + /* > > + * While using multifd live migration, we still need to handle zero > > + * page checking on the migration main thread. > > + */ > > + if (migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) { > > + if (save_zero_page(rs, pss, offset)) { > > + return 1; > > + } > } > There's one more save_zero_page() below. Please consider properly merging them. if (save_zero_page(rs, pss, offset)) { return 1; } * First is called in migrate_multifd() mode, the second (above) is called in non-multifd mode. Will check how/if we can conflate them. > > migration_ops = g_malloc0(sizeof(MigrationOps)); > > + migration_ops->ram_save_target_page = ram_save_target_page_common; > > If we want to merge the hooks, we should drop the hook in one shot, then > call the new function directly. > * Ie. drop the 'migration_ops' object altogether? And call ram_save_target_page() as it used to be before multifd mode? Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 4/5] migration: refactor ram_save_target_page functions 2024-11-04 11:56 ` Prasad Pandit @ 2024-11-04 17:00 ` Peter Xu 2024-11-05 10:01 ` Prasad Pandit 0 siblings, 1 reply; 33+ messages in thread From: Peter Xu @ 2024-11-04 17:00 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Mon, Nov 04, 2024 at 05:26:45PM +0530, Prasad Pandit wrote: > On Fri, 1 Nov 2024 at 20:09, Peter Xu <peterx@redhat.com> wrote: > > > + if (migrate_multifd()) { > > > + RAMBlock *block = pss->block; > > > + /* > > > + * While using multifd live migration, we still need to handle zero > > > + * page checking on the migration main thread. > > > + */ > > > + if (migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) { > > > + if (save_zero_page(rs, pss, offset)) { > > > + return 1; > > > + } > > } > > There's one more save_zero_page() below. Please consider properly merging them. > > if (save_zero_page(rs, pss, offset)) { > return 1; > } > > * First is called in migrate_multifd() mode, the second (above) is > called in non-multifd mode. Will check how/if we can conflate them. Yes, IMHO it's better when merged. One more note here, that even with ZERO_PAGE_DETECTION_MULTIFD, qemu will fallback to use LEGACY in reality when !multifd before. We need to keep that behavior. > > > > migration_ops = g_malloc0(sizeof(MigrationOps)); > > > + migration_ops->ram_save_target_page = ram_save_target_page_common; > > > > If we want to merge the hooks, we should drop the hook in one shot, then > > call the new function directly. > > > > * Ie. drop the 'migration_ops' object altogether? And call > ram_save_target_page() as it used to be before multifd mode? Yes. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 4/5] migration: refactor ram_save_target_page functions 2024-11-04 17:00 ` Peter Xu @ 2024-11-05 10:01 ` Prasad Pandit 2024-11-05 13:01 ` Peter Xu 0 siblings, 1 reply; 33+ messages in thread From: Prasad Pandit @ 2024-11-05 10:01 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Mon, 4 Nov 2024 at 22:30, Peter Xu <peterx@redhat.com> wrote: > Yes, IMHO it's better when merged. > > One more note here, that even with ZERO_PAGE_DETECTION_MULTIFD, qemu will > fallback to use LEGACY in reality when !multifd before. We need to keep > that behavior. * Where does this fallback happen? in ram_save_target_page()? > > * Ie. drop the 'migration_ops' object altogether? And call > > ram_save_target_page() as it used to be before multifd mode? > > Yes. Okay, thank you. --- - Prasad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 4/5] migration: refactor ram_save_target_page functions 2024-11-05 10:01 ` Prasad Pandit @ 2024-11-05 13:01 ` Peter Xu 0 siblings, 0 replies; 33+ messages in thread From: Peter Xu @ 2024-11-05 13:01 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Tue, Nov 05, 2024 at 03:31:19PM +0530, Prasad Pandit wrote: > On Mon, 4 Nov 2024 at 22:30, Peter Xu <peterx@redhat.com> wrote: > > Yes, IMHO it's better when merged. > > > > One more note here, that even with ZERO_PAGE_DETECTION_MULTIFD, qemu will > > fallback to use LEGACY in reality when !multifd before. We need to keep > > that behavior. > > * Where does this fallback happen? in ram_save_target_page()? When ZERO_PAGE_DETECTION_MULTIFD is used but when !multifd cap, it'll use legacy even if it's MULTIFD. We don't yet change the value, the fallback will still happen. -- Peter Xu ^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 5/5] migration: enable multifd and postcopy together 2024-10-29 15:09 [PATCH 0/5] Allow to enable multifd and postcopy migration together Prasad Pandit ` (3 preceding siblings ...) 2024-10-29 15:09 ` [PATCH 4/5] migration: refactor ram_save_target_page functions Prasad Pandit @ 2024-10-29 15:09 ` Prasad Pandit 2024-11-04 17:48 ` Peter Xu 4 siblings, 1 reply; 33+ messages in thread From: Prasad Pandit @ 2024-10-29 15:09 UTC (permalink / raw) To: qemu-devel; +Cc: Peter Xu, Fabiano Rosas, Prasad Pandit From: Prasad Pandit <pjp@fedoraproject.org> Enable Multifd and Postcopy migration together. The migration_ioc_process_incoming() routine checks magic value sent on each channel and helps to properly setup multifd and postcopy channels. Idea is to take advantage of the multifd threads to accelerate transfer of large guest RAM to the destination and switch to postcopy mode sooner. The Precopy and Multifd threads work during the initial guest RAM transfer. When migration moves to the Postcopy phase, the source guest is paused, so the Precopy and Multifd threads stop sending data on their channels. Postcopy threads on the destination request/pull data from the source side. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/migration.c | 73 ++++++++++++++++++++++++++----------------- 1 file changed, 44 insertions(+), 29 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 021faee2f3..11fcc1e012 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -92,6 +92,9 @@ enum mig_rp_message_type { MIG_RP_MSG_MAX }; +/* Migration channel types */ +enum { CH_DEFAULT, CH_MULTIFD, CH_POSTCOPY }; + /* When we add fault tolerance, we could have several migrations at once. For now we don't need to add dynamic creation of migration */ @@ -919,16 +922,15 @@ void migration_fd_process_incoming(QEMUFile *f) * Returns true when we want to start a new incoming migration process, * false otherwise. */ -static bool migration_should_start_incoming(bool main_channel) +static bool migration_should_start_incoming(uint8_t channel) { + if (channel == CH_POSTCOPY) { + return false; + } + /* Multifd doesn't start unless all channels are established */ if (migrate_multifd()) { - return migration_has_all_channels(); - } - - /* Preempt channel only starts when the main channel is created */ - if (migrate_postcopy_preempt()) { - return main_channel; + return multifd_recv_all_channels_created(); } /* @@ -936,7 +938,7 @@ static bool migration_should_start_incoming(bool main_channel) * it's the main channel that's being created, and we should always * proceed with this channel. */ - assert(main_channel); + assert(channel == CH_DEFAULT); return true; } @@ -945,13 +947,11 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) MigrationIncomingState *mis = migration_incoming_get_current(); Error *local_err = NULL; QEMUFile *f; - bool default_channel = true; uint32_t channel_magic = 0; + uint8_t channel = CH_DEFAULT; int ret = 0; - if (migrate_multifd() && !migrate_mapped_ram() && - !migrate_postcopy_ram() && - qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) { + if (qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) { /* * With multiple channels, it is possible that we receive channels * out of order on destination side, causing incorrect mapping of @@ -969,35 +969,49 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) return; } - default_channel = (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC)); - } else { - default_channel = !mis->from_src_file; + if (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC)) { + channel = CH_DEFAULT; + } else if (channel_magic == cpu_to_be32(MULTIFD_MAGIC)) { + channel = CH_MULTIFD; + } else if (channel_magic == cpu_to_be32(POSTCOPY_MAGIC)) { + if (qio_channel_read_all(ioc, (char *)&channel_magic, + sizeof(channel_magic), &local_err)) { + error_report_err(local_err); + return; + } + channel = CH_POSTCOPY; + } else { + error_report("%s: could not identify channel, unknown magic: %u", + __func__, channel_magic); + return; + } } if (multifd_recv_setup(errp) != 0) { return; } - if (default_channel) { + if (channel == CH_DEFAULT) { f = qemu_file_new_input(ioc); migration_incoming_setup(f); - } else { + } else if (channel == CH_MULTIFD) { /* Multiple connections */ - assert(migration_needs_multiple_sockets()); if (migrate_multifd()) { multifd_recv_new_channel(ioc, &local_err); - } else { + } + if (local_err) { + error_propagate(errp, local_err); + return; + } + } else if (channel == CH_POSTCOPY) { + if (migrate_postcopy()) { assert(migrate_postcopy_preempt()); f = qemu_file_new_input(ioc); postcopy_preempt_new_channel(mis, f); } - if (local_err) { - error_propagate(errp, local_err); - return; - } } - if (migration_should_start_incoming(default_channel)) { + if (migration_should_start_incoming(channel)) { /* If it's a recovery, we're done */ if (postcopy_try_recover()) { return; @@ -1014,21 +1028,22 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) */ bool migration_has_all_channels(void) { + bool ret = false; MigrationIncomingState *mis = migration_incoming_get_current(); if (!mis->from_src_file) { - return false; + return ret; } if (migrate_multifd()) { - return multifd_recv_all_channels_created(); + ret = multifd_recv_all_channels_created(); } - if (migrate_postcopy_preempt()) { - return mis->postcopy_qemufile_dst != NULL; + if (ret && migrate_postcopy_preempt()) { + ret = mis->postcopy_qemufile_dst != NULL; } - return true; + return ret; } int migrate_send_rp_switchover_ack(MigrationIncomingState *mis) -- 2.47.0 ^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH 5/5] migration: enable multifd and postcopy together 2024-10-29 15:09 ` [PATCH 5/5] migration: enable multifd and postcopy together Prasad Pandit @ 2024-11-04 17:48 ` Peter Xu 2024-11-05 11:54 ` Prasad Pandit 0 siblings, 1 reply; 33+ messages in thread From: Peter Xu @ 2024-11-04 17:48 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Tue, Oct 29, 2024 at 08:39:08PM +0530, Prasad Pandit wrote: > From: Prasad Pandit <pjp@fedoraproject.org> > > Enable Multifd and Postcopy migration together. > The migration_ioc_process_incoming() routine > checks magic value sent on each channel and > helps to properly setup multifd and postcopy > channels. > > Idea is to take advantage of the multifd threads > to accelerate transfer of large guest RAM to the > destination and switch to postcopy mode sooner. > > The Precopy and Multifd threads work during the > initial guest RAM transfer. When migration moves > to the Postcopy phase, the source guest is > paused, so the Precopy and Multifd threads stop > sending data on their channels. Postcopy threads > on the destination request/pull data from the > source side. Hmm I think this is not the truth.. Precopy keeps sending data even during postcopy, that's the background stream (with/without preempt feature enabled). May need some amendment when repost here. > > Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> > --- > migration/migration.c | 73 ++++++++++++++++++++++++++----------------- > 1 file changed, 44 insertions(+), 29 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c > index 021faee2f3..11fcc1e012 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -92,6 +92,9 @@ enum mig_rp_message_type { > MIG_RP_MSG_MAX > }; > > +/* Migration channel types */ > +enum { CH_DEFAULT, CH_MULTIFD, CH_POSTCOPY }; > + > /* When we add fault tolerance, we could have several > migrations at once. For now we don't need to add > dynamic creation of migration */ > @@ -919,16 +922,15 @@ void migration_fd_process_incoming(QEMUFile *f) > * Returns true when we want to start a new incoming migration process, > * false otherwise. > */ > -static bool migration_should_start_incoming(bool main_channel) > +static bool migration_should_start_incoming(uint8_t channel) > { > + if (channel == CH_POSTCOPY) { > + return false; > + } Please see my other reply, I think here it should never be POSTCOPY channel, because postcopy (aka, preempt) channel is only created after the main channel.. So I wonder whether this "if" will hit at all. > + > /* Multifd doesn't start unless all channels are established */ > if (migrate_multifd()) { > - return migration_has_all_channels(); > - } > - > - /* Preempt channel only starts when the main channel is created */ > - if (migrate_postcopy_preempt()) { > - return main_channel; > + return multifd_recv_all_channels_created(); I think this is incorrect.. We should also need to check main channel is established before start incoming. The old code uses migration_has_all_channels() which checks that. > } > > /* > @@ -936,7 +938,7 @@ static bool migration_should_start_incoming(bool main_channel) > * it's the main channel that's being created, and we should always > * proceed with this channel. > */ > - assert(main_channel); > + assert(channel == CH_DEFAULT); > return true; > } > > @@ -945,13 +947,11 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) > MigrationIncomingState *mis = migration_incoming_get_current(); > Error *local_err = NULL; > QEMUFile *f; > - bool default_channel = true; > uint32_t channel_magic = 0; > + uint8_t channel = CH_DEFAULT; > int ret = 0; > > - if (migrate_multifd() && !migrate_mapped_ram() && > - !migrate_postcopy_ram() && > - qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) { > + if (qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) { > /* > * With multiple channels, it is possible that we receive channels > * out of order on destination side, causing incorrect mapping of > @@ -969,35 +969,49 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) > return; > } > > - default_channel = (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC)); > - } else { > - default_channel = !mis->from_src_file; > + if (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC)) { > + channel = CH_DEFAULT; > + } else if (channel_magic == cpu_to_be32(MULTIFD_MAGIC)) { > + channel = CH_MULTIFD; > + } else if (channel_magic == cpu_to_be32(POSTCOPY_MAGIC)) { > + if (qio_channel_read_all(ioc, (char *)&channel_magic, > + sizeof(channel_magic), &local_err)) { > + error_report_err(local_err); > + return; > + } > + channel = CH_POSTCOPY; > + } else { > + error_report("%s: could not identify channel, unknown magic: %u", > + __func__, channel_magic); > + return; > + } > } > > if (multifd_recv_setup(errp) != 0) { > return; > } > > - if (default_channel) { > + if (channel == CH_DEFAULT) { > f = qemu_file_new_input(ioc); > migration_incoming_setup(f); > - } else { > + } else if (channel == CH_MULTIFD) { > /* Multiple connections */ > - assert(migration_needs_multiple_sockets()); > if (migrate_multifd()) { > multifd_recv_new_channel(ioc, &local_err); > - } else { > + } > + if (local_err) { > + error_propagate(errp, local_err); > + return; > + } > + } else if (channel == CH_POSTCOPY) { > + if (migrate_postcopy()) { > assert(migrate_postcopy_preempt()); > f = qemu_file_new_input(ioc); > postcopy_preempt_new_channel(mis, f); > } > - if (local_err) { > - error_propagate(errp, local_err); > - return; > - } > } > > - if (migration_should_start_incoming(default_channel)) { > + if (migration_should_start_incoming(channel)) { > /* If it's a recovery, we're done */ > if (postcopy_try_recover()) { > return; > @@ -1014,21 +1028,22 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) > */ > bool migration_has_all_channels(void) > { > + bool ret = false; > MigrationIncomingState *mis = migration_incoming_get_current(); > > if (!mis->from_src_file) { > - return false; > + return ret; > } > > if (migrate_multifd()) { > - return multifd_recv_all_channels_created(); > + ret = multifd_recv_all_channels_created(); > } > > - if (migrate_postcopy_preempt()) { > - return mis->postcopy_qemufile_dst != NULL; > + if (ret && migrate_postcopy_preempt()) { > + ret = mis->postcopy_qemufile_dst != NULL; > } IMHO it's clearer written as: if (migrate_multifd()) { if (!multifd_recv_all_channels_created()) { return false; } } if (migrate_preempt()) { if (mis->postcopy_qemufile_dst == NULL) { return false; } } return true; > > - return true; > + return ret; > } I don't yet see how you manage the multifd threads, etc, on both sides. Or any logic to make sure multifd will properly flush the pages before postcopy starts. IOW, any guarantee that all the pages will only be installed using UFFDIO_COPY as long as vcpu starts on dest. Any comments? The most complicated part of this work would be testing, to make sure it works in all previous cases, and that's majorly why we disabled it before: it was because it randomly crashes, but not always; fundamentally it's because when multifd was designed there wasn't enough consideration on working together with postcopy. We didn't clearly know what's missing at that time. So we would definitely need to add test cases, just like whoever adds new features to migration, to make sure at least it works for existing multifd / postcopy test cases, but when both features enabled. Some hints on what we can add (assuming we already enable both features): - All possible multifd test cases can run one more time with postcopy enabled, but when precopy will converge and finish / cancel migration. e.g.: /x86_64/migration/multifd/file/mapped-ram/* These ones need to keep working like before, it should simply ignore postcopy being enabled. /x86_64/migration/multifd/tcp/uri/plain/none This one is the most generic multifd test, we'd want to make this run again with postcopy enabled, so it verifies it keeps working if it can converge before postcopy enabled. /x86_64/migration/multifd/tcp/plain/cancel This one tests multifd cancellations, and we want to make sure this works too when postcopy is enabled. - All possible postcopy test cases can also run one more time with multifd enabled. Exmaples: /x86_64/migration/postcopy/plain The most generic postcopy test, we want to run this with multifd enabled, then this will cover the most simple use case of multifd-precopy plus postcopy. /x86_64/migration/postcopy/recovery/* These tests cover the fundamental postcopy recovery (plan, fails in handshake, fails in reconnect, or when using TLS), we may want to make sure these work even if multifd cap is enabled. /x86_64/migration/postcopy/preempt/* Similarly like above, but it now enables preempt channels too. It will add quite a few tests to run here, but I don't see a good way otherwise when we want to enable the two features, because it is indeed a challenge to enable the two major features together here. You can also do some manual tests with real workloads when working on this series, that'll be definitely very helpful. I had a feeling that this series could already fail some when enable both features, but please give it a shot. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 5/5] migration: enable multifd and postcopy together 2024-11-04 17:48 ` Peter Xu @ 2024-11-05 11:54 ` Prasad Pandit 2024-11-05 16:55 ` Peter Xu 0 siblings, 1 reply; 33+ messages in thread From: Prasad Pandit @ 2024-11-05 11:54 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Mon, 4 Nov 2024 at 23:19, Peter Xu <peterx@redhat.com> wrote: > Precopy keeps sending data even during postcopy, that's the background > stream (with/without preempt feature enabled). May need some amendment > when repost here. * Okay. > > + if (channel == CH_POSTCOPY) { > > + return false; > > + } > > Please see my other reply, I think here it should never be POSTCOPY > channel, because postcopy (aka, preempt) channel is only created after the > main channel.. So I wonder whether this "if" will hit at all. * It does. migration_ioc_process_incoming() is called for each incoming connection. And every time it calls migration_should_start_incoming() to check if it should proceed with the migration or should wait for further connections, because multifd does not start until all its connections are established. * Similarly when the Postcopy channel is initiated towards the end of Precopy migration, migration_should_start_incoming() gets called, and returns 'false' because we don't want to start a new incoming migration at that time. Earlier it was receiving boolean value 'default_channel' as parameter, which was set to false while accepting 'postcopy' connection via => default_channel = !mis->from_src_file; > > + > > /* Multifd doesn't start unless all channels are established */ > > if (migrate_multifd()) { > > - return migration_has_all_channels(); > > - } > > - > > - /* Preempt channel only starts when the main channel is created */ > > - if (migrate_postcopy_preempt()) { > > - return main_channel; > > + return multifd_recv_all_channels_created(); > > I think this is incorrect.. We should also need to check main channel is > established before start incoming. The old code uses > migration_has_all_channels() which checks that. * Okay, will try to fix it better. But calling migration_has_all_channels() after checking migrate_multifd() does not seem/read right. > I don't yet see how you manage the multifd threads, etc, on both sides. Or > any logic to make sure multifd will properly flush the pages before > postcopy starts. IOW, any guarantee that all the pages will only be > installed using UFFDIO_COPY as long as vcpu starts on dest. Any comments? * There are no changes related to that. As this series only tries to enable the multifd and postcopy options together. > The most complicated part of this work would be testing, to make sure it > works in all previous cases, and that's majorly why we disabled it before: > it was because it randomly crashes, but not always; fundamentally it's > because when multifd was designed there wasn't enough consideration on > working together with postcopy. We didn't clearly know what's missing at > that time. * Right. I have tested this series with the following migration commands to confirm that migration works as expected and there were no crash(es) observed. === [Precopy] # virsh migrate --verbose --live --auto-converge f39vm qemu+ssh://<destination-host-name>/system [Precopy + Multifd] # virsh migrate --verbose --live --auto-converge --parallel --parallel-connections 02 \ f39vm qemu+ssh://<destination-host-name>/system # virsh migrate --verbose --live --auto-converge --parallel --parallel-connections 04 \ f39vm qemu+ssh://<destination-host-name>/system # virsh migrate --verbose --live --auto-converge --parallel --parallel-connections 08 \ f39vm qemu+ssh://<destination-host-name>/system # virsh migrate --verbose --live --auto-converge --parallel --parallel-connections 16 \ f39vm qemu+ssh://<destination-host-name>/system [Postcopy] # virsh migrate --verbose --live --auto-converge \ --postcopy --postcopy-after-precopy f39vm qemu+ssh://<destination-host-name>/system [Postcopy + Multifd] # virsh migrate --verbose --live --auto-converge --parallel --parallel-connections 02 \ --postcopy --postcopy-after-precopy f39vm qemu+ssh://<destination-host-name>/system # virsh migrate --verbose --live --auto-converge --parallel --parallel-connections 04 \ --postcopy --postcopy-after-precopy f39vm qemu+ssh://<destination-host-name>/system # virsh migrate --verbose --live --auto-converge --parallel --parallel-connections 08 \ --postcopy --postcopy-after-precopy f39vm qemu+ssh://<destination-host-name>/system # virsh migrate --verbose --live --auto-converge --parallel --parallel-connections 16 \ --postcopy --postcopy-after-precopy f39vm qemu+ssh://<destination-host-name>/system === > So we would definitely need to add test cases, just like whoever adds new > features to migration, to make sure at least it works for existing multifd > / postcopy test cases, but when both features enabled. ... > It will add quite a few tests to run here, but I don't see a good way > otherwise when we want to enable the two features, because it is indeed a > challenge to enable the two major features together here. > * Thank you for the hints about the tests, will surely look into them and try to learn about how to add new test cases. Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 5/5] migration: enable multifd and postcopy together 2024-11-05 11:54 ` Prasad Pandit @ 2024-11-05 16:55 ` Peter Xu 0 siblings, 0 replies; 33+ messages in thread From: Peter Xu @ 2024-11-05 16:55 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, Fabiano Rosas, Prasad Pandit On Tue, Nov 05, 2024 at 05:24:55PM +0530, Prasad Pandit wrote: > On Mon, 4 Nov 2024 at 23:19, Peter Xu <peterx@redhat.com> wrote: > > Precopy keeps sending data even during postcopy, that's the background > > stream (with/without preempt feature enabled). May need some amendment > > when repost here. > > * Okay. > > > > + if (channel == CH_POSTCOPY) { > > > + return false; > > > + } > > > > Please see my other reply, I think here it should never be POSTCOPY > > channel, because postcopy (aka, preempt) channel is only created after the > > main channel.. So I wonder whether this "if" will hit at all. > > * It does. migration_ioc_process_incoming() is called for each > incoming connection. And every time it calls > migration_should_start_incoming() to check if it should proceed with > the migration or should wait for further connections, because multifd > does not start until all its connections are established. > > * Similarly when the Postcopy channel is initiated towards the end of > Precopy migration, migration_should_start_incoming() gets called, and > returns 'false' because we don't want to start a new incoming > migration at that time. Earlier it was receiving boolean value > 'default_channel' as parameter, which was set to false while accepting > 'postcopy' connection via => default_channel = !mis->from_src_file; Hmm yes, it will happen, but should only happen after the main channel. > > > > + > > > /* Multifd doesn't start unless all channels are established */ > > > if (migrate_multifd()) { > > > - return migration_has_all_channels(); > > > - } > > > - > > > - /* Preempt channel only starts when the main channel is created */ > > > - if (migrate_postcopy_preempt()) { > > > - return main_channel; > > > + return multifd_recv_all_channels_created(); > > > > I think this is incorrect.. We should also need to check main channel is > > established before start incoming. The old code uses > > migration_has_all_channels() which checks that. > > * Okay, will try to fix it better. But calling > migration_has_all_channels() after checking migrate_multifd() does not > seem/read right. > > > I don't yet see how you manage the multifd threads, etc, on both sides. Or > > any logic to make sure multifd will properly flush the pages before > > postcopy starts. IOW, any guarantee that all the pages will only be > > installed using UFFDIO_COPY as long as vcpu starts on dest. Any comments? > > * There are no changes related to that. As this series only tries to > enable the multifd and postcopy options together. In short, qemu_savevm_state_complete_precopy_iterable() is skipped in postcopy. I don't yet see anywhere multifd flushes pages. We need to flush multifd pages before vcpu starts on dest. As we discussed off-list, we can add an assertion in multifd recv threads to make sure it won't touch any guest page during active postcopy. Maybe something like: diff --git a/migration/multifd.c b/migration/multifd.c index 4374e14a96..32425137bd 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -1182,6 +1182,13 @@ static void *multifd_recv_thread(void *opaque) } if (has_data) { + /* + * Now we're going to receive some guest pages into iov + * buffers. If it's during postcopy, it means vcpu can be + * running, so this can corrupt memory when modified + * concurrently by multifd threads! + */ + assert(!migration_in_postcopy()); ret = multifd_recv_state->ops->recv(p, &local_err); if (ret != 0) { break; > > > The most complicated part of this work would be testing, to make sure it > > works in all previous cases, and that's majorly why we disabled it before: > > it was because it randomly crashes, but not always; fundamentally it's > > because when multifd was designed there wasn't enough consideration on > > working together with postcopy. We didn't clearly know what's missing at > > that time. > > * Right. I have tested this series with the following migration > commands to confirm that migration works as expected and there were no > crash(es) observed. > === > [Precopy] > # virsh migrate --verbose --live --auto-converge f39vm > qemu+ssh://<destination-host-name>/system > > [Precopy + Multifd] > # virsh migrate --verbose --live --auto-converge --parallel > --parallel-connections 02 \ > f39vm qemu+ssh://<destination-host-name>/system > # virsh migrate --verbose --live --auto-converge --parallel > --parallel-connections 04 \ > f39vm qemu+ssh://<destination-host-name>/system > # virsh migrate --verbose --live --auto-converge --parallel > --parallel-connections 08 \ > f39vm qemu+ssh://<destination-host-name>/system > # virsh migrate --verbose --live --auto-converge --parallel > --parallel-connections 16 \ > f39vm qemu+ssh://<destination-host-name>/system > > [Postcopy] > # virsh migrate --verbose --live --auto-converge \ > --postcopy --postcopy-after-precopy f39vm > qemu+ssh://<destination-host-name>/system I'd suggest we try interrupt postcopy with migrate-pause, then go with postcopy recover. I wonder how the current series work if the network failure will fail all the multifd iochannels, and how reconnect works. > > [Postcopy + Multifd] > # virsh migrate --verbose --live --auto-converge --parallel > --parallel-connections 02 \ > --postcopy --postcopy-after-precopy f39vm > qemu+ssh://<destination-host-name>/system > # virsh migrate --verbose --live --auto-converge --parallel > --parallel-connections 04 \ > --postcopy --postcopy-after-precopy f39vm > qemu+ssh://<destination-host-name>/system > # virsh migrate --verbose --live --auto-converge --parallel > --parallel-connections 08 \ > --postcopy --postcopy-after-precopy f39vm > qemu+ssh://<destination-host-name>/system > # virsh migrate --verbose --live --auto-converge --parallel > --parallel-connections 16 \ > --postcopy --postcopy-after-precopy f39vm > qemu+ssh://<destination-host-name>/system > === > > > So we would definitely need to add test cases, just like whoever adds new > > features to migration, to make sure at least it works for existing multifd > > / postcopy test cases, but when both features enabled. > ... > > It will add quite a few tests to run here, but I don't see a good way > > otherwise when we want to enable the two features, because it is indeed a > > challenge to enable the two major features together here. > > > > * Thank you for the hints about the tests, will surely look into them > and try to learn about how to add new test cases. Thanks, -- Peter Xu ^ permalink raw reply related [flat|nested] 33+ messages in thread
end of thread, other threads:[~2024-11-08 13:26 UTC | newest] Thread overview: 33+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-10-29 15:09 [PATCH 0/5] Allow to enable multifd and postcopy migration together Prasad Pandit 2024-10-29 15:09 ` [PATCH 1/5] migration/multifd: move macros to multifd header Prasad Pandit 2024-10-29 15:09 ` [PATCH 2/5] migration/postcopy: magic value for postcopy channel Prasad Pandit [not found] ` <ZyTnBwpOwXcHGGPJ@x1n> 2024-11-04 12:32 ` Prasad Pandit 2024-11-04 17:18 ` Peter Xu 2024-11-05 11:19 ` Prasad Pandit 2024-11-05 13:00 ` Peter Xu 2024-11-06 12:19 ` Prasad Pandit 2024-11-06 13:11 ` Fabiano Rosas 2024-11-07 12:05 ` Prasad Pandit 2024-11-07 12:11 ` Fabiano Rosas 2024-11-07 12:33 ` Daniel P. Berrangé 2024-11-07 16:17 ` Peter Xu 2024-11-07 16:57 ` Daniel P. Berrangé 2024-11-07 17:45 ` Peter Xu 2024-11-08 12:37 ` Prasad Pandit 2024-11-08 13:25 ` Fabiano Rosas 2024-11-06 16:00 ` Peter Xu 2024-11-07 11:52 ` Prasad Pandit 2024-11-07 15:56 ` Peter Xu 2024-10-29 15:09 ` [PATCH 3/5] migration: remove multifd check with postcopy Prasad Pandit [not found] ` <ZyTnWYyHlrJUYQRB@x1n> 2024-11-04 12:23 ` Prasad Pandit 2024-11-04 16:52 ` Peter Xu 2024-11-05 9:50 ` Prasad Pandit 2024-10-29 15:09 ` [PATCH 4/5] migration: refactor ram_save_target_page functions Prasad Pandit [not found] ` <ZyToBbvfWkIZ_40W@x1n> 2024-11-04 11:56 ` Prasad Pandit 2024-11-04 17:00 ` Peter Xu 2024-11-05 10:01 ` Prasad Pandit 2024-11-05 13:01 ` Peter Xu 2024-10-29 15:09 ` [PATCH 5/5] migration: enable multifd and postcopy together Prasad Pandit 2024-11-04 17:48 ` Peter Xu 2024-11-05 11:54 ` Prasad Pandit 2024-11-05 16:55 ` Peter Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).