* [PATCH 0/2] migration/multifd: Some VFIO preparations @ 2024-12-05 18:53 Peter Xu 2024-12-05 18:53 ` [PATCH 1/2] migration/multifd: Further remove the SYNC on complete Peter Xu 2024-12-05 18:53 ` [PATCH 2/2] migration/multifd: Allow to sync with sender threads only Peter Xu 0 siblings, 2 replies; 12+ messages in thread From: Peter Xu @ 2024-12-05 18:53 UTC (permalink / raw) To: qemu-devel Cc: peterx, Fabiano Rosas, Maciej S . Szmigiero, Alex Williamson, Avihai Horon, Cédric Le Goater When reviewing the VFIO series I noticed something was off in multifd. Meanwhile I just noticed we can also provide some more options to sync the sender threads. In general, I assume these two small patches should help the VFIO series moving forward easier. More info in each of the patch. Thanks, Peter Xu (2): migration/multifd: Further remove the SYNC on complete migration/multifd: Allow to sync with sender threads only migration/multifd.h | 16 +++++++++++++--- migration/multifd-nocomp.c | 8 +++++++- migration/multifd.c | 14 ++++++++------ migration/ram.c | 13 ++++++++++--- 4 files changed, 38 insertions(+), 13 deletions(-) -- 2.47.0 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2] migration/multifd: Further remove the SYNC on complete 2024-12-05 18:53 [PATCH 0/2] migration/multifd: Some VFIO preparations Peter Xu @ 2024-12-05 18:53 ` Peter Xu 2024-12-05 19:55 ` Fabiano Rosas 2024-12-05 18:53 ` [PATCH 2/2] migration/multifd: Allow to sync with sender threads only Peter Xu 1 sibling, 1 reply; 12+ messages in thread From: Peter Xu @ 2024-12-05 18:53 UTC (permalink / raw) To: qemu-devel Cc: peterx, Fabiano Rosas, Maciej S . Szmigiero, Alex Williamson, Avihai Horon, Cédric Le Goater Commit 637280aeb2 ("migration/multifd: Avoid the final FLUSH in complete()") removed the FLUSH operation on complete() which should avoid one global sync on destination side, because it's not needed. However that commit overlooked multifd_ram_flush_and_sync() part of things, as that's always used together with the FLUSH message on the main channel. For very old binaries (multifd_flush_after_each_section==true), that's still needed because each EOS received on destination will enforce all-channel sync once. For new binaries (multifd_flush_after_each_section==false), the flush and sync shouldn't be needed just like the FLUSH, with the same reasoning. Currently, on new binaries we'll still send SYNC messages on multifd channels, even if FLUSH is omitted at the end. It'll make all recv threads hang at SYNC message. Multifd is still all working fine because luckily recv side cleanup code (mostly multifd_recv_sync_main()) is smart enough to make sure even if recv threads are stuck at SYNC it'll get kicked out. And since this is the completion phase of migration, nothing else will be sent after the SYNCs. This may be needed for VFIO in the future because VFIO can have data to push after ram_save_complete(), hence we don't want the recv thread to be stuck in SYNC message. Remove this limitation will make src even slightly faster too to avoid some more code. Stable branches do not need this patch, as no real bug I can think of that will go wrong there.. so not attaching Fixes to be clear on the backport not needed. Signed-off-by: Peter Xu <peterx@redhat.com> --- migration/ram.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 05ff9eb328..7284c34bd8 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3283,9 +3283,16 @@ static int ram_save_complete(QEMUFile *f, void *opaque) } } - ret = multifd_ram_flush_and_sync(); - if (ret < 0) { - return ret; + if (migrate_multifd() && + migrate_multifd_flush_after_each_section()) { + /* + * Only the old dest QEMU will need this sync, because each EOS + * will require one SYNC message on each channel. + */ + ret = multifd_ram_flush_and_sync(); + if (ret < 0) { + return ret; + } } if (migrate_mapped_ram()) { -- 2.47.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] migration/multifd: Further remove the SYNC on complete 2024-12-05 18:53 ` [PATCH 1/2] migration/multifd: Further remove the SYNC on complete Peter Xu @ 2024-12-05 19:55 ` Fabiano Rosas 2024-12-05 20:23 ` Peter Xu 0 siblings, 1 reply; 12+ messages in thread From: Fabiano Rosas @ 2024-12-05 19:55 UTC (permalink / raw) To: Peter Xu, qemu-devel Cc: peterx, Maciej S . Szmigiero, Alex Williamson, Avihai Horon, Cédric Le Goater Peter Xu <peterx@redhat.com> writes: > Commit 637280aeb2 ("migration/multifd: Avoid the final FLUSH in > complete()") removed the FLUSH operation on complete() which should avoid > one global sync on destination side, because it's not needed. > > However that commit overlooked multifd_ram_flush_and_sync() part of things, > as that's always used together with the FLUSH message on the main channel. Let's please write the full name of the flags, functions, etc. As we've seen in the discussion with Prasad, we're stumbling over ambiguous terminology. This is RAM_SAVE_FLAG_MULTIFD_FLUSH. > > For very old binaries (multifd_flush_after_each_section==true), that's > still needed because each EOS received on destination will enforce > all-channel sync once. Ok, but why does this patch not reinstate the flag? > > For new binaries (multifd_flush_after_each_section==false), the flush and > sync shouldn't be needed just like the FLUSH, with the same reasoning. > > Currently, on new binaries we'll still send SYNC messages on multifd > channels, even if FLUSH is omitted at the end. It'll make all recv threads > hang at SYNC message. I don't get this part, is this a bug you're describing? There's not SYNC message on the recv side, I think you mean the MULTIFD_FLAG_SYNC and this code? if (flags & MULTIFD_FLAG_SYNC) { qemu_sem_post(&multifd_recv_state->sem_sync); qemu_sem_wait(&p->sem_sync); } That's not a hang, that's the sync. > > Multifd is still all working fine because luckily recv side cleanup > code (mostly multifd_recv_sync_main()) is smart enough to make sure even if > recv threads are stuck at SYNC it'll get kicked out. And since this is the > completion phase of migration, nothing else will be sent after the SYNCs. Hmm, a last sync on the recv side might indeed not be needed. > > This may be needed for VFIO in the future because VFIO can have data to > push after ram_save_complete(), hence we don't want the recv thread to be > stuck in SYNC message. Remove this limitation will make src even slightly > faster too to avoid some more code. > > Stable branches do not need this patch, as no real bug I can think of that > will go wrong there.. so not attaching Fixes to be clear on the backport > not needed. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > migration/ram.c | 13 ++++++++++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/migration/ram.c b/migration/ram.c > index 05ff9eb328..7284c34bd8 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -3283,9 +3283,16 @@ static int ram_save_complete(QEMUFile *f, void *opaque) > } > } > > - ret = multifd_ram_flush_and_sync(); > - if (ret < 0) { > - return ret; > + if (migrate_multifd() && > + migrate_multifd_flush_after_each_section()) { > + /* > + * Only the old dest QEMU will need this sync, because each EOS > + * will require one SYNC message on each channel. > + */ > + ret = multifd_ram_flush_and_sync(); > + if (ret < 0) { > + return ret; > + } I don't think this works. We need one last flush to not lose the last few pages of ram. And we need the src side sync of multifd threads to make sure this function doesn't return before all IO has been put on the wire. This also doesn't address what the commit message says about the recv side never getting the RAM_SAVE_FLAG_MULTIFD_FLUSH message. > } > > if (migrate_mapped_ram()) { ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] migration/multifd: Further remove the SYNC on complete 2024-12-05 19:55 ` Fabiano Rosas @ 2024-12-05 20:23 ` Peter Xu 2024-12-05 21:16 ` Fabiano Rosas 0 siblings, 1 reply; 12+ messages in thread From: Peter Xu @ 2024-12-05 20:23 UTC (permalink / raw) To: Fabiano Rosas Cc: qemu-devel, Maciej S . Szmigiero, Alex Williamson, Avihai Horon, Cédric Le Goater On Thu, Dec 05, 2024 at 04:55:08PM -0300, Fabiano Rosas wrote: > Peter Xu <peterx@redhat.com> writes: > > > Commit 637280aeb2 ("migration/multifd: Avoid the final FLUSH in > > complete()") removed the FLUSH operation on complete() which should avoid > > one global sync on destination side, because it's not needed. > > > > However that commit overlooked multifd_ram_flush_and_sync() part of things, > > as that's always used together with the FLUSH message on the main channel. > > Let's please write the full name of the flags, functions, etc. As we've > seen in the discussion with Prasad, we're stumbling over ambiguous > terminology. This is RAM_SAVE_FLAG_MULTIFD_FLUSH. Sure. > > > > > For very old binaries (multifd_flush_after_each_section==true), that's > > still needed because each EOS received on destination will enforce > > all-channel sync once. > > Ok, but why does this patch not reinstate the flag? RAM_SAVE_FLAG_MULTIFD_FLUSH? Because it's not needed? > > > > > For new binaries (multifd_flush_after_each_section==false), the flush and > > sync shouldn't be needed just like the FLUSH, with the same reasoning. > > > > Currently, on new binaries we'll still send SYNC messages on multifd > > channels, even if FLUSH is omitted at the end. It'll make all recv threads > > hang at SYNC message. > > I don't get this part, is this a bug you're describing? There's not SYNC > message on the recv side, I think you mean the MULTIFD_FLAG_SYNC and > this code? > > if (flags & MULTIFD_FLAG_SYNC) { > qemu_sem_post(&multifd_recv_state->sem_sync); > qemu_sem_wait(&p->sem_sync); > } Yes. > > That's not a hang, that's the sync. When recv side never collect that SYNC (by invoke multifd_recv_sync_main), then it is a hang. > > > > > Multifd is still all working fine because luckily recv side cleanup > > code (mostly multifd_recv_sync_main()) is smart enough to make sure even if > > recv threads are stuck at SYNC it'll get kicked out. And since this is the > > completion phase of migration, nothing else will be sent after the SYNCs. > > Hmm, a last sync on the recv side might indeed not be needed. > > > > > This may be needed for VFIO in the future because VFIO can have data to > > push after ram_save_complete(), hence we don't want the recv thread to be > > stuck in SYNC message. Remove this limitation will make src even slightly > > faster too to avoid some more code. > > > > Stable branches do not need this patch, as no real bug I can think of that > > will go wrong there.. so not attaching Fixes to be clear on the backport > > not needed. > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > migration/ram.c | 13 ++++++++++--- > > 1 file changed, 10 insertions(+), 3 deletions(-) > > > > diff --git a/migration/ram.c b/migration/ram.c > > index 05ff9eb328..7284c34bd8 100644 > > --- a/migration/ram.c > > +++ b/migration/ram.c > > @@ -3283,9 +3283,16 @@ static int ram_save_complete(QEMUFile *f, void *opaque) > > } > > } > > > > - ret = multifd_ram_flush_and_sync(); > > - if (ret < 0) { > > - return ret; > > + if (migrate_multifd() && > > + migrate_multifd_flush_after_each_section()) { > > + /* > > + * Only the old dest QEMU will need this sync, because each EOS > > + * will require one SYNC message on each channel. > > + */ > > + ret = multifd_ram_flush_and_sync(); > > + if (ret < 0) { > > + return ret; > > + } > > I don't think this works. We need one last flush to not lose the last > few pages of ram. And we need the src side sync of multifd threads to > make sure this function doesn't return before all IO has been put on the > wire. This should be the question for commit 637280aeb2, I thought it got answered there.. It's almost what the commit message there in 637280aeb2 wanted to justify too. We don't need to flush the last pages here, because we flushed it already, in the last find_dirty_block() call where src QEMU finished scanning the last round. Then we set complete_round=true, scan one more round, and the iteration won't complete until the new round sees zero dirty page. So when reaching this line, multifd send buffer must be empty. We need to flush for each round of RAM scan, we can't avoid the flush there, but we can save this one in complete(). To explain that with code, I hacked a QEMU and assert that. It ran all fine here (this is definitely not for merge.. but to show what I meant): ===8<=== diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index f64c4c9abd..0bd42c7627 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -21,7 +21,7 @@ #include "qemu/error-report.h" #include "trace.h" -static MultiFDSendData *multifd_ram_send; +MultiFDSendData *multifd_ram_send; size_t multifd_ram_payload_size(void) { diff --git a/migration/ram.c b/migration/ram.c index 7284c34bd8..edeb9e28ff 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3228,6 +3228,8 @@ out: return done; } +extern MultiFDSendData *multifd_ram_send; + /** * ram_save_complete: function called to send the remaining amount of ram * @@ -3283,6 +3285,10 @@ static int ram_save_complete(QEMUFile *f, void *opaque) } } + if (migrate_multifd()) { + assert(multifd_payload_empty(multifd_ram_send)); + } + if (migrate_multifd() && migrate_multifd_flush_after_each_section()) { /* ===8<=== > > This also doesn't address what the commit message says about the recv > side never getting the RAM_SAVE_FLAG_MULTIFD_FLUSH message. The new binaries now always not send RAM_SAVE_FLAG_MULTIFD_FLUSH when complete(), however it always sends the multifd SYNC messages on the wire. It means those SYNC messages will always arrive on dest multifd channels, and then all these channels will wait for the main thread to collect that. However since RAM_SAVE_FLAG_MULTIFD_FLUSH is not there, they'll hang until multifd recv cleanups. -- Peter Xu ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] migration/multifd: Further remove the SYNC on complete 2024-12-05 20:23 ` Peter Xu @ 2024-12-05 21:16 ` Fabiano Rosas 2024-12-05 22:00 ` Peter Xu 0 siblings, 1 reply; 12+ messages in thread From: Fabiano Rosas @ 2024-12-05 21:16 UTC (permalink / raw) To: Peter Xu Cc: qemu-devel, Maciej S . Szmigiero, Alex Williamson, Avihai Horon, Cédric Le Goater Peter Xu <peterx@redhat.com> writes: > On Thu, Dec 05, 2024 at 04:55:08PM -0300, Fabiano Rosas wrote: >> Peter Xu <peterx@redhat.com> writes: >> >> > Commit 637280aeb2 ("migration/multifd: Avoid the final FLUSH in >> > complete()") removed the FLUSH operation on complete() which should avoid >> > one global sync on destination side, because it's not needed. >> > >> > However that commit overlooked multifd_ram_flush_and_sync() part of things, >> > as that's always used together with the FLUSH message on the main channel. >> >> Let's please write the full name of the flags, functions, etc. As we've >> seen in the discussion with Prasad, we're stumbling over ambiguous >> terminology. This is RAM_SAVE_FLAG_MULTIFD_FLUSH. > > Sure. > >> >> > >> > For very old binaries (multifd_flush_after_each_section==true), that's >> > still needed because each EOS received on destination will enforce >> > all-channel sync once. >> >> Ok, but why does this patch not reinstate the flag? > > RAM_SAVE_FLAG_MULTIFD_FLUSH? Because it's not needed? > Ah, you're saying we need the source side to send the MULTIFD_FLAG_SYNC packet so that the old QEMU on the recv side gets out of waiting. I see. >> >> > >> > For new binaries (multifd_flush_after_each_section==false), the flush and >> > sync shouldn't be needed just like the FLUSH, with the same reasoning. >> > >> > Currently, on new binaries we'll still send SYNC messages on multifd >> > channels, even if FLUSH is omitted at the end. It'll make all recv threads >> > hang at SYNC message. >> >> I don't get this part, is this a bug you're describing? There's not SYNC >> message on the recv side, I think you mean the MULTIFD_FLAG_SYNC and >> this code? >> >> if (flags & MULTIFD_FLAG_SYNC) { >> qemu_sem_post(&multifd_recv_state->sem_sync); >> qemu_sem_wait(&p->sem_sync); >> } > > Yes. > >> >> That's not a hang, that's the sync. > > When recv side never collect that SYNC (by invoke multifd_recv_sync_main), > then it is a hang. > Right, I forget the sync on the recv is the other way around. It's the multifd_recv_state that does the sync between the threads. The sem_sync is there so that the channels don't exit. >> >> > >> > Multifd is still all working fine because luckily recv side cleanup >> > code (mostly multifd_recv_sync_main()) is smart enough to make sure even if >> > recv threads are stuck at SYNC it'll get kicked out. And since this is the >> > completion phase of migration, nothing else will be sent after the SYNCs. >> >> Hmm, a last sync on the recv side might indeed not be needed. >> >> > >> > This may be needed for VFIO in the future because VFIO can have data to >> > push after ram_save_complete(), hence we don't want the recv thread to be >> > stuck in SYNC message. Remove this limitation will make src even slightly >> > faster too to avoid some more code. >> > >> > Stable branches do not need this patch, as no real bug I can think of that >> > will go wrong there.. so not attaching Fixes to be clear on the backport >> > not needed. >> > >> > Signed-off-by: Peter Xu <peterx@redhat.com> >> > --- >> > migration/ram.c | 13 ++++++++++--- >> > 1 file changed, 10 insertions(+), 3 deletions(-) >> > >> > diff --git a/migration/ram.c b/migration/ram.c >> > index 05ff9eb328..7284c34bd8 100644 >> > --- a/migration/ram.c >> > +++ b/migration/ram.c >> > @@ -3283,9 +3283,16 @@ static int ram_save_complete(QEMUFile *f, void *opaque) >> > } >> > } >> > >> > - ret = multifd_ram_flush_and_sync(); >> > - if (ret < 0) { >> > - return ret; >> > + if (migrate_multifd() && >> > + migrate_multifd_flush_after_each_section()) { >> > + /* >> > + * Only the old dest QEMU will need this sync, because each EOS >> > + * will require one SYNC message on each channel. >> > + */ >> > + ret = multifd_ram_flush_and_sync(); >> > + if (ret < 0) { >> > + return ret; >> > + } >> >> I don't think this works. We need one last flush to not lose the last >> few pages of ram. And we need the src side sync of multifd threads to >> make sure this function doesn't return before all IO has been put on the >> wire. > > This should be the question for commit 637280aeb2, I thought it got > answered there.. It's almost what the commit message there in 637280aeb2 > wanted to justify too. Yeah, it missed the mark. > > We don't need to flush the last pages here, because we flushed it already, > in the last find_dirty_block() call where src QEMU finished scanning the > last round. Then we set complete_round=true, scan one more round, and the > iteration won't complete until the new round sees zero dirty page. Ok, let's put this in the commit message. As it stands it looks like this patch is fixing a bug with 637280aeb2 when instead it's doing further optimization, but so it happens that we still require the backward compatibility part. > > So when reaching this line, multifd send buffer must be empty. We need to > flush for each round of RAM scan, we can't avoid the flush there, but we > can save this one in complete(). > > To explain that with code, I hacked a QEMU and assert that. It ran all > fine here (this is definitely not for merge.. but to show what I meant): > > ===8<=== > diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c > index f64c4c9abd..0bd42c7627 100644 > --- a/migration/multifd-nocomp.c > +++ b/migration/multifd-nocomp.c > @@ -21,7 +21,7 @@ > #include "qemu/error-report.h" > #include "trace.h" > > -static MultiFDSendData *multifd_ram_send; > +MultiFDSendData *multifd_ram_send; > > size_t multifd_ram_payload_size(void) > { > diff --git a/migration/ram.c b/migration/ram.c > index 7284c34bd8..edeb9e28ff 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -3228,6 +3228,8 @@ out: > return done; > } > > +extern MultiFDSendData *multifd_ram_send; > + > /** > * ram_save_complete: function called to send the remaining amount of ram > * > @@ -3283,6 +3285,10 @@ static int ram_save_complete(QEMUFile *f, void *opaque) > } > } > > + if (migrate_multifd()) { > + assert(multifd_payload_empty(multifd_ram_send)); > + } > + > if (migrate_multifd() && > migrate_multifd_flush_after_each_section()) { > /* > ===8<=== > >> >> This also doesn't address what the commit message says about the recv >> side never getting the RAM_SAVE_FLAG_MULTIFD_FLUSH message. > > The new binaries now always not send RAM_SAVE_FLAG_MULTIFD_FLUSH when > complete(), however it always sends the multifd SYNC messages on the wire. > It means those SYNC messages will always arrive on dest multifd channels, > and then all these channels will wait for the main thread to collect that. > However since RAM_SAVE_FLAG_MULTIFD_FLUSH is not there, they'll hang until > multifd recv cleanups. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] migration/multifd: Further remove the SYNC on complete 2024-12-05 21:16 ` Fabiano Rosas @ 2024-12-05 22:00 ` Peter Xu 0 siblings, 0 replies; 12+ messages in thread From: Peter Xu @ 2024-12-05 22:00 UTC (permalink / raw) To: Fabiano Rosas Cc: qemu-devel, Maciej S . Szmigiero, Alex Williamson, Avihai Horon, Cédric Le Goater On Thu, Dec 05, 2024 at 06:16:05PM -0300, Fabiano Rosas wrote: > > We don't need to flush the last pages here, because we flushed it already, > > in the last find_dirty_block() call where src QEMU finished scanning the > > last round. Then we set complete_round=true, scan one more round, and the > > iteration won't complete until the new round sees zero dirty page. > > Ok, let's put this in the commit message. As it stands it looks like > this patch is fixing a bug with 637280aeb2 when instead it's doing > further optimization, but so it happens that we still require the > backward compatibility part. Yes, and as commit message said I didn't attach Fixes and plan not to backport to stable because there's no real bug that we can hit, as those SYNCs will always only present at the end of migration, so they cannot harm anything yet if RAM is the only multifd user. I can add some of above into commit message. Since I already started looking at the patch you posted on putting all sync conditions together, I decided to repost this series with that, and with the rename you suggested on local/remote. Though I can't name them LOCAL and REMOTE because the REMOTE will contain LOCAL too. So in reality I renamed them to LOCAL and ALL, comment will explain that ALL contains LOCAL and REMOTE flushes. -- Peter Xu ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] migration/multifd: Allow to sync with sender threads only 2024-12-05 18:53 [PATCH 0/2] migration/multifd: Some VFIO preparations Peter Xu 2024-12-05 18:53 ` [PATCH 1/2] migration/multifd: Further remove the SYNC on complete Peter Xu @ 2024-12-05 18:53 ` Peter Xu 2024-12-05 20:16 ` Fabiano Rosas 1 sibling, 1 reply; 12+ messages in thread From: Peter Xu @ 2024-12-05 18:53 UTC (permalink / raw) To: qemu-devel Cc: peterx, Fabiano Rosas, Maciej S . Szmigiero, Alex Williamson, Avihai Horon, Cédric Le Goater Teach multifd_send_sync_main() to sync with threads only. We already have such requests, which is when mapped-ram is enabled with multifd. In that case, no SYNC messages will be pushed to the stream when multifd syncs the sender threads because there's no destination threads waiting for that. The whole point of the sync is to make sure all threads flushed their jobs. So fundamentally we have a request to do the sync in different ways: - Either to sync the threads only, - Or to sync the threads but also with the destination side Mapped-ram did it already because of the use_packet check in the sync handler of the sender thread. It works. However it may stop working when e.g. VFIO may start to reuse multifd channels to push device states. In that case VFIO has similar request on "thread-only sync" however we can't check a flag because such sync request can still come from RAM which needs the on-wire notifications. Paving way for that by allowing the multifd_send_sync_main() to specify what kind of sync the caller needs. We can use it for mapped-ram already. No functional change intended. Signed-off-by: Peter Xu <peterx@redhat.com> --- migration/multifd.h | 16 +++++++++++++--- migration/multifd-nocomp.c | 8 +++++++- migration/multifd.c | 14 ++++++++------ 3 files changed, 28 insertions(+), 10 deletions(-) diff --git a/migration/multifd.h b/migration/multifd.h index 50d58c0c9c..6b2f60a917 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -19,6 +19,15 @@ typedef struct MultiFDRecvData MultiFDRecvData; typedef struct MultiFDSendData MultiFDSendData; +typedef enum { + /* No sync request */ + MULTIFD_SYNC_NONE = 0, + /* Sync on the sender threads without pushing messages */ + MULTIFD_SYNC_THREADS, + /* Sync on the sender threads, meanwhile push "SYNC" message to the wire */ + MULTIFD_SYNC_THREADS_AND_NOTIFY, +} MultiFDSyncReq; + bool multifd_send_setup(void); void multifd_send_shutdown(void); void multifd_send_channel_created(void); @@ -28,7 +37,7 @@ void multifd_recv_shutdown(void); bool multifd_recv_all_channels_created(void); void multifd_recv_new_channel(QIOChannel *ioc, Error **errp); void multifd_recv_sync_main(void); -int multifd_send_sync_main(void); +int multifd_send_sync_main(MultiFDSyncReq req); bool multifd_queue_page(RAMBlock *block, ram_addr_t offset); bool multifd_recv(void); MultiFDRecvData *multifd_get_recv_data(void); @@ -143,7 +152,7 @@ typedef struct { /* multifd flags for each packet */ uint32_t flags; /* - * The sender thread has work to do if either of below boolean is set. + * The sender thread has work to do if either of below field is set. * * @pending_job: a job is pending * @pending_sync: a sync request is pending @@ -152,7 +161,8 @@ typedef struct { * cleared by the multifd sender threads. */ bool pending_job; - bool pending_sync; + MultiFDSyncReq pending_sync; + MultiFDSendData *data; /* thread local variables. No locking required */ diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index 55191152f9..f64c4c9abd 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -345,6 +345,8 @@ retry: int multifd_ram_flush_and_sync(void) { + MultiFDSyncReq req; + if (!migrate_multifd()) { return 0; } @@ -356,7 +358,11 @@ int multifd_ram_flush_and_sync(void) } } - return multifd_send_sync_main(); + /* File migrations only need to sync with threads */ + req = migrate_mapped_ram() ? + MULTIFD_SYNC_THREADS : MULTIFD_SYNC_THREADS_AND_NOTIFY; + + return multifd_send_sync_main(req); } bool multifd_send_prepare_common(MultiFDSendParams *p) diff --git a/migration/multifd.c b/migration/multifd.c index 498e71fd10..77645e87a0 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -523,7 +523,7 @@ static int multifd_zero_copy_flush(QIOChannel *c) return ret; } -int multifd_send_sync_main(void) +int multifd_send_sync_main(MultiFDSyncReq req) { int i; bool flush_zero_copy; @@ -543,8 +543,8 @@ int multifd_send_sync_main(void) * We should be the only user so far, so not possible to be set by * others concurrently. */ - assert(qatomic_read(&p->pending_sync) == false); - qatomic_set(&p->pending_sync, true); + assert(qatomic_read(&p->pending_sync) == MULTIFD_SYNC_NONE); + qatomic_set(&p->pending_sync, req); qemu_sem_post(&p->sem); } for (i = 0; i < migrate_multifd_channels(); i++) { @@ -635,14 +635,16 @@ static void *multifd_send_thread(void *opaque) */ qatomic_store_release(&p->pending_job, false); } else { + MultiFDSyncReq req = qatomic_read(&p->pending_sync); + /* * If not a normal job, must be a sync request. Note that * pending_sync is a standalone flag (unlike pending_job), so * it doesn't require explicit memory barriers. */ - assert(qatomic_read(&p->pending_sync)); + assert(req != MULTIFD_SYNC_NONE); - if (use_packets) { + if (req == MULTIFD_SYNC_THREADS_AND_NOTIFY) { p->flags = MULTIFD_FLAG_SYNC; multifd_send_fill_packet(p); ret = qio_channel_write_all(p->c, (void *)p->packet, @@ -654,7 +656,7 @@ static void *multifd_send_thread(void *opaque) stat64_add(&mig_stats.multifd_bytes, p->packet_len); } - qatomic_set(&p->pending_sync, false); + qatomic_set(&p->pending_sync, MULTIFD_SYNC_NONE); qemu_sem_post(&p->sem_sync); } } -- 2.47.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] migration/multifd: Allow to sync with sender threads only 2024-12-05 18:53 ` [PATCH 2/2] migration/multifd: Allow to sync with sender threads only Peter Xu @ 2024-12-05 20:16 ` Fabiano Rosas 2024-12-05 20:35 ` Peter Xu 0 siblings, 1 reply; 12+ messages in thread From: Fabiano Rosas @ 2024-12-05 20:16 UTC (permalink / raw) To: Peter Xu, qemu-devel Cc: peterx, Maciej S . Szmigiero, Alex Williamson, Avihai Horon, Cédric Le Goater Peter Xu <peterx@redhat.com> writes: > Teach multifd_send_sync_main() to sync with threads only. > > We already have such requests, which is when mapped-ram is enabled with > multifd. In that case, no SYNC messages will be pushed to the stream when > multifd syncs the sender threads because there's no destination threads > waiting for that. The whole point of the sync is to make sure all threads > flushed their jobs. > > So fundamentally we have a request to do the sync in different ways: > > - Either to sync the threads only, > - Or to sync the threads but also with the destination side > > Mapped-ram did it already because of the use_packet check in the sync > handler of the sender thread. It works. > > However it may stop working when e.g. VFIO may start to reuse multifd > channels to push device states. In that case VFIO has similar request on > "thread-only sync" however we can't check a flag because such sync request > can still come from RAM which needs the on-wire notifications. > > Paving way for that by allowing the multifd_send_sync_main() to specify > what kind of sync the caller needs. We can use it for mapped-ram already. > > No functional change intended. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > migration/multifd.h | 16 +++++++++++++--- > migration/multifd-nocomp.c | 8 +++++++- > migration/multifd.c | 14 ++++++++------ > 3 files changed, 28 insertions(+), 10 deletions(-) > > diff --git a/migration/multifd.h b/migration/multifd.h > index 50d58c0c9c..6b2f60a917 100644 > --- a/migration/multifd.h > +++ b/migration/multifd.h > @@ -19,6 +19,15 @@ > typedef struct MultiFDRecvData MultiFDRecvData; > typedef struct MultiFDSendData MultiFDSendData; > > +typedef enum { > + /* No sync request */ > + MULTIFD_SYNC_NONE = 0, > + /* Sync on the sender threads without pushing messages */ > + MULTIFD_SYNC_THREADS, > + /* Sync on the sender threads, meanwhile push "SYNC" message to the wire */ s/meanwhile// > + MULTIFD_SYNC_THREADS_AND_NOTIFY, > +} MultiFDSyncReq; I think I'd prefer the local vs. remote terminology I introduced in my proposal [1] for cleaning up the multifd_flush_after_each_section() code: LOCAL - sync the local threads between themselves REMOTE - put a message on the stream for the remote end to perform a sync on their threads. Down below you're passing the MULTIFD_SYNC_THREADS_AND_NOTIFY into the send thread, but the "sync threads" part of this is really done outside the thread, so that part doesn't have a meaning inside the thread. 1- https://lore.kernel.org/r/875xo8n4ue.fsf@suse.de Also, please provide your input there^, it would be nice to unify the terminology and reasoning about both changes. > + > bool multifd_send_setup(void); > void multifd_send_shutdown(void); > void multifd_send_channel_created(void); > @@ -28,7 +37,7 @@ void multifd_recv_shutdown(void); > bool multifd_recv_all_channels_created(void); > void multifd_recv_new_channel(QIOChannel *ioc, Error **errp); > void multifd_recv_sync_main(void); > -int multifd_send_sync_main(void); > +int multifd_send_sync_main(MultiFDSyncReq req); > bool multifd_queue_page(RAMBlock *block, ram_addr_t offset); > bool multifd_recv(void); > MultiFDRecvData *multifd_get_recv_data(void); > @@ -143,7 +152,7 @@ typedef struct { > /* multifd flags for each packet */ > uint32_t flags; > /* > - * The sender thread has work to do if either of below boolean is set. > + * The sender thread has work to do if either of below field is set. > * > * @pending_job: a job is pending > * @pending_sync: a sync request is pending > @@ -152,7 +161,8 @@ typedef struct { > * cleared by the multifd sender threads. > */ > bool pending_job; > - bool pending_sync; > + MultiFDSyncReq pending_sync; > + > MultiFDSendData *data; > > /* thread local variables. No locking required */ > diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c > index 55191152f9..f64c4c9abd 100644 > --- a/migration/multifd-nocomp.c > +++ b/migration/multifd-nocomp.c > @@ -345,6 +345,8 @@ retry: > > int multifd_ram_flush_and_sync(void) > { > + MultiFDSyncReq req; > + > if (!migrate_multifd()) { > return 0; > } > @@ -356,7 +358,11 @@ int multifd_ram_flush_and_sync(void) > } > } > > - return multifd_send_sync_main(); > + /* File migrations only need to sync with threads */ > + req = migrate_mapped_ram() ? > + MULTIFD_SYNC_THREADS : MULTIFD_SYNC_THREADS_AND_NOTIFY; > + > + return multifd_send_sync_main(req); > } > > bool multifd_send_prepare_common(MultiFDSendParams *p) > diff --git a/migration/multifd.c b/migration/multifd.c > index 498e71fd10..77645e87a0 100644 > --- a/migration/multifd.c > +++ b/migration/multifd.c > @@ -523,7 +523,7 @@ static int multifd_zero_copy_flush(QIOChannel *c) > return ret; > } > > -int multifd_send_sync_main(void) > +int multifd_send_sync_main(MultiFDSyncReq req) > { > int i; > bool flush_zero_copy; > @@ -543,8 +543,8 @@ int multifd_send_sync_main(void) > * We should be the only user so far, so not possible to be set by > * others concurrently. > */ > - assert(qatomic_read(&p->pending_sync) == false); > - qatomic_set(&p->pending_sync, true); > + assert(qatomic_read(&p->pending_sync) == MULTIFD_SYNC_NONE); > + qatomic_set(&p->pending_sync, req); Hmm, isn't it easier to skip the whole loop if req == MULTIFD_SYNC_THREADS? I don't remember why we kept this loop here for mapped-ram. > qemu_sem_post(&p->sem); > } > for (i = 0; i < migrate_multifd_channels(); i++) { > @@ -635,14 +635,16 @@ static void *multifd_send_thread(void *opaque) > */ > qatomic_store_release(&p->pending_job, false); > } else { > + MultiFDSyncReq req = qatomic_read(&p->pending_sync); > + > /* > * If not a normal job, must be a sync request. Note that > * pending_sync is a standalone flag (unlike pending_job), so > * it doesn't require explicit memory barriers. > */ > - assert(qatomic_read(&p->pending_sync)); > + assert(req != MULTIFD_SYNC_NONE); > > - if (use_packets) { > + if (req == MULTIFD_SYNC_THREADS_AND_NOTIFY) { Good, more explicit. > p->flags = MULTIFD_FLAG_SYNC; > multifd_send_fill_packet(p); > ret = qio_channel_write_all(p->c, (void *)p->packet, > @@ -654,7 +656,7 @@ static void *multifd_send_thread(void *opaque) > stat64_add(&mig_stats.multifd_bytes, p->packet_len); > } > > - qatomic_set(&p->pending_sync, false); > + qatomic_set(&p->pending_sync, MULTIFD_SYNC_NONE); It's a bit weird that MULTIFD_SYNC_THREADS will never have an use inside the thread. Makes me think it should never exist in the first place. But then we're back into pending_sync + use_packets... looks like it would be less convoluted to skip the loop up there and assert(!use_packets) in here. Unless I'm missing something... > qemu_sem_post(&p->sem_sync); > } > } ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] migration/multifd: Allow to sync with sender threads only 2024-12-05 20:16 ` Fabiano Rosas @ 2024-12-05 20:35 ` Peter Xu 2024-12-05 21:50 ` Fabiano Rosas 0 siblings, 1 reply; 12+ messages in thread From: Peter Xu @ 2024-12-05 20:35 UTC (permalink / raw) To: Fabiano Rosas Cc: qemu-devel, Maciej S . Szmigiero, Alex Williamson, Avihai Horon, Cédric Le Goater On Thu, Dec 05, 2024 at 05:16:05PM -0300, Fabiano Rosas wrote: > Peter Xu <peterx@redhat.com> writes: > > > Teach multifd_send_sync_main() to sync with threads only. > > > > We already have such requests, which is when mapped-ram is enabled with > > multifd. In that case, no SYNC messages will be pushed to the stream when > > multifd syncs the sender threads because there's no destination threads > > waiting for that. The whole point of the sync is to make sure all threads > > flushed their jobs. > > > > So fundamentally we have a request to do the sync in different ways: > > > > - Either to sync the threads only, > > - Or to sync the threads but also with the destination side > > > > Mapped-ram did it already because of the use_packet check in the sync > > handler of the sender thread. It works. > > > > However it may stop working when e.g. VFIO may start to reuse multifd > > channels to push device states. In that case VFIO has similar request on > > "thread-only sync" however we can't check a flag because such sync request > > can still come from RAM which needs the on-wire notifications. > > > > Paving way for that by allowing the multifd_send_sync_main() to specify > > what kind of sync the caller needs. We can use it for mapped-ram already. > > > > No functional change intended. > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > migration/multifd.h | 16 +++++++++++++--- > > migration/multifd-nocomp.c | 8 +++++++- > > migration/multifd.c | 14 ++++++++------ > > 3 files changed, 28 insertions(+), 10 deletions(-) > > > > diff --git a/migration/multifd.h b/migration/multifd.h > > index 50d58c0c9c..6b2f60a917 100644 > > --- a/migration/multifd.h > > +++ b/migration/multifd.h > > @@ -19,6 +19,15 @@ > > typedef struct MultiFDRecvData MultiFDRecvData; > > typedef struct MultiFDSendData MultiFDSendData; > > > > +typedef enum { > > + /* No sync request */ > > + MULTIFD_SYNC_NONE = 0, > > + /* Sync on the sender threads without pushing messages */ > > + MULTIFD_SYNC_THREADS, > > + /* Sync on the sender threads, meanwhile push "SYNC" message to the wire */ > > s/meanwhile// > > > + MULTIFD_SYNC_THREADS_AND_NOTIFY, > > +} MultiFDSyncReq; > > I think I'd prefer the local vs. remote terminology I introduced in my > proposal [1] for cleaning up the multifd_flush_after_each_section() code: I'm ok with your naming, as long as the comment will explain. > > LOCAL - sync the local threads between themselves > REMOTE - put a message on the stream for the remote end to perform a > sync on their threads. > > Down below you're passing the > MULTIFD_SYNC_THREADS_AND_NOTIFY into the send thread, but the "sync > threads" part of this is really done outside the thread, so that part > doesn't have a meaning inside the thread. > > 1- https://lore.kernel.org/r/875xo8n4ue.fsf@suse.de > > Also, please provide your input there^, it would be nice to unify the > terminology and reasoning about both changes. Yes, I'm mostly flushing my inbox in time order unless prioritized, so I'm getting there today or tomorrow. > > > + > > bool multifd_send_setup(void); > > void multifd_send_shutdown(void); > > void multifd_send_channel_created(void); > > @@ -28,7 +37,7 @@ void multifd_recv_shutdown(void); > > bool multifd_recv_all_channels_created(void); > > void multifd_recv_new_channel(QIOChannel *ioc, Error **errp); > > void multifd_recv_sync_main(void); > > -int multifd_send_sync_main(void); > > +int multifd_send_sync_main(MultiFDSyncReq req); > > bool multifd_queue_page(RAMBlock *block, ram_addr_t offset); > > bool multifd_recv(void); > > MultiFDRecvData *multifd_get_recv_data(void); > > @@ -143,7 +152,7 @@ typedef struct { > > /* multifd flags for each packet */ > > uint32_t flags; > > /* > > - * The sender thread has work to do if either of below boolean is set. > > + * The sender thread has work to do if either of below field is set. > > * > > * @pending_job: a job is pending > > * @pending_sync: a sync request is pending > > @@ -152,7 +161,8 @@ typedef struct { > > * cleared by the multifd sender threads. > > */ > > bool pending_job; > > - bool pending_sync; > > + MultiFDSyncReq pending_sync; > > + > > MultiFDSendData *data; > > > > /* thread local variables. No locking required */ > > diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c > > index 55191152f9..f64c4c9abd 100644 > > --- a/migration/multifd-nocomp.c > > +++ b/migration/multifd-nocomp.c > > @@ -345,6 +345,8 @@ retry: > > > > int multifd_ram_flush_and_sync(void) > > { > > + MultiFDSyncReq req; > > + > > if (!migrate_multifd()) { > > return 0; > > } > > @@ -356,7 +358,11 @@ int multifd_ram_flush_and_sync(void) > > } > > } > > > > - return multifd_send_sync_main(); > > + /* File migrations only need to sync with threads */ > > + req = migrate_mapped_ram() ? > > + MULTIFD_SYNC_THREADS : MULTIFD_SYNC_THREADS_AND_NOTIFY; > > + > > + return multifd_send_sync_main(req); > > } > > > > bool multifd_send_prepare_common(MultiFDSendParams *p) > > diff --git a/migration/multifd.c b/migration/multifd.c > > index 498e71fd10..77645e87a0 100644 > > --- a/migration/multifd.c > > +++ b/migration/multifd.c > > @@ -523,7 +523,7 @@ static int multifd_zero_copy_flush(QIOChannel *c) > > return ret; > > } > > > > -int multifd_send_sync_main(void) > > +int multifd_send_sync_main(MultiFDSyncReq req) > > { > > int i; > > bool flush_zero_copy; > > @@ -543,8 +543,8 @@ int multifd_send_sync_main(void) > > * We should be the only user so far, so not possible to be set by > > * others concurrently. > > */ > > - assert(qatomic_read(&p->pending_sync) == false); > > - qatomic_set(&p->pending_sync, true); > > + assert(qatomic_read(&p->pending_sync) == MULTIFD_SYNC_NONE); > > + qatomic_set(&p->pending_sync, req); > > Hmm, isn't it easier to skip the whole loop if req == > MULTIFD_SYNC_THREADS? I don't remember why we kept this loop here for > mapped-ram. The "thread-only" version of request (or, in your preferred naming, "local" sync request) says: "please flush all the works enqueued in sender thread". Sync is still needed even for mapped-ram to make sure pwrite()s all land. Also needed for VFIO. > > > qemu_sem_post(&p->sem); > > } > > for (i = 0; i < migrate_multifd_channels(); i++) { > > @@ -635,14 +635,16 @@ static void *multifd_send_thread(void *opaque) > > */ > > qatomic_store_release(&p->pending_job, false); > > } else { > > + MultiFDSyncReq req = qatomic_read(&p->pending_sync); > > + > > /* > > * If not a normal job, must be a sync request. Note that > > * pending_sync is a standalone flag (unlike pending_job), so > > * it doesn't require explicit memory barriers. > > */ > > - assert(qatomic_read(&p->pending_sync)); > > + assert(req != MULTIFD_SYNC_NONE); > > > > - if (use_packets) { > > + if (req == MULTIFD_SYNC_THREADS_AND_NOTIFY) { > > Good, more explicit. > > > p->flags = MULTIFD_FLAG_SYNC; > > multifd_send_fill_packet(p); > > ret = qio_channel_write_all(p->c, (void *)p->packet, > > @@ -654,7 +656,7 @@ static void *multifd_send_thread(void *opaque) > > stat64_add(&mig_stats.multifd_bytes, p->packet_len); > > } > > > > - qatomic_set(&p->pending_sync, false); > > + qatomic_set(&p->pending_sync, MULTIFD_SYNC_NONE); > > It's a bit weird that MULTIFD_SYNC_THREADS will never have an use inside > the thread. It has; it guarantees that existing queued pending_job is completed. > Makes me think it should never exist in the first place. But > then we're back into pending_sync + use_packets... looks like it would > be less convoluted to skip the loop up there and assert(!use_packets) in > here. > > Unless I'm missing something... > > > qemu_sem_post(&p->sem_sync); > > } > > } > -- Peter Xu ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] migration/multifd: Allow to sync with sender threads only 2024-12-05 20:35 ` Peter Xu @ 2024-12-05 21:50 ` Fabiano Rosas 2024-12-05 22:12 ` Peter Xu 0 siblings, 1 reply; 12+ messages in thread From: Fabiano Rosas @ 2024-12-05 21:50 UTC (permalink / raw) To: Peter Xu Cc: qemu-devel, Maciej S . Szmigiero, Alex Williamson, Avihai Horon, Cédric Le Goater Peter Xu <peterx@redhat.com> writes: > On Thu, Dec 05, 2024 at 05:16:05PM -0300, Fabiano Rosas wrote: >> Peter Xu <peterx@redhat.com> writes: >> >> > Teach multifd_send_sync_main() to sync with threads only. >> > >> > We already have such requests, which is when mapped-ram is enabled with >> > multifd. In that case, no SYNC messages will be pushed to the stream when >> > multifd syncs the sender threads because there's no destination threads >> > waiting for that. The whole point of the sync is to make sure all threads >> > flushed their jobs. >> > >> > So fundamentally we have a request to do the sync in different ways: >> > >> > - Either to sync the threads only, >> > - Or to sync the threads but also with the destination side >> > >> > Mapped-ram did it already because of the use_packet check in the sync >> > handler of the sender thread. It works. >> > >> > However it may stop working when e.g. VFIO may start to reuse multifd >> > channels to push device states. In that case VFIO has similar request on >> > "thread-only sync" however we can't check a flag because such sync request >> > can still come from RAM which needs the on-wire notifications. >> > >> > Paving way for that by allowing the multifd_send_sync_main() to specify >> > what kind of sync the caller needs. We can use it for mapped-ram already. >> > >> > No functional change intended. >> > >> > Signed-off-by: Peter Xu <peterx@redhat.com> >> > --- >> > migration/multifd.h | 16 +++++++++++++--- >> > migration/multifd-nocomp.c | 8 +++++++- >> > migration/multifd.c | 14 ++++++++------ >> > 3 files changed, 28 insertions(+), 10 deletions(-) >> > >> > diff --git a/migration/multifd.h b/migration/multifd.h >> > index 50d58c0c9c..6b2f60a917 100644 >> > --- a/migration/multifd.h >> > +++ b/migration/multifd.h >> > @@ -19,6 +19,15 @@ >> > typedef struct MultiFDRecvData MultiFDRecvData; >> > typedef struct MultiFDSendData MultiFDSendData; >> > >> > +typedef enum { >> > + /* No sync request */ >> > + MULTIFD_SYNC_NONE = 0, >> > + /* Sync on the sender threads without pushing messages */ >> > + MULTIFD_SYNC_THREADS, >> > + /* Sync on the sender threads, meanwhile push "SYNC" message to the wire */ >> >> s/meanwhile// >> >> > + MULTIFD_SYNC_THREADS_AND_NOTIFY, >> > +} MultiFDSyncReq; >> >> I think I'd prefer the local vs. remote terminology I introduced in my >> proposal [1] for cleaning up the multifd_flush_after_each_section() code: > > I'm ok with your naming, as long as the comment will explain. > >> >> LOCAL - sync the local threads between themselves >> REMOTE - put a message on the stream for the remote end to perform a >> sync on their threads. >> >> Down below you're passing the >> MULTIFD_SYNC_THREADS_AND_NOTIFY into the send thread, but the "sync >> threads" part of this is really done outside the thread, so that part >> doesn't have a meaning inside the thread. >> >> 1- https://lore.kernel.org/r/875xo8n4ue.fsf@suse.de >> >> Also, please provide your input there^, it would be nice to unify the >> terminology and reasoning about both changes. > > Yes, I'm mostly flushing my inbox in time order unless prioritized, so I'm > getting there today or tomorrow. > >> >> > + >> > bool multifd_send_setup(void); >> > void multifd_send_shutdown(void); >> > void multifd_send_channel_created(void); >> > @@ -28,7 +37,7 @@ void multifd_recv_shutdown(void); >> > bool multifd_recv_all_channels_created(void); >> > void multifd_recv_new_channel(QIOChannel *ioc, Error **errp); >> > void multifd_recv_sync_main(void); >> > -int multifd_send_sync_main(void); >> > +int multifd_send_sync_main(MultiFDSyncReq req); >> > bool multifd_queue_page(RAMBlock *block, ram_addr_t offset); >> > bool multifd_recv(void); >> > MultiFDRecvData *multifd_get_recv_data(void); >> > @@ -143,7 +152,7 @@ typedef struct { >> > /* multifd flags for each packet */ >> > uint32_t flags; >> > /* >> > - * The sender thread has work to do if either of below boolean is set. >> > + * The sender thread has work to do if either of below field is set. >> > * >> > * @pending_job: a job is pending >> > * @pending_sync: a sync request is pending >> > @@ -152,7 +161,8 @@ typedef struct { >> > * cleared by the multifd sender threads. >> > */ >> > bool pending_job; >> > - bool pending_sync; >> > + MultiFDSyncReq pending_sync; >> > + >> > MultiFDSendData *data; >> > >> > /* thread local variables. No locking required */ >> > diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c >> > index 55191152f9..f64c4c9abd 100644 >> > --- a/migration/multifd-nocomp.c >> > +++ b/migration/multifd-nocomp.c >> > @@ -345,6 +345,8 @@ retry: >> > >> > int multifd_ram_flush_and_sync(void) >> > { >> > + MultiFDSyncReq req; >> > + >> > if (!migrate_multifd()) { >> > return 0; >> > } >> > @@ -356,7 +358,11 @@ int multifd_ram_flush_and_sync(void) >> > } >> > } >> > >> > - return multifd_send_sync_main(); >> > + /* File migrations only need to sync with threads */ >> > + req = migrate_mapped_ram() ? >> > + MULTIFD_SYNC_THREADS : MULTIFD_SYNC_THREADS_AND_NOTIFY; >> > + >> > + return multifd_send_sync_main(req); >> > } >> > >> > bool multifd_send_prepare_common(MultiFDSendParams *p) >> > diff --git a/migration/multifd.c b/migration/multifd.c >> > index 498e71fd10..77645e87a0 100644 >> > --- a/migration/multifd.c >> > +++ b/migration/multifd.c >> > @@ -523,7 +523,7 @@ static int multifd_zero_copy_flush(QIOChannel *c) >> > return ret; >> > } >> > >> > -int multifd_send_sync_main(void) >> > +int multifd_send_sync_main(MultiFDSyncReq req) >> > { >> > int i; >> > bool flush_zero_copy; >> > @@ -543,8 +543,8 @@ int multifd_send_sync_main(void) >> > * We should be the only user so far, so not possible to be set by >> > * others concurrently. >> > */ >> > - assert(qatomic_read(&p->pending_sync) == false); >> > - qatomic_set(&p->pending_sync, true); >> > + assert(qatomic_read(&p->pending_sync) == MULTIFD_SYNC_NONE); >> > + qatomic_set(&p->pending_sync, req); >> >> Hmm, isn't it easier to skip the whole loop if req == >> MULTIFD_SYNC_THREADS? I don't remember why we kept this loop here for >> mapped-ram. > > The "thread-only" version of request (or, in your preferred naming, "local" > sync request) says: "please flush all the works enqueued in sender thread". > Sync is still needed even for mapped-ram to make sure pwrite()s all land. > Also needed for VFIO. I think I remember now, what's needed is to release p->sem and wait on p->sem_sync (one in each of these loops). We don't need to set the pending_sync flag if it's not going to be used: multifd_send_sync_main: for () { ... if (remote_sync) { assert(qatomic_read(&p->pending_sync) == false); qatomic_set(&p->pending_sync, true); } qemu_sem_post(&p->sem); } for () { ... qemu_sem_wait(&multifd_send_state->channels_ready); qemu_sem_wait(&p->sem_sync); } in multifd_send_thread: if (qatomic_load_acquire(&p->pending_job)) { ... qatomic_store_release(&p->pending_job, false); } else if (qatomic_read(&p->pending_sync)) { ... p->flags = MULTIFD_FLAG_SYNC; qatomic_set(&p->pending_sync, false); qemu_sem_post(&p->sem_sync); } else { qemu_sem_post(&p->sem_sync); } Is this clearer? Then we avoid the enum altogether, a boolean would suffice. > >> >> > qemu_sem_post(&p->sem); >> > } >> > for (i = 0; i < migrate_multifd_channels(); i++) { >> > @@ -635,14 +635,16 @@ static void *multifd_send_thread(void *opaque) >> > */ >> > qatomic_store_release(&p->pending_job, false); >> > } else { >> > + MultiFDSyncReq req = qatomic_read(&p->pending_sync); >> > + >> > /* >> > * If not a normal job, must be a sync request. Note that >> > * pending_sync is a standalone flag (unlike pending_job), so >> > * it doesn't require explicit memory barriers. >> > */ >> > - assert(qatomic_read(&p->pending_sync)); >> > + assert(req != MULTIFD_SYNC_NONE); >> > >> > - if (use_packets) { >> > + if (req == MULTIFD_SYNC_THREADS_AND_NOTIFY) { >> >> Good, more explicit. >> >> > p->flags = MULTIFD_FLAG_SYNC; >> > multifd_send_fill_packet(p); >> > ret = qio_channel_write_all(p->c, (void *)p->packet, >> > @@ -654,7 +656,7 @@ static void *multifd_send_thread(void *opaque) >> > stat64_add(&mig_stats.multifd_bytes, p->packet_len); >> > } >> > >> > - qatomic_set(&p->pending_sync, false); >> > + qatomic_set(&p->pending_sync, MULTIFD_SYNC_NONE); >> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] migration/multifd: Allow to sync with sender threads only 2024-12-05 21:50 ` Fabiano Rosas @ 2024-12-05 22:12 ` Peter Xu 2024-12-05 22:20 ` Peter Xu 0 siblings, 1 reply; 12+ messages in thread From: Peter Xu @ 2024-12-05 22:12 UTC (permalink / raw) To: Fabiano Rosas Cc: qemu-devel, Maciej S . Szmigiero, Alex Williamson, Avihai Horon, Cédric Le Goater On Thu, Dec 05, 2024 at 06:50:23PM -0300, Fabiano Rosas wrote: > I think I remember now, what's needed is to release p->sem and wait on > p->sem_sync (one in each of these loops). We don't need to set the > pending_sync flag if it's not going to be used: > > multifd_send_sync_main: > for () { > ... > if (remote_sync) { > assert(qatomic_read(&p->pending_sync) == false); > qatomic_set(&p->pending_sync, true); > } > qemu_sem_post(&p->sem); > } > for () { > ... > qemu_sem_wait(&multifd_send_state->channels_ready); > qemu_sem_wait(&p->sem_sync); > } > > in multifd_send_thread: > > if (qatomic_load_acquire(&p->pending_job)) { > ... > qatomic_store_release(&p->pending_job, false); > } else if (qatomic_read(&p->pending_sync)) { > ... > p->flags = MULTIFD_FLAG_SYNC; > qatomic_set(&p->pending_sync, false); > qemu_sem_post(&p->sem_sync); > } else { How do you trigger this "else" path at all, if without setting pending_sync first? > qemu_sem_post(&p->sem_sync); > } > > Is this clearer? Then we avoid the enum altogether, a boolean would > suffice. So one of us missed something. :) In case if I missed it, a runnable patch would work to clarify. -- Peter Xu ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] migration/multifd: Allow to sync with sender threads only 2024-12-05 22:12 ` Peter Xu @ 2024-12-05 22:20 ` Peter Xu 0 siblings, 0 replies; 12+ messages in thread From: Peter Xu @ 2024-12-05 22:20 UTC (permalink / raw) To: Fabiano Rosas Cc: qemu-devel, Maciej S . Szmigiero, Alex Williamson, Avihai Horon, Cédric Le Goater On Thu, Dec 05, 2024 at 05:12:47PM -0500, Peter Xu wrote: > In case if I missed it, a runnable patch would work to clarify. Ohhh no need now, I see what you meant. But then you'll really need to comment p->sem, with something like: - /* sem where to wait for more work */ + /* sem where to wait for more work. If there's no any work, it means + * a local sync. */ QemuSemaphore sem; Do you like it? I definitely don't.. because it's confusing why p->sem can imply a sync request if we already have pending_sync. IMHO it's cleaner when we have pending_sync, use it for all kinds of syncs. -- Peter Xu ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-12-05 22:22 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-12-05 18:53 [PATCH 0/2] migration/multifd: Some VFIO preparations Peter Xu 2024-12-05 18:53 ` [PATCH 1/2] migration/multifd: Further remove the SYNC on complete Peter Xu 2024-12-05 19:55 ` Fabiano Rosas 2024-12-05 20:23 ` Peter Xu 2024-12-05 21:16 ` Fabiano Rosas 2024-12-05 22:00 ` Peter Xu 2024-12-05 18:53 ` [PATCH 2/2] migration/multifd: Allow to sync with sender threads only Peter Xu 2024-12-05 20:16 ` Fabiano Rosas 2024-12-05 20:35 ` Peter Xu 2024-12-05 21:50 ` Fabiano Rosas 2024-12-05 22:12 ` Peter Xu 2024-12-05 22:20 ` Peter Xu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.