All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: peterx@redhat.com, qemu-devel@nongnu.org
Cc: peterx@redhat.com, Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Subject: Re: [PATCH] migration/postcopy: Fix high frequency sync
Date: Thu, 21 Mar 2024 09:32:41 -0300	[thread overview]
Message-ID: <878r2bn4t2.fsf@suse.de> (raw)
In-Reply-To: <20240320214453.584374-1-peterx@redhat.com>

peterx@redhat.com writes:

> From: Peter Xu <peterx@redhat.com>
>
> On current code base I can observe extremely high sync count during
> precopy, as long as one enables postcopy-ram=on before switchover to
> postcopy.
>
> To provide some context of when we decide to do a full sync: we check
> must_precopy (which implies "data must be sent during precopy phase"), and
> as long as it is lower than the threshold size we calculated (out of
> bandwidth and expected downtime) we will kick off the slow sync.
>
> However, when postcopy is enabled (even if still during precopy phase), RAM
> only reports all pages as can_postcopy, and report must_precopy==0.  Then
> "must_precopy <= threshold_size" mostly always triggers and enforces a slow
> sync for every call to migration_iteration_run() when postcopy is enabled
> even if not used.  That is insane.
>
> It turns out it was a regress bug introduced in the previous refactoring in
> QEMU 8.0 in late 2022. Fix this by checking the whole RAM size rather than
> must_precopy, like before.  Not copy stable yet as many things changed, and
> even if this should be a major performance regression, no functional change
> has observed (and that's also probably why nobody found it).  I only notice
> this when looking for another bug reported by Nina.
>
> When at it, cleanup a little bit on the lines around.
>
> Cc: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
> Fixes: c8df4a7aef ("migration: Split save_live_pending() into state_pending_*")
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>

> ---
>
> Nina: I copied you only because this might still be relevant, as this issue
> also misteriously points back to c8df4a7aef..  However I don't think it
> should be a fix of your problem, at most it can change the possibility of
> reproducability.
>
> This is not a regression for this release, but I still want to have it for
> 9.0.  Fabiano, any opinions / objections?

Go for it.

> ---
>  migration/migration.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 047b6b49cf..9fe8fd2afd 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3199,17 +3199,16 @@ typedef enum {
>   */
>  static MigIterateState migration_iteration_run(MigrationState *s)
>  {
> -    uint64_t must_precopy, can_postcopy;
> +    uint64_t must_precopy, can_postcopy, pending_size;
>      Error *local_err = NULL;
>      bool in_postcopy = s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE;
>      bool can_switchover = migration_can_switchover(s);
>  
>      qemu_savevm_state_pending_estimate(&must_precopy, &can_postcopy);
> -    uint64_t pending_size = must_precopy + can_postcopy;
> -
> +    pending_size = must_precopy + can_postcopy;
>      trace_migrate_pending_estimate(pending_size, must_precopy, can_postcopy);
>  
> -    if (must_precopy <= s->threshold_size) {
> +    if (pending_size < s->threshold_size) {
>          qemu_savevm_state_pending_exact(&must_precopy, &can_postcopy);
>          pending_size = must_precopy + can_postcopy;
>          trace_migrate_pending_exact(pending_size, must_precopy, can_postcopy);


  reply	other threads:[~2024-03-21 12:33 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-20 21:44 [PATCH] migration/postcopy: Fix high frequency sync peterx
2024-03-21 12:32 ` Fabiano Rosas [this message]
2024-03-21 16:20 ` Peter Xu
2024-03-22 14:53   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878r2bn4t2.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=nsg@linux.ibm.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.