Re: [PATCH QEMU v5 6/8] migration: Implement dirty-limit convergence algo

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Juan Quintela <quintela@redhat.com>
To: ~hyman <hyman@git.sr.ht>
Cc: qemu-devel <qemu-devel@nongnu.org>,
	~hyman <yong.huang@smartx.com>, "Peter Xu" <peterx@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Eric Blake" <eblake@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Thomas Huth" <thuth@redhat.com>,
	"Laurent Vivier" <lvivier@redhat.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>
Subject: Re: [PATCH QEMU v5 6/8] migration: Implement dirty-limit convergence algo
Date: Tue, 13 Jun 2023 19:50:24 +0200	[thread overview]
Message-ID: <87ttvbckv3.fsf@secure.mitica> (raw)
In-Reply-To: <168618975839.6361.17407633874747688653-6@git.sr.ht> (hyman@git.sr.ht's message of "Thu, 08 Jun 2023 00:12:40 +0800")

~hyman <hyman@git.sr.ht> wrote:
> From: Hyman Huang(黄勇) <yong.huang@smartx.com>

To speed thinkng up, 1-5 are included on next Migration PULL request.

> Implement dirty-limit convergence algo for live migration,
> which is kind of like auto-converge algo but using dirty-limit
> instead of cpu throttle to make migration convergent.
>
> Enable dirty page limit if dirty_rate_high_cnt greater than 2
> when dirty-limit capability enabled, Disable dirty-limit if
> migration be cancled.

Nit: canceled.

>
> Note that "set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit"
> commands are not allowed during dirty-limit live migration.
>
> Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
> Signed-off-by: Markus Armbruster <armbru@redhat.com>


> + * Enable dirty-limit to throttle down the guest
> + */
> +static void migration_dirty_limit_guest(void)
> +{
> +    static int64_t quota_dirtyrate;

quota_dirtyrate deserves at least a comment.

I guess it means the current quota_dirty_rate that is set, but no clue.

> +    MigrationState *s = migrate_get_current();
> +
> +    /*
> +     * If dirty limit already enabled and migration parameter
> +     * vcpu-dirty-limit untouched.
> +     */
> +    if (dirtylimit_in_service() &&
> +        quota_dirtyrate == s->parameters.vcpu_dirty_limit) {
> +        return;
> +    }
> +
> +    quota_dirtyrate = s->parameters.vcpu_dirty_limit;
> +
> +    /* Set or update quota dirty limit */
> +    qmp_set_vcpu_dirty_limit(false, -1, quota_dirtyrate, NULL);

Care to explain why do we have to "reset" the quota?  Or why we can't
set it when the user issues the command, only when we throttle the guest?

> +    trace_migration_dirty_limit_guest(quota_dirtyrate);
> +}
> +

Split this patch in two:

a - the logic change
b - the introduction of dirty limit.


Old code:

    /* During block migration the auto-converge logic incorrectly detects
     * that ram migration makes no progress. Avoid this by disabling the
     * throttling logic during the bulk phase of block migration. */
    if (blk_mig_bulk_active()) {
        return;
    }

    if (migrate_auto_converge()) {
        /* The following detection logic can be refined later. For now:
           Check to see if the ratio between dirtied bytes and the approx.
           amount of bytes that just got transferred since the last time
           we were in this routine reaches the threshold. If that happens
           twice, start or increase throttling. */

        if ((bytes_dirty_period > bytes_dirty_threshold) &&
            (++rs->dirty_rate_high_cnt >= 2)) {
            trace_migration_throttle();
            rs->dirty_rate_high_cnt = 0;
            mig_throttle_guest_down(bytes_dirty_period,
                                    bytes_dirty_threshold);
        }
    }

New code:
    /*
     * The following detection logic can be refined later. For now:
     * Check to see if the ratio between dirtied bytes and the approx.
     * amount of bytes that just got transferred since the last time
     * we were in this routine reaches the threshold. If that happens
     * twice, start or increase throttling.
     */

    if ((bytes_dirty_period > bytes_dirty_threshold) &&
        (++rs->dirty_rate_high_cnt >= 2)) {
        rs->dirty_rate_high_cnt = 0;
        /*
         * During block migration the auto-converge logic incorrectly detects
         * that ram migration makes no progress. Avoid this by disabling the
         * throttling logic during the bulk phase of block migration
         */
        if (blk_mig_bulk_active()) {
            return;
        }

        if (migrate_auto_converge()) {
            trace_migration_throttle();
            mig_throttle_guest_down(bytes_dirty_period,
                                    bytes_dirty_threshold);
        } else if (migrate_dirty_limit()) {
            migration_dirty_limit_guest();
        }
    }

Questions:

- Why are we changing blk_mig_bulk_active() position?

  I think that the old code have it in the right place.  Additionally,
  you just changefd to this version a couple of patches agon.




>                                   int64_t cpu_index,
>                                   Error **errp)
>  {
> +    MigrationState *ms = migrate_get_current();
> +
>      if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
>          return;
>      }
> @@ -453,6 +455,15 @@ void qmp_cancel_vcpu_dirty_limit(bool has_cpu_index,
>          return;
>      }
>  
> +    if (migration_is_running(ms->state) &&
> +        (!qemu_thread_is_self(&ms->thread)) &&
> +        migrate_dirty_limit() &&
> +        dirtylimit_in_service()) {
> +        error_setg(errp, "can't cancel dirty page limit while"
> +                   " migration is running");

Error message is bad or wrong.
You can cancel the dirty page, you ust need to be on the main thread.

Or I am missing something?



> +        return;
> +    }
> +
>      dirtylimit_state_lock();
>  
>      if (has_cpu_index) {
> @@ -488,6 +499,8 @@ void qmp_set_vcpu_dirty_limit(bool has_cpu_index,
>                                uint64_t dirty_rate,
>                                Error **errp)
>  {
> +    MigrationState *ms = migrate_get_current();
> +
>      if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
>          error_setg(errp, "dirty page limit feature requires KVM with"
>                     " accelerator property 'dirty-ring-size' set'")
> @@ -504,6 +517,15 @@ void qmp_set_vcpu_dirty_limit(bool has_cpu_index,
>          return;
>      }
>  
> +    if (migration_is_running(ms->state) &&
> +        (!qemu_thread_is_self(&ms->thread)) &&
> +        migrate_dirty_limit() &&
> +        dirtylimit_in_service()) {
> +        error_setg(errp, "can't cancel dirty page limit while"
> +                   " migration is running");
> +        return;
> +    }

If you use such a complex expression twice, I think that creating a
helper function is a good idea.

Later, Juan.

next prev parent reply	other threads:[~2023-06-13 17:51 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-08  2:02 [PATCH QEMU v5 0/8] migration: introduce dirtylimit capability ~hyman
2022-11-18  2:08 ` [PATCH QEMU v5 1/8] softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit" ~hyman
2023-06-13 16:20   ` Juan Quintela
2023-06-13 16:37     ` Juan Quintela
2023-06-07 13:32 ` [PATCH QEMU v5 2/8] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter ~hyman
2023-06-13 16:18   ` Juan Quintela
2023-06-07 14:58 ` [PATCH QEMU v5 3/8] qapi/migration: Introduce vcpu-dirty-limit parameters ~hyman
2023-06-13 16:30   ` Juan Quintela
2023-06-07 15:30 ` [PATCH QEMU v5 4/8] migration: Introduce dirty-limit capability ~hyman
2023-06-13 16:32   ` Juan Quintela
2023-06-07 15:32 ` [PATCH QEMU v5 5/8] migration: Refactor auto-converge capability logic ~hyman
2023-06-13 16:34   ` Juan Quintela
2023-06-07 16:12 ` [PATCH QEMU v5 6/8] migration: Implement dirty-limit convergence algo ~hyman
2023-06-13 17:50   ` Juan Quintela [this message]
2023-06-15 10:12     ` Yong Huang
2023-06-07 16:21 ` [PATCH QEMU v5 7/8] migration: Extend query-migrate to provide dirty page limit info ~hyman
2023-06-13 18:03   ` Juan Quintela
2023-06-13 18:07   ` Juan Quintela
2023-06-07 16:46 ` [PATCH QEMU v5 8/8] tests: Add migration dirty-limit capability test ~hyman
2023-06-13 18:09   ` Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ttvbckv3.fsf@secure.mitica \
    --to=quintela@redhat.com \
    --cc=armbru@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=hyman@git.sr.ht \
    --cc=lvivier@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=thuth@redhat.com \
    --cc=yong.huang@smartx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.