From: Yong Huang <yong.huang@smartx.com>
To: quintela@redhat.com
Cc: ~hyman <hyman@git.sr.ht>, qemu-devel <qemu-devel@nongnu.org>,
"Peter Xu" <peterx@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
"Eric Blake" <eblake@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Thomas Huth" <thuth@redhat.com>,
"Laurent Vivier" <lvivier@redhat.com>,
"Richard Henderson" <richard.henderson@linaro.org>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>
Subject: Re: [PATCH QEMU v5 6/8] migration: Implement dirty-limit convergence algo
Date: Thu, 15 Jun 2023 18:12:57 +0800 [thread overview]
Message-ID: <CAK9dgmZ4NhVOY5uxXbNVpgHT9HcW5kbOeLHsMe73uP_MeMxT-g@mail.gmail.com> (raw)
In-Reply-To: <87ttvbckv3.fsf@secure.mitica>
[-- Attachment #1: Type: text/plain, Size: 7363 bytes --]
On Wed, Jun 14, 2023 at 1:50 AM Juan Quintela <quintela@redhat.com> wrote:
> ~hyman <hyman@git.sr.ht> wrote:
> > From: Hyman Huang(黄勇) <yong.huang@smartx.com>
>
> To speed thinkng up, 1-5 are included on next Migration PULL request.
>
OK, I'll post the next version only contain the last 3 commits.
>
> Implement dirty-limit convergence algo for live migration,
> > which is kind of like auto-converge algo but using dirty-limit
> > instead of cpu throttle to make migration convergent.
> >
> > Enable dirty page limit if dirty_rate_high_cnt greater than 2
> > when dirty-limit capability enabled, Disable dirty-limit if
> > migration be cancled.
>
> Nit: canceled.
get it.
> >
> > Note that "set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit"
> > commands are not allowed during dirty-limit live migration.
> >
> > Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
> > Signed-off-by: Markus Armbruster <armbru@redhat.com>
>
>
> > + * Enable dirty-limit to throttle down the guest
> > + */
> > +static void migration_dirty_limit_guest(void)
> > +{
> > + static int64_t quota_dirtyrate;
>
> quota_dirtyrate deserves at least a comment.
>
> I guess it means the current quota_dirty_rate that is set, but no clue.
OK. I'll comment it next version.
>
> > + MigrationState *s = migrate_get_current();
> > +
> > + /*
> > + * If dirty limit already enabled and migration parameter
> > + * vcpu-dirty-limit untouched.
> > + */
> > + if (dirtylimit_in_service() &&
> > + quota_dirtyrate == s->parameters.vcpu_dirty_limit) {
> > + return;
> > + }
> > +
> > + quota_dirtyrate = s->parameters.vcpu_dirty_limit;
> > +
> > + /* Set or update quota dirty limit */
> > + qmp_set_vcpu_dirty_limit(false, -1, quota_dirtyrate, NULL);
>
> Care to explain why do we have to "reset" the quota? Or why we can't
> set it when the user issues the command, only when we throttle the guest?
Indeed, -1 is misleading, the first parameter means the set all vcpu a
quota dirtyrate, and the second parameter is meaningless if the first
parameter is false.
The comment will be like this next version?
/* Set all vCPU a quota dirtyrate, note that the second parameter will
be ignored if setting all vCPU for a vm.
*/
> > + trace_migration_dirty_limit_guest(quota_dirtyrate);
> > +}
> > +
>
> Split this patch in two:
>
> a - the logic change
> b - the introduction of dirty limit.
>
> Ok, get it.
>
> Old code:
>
> /* During block migration the auto-converge logic incorrectly detects
> * that ram migration makes no progress. Avoid this by disabling the
> * throttling logic during the bulk phase of block migration. */
> if (blk_mig_bulk_active()) {
> return;
> }
>
> if (migrate_auto_converge()) {
> /* The following detection logic can be refined later. For now:
> Check to see if the ratio between dirtied bytes and the approx.
> amount of bytes that just got transferred since the last time
> we were in this routine reaches the threshold. If that happens
> twice, start or increase throttling. */
>
> if ((bytes_dirty_period > bytes_dirty_threshold) &&
> (++rs->dirty_rate_high_cnt >= 2)) {
> trace_migration_throttle();
> rs->dirty_rate_high_cnt = 0;
> mig_throttle_guest_down(bytes_dirty_period,
> bytes_dirty_threshold);
> }
> }
>
> New code:
> /*
> * The following detection logic can be refined later. For now:
> * Check to see if the ratio between dirtied bytes and the approx.
> * amount of bytes that just got transferred since the last time
> * we were in this routine reaches the threshold. If that happens
> * twice, start or increase throttling.
> */
>
> if ((bytes_dirty_period > bytes_dirty_threshold) &&
> (++rs->dirty_rate_high_cnt >= 2)) {
> rs->dirty_rate_high_cnt = 0;
> /*
> * During block migration the auto-converge logic incorrectly
> detects
> * that ram migration makes no progress. Avoid this by disabling
> the
> * throttling logic during the bulk phase of block migration
> */
> if (blk_mig_bulk_active()) {
> return;
> }
>
> if (migrate_auto_converge()) {
> trace_migration_throttle();
> mig_throttle_guest_down(bytes_dirty_period,
> bytes_dirty_threshold);
> } else if (migrate_dirty_limit()) {
> migration_dirty_limit_guest();
> }
> }
>
> Questions:
>
> - Why are we changing blk_mig_bulk_active() position?
> I think that the old code have it in the right place. Additionally,
> you just changefd to this version a couple of patches agon.
>
Yes, indeed, this modification make no sense, i'll fix it next version.
>
>
>
>
> > int64_t cpu_index,
> > Error **errp)
> > {
> > + MigrationState *ms = migrate_get_current();
> > +
> > if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
> > return;
> > }
> > @@ -453,6 +455,15 @@ void qmp_cancel_vcpu_dirty_limit(bool has_cpu_index,
> > return;
> > }
> >
> > + if (migration_is_running(ms->state) &&
> > + (!qemu_thread_is_self(&ms->thread)) &&
> > + migrate_dirty_limit() &&
> > + dirtylimit_in_service()) {
> > + error_setg(errp, "can't cancel dirty page limit while"
> > + " migration is running");
>
> Error message is bad or wrong.
> You can cancel the dirty page, you ust need to be on the main thread.
>
> Or I am missing something?
>
> Migration, IMHO, shares the same quota dirty rate stored in the global
variable
"dirtylimit_state ", if we cancel the dirty limit, it will make the
throttle not work
and the migration will be affected.
>
>
> > + return;
> > + }
> > +
> > dirtylimit_state_lock();
> >
> > if (has_cpu_index) {
> > @@ -488,6 +499,8 @@ void qmp_set_vcpu_dirty_limit(bool has_cpu_index,
> > uint64_t dirty_rate,
> > Error **errp)
> > {
> > + MigrationState *ms = migrate_get_current();
> > +
> > if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
> > error_setg(errp, "dirty page limit feature requires KVM with"
> > " accelerator property 'dirty-ring-size' set'")
> > @@ -504,6 +517,15 @@ void qmp_set_vcpu_dirty_limit(bool has_cpu_index,
> > return;
> > }
> >
> > + if (migration_is_running(ms->state) &&
> > + (!qemu_thread_is_self(&ms->thread)) &&
> > + migrate_dirty_limit() &&
> > + dirtylimit_in_service()) {
> > + error_setg(errp, "can't cancel dirty page limit while"
> > + " migration is running");
> > + return;
> > + }
>
> If you use such a complex expression twice, I think that creating a
> helper function is a good idea.
>
Ok, get it
>
> Later, Juan.
>
>
Hyman
--
Best regards
[-- Attachment #2: Type: text/html, Size: 13792 bytes --]
next prev parent reply other threads:[~2023-06-15 13:35 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-08 2:02 [PATCH QEMU v5 0/8] migration: introduce dirtylimit capability ~hyman
2022-11-18 2:08 ` [PATCH QEMU v5 1/8] softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit" ~hyman
2023-06-13 16:20 ` Juan Quintela
2023-06-13 16:37 ` Juan Quintela
2023-06-07 13:32 ` [PATCH QEMU v5 2/8] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter ~hyman
2023-06-13 16:18 ` Juan Quintela
2023-06-07 14:58 ` [PATCH QEMU v5 3/8] qapi/migration: Introduce vcpu-dirty-limit parameters ~hyman
2023-06-13 16:30 ` Juan Quintela
2023-06-07 15:30 ` [PATCH QEMU v5 4/8] migration: Introduce dirty-limit capability ~hyman
2023-06-13 16:32 ` Juan Quintela
2023-06-07 15:32 ` [PATCH QEMU v5 5/8] migration: Refactor auto-converge capability logic ~hyman
2023-06-13 16:34 ` Juan Quintela
2023-06-07 16:12 ` [PATCH QEMU v5 6/8] migration: Implement dirty-limit convergence algo ~hyman
2023-06-13 17:50 ` Juan Quintela
2023-06-15 10:12 ` Yong Huang [this message]
2023-06-07 16:21 ` [PATCH QEMU v5 7/8] migration: Extend query-migrate to provide dirty page limit info ~hyman
2023-06-13 18:03 ` Juan Quintela
2023-06-13 18:07 ` Juan Quintela
2023-06-07 16:46 ` [PATCH QEMU v5 8/8] tests: Add migration dirty-limit capability test ~hyman
2023-06-13 18:09 ` Juan Quintela
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAK9dgmZ4NhVOY5uxXbNVpgHT9HcW5kbOeLHsMe73uP_MeMxT-g@mail.gmail.com \
--to=yong.huang@smartx.com \
--cc=armbru@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eblake@redhat.com \
--cc=hyman@git.sr.ht \
--cc=lvivier@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=richard.henderson@linaro.org \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).