qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: bala24 <bala24@linux.vnet.ibm.com>
Cc: qemu-devel@nongnu.org, amit.shah@redhat.com, quintela@redhat.com
Subject: Re: [Qemu-devel] [PATCH] migration: calculate expected_downtime with ram_bytes_remaining()
Date: Wed, 4 Apr 2018 09:59:53 +0800	[thread overview]
Message-ID: <20180404015953.GH26441@xz-mi> (raw)
In-Reply-To: <4b9793192f7007913e7fb9bb210524fb@linux.vnet.ibm.com>

On Tue, Apr 03, 2018 at 11:00:00PM +0530, bala24 wrote:
> On 2018-04-03 11:40, Peter Xu wrote:
> > On Sun, Apr 01, 2018 at 12:25:36AM +0530, Balamuruhan S wrote:
> > > expected_downtime value is not accurate with dirty_pages_rate *
> > > page_size,
> > > using ram_bytes_remaining would yeild it correct.
> > > 
> > > Signed-off-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
> > > ---
> > >  migration/migration.c | 3 +--
> > >  1 file changed, 1 insertion(+), 2 deletions(-)
> > > 
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index 58bd382730..4e43dc4f92 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -2245,8 +2245,7 @@ static void
> > > migration_update_counters(MigrationState *s,
> > >       * recalculate. 10000 is a small enough number for our purposes
> > >       */
> > >      if (ram_counters.dirty_pages_rate && transferred > 10000) {
> > > -        s->expected_downtime = ram_counters.dirty_pages_rate *
> > > -            qemu_target_page_size() / bandwidth;
> > > +        s->expected_downtime = ram_bytes_remaining() / bandwidth;
> > 
> > This field was removed in e4ed1541ac ("savevm: New save live migration
> > method: pending", 2012-12-20), in which remaing RAM was used.
> > 
> > And it was added back in 90f8ae724a ("migration: calculate
> > expected_downtime", 2013-02-22), in which dirty rate was used.
> > 
> > However I didn't find a clue on why we changed from using remaining
> > RAM to using dirty rate...  So I'll leave this question to Juan.
> > 
> > Besides, I'm a bit confused on when we'll want such a value.  AFAIU
> > precopy is mostly used by setting up the target downtime before hand,
> > so we should already know the downtime before hand.  Then why we want
> > to observe such a thing?
> 
> Thanks Peter Xu for reviewing,
> 
> I tested precopy migration with 16M hugepage backed ppc guest and
> granularity
> of page size in migration is 4K so any page dirtied would result in 4096
> pages
> to be transmitted again, this led for migration to continue endless,
> 
> default migrate_parameters:
> downtime-limit: 300 milliseconds
> 
> info migrate:
> expected downtime: 1475 milliseconds
> 
> Migration status: active
> total time: 130874 milliseconds
> expected downtime: 1475 milliseconds
> setup: 3475 milliseconds
> transferred ram: 18197383 kbytes
> throughput: 866.83 mbps
> remaining ram: 376892 kbytes
> total ram: 8388864 kbytes
> duplicate: 1678265 pages
> skipped: 0 pages
> normal: 4536795 pages
> normal bytes: 18147180 kbytes
> dirty sync count: 6
> page size: 4 kbytes
> dirty pages rate: 39044 pages
> 
> In order to complete migration I configured downtime-limit to 1475
> milliseconds but still migration was endless. Later calculated expected
> downtime by remaining ram 376892 Kbytes / 866.83 mbps yeilded 3478.34
> milliseconds and configuring it as downtime-limit succeeds the migration
> to complete. This led to the conclusion that expected downtime is not
> accurate.

Hmm, thanks for the information.  I'd say your calculation seems
reasonable to me: it shows how long time will it need if we stop the
VM now on source immediately and migrate the rest. However Juan might
have an explanation on existing algorithm which I would like to know
too. So still I'll put aside the "which one is better" question.

For your use case, you can have a look on either of below way to
have a converged migration:

- auto-converge: that's a migration capability that throttles CPU
  usage of guests

- postcopy: that'll let you start the destination VM even without
  transferring all the RAMs before hand

Either of the technique can be configured via "migrate_set_capability"
HMP command or "migrate-set-capabilities" QMP command (some googling
would show detailed steps). And, either of above should help you to
migrate successfully in this hard-to-converge scenario, instead of
your current way (observing downtime, set downtime).

Meanwhile, I'm thinking whether instead of observing the downtime in
real time, whether we should introduce a command to stop the VM
immediately to migrate the rest when we want, or, a new parameter to
current "migrate" command.

-- 
Peter Xu

  reply	other threads:[~2018-04-04  2:00 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-31 18:55 [Qemu-devel] [PATCH] migration: calculate expected_downtime with ram_bytes_remaining() Balamuruhan S
2018-04-03  6:10 ` Peter Xu
2018-04-03 17:30   ` bala24
2018-04-04  1:59     ` Peter Xu [this message]
2018-04-04  9:02   ` Juan Quintela
2018-04-04  9:04 ` Juan Quintela
2018-04-10  9:52   ` Balamuruhan S
2018-04-10 10:52     ` Balamuruhan S
  -- strict thread matches above, loose matches on Subject: below --
2018-04-04  6:25 Balamuruhan S
2018-04-04  8:06 ` Peter Xu
2018-04-04  8:49   ` Balamuruhan S
2018-04-09 18:57     ` Dr. David Alan Gilbert
2018-04-10  1:22       ` David Gibson
2018-04-10 10:02         ` Dr. David Alan Gilbert
2018-04-11  1:28           ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180404015953.GH26441@xz-mi \
    --to=peterx@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=bala24@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).