Re: [Qemu-devel] Block Migration and CPU throttling

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Peter Lieven <pl@kamp.de>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Juan Quintela <quintela@redhat.com>, Fam Zheng <famz@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	qemu block <qemu-block@nongnu.org>,
	jjherne@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] Block Migration and CPU throttling
Date: Wed, 7 Feb 2018 18:29:30 +0000	[thread overview]
Message-ID: <20180207182930.GV2665@work-vm> (raw)
In-Reply-To: <8ed6cb1e-4770-21ec-164e-7142e649eab3@kamp.de>

* Peter Lieven (pl@kamp.de) wrote:
> Am 12.12.2017 um 18:05 schrieb Dr. David Alan Gilbert:
> > * Peter Lieven (pl@kamp.de) wrote:
> > > Am 21.09.2017 um 14:36 schrieb Dr. David Alan Gilbert:
> > > > * Peter Lieven (pl@kamp.de) wrote:
> > > > > Am 19.09.2017 um 16:41 schrieb Dr. David Alan Gilbert:
> > > > > > * Peter Lieven (pl@kamp.de) wrote:
> > > > > > > Am 19.09.2017 um 16:38 schrieb Dr. David Alan Gilbert:
> > > > > > > > * Peter Lieven (pl@kamp.de) wrote:
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > I just noticed that CPU throttling and Block Migration don't work together very well.
> > > > > > > > > During block migration the throttling heuristic detects that we obviously make no progress
> > > > > > > > > in ram transfer. But the reason is the running block migration and not a too high dirty pages rate.
> > > > > > > > > 
> > > > > > > > > The result is that any VM is throttled by 99% during block migration.
> > > > > > > > Hmm that's unfortunate; do you have a bandwidth set lower than your
> > > > > > > > actual network connection? I'm just wondering if it's actually going
> > > > > > > > between the block and RAM iterative sections or getting stuck in ne.
> > > > > > > It happens also if source and dest are on the same machine and speed is set to 100G.
> > > > > > But does it happen if they're not and the speed is set low?
> > > > > Yes, it does. I noticed it in our test environment between different nodes with a 10G
> > > > > link in between. But its totally clear why it happens. During block migration we transfer
> > > > > all dirty memory pages in each round (if there is moderate memory load), but all dirty
> > > > > pages are obviously more than 50% of the transferred ram in that round.
> > > > > It is more exactly 100%. But the current logic triggers on this condition.
> > > > > 
> > > > > I think I will go forward and send a patch which disables auto converge during
> > > > > block migration bulk stage.
> > > > Yes, that's fair;  it probably would also make sense to throttle the RAM
> > > > migration during the block migration bulk stage, since the chances are
> > > > it's not going to get far.  (I think in the nbd setup, the main
> > > > migration process isn't started until the end of bulk).
> > > Catching up with the idea of delaying ram migration until block bulk has completed.
> > > What do you think is the easiest way to achieve this?
> > <excavates inbox, and notices I never replied>
> > 
> > I think the answer depends whether we think this is a 'special' or we
> > need a new general purpose mechanism.
> > 
> > If it was really general then we'd probably want to split the iterative
> > stage in two somehow, and only do RAM in the second half.
> > 
> > But I'm not sure it's worth it; I suspect the easiest way is:
> > 
> >     a) Add a counter in migration/ram.c or in the RAM state somewhere
> >     b) Make ram_save_inhibit increment the counter
> >     c) Check the counter at the head of ram_save_iterate and just exit
> >       if it's none 0
> >     d) Call ram_save_inhibit from block_save_setup
> >     e) Then release it when you've finished the bulk stage
> > 
> > Make sure you still count the RAM in the pending totals, otherwise
> > migration might think it's finished a bit early.
> 
> Is there any culprit I don't see or is it as easy as this?

Hmm, looks promising doesn't it;  might need an include or two tidied
up, but looks worth a try.   Just be careful that there are no cases
where block migration can't transfer data in that state, otherwise we'll
keep coming back to here and spewing empty sections.

Dave

> diff --git a/migration/ram.c b/migration/ram.c
> index cb1950f..c67bcf1 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2255,6 +2255,13 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>      int64_t t0;
>      int done = 0;
> 
> +    if (blk_mig_bulk_active()) {
> +        /* Avoid transferring RAM during bulk phase of block migration as
> +         * the bulk phase will usually take a lot of time and transferring
> +         * RAM updates again and again is pointless. */
> +        goto out;
> +    }
> +
>      rcu_read_lock();
>      if (ram_list.version != rs->last_version) {
>          ram_state_reset(rs);
> @@ -2301,6 +2308,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>       */
>      ram_control_after_iterate(f, RAM_CONTROL_ROUND);
> 
> +out:
>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>      ram_counters.transferred += 8;
> 
> 
> Peter
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

next prev parent reply	other threads:[~2018-02-07 18:30 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-19 13:36 [Qemu-devel] Block Migration and CPU throttling Peter Lieven
2017-09-19 14:38 ` Dr. David Alan Gilbert
2017-09-19 14:40   ` Peter Lieven
2017-09-19 14:41     ` Dr. David Alan Gilbert
2017-09-20 19:15       ` Peter Lieven
2017-09-21 12:36         ` Dr. David Alan Gilbert
2017-09-21 12:42           ` Peter Lieven
2017-10-12 13:41           ` Peter Lieven
2017-12-12 17:05             ` Dr. David Alan Gilbert
2017-12-13 16:26               ` Peter Lieven
2018-02-06 11:46               ` Peter Lieven
2018-02-07 18:29                 ` Dr. David Alan Gilbert [this message]
2018-02-07 20:56                   ` Peter Lieven
2018-02-16 20:20                     ` Dr. David Alan Gilbert
2017-09-19 14:41 ` Paolo Bonzini
2017-09-20 19:16   ` Peter Lieven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180207182930.GV2665@work-vm \
    --to=dgilbert@redhat.com \
    --cc=famz@redhat.com \
    --cc=jjherne@linux.vnet.ibm.com \
    --cc=pl@kamp.de \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).