Re: Stalls on Live Migration of VMs with a lot of memory

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Lieven <pl@dlh.net>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Shu Ming <shuming@linux.vnet.ibm.com>,
	qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: Re: Stalls on Live Migration of VMs with a lot of memory
Date: Wed, 04 Jan 2012 14:08:18 +0100	[thread overview]
Message-ID: <4F044F42.2050508@dlh.net> (raw)
In-Reply-To: <4F0445EE.9010905@redhat.com>

On 04.01.2012 13:28, Paolo Bonzini wrote:
> On 01/04/2012 12:42 PM, Peter Lieven wrote:
>>>
>> ok, then i misunderstood the ram blocks thing. i thought the guest ram
>> would consist of a collection of ram blocks.
>> then let me describe it differntly. would it make sense to process
>> bigger portions of memory (e.g. 1M) in stage 2 to reduce the number of
>> calls to cpu_physical_memory_reset_dirty and instead run it on bigger
>> portions of memory. we might loose a few dirty pages but they will be
>> tracked in the next iteration in stage 2 or in stage 3 at least. what
>> would be necessary is that nobody marks a page dirty
>> while i copy the dirty information for the portion of memory i want to
>> process.
>
> Dirty memory tracking is done by the hypervisor and must be done at 
> page granularity.
ok, so this is unfortunately no option.

thus my only option at the moment is to limit the runtime of the while 
loop in stage 2 or
are there any post 1.0 patches in git that might already help?

i tried to limit it to migrate_max_downtime() and this at least resolves 
the problem with
the vm stalls. however, migration speed is very limited (approx. 80MB/s 
on a 10G link).
with that.


>
>>>> - in stage 3 the vm is stopped, right? so there can't be any more 
>>>> dirty
>>>> blocks after scanning the whole memory once?
>>>
>>> No, stage 3 is entered when there are very few dirty memory pages
>>> remaining.  This may happen after scanning the whole memory many
>>> times.  It may even never happen if migration does not converge
>>> because of low bandwidth or too strict downtime requirements.
>>>
>> ok, is there a chance that i lose one final page if it is modified just
>> after i walked over it and i found no other page dirty (so bytes_sent 
>> = 0).
>
> No, of course not.  Stage 3 will send all missing pages while the VM 
> is stopped.  There is a chance that the guest will go crazy and start 
> touching lots of pages at exactly the wrong time, and thus the 
> downtime will be longer than expected.  However, that's a necessary 
> evil; if you cannot accept that, post-copy migration would provide a 
> completely different set of tradeoffs.
i don't suffer from long downtimes in stage 3. my issue is in stage 2.
>
> (BTW, bytes_sent = 0 is very rare).
i know, but when the vm is stopped there is no issue. i understood your 
"No, stage 3 is entered ..." wrong ;-)

thanks for your help and explainations.

peter
>
> Paolo

WARNING: multiple messages have this Message-ID (diff)

From: Peter Lieven <pl@dlh.net>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Shu Ming <shuming@linux.vnet.ibm.com>,
	qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: Re: [Qemu-devel] Stalls on Live Migration of VMs with a lot of memory
Date: Wed, 04 Jan 2012 14:08:18 +0100	[thread overview]
Message-ID: <4F044F42.2050508@dlh.net> (raw)
In-Reply-To: <4F0445EE.9010905@redhat.com>

On 04.01.2012 13:28, Paolo Bonzini wrote:
> On 01/04/2012 12:42 PM, Peter Lieven wrote:
>>>
>> ok, then i misunderstood the ram blocks thing. i thought the guest ram
>> would consist of a collection of ram blocks.
>> then let me describe it differntly. would it make sense to process
>> bigger portions of memory (e.g. 1M) in stage 2 to reduce the number of
>> calls to cpu_physical_memory_reset_dirty and instead run it on bigger
>> portions of memory. we might loose a few dirty pages but they will be
>> tracked in the next iteration in stage 2 or in stage 3 at least. what
>> would be necessary is that nobody marks a page dirty
>> while i copy the dirty information for the portion of memory i want to
>> process.
>
> Dirty memory tracking is done by the hypervisor and must be done at 
> page granularity.
ok, so this is unfortunately no option.

thus my only option at the moment is to limit the runtime of the while 
loop in stage 2 or
are there any post 1.0 patches in git that might already help?

i tried to limit it to migrate_max_downtime() and this at least resolves 
the problem with
the vm stalls. however, migration speed is very limited (approx. 80MB/s 
on a 10G link).
with that.


>
>>>> - in stage 3 the vm is stopped, right? so there can't be any more 
>>>> dirty
>>>> blocks after scanning the whole memory once?
>>>
>>> No, stage 3 is entered when there are very few dirty memory pages
>>> remaining.  This may happen after scanning the whole memory many
>>> times.  It may even never happen if migration does not converge
>>> because of low bandwidth or too strict downtime requirements.
>>>
>> ok, is there a chance that i lose one final page if it is modified just
>> after i walked over it and i found no other page dirty (so bytes_sent 
>> = 0).
>
> No, of course not.  Stage 3 will send all missing pages while the VM 
> is stopped.  There is a chance that the guest will go crazy and start 
> touching lots of pages at exactly the wrong time, and thus the 
> downtime will be longer than expected.  However, that's a necessary 
> evil; if you cannot accept that, post-copy migration would provide a 
> completely different set of tradeoffs.
i don't suffer from long downtimes in stage 3. my issue is in stage 2.
>
> (BTW, bytes_sent = 0 is very rare).
i know, but when the vm is stopped there is no issue. i understood your 
"No, stage 3 is entered ..." wrong ;-)

thanks for your help and explainations.

peter
>
> Paolo

next prev parent reply	other threads:[~2012-01-04 13:08 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-03 18:04 Stalls on Live Migration of VMs with a lot of memory Peter Lieven
2012-01-03 18:04 ` [Qemu-devel] " Peter Lieven
2012-01-04  1:38 ` Shu Ming
2012-01-04  9:11   ` Peter Lieven
2012-01-04 10:53   ` Peter Lieven
2012-01-04 11:05     ` Paolo Bonzini
2012-01-04 11:22       ` Peter Lieven
2012-01-04 11:28         ` Paolo Bonzini
2012-01-04 11:42           ` Peter Lieven
2012-01-04 12:28             ` Paolo Bonzini
2012-01-04 13:08               ` Peter Lieven [this message]
2012-01-04 13:08                 ` Peter Lieven
2012-01-04 14:14                 ` Paolo Bonzini
2012-01-04 14:17                   ` Peter Lieven
2012-01-04 14:21                   ` Peter Lieven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F044F42.2050508@dlh.net \
    --to=pl@dlh.net \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shuming@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.