Re: [Qemu-devel] migration: broken ram_save_pending

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Paolo Bonzini <pbonzini@redhat.com>, Alex Graf <agraf@suse.de>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] migration: broken ram_save_pending
Date: Thu, 6 Feb 2014 11:24:36 +0000	[thread overview]
Message-ID: <20140206112435.GC3013@work-vm> (raw)
In-Reply-To: <52F2FD2B.9010504@ozlabs.ru>

* Alexey Kardashevskiy (aik@ozlabs.ru) wrote:
> On 02/06/2014 03:45 AM, Paolo Bonzini wrote:
> > Il 05/02/2014 17:42, Dr. David Alan Gilbert ha scritto:
> >> Because:
> >>     * the code is still running and keeps redirtying a small handful of
> >> pages
> >>     * but because we've underestimated our available bandwidth we never stop
> >>       it and just throw those pages across immediately
> > 
> > Ok, I thought Alexey was saying we are not redirtying that handful of pages.
> 
> 
> Every iteration we read the dirty map from KVM and send all dirty pages
> across the stream.
> 
> 
> > And in turn, this is because the max downtime we have is too low
> > (especially for the default 32 MB/sec default bandwidth; that's also pretty
> > low).
> 
> 
> My understanding nooow is that in order to finish migration QEMU waits for
> the earliest 100ms (BUFFER_DELAY) of continuously low trafic but due to
> those pages getting dirty every time we read the dirty map, we transfer
> more in these 100ms than we are actually allowed (>32MB/s or 320KB/100ms).
> So we transfer-transfer-transfer, detect than we transfer too much, do
> delay() and if max_size (calculated from actual transfer and downtime) for
> the next iteration is less (by luck) than those 96 pages (uncompressed) -
> we finish.

How about turning on some of the debug in migration.c; I suggest not all of
it, but how about the :

            DPRINTF("transferred %" PRIu64 " time_spent %" PRIu64
                    " bandwidth %g max_size %" PRId64 "\n",
                    transferred_bytes, time_spent, bandwidth, max_size);

and also the s->dirty_bytes_rate value.  It would help check our assumptions.

> Increasing speed or/and downtime will help but still - we would not need
> that if migration did not expect all 96 pages to have to be sent but did
> have some smart way to detect that many are empty (so - compressed).

I think the other way would be to keep track of the compression ratio;
if we knew how many pages we'd sent, and how much bandwidth that had used,
we could divide the pending_bytes by that to get a *different* approximation.

However, the problem is that my understanding is we're trying to 
_gurantee_ a maximum downtime, and to do that we have to use the calculation
that assumes that all the pages we have are going to take the maximum time
to transfer, and only go into downtime then.

> Literally, move is_zero_range() from ram_save_block() to
> migration_bitmap_sync() and store this bit in some new pages_zero_map, for
> example. But does it make a lot of sense?

The problem is that means checking whether it's zero more often; at the moment
we check it's zero once during sending; to do what you're suggesting would
mean we'd have to check every page is zero, every time we sync, and I think
that's more often than we send.

Have you tried disabling the call to is_zero_range in arch_init.c's ram_block
so that (as long as you have XBZRLE off) we don't do any compression; if 
the theory is right then your problem should go away.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

next prev parent reply	other threads:[~2014-02-06 11:24 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-04  7:15 [Qemu-devel] migration: broken ram_save_pending Alexey Kardashevskiy
2014-02-04 10:46 ` Paolo Bonzini
2014-02-04 11:59   ` Alexey Kardashevskiy
2014-02-04 12:07     ` Paolo Bonzini
2014-02-04 12:16       ` Alexey Kardashevskiy
2014-02-04 14:00         ` Paolo Bonzini
2014-02-04 22:17           ` Alexey Kardashevskiy
2014-02-05  7:18             ` Paolo Bonzini
2014-02-05  9:09               ` Dr. David Alan Gilbert
2014-02-05 16:35                 ` Paolo Bonzini
2014-02-05 16:42                   ` Dr. David Alan Gilbert
2014-02-05 16:45                     ` Paolo Bonzini
2014-02-06  3:10                       ` Alexey Kardashevskiy
2014-02-06 11:24                         ` Dr. David Alan Gilbert [this message]
2014-02-07  5:39                           ` Alexey Kardashevskiy
2014-02-07  8:55                             ` Dr. David Alan Gilbert
2014-02-06 23:49                         ` Paolo Bonzini
2014-02-07  5:42                           ` Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140206112435.GC3013@work-vm \
    --to=dgilbert@redhat.com \
    --cc=agraf@suse.de \
    --cc=aik@ozlabs.ru \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).