Re: [Qemu-devel] migration: broken ram_save_pending

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Alex Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] migration: broken ram_save_pending
Date: Wed, 5 Feb 2014 09:09:13 +0000	[thread overview]
Message-ID: <20140205090912.GA2398@work-vm> (raw)
In-Reply-To: <52F1E5BA.60902@redhat.com>

* Paolo Bonzini (pbonzini@redhat.com) wrote:
> Il 04/02/2014 23:17, Alexey Kardashevskiy ha scritto:
> >>>>> Well, it will fix it in my particular case but in a long run this does not
> >>>>> feel like a fix - there should be a way for migration_thread() to know that
> >>>>> ram_save_iterate() sent all dirty pages it had to send, no?
> >>>
> >>> No, because new pages might be dirtied while ram_save_iterate() was running.
> >
> >I do not get it, sorry. In my example the ram_save_iterate() sends
> >everything in one go but its caller thinks that it did not and tries again.
> 
> It's not that "the caller thinks that it did not".  The caller knows
> what happens, because migration_bitmap_find_and_reset_dirty updates
> the migration_dirty_pages count that ram_save_pending uses.  So
> migration_dirty_pages should be 0 when ram_save_pending is entered.
> 
> However, something gets dirty in between so remaining_size is again
> 393216 when ram_save_pending returns, after the
> migration_bitmap_sync call.  Because of this the migration thread
> thinks that ram_save_iterate() _will_ not send everything in one go.
> 
> At least, this is how I read the code.  Perhaps I'm wrong. ;)

My reading was a bit different.

I think the case Alexey is hitting is:
   1 A few dirtied pages
   2 but because of the hpratio most of the data is actually zero
     - indeed most of the target-page sized chunks are zero
   3 Thus the data compresses very heavily
   4 When the bandwidth/delay calculation happens it's spent a reasonable
     amount of time transferring a reasonable amount of pages but not
     actually many bytes on the wire, so the estimate of the available
     bandwidth available is lower than reality.
   5 The max-downtime calculation is a comparison of pending-dirty uncompressed
     bytes with compressed bandwidth

(5) is bound to fail if the compression ratio is particularly high, which
because of the hpratio it is if we're just dirtying one word in an entire
host page.

What I'm not too sure of is you'd think if only a few pages were dirtied
that the loop would happen quite quickly and thus the delay would also be
small, and so bytes-on-wire would be divided by a small value and thus
not be too bad.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

next prev parent reply	other threads:[~2014-02-05  9:09 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-04  7:15 [Qemu-devel] migration: broken ram_save_pending Alexey Kardashevskiy
2014-02-04 10:46 ` Paolo Bonzini
2014-02-04 11:59   ` Alexey Kardashevskiy
2014-02-04 12:07     ` Paolo Bonzini
2014-02-04 12:16       ` Alexey Kardashevskiy
2014-02-04 14:00         ` Paolo Bonzini
2014-02-04 22:17           ` Alexey Kardashevskiy
2014-02-05  7:18             ` Paolo Bonzini
2014-02-05  9:09               ` Dr. David Alan Gilbert [this message]
2014-02-05 16:35                 ` Paolo Bonzini
2014-02-05 16:42                   ` Dr. David Alan Gilbert
2014-02-05 16:45                     ` Paolo Bonzini
2014-02-06  3:10                       ` Alexey Kardashevskiy
2014-02-06 11:24                         ` Dr. David Alan Gilbert
2014-02-07  5:39                           ` Alexey Kardashevskiy
2014-02-07  8:55                             ` Dr. David Alan Gilbert
2014-02-06 23:49                         ` Paolo Bonzini
2014-02-07  5:42                           ` Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140205090912.GA2398@work-vm \
    --to=dgilbert@redhat.com \
    --cc=agraf@suse.de \
    --cc=aik@ozlabs.ru \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).