From: "Michael S. Tsirkin" <mst@redhat.com>
To: Juan Quintela <quintela@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long
Date: Wed, 24 Nov 2010 13:14:42 +0200 [thread overview]
Message-ID: <20101124111442.GF23493@redhat.com> (raw)
In-Reply-To: <m3r5ebdly8.fsf@trasno.mitica>
On Wed, Nov 24, 2010 at 12:01:51PM +0100, Juan Quintela wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Wed, Nov 24, 2010 at 12:03:06AM +0100, Juan Quintela wrote:
> >> From: Juan Quintela <quintela@trasno.org>
> >>
> >> cheking each 64 pages is a random magic number as good as any other.
> >> We don't want to test too many times, but on the other hand,
> >> qemu_get_clock_ns() is not so expensive either.
> >>
> >
> > Could you please explain what's the problem this fixes?
> > I would like to see an API that documents the contract
> > we are making with the backend.
>
> buffered_file is an "abstraction" that uses a buffer.
>
> live migration code (remember it can't sleep, it runs on the main loop)
> stores its "stuff" on that buffer. And a timer writes that buffer to
> the fd that is associated with migration.
>
> This design is due to the main_loop/no threads qemu model.
>
> buffered_file timer runs each 100ms. And we "try" to measure channel
> bandwidth from there. If we are not able to run the timer, all the
> calculations are wrong, and then stalls happens.
So the problem is the timer in the buffered file abstraction?
Why don't we just flush out data if the buffer is full?
>
>
> >> @@ -269,6 +272,19 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque)
> >> if (bytes_sent == 0) { /* no more blocks */
> >> break;
> >> }
> >> + /* we want to check in the 1st loop, just in case it was the 1st time
> >> + and we had to sync the dirty bitmap.
> >> + qemu_get_clock_ns() is a bit expensive, so we only check each some
> >> + iterations
> >> + */
> >> + if ((i & 63) == 0) {
> >> + uint64_t t1 = (qemu_get_clock_ns(rt_clock) - t0) / 1000000;
> >
> > This adds even more non-determinism to savevm behaviour. If bandwidth
> > limit is higth enough, I expect it to just keep going.
>
> If we find a row of 512MB of zero pages together (and that happens if
> you have a 64GB iddle guest, then you can spent more than 3seconds to
> fill the default bandwith). After that everything that uses the main
> loop has had stalls.
>
>
> >> + if (t1 > buffered_file_interval/2) {
> >
> > arch_init should not depend on buffered_file implementation IMO.
> >
> > Also - / 2?
>
> We need to run a timer each 100ms. For times look at the 0/6 patch.
> We can't spent more that 50ms in each function. It is something that
> should happen for all funnctions called from io_handlers.
>
> >> + printf("big delay %ld milliseconds, %d iterations\n", t1, i);
> >
> > Is this a debugging aid?
>
> I left that on purpose, to show that it happens a lot. There is no
> DEBUG_ARCH or DEBUG_RAM around, I can create them if you preffer. But
> notice that this is something that shouldn't happen (but it happens).
>
> DPRINTF for that file should be a good idea, will do.
>
> >> + break;
> >> + }
> >> + }
> >> + i++;
> >> }
> >>
> >> t0 = qemu_get_clock_ns(rt_clock) - t0;
> >> --
> >> 1.7.3.2
> >>
next prev parent reply other threads:[~2010-11-24 11:15 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-23 23:02 [Qemu-devel] [PATCH 00/10] Fix migration with lots of memory Juan Quintela
2010-11-23 23:02 ` [Qemu-devel] [PATCH 01/10] Add spent time to migration Juan Quintela
2010-11-23 23:02 ` [Qemu-devel] [PATCH 02/10] Add buffered_file_internal constant Juan Quintela
2010-11-24 10:40 ` [Qemu-devel] " Michael S. Tsirkin
2010-11-24 10:52 ` Juan Quintela
2010-11-24 11:04 ` Michael S. Tsirkin
2010-11-24 11:13 ` Juan Quintela
2010-11-24 11:19 ` Michael S. Tsirkin
[not found] ` <4CF46012.2060804@codemonkey.ws>
2010-11-30 11:56 ` Juan Quintela
2010-11-30 14:02 ` Anthony Liguori
2010-11-30 14:11 ` Michael S. Tsirkin
2010-11-30 14:22 ` Anthony Liguori
2010-11-30 15:40 ` Juan Quintela
2010-11-30 16:10 ` Michael S. Tsirkin
2010-11-30 16:32 ` Juan Quintela
2010-11-30 16:44 ` Anthony Liguori
2010-11-30 18:04 ` Juan Quintela
2010-11-30 18:54 ` Anthony Liguori
2010-11-30 19:15 ` Juan Quintela
2010-11-30 20:23 ` Anthony Liguori
2010-11-30 20:56 ` Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 03/10] Add printf debug to savevm Juan Quintela
[not found] ` <4CF45AB2.7050506@codemonkey.ws>
2010-11-30 10:36 ` Stefan Hajnoczi
2010-11-30 22:40 ` [Qemu-devel] " Juan Quintela
2010-12-01 7:50 ` Stefan Hajnoczi
2010-11-23 23:03 ` [Qemu-devel] [PATCH 04/10] No need to iterate if we already are over the limit Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 05/10] KVM don't care about TLB handling Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 06/10] Only calculate expected_time for stage 2 Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 07/10] ram_save_remaining() returns an uint64_t Juan Quintela
[not found] ` <4CF45C0C.705@codemonkey.ws>
2010-11-30 7:21 ` [Qemu-devel] " Paolo Bonzini
2010-11-30 13:44 ` Anthony Liguori
2010-11-30 14:38 ` Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 08/10] Count nanoseconds with uint64_t not doubles Juan Quintela
2010-11-30 7:17 ` [Qemu-devel] " Paolo Bonzini
[not found] ` <4CF45C5B.9080507@codemonkey.ws>
2010-11-30 14:40 ` Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 09/10] Exit loop if we have been there too long Juan Quintela
2010-11-24 10:40 ` [Qemu-devel] " Michael S. Tsirkin
2010-11-24 11:01 ` Juan Quintela
2010-11-24 11:14 ` Michael S. Tsirkin [this message]
2010-11-24 15:16 ` Paolo Bonzini
2010-11-24 15:59 ` Michael S. Tsirkin
[not found] ` <4CF45E3F.4040609@codemonkey.ws>
2010-11-30 8:10 ` Paolo Bonzini
2010-11-30 13:26 ` Juan Quintela
[not found] ` <4CF45D67.5010906@codemonkey.ws>
2010-11-30 7:15 ` Paolo Bonzini
2010-11-30 13:47 ` Anthony Liguori
2010-11-30 13:58 ` Avi Kivity
2010-11-30 14:17 ` Anthony Liguori
2010-11-30 14:27 ` Avi Kivity
2010-11-30 14:50 ` Anthony Liguori
2010-12-01 12:40 ` Avi Kivity
2010-11-30 17:43 ` Juan Quintela
2010-12-01 1:20 ` Takuya Yoshikawa
2010-12-01 1:52 ` Juan Quintela
2010-12-01 2:22 ` Takuya Yoshikawa
2010-12-01 12:35 ` Avi Kivity
2010-12-01 13:45 ` Juan Quintela
2010-12-02 1:31 ` Takuya Yoshikawa
2010-12-02 8:37 ` Avi Kivity
2010-11-30 14:12 ` Paolo Bonzini
2010-11-30 15:00 ` Anthony Liguori
2010-11-30 17:59 ` Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 10/10] Maintaing number of dirty pages Juan Quintela
[not found] ` <4CF45DE0.8020701@codemonkey.ws>
2010-11-30 14:46 ` [Qemu-devel] " Juan Quintela
2010-12-01 14:46 ` Avi Kivity
2010-12-01 15:51 ` Juan Quintela
2010-12-01 15:55 ` Anthony Liguori
2010-12-01 16:25 ` Juan Quintela
2010-12-01 16:33 ` Anthony Liguori
2010-12-01 16:43 ` Avi Kivity
2010-12-01 16:49 ` Anthony Liguori
2010-12-01 16:52 ` Avi Kivity
2010-12-01 16:56 ` Anthony Liguori
2010-12-01 17:01 ` Avi Kivity
2010-12-01 17:05 ` Anthony Liguori
2010-12-01 18:51 ` Juan Quintela
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101124111442.GF23493@redhat.com \
--to=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).