All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juan Quintela <quintela@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long
Date: Wed, 24 Nov 2010 12:01:51 +0100	[thread overview]
Message-ID: <m3r5ebdly8.fsf@trasno.mitica> (raw)
In-Reply-To: <20101124104010.GA23493@redhat.com> (Michael S. Tsirkin's message of "Wed, 24 Nov 2010 12:40:10 +0200")

"Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Wed, Nov 24, 2010 at 12:03:06AM +0100, Juan Quintela wrote:
>> From: Juan Quintela <quintela@trasno.org>
>> 
>> cheking each 64 pages is a random magic number as good as any other.
>> We don't want to test too many times, but on the other hand,
>> qemu_get_clock_ns() is not so expensive either.
>> 
>
> Could you please explain what's the problem this fixes?
> I would like to see an API that documents the contract
> we are making with the backend.

buffered_file is an "abstraction" that uses a buffer.

live migration code (remember it can't sleep, it runs on the main loop)
stores its "stuff" on that buffer.  And a timer writes that buffer to
the fd that is associated with migration.

This design is due to the main_loop/no threads qemu model.

buffered_file timer runs each 100ms.  And we "try" to measure channel
bandwidth from there.  If we are not able to run the timer, all the
calculations are wrong, and then stalls happens.


>> @@ -269,6 +272,19 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque)
>>          if (bytes_sent == 0) { /* no more blocks */
>>              break;
>>          }
>> +	/* we want to check in the 1st loop, just in case it was the 1st time
>> +           and we had to sync the dirty bitmap.
>> +           qemu_get_clock_ns() is a bit expensive, so we only check each some
>> +           iterations
>> +	*/
>> +        if ((i & 63) == 0) {
>> +            uint64_t t1 = (qemu_get_clock_ns(rt_clock) - t0) / 1000000;
>
> This adds even more non-determinism to savevm behaviour.  If bandwidth
> limit is higth enough, I expect it to just keep going.

If we find a row of 512MB of zero pages together (and that happens if
you have a 64GB iddle guest, then you can spent more than 3seconds to
fill the default bandwith).  After that everything that uses the main
loop has had stalls.


>> +            if (t1 > buffered_file_interval/2) {
>
> arch_init should not depend on buffered_file implementation IMO.
>
> Also - / 2?

We need to run a timer each 100ms.  For times look at the 0/6 patch.
We can't spent more that 50ms in each function.  It is something that
should happen for all funnctions called from io_handlers.

>> +                printf("big delay %ld milliseconds, %d iterations\n", t1, i);
>
> Is this a debugging aid?

I left that on purpose, to show that it happens a lot.  There is no
DEBUG_ARCH or DEBUG_RAM around, I can create them if you preffer.  But
notice that this is something that shouldn't happen (but it happens).

DPRINTF for that file should be a good idea, will do.

>> +		break;
>> +	    }
>> +	}
>> +        i++;
>>      }
>> 
>>      t0 = qemu_get_clock_ns(rt_clock) - t0;
>> -- 
>> 1.7.3.2
>> 

  reply	other threads:[~2010-11-24 11:02 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-23 23:02 [Qemu-devel] [PATCH 00/10] Fix migration with lots of memory Juan Quintela
2010-11-23 23:02 ` [Qemu-devel] [PATCH 01/10] Add spent time to migration Juan Quintela
2010-11-23 23:02 ` [Qemu-devel] [PATCH 02/10] Add buffered_file_internal constant Juan Quintela
2010-11-24 10:40   ` [Qemu-devel] " Michael S. Tsirkin
2010-11-24 10:52     ` Juan Quintela
2010-11-24 11:04       ` Michael S. Tsirkin
2010-11-24 11:13         ` Juan Quintela
2010-11-24 11:19           ` Michael S. Tsirkin
     [not found]       ` <4CF46012.2060804@codemonkey.ws>
2010-11-30 11:56         ` Juan Quintela
2010-11-30 14:02           ` Anthony Liguori
2010-11-30 14:11             ` Michael S. Tsirkin
2010-11-30 14:22               ` Anthony Liguori
2010-11-30 15:40             ` Juan Quintela
2010-11-30 16:10               ` Michael S. Tsirkin
2010-11-30 16:32                 ` Juan Quintela
2010-11-30 16:44                   ` Anthony Liguori
2010-11-30 18:04                     ` Juan Quintela
2010-11-30 18:54                       ` Anthony Liguori
2010-11-30 19:15                         ` Juan Quintela
2010-11-30 20:23                           ` Anthony Liguori
2010-11-30 20:56                             ` Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 03/10] Add printf debug to savevm Juan Quintela
     [not found]   ` <4CF45AB2.7050506@codemonkey.ws>
2010-11-30 10:36     ` Stefan Hajnoczi
2010-11-30 22:40       ` [Qemu-devel] " Juan Quintela
2010-12-01  7:50         ` Stefan Hajnoczi
2010-11-23 23:03 ` [Qemu-devel] [PATCH 04/10] No need to iterate if we already are over the limit Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 05/10] KVM don't care about TLB handling Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 06/10] Only calculate expected_time for stage 2 Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 07/10] ram_save_remaining() returns an uint64_t Juan Quintela
     [not found]   ` <4CF45C0C.705@codemonkey.ws>
2010-11-30  7:21     ` [Qemu-devel] " Paolo Bonzini
2010-11-30 13:44       ` Anthony Liguori
2010-11-30 14:38     ` Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 08/10] Count nanoseconds with uint64_t not doubles Juan Quintela
2010-11-30  7:17   ` [Qemu-devel] " Paolo Bonzini
     [not found]   ` <4CF45C5B.9080507@codemonkey.ws>
2010-11-30 14:40     ` Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 09/10] Exit loop if we have been there too long Juan Quintela
2010-11-24 10:40   ` [Qemu-devel] " Michael S. Tsirkin
2010-11-24 11:01     ` Juan Quintela [this message]
2010-11-24 11:14       ` Michael S. Tsirkin
2010-11-24 15:16         ` Paolo Bonzini
2010-11-24 15:59           ` Michael S. Tsirkin
     [not found]           ` <4CF45E3F.4040609@codemonkey.ws>
2010-11-30  8:10             ` Paolo Bonzini
2010-11-30 13:26             ` Juan Quintela
     [not found]   ` <4CF45D67.5010906@codemonkey.ws>
2010-11-30  7:15     ` Paolo Bonzini
2010-11-30 13:47       ` Anthony Liguori
2010-11-30 13:47         ` [Qemu-devel] " Anthony Liguori
2010-11-30 13:58         ` Avi Kivity
2010-11-30 13:58           ` [Qemu-devel] " Avi Kivity
2010-11-30 14:17           ` Anthony Liguori
2010-11-30 14:17             ` [Qemu-devel] " Anthony Liguori
2010-11-30 14:27             ` Avi Kivity
2010-11-30 14:27               ` [Qemu-devel] " Avi Kivity
2010-11-30 14:50               ` Anthony Liguori
2010-11-30 14:50                 ` [Qemu-devel] " Anthony Liguori
2010-12-01 12:40                 ` Avi Kivity
2010-12-01 12:40                   ` [Qemu-devel] " Avi Kivity
2010-11-30 17:43               ` Juan Quintela
2010-11-30 17:43                 ` [Qemu-devel] " Juan Quintela
2010-12-01  1:20               ` Takuya Yoshikawa
2010-12-01  1:20                 ` [Qemu-devel] " Takuya Yoshikawa
2010-12-01  1:52                 ` Juan Quintela
2010-12-01  1:52                   ` [Qemu-devel] " Juan Quintela
2010-12-01  2:22                   ` Takuya Yoshikawa
2010-12-01  2:22                     ` [Qemu-devel] " Takuya Yoshikawa
2010-12-01 12:35                   ` Avi Kivity
2010-12-01 12:35                     ` [Qemu-devel] " Avi Kivity
2010-12-01 13:45                     ` Juan Quintela
2010-12-01 13:45                       ` [Qemu-devel] " Juan Quintela
2010-12-02  1:31                     ` Takuya Yoshikawa
2010-12-02  1:31                       ` [Qemu-devel] " Takuya Yoshikawa
2010-12-02  8:37                       ` Avi Kivity
2010-12-02  8:37                         ` [Qemu-devel] " Avi Kivity
2010-11-30 14:12         ` Paolo Bonzini
2010-11-30 14:12           ` [Qemu-devel] " Paolo Bonzini
2010-11-30 15:00           ` Anthony Liguori
2010-11-30 15:00             ` [Qemu-devel] " Anthony Liguori
2010-11-30 17:59             ` Juan Quintela
2010-11-30 17:59               ` [Qemu-devel] " Juan Quintela
2010-11-23 23:03 ` [Qemu-devel] [PATCH 10/10] Maintaing number of dirty pages Juan Quintela
     [not found]   ` <4CF45DE0.8020701@codemonkey.ws>
2010-11-30 14:46     ` [Qemu-devel] " Juan Quintela
2010-12-01 14:46       ` Avi Kivity
2010-12-01 15:51         ` Juan Quintela
2010-12-01 15:55           ` Anthony Liguori
2010-12-01 16:25             ` Juan Quintela
2010-12-01 16:33               ` Anthony Liguori
2010-12-01 16:43                 ` Avi Kivity
2010-12-01 16:49                   ` Anthony Liguori
2010-12-01 16:52                     ` Avi Kivity
2010-12-01 16:56                       ` Anthony Liguori
2010-12-01 17:01                         ` Avi Kivity
2010-12-01 17:05                           ` Anthony Liguori
2010-12-01 18:51                             ` Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3r5ebdly8.fsf@trasno.mitica \
    --to=quintela@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.