migrate_set_downtime bug

All of lore.kernel.org
 help / color / mirror / Atom feed

* migrate_set_downtime bug
@ 2009-09-29 13:00 Dietmar Maurer
  2009-09-29 14:36 ` Dietmar Maurer
  0 siblings, 1 reply; 29+ messages in thread
From: Dietmar Maurer @ 2009-09-29 13:00 UTC (permalink / raw)
  To: kvm

using 0.11.0, live migration works as expected, but max downtime does not seem to work, for example:

# migrate_set_downtime 1

After that tcp migration has much longer downtimes (up to 20 seconds).

Also, it seems that the 'monitor' is locked (take up to 10 seconds until I get a monitor prompt).

Someone else get this behavior?

- Dietmar

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-09-29 13:00 migrate_set_downtime bug Dietmar Maurer
@ 2009-09-29 14:36 ` Dietmar Maurer
  2009-09-29 15:09   ` Dietmar Maurer
  0 siblings, 1 reply; 29+ messages in thread
From: Dietmar Maurer @ 2009-09-29 14:36 UTC (permalink / raw)
  To: kvm

Seems the bwidth calculation is the problem. The code simply does:

bwidth = (bytes_transferred - bytes_transferred_last) / timediff

but I assume network traffic is buffered, so calculated bwidth is sometimes much too high. 

- Dietmar

> -----Original Message-----
> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On
> Behalf Of Dietmar Maurer
> Sent: Dienstag, 29. September 2009 15:01
> To: kvm
> Subject: migrate_set_downtime bug
> 
> using 0.11.0, live migration works as expected, but max downtime does
> not seem to work, for example:
> 
> # migrate_set_downtime 1
> 
> After that tcp migration has much longer downtimes (up to 20 seconds).
> 
> Also, it seems that the 'monitor' is locked (take up to 10 seconds
> until I get a monitor prompt).
> 
> Someone else get this behavior?
> 
> - Dietmar
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-09-29 14:36 ` Dietmar Maurer
@ 2009-09-29 15:09   ` Dietmar Maurer
  2009-09-29 15:39     ` Anthony Liguori
  0 siblings, 1 reply; 29+ messages in thread
From: Dietmar Maurer @ 2009-09-29 15:09 UTC (permalink / raw)
  To: kvm

[-- Attachment #1: Type: text/plain, Size: 561 bytes --]

this patch solves the problem by calculation an average bandwidth.

- Dietmar

> -----Original Message-----
> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On
> Behalf Of Dietmar Maurer
> Sent: Dienstag, 29. September 2009 16:37
> To: kvm
> Subject: RE: migrate_set_downtime bug
> 
> Seems the bwidth calculation is the problem. The code simply does:
> 
> bwidth = (bytes_transferred - bytes_transferred_last) / timediff
> 
> but I assume network traffic is buffered, so calculated bwidth is
> sometimes much too high.
> 

[-- Attachment #2: migrate.diff --]
[-- Type: application/octet-stream, Size: 1273 bytes --]

Index: qemu-kvm/vl.c
===================================================================
--- qemu-kvm.orig/vl.c	2009-09-29 16:43:45.000000000 +0200
+++ qemu-kvm/vl.c	2009-09-29 16:52:24.000000000 +0200
@@ -3175,9 +3175,10 @@
 static int ram_save_live(QEMUFile *f, int stage, void *opaque)
 {
     ram_addr_t addr;
-    uint64_t bytes_transferred_last;
     double bwidth = 0;
     uint64_t expected_time = 0;
+    static int64_t starttime = 0;
+    double timediff;
 
     if (cpu_physical_sync_dirty_bitmap(0, TARGET_PHYS_ADDR_MAX) != 0) {
         qemu_file_set_error(f);
@@ -3195,10 +3196,9 @@
         cpu_physical_memory_set_dirty_tracking(1);
 
         qemu_put_be64(f, last_ram_offset | RAM_SAVE_FLAG_MEM_SIZE);
-    }
 
-    bytes_transferred_last = bytes_transferred;
-    bwidth = get_clock();
+	starttime = get_clock();
+    }
 
     while (!qemu_file_rate_limit(f)) {
         int ret;
@@ -3209,8 +3209,8 @@
             break;
     }
 
-    bwidth = get_clock() - bwidth;
-    bwidth = (bytes_transferred - bytes_transferred_last) / bwidth;
+    timediff = get_clock() - starttime;
+    bwidth = bytes_transferred / timediff;
 
     /* if we haven't transferred anything this round, force expected_time to a
      * a very high value, but without crashing */

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: migrate_set_downtime bug
  2009-09-29 15:09   ` Dietmar Maurer
@ 2009-09-29 15:39     ` Anthony Liguori
  2009-09-29 16:23       ` Glauber Costa
  0 siblings, 1 reply; 29+ messages in thread
From: Anthony Liguori @ 2009-09-29 15:39 UTC (permalink / raw)
  To: Dietmar Maurer; +Cc: kvm, Glauber Costa

Dietmar Maurer wrote:
> this patch solves the problem by calculation an average bandwidth.
>   

Can you take a look Glauber?

Regards,

Anthony Liguori

> - Dietmar
>
>   
>> -----Original Message-----
>> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On
>> Behalf Of Dietmar Maurer
>> Sent: Dienstag, 29. September 2009 16:37
>> To: kvm
>> Subject: RE: migrate_set_downtime bug
>>
>> Seems the bwidth calculation is the problem. The code simply does:
>>
>> bwidth = (bytes_transferred - bytes_transferred_last) / timediff
>>
>> but I assume network traffic is buffered, so calculated bwidth is
>> sometimes much too high.
>>
>>     


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: migrate_set_downtime bug
  2009-09-29 15:39     ` Anthony Liguori
@ 2009-09-29 16:23       ` Glauber Costa
  2009-09-29 16:36         ` Dietmar Maurer
  0 siblings, 1 reply; 29+ messages in thread
From: Glauber Costa @ 2009-09-29 16:23 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Dietmar Maurer, kvm

On Tue, Sep 29, 2009 at 10:39:57AM -0500, Anthony Liguori wrote:
> Dietmar Maurer wrote:
>> this patch solves the problem by calculation an average bandwidth.
>>   
>
> Can you take a look Glauber?
>
> Regards,
>
> Anthony Liguori
>
>> - Dietmar
>>
>>   
>>> -----Original Message-----
>>> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On
>>> Behalf Of Dietmar Maurer
>>> Sent: Dienstag, 29. September 2009 16:37
>>> To: kvm
>>> Subject: RE: migrate_set_downtime bug
>>>
>>> Seems the bwidth calculation is the problem. The code simply does:
>>>
>>> bwidth = (bytes_transferred - bytes_transferred_last) / timediff
>>>
>>> but I assume network traffic is buffered, so calculated bwidth is
>>> sometimes much too high.
On the other hand, you are just calculating the total since the beginning of
migration, which is not right either.

Also, if this is really the case (buffered), then the bandwidth capping part
of migration is also wrong.

Have you compared the reported bandwidth to your actual bandwith ? I suspect
the source of the problem can be that we're currently ignoring the time we take
to transfer the state of the devices, and maybe it is not negligible.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-09-29 16:23       ` Glauber Costa
@ 2009-09-29 16:36         ` Dietmar Maurer
  2009-09-30  4:48           ` Glauber Costa
  0 siblings, 1 reply; 29+ messages in thread
From: Dietmar Maurer @ 2009-09-29 16:36 UTC (permalink / raw)
  To: Glauber Costa, Anthony Liguori; +Cc: kvm

> Also, if this is really the case (buffered), then the bandwidth capping
> part
> of migration is also wrong.
> 
> Have you compared the reported bandwidth to your actual bandwith ? I
> suspect
> the source of the problem can be that we're currently ignoring the time
> we take
> to transfer the state of the devices, and maybe it is not negligible.
> 

I have a 1GB network (e1000 card), and get values like bwidth=0.98 - which is much too high.

- Dietmar


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: migrate_set_downtime bug
  2009-09-29 16:36         ` Dietmar Maurer
@ 2009-09-30  4:48           ` Glauber Costa
  2009-09-30  6:58             ` Dietmar Maurer
  2009-09-30  8:55             ` Dietmar Maurer
  0 siblings, 2 replies; 29+ messages in thread
From: Glauber Costa @ 2009-09-30  4:48 UTC (permalink / raw)
  To: Dietmar Maurer; +Cc: Anthony Liguori, kvm

On Tue, Sep 29, 2009 at 06:36:57PM +0200, Dietmar Maurer wrote:
> > Also, if this is really the case (buffered), then the bandwidth capping
> > part
> > of migration is also wrong.
> > 
> > Have you compared the reported bandwidth to your actual bandwith ? I
> > suspect
> > the source of the problem can be that we're currently ignoring the time
> > we take
> > to transfer the state of the devices, and maybe it is not negligible.
> > 
> 
> I have a 1GB network (e1000 card), and get values like bwidth=0.98 - which is much too high.
The main reason for not using the whole migration time is that it can lead to values
that are not very helpful in situation where the network load changes too much.

Since the problem you pinpointed do exist, I would suggest measuring the average load of the last,
say, 10 iterations. How would that work for you?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-09-30  4:48           ` Glauber Costa
@ 2009-09-30  6:58             ` Dietmar Maurer
  2009-09-30  8:55             ` Dietmar Maurer
  1 sibling, 0 replies; 29+ messages in thread
From: Dietmar Maurer @ 2009-09-30  6:58 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Anthony Liguori, kvm

> Since the problem you pinpointed do exist, I would suggest measuring
> the average load of the last,
> say, 10 iterations.

The "last 10 interation" does not define a fixed time. I guess it is much more reasonable to measure the average of the last '10 seconds'.

But usually a migration only takes about 10-30 seconds. So do you really want to add additional complexity?

- Dietmar

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-09-30  4:48           ` Glauber Costa
  2009-09-30  6:58             ` Dietmar Maurer
@ 2009-09-30  8:55             ` Dietmar Maurer
  2009-09-30 11:23               ` Glauber Costa
  1 sibling, 1 reply; 29+ messages in thread
From: Dietmar Maurer @ 2009-09-30  8:55 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Anthony Liguori, kvm

[-- Attachment #1: Type: text/plain, Size: 1589 bytes --]

Another problem occur when max_downtime is too short. This can results in never ending migration task.

To reproduce just play a video inside a VM and set max_downtime to 30ns

Sure, one can argument that this behavior is expected.

But the following would avoid the problem:

+    if ((stage == 2) && (bytes_transferred > 2*ram_bytes_total())) {
+        return 1;
+    }

Or do you think that is not reasonable?

- Dietmar

> -----Original Message-----
> From: Glauber Costa [mailto:glommer@redhat.com]
> Sent: Mittwoch, 30. September 2009 06:49
> To: Dietmar Maurer
> Cc: Anthony Liguori; kvm
> Subject: Re: migrate_set_downtime bug
> 
> On Tue, Sep 29, 2009 at 06:36:57PM +0200, Dietmar Maurer wrote:
> > > Also, if this is really the case (buffered), then the bandwidth
> capping
> > > part
> > > of migration is also wrong.
> > >
> > > Have you compared the reported bandwidth to your actual bandwith ?
> I
> > > suspect
> > > the source of the problem can be that we're currently ignoring the
> time
> > > we take
> > > to transfer the state of the devices, and maybe it is not
> negligible.
> > >
> >
> > I have a 1GB network (e1000 card), and get values like bwidth=0.98 -
> which is much too high.
> The main reason for not using the whole migration time is that it can
> lead to values
> that are not very helpful in situation where the network load changes
> too much.
> 
> Since the problem you pinpointed do exist, I would suggest measuring
> the average load of the last,
> say, 10 iterations. How would that work for you?


[-- Attachment #2: migrate.diff --]
[-- Type: application/octet-stream, Size: 1584 bytes --]

Index: qemu-kvm/vl.c
===================================================================
--- qemu-kvm.orig/vl.c	2009-09-30 10:35:45.000000000 +0200
+++ qemu-kvm/vl.c	2009-09-30 10:47:05.000000000 +0200
@@ -3175,9 +3175,10 @@
 static int ram_save_live(QEMUFile *f, int stage, void *opaque)
 {
     ram_addr_t addr;
-    uint64_t bytes_transferred_last;
     double bwidth = 0;
     uint64_t expected_time = 0;
+    static int64_t starttime = 0;
+    double timediff;
 
     if (cpu_physical_sync_dirty_bitmap(0, TARGET_PHYS_ADDR_MAX) != 0) {
         qemu_file_set_error(f);
@@ -3195,10 +3196,9 @@
         cpu_physical_memory_set_dirty_tracking(1);
 
         qemu_put_be64(f, last_ram_offset | RAM_SAVE_FLAG_MEM_SIZE);
-    }
 
-    bytes_transferred_last = bytes_transferred;
-    bwidth = get_clock();
+	starttime = get_clock();
+    }
 
     while (!qemu_file_rate_limit(f)) {
         int ret;
@@ -3209,8 +3209,8 @@
             break;
     }
 
-    bwidth = get_clock() - bwidth;
-    bwidth = (bytes_transferred - bytes_transferred_last) / bwidth;
+    timediff = get_clock() - starttime;
+    bwidth = bytes_transferred / timediff;
 
     /* if we haven't transferred anything this round, force expected_time to a
      * a very high value, but without crashing */
@@ -3230,6 +3230,10 @@
 
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
 
+    if ((stage == 2) && (bytes_transferred > 2*ram_bytes_total())) {
+        return 1;
+    }
+
     expected_time = ram_save_remaining() * TARGET_PAGE_SIZE / bwidth;
 
     return (stage == 2) && (expected_time <= migrate_max_downtime());

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: migrate_set_downtime bug
  2009-09-30  8:55             ` Dietmar Maurer
@ 2009-09-30 11:23               ` Glauber Costa
  2009-09-30 14:11                 ` Dietmar Maurer
  0 siblings, 1 reply; 29+ messages in thread
From: Glauber Costa @ 2009-09-30 11:23 UTC (permalink / raw)
  To: Dietmar Maurer; +Cc: Anthony Liguori, kvm

On Wed, Sep 30, 2009 at 10:55:24AM +0200, Dietmar Maurer wrote:
> Another problem occur when max_downtime is too short. This can results in never ending migration task.
> 
> To reproduce just play a video inside a VM and set max_downtime to 30ns
> 
> Sure, one can argument that this behavior is expected.
> 
> But the following would avoid the problem:
> 
> +    if ((stage == 2) && (bytes_transferred > 2*ram_bytes_total())) {
> +        return 1;
> +    }
why 2 * ? 
This means we'll have to transfer the whole contents of RAM at least twice to hit this condition, right?

> 
> Or do you think that is not reasonable?
> 
> - Dietmar
> 
> > -----Original Message-----
> > From: Glauber Costa [mailto:glommer@redhat.com]
> > Sent: Mittwoch, 30. September 2009 06:49
> > To: Dietmar Maurer
> > Cc: Anthony Liguori; kvm
> > Subject: Re: migrate_set_downtime bug
> > 
> > On Tue, Sep 29, 2009 at 06:36:57PM +0200, Dietmar Maurer wrote:
> > > > Also, if this is really the case (buffered), then the bandwidth
> > capping
> > > > part
> > > > of migration is also wrong.
> > > >
> > > > Have you compared the reported bandwidth to your actual bandwith ?
> > I
> > > > suspect
> > > > the source of the problem can be that we're currently ignoring the
> > time
> > > > we take
> > > > to transfer the state of the devices, and maybe it is not
> > negligible.
> > > >
> > >
> > > I have a 1GB network (e1000 card), and get values like bwidth=0.98 -
> > which is much too high.
> > The main reason for not using the whole migration time is that it can
> > lead to values
> > that are not very helpful in situation where the network load changes
> > too much.
> > 
> > Since the problem you pinpointed do exist, I would suggest measuring
> > the average load of the last,
> > say, 10 iterations. How would that work for you?
> 



^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-09-30 11:23               ` Glauber Costa
@ 2009-09-30 14:11                 ` Dietmar Maurer
  2009-09-30 16:39                   ` Glauber Costa
  0 siblings, 1 reply; 29+ messages in thread
From: Dietmar Maurer @ 2009-09-30 14:11 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Anthony Liguori, kvm

> On Wed, Sep 30, 2009 at 10:55:24AM +0200, Dietmar Maurer wrote:
> > Another problem occur when max_downtime is too short. This can
> results in never ending migration task.
> >
> > To reproduce just play a video inside a VM and set max_downtime to
> 30ns
> >
> > Sure, one can argument that this behavior is expected.
> >
> > But the following would avoid the problem:
> >
> > +    if ((stage == 2) && (bytes_transferred > 2*ram_bytes_total())) {
> > +        return 1;
> > +    }
> why 2 * ?
> This means we'll have to transfer the whole contents of RAM at least
> twice to hit this condition, right?

Yes, this is just an arbitrary limit. 

- Dietmar


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: migrate_set_downtime bug
  2009-09-30 14:11                 ` Dietmar Maurer
@ 2009-09-30 16:39                   ` Glauber Costa
  2009-09-30 18:41                     ` Dietmar Maurer
  0 siblings, 1 reply; 29+ messages in thread
From: Glauber Costa @ 2009-09-30 16:39 UTC (permalink / raw)
  To: Dietmar Maurer; +Cc: Anthony Liguori, kvm

On Wed, Sep 30, 2009 at 04:11:32PM +0200, Dietmar Maurer wrote:
> > On Wed, Sep 30, 2009 at 10:55:24AM +0200, Dietmar Maurer wrote:
> > > Another problem occur when max_downtime is too short. This can
> > results in never ending migration task.
> > >
> > > To reproduce just play a video inside a VM and set max_downtime to
> > 30ns
> > >
> > > Sure, one can argument that this behavior is expected.
> > >
> > > But the following would avoid the problem:
> > >
> > > +    if ((stage == 2) && (bytes_transferred > 2*ram_bytes_total())) {
> > > +        return 1;
> > > +    }
> > why 2 * ?
> > This means we'll have to transfer the whole contents of RAM at least
> > twice to hit this condition, right?
> 
> Yes, this is just an arbitrary limit. 
I don't know. If we are going for a limit, I would prefere a limit of pages yet to transfer,
not pages already transferred.

However, the very reason this whole thing was written in the first place, was to leave choices
to management tools ontop of qemu, not qemu itself. So I would say yes, if you set limit for 30ns,
you asked for it never finishing.

Your first patch is okay, tough.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-09-30 16:39                   ` Glauber Costa
@ 2009-09-30 18:41                     ` Dietmar Maurer
  2009-10-05 12:17                       ` Avi Kivity
  0 siblings, 1 reply; 29+ messages in thread
From: Dietmar Maurer @ 2009-09-30 18:41 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Anthony Liguori, kvm

> > > > +    if ((stage == 2) && (bytes_transferred >
> 2*ram_bytes_total())) {
> > > > +        return 1;
> > > > +    }
> > > why 2 * ?
> > > This means we'll have to transfer the whole contents of RAM at
> least
> > > twice to hit this condition, right?
> >
> > Yes, this is just an arbitrary limit.
> I don't know. If we are going for a limit, I would prefere a limit of
> pages yet to transfer,
> not pages already transferred.
> 
> However, the very reason this whole thing was written in the first
> place, was to leave choices
> to management tools ontop of qemu, not qemu itself. So I would say yes,
> if you set limit for 30ns,
> you asked for it never finishing.

I just think of common scenarios like 'maintanace mode', where all VM should migrate to another host. A endless migrate task can make that fail. 

For me, it is totally unclear what value I should set for 'max_downtime' to avoid that behavior?

- Dietmar



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: migrate_set_downtime bug
  2009-09-30 18:41                     ` Dietmar Maurer
@ 2009-10-05 12:17                       ` Avi Kivity
  2009-10-05 13:04                         ` Glauber Costa
                                           ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Avi Kivity @ 2009-10-05 12:17 UTC (permalink / raw)
  To: Dietmar Maurer; +Cc: Glauber Costa, Anthony Liguori, kvm

On 09/30/2009 08:41 PM, Dietmar Maurer wrote:
>
> I just think of common scenarios like 'maintanace mode', where all VM should migrate to another host. A endless migrate task can make that fail.
>
> For me, it is totally unclear what value I should set for 'max_downtime' to avoid that behavior?
>
>    

We used to have a heuristic that said 'if an iteration transfers more 
pages than the previous iteration, we've stopped converging'.  Why 
wouldn't that work?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: migrate_set_downtime bug
  2009-10-05 12:17                       ` Avi Kivity
@ 2009-10-05 13:04                         ` Glauber Costa
  2009-10-05 13:17                           ` Avi Kivity
  2009-10-05 14:01                         ` Dietmar Maurer
  2009-10-05 14:18                         ` Dietmar Maurer
  2 siblings, 1 reply; 29+ messages in thread
From: Glauber Costa @ 2009-10-05 13:04 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Dietmar Maurer, Anthony Liguori, kvm

On Mon, Oct 05, 2009 at 02:17:30PM +0200, Avi Kivity wrote:
> On 09/30/2009 08:41 PM, Dietmar Maurer wrote:
>>
>> I just think of common scenarios like 'maintanace mode', where all VM should migrate to another host. A endless migrate task can make that fail.
>>
>> For me, it is totally unclear what value I should set for 'max_downtime' to avoid that behavior?
>>
>>    
>
> We used to have a heuristic that said 'if an iteration transfers more  
> pages than the previous iteration, we've stopped converging'.  Why  
> wouldn't that work?
Because it seems people agreed that mgmt tools would be the place for those heuristics.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: migrate_set_downtime bug
  2009-10-05 13:04                         ` Glauber Costa
@ 2009-10-05 13:17                           ` Avi Kivity
  2009-10-05 14:09                             ` Dietmar Maurer
  0 siblings, 1 reply; 29+ messages in thread
From: Avi Kivity @ 2009-10-05 13:17 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Dietmar Maurer, Anthony Liguori, kvm

On 10/05/2009 03:04 PM, Glauber Costa wrote:
> On Mon, Oct 05, 2009 at 02:17:30PM +0200, Avi Kivity wrote:
>    
>> On 09/30/2009 08:41 PM, Dietmar Maurer wrote:
>>      
>>> I just think of common scenarios like 'maintanace mode', where all VM should migrate to another host. A endless migrate task can make that fail.
>>>
>>> For me, it is totally unclear what value I should set for 'max_downtime' to avoid that behavior?
>>>
>>>
>>>        
>> We used to have a heuristic that said 'if an iteration transfers more
>> pages than the previous iteration, we've stopped converging'.  Why
>> wouldn't that work?
>>      
> Because it seems people agreed that mgmt tools would be the place for those heuristics.
>    

Heuristics like number of pages, maybe.  But since we don't export 
iteration information, we can't expect management tools to stop the 
guest if migration doesn't converge.

I suppose it could issue a 'stop' after some amount of time (constant * 
memory size / bandwidth).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-10-05 12:17                       ` Avi Kivity
  2009-10-05 13:04                         ` Glauber Costa
@ 2009-10-05 14:01                         ` Dietmar Maurer
  2009-10-05 14:06                           ` Avi Kivity
  2009-10-05 14:18                         ` Dietmar Maurer
  2 siblings, 1 reply; 29+ messages in thread
From: Dietmar Maurer @ 2009-10-05 14:01 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Glauber Costa, Anthony Liguori, kvm

> We used to have a heuristic that said 'if an iteration transfers more
> pages than the previous iteration, we've stopped converging'.  Why
> wouldn't that work?

This does not protect you from very long migration times.

- Dietmar


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: migrate_set_downtime bug
  2009-10-05 14:01                         ` Dietmar Maurer
@ 2009-10-05 14:06                           ` Avi Kivity
  2009-10-05 14:08                             ` Dietmar Maurer
  2009-10-05 14:11                             ` Dietmar Maurer
  0 siblings, 2 replies; 29+ messages in thread
From: Avi Kivity @ 2009-10-05 14:06 UTC (permalink / raw)
  To: Dietmar Maurer; +Cc: Glauber Costa, Anthony Liguori, kvm

On 10/05/2009 04:01 PM, Dietmar Maurer wrote:
>> We used to have a heuristic that said 'if an iteration transfers more
>> pages than the previous iteration, we've stopped converging'.  Why
>> wouldn't that work?
>>      
> This does not protect you from very long migration times.
>
>    

Well, if each iteration transfers one page less than the previous one, 
it doesn't.

You can always issue a 'stop' from the monitor.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-10-05 14:06                           ` Avi Kivity
@ 2009-10-05 14:08                             ` Dietmar Maurer
  2009-10-05 14:15                               ` Avi Kivity
  2009-10-05 14:11                             ` Dietmar Maurer
  1 sibling, 1 reply; 29+ messages in thread
From: Dietmar Maurer @ 2009-10-05 14:08 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Glauber Costa, Anthony Liguori, kvm

> On 10/05/2009 04:01 PM, Dietmar Maurer wrote:
> >> We used to have a heuristic that said 'if an iteration transfers
> more
> >> pages than the previous iteration, we've stopped converging'.  Why
> >> wouldn't that work?
> >>
> > This does not protect you from very long migration times.
> >
> >
> 
> Well, if each iteration transfers one page less than the previous one,
> it doesn't.

So how long does a migration take in this scenario when you have a VM with 8GB RAM?

- dietmar


^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-10-05 13:17                           ` Avi Kivity
@ 2009-10-05 14:09                             ` Dietmar Maurer
  2009-10-05 15:32                               ` Glauber Costa
  0 siblings, 1 reply; 29+ messages in thread
From: Dietmar Maurer @ 2009-10-05 14:09 UTC (permalink / raw)
  To: Avi Kivity, Glauber Costa; +Cc: Anthony Liguori, kvm

> Heuristics like number of pages, maybe.  But since we don't export
> iteration information, we can't expect management tools to stop the
> guest if migration doesn't converge.
> 
> I suppose it could issue a 'stop' after some amount of time (constant *
> memory size / bandwidth).

'bandwidth' is something that changes dynamically (or by user settings), so why don't we simply abort after some amount of transferred memory (constant * memory size). This can be implemented by the management application without problems, although it's much easier inside kvm.

- Dietmar



^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-10-05 14:06                           ` Avi Kivity
  2009-10-05 14:08                             ` Dietmar Maurer
@ 2009-10-05 14:11                             ` Dietmar Maurer
  1 sibling, 0 replies; 29+ messages in thread
From: Dietmar Maurer @ 2009-10-05 14:11 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Glauber Costa, Anthony Liguori, kvm



> -----Original Message-----
> From: Avi Kivity [mailto:avi@redhat.com]
> Sent: Montag, 05. Oktober 2009 16:06
> To: Dietmar Maurer
> Cc: Glauber Costa; Anthony Liguori; kvm
> Subject: Re: migrate_set_downtime bug
> 
> On 10/05/2009 04:01 PM, Dietmar Maurer wrote:
> >> We used to have a heuristic that said 'if an iteration transfers
> more
> >> pages than the previous iteration, we've stopped converging'.  Why
> >> wouldn't that work?
> >>
> > This does not protect you from very long migration times.
> >
> >
> 
> Well, if each iteration transfers one page less than the previous one,
> it doesn't.

That approach also depends on bandwidth changes.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: migrate_set_downtime bug
  2009-10-05 14:08                             ` Dietmar Maurer
@ 2009-10-05 14:15                               ` Avi Kivity
  0 siblings, 0 replies; 29+ messages in thread
From: Avi Kivity @ 2009-10-05 14:15 UTC (permalink / raw)
  To: Dietmar Maurer; +Cc: Glauber Costa, Anthony Liguori, kvm

On 10/05/2009 04:08 PM, Dietmar Maurer wrote:
>> Well, if each iteration transfers one page less than the previous one,
>> it doesn't.
>>      
> So how long does a migration take in this scenario when you have a VM with 8GB RAM?
>
>    

At 1 Gbps, about 2 years.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-10-05 12:17                       ` Avi Kivity
  2009-10-05 13:04                         ` Glauber Costa
  2009-10-05 14:01                         ` Dietmar Maurer
@ 2009-10-05 14:18                         ` Dietmar Maurer
  2 siblings, 0 replies; 29+ messages in thread
From: Dietmar Maurer @ 2009-10-05 14:18 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Glauber Costa, Anthony Liguori, kvm

> We used to have a heuristic that said 'if an iteration transfers more
> pages than the previous iteration, we've stopped converging'.  Why
> wouldn't that work?

I agree that this is the 'right' approach - but it is just too difficult to detect that we are not 'converging', and it does not set a limit on migration time.

- Dietmar


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: migrate_set_downtime bug
  2009-10-05 14:09                             ` Dietmar Maurer
@ 2009-10-05 15:32                               ` Glauber Costa
  2009-10-06  8:30                                 ` Dietmar Maurer
  0 siblings, 1 reply; 29+ messages in thread
From: Glauber Costa @ 2009-10-05 15:32 UTC (permalink / raw)
  To: Dietmar Maurer; +Cc: Avi Kivity, Anthony Liguori, kvm

On Mon, Oct 05, 2009 at 04:09:43PM +0200, Dietmar Maurer wrote:
> > Heuristics like number of pages, maybe.  But since we don't export
> > iteration information, we can't expect management tools to stop the
> > guest if migration doesn't converge.
> > 
> > I suppose it could issue a 'stop' after some amount of time (constant *
> > memory size / bandwidth).
> 
> 'bandwidth' is something that changes dynamically (or by user settings), so why don't we simply abort after some amount of transferred memory (constant * memory size). This can be implemented by the management application without problems, although it's much easier inside kvm.
> 
Easier, yes.

But then once it is done, people wanting a different behaviour for some valid reason are stuck with that.
This is the very reason we expose information about migration in the monitor to begin with.

Again, I believe the fix for this convergence problem does not belong here.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-10-05 15:32                               ` Glauber Costa
@ 2009-10-06  8:30                                 ` Dietmar Maurer
  2009-10-06 17:33                                   ` Glauber Costa
  0 siblings, 1 reply; 29+ messages in thread
From: Dietmar Maurer @ 2009-10-06  8:30 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Avi Kivity, Anthony Liguori, kvm

> > 'bandwidth' is something that changes dynamically (or by user
> settings), so why don't we simply abort after some amount of
> transferred memory (constant * memory size). This can be implemented by
> the management application without problems, although it's much easier
> inside kvm.
> >
> Easier, yes.
> 
> But then once it is done, people wanting a different behaviour for some
> valid reason are stuck with that.
> This is the very reason we expose information about migration in the
> monitor to begin with.

No problem. Maybe you can just commit the first part of my patch then?

> Again, I believe the fix for this convergence problem does not belong
> here.

The default downtime is set to 30ms. This value triggers the convergence problem quite often. Maybe a longer default is more reasonable.

- Dietmar
 



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: migrate_set_downtime bug
  2009-10-06  8:30                                 ` Dietmar Maurer
@ 2009-10-06 17:33                                   ` Glauber Costa
  2009-10-07  4:42                                     ` Dietmar Maurer
  0 siblings, 1 reply; 29+ messages in thread
From: Glauber Costa @ 2009-10-06 17:33 UTC (permalink / raw)
  To: Dietmar Maurer; +Cc: Avi Kivity, Anthony Liguori, kvm

On Tue, Oct 06, 2009 at 10:30:14AM +0200, Dietmar Maurer wrote:
> > > 'bandwidth' is something that changes dynamically (or by user
> > settings), so why don't we simply abort after some amount of
> > transferred memory (constant * memory size). This can be implemented by
> > the management application without problems, although it's much easier
> > inside kvm.
> > >
> > Easier, yes.
> > 
> > But then once it is done, people wanting a different behaviour for some
> > valid reason are stuck with that.
> > This is the very reason we expose information about migration in the
> > monitor to begin with.
> 
> No problem. Maybe you can just commit the first part of my patch then?
Anthony should do it.
Given the circumnstances: your method and the current method are both approximations.
Your works where current fails, and none of us can come up with a better solution,
I ack it.

> 
> > Again, I believe the fix for this convergence problem does not belong
> > here.
> 
> The default downtime is set to 30ms. This value triggers the convergence problem quite often. Maybe a longer default is more reasonable.
What do you feel about 100 ms? 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-10-06 17:33                                   ` Glauber Costa
@ 2009-10-07  4:42                                     ` Dietmar Maurer
  2009-10-07 12:32                                       ` Glauber Costa
  0 siblings, 1 reply; 29+ messages in thread
From: Dietmar Maurer @ 2009-10-07  4:42 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Avi Kivity, Anthony Liguori, kvm

> > The default downtime is set to 30ms. This value triggers the
> convergence problem quite often. Maybe a longer default is more
> reasonable.
> What do you feel about 100 ms?

What is the reasoning behind such short downtimes? Are there any application that will fail with longer downtimes (let say 1s)?

Note: on a 1Gbit/s net you can transfer only 10MB within 100ms

- Dietmar




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: migrate_set_downtime bug
  2009-10-07  4:42                                     ` Dietmar Maurer
@ 2009-10-07 12:32                                       ` Glauber Costa
  2009-10-07 19:40                                         ` Dietmar Maurer
  0 siblings, 1 reply; 29+ messages in thread
From: Glauber Costa @ 2009-10-07 12:32 UTC (permalink / raw)
  To: Dietmar Maurer; +Cc: Avi Kivity, Anthony Liguori, kvm

On Wed, Oct 07, 2009 at 06:42:48AM +0200, Dietmar Maurer wrote:
> > > The default downtime is set to 30ms. This value triggers the
> > convergence problem quite often. Maybe a longer default is more
> > reasonable.
> > What do you feel about 100 ms?
> 
> What is the reasoning behind such short downtimes? Are there any application that will fail with longer downtimes (let say 1s)?
> 
> Note: on a 1Gbit/s net you can transfer only 10MB within 100ms

which accounts for more than 2 thousand pages, which sounds like enough for a first pass to me. For the default case,
It is hard to imagine an application dirtying more than 2k pages per-iteration

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: migrate_set_downtime bug
  2009-10-07 12:32                                       ` Glauber Costa
@ 2009-10-07 19:40                                         ` Dietmar Maurer
  0 siblings, 0 replies; 29+ messages in thread
From: Dietmar Maurer @ 2009-10-07 19:40 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Avi Kivity, Anthony Liguori, kvm

> > What is the reasoning behind such short downtimes? Are there any
> application that will fail with longer downtimes (let say 1s)?
> >
> > Note: on a 1Gbit/s net you can transfer only 10MB within 100ms
> 
> which accounts for more than 2 thousand pages, which sounds like enough
> for a first pass to me. For the default case,
> It is hard to imagine an application dirtying more than 2k pages per-
> iteration

simply encode or decode a mpeg video (or play a video).


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2009-10-07 19:41 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-29 13:00 migrate_set_downtime bug Dietmar Maurer
2009-09-29 14:36 ` Dietmar Maurer
2009-09-29 15:09   ` Dietmar Maurer
2009-09-29 15:39     ` Anthony Liguori
2009-09-29 16:23       ` Glauber Costa
2009-09-29 16:36         ` Dietmar Maurer
2009-09-30  4:48           ` Glauber Costa
2009-09-30  6:58             ` Dietmar Maurer
2009-09-30  8:55             ` Dietmar Maurer
2009-09-30 11:23               ` Glauber Costa
2009-09-30 14:11                 ` Dietmar Maurer
2009-09-30 16:39                   ` Glauber Costa
2009-09-30 18:41                     ` Dietmar Maurer
2009-10-05 12:17                       ` Avi Kivity
2009-10-05 13:04                         ` Glauber Costa
2009-10-05 13:17                           ` Avi Kivity
2009-10-05 14:09                             ` Dietmar Maurer
2009-10-05 15:32                               ` Glauber Costa
2009-10-06  8:30                                 ` Dietmar Maurer
2009-10-06 17:33                                   ` Glauber Costa
2009-10-07  4:42                                     ` Dietmar Maurer
2009-10-07 12:32                                       ` Glauber Costa
2009-10-07 19:40                                         ` Dietmar Maurer
2009-10-05 14:01                         ` Dietmar Maurer
2009-10-05 14:06                           ` Avi Kivity
2009-10-05 14:08                             ` Dietmar Maurer
2009-10-05 14:15                               ` Avi Kivity
2009-10-05 14:11                             ` Dietmar Maurer
2009-10-05 14:18                         ` Dietmar Maurer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.