qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Live migration broken when under heavy IO
@ 2009-06-15 20:33 Anthony Liguori
  2009-06-15 20:48 ` Glauber Costa
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Anthony Liguori @ 2009-06-15 20:33 UTC (permalink / raw)
  To: qemu-devel@nongnu.org, kvm-devel

The basic issue is that:

migrate_fd_put_ready():    bdrv_flush_all();

Does:

block.c:

foreach block driver:
   drv->flush(bs);

Which in the case of raw, is just fsync(s->fd).

Any submitted request is not queued or flushed which will lead to the 
request being dropped after the live migration.

Is anyone working on fixing this?  Does anyone have a clever idea how to 
fix this without just waiting for all IO requests to complete?

---

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] Live migration broken when under heavy IO
  2009-06-15 20:33 [Qemu-devel] Live migration broken when under heavy IO Anthony Liguori
@ 2009-06-15 20:48 ` Glauber Costa
  2009-06-16  9:10 ` [Qemu-devel] " Avi Kivity
  2009-06-16 18:19 ` Charles Duffy
  2 siblings, 0 replies; 9+ messages in thread
From: Glauber Costa @ 2009-06-15 20:48 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel@nongnu.org, kvm-devel

On Mon, Jun 15, 2009 at 03:33:41PM -0500, Anthony Liguori wrote:
> The basic issue is that:
>
> migrate_fd_put_ready():    bdrv_flush_all();
>
> Does:
>
> block.c:
>
> foreach block driver:
>   drv->flush(bs);
>
> Which in the case of raw, is just fsync(s->fd).
>
> Any submitted request is not queued or flushed which will lead to the  
> request being dropped after the live migration.
you mean any request submitted _after_ that is not queued, right?

>
> Is anyone working on fixing this?  Does anyone have a clever idea how to  
> fix this without just waiting for all IO requests to complete?
If I understood you correctly, we could do something in the lines of dirty
tracking for I/O devices.

use register_savevm_live() instead of register_savevm() for those, and
keep doing passes until we reach stage 3, for some criteria. We can then
just flush the remaining requests on that device and mark[1] it somewhere.
We can then either stop that device, so that new requests never arrive,
or stop the VM entirely.

[1] By mark, I mean the verb "to mark", not our dear friend Mark McLaughing.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] Re: Live migration broken when under heavy IO
  2009-06-15 20:33 [Qemu-devel] Live migration broken when under heavy IO Anthony Liguori
  2009-06-15 20:48 ` Glauber Costa
@ 2009-06-16  9:10 ` Avi Kivity
  2009-06-16  9:13   ` Avi Kivity
  2009-06-16 12:50   ` Anthony Liguori
  2009-06-16 18:19 ` Charles Duffy
  2 siblings, 2 replies; 9+ messages in thread
From: Avi Kivity @ 2009-06-16  9:10 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel@nongnu.org, kvm-devel

On 06/15/2009 11:33 PM, Anthony Liguori wrote:
> The basic issue is that:
>
> migrate_fd_put_ready():    bdrv_flush_all();
>
> Does:
>
> block.c:
>
> foreach block driver:
>   drv->flush(bs);
>
> Which in the case of raw, is just fsync(s->fd).
>
> Any submitted request is not queued or flushed which will lead to the 
> request being dropped after the live migration.
>
> Is anyone working on fixing this? 

Not to my knowledge

> Does anyone have a clever idea how to fix this without just waiting 
> for all IO requests to complete?

What's wrong with waiting for requests to complete?  It should take a 
few tens of milliseconds.

We could start throttling requests late in the live stage, but I don't 
really see the point.

Isn't virtio migration currently broken due to the qdev changes?

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] Re: Live migration broken when under heavy IO
  2009-06-16  9:10 ` [Qemu-devel] " Avi Kivity
@ 2009-06-16  9:13   ` Avi Kivity
  2009-06-16 12:50   ` Anthony Liguori
  1 sibling, 0 replies; 9+ messages in thread
From: Avi Kivity @ 2009-06-16  9:13 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel@nongnu.org, kvm-devel

On 06/16/2009 12:10 PM, Avi Kivity wrote:
>> Does anyone have a clever idea how to fix this without just waiting 
>> for all IO requests to complete?
>
> What's wrong with waiting for requests to complete?  It should take a 
> few tens of milliseconds.
>
> We could start throttling requests late in the live stage, but I don't 
> really see the point.

Well, we can introduce a new live stage, where we migrate RAM and 
complete block requests, but the vm is otherwise stopped.  This allows 
the flush to overlap with sending memory dirtied by the flush, reducing 
some downtime.  Since block requests can dirty large amounts of memory, 
this may be significant.

We can even keep the vcpu alive, only blocking new block requests, but 
that may cause dirty RAM divergence.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] Re: Live migration broken when under heavy IO
  2009-06-16  9:10 ` [Qemu-devel] " Avi Kivity
  2009-06-16  9:13   ` Avi Kivity
@ 2009-06-16 12:50   ` Anthony Liguori
  2009-06-16 12:54     ` Avi Kivity
  1 sibling, 1 reply; 9+ messages in thread
From: Anthony Liguori @ 2009-06-16 12:50 UTC (permalink / raw)
  To: Avi Kivity; +Cc: qemu-devel@nongnu.org, kvm-devel

Avi Kivity wrote:
>> Does anyone have a clever idea how to fix this without just waiting 
>> for all IO requests to complete?
>
> What's wrong with waiting for requests to complete?  It should take a 
> few tens of milliseconds.

An alternative would be to attempt to cancel the requests.  This incurs 
no non-deterministic latency.

The tricky bit is that this has to happen at the device layer because 
the opaques cannot be saved in a meaningful way.

> We could start throttling requests late in the live stage, but I don't 
> really see the point.
>
> Isn't virtio migration currently broken due to the qdev changes?
>


-- 
Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] Re: Live migration broken when under heavy IO
  2009-06-16 12:50   ` Anthony Liguori
@ 2009-06-16 12:54     ` Avi Kivity
  2009-06-16 12:57       ` Anthony Liguori
  0 siblings, 1 reply; 9+ messages in thread
From: Avi Kivity @ 2009-06-16 12:54 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel@nongnu.org, kvm-devel

On 06/16/2009 03:50 PM, Anthony Liguori wrote:
> Avi Kivity wrote:
>>> Does anyone have a clever idea how to fix this without just waiting 
>>> for all IO requests to complete?
>>
>> What's wrong with waiting for requests to complete?  It should take a 
>> few tens of milliseconds.
>
> An alternative would be to attempt to cancel the requests.  This 
> incurs no non-deterministic latency.

Yes, that's even better (though without linux-aio, it's equivalent).

>
> The tricky bit is that this has to happen at the device layer because 
> the opaques cannot be saved in a meaningful way.
>

Do you mean the device has to record all cancelled requests and replay 
them?  I think we can do it at the block layer (though we have to avoid 
it for nested requests).

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] Re: Live migration broken when under heavy IO
  2009-06-16 12:54     ` Avi Kivity
@ 2009-06-16 12:57       ` Anthony Liguori
  2009-06-16 13:12         ` Avi Kivity
  0 siblings, 1 reply; 9+ messages in thread
From: Anthony Liguori @ 2009-06-16 12:57 UTC (permalink / raw)
  To: Avi Kivity; +Cc: qemu-devel@nongnu.org, kvm-devel

Avi Kivity wrote:
> Yes, that's even better (though without linux-aio, it's equivalent).

Not absolutely equivalent.  There many be queued requests that haven't 
yet been dispatched to the thread pool, but yeah, I understand what you 
mean.

>>
>> The tricky bit is that this has to happen at the device layer because 
>> the opaques cannot be saved in a meaningful way.
>>
>
> Do you mean the device has to record all cancelled requests and replay 
> them?  I think we can do it at the block layer (though we have to 
> avoid it for nested requests).

In order to complete the requests, you have to call a callback and pass 
an opaque with the results.  The callback/opaque cannot be saved in the 
block layer in a meaningful way.

-- 
Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] Re: Live migration broken when under heavy IO
  2009-06-16 12:57       ` Anthony Liguori
@ 2009-06-16 13:12         ` Avi Kivity
  0 siblings, 0 replies; 9+ messages in thread
From: Avi Kivity @ 2009-06-16 13:12 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel@nongnu.org, kvm-devel

On 06/16/2009 03:57 PM, Anthony Liguori wrote:
>>>
>>> The tricky bit is that this has to happen at the device layer 
>>> because the opaques cannot be saved in a meaningful way.
>>>
>>
>> Do you mean the device has to record all cancelled requests and 
>> replay them?  I think we can do it at the block layer (though we have 
>> to avoid it for nested requests).
>
> In order to complete the requests, you have to call a callback and 
> pass an opaque with the results.  The callback/opaque cannot be saved 
> in the block layer in a meaningful way.
>

You're right, of course.  I guess we'll have to cancel any near term 
cancellation plans.

We could change the opaque to be something pre-registered (e.g. the 
device state object, which we don't need to save/restore) and pass in 
addition an integer request tag.  These would be migratable.  The device 
would be responsible for saving tags and their associated information 
(perhaps through a common API).

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] Re: Live migration broken when under heavy IO
  2009-06-15 20:33 [Qemu-devel] Live migration broken when under heavy IO Anthony Liguori
  2009-06-15 20:48 ` Glauber Costa
  2009-06-16  9:10 ` [Qemu-devel] " Avi Kivity
@ 2009-06-16 18:19 ` Charles Duffy
  2 siblings, 0 replies; 9+ messages in thread
From: Charles Duffy @ 2009-06-16 18:19 UTC (permalink / raw)
  To: qemu-devel

I'm not sure if this is related, but to comment --

I'm seeing what appears to be occasional dropped I/O on non-live 
migration -- that is to say, stop, migrate-to-disk, shuffle files 
around, migrate-from-disk, cont. I have not yet been able to reproduce 
this with cache=off.

As I understand it, the bulk of discussion in this thread is about I/O 
submitted by the guest after the migration starts; as I'm stopping the 
guest CPU _before_ initiating migration, there would appear to be 
outstanding issues not limited to this scenario.

[qemu-kvm-0.10.5, IDE, qcow2]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-06-16 18:19 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-15 20:33 [Qemu-devel] Live migration broken when under heavy IO Anthony Liguori
2009-06-15 20:48 ` Glauber Costa
2009-06-16  9:10 ` [Qemu-devel] " Avi Kivity
2009-06-16  9:13   ` Avi Kivity
2009-06-16 12:50   ` Anthony Liguori
2009-06-16 12:54     ` Avi Kivity
2009-06-16 12:57       ` Anthony Liguori
2009-06-16 13:12         ` Avi Kivity
2009-06-16 18:19 ` Charles Duffy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).