* [Qemu-devel] Live migration broken when under heavy IO
@ 2009-06-15 20:33 Anthony Liguori
2009-06-15 20:48 ` Glauber Costa
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Anthony Liguori @ 2009-06-15 20:33 UTC (permalink / raw)
To: qemu-devel@nongnu.org, kvm-devel
The basic issue is that:
migrate_fd_put_ready(): bdrv_flush_all();
Does:
block.c:
foreach block driver:
drv->flush(bs);
Which in the case of raw, is just fsync(s->fd).
Any submitted request is not queued or flushed which will lead to the
request being dropped after the live migration.
Is anyone working on fixing this? Does anyone have a clever idea how to
fix this without just waiting for all IO requests to complete?
---
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration broken when under heavy IO
2009-06-15 20:33 [Qemu-devel] Live migration broken when under heavy IO Anthony Liguori
@ 2009-06-15 20:48 ` Glauber Costa
2009-06-16 9:10 ` [Qemu-devel] " Avi Kivity
2009-06-16 18:19 ` Charles Duffy
2 siblings, 0 replies; 9+ messages in thread
From: Glauber Costa @ 2009-06-15 20:48 UTC (permalink / raw)
To: Anthony Liguori; +Cc: qemu-devel@nongnu.org, kvm-devel
On Mon, Jun 15, 2009 at 03:33:41PM -0500, Anthony Liguori wrote:
> The basic issue is that:
>
> migrate_fd_put_ready(): bdrv_flush_all();
>
> Does:
>
> block.c:
>
> foreach block driver:
> drv->flush(bs);
>
> Which in the case of raw, is just fsync(s->fd).
>
> Any submitted request is not queued or flushed which will lead to the
> request being dropped after the live migration.
you mean any request submitted _after_ that is not queued, right?
>
> Is anyone working on fixing this? Does anyone have a clever idea how to
> fix this without just waiting for all IO requests to complete?
If I understood you correctly, we could do something in the lines of dirty
tracking for I/O devices.
use register_savevm_live() instead of register_savevm() for those, and
keep doing passes until we reach stage 3, for some criteria. We can then
just flush the remaining requests on that device and mark[1] it somewhere.
We can then either stop that device, so that new requests never arrive,
or stop the VM entirely.
[1] By mark, I mean the verb "to mark", not our dear friend Mark McLaughing.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] Re: Live migration broken when under heavy IO
2009-06-15 20:33 [Qemu-devel] Live migration broken when under heavy IO Anthony Liguori
2009-06-15 20:48 ` Glauber Costa
@ 2009-06-16 9:10 ` Avi Kivity
2009-06-16 9:13 ` Avi Kivity
2009-06-16 12:50 ` Anthony Liguori
2009-06-16 18:19 ` Charles Duffy
2 siblings, 2 replies; 9+ messages in thread
From: Avi Kivity @ 2009-06-16 9:10 UTC (permalink / raw)
To: Anthony Liguori; +Cc: qemu-devel@nongnu.org, kvm-devel
On 06/15/2009 11:33 PM, Anthony Liguori wrote:
> The basic issue is that:
>
> migrate_fd_put_ready(): bdrv_flush_all();
>
> Does:
>
> block.c:
>
> foreach block driver:
> drv->flush(bs);
>
> Which in the case of raw, is just fsync(s->fd).
>
> Any submitted request is not queued or flushed which will lead to the
> request being dropped after the live migration.
>
> Is anyone working on fixing this?
Not to my knowledge
> Does anyone have a clever idea how to fix this without just waiting
> for all IO requests to complete?
What's wrong with waiting for requests to complete? It should take a
few tens of milliseconds.
We could start throttling requests late in the live stage, but I don't
really see the point.
Isn't virtio migration currently broken due to the qdev changes?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] Re: Live migration broken when under heavy IO
2009-06-16 9:10 ` [Qemu-devel] " Avi Kivity
@ 2009-06-16 9:13 ` Avi Kivity
2009-06-16 12:50 ` Anthony Liguori
1 sibling, 0 replies; 9+ messages in thread
From: Avi Kivity @ 2009-06-16 9:13 UTC (permalink / raw)
To: Anthony Liguori; +Cc: qemu-devel@nongnu.org, kvm-devel
On 06/16/2009 12:10 PM, Avi Kivity wrote:
>> Does anyone have a clever idea how to fix this without just waiting
>> for all IO requests to complete?
>
> What's wrong with waiting for requests to complete? It should take a
> few tens of milliseconds.
>
> We could start throttling requests late in the live stage, but I don't
> really see the point.
Well, we can introduce a new live stage, where we migrate RAM and
complete block requests, but the vm is otherwise stopped. This allows
the flush to overlap with sending memory dirtied by the flush, reducing
some downtime. Since block requests can dirty large amounts of memory,
this may be significant.
We can even keep the vcpu alive, only blocking new block requests, but
that may cause dirty RAM divergence.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] Re: Live migration broken when under heavy IO
2009-06-16 9:10 ` [Qemu-devel] " Avi Kivity
2009-06-16 9:13 ` Avi Kivity
@ 2009-06-16 12:50 ` Anthony Liguori
2009-06-16 12:54 ` Avi Kivity
1 sibling, 1 reply; 9+ messages in thread
From: Anthony Liguori @ 2009-06-16 12:50 UTC (permalink / raw)
To: Avi Kivity; +Cc: qemu-devel@nongnu.org, kvm-devel
Avi Kivity wrote:
>> Does anyone have a clever idea how to fix this without just waiting
>> for all IO requests to complete?
>
> What's wrong with waiting for requests to complete? It should take a
> few tens of milliseconds.
An alternative would be to attempt to cancel the requests. This incurs
no non-deterministic latency.
The tricky bit is that this has to happen at the device layer because
the opaques cannot be saved in a meaningful way.
> We could start throttling requests late in the live stage, but I don't
> really see the point.
>
> Isn't virtio migration currently broken due to the qdev changes?
>
--
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] Re: Live migration broken when under heavy IO
2009-06-16 12:50 ` Anthony Liguori
@ 2009-06-16 12:54 ` Avi Kivity
2009-06-16 12:57 ` Anthony Liguori
0 siblings, 1 reply; 9+ messages in thread
From: Avi Kivity @ 2009-06-16 12:54 UTC (permalink / raw)
To: Anthony Liguori; +Cc: qemu-devel@nongnu.org, kvm-devel
On 06/16/2009 03:50 PM, Anthony Liguori wrote:
> Avi Kivity wrote:
>>> Does anyone have a clever idea how to fix this without just waiting
>>> for all IO requests to complete?
>>
>> What's wrong with waiting for requests to complete? It should take a
>> few tens of milliseconds.
>
> An alternative would be to attempt to cancel the requests. This
> incurs no non-deterministic latency.
Yes, that's even better (though without linux-aio, it's equivalent).
>
> The tricky bit is that this has to happen at the device layer because
> the opaques cannot be saved in a meaningful way.
>
Do you mean the device has to record all cancelled requests and replay
them? I think we can do it at the block layer (though we have to avoid
it for nested requests).
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] Re: Live migration broken when under heavy IO
2009-06-16 12:54 ` Avi Kivity
@ 2009-06-16 12:57 ` Anthony Liguori
2009-06-16 13:12 ` Avi Kivity
0 siblings, 1 reply; 9+ messages in thread
From: Anthony Liguori @ 2009-06-16 12:57 UTC (permalink / raw)
To: Avi Kivity; +Cc: qemu-devel@nongnu.org, kvm-devel
Avi Kivity wrote:
> Yes, that's even better (though without linux-aio, it's equivalent).
Not absolutely equivalent. There many be queued requests that haven't
yet been dispatched to the thread pool, but yeah, I understand what you
mean.
>>
>> The tricky bit is that this has to happen at the device layer because
>> the opaques cannot be saved in a meaningful way.
>>
>
> Do you mean the device has to record all cancelled requests and replay
> them? I think we can do it at the block layer (though we have to
> avoid it for nested requests).
In order to complete the requests, you have to call a callback and pass
an opaque with the results. The callback/opaque cannot be saved in the
block layer in a meaningful way.
--
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] Re: Live migration broken when under heavy IO
2009-06-16 12:57 ` Anthony Liguori
@ 2009-06-16 13:12 ` Avi Kivity
0 siblings, 0 replies; 9+ messages in thread
From: Avi Kivity @ 2009-06-16 13:12 UTC (permalink / raw)
To: Anthony Liguori; +Cc: qemu-devel@nongnu.org, kvm-devel
On 06/16/2009 03:57 PM, Anthony Liguori wrote:
>>>
>>> The tricky bit is that this has to happen at the device layer
>>> because the opaques cannot be saved in a meaningful way.
>>>
>>
>> Do you mean the device has to record all cancelled requests and
>> replay them? I think we can do it at the block layer (though we have
>> to avoid it for nested requests).
>
> In order to complete the requests, you have to call a callback and
> pass an opaque with the results. The callback/opaque cannot be saved
> in the block layer in a meaningful way.
>
You're right, of course. I guess we'll have to cancel any near term
cancellation plans.
We could change the opaque to be something pre-registered (e.g. the
device state object, which we don't need to save/restore) and pass in
addition an integer request tag. These would be migratable. The device
would be responsible for saving tags and their associated information
(perhaps through a common API).
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] Re: Live migration broken when under heavy IO
2009-06-15 20:33 [Qemu-devel] Live migration broken when under heavy IO Anthony Liguori
2009-06-15 20:48 ` Glauber Costa
2009-06-16 9:10 ` [Qemu-devel] " Avi Kivity
@ 2009-06-16 18:19 ` Charles Duffy
2 siblings, 0 replies; 9+ messages in thread
From: Charles Duffy @ 2009-06-16 18:19 UTC (permalink / raw)
To: qemu-devel
I'm not sure if this is related, but to comment --
I'm seeing what appears to be occasional dropped I/O on non-live
migration -- that is to say, stop, migrate-to-disk, shuffle files
around, migrate-from-disk, cont. I have not yet been able to reproduce
this with cache=off.
As I understand it, the bulk of discussion in this thread is about I/O
submitted by the guest after the migration starts; as I'm stopping the
guest CPU _before_ initiating migration, there would appear to be
outstanding issues not limited to this scenario.
[qemu-kvm-0.10.5, IDE, qcow2]
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-06-16 18:19 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-15 20:33 [Qemu-devel] Live migration broken when under heavy IO Anthony Liguori
2009-06-15 20:48 ` Glauber Costa
2009-06-16 9:10 ` [Qemu-devel] " Avi Kivity
2009-06-16 9:13 ` Avi Kivity
2009-06-16 12:50 ` Anthony Liguori
2009-06-16 12:54 ` Avi Kivity
2009-06-16 12:57 ` Anthony Liguori
2009-06-16 13:12 ` Avi Kivity
2009-06-16 18:19 ` Charles Duffy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).