* [Qemu-devel] Live migration sequence
@ 2015-10-08 11:39 Pavel Fedin
2015-10-09 15:29 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 9+ messages in thread
From: Pavel Fedin @ 2015-10-08 11:39 UTC (permalink / raw)
To: 'QEMU'
Hello!
I would like to clarify, what is the exact live migration sequence in qemu?
I mean - there are pre_save and post_load callbacks for VMState structures. Is there any determined
order of calling them related to memory contents migration? In other words, is there any guarantee
that pre_save is called before RAM migrates, and post_load is called after RAM migrates?
The answer to this question is important for developing vITS live migration, where i have to dump
internal ITS state into in-memory tables before the migration starts, and then get it back in cache
on destination.
Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence
2015-10-08 11:39 [Qemu-devel] Live migration sequence Pavel Fedin
@ 2015-10-09 15:29 ` Dr. David Alan Gilbert
2015-10-13 10:06 ` Pavel Fedin
0 siblings, 1 reply; 9+ messages in thread
From: Dr. David Alan Gilbert @ 2015-10-09 15:29 UTC (permalink / raw)
To: Pavel Fedin; +Cc: 'QEMU'
* Pavel Fedin (p.fedin@samsung.com) wrote:
> Hello!
>
> I would like to clarify, what is the exact live migration sequence in qemu?
>
> I mean - there are pre_save and post_load callbacks for VMState structures. Is there any determined
> order of calling them related to memory contents migration? In other words, is there any guarantee
> that pre_save is called before RAM migrates, and post_load is called after RAM migrates?
The pre_load/pre_save and post_load relate to the particular VMState the functions are attached to;
so if you use them on a VMState of a particular device the only thing you know is that the pre_save
is called just before the system writes the description out; and on loading the pre_load is called
just before it reads the data, and post_load just after it's read the data.
Ordering relating to RAM is a separate question; in general RAM is normally loaded before all
of the non-iterative devices.
> The answer to this question is important for developing vITS live migration, where i have to dump
> internal ITS state into in-memory tables before the migration starts, and then get it back in cache
> on destination.
What's an ITS ?
With a related question, how big are the tables and can it change during the iterated part
of the migrate?
Dave
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence
2015-10-09 15:29 ` Dr. David Alan Gilbert
@ 2015-10-13 10:06 ` Pavel Fedin
2015-10-13 11:05 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 9+ messages in thread
From: Pavel Fedin @ 2015-10-13 10:06 UTC (permalink / raw)
To: 'Dr. David Alan Gilbert'; +Cc: 'QEMU'
Hello!
Sorry for the delayed reply.
> What's an ITS ?
Interrupt Translation Service. In a short, it's a thing responsible for handling PCIe MSI-X
interrupts on ARM64 architecture.
> With a related question, how big are the tables and can it change during the iterated part
> of the migrate?
Tables are something like 64K each. They hold mappings between device/event IDs and actual IRQ
numbers.
Unfortunately i don't know how to answer the second part of the question, about iterated part. Can
you explain in details, what is it and how does it work?
Or, well, we could put the question the other way: imagine that in pre_save i tell my emulated
device to flush its cached state into RAM-based tables. In post_load i could tell the device to
re-read data from RAM into its cache. So, what do i need in order to make these tables in RAM to
migrate correctly?
Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence
2015-10-13 10:06 ` Pavel Fedin
@ 2015-10-13 11:05 ` Dr. David Alan Gilbert
2015-10-13 12:02 ` Pavel Fedin
2015-10-16 7:24 ` Pavel Fedin
0 siblings, 2 replies; 9+ messages in thread
From: Dr. David Alan Gilbert @ 2015-10-13 11:05 UTC (permalink / raw)
To: Pavel Fedin
Cc: peter.maydell, quintela, marc.zyngier, 'QEMU', amit.shah,
kvmarm, christoffer.dall
* Pavel Fedin (p.fedin@samsung.com) wrote:
> Hello!
>
> Sorry for the delayed reply.
> > What's an ITS ?
>
> Interrupt Translation Service. In a short, it's a thing responsible for handling PCIe MSI-X
> interrupts on ARM64 architecture.
OK; I asked Peter (cc'd) to explain a bit more about the ITS to me.
> > With a related question, how big are the tables and can it change during the iterated part
> > of the migrate?
>
> Tables are something like 64K each. They hold mappings between device/event IDs and actual IRQ
> numbers.
>
> Unfortunately i don't know how to answer the second part of the question, about iterated part. Can
> you explain in details, what is it and how does it work?
QEMU migrates stuff in two different ways:
a) Like a device where at the end of the migration it transmits all the information
b) Like RAM, (iterated) - in this it sends the contents across but does this while the guest
is still running, changes that are made to the RAM are then transmitted over and over again
until the amount of changed RAM is small; then we stop the guest, transmit those last few
changes and then do (a).
> Or, well, we could put the question the other way: imagine that in pre_save i tell my emulated
> device to flush its cached state into RAM-based tables. In post_load i could tell the device to
> re-read data from RAM into its cache. So, what do i need in order to make these tables in RAM to
> migrate correctly?
At 64k that's pretty small; however Peter explained to me that's per-cpu so that potentially
could be huge. If it was only small I'd do it just like a device which is nice
and simple; but since it can get big then it's going to be more interesting, and
since it's part of guest RAM you need to be careful.
The pre_load/post_load are all relative to a particular device; they're
not hooked around when other stuff (RAM) gets migrated; it sounds like you
need another hook.
If I understand correctly what you need is to find a hook to dump the state
into guest ram, but then you also need to keep the state updated in guest
RAM during the migration.
Some thoughts:
a) There is a migration state notifier list - see add_migration_state_change_notifier (spice calls it)
- but I don't think it's called in the right places for your needs; we
could add some more places that gets called.
b) Once you're in the device state saving (a above) you must not change guest RAM,
because at that point the migration code won't send any new changes across
to the destination. So any sync's you're going to do have to happen before/at
the time we stop the CPU and do the final RAM sync. On the plus side, when
you're loading the device state in (a) you can be sure the RAM contents are there.
c) Watch out for the size of that final sync; if you have lots of these ITS
and they all update their 64k page at the point we stop the CPU then you're
going to generate a lot of RAM that needs syncing.
Dave
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence
2015-10-13 11:05 ` Dr. David Alan Gilbert
@ 2015-10-13 12:02 ` Pavel Fedin
2015-10-13 12:04 ` Peter Maydell
2015-10-16 7:24 ` Pavel Fedin
1 sibling, 1 reply; 9+ messages in thread
From: Pavel Fedin @ 2015-10-13 12:02 UTC (permalink / raw)
To: 'Dr. David Alan Gilbert'
Cc: peter.maydell, quintela, marc.zyngier, 'QEMU', amit.shah,
kvmarm, christoffer.dall
Hello!
> b) Once you're in the device state saving (a above) you must not change guest RAM,
> because at that point the migration code won't send any new changes across
> to the destination. So any sync's you're going to do have to happen before/at
> the time we stop the CPU and do the final RAM sync. On the plus side, when
> you're loading the device state in (a) you can be sure the RAM contents are there.
This is good. I think, in this case i can teach the kernel (here we talk about accelerated
in-kernel irqchip implementation) to flush ITS caches when a CPU is stopped. This will do the job.
> c) Watch out for the size of that final sync; if you have lots of these ITS
> and they all update their 64k page at the point we stop the CPU then you're
> going to generate a lot of RAM that needs syncing.
Well, reducing downtime would be the next task. :) First i'd like to get it working at all.
Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence
2015-10-13 12:02 ` Pavel Fedin
@ 2015-10-13 12:04 ` Peter Maydell
2015-10-13 12:41 ` Pavel Fedin
0 siblings, 1 reply; 9+ messages in thread
From: Peter Maydell @ 2015-10-13 12:04 UTC (permalink / raw)
To: Pavel Fedin
Cc: Juan Quintela, Marc Zyngier, QEMU, Dr. David Alan Gilbert,
Amit Shah, kvmarm@lists.cs.columbia.edu, Christoffer Dall
On 13 October 2015 at 13:02, Pavel Fedin <p.fedin@samsung.com> wrote:
> Hello!
>
>> b) Once you're in the device state saving (a above) you must not change guest RAM,
>> because at that point the migration code won't send any new changes across
>> to the destination. So any sync's you're going to do have to happen before/at
>> the time we stop the CPU and do the final RAM sync. On the plus side, when
>> you're loading the device state in (a) you can be sure the RAM contents are there.
>
> This is good. I think, in this case i can teach the kernel (here we talk about accelerated
> in-kernel irqchip implementation) to flush ITS caches when a CPU is stopped. This will do the job.
Our idea at the discussion at Connect was to have an ioctl to request
a flush, rather than to do it automatically when a CPU is stopped
(you probably don't want to flush when only one CPU in an SMP system
is stopped, for instance). But we wanted to get the basic no-ITS
ABI sorted out and agreed first, so those details aren't in the
patch Christoffer sent out the other day.
It will probably be more efficient if we agree on the ABI details
here before doing the implementation rather than afterwards.
thanks
-- PMM
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence
2015-10-13 12:04 ` Peter Maydell
@ 2015-10-13 12:41 ` Pavel Fedin
0 siblings, 0 replies; 9+ messages in thread
From: Pavel Fedin @ 2015-10-13 12:41 UTC (permalink / raw)
To: 'Peter Maydell'
Cc: 'Juan Quintela', 'Marc Zyngier', 'QEMU',
'Dr. David Alan Gilbert', 'Amit Shah', kvmarm,
'Christoffer Dall'
Hello!
> Our idea at the discussion at Connect was to have an ioctl to request
> a flush, rather than to do it automatically when a CPU is stopped
> (you probably don't want to flush when only one CPU in an SMP system
> is stopped, for instance).
Yes, you're right. Looks like this would be more complicated but better way to do the thing. I'll look at migration state notifier mechanism.
> But we wanted to get the basic no-ITS ABI sorted out and agreed first
Yes, this is good idea, too much changes otherwise. But here at Samsung we have a project, and it has deadlines, so please understand that i try to do things quickly.
Of course, you see that as soon as i make something, i post it at least as RFCs, because i don't want to have something misdesigned, which will be dropped afterwards. Cooperation with OSS is one of goals of our project.
Thank you very much for your cooperation and explanations.
Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence
2015-10-13 11:05 ` Dr. David Alan Gilbert
2015-10-13 12:02 ` Pavel Fedin
@ 2015-10-16 7:24 ` Pavel Fedin
2015-10-16 17:11 ` Dr. David Alan Gilbert
1 sibling, 1 reply; 9+ messages in thread
From: Pavel Fedin @ 2015-10-16 7:24 UTC (permalink / raw)
To: 'Dr. David Alan Gilbert'
Cc: peter.maydell, quintela, marc.zyngier, 'QEMU', amit.shah,
kvmarm, christoffer.dall
Hello!
> Some thoughts:
> a) There is a migration state notifier list - see add_migration_state_change_notifier (spice
> calls it)
> - but I don't think it's called in the right places for your needs; we
> could add some more places that gets called.
I am now trying to add one more state, something like MIGRATION_STATUS_FINISHING. It would mean that CPUs are stopped.
Can you explain me migration code a bit? Where is iteration loop, and where are CPUs stopped? I am looking at migration.c but
cannot say that i understand some good portion of it. :)
Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence
2015-10-16 7:24 ` Pavel Fedin
@ 2015-10-16 17:11 ` Dr. David Alan Gilbert
0 siblings, 0 replies; 9+ messages in thread
From: Dr. David Alan Gilbert @ 2015-10-16 17:11 UTC (permalink / raw)
To: Pavel Fedin
Cc: peter.maydell, quintela, marc.zyngier, 'QEMU', amit.shah,
kvmarm, christoffer.dall
* Pavel Fedin (p.fedin@samsung.com) wrote:
> Hello!
>
> > Some thoughts:
> > a) There is a migration state notifier list - see add_migration_state_change_notifier (spice
> > calls it)
> > - but I don't think it's called in the right places for your needs; we
> > could add some more places that gets called.
>
> I am now trying to add one more state, something like MIGRATION_STATUS_FINISHING. It would mean that CPUs are stopped.
> Can you explain me migration code a bit? Where is iteration loop, and where are CPUs stopped? I am looking at migration.c but
> cannot say that i understand some good portion of it. :)
The outgoing side of migration comes into migrate_fd_connect which does
all the setup and then starts 'migration_thread'. The big while loop in there does
most of the work, and on each loop normally ends up calling either
qemu_savevm_state_iterate
or
migration_completion
migration_completion calls vm_stop_force_state to stop the CPU,
and then qemu_savevm_state_complete to save all the remaining devices
out.
Dave
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-10-16 17:11 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-08 11:39 [Qemu-devel] Live migration sequence Pavel Fedin
2015-10-09 15:29 ` Dr. David Alan Gilbert
2015-10-13 10:06 ` Pavel Fedin
2015-10-13 11:05 ` Dr. David Alan Gilbert
2015-10-13 12:02 ` Pavel Fedin
2015-10-13 12:04 ` Peter Maydell
2015-10-13 12:41 ` Pavel Fedin
2015-10-16 7:24 ` Pavel Fedin
2015-10-16 17:11 ` Dr. David Alan Gilbert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).