* [Qemu-devel] Live migration sequence @ 2015-10-08 11:39 Pavel Fedin 2015-10-09 15:29 ` Dr. David Alan Gilbert 0 siblings, 1 reply; 9+ messages in thread From: Pavel Fedin @ 2015-10-08 11:39 UTC (permalink / raw) To: 'QEMU' Hello! I would like to clarify, what is the exact live migration sequence in qemu? I mean - there are pre_save and post_load callbacks for VMState structures. Is there any determined order of calling them related to memory contents migration? In other words, is there any guarantee that pre_save is called before RAM migrates, and post_load is called after RAM migrates? The answer to this question is important for developing vITS live migration, where i have to dump internal ITS state into in-memory tables before the migration starts, and then get it back in cache on destination. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence 2015-10-08 11:39 [Qemu-devel] Live migration sequence Pavel Fedin @ 2015-10-09 15:29 ` Dr. David Alan Gilbert 2015-10-13 10:06 ` Pavel Fedin 0 siblings, 1 reply; 9+ messages in thread From: Dr. David Alan Gilbert @ 2015-10-09 15:29 UTC (permalink / raw) To: Pavel Fedin; +Cc: 'QEMU' * Pavel Fedin (p.fedin@samsung.com) wrote: > Hello! > > I would like to clarify, what is the exact live migration sequence in qemu? > > I mean - there are pre_save and post_load callbacks for VMState structures. Is there any determined > order of calling them related to memory contents migration? In other words, is there any guarantee > that pre_save is called before RAM migrates, and post_load is called after RAM migrates? The pre_load/pre_save and post_load relate to the particular VMState the functions are attached to; so if you use them on a VMState of a particular device the only thing you know is that the pre_save is called just before the system writes the description out; and on loading the pre_load is called just before it reads the data, and post_load just after it's read the data. Ordering relating to RAM is a separate question; in general RAM is normally loaded before all of the non-iterative devices. > The answer to this question is important for developing vITS live migration, where i have to dump > internal ITS state into in-memory tables before the migration starts, and then get it back in cache > on destination. What's an ITS ? With a related question, how big are the tables and can it change during the iterated part of the migrate? Dave > > Kind regards, > Pavel Fedin > Expert Engineer > Samsung Electronics Research center Russia > > > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence 2015-10-09 15:29 ` Dr. David Alan Gilbert @ 2015-10-13 10:06 ` Pavel Fedin 2015-10-13 11:05 ` Dr. David Alan Gilbert 0 siblings, 1 reply; 9+ messages in thread From: Pavel Fedin @ 2015-10-13 10:06 UTC (permalink / raw) To: 'Dr. David Alan Gilbert'; +Cc: 'QEMU' Hello! Sorry for the delayed reply. > What's an ITS ? Interrupt Translation Service. In a short, it's a thing responsible for handling PCIe MSI-X interrupts on ARM64 architecture. > With a related question, how big are the tables and can it change during the iterated part > of the migrate? Tables are something like 64K each. They hold mappings between device/event IDs and actual IRQ numbers. Unfortunately i don't know how to answer the second part of the question, about iterated part. Can you explain in details, what is it and how does it work? Or, well, we could put the question the other way: imagine that in pre_save i tell my emulated device to flush its cached state into RAM-based tables. In post_load i could tell the device to re-read data from RAM into its cache. So, what do i need in order to make these tables in RAM to migrate correctly? Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence 2015-10-13 10:06 ` Pavel Fedin @ 2015-10-13 11:05 ` Dr. David Alan Gilbert 2015-10-13 12:02 ` Pavel Fedin 2015-10-16 7:24 ` Pavel Fedin 0 siblings, 2 replies; 9+ messages in thread From: Dr. David Alan Gilbert @ 2015-10-13 11:05 UTC (permalink / raw) To: Pavel Fedin Cc: peter.maydell, quintela, marc.zyngier, 'QEMU', amit.shah, kvmarm, christoffer.dall * Pavel Fedin (p.fedin@samsung.com) wrote: > Hello! > > Sorry for the delayed reply. > > What's an ITS ? > > Interrupt Translation Service. In a short, it's a thing responsible for handling PCIe MSI-X > interrupts on ARM64 architecture. OK; I asked Peter (cc'd) to explain a bit more about the ITS to me. > > With a related question, how big are the tables and can it change during the iterated part > > of the migrate? > > Tables are something like 64K each. They hold mappings between device/event IDs and actual IRQ > numbers. > > Unfortunately i don't know how to answer the second part of the question, about iterated part. Can > you explain in details, what is it and how does it work? QEMU migrates stuff in two different ways: a) Like a device where at the end of the migration it transmits all the information b) Like RAM, (iterated) - in this it sends the contents across but does this while the guest is still running, changes that are made to the RAM are then transmitted over and over again until the amount of changed RAM is small; then we stop the guest, transmit those last few changes and then do (a). > Or, well, we could put the question the other way: imagine that in pre_save i tell my emulated > device to flush its cached state into RAM-based tables. In post_load i could tell the device to > re-read data from RAM into its cache. So, what do i need in order to make these tables in RAM to > migrate correctly? At 64k that's pretty small; however Peter explained to me that's per-cpu so that potentially could be huge. If it was only small I'd do it just like a device which is nice and simple; but since it can get big then it's going to be more interesting, and since it's part of guest RAM you need to be careful. The pre_load/post_load are all relative to a particular device; they're not hooked around when other stuff (RAM) gets migrated; it sounds like you need another hook. If I understand correctly what you need is to find a hook to dump the state into guest ram, but then you also need to keep the state updated in guest RAM during the migration. Some thoughts: a) There is a migration state notifier list - see add_migration_state_change_notifier (spice calls it) - but I don't think it's called in the right places for your needs; we could add some more places that gets called. b) Once you're in the device state saving (a above) you must not change guest RAM, because at that point the migration code won't send any new changes across to the destination. So any sync's you're going to do have to happen before/at the time we stop the CPU and do the final RAM sync. On the plus side, when you're loading the device state in (a) you can be sure the RAM contents are there. c) Watch out for the size of that final sync; if you have lots of these ITS and they all update their 64k page at the point we stop the CPU then you're going to generate a lot of RAM that needs syncing. Dave > > Kind regards, > Pavel Fedin > Expert Engineer > Samsung Electronics Research center Russia > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence 2015-10-13 11:05 ` Dr. David Alan Gilbert @ 2015-10-13 12:02 ` Pavel Fedin 2015-10-13 12:04 ` Peter Maydell 2015-10-16 7:24 ` Pavel Fedin 1 sibling, 1 reply; 9+ messages in thread From: Pavel Fedin @ 2015-10-13 12:02 UTC (permalink / raw) To: 'Dr. David Alan Gilbert' Cc: peter.maydell, quintela, marc.zyngier, 'QEMU', amit.shah, kvmarm, christoffer.dall Hello! > b) Once you're in the device state saving (a above) you must not change guest RAM, > because at that point the migration code won't send any new changes across > to the destination. So any sync's you're going to do have to happen before/at > the time we stop the CPU and do the final RAM sync. On the plus side, when > you're loading the device state in (a) you can be sure the RAM contents are there. This is good. I think, in this case i can teach the kernel (here we talk about accelerated in-kernel irqchip implementation) to flush ITS caches when a CPU is stopped. This will do the job. > c) Watch out for the size of that final sync; if you have lots of these ITS > and they all update their 64k page at the point we stop the CPU then you're > going to generate a lot of RAM that needs syncing. Well, reducing downtime would be the next task. :) First i'd like to get it working at all. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence 2015-10-13 12:02 ` Pavel Fedin @ 2015-10-13 12:04 ` Peter Maydell 2015-10-13 12:41 ` Pavel Fedin 0 siblings, 1 reply; 9+ messages in thread From: Peter Maydell @ 2015-10-13 12:04 UTC (permalink / raw) To: Pavel Fedin Cc: Juan Quintela, Marc Zyngier, QEMU, Dr. David Alan Gilbert, Amit Shah, kvmarm@lists.cs.columbia.edu, Christoffer Dall On 13 October 2015 at 13:02, Pavel Fedin <p.fedin@samsung.com> wrote: > Hello! > >> b) Once you're in the device state saving (a above) you must not change guest RAM, >> because at that point the migration code won't send any new changes across >> to the destination. So any sync's you're going to do have to happen before/at >> the time we stop the CPU and do the final RAM sync. On the plus side, when >> you're loading the device state in (a) you can be sure the RAM contents are there. > > This is good. I think, in this case i can teach the kernel (here we talk about accelerated > in-kernel irqchip implementation) to flush ITS caches when a CPU is stopped. This will do the job. Our idea at the discussion at Connect was to have an ioctl to request a flush, rather than to do it automatically when a CPU is stopped (you probably don't want to flush when only one CPU in an SMP system is stopped, for instance). But we wanted to get the basic no-ITS ABI sorted out and agreed first, so those details aren't in the patch Christoffer sent out the other day. It will probably be more efficient if we agree on the ABI details here before doing the implementation rather than afterwards. thanks -- PMM ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence 2015-10-13 12:04 ` Peter Maydell @ 2015-10-13 12:41 ` Pavel Fedin 0 siblings, 0 replies; 9+ messages in thread From: Pavel Fedin @ 2015-10-13 12:41 UTC (permalink / raw) To: 'Peter Maydell' Cc: 'Juan Quintela', 'Marc Zyngier', 'QEMU', 'Dr. David Alan Gilbert', 'Amit Shah', kvmarm, 'Christoffer Dall' Hello! > Our idea at the discussion at Connect was to have an ioctl to request > a flush, rather than to do it automatically when a CPU is stopped > (you probably don't want to flush when only one CPU in an SMP system > is stopped, for instance). Yes, you're right. Looks like this would be more complicated but better way to do the thing. I'll look at migration state notifier mechanism. > But we wanted to get the basic no-ITS ABI sorted out and agreed first Yes, this is good idea, too much changes otherwise. But here at Samsung we have a project, and it has deadlines, so please understand that i try to do things quickly. Of course, you see that as soon as i make something, i post it at least as RFCs, because i don't want to have something misdesigned, which will be dropped afterwards. Cooperation with OSS is one of goals of our project. Thank you very much for your cooperation and explanations. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence 2015-10-13 11:05 ` Dr. David Alan Gilbert 2015-10-13 12:02 ` Pavel Fedin @ 2015-10-16 7:24 ` Pavel Fedin 2015-10-16 17:11 ` Dr. David Alan Gilbert 1 sibling, 1 reply; 9+ messages in thread From: Pavel Fedin @ 2015-10-16 7:24 UTC (permalink / raw) To: 'Dr. David Alan Gilbert' Cc: peter.maydell, quintela, marc.zyngier, 'QEMU', amit.shah, kvmarm, christoffer.dall Hello! > Some thoughts: > a) There is a migration state notifier list - see add_migration_state_change_notifier (spice > calls it) > - but I don't think it's called in the right places for your needs; we > could add some more places that gets called. I am now trying to add one more state, something like MIGRATION_STATUS_FINISHING. It would mean that CPUs are stopped. Can you explain me migration code a bit? Where is iteration loop, and where are CPUs stopped? I am looking at migration.c but cannot say that i understand some good portion of it. :) Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Live migration sequence 2015-10-16 7:24 ` Pavel Fedin @ 2015-10-16 17:11 ` Dr. David Alan Gilbert 0 siblings, 0 replies; 9+ messages in thread From: Dr. David Alan Gilbert @ 2015-10-16 17:11 UTC (permalink / raw) To: Pavel Fedin Cc: peter.maydell, quintela, marc.zyngier, 'QEMU', amit.shah, kvmarm, christoffer.dall * Pavel Fedin (p.fedin@samsung.com) wrote: > Hello! > > > Some thoughts: > > a) There is a migration state notifier list - see add_migration_state_change_notifier (spice > > calls it) > > - but I don't think it's called in the right places for your needs; we > > could add some more places that gets called. > > I am now trying to add one more state, something like MIGRATION_STATUS_FINISHING. It would mean that CPUs are stopped. > Can you explain me migration code a bit? Where is iteration loop, and where are CPUs stopped? I am looking at migration.c but > cannot say that i understand some good portion of it. :) The outgoing side of migration comes into migrate_fd_connect which does all the setup and then starts 'migration_thread'. The big while loop in there does most of the work, and on each loop normally ends up calling either qemu_savevm_state_iterate or migration_completion migration_completion calls vm_stop_force_state to stop the CPU, and then qemu_savevm_state_complete to save all the remaining devices out. Dave > > Kind regards, > Pavel Fedin > Expert Engineer > Samsung Electronics Research center Russia > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-10-16 17:11 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-10-08 11:39 [Qemu-devel] Live migration sequence Pavel Fedin 2015-10-09 15:29 ` Dr. David Alan Gilbert 2015-10-13 10:06 ` Pavel Fedin 2015-10-13 11:05 ` Dr. David Alan Gilbert 2015-10-13 12:02 ` Pavel Fedin 2015-10-13 12:04 ` Peter Maydell 2015-10-13 12:41 ` Pavel Fedin 2015-10-16 7:24 ` Pavel Fedin 2015-10-16 17:11 ` Dr. David Alan Gilbert
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).