qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC 0/7] Optional toplevel sections
@ 2014-10-15  7:55 Juan Quintela
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 1/7] migration: Create optional sections Juan Quintela
                   ` (8 more replies)
  0 siblings, 9 replies; 26+ messages in thread
From: Juan Quintela @ 2014-10-15  7:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, laine, lcapitulino

Hi

by popular demand, and after too many time, this series.  This is an
RFC to know what people think about how to use them, the interface
proposed, whatever.

* simplify optional subsections moving the "needed" function to
  vmstate description.  I think that this simplification makes sense
  by itself, it is indipendent of the rest of the patches.

* runstate: To make an example of an optional section, I decided to
  use current runstate.  Right now, we have a problem when:
  - we start destination without -S
  - we run migration, and it causes an ioerror on source, but migration finishes
  - we try to run migration on destination anyways, when it is
    possible that we could get disk corruption (the ioerror was there for a reason)
  Luiz: You can see any obvious improvement about how we use runstates?
  Laine: Could you told me if you (libvirt) like this or would want
     something a bit different?

* I sent that option indpendently for new machine types.

* For old machine types I use this as one example of optional section.
  We only sent it when the state is different from "running" or "paused".

  So, the only case where we fail is if we migrate to an old qemu and
  there is one error.

* On the runstate subsection "postload" we can send any event for
  anything that libvirt wants when migration finishes.
  Laine, can you told us what libvirt would preffer for this?

Kevin: You asked for optional sections in the past for the block
   layer, would this proposal be enough for you?

Please review, comment.

Thanks, Juan.

PD.  Yes, on proper submission, patches 6 and 7 are on the wrong order.

Juan Quintela (7):
  migration: Create optional sections
  runstate: Add runstate store
  runstate: create runstate_index function
  runstate: migration allows more transitions now
  migration: create now section to store global state
  global_state: Make section optional
  vmstate: Create optional sections

 cpus.c                        |  11 ++--
 docs/migration.txt            |  11 ++--
 exec.c                        |  11 ++--
 hw/acpi/ich9.c                |  10 ++--
 hw/acpi/piix4.c               |  10 ++--
 hw/block/fdc.c                |  37 +++++--------
 hw/char/serial.c              |  41 ++++++---------
 hw/display/qxl.c              |  11 ++--
 hw/display/vga.c              |  11 ++--
 hw/i386/pc_piix.c             |   2 +
 hw/ide/core.c                 |  32 +++++-------
 hw/ide/pci.c                  |  16 +++---
 hw/input/pckbd.c              |  22 ++++----
 hw/input/ps2.c                |  11 ++--
 hw/isa/lpc_ich9.c             |  10 ++--
 hw/net/e1000.c                |  11 ++--
 hw/net/rtl8139.c              |  11 ++--
 hw/net/vmxnet3.c              |  12 ++---
 hw/pci-host/piix.c            |  10 ++--
 hw/scsi/scsi-bus.c            |  11 ++--
 hw/timer/hpet.c               |  11 ++--
 hw/timer/mc146818rtc.c        |  23 ++++-----
 hw/usb/hcd-ohci.c             |  11 ++--
 hw/usb/redirect.c             |  34 ++++++------
 hw/virtio/virtio.c            |  10 ++--
 include/migration/migration.h |   4 ++
 include/migration/vmstate.h   |  10 ++--
 include/sysemu/sysemu.h       |   2 +
 migration.c                   | 117 ++++++++++++++++++++++++++++++++++++++++--
 savevm.c                      |  14 +++--
 target-arm/machine.c          |  26 ++++------
 target-i386/machine.c         |  71 ++++++++++---------------
 target-ppc/machine.c          |  62 +++++++++-------------
 vl.c                          |  26 ++++++++++
 vmstate.c                     |  27 +++++++---
 35 files changed, 393 insertions(+), 356 deletions(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 1/7] migration: Create optional sections
  2014-10-15  7:55 [Qemu-devel] [RFC 0/7] Optional toplevel sections Juan Quintela
@ 2014-10-15  7:55 ` Juan Quintela
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 2/7] runstate: Add runstate store Juan Quintela
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Juan Quintela @ 2014-10-15  7:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, laine, lcapitulino

We create optional sections with this patch.  But we already have
optional subsections.  Instead of having two mechanism that do the
same, we can just generalize it.

For subsections we just change:

- Add a needed function to VMStateDescription
- Remove VMStateSubsection (after removal of the needed function
  it is just a VMStateDescription)
- Adjust the whole tree, moving the needed function to the corresponding
  VMStateDescription

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 cpus.c                      | 11 +++----
 docs/migration.txt          | 11 +++----
 exec.c                      | 11 +++----
 hw/acpi/ich9.c              | 10 +++----
 hw/acpi/piix4.c             | 10 +++----
 hw/block/fdc.c              | 37 +++++++++--------------
 hw/char/serial.c            | 41 ++++++++++----------------
 hw/display/qxl.c            | 11 +++----
 hw/display/vga.c            | 11 +++----
 hw/ide/core.c               | 32 ++++++++------------
 hw/ide/pci.c                | 16 ++++------
 hw/input/pckbd.c            | 22 +++++++-------
 hw/input/ps2.c              | 11 +++----
 hw/isa/lpc_ich9.c           | 10 +++----
 hw/net/e1000.c              | 11 +++----
 hw/net/rtl8139.c            | 11 +++----
 hw/net/vmxnet3.c            | 12 +++-----
 hw/pci-host/piix.c          | 10 +++----
 hw/scsi/scsi-bus.c          | 11 +++----
 hw/timer/hpet.c             | 11 +++----
 hw/timer/mc146818rtc.c      | 23 +++++++--------
 hw/usb/hcd-ohci.c           | 11 +++----
 hw/usb/redirect.c           | 34 ++++++++++------------
 hw/virtio/virtio.c          | 10 +++----
 include/migration/vmstate.h |  8 ++---
 savevm.c                    | 10 +++----
 target-arm/machine.c        | 26 +++++++----------
 target-i386/machine.c       | 71 ++++++++++++++++++---------------------------
 target-ppc/machine.c        | 62 +++++++++++++++------------------------
 vmstate.c                   | 16 +++++-----
 30 files changed, 228 insertions(+), 353 deletions(-)

diff --git a/cpus.c b/cpus.c
index 0c33458..f67c400 100644
--- a/cpus.c
+++ b/cpus.c
@@ -466,6 +466,7 @@ static const VMStateDescription icount_vmstate_timers = {
     .name = "timer/icount",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = icount_state_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT64(qemu_icount_bias, TimersState),
         VMSTATE_INT64(qemu_icount, TimersState),
@@ -483,13 +484,9 @@ static const VMStateDescription vmstate_timers = {
         VMSTATE_INT64_V(cpu_clock_offset, TimersState, 2),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &icount_vmstate_timers,
-            .needed = icount_state_needed,
-        }, {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &icount_vmstate_timers,
+        NULL
     }
 };

diff --git a/docs/migration.txt b/docs/migration.txt
index 0492a45..ebe656b 100644
--- a/docs/migration.txt
+++ b/docs/migration.txt
@@ -257,6 +257,7 @@ const VMStateDescription vmstate_ide_drive_pio_state = {
     .minimum_version_id = 1,
     .pre_save = ide_drive_pio_pre_save,
     .post_load = ide_drive_pio_post_load,
+    .needed = ide_drive_pio_state_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT32(req_nb_sectors, IDEState),
         VMSTATE_VARRAY_INT32(io_buffer, IDEState, io_buffer_total_len, 1,
@@ -279,13 +280,9 @@ const VMStateDescription vmstate_ide_drive = {
         .... several fields ....
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection []) {
-        {
-            .vmsd = &vmstate_ide_drive_pio_state,
-            .needed = ide_drive_pio_state_needed,
-        }, {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_ide_drive_pio_state,
+        NULL
     }
 };

diff --git a/exec.c b/exec.c
index 759055d..226ba95 100644
--- a/exec.c
+++ b/exec.c
@@ -450,6 +450,7 @@ static const VMStateDescription vmstate_cpu_common_exception_index = {
     .name = "cpu_common/exception_index",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = cpu_common_exception_index_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT32(exception_index, CPUState),
         VMSTATE_END_OF_LIST()
@@ -467,13 +468,9 @@ const VMStateDescription vmstate_cpu_common = {
         VMSTATE_UINT32(interrupt_request, CPUState),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmstate_cpu_common_exception_index,
-            .needed = cpu_common_exception_index_needed,
-        } , {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_cpu_common_exception_index,
+        NULL
     }
 };

diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 7b14bbb..a78fc5b 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -151,6 +151,7 @@ static const VMStateDescription vmstate_memhp_state = {
     .version_id = 1,
     .minimum_version_id = 1,
     .minimum_version_id_old = 1,
+    .needed = vmstate_test_use_memhp,
     .fields      = (VMStateField[]) {
         VMSTATE_MEMORY_HOTPLUG(acpi_memory_hotplug, ICH9LPCPMRegs),
         VMSTATE_END_OF_LIST()
@@ -174,12 +175,9 @@ const VMStateDescription vmstate_ich9_pm = {
         VMSTATE_UINT32(smi_sts, ICH9LPCPMRegs),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmstate_memhp_state,
-            .needed = vmstate_test_use_memhp,
-        },
-        VMSTATE_END_OF_LIST()
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_memhp_state,
+        NULL
     }
 };

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index b72b34e..9675077 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -260,6 +260,7 @@ static const VMStateDescription vmstate_memhp_state = {
     .version_id = 1,
     .minimum_version_id = 1,
     .minimum_version_id_old = 1,
+    .needed = vmstate_test_use_memhp,
     .fields      = (VMStateField[]) {
         VMSTATE_MEMORY_HOTPLUG(acpi_memory_hotplug, PIIX4PMState),
         VMSTATE_END_OF_LIST()
@@ -298,12 +299,9 @@ static const VMStateDescription vmstate_acpi = {
                             vmstate_test_use_acpi_pci_hotplug),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmstate_memhp_state,
-            .needed = vmstate_test_use_memhp,
-        },
-        VMSTATE_END_OF_LIST()
+    .subsections = (const VMStateDescription *[]) {
+         &vmstate_memhp_state,
+         NULL
     }
 };

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 6c86a6b..d39f44b 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -672,6 +672,7 @@ static const VMStateDescription vmstate_fdrive_media_changed = {
     .name = "fdrive/media_changed",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = fdrive_media_changed_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT8(media_changed, FDrive),
         VMSTATE_END_OF_LIST()
@@ -689,6 +690,7 @@ static const VMStateDescription vmstate_fdrive_media_rate = {
     .name = "fdrive/media_rate",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = fdrive_media_rate_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT8(media_rate, FDrive),
         VMSTATE_END_OF_LIST()
@@ -706,6 +708,7 @@ static const VMStateDescription vmstate_fdrive_perpendicular = {
     .name = "fdrive/perpendicular",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = fdrive_perpendicular_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT8(perpendicular, FDrive),
         VMSTATE_END_OF_LIST()
@@ -729,19 +732,11 @@ static const VMStateDescription vmstate_fdrive = {
         VMSTATE_UINT8(sect, FDrive),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmstate_fdrive_media_changed,
-            .needed = &fdrive_media_changed_needed,
-        } , {
-            .vmsd = &vmstate_fdrive_media_rate,
-            .needed = &fdrive_media_rate_needed,
-        } , {
-            .vmsd = &vmstate_fdrive_perpendicular,
-            .needed = &fdrive_perpendicular_needed,
-        } , {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_fdrive_media_changed,
+        &vmstate_fdrive_media_rate,
+        &vmstate_fdrive_perpendicular,
+        NULL
     }
 };

@@ -772,6 +767,7 @@ static const VMStateDescription vmstate_fdc_reset_sensei = {
     .name = "fdc/reset_sensei",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = fdc_reset_sensei_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT32(reset_sensei, FDCtrl),
         VMSTATE_END_OF_LIST()
@@ -789,6 +785,7 @@ static const VMStateDescription vmstate_fdc_result_timer = {
     .name = "fdc/result_timer",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = fdc_result_timer_needed,
     .fields = (VMStateField[]) {
         VMSTATE_TIMER(result_timer, FDCtrl),
         VMSTATE_END_OF_LIST()
@@ -832,16 +829,10 @@ static const VMStateDescription vmstate_fdc = {
                              vmstate_fdrive, FDrive),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmstate_fdc_reset_sensei,
-            .needed = fdc_reset_sensei_needed,
-        } , {
-            .vmsd = &vmstate_fdc_result_timer,
-            .needed = fdc_result_timer_needed,
-        } , {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_fdc_reset_sensei,
+        &vmstate_fdc_result_timer,
+        NULL
     }
 };

diff --git a/hw/char/serial.c b/hw/char/serial.c
index ebcacdc..4e4be01 100644
--- a/hw/char/serial.c
+++ b/hw/char/serial.c
@@ -631,6 +631,7 @@ const VMStateDescription vmstate_serial_thr_ipending = {
     .name = "serial/thr_ipending",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = serial_thr_ipending_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT32(thr_ipending, SerialState),
         VMSTATE_END_OF_LIST()
@@ -647,6 +648,7 @@ const VMStateDescription vmstate_serial_tsr = {
     .name = "serial/tsr",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = serial_tsr_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT32(tsr_retry, SerialState),
         VMSTATE_UINT8(thr, SerialState),
@@ -666,6 +668,7 @@ const VMStateDescription vmstate_serial_recv_fifo = {
     .name = "serial/recv_fifo",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = serial_recv_fifo_needed,
     .fields = (VMStateField[]) {
         VMSTATE_STRUCT(recv_fifo, SerialState, 1, vmstate_fifo8, Fifo8),
         VMSTATE_END_OF_LIST()
@@ -682,6 +685,7 @@ const VMStateDescription vmstate_serial_xmit_fifo = {
     .name = "serial/xmit_fifo",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = serial_xmit_fifo_needed,
     .fields = (VMStateField[]) {
         VMSTATE_STRUCT(xmit_fifo, SerialState, 1, vmstate_fifo8, Fifo8),
         VMSTATE_END_OF_LIST()
@@ -698,6 +702,7 @@ const VMStateDescription vmstate_serial_fifo_timeout_timer = {
     .name = "serial/fifo_timeout_timer",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = serial_fifo_timeout_timer_needed,
     .fields = (VMStateField[]) {
         VMSTATE_TIMER(fifo_timeout_timer, SerialState),
         VMSTATE_END_OF_LIST()
@@ -714,6 +719,7 @@ const VMStateDescription vmstate_serial_timeout_ipending = {
     .name = "serial/timeout_ipending",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = serial_timeout_ipending_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT32(timeout_ipending, SerialState),
         VMSTATE_END_OF_LIST()
@@ -729,6 +735,7 @@ static bool serial_poll_needed(void *opaque)
 const VMStateDescription vmstate_serial_poll = {
     .name = "serial/poll",
     .version_id = 1,
+    .needed = serial_poll_needed,
     .minimum_version_id = 1,
     .fields = (VMStateField[]) {
         VMSTATE_INT32(poll_msl, SerialState),
@@ -757,31 +764,15 @@ const VMStateDescription vmstate_serial = {
         VMSTATE_UINT8_V(fcr_vmstate, SerialState, 3),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmstate_serial_thr_ipending,
-            .needed = &serial_thr_ipending_needed,
-        } , {
-            .vmsd = &vmstate_serial_tsr,
-            .needed = &serial_tsr_needed,
-        } , {
-            .vmsd = &vmstate_serial_recv_fifo,
-            .needed = &serial_recv_fifo_needed,
-        } , {
-            .vmsd = &vmstate_serial_xmit_fifo,
-            .needed = &serial_xmit_fifo_needed,
-        } , {
-            .vmsd = &vmstate_serial_fifo_timeout_timer,
-            .needed = &serial_fifo_timeout_timer_needed,
-        } , {
-            .vmsd = &vmstate_serial_timeout_ipending,
-            .needed = &serial_timeout_ipending_needed,
-        } , {
-            .vmsd = &vmstate_serial_poll,
-            .needed = &serial_poll_needed,
-        } , {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_serial_thr_ipending,
+        &vmstate_serial_tsr,
+        &vmstate_serial_recv_fifo,
+        &vmstate_serial_xmit_fifo,
+        &vmstate_serial_fifo_timeout_timer,
+        &vmstate_serial_timeout_ipending,
+        &vmstate_serial_poll,
+        NULL
     }
 };

diff --git a/hw/display/qxl.c b/hw/display/qxl.c
index 93b3518..47a1f3a 100644
--- a/hw/display/qxl.c
+++ b/hw/display/qxl.c
@@ -2216,6 +2216,7 @@ static VMStateDescription qxl_vmstate_monitors_config = {
     .name               = "qxl/monitors-config",
     .version_id         = 1,
     .minimum_version_id = 1,
+    .needed = qxl_monitors_config_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64(guest_monitors_config, PCIQXLDevice),
         VMSTATE_END_OF_LIST()
@@ -2249,13 +2250,9 @@ static VMStateDescription qxl_vmstate = {
         VMSTATE_UINT64(guest_cursor, PCIQXLDevice),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &qxl_vmstate_monitors_config,
-            .needed = qxl_monitors_config_needed,
-        }, {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &qxl_vmstate_monitors_config,
+        NULL
     }
 };

diff --git a/hw/display/vga.c b/hw/display/vga.c
index 19e7f23..db20c2a 100644
--- a/hw/display/vga.c
+++ b/hw/display/vga.c
@@ -2039,6 +2039,7 @@ const VMStateDescription vmstate_vga_endian = {
     .name = "vga.endian",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = vga_endian_state_needed,
     .fields = (VMStateField[]) {
         VMSTATE_BOOL(big_endian_fb, VGACommonState),
         VMSTATE_END_OF_LIST()
@@ -2082,13 +2083,9 @@ const VMStateDescription vmstate_vga_common = {
         VMSTATE_UINT32(vbe_bank_mask, VGACommonState),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection []) {
-        {
-            .vmsd = &vmstate_vga_endian,
-            .needed = vga_endian_state_needed,
-        }, {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_vga_endian,
+        NULL
     }
 };

diff --git a/hw/ide/core.c b/hw/ide/core.c
index ae85428..d38e84f 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -2448,6 +2448,7 @@ static const VMStateDescription vmstate_ide_atapi_gesn_state = {
     .name ="ide_drive/atapi/gesn_state",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = ide_atapi_gesn_needed,
     .fields = (VMStateField[]) {
         VMSTATE_BOOL(events.new_media, IDEState),
         VMSTATE_BOOL(events.eject_request, IDEState),
@@ -2459,6 +2460,7 @@ static const VMStateDescription vmstate_ide_tray_state = {
     .name = "ide_drive/tray_state",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = ide_tray_state_needed,
     .fields = (VMStateField[]) {
         VMSTATE_BOOL(tray_open, IDEState),
         VMSTATE_BOOL(tray_locked, IDEState),
@@ -2472,6 +2474,7 @@ static const VMStateDescription vmstate_ide_drive_pio_state = {
     .minimum_version_id = 1,
     .pre_save = ide_drive_pio_pre_save,
     .post_load = ide_drive_pio_post_load,
+    .needed = ide_drive_pio_state_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT32(req_nb_sectors, IDEState),
         VMSTATE_VARRAY_INT32(io_buffer, IDEState, io_buffer_total_len, 1,
@@ -2513,19 +2516,11 @@ const VMStateDescription vmstate_ide_drive = {
         VMSTATE_UINT8_V(cdrom_changed, IDEState, 3),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection []) {
-        {
-            .vmsd = &vmstate_ide_drive_pio_state,
-            .needed = ide_drive_pio_state_needed,
-        }, {
-            .vmsd = &vmstate_ide_tray_state,
-            .needed = ide_tray_state_needed,
-        }, {
-            .vmsd = &vmstate_ide_atapi_gesn_state,
-            .needed = ide_atapi_gesn_needed,
-        }, {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_ide_drive_pio_state,
+        &vmstate_ide_tray_state,
+        &vmstate_ide_atapi_gesn_state,
+        NULL
     }
 };

@@ -2533,6 +2528,7 @@ static const VMStateDescription vmstate_ide_error_status = {
     .name ="ide_bus/error",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = ide_error_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT32(error_status, IDEBus),
         VMSTATE_END_OF_LIST()
@@ -2548,13 +2544,9 @@ const VMStateDescription vmstate_ide_bus = {
         VMSTATE_UINT8(unit, IDEBus),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection []) {
-        {
-            .vmsd = &vmstate_ide_error_status,
-            .needed = ide_error_needed,
-        }, {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_ide_error_status,
+        NULL
     }
 };

diff --git a/hw/ide/pci.c b/hw/ide/pci.c
index 2397f35..07f66e0 100644
--- a/hw/ide/pci.c
+++ b/hw/ide/pci.c
@@ -402,6 +402,7 @@ static const VMStateDescription vmstate_bmdma_current = {
     .name = "ide bmdma_current",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = ide_bmdma_current_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT32(cur_addr, BMDMAState),
         VMSTATE_UINT32(cur_prd_last, BMDMAState),
@@ -415,6 +416,7 @@ static const VMStateDescription vmstate_bmdma_status = {
     .name ="ide bmdma/status",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = ide_bmdma_status_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT8(status, BMDMAState),
         VMSTATE_END_OF_LIST()
@@ -435,16 +437,10 @@ static const VMStateDescription vmstate_bmdma = {
         VMSTATE_UINT8(unit, BMDMAState),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection []) {
-        {
-            .vmsd = &vmstate_bmdma_current,
-            .needed = ide_bmdma_current_needed,
-        }, {
-            .vmsd = &vmstate_bmdma_status,
-            .needed = ide_bmdma_status_needed,
-        }, {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_bmdma_current,
+        &vmstate_bmdma_status,
+        NULL
     }
 };

diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
index 2b0cd3d..8121b5b 100644
--- a/hw/input/pckbd.c
+++ b/hw/input/pckbd.c
@@ -385,23 +385,24 @@ static int kbd_outport_post_load(void *opaque, int version_id)
     return 0;
 }

+static bool kbd_outport_needed(void *opaque)
+{
+    KBDState *s = opaque;
+    return s->outport != kbd_outport_default(s);
+}
+
 static const VMStateDescription vmstate_kbd_outport = {
     .name = "pckbd_outport",
     .version_id = 1,
     .minimum_version_id = 1,
     .post_load = kbd_outport_post_load,
+    .needed = kbd_outport_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT8(outport, KBDState),
         VMSTATE_END_OF_LIST()
     }
 };

-static bool kbd_outport_needed(void *opaque)
-{
-    KBDState *s = opaque;
-    return s->outport != kbd_outport_default(s);
-}
-
 static int kbd_post_load(void *opaque, int version_id)
 {
     KBDState *s = opaque;
@@ -424,12 +425,9 @@ static const VMStateDescription vmstate_kbd = {
         VMSTATE_UINT8(pending, KBDState),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmstate_kbd_outport,
-            .needed = kbd_outport_needed,
-        },
-        VMSTATE_END_OF_LIST()
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_kbd_outport,
+        NULL
     }
 };

diff --git a/hw/input/ps2.c b/hw/input/ps2.c
index a466e25..ac9dd7c 100644
--- a/hw/input/ps2.c
+++ b/hw/input/ps2.c
@@ -663,6 +663,7 @@ static const VMStateDescription vmstate_ps2_keyboard_ledstate = {
     .version_id = 3,
     .minimum_version_id = 2,
     .post_load = ps2_kbd_ledstate_post_load,
+    .needed = ps2_keyboard_ledstate_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT32(ledstate, PS2KbdState),
         VMSTATE_END_OF_LIST()
@@ -703,13 +704,9 @@ static const VMStateDescription vmstate_ps2_keyboard = {
         VMSTATE_INT32_V(scancode_set, PS2KbdState,3),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection []) {
-        {
-            .vmsd = &vmstate_ps2_keyboard_ledstate,
-            .needed = ps2_keyboard_ledstate_needed,
-        }, {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_ps2_keyboard_ledstate,
+        NULL
     }
 };

diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
index 177023b..82312b5 100644
--- a/hw/isa/lpc_ich9.c
+++ b/hw/isa/lpc_ich9.c
@@ -625,6 +625,7 @@ static const VMStateDescription vmstate_ich9_rst_cnt = {
     .name = "ICH9LPC/rst_cnt",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = ich9_rst_cnt_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT8(rst_cnt, ICH9LPCState),
         VMSTATE_END_OF_LIST()
@@ -644,12 +645,9 @@ static const VMStateDescription vmstate_ich9_lpc = {
         VMSTATE_UINT32(sci_level, ICH9LPCState),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmstate_ich9_rst_cnt,
-            .needed = ich9_rst_cnt_needed
-        },
-        { 0 }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_ich9_rst_cnt,
+        NULL
     }
 };

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 272df00..9bcf3b8 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -1367,6 +1367,7 @@ static const VMStateDescription vmstate_e1000_mit_state = {
     .name = "e1000/mit_state",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = e1000_mit_state_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT32(mac_reg[RDTR], E1000State),
         VMSTATE_UINT32(mac_reg[RADV], E1000State),
@@ -1454,13 +1455,9 @@ static const VMStateDescription vmstate_e1000 = {
         VMSTATE_UINT32_SUB_ARRAY(mac_reg, E1000State, VFTA, 128),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmstate_e1000_mit_state,
-            .needed = e1000_mit_state_needed,
-        }, {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_e1000_mit_state,
+        NULL
     }
 };

diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index 6e59f38..50a1f1c 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -3270,6 +3270,7 @@ static const VMStateDescription vmstate_rtl8139_hotplug_ready ={
     .name = "rtl8139/hotplug_ready",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = rtl8139_hotplug_ready_needed,
     .fields = (VMStateField[]) {
         VMSTATE_END_OF_LIST()
     }
@@ -3366,13 +3367,9 @@ static const VMStateDescription vmstate_rtl8139 = {
         VMSTATE_UINT32_V(cplus_enabled, RTL8139State, 4),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection []) {
-        {
-            .vmsd = &vmstate_rtl8139_hotplug_ready,
-            .needed = rtl8139_hotplug_ready_needed,
-        }, {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_rtl8139_hotplug_ready,
+        NULL
     }
 };

diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index f246fa1..036d3a5 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -2230,6 +2230,7 @@ static const VMStateDescription vmxstate_vmxnet3_mcast_list = {
     .version_id = 1,
     .minimum_version_id = 1,
     .pre_load = vmxnet3_mcast_list_pre_load,
+    .needed = vmxnet3_mc_list_needed,
     .fields = (VMStateField[]) {
         VMSTATE_VBUFFER_UINT32(mcast_list, VMXNET3State, 0, NULL, 0,
             mcast_list_buff_size),
@@ -2474,14 +2475,9 @@ static const VMStateDescription vmstate_vmxnet3 = {

             VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmxstate_vmxnet3_mcast_list,
-            .needed = vmxnet3_mc_list_needed
-        },
-        {
-            /* empty element. */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmxstate_vmxnet3_mcast_list,
+        NULL
     }
 };

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 1530038..f7f60dc 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -579,6 +579,7 @@ static const VMStateDescription vmstate_piix3_rcr = {
     .name = "PIIX3/rcr",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = piix3_rcr_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT8(rcr, PIIX3State),
         VMSTATE_END_OF_LIST()
@@ -597,12 +598,9 @@ static const VMStateDescription vmstate_piix3 = {
                               PIIX_NUM_PIRQS, 3),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmstate_piix3_rcr,
-            .needed = piix3_rcr_needed,
-        },
-        { 0 }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_piix3_rcr,
+        NULL
     }
 };

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 0f3e039..14e0232 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -1965,6 +1965,7 @@ static const VMStateDescription vmstate_scsi_sense_state = {
     .name = "SCSIDevice/sense",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = scsi_sense_state_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT8_SUB_ARRAY(sense, SCSIDevice,
                                 SCSI_SENSE_BUF_SIZE_OLD,
@@ -1995,13 +1996,9 @@ const VMStateDescription vmstate_scsi_device = {
         },
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection []) {
-        {
-            .vmsd = &vmstate_scsi_sense_state,
-            .needed = scsi_sense_state_needed,
-        }, {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_scsi_sense_state,
+        NULL
     }
 };

diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c
index e160e8f..2f4913d 100644
--- a/hw/timer/hpet.c
+++ b/hw/timer/hpet.c
@@ -282,6 +282,7 @@ static const VMStateDescription vmstate_hpet_rtc_irq_level = {
     .name = "hpet/rtc_irq_level",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = hpet_rtc_irq_level_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT8(rtc_irq_level, HPETState),
         VMSTATE_END_OF_LIST()
@@ -321,13 +322,9 @@ static const VMStateDescription vmstate_hpet = {
                                     vmstate_hpet_timer, HPETTimer),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmstate_hpet_rtc_irq_level,
-            .needed = hpet_rtc_irq_level_needed,
-        }, {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_hpet_rtc_irq_level,
+        NULL
     }
 };

diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
index f18d128..8c9d46e 100644
--- a/hw/timer/mc146818rtc.c
+++ b/hw/timer/mc146818rtc.c
@@ -733,22 +733,23 @@ static int rtc_post_load(void *opaque, int version_id)
     return 0;
 }

+static bool rtc_irq_reinject_on_ack_count_needed(void *opaque)
+{
+    RTCState *s = (RTCState *)opaque;
+    return s->irq_reinject_on_ack_count != 0;
+}
+
 static const VMStateDescription vmstate_rtc_irq_reinject_on_ack_count = {
     .name = "irq_reinject_on_ack_count",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = rtc_irq_reinject_on_ack_count_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT16(irq_reinject_on_ack_count, RTCState),
         VMSTATE_END_OF_LIST()
     }
 };

-static bool rtc_irq_reinject_on_ack_count_needed(void *opaque)
-{
-    RTCState *s = (RTCState *)opaque;
-    return s->irq_reinject_on_ack_count != 0;
-}
-
 static const VMStateDescription vmstate_rtc = {
     .name = "mc146818rtc",
     .version_id = 3,
@@ -770,13 +771,9 @@ static const VMStateDescription vmstate_rtc = {
         VMSTATE_UINT64_V(next_alarm_time, RTCState, 3),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmstate_rtc_irq_reinject_on_ack_count,
-            .needed = rtc_irq_reinject_on_ack_count_needed,
-        }, {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_rtc_irq_reinject_on_ack_count,
+        NULL
     }
 };

diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
index 9a84eb6..7d5e983 100644
--- a/hw/usb/hcd-ohci.c
+++ b/hw/usb/hcd-ohci.c
@@ -2014,6 +2014,7 @@ static const VMStateDescription vmstate_ohci_eof_timer = {
     .version_id = 1,
     .minimum_version_id = 1,
     .pre_load = ohci_eof_timer_pre_load,
+    .needed = ohci_eof_timer_needed,
     .fields = (VMStateField[]) {
         VMSTATE_TIMER(eof_timer, OHCIState),
         VMSTATE_END_OF_LIST()
@@ -2061,13 +2062,9 @@ static const VMStateDescription vmstate_ohci_state = {
         VMSTATE_BOOL(async_complete, OHCIState),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection []) {
-        {
-            .vmsd = &vmstate_ohci_eof_timer,
-            .needed = ohci_eof_timer_needed,
-        } , {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_ohci_eof_timer,
+        NULL
     }
 };

diff --git a/hw/usb/redirect.c b/hw/usb/redirect.c
index e2c9896..e2d7bac 100644
--- a/hw/usb/redirect.c
+++ b/hw/usb/redirect.c
@@ -2256,40 +2256,42 @@ static const VMStateInfo usbredir_ep_bufpq_vmstate_info = {


 /* For endp_data migration */
+static bool usbredir_bulk_receiving_needed(void *priv)
+{
+    struct endp_data *endp = priv;
+
+    return endp->bulk_receiving_started;
+}
+
 static const VMStateDescription usbredir_bulk_receiving_vmstate = {
     .name = "usb-redir-ep/bulk-receiving",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = usbredir_bulk_receiving_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT8(bulk_receiving_started, struct endp_data),
         VMSTATE_END_OF_LIST()
     }
 };

-static bool usbredir_bulk_receiving_needed(void *priv)
+static bool usbredir_stream_needed(void *priv)
 {
     struct endp_data *endp = priv;

-    return endp->bulk_receiving_started;
+    return endp->max_streams;
 }

 static const VMStateDescription usbredir_stream_vmstate = {
     .name = "usb-redir-ep/stream-state",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = usbredir_stream_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT32(max_streams, struct endp_data),
         VMSTATE_END_OF_LIST()
     }
 };

-static bool usbredir_stream_needed(void *priv)
-{
-    struct endp_data *endp = priv;
-
-    return endp->max_streams;
-}
-
 static const VMStateDescription usbredir_ep_vmstate = {
     .name = "usb-redir-ep",
     .version_id = 1,
@@ -2317,16 +2319,10 @@ static const VMStateDescription usbredir_ep_vmstate = {
         VMSTATE_INT32(bufpq_target_size, struct endp_data),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &usbredir_bulk_receiving_vmstate,
-            .needed = usbredir_bulk_receiving_needed,
-        }, {
-            .vmsd = &usbredir_stream_vmstate,
-            .needed = usbredir_stream_needed,
-        }, {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &usbredir_bulk_receiving_vmstate,
+        &usbredir_stream_vmstate,
+        NULL
     }
 };

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 2c236bf..abdf5c5 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -887,6 +887,7 @@ static const VMStateDescription vmstate_virtio_device_endian = {
     .name = "virtio/device_endian",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = &virtio_device_endian_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT8(device_endian, VirtIODevice),
         VMSTATE_END_OF_LIST()
@@ -901,12 +902,9 @@ static const VMStateDescription vmstate_virtio = {
     .fields = (VMStateField[]) {
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmstate_virtio_device_endian,
-            .needed = &virtio_device_endian_needed
-        },
-        { 0 }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_virtio_device_endian,
+        NULL
     }
 };

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 9a001bd..3556924 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -119,11 +119,6 @@ typedef struct {
     bool (*field_exists)(void *opaque, int version_id);
 } VMStateField;

-typedef struct VMStateSubsection {
-    const VMStateDescription *vmsd;
-    bool (*needed)(void *opaque);
-} VMStateSubsection;
-
 struct VMStateDescription {
     const char *name;
     int unmigratable;
@@ -134,8 +129,9 @@ struct VMStateDescription {
     int (*pre_load)(void *opaque);
     int (*post_load)(void *opaque, int version_id);
     void (*pre_save)(void *opaque);
+    bool (*needed)(void *opaque);
     VMStateField *fields;
-    const VMStateSubsection *subsections;
+    const VMStateDescription **subsections;
 };

 #ifdef CONFIG_USER_ONLY
diff --git a/savevm.c b/savevm.c
index 2d8eb96..eed43e6 100644
--- a/savevm.c
+++ b/savevm.c
@@ -263,11 +263,11 @@ static void dump_vmstate_vmsf(FILE *out_file, const VMStateField *field,
 }

 static void dump_vmstate_vmss(FILE *out_file,
-                              const VMStateSubsection *subsection,
+                              const VMStateDescription **subsection,
                               int indent)
 {
-    if (subsection->vmsd != NULL) {
-        dump_vmstate_vmsd(out_file, subsection->vmsd, indent, true);
+    if (*subsection != NULL) {
+        dump_vmstate_vmsd(out_file, *subsection, indent, true);
     }
 }

@@ -308,12 +308,12 @@ static void dump_vmstate_vmsd(FILE *out_file,
         fprintf(out_file, "\n%*s]", indent, "");
     }
     if (vmsd->subsections != NULL) {
-        const VMStateSubsection *subsection = vmsd->subsections;
+        const VMStateDescription **subsection = vmsd->subsections;
         bool first;

         fprintf(out_file, ",\n%*s\"Subsections\": [\n", indent, "");
         first = true;
-        while (subsection->vmsd != NULL) {
+        while (*subsection != NULL) {
             if (!first) {
                 fprintf(out_file, ",\n");
             }
diff --git a/target-arm/machine.c b/target-arm/machine.c
index ddb7d05..07803d0 100644
--- a/target-arm/machine.c
+++ b/target-arm/machine.c
@@ -40,6 +40,7 @@ static const VMStateDescription vmstate_vfp = {
     .name = "cpu/vfp",
     .version_id = 3,
     .minimum_version_id = 3,
+    .needed = vfp_needed,
     .fields = (VMStateField[]) {
         VMSTATE_FLOAT64_ARRAY(env.vfp.regs, ARMCPU, 64),
         /* The xregs array is a little awkward because element 1 (FPSCR)
@@ -72,6 +73,7 @@ static const VMStateDescription vmstate_iwmmxt = {
     .name = "cpu/iwmmxt",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = iwmmxt_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64_ARRAY(env.iwmmxt.regs, ARMCPU, 16),
         VMSTATE_UINT32_ARRAY(env.iwmmxt.cregs, ARMCPU, 16),
@@ -91,6 +93,7 @@ static const VMStateDescription vmstate_m = {
     .name = "cpu/m",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = m_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT32(env.v7m.other_sp, ARMCPU),
         VMSTATE_UINT32(env.v7m.vecbase, ARMCPU),
@@ -114,6 +117,7 @@ static const VMStateDescription vmstate_thumb2ee = {
     .name = "cpu/thumb2ee",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = thumb2ee_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT32(env.teecr, ARMCPU),
         VMSTATE_UINT32(env.teehbr, ARMCPU),
@@ -265,21 +269,11 @@ const VMStateDescription vmstate_arm_cpu = {
         VMSTATE_TIMER(gt_timer[GTIMER_VIRT], ARMCPU),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection[]) {
-        {
-            .vmsd = &vmstate_vfp,
-            .needed = vfp_needed,
-        } , {
-            .vmsd = &vmstate_iwmmxt,
-            .needed = iwmmxt_needed,
-        } , {
-            .vmsd = &vmstate_m,
-            .needed = m_needed,
-        } , {
-            .vmsd = &vmstate_thumb2ee,
-            .needed = thumb2ee_needed,
-        } , {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_vfp,
+        &vmstate_iwmmxt,
+        &vmstate_m,
+        &vmstate_thumb2ee,
+        NULL
     }
 };
diff --git a/target-i386/machine.c b/target-i386/machine.c
index 0dd49f0..b8f39b4 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -358,6 +358,7 @@ static const VMStateDescription vmstate_steal_time_msr = {
     .name = "cpu/steal_time_msr",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = steal_time_msr_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64(env.steal_time_msr, X86CPU),
         VMSTATE_END_OF_LIST()
@@ -368,6 +369,7 @@ static const VMStateDescription vmstate_async_pf_msr = {
     .name = "cpu/async_pf_msr",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = async_pf_msr_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64(env.async_pf_en_msr, X86CPU),
         VMSTATE_END_OF_LIST()
@@ -378,6 +380,7 @@ static const VMStateDescription vmstate_pv_eoi_msr = {
     .name = "cpu/async_pv_eoi_msr",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = pv_eoi_msr_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64(env.pv_eoi_en_msr, X86CPU),
         VMSTATE_END_OF_LIST()
@@ -396,6 +399,7 @@ static const VMStateDescription vmstate_fpop_ip_dp = {
     .name = "cpu/fpop_ip_dp",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = fpop_ip_dp_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT16(env.fpop, X86CPU),
         VMSTATE_UINT64(env.fpip, X86CPU),
@@ -416,6 +420,7 @@ static const VMStateDescription vmstate_msr_tsc_adjust = {
     .name = "cpu/msr_tsc_adjust",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = tsc_adjust_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64(env.tsc_adjust, X86CPU),
         VMSTATE_END_OF_LIST()
@@ -434,6 +439,7 @@ static const VMStateDescription vmstate_msr_tscdeadline = {
     .name = "cpu/msr_tscdeadline",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = tscdeadline_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64(env.tsc_deadline, X86CPU),
         VMSTATE_END_OF_LIST()
@@ -460,6 +466,7 @@ static const VMStateDescription vmstate_msr_ia32_misc_enable = {
     .name = "cpu/msr_ia32_misc_enable",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = misc_enable_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64(env.msr_ia32_misc_enable, X86CPU),
         VMSTATE_END_OF_LIST()
@@ -470,6 +477,7 @@ static const VMStateDescription vmstate_msr_ia32_feature_control = {
     .name = "cpu/msr_ia32_feature_control",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = feature_control_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64(env.msr_ia32_feature_control, X86CPU),
         VMSTATE_END_OF_LIST()
@@ -504,6 +512,7 @@ static const VMStateDescription vmstate_msr_architectural_pmu = {
     .name = "cpu/msr_architectural_pmu",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = pmu_enable_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64(env.msr_fixed_ctr_ctrl, X86CPU),
         VMSTATE_UINT64(env.msr_global_ctrl, X86CPU),
@@ -539,6 +548,7 @@ static const VMStateDescription vmstate_mpx = {
     .name = "cpu/mpx",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = mpx_needed,
     .fields = (VMStateField[]) {
         VMSTATE_BND_REGS(env.bnd_regs, X86CPU, 4),
         VMSTATE_UINT64(env.bndcs_regs.cfgu, X86CPU),
@@ -560,6 +570,7 @@ static const VMStateDescription vmstate_msr_hypercall_hypercall = {
     .name = "cpu/msr_hyperv_hypercall",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = hyperv_hypercall_enable_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64(env.msr_hv_guest_os_id, X86CPU),
         VMSTATE_UINT64(env.msr_hv_hypercall, X86CPU),
@@ -579,6 +590,7 @@ static const VMStateDescription vmstate_msr_hyperv_vapic = {
     .name = "cpu/msr_hyperv_vapic",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = hyperv_vapic_enable_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64(env.msr_hv_vapic, X86CPU),
         VMSTATE_END_OF_LIST()
@@ -596,6 +608,7 @@ static bool hyperv_time_enable_needed(void *opaque)
 static const VMStateDescription vmstate_msr_hyperv_time = {
     .name = "cpu/msr_hyperv_time",
     .version_id = 1,
+    .needed = hyperv_time_enable_needed,
     .minimum_version_id = 1,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64(env.msr_hv_tsc, X86CPU),
@@ -705,48 +718,20 @@ VMStateDescription vmstate_x86_cpu = {
         VMSTATE_END_OF_LIST()
         /* The above list is not sorted /wrt version numbers, watch out! */
     },
-    .subsections = (VMStateSubsection []) {
-        {
-            .vmsd = &vmstate_async_pf_msr,
-            .needed = async_pf_msr_needed,
-        } , {
-            .vmsd = &vmstate_pv_eoi_msr,
-            .needed = pv_eoi_msr_needed,
-        } , {
-            .vmsd = &vmstate_steal_time_msr,
-            .needed = steal_time_msr_needed,
-        } , {
-            .vmsd = &vmstate_fpop_ip_dp,
-            .needed = fpop_ip_dp_needed,
-        }, {
-            .vmsd = &vmstate_msr_tsc_adjust,
-            .needed = tsc_adjust_needed,
-        }, {
-            .vmsd = &vmstate_msr_tscdeadline,
-            .needed = tscdeadline_needed,
-        }, {
-            .vmsd = &vmstate_msr_ia32_misc_enable,
-            .needed = misc_enable_needed,
-        }, {
-            .vmsd = &vmstate_msr_ia32_feature_control,
-            .needed = feature_control_needed,
-        }, {
-            .vmsd = &vmstate_msr_architectural_pmu,
-            .needed = pmu_enable_needed,
-        } , {
-            .vmsd = &vmstate_mpx,
-            .needed = mpx_needed,
-        }, {
-            .vmsd = &vmstate_msr_hypercall_hypercall,
-            .needed = hyperv_hypercall_enable_needed,
-        }, {
-            .vmsd = &vmstate_msr_hyperv_vapic,
-            .needed = hyperv_vapic_enable_needed,
-        }, {
-            .vmsd = &vmstate_msr_hyperv_time,
-            .needed = hyperv_time_enable_needed,
-        } , {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_async_pf_msr,
+        &vmstate_pv_eoi_msr,
+        &vmstate_steal_time_msr,
+        &vmstate_fpop_ip_dp,
+        &vmstate_msr_tsc_adjust,
+        &vmstate_msr_tscdeadline,
+        &vmstate_msr_ia32_misc_enable,
+        &vmstate_msr_ia32_feature_control,
+        &vmstate_msr_architectural_pmu,
+        &vmstate_mpx,
+        &vmstate_msr_hypercall_hypercall,
+        &vmstate_msr_hyperv_vapic,
+        &vmstate_msr_hyperv_time,
+        NULL
     }
 };
diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index c801b82..3c10a41 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -207,6 +207,7 @@ static const VMStateDescription vmstate_fpu = {
     .name = "cpu/fpu",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = fpu_needed,
     .fields = (VMStateField[]) {
         VMSTATE_FLOAT64_ARRAY(env.fpr, PowerPCCPU, 32),
         VMSTATE_UINTTL(env.fpscr, PowerPCCPU),
@@ -225,6 +226,7 @@ static const VMStateDescription vmstate_altivec = {
     .name = "cpu/altivec",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = altivec_needed,
     .fields = (VMStateField[]) {
         VMSTATE_AVR_ARRAY(env.avr, PowerPCCPU, 32),
         VMSTATE_UINT32(env.vscr, PowerPCCPU),
@@ -243,6 +245,7 @@ static const VMStateDescription vmstate_vsx = {
     .name = "cpu/vsx",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = vsx_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64_ARRAY(env.vsr, PowerPCCPU, 32),
         VMSTATE_END_OF_LIST()
@@ -263,6 +266,7 @@ static const VMStateDescription vmstate_tm = {
     .version_id = 1,
     .minimum_version_id = 1,
     .minimum_version_id_old = 1,
+    .needed = tm_needed,
     .fields      = (VMStateField []) {
         VMSTATE_UINTTL_ARRAY(env.tm_gpr, PowerPCCPU, 32),
         VMSTATE_AVR_ARRAY(env.tm_vsr, PowerPCCPU, 64),
@@ -296,6 +300,7 @@ static const VMStateDescription vmstate_sr = {
     .name = "cpu/sr",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = sr_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINTTL_ARRAY(env.sr, PowerPCCPU, 32),
         VMSTATE_END_OF_LIST()
@@ -345,6 +350,7 @@ static const VMStateDescription vmstate_slb = {
     .name = "cpu/slb",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = slb_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT32_EQUAL(env.slb_nr, PowerPCCPU),
         VMSTATE_SLB_ARRAY(env.slb, PowerPCCPU, MAX_SLB_ENTRIES),
@@ -377,6 +383,7 @@ static const VMStateDescription vmstate_tlb6xx = {
     .name = "cpu/tlb6xx",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = tlb6xx_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT32_EQUAL(env.nb_tlb, PowerPCCPU),
         VMSTATE_STRUCT_VARRAY_POINTER_INT32(env.tlb.tlb6, PowerPCCPU,
@@ -423,6 +430,7 @@ static const VMStateDescription vmstate_pbr403 = {
     .name = "cpu/pbr403",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = pbr403_needed,
     .fields = (VMStateField[]) {
         VMSTATE_UINTTL_ARRAY(env.pb, PowerPCCPU, 4),
         VMSTATE_END_OF_LIST()
@@ -433,6 +441,7 @@ static const VMStateDescription vmstate_tlbemb = {
     .name = "cpu/tlb6xx",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = tlbemb_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT32_EQUAL(env.nb_tlb, PowerPCCPU),
         VMSTATE_STRUCT_VARRAY_POINTER_INT32(env.tlb.tlbe, PowerPCCPU,
@@ -442,13 +451,9 @@ static const VMStateDescription vmstate_tlbemb = {
         /* 403 protection registers */
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection []) {
-        {
-            .vmsd = &vmstate_pbr403,
-            .needed = pbr403_needed,
-        } , {
-            /* empty */
-        }
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_pbr403,
+        NULL
     }
 };

@@ -477,6 +482,7 @@ static const VMStateDescription vmstate_tlbmas = {
     .name = "cpu/tlbmas",
     .version_id = 1,
     .minimum_version_id = 1,
+    .needed = tlbmas_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT32_EQUAL(env.nb_tlb, PowerPCCPU),
         VMSTATE_STRUCT_VARRAY_POINTER_INT32(env.tlb.tlbm, PowerPCCPU,
@@ -527,38 +533,18 @@ const VMStateDescription vmstate_ppc_cpu = {
         VMSTATE_UINT32_EQUAL(env.nb_BATs, PowerPCCPU),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (VMStateSubsection []) {
-        {
-            .vmsd = &vmstate_fpu,
-            .needed = fpu_needed,
-        } , {
-            .vmsd = &vmstate_altivec,
-            .needed = altivec_needed,
-        } , {
-            .vmsd = &vmstate_vsx,
-            .needed = vsx_needed,
-        } , {
-            .vmsd = &vmstate_sr,
-            .needed = sr_needed,
-        } , {
+    .subsections = (const VMStateDescription *[]) {
+        &vmstate_fpu,
+        &vmstate_altivec,
+        &vmstate_vsx,
+        &vmstate_sr,
 #ifdef TARGET_PPC64
-            .vmsd = &vmstate_tm,
-            .needed = tm_needed,
-        } , {
-            .vmsd = &vmstate_slb,
-            .needed = slb_needed,
-        } , {
+        &vmstate_tm,
+        &vmstate_slb,
 #endif /* TARGET_PPC64 */
-            .vmsd = &vmstate_tlb6xx,
-            .needed = tlb6xx_needed,
-        } , {
-            .vmsd = &vmstate_tlbemb,
-            .needed = tlbemb_needed,
-        } , {
-            .vmsd = &vmstate_tlbmas,
-            .needed = tlbmas_needed,
-        } , {
-            /* empty */
-        }
+        &vmstate_tlb6xx,
+        &vmstate_tlbemb,
+        &vmstate_tlbmas,
+        NULL
     }
 };
diff --git a/vmstate.c b/vmstate.c
index ef2f87b..bb02b7d 100644
--- a/vmstate.c
+++ b/vmstate.c
@@ -171,11 +171,11 @@ void vmstate_save_state(QEMUFile *f, const VMStateDescription *vmsd,
 }

 static const VMStateDescription *
-    vmstate_get_subsection(const VMStateSubsection *sub, char *idstr)
+vmstate_get_subsection(const VMStateDescription **sub, char *idstr)
 {
-    while (sub && sub->needed) {
-        if (strcmp(idstr, sub->vmsd->name) == 0) {
-            return sub->vmsd;
+    while (sub && *sub && (*sub)->needed) {
+        if (strcmp(idstr, (*sub)->name) == 0) {
+            return *sub;
         }
         sub++;
     }
@@ -226,11 +226,11 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
 static void vmstate_subsection_save(QEMUFile *f, const VMStateDescription *vmsd,
                                     void *opaque)
 {
-    const VMStateSubsection *sub = vmsd->subsections;
+    const VMStateDescription **sub = vmsd->subsections;

-    while (sub && sub->needed) {
-        if (sub->needed(opaque)) {
-            const VMStateDescription *vmsd = sub->vmsd;
+    while (sub && *sub && (*sub)->needed) {
+        if ((*sub)->needed(opaque)) {
+            const VMStateDescription *vmsd = *sub;
             uint8_t len;

             qemu_put_byte(f, QEMU_VM_SUBSECTION);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 2/7] runstate: Add runstate store
  2014-10-15  7:55 [Qemu-devel] [RFC 0/7] Optional toplevel sections Juan Quintela
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 1/7] migration: Create optional sections Juan Quintela
@ 2014-10-15  7:55 ` Juan Quintela
  2014-10-20 10:24   ` Dr. David Alan Gilbert
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 3/7] runstate: create runstate_index function Juan Quintela
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Juan Quintela @ 2014-10-15  7:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, laine, lcapitulino

This allows us to store the current state to send it through migration.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/sysemu/sysemu.h |  1 +
 vl.c                    | 10 ++++++++++
 2 files changed, 11 insertions(+)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index d8539fd..ae217da 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -28,6 +28,7 @@ bool runstate_check(RunState state);
 void runstate_set(RunState new_state);
 int runstate_is_running(void);
 bool runstate_needs_reset(void);
+int runstate_store(char *str, int size);
 typedef struct vm_change_state_entry VMChangeStateEntry;
 typedef void VMChangeStateHandler(void *opaque, int running, RunState state);

diff --git a/vl.c b/vl.c
index 964d634..ce8e28b 100644
--- a/vl.c
+++ b/vl.c
@@ -677,6 +677,16 @@ bool runstate_check(RunState state)
     return current_run_state == state;
 }

+int runstate_store(char *str, int size)
+{
+    const char *state = RunState_lookup[current_run_state];
+
+    if (strlen(state)+1 > size)
+        return -1;
+    strncpy(str, state, strlen(state)+1);
+    return 0;
+}
+
 static void runstate_init(void)
 {
     const RunStateTransition *p;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 3/7] runstate: create runstate_index function
  2014-10-15  7:55 [Qemu-devel] [RFC 0/7] Optional toplevel sections Juan Quintela
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 1/7] migration: Create optional sections Juan Quintela
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 2/7] runstate: Add runstate store Juan Quintela
@ 2014-10-15  7:55 ` Juan Quintela
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 4/7] runstate: migration allows more transitions now Juan Quintela
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Juan Quintela @ 2014-10-15  7:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, laine, lcapitulino

Given a string state, we need a way to get the RunState for that string.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/sysemu/sysemu.h |  1 +
 vl.c                    | 13 +++++++++++++
 2 files changed, 14 insertions(+)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index ae217da..2a6e669 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -29,6 +29,7 @@ void runstate_set(RunState new_state);
 int runstate_is_running(void);
 bool runstate_needs_reset(void);
 int runstate_store(char *str, int size);
+RunState runstate_index(char *str);
 typedef struct vm_change_state_entry VMChangeStateEntry;
 typedef void VMChangeStateHandler(void *opaque, int running, RunState state);

diff --git a/vl.c b/vl.c
index ce8e28b..c4bc43c 100644
--- a/vl.c
+++ b/vl.c
@@ -687,6 +687,19 @@ int runstate_store(char *str, int size)
     return 0;
 }

+RunState runstate_index(char *str)
+{
+    RunState i = 0;
+
+    while (i != RUN_STATE_MAX) {
+        if (strcmp(str, RunState_lookup[i]) == 0) {
+            return i;
+        }
+        i++;
+    }
+    return -1;
+}
+
 static void runstate_init(void)
 {
     const RunStateTransition *p;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 4/7] runstate: migration allows more transitions now
  2014-10-15  7:55 [Qemu-devel] [RFC 0/7] Optional toplevel sections Juan Quintela
                   ` (2 preceding siblings ...)
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 3/7] runstate: create runstate_index function Juan Quintela
@ 2014-10-15  7:55 ` Juan Quintela
  2014-10-15 14:42   ` Eric Blake
  2014-11-03 12:46   ` Amit Shah
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 5/7] migration: create now section to store global state Juan Quintela
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 26+ messages in thread
From: Juan Quintela @ 2014-10-15  7:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, laine, lcapitulino

Next commit would allow to move from incoming migration to error happening on source.

Should we add more states to this transition?  Luiz?

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 vl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/vl.c b/vl.c
index c4bc43c..1788b6a 100644
--- a/vl.c
+++ b/vl.c
@@ -618,6 +618,8 @@ static const RunStateTransition runstate_transitions_def[] = {

     { RUN_STATE_INMIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_INMIGRATE, RUN_STATE_PAUSED },
+    { RUN_STATE_INMIGRATE, RUN_STATE_INTERNAL_ERROR },
+    { RUN_STATE_INMIGRATE, RUN_STATE_IO_ERROR },

     { RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
     { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 5/7] migration: create now section to store global state
  2014-10-15  7:55 [Qemu-devel] [RFC 0/7] Optional toplevel sections Juan Quintela
                   ` (3 preceding siblings ...)
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 4/7] runstate: migration allows more transitions now Juan Quintela
@ 2014-10-15  7:55 ` Juan Quintela
  2014-10-20 10:52   ` Dr. David Alan Gilbert
  2014-10-20 11:11   ` Kevin Wolf
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 6/7] global_state: Make section optional Juan Quintela
                   ` (3 subsequent siblings)
  8 siblings, 2 replies; 26+ messages in thread
From: Juan Quintela @ 2014-10-15  7:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, laine, lcapitulino

This includes a new section that for now just stores the current qemu state.

Right now, there are only one way to control what is the state of the
target after migration.

- If you run the target qemu with -S, it would start stopped.
- If you run the target qemu without -S, it would run just after migration finishes.

The problem here is what happens if we start the target without -S and
there happens one error during migration that puts current state as
-EIO.  Migration would ends (notice that the error happend doing block
IO, network IO, i.e. nothing related with migration), and when
migration finish, we would just "continue" running on destination,
probably hanging the guest/corruption data, whatever.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/migration/migration.h |  4 ++
 migration.c                   | 88 +++++++++++++++++++++++++++++++++++++++++--
 vl.c                          |  1 +
 3 files changed, 90 insertions(+), 3 deletions(-)

diff --git a/include/migration/migration.h b/include/migration/migration.h
index 3cb5ba8..bc1069b 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -174,4 +174,8 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t block_offset,
                              ram_addr_t offset, size_t size,
                              int *bytes_sent);

+void register_global_state(void);
+void global_state_store(void);
+char *global_state_get_runstate(void);
+
 #endif
diff --git a/migration.c b/migration.c
index 8d675b3..6f7e50e 100644
--- a/migration.c
+++ b/migration.c
@@ -112,10 +112,20 @@ static void process_incoming_migration_co(void *opaque)
         exit(EXIT_FAILURE);
     }

-    if (autostart) {
+    /* runstate == "" means that we haven't received it through the
+     * wire, so we obey autostart.  runstate == runing means that we
+     * need to run it, we need to make sure that we do it after
+     * everything else has finished.  Every other state change is done
+     * at the post_load function */
+
+    if (strcmp(global_state_get_runstate(), "running") == 0 ) {
         vm_start();
-    } else {
-        runstate_set(RUN_STATE_PAUSED);
+    } else if (strcmp(global_state_get_runstate(), "") == 0 ) {
+        if (autostart) {
+            vm_start();
+        } else {
+            runstate_set(RUN_STATE_PAUSED);
+        }
     }
 }

@@ -608,6 +618,7 @@ static void *migration_thread(void *opaque)
                 qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
                 old_vm_running = runstate_is_running();

+                global_state_store();
                 ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
                 if (ret >= 0) {
                     qemu_file_set_rate_limit(s->file, INT64_MAX);
@@ -699,3 +710,74 @@ void migrate_fd_connect(MigrationState *s)
     qemu_thread_create(&s->thread, "migration", migration_thread, s,
                        QEMU_THREAD_JOINABLE);
 }
+
+typedef struct {
+    int32_t size;
+    uint8_t runstate[100];
+} GlobalState;
+
+static GlobalState global_state;
+
+void global_state_store(void)
+{
+    if (runstate_store((char*)global_state.runstate,
+                       sizeof(global_state.runstate)) == -1) {
+        printf("Runstate is too big\n");
+        exit(-1);
+    }
+}
+
+char *global_state_get_runstate(void)
+{
+    return (char *)global_state.runstate;
+}
+
+static int global_state_post_load(void *opaque, int version_id)
+{
+    GlobalState *s = opaque;
+    int ret = 0;
+    char *runstate = (char*)s->runstate;
+
+    printf("loaded state: %s\n", runstate);
+
+    if (strcmp(runstate, "running") != 0) {
+
+        RunState r = runstate_index(runstate);
+
+        if (r == -1) {
+            printf("Unknown received state %s\n", runstate);
+            return -1;
+        }
+        ret = vm_stop_force_state(r);
+    }
+
+   return ret;
+}
+
+static void global_state_pre_save(void *opaque)
+{
+    GlobalState *s = opaque;
+
+    s->size = strlen((char*)s->runstate) + 1;
+    printf("saved state: %s\n", s->runstate);
+}
+
+static const VMStateDescription vmstate_globalstate = {
+    .name = "globalstate",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .post_load = global_state_post_load,
+    .pre_save = global_state_pre_save,
+    .fields = (VMStateField[]) {
+        VMSTATE_INT32(size, GlobalState),
+        VMSTATE_BUFFER(runstate, GlobalState),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+void register_global_state(void)
+{
+    /* We would use it independently that we receive it */
+    strcpy((char*)&global_state.runstate, "");
+    vmstate_register(NULL, 0, &vmstate_globalstate, &global_state);
+}
diff --git a/vl.c b/vl.c
index 1788b6a..75e855e 100644
--- a/vl.c
+++ b/vl.c
@@ -4511,6 +4511,7 @@ int main(int argc, char **argv, char **envp)
         return 0;
     }

+    register_global_state();
     if (incoming) {
         Error *local_err = NULL;
         qemu_start_incoming_migration(incoming, &local_err);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 6/7] global_state: Make section optional
  2014-10-15  7:55 [Qemu-devel] [RFC 0/7] Optional toplevel sections Juan Quintela
                   ` (4 preceding siblings ...)
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 5/7] migration: create now section to store global state Juan Quintela
@ 2014-10-15  7:55 ` Juan Quintela
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 7/7] vmstate: Create optional sections Juan Quintela
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Juan Quintela @ 2014-10-15  7:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, laine, lcapitulino

This section would be sent:

a- for all new machine types
b- for old achine types if section state is different form {running,paused}
   that were the only giving us troubles.

So, in new qemus: it is alwasy there.  In old qemus: they are only
there if it an error has happened, basically stoping on target.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 hw/i386/pc_piix.c             |  2 ++
 include/migration/migration.h |  2 +-
 migration.c                   | 31 ++++++++++++++++++++++++++++++-
 3 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 4384633..078fbfb 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -52,6 +52,7 @@
 #ifdef CONFIG_XEN
 #  include <xen/hvm/hvm_info_table.h>
 #endif
+#include "migration/migration.h"

 #define MAX_IDE_BUS 2

@@ -323,6 +324,7 @@ static void pc_compat_2_0(MachineState *machine)
     legacy_acpi_table_size = 6652;
     smbios_legacy_mode = true;
     has_reserved_memory = false;
+    global_state_set_optional();
     pc_set_legacy_acpi_data_size();
 }

diff --git a/include/migration/migration.h b/include/migration/migration.h
index bc1069b..475ed48 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -177,5 +177,5 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t block_offset,
 void register_global_state(void);
 void global_state_store(void);
 char *global_state_get_runstate(void);
-
+void global_state_set_optional(void);
 #endif
diff --git a/migration.c b/migration.c
index 6f7e50e..7379073 100644
--- a/migration.c
+++ b/migration.c
@@ -712,6 +712,7 @@ void migrate_fd_connect(MigrationState *s)
 }

 typedef struct {
+    bool optional;
     int32_t size;
     uint8_t runstate[100];
 } GlobalState;
@@ -732,6 +733,33 @@ char *global_state_get_runstate(void)
     return (char *)global_state.runstate;
 }

+void global_state_set_optional(void)
+{
+    global_state.optional = true;
+}
+
+static bool global_state_needed(void *opaque)
+{
+    GlobalState *s = opaque;
+    char *runstate = (char*)s->runstate;
+
+    /* If it is not optional, it is mandatory */
+
+    if (s->optional == false) {
+        return true;
+    }
+
+    /* If state is running or paused, it is not needed */
+
+    if (strcmp(runstate, "running") == 0 ||
+        strcmp(runstate, "paused") == 0) {
+        return false;
+    }
+
+    /* for any other state it is needed */
+    return true;
+}
+
 static int global_state_post_load(void *opaque, int version_id)
 {
     GlobalState *s = opaque;
@@ -759,7 +787,7 @@ static void global_state_pre_save(void *opaque)
     GlobalState *s = opaque;

     s->size = strlen((char*)s->runstate) + 1;
-    printf("saved state: %s\n", s->runstate);
+    printf("saved state: %s optional (%d)\n", s->runstate, s->optional);
 }

 static const VMStateDescription vmstate_globalstate = {
@@ -768,6 +796,7 @@ static const VMStateDescription vmstate_globalstate = {
     .minimum_version_id = 1,
     .post_load = global_state_post_load,
     .pre_save = global_state_pre_save,
+    .needed = global_state_needed,
     .fields = (VMStateField[]) {
         VMSTATE_INT32(size, GlobalState),
         VMSTATE_BUFFER(runstate, GlobalState),
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 7/7] vmstate: Create optional sections
  2014-10-15  7:55 [Qemu-devel] [RFC 0/7] Optional toplevel sections Juan Quintela
                   ` (5 preceding siblings ...)
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 6/7] global_state: Make section optional Juan Quintela
@ 2014-10-15  7:55 ` Juan Quintela
  2014-10-15 14:45   ` Eric Blake
  2014-10-15 14:57 ` [Qemu-devel] [RFC 0/7] Optional toplevel sections Eric Blake
  2014-10-20 11:29 ` Kevin Wolf
  8 siblings, 1 reply; 26+ messages in thread
From: Juan Quintela @ 2014-10-15  7:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, laine, lcapitulino

To make sections optional, we need to do it at the beggining of the code.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/migration/vmstate.h |  2 ++
 savevm.c                    |  4 ++++
 vmstate.c                   | 11 +++++++++++
 3 files changed, 17 insertions(+)

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 3556924..a5155be 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -766,6 +766,8 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd,
 void vmstate_save_state(QEMUFile *f, const VMStateDescription *vmsd,
                         void *opaque);

+bool vmstate_save_needed(const VMStateDescription *vmsd, void *opaque);
+
 int vmstate_register_with_alias_id(DeviceState *dev, int instance_id,
                                    const VMStateDescription *vmsd,
                                    void *base, int alias_id,
diff --git a/savevm.c b/savevm.c
index eed43e6..3884e0b 100644
--- a/savevm.c
+++ b/savevm.c
@@ -727,6 +727,10 @@ void qemu_savevm_state_complete(QEMUFile *f)
         if ((!se->ops || !se->ops->save_state) && !se->vmsd) {
             continue;
         }
+        if (se->vmsd && !vmstate_save_needed(se->vmsd, se->opaque)) {
+            continue;
+        }
+
         trace_savevm_section_start(se->idstr, se->section_id);
         /* Section type */
         qemu_put_byte(f, QEMU_VM_SECTION_FULL);
diff --git a/vmstate.c b/vmstate.c
index bb02b7d..b8914ac 100644
--- a/vmstate.c
+++ b/vmstate.c
@@ -131,6 +131,17 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd,
     return 0;
 }

+
+bool vmstate_save_needed(const VMStateDescription *vmsd, void *opaque)
+{
+    if (vmsd->needed && !vmsd->needed(opaque)) {
+        /* optional section not needed */
+        return false;
+    }
+    return true;
+}
+
+
 void vmstate_save_state(QEMUFile *f, const VMStateDescription *vmsd,
                         void *opaque)
 {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] runstate: migration allows more transitions now
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 4/7] runstate: migration allows more transitions now Juan Quintela
@ 2014-10-15 14:42   ` Eric Blake
  2014-10-15 15:50     ` Juan Quintela
  2014-11-03 12:46   ` Amit Shah
  1 sibling, 1 reply; 26+ messages in thread
From: Eric Blake @ 2014-10-15 14:42 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: kwolf, laine, lcapitulino

[-- Attachment #1: Type: text/plain, Size: 1066 bytes --]

On 10/15/2014 01:55 AM, Juan Quintela wrote:
> Next commit would allow to move from incoming migration to error happening on source.
> 
> Should we add more states to this transition?  Luiz?

Does this mean we can finally do S3 migration without forcing the guest
to wake up from S3 on the destination?

> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  vl.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/vl.c b/vl.c
> index c4bc43c..1788b6a 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -618,6 +618,8 @@ static const RunStateTransition runstate_transitions_def[] = {
> 
>      { RUN_STATE_INMIGRATE, RUN_STATE_RUNNING },
>      { RUN_STATE_INMIGRATE, RUN_STATE_PAUSED },
> +    { RUN_STATE_INMIGRATE, RUN_STATE_INTERNAL_ERROR },
> +    { RUN_STATE_INMIGRATE, RUN_STATE_IO_ERROR },
> 
>      { RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
>      { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 7/7] vmstate: Create optional sections
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 7/7] vmstate: Create optional sections Juan Quintela
@ 2014-10-15 14:45   ` Eric Blake
  0 siblings, 0 replies; 26+ messages in thread
From: Eric Blake @ 2014-10-15 14:45 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: kwolf, laine, lcapitulino

[-- Attachment #1: Type: text/plain, Size: 500 bytes --]

On 10/15/2014 01:55 AM, Juan Quintela wrote:
> To make sections optional, we need to do it at the beggining of the code.

s/beggining/beginning/

> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  include/migration/vmstate.h |  2 ++
>  savevm.c                    |  4 ++++
>  vmstate.c                   | 11 +++++++++++
>  3 files changed, 17 insertions(+)
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC 0/7] Optional toplevel sections
  2014-10-15  7:55 [Qemu-devel] [RFC 0/7] Optional toplevel sections Juan Quintela
                   ` (6 preceding siblings ...)
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 7/7] vmstate: Create optional sections Juan Quintela
@ 2014-10-15 14:57 ` Eric Blake
  2014-10-15 15:59   ` Juan Quintela
  2014-10-20 11:29 ` Kevin Wolf
  8 siblings, 1 reply; 26+ messages in thread
From: Eric Blake @ 2014-10-15 14:57 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: kwolf, laine, lcapitulino

[-- Attachment #1: Type: text/plain, Size: 5018 bytes --]

On 10/15/2014 01:55 AM, Juan Quintela wrote:
> Hi
> 
> by popular demand, and after too many time, this series.  This is an
> RFC to know what people think about how to use them, the interface
> proposed, whatever.
> 
> * simplify optional subsections moving the "needed" function to
>   vmstate description.  I think that this simplification makes sense
>   by itself, it is indipendent of the rest of the patches.
> 
> * runstate: To make an example of an optional section, I decided to
>   use current runstate.  Right now, we have a problem when:
>   - we start destination without -S
>   - we run migration, and it causes an ioerror on source, but migration finishes
>   - we try to run migration on destination anyways, when it is
>     possible that we could get disk corruption (the ioerror was there for a reason)
>   Luiz: You can see any obvious improvement about how we use runstates?
>   Laine: Could you told me if you (libvirt) like this or would want
>      something a bit different?

Right now, libvirt always uses -S, then just calls 'cont' on the
destination to resume the CPUs (if the migration was live and the source
was in the running state).  But if we start passing other states, 'cont'
might not be the right thing to do.  For example, if the guest is at S3
on the source, how do we transfer from in migration to S3 at the
destination?  Do we need a new monitor command that says to put the
guest into the same state that migration said it should be in (and the
command fails if migration was from an older source that did not send
the subsection)?

How can libvirt introspect that the destination qemu is new enough to
understand the subsections, and/or that the source qemu is new enough to
send the subsections?

> 
> * I sent that option indpendently for new machine types.
> 
> * For old machine types I use this as one example of optional section.
>   We only sent it when the state is different from "running" or "paused".
> 
>   So, the only case where we fail is if we migrate to an old qemu and
>   there is one error.

Seems reasonable.

> 
> * On the runstate subsection "postload" we can send any event for
>   anything that libvirt wants when migration finishes.
>   Laine, can you told us what libvirt would preffer for this?

So you're thinking that an event on the destination emitted stating that
'incoming migration is complete, and requests the following state' is
sufficient for libvirt to know how to put the domain into that state?

> 
> Kevin: You asked for optional sections in the past for the block
>    layer, would this proposal be enough for you?
> 
> Please review, comment.
> 
> Thanks, Juan.
> 
> PD.  Yes, on proper submission, patches 6 and 7 are on the wrong order.
> 
> Juan Quintela (7):
>   migration: Create optional sections
>   runstate: Add runstate store
>   runstate: create runstate_index function
>   runstate: migration allows more transitions now
>   migration: create now section to store global state
>   global_state: Make section optional
>   vmstate: Create optional sections
> 
>  cpus.c                        |  11 ++--
>  docs/migration.txt            |  11 ++--
>  exec.c                        |  11 ++--
>  hw/acpi/ich9.c                |  10 ++--
>  hw/acpi/piix4.c               |  10 ++--
>  hw/block/fdc.c                |  37 +++++--------
>  hw/char/serial.c              |  41 ++++++---------
>  hw/display/qxl.c              |  11 ++--
>  hw/display/vga.c              |  11 ++--
>  hw/i386/pc_piix.c             |   2 +
>  hw/ide/core.c                 |  32 +++++-------
>  hw/ide/pci.c                  |  16 +++---
>  hw/input/pckbd.c              |  22 ++++----
>  hw/input/ps2.c                |  11 ++--
>  hw/isa/lpc_ich9.c             |  10 ++--
>  hw/net/e1000.c                |  11 ++--
>  hw/net/rtl8139.c              |  11 ++--
>  hw/net/vmxnet3.c              |  12 ++---
>  hw/pci-host/piix.c            |  10 ++--
>  hw/scsi/scsi-bus.c            |  11 ++--
>  hw/timer/hpet.c               |  11 ++--
>  hw/timer/mc146818rtc.c        |  23 ++++-----
>  hw/usb/hcd-ohci.c             |  11 ++--
>  hw/usb/redirect.c             |  34 ++++++------
>  hw/virtio/virtio.c            |  10 ++--
>  include/migration/migration.h |   4 ++
>  include/migration/vmstate.h   |  10 ++--
>  include/sysemu/sysemu.h       |   2 +
>  migration.c                   | 117 ++++++++++++++++++++++++++++++++++++++++--
>  savevm.c                      |  14 +++--
>  target-arm/machine.c          |  26 ++++------
>  target-i386/machine.c         |  71 ++++++++++---------------
>  target-ppc/machine.c          |  62 +++++++++-------------
>  vl.c                          |  26 ++++++++++
>  vmstate.c                     |  27 +++++++---
>  35 files changed, 393 insertions(+), 356 deletions(-)
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] runstate: migration allows more transitions now
  2014-10-15 14:42   ` Eric Blake
@ 2014-10-15 15:50     ` Juan Quintela
  0 siblings, 0 replies; 26+ messages in thread
From: Juan Quintela @ 2014-10-15 15:50 UTC (permalink / raw)
  To: Eric Blake; +Cc: kwolf, qemu-devel, laine, lcapitulino

Eric Blake <eblake@redhat.com> wrote:
> On 10/15/2014 01:55 AM, Juan Quintela wrote:
>> Next commit would allow to move from incoming migration to error
>> happening on source.
>> 
>> Should we add more states to this transition?  Luiz?
>
> Does this mean we can finally do S3 migration without forcing the guest
> to wake up from S3 on the destination?

I haven't tested, but it should just work (famous last words).

Would test.

Later, Juan.

>
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>  vl.c | 2 ++
>>  1 file changed, 2 insertions(+)
>> 
>> diff --git a/vl.c b/vl.c
>> index c4bc43c..1788b6a 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -618,6 +618,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>> 
>>      { RUN_STATE_INMIGRATE, RUN_STATE_RUNNING },
>>      { RUN_STATE_INMIGRATE, RUN_STATE_PAUSED },
>> +    { RUN_STATE_INMIGRATE, RUN_STATE_INTERNAL_ERROR },
>> +    { RUN_STATE_INMIGRATE, RUN_STATE_IO_ERROR },
>> 
>>      { RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
>>      { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
>> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC 0/7] Optional toplevel sections
  2014-10-15 14:57 ` [Qemu-devel] [RFC 0/7] Optional toplevel sections Eric Blake
@ 2014-10-15 15:59   ` Juan Quintela
  2014-10-15 16:37     ` Eric Blake
  2014-10-17 13:41     ` Paolo Bonzini
  0 siblings, 2 replies; 26+ messages in thread
From: Juan Quintela @ 2014-10-15 15:59 UTC (permalink / raw)
  To: Eric Blake; +Cc: kwolf, qemu-devel, laine, lcapitulino

Eric Blake <eblake@redhat.com> wrote:
> On 10/15/2014 01:55 AM, Juan Quintela wrote:
>> Hi
>> 
>> by popular demand, and after too many time, this series.  This is an
>> RFC to know what people think about how to use them, the interface
>> proposed, whatever.
>> 
>> * simplify optional subsections moving the "needed" function to
>>   vmstate description.  I think that this simplification makes sense
>>   by itself, it is indipendent of the rest of the patches.
>> 
>> * runstate: To make an example of an optional section, I decided to
>>   use current runstate.  Right now, we have a problem when:
>>   - we start destination without -S
>>   - we run migration, and it causes an ioerror on source, but migration finishes
>>   - we try to run migration on destination anyways, when it is
>>     possible that we could get disk corruption (the ioerror was there for a reason)
>>   Luiz: You can see any obvious improvement about how we use runstates?
>>   Laine: Could you told me if you (libvirt) like this or would want
>>      something a bit different?
>
> Right now, libvirt always uses -S, then just calls 'cont' on the
> destination to resume the CPUs (if the migration was live and the source
> was in the running state).  But if we start passing other states, 'cont'
> might not be the right thing to do.  For example, if the guest is at S3
> on the source, how do we transfer from in migration to S3 at the
> destination?

It should be transparent (I haven't tested, this series was to ask for
comments before investing more time on it).

> Do we need a new monitor command that says to put the
> guest into the same state that migration said it should be in (and the
> command fails if migration was from an older source that did not send
> the subsection)?

I think that you don't need the command.
Target is started "paused" (-S) or "running" (nothing).


source old: target old: no changes
source old: target new: the same as now, no changes
source new: target new: we set the right state.  And if it is not
"running" we don't run on destination, independent of what is happening.
source new: target old: if source is in state "running" or "paused", no
                        change.  If source is in error state, we sent
                        the section and migration gets aborted (target
                        don't understand it)

source new: target new, running with old machine type:
if state is "running" or "paused", nothing is sent.
if state is "error", target is set to "error".

So, I think that we get all the cases possible right, no?


> How can libvirt introspect that the destination qemu is new enough to
> understand the subsections, and/or that the source qemu is new enough to
> send the subsections?

A new qemu_option value would make for you?

>> 
>> * I sent that option indpendently for new machine types.
>> 
>> * For old machine types I use this as one example of optional section.
>>   We only sent it when the state is different from "running" or "paused".
>> 
>>   So, the only case where we fail is if we migrate to an old qemu and
>>   there is one error.
>
> Seems reasonable.
>
>> 
>> * On the runstate subsection "postload" we can send any event for
>>   anything that libvirt wants when migration finishes.
>>   Laine, can you told us what libvirt would preffer for this?
>
> So you're thinking that an event on the destination emitted stating that
> 'incoming migration is complete, and requests the following state' is
> sufficient for libvirt to know how to put the domain into that state?

It is up to libvirt what to do.

My idea here is that, if you don't use libvirt, you just start without
-S.

If you use libvirt, and you *don't* need to do anything special to run
after migration, you shouldn't use -S.  And I would emit an event saying
"migration was finished".

But what I want to know is _what_ events are you interested in?

Later, Juan.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC 0/7] Optional toplevel sections
  2014-10-15 15:59   ` Juan Quintela
@ 2014-10-15 16:37     ` Eric Blake
  2014-10-17 13:41     ` Paolo Bonzini
  1 sibling, 0 replies; 26+ messages in thread
From: Eric Blake @ 2014-10-15 16:37 UTC (permalink / raw)
  To: quintela; +Cc: kwolf, qemu-devel, laine, lcapitulino

[-- Attachment #1: Type: text/plain, Size: 3267 bytes --]

On 10/15/2014 09:59 AM, Juan Quintela wrote:

>> Do we need a new monitor command that says to put the
>> guest into the same state that migration said it should be in (and the
>> command fails if migration was from an older source that did not send
>> the subsection)?
> 
> I think that you don't need the command.
> Target is started "paused" (-S) or "running" (nothing).

Libvirt will _always_ pass -S.  This is because there is a need for
additional handshaking from destination back to source to let the source
know that the destination is ready to take over operation.

> 
> 
> source old: target old: no changes
> source old: target new: the same as now, no changes
> source new: target new: we set the right state.  And if it is not
> "running" we don't run on destination, independent of what is happening.
> source new: target old: if source is in state "running" or "paused", no
>                         change.  If source is in error state, we sent
>                         the section and migration gets aborted (target
>                         don't understand it)
> 
> source new: target new, running with old machine type:
> if state is "running" or "paused", nothing is sent.
> if state is "error", target is set to "error".
> 
> So, I think that we get all the cases possible right, no?

Only if the existing 'cont' is changed to do something other than put
the destination into 'running', which doesn't sound good; or if we add
some new way for the destination to resume to the state passed in migration.

> 
> 
>> How can libvirt introspect that the destination qemu is new enough to
>> understand the subsections, and/or that the source qemu is new enough to
>> send the subsections?
> 
> A new qemu_option value would make for you?

Or even the existence of a new QMP command in parallel to 'cont' that
has semantics of 'please restore this guest to the state it had on
incoming migration'.

> If you use libvirt, and you *don't* need to do anything special to run
> after migration, you shouldn't use -S.  And I would emit an event saying
> "migration was finished".
> 

No, libvirt will ALWAYS use -S.  So what we need is the hooks for using
-S and still relying on the migration stream rather than the current
status quo of a blind 'cont' (or nothing, if libvirt knows the source
was also in the paused state).  In fact, it is more confusing than that:
libvirt has a live migration mode that will auto-fallback to paused
migration if live migration wasn't converging fast enough.  That is, on
the source, we intentionally pause the source to quit waiting for
convergence, but on the destination we then 'cont' to wake the guest
back up.  In _that_ scenario, the migration stream will contain data
that the guest is paused, but we WANT the destination to be running.

> But what I want to know is _what_ events are you interested in?

Really, an event that the destination is ready to be woken up, and some
indication of the destination having received state in the migration
stream (so that it will wake up to the correct state).

> 
> Later, Juan.
> 
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC 0/7] Optional toplevel sections
  2014-10-15 15:59   ` Juan Quintela
  2014-10-15 16:37     ` Eric Blake
@ 2014-10-17 13:41     ` Paolo Bonzini
  2014-10-20 14:59       ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 26+ messages in thread
From: Paolo Bonzini @ 2014-10-17 13:41 UTC (permalink / raw)
  To: quintela, Eric Blake; +Cc: kwolf, qemu-devel, laine, lcapitulino

Il 15/10/2014 17:59, Juan Quintela ha scritto:
> My idea here is that, if you don't use libvirt, you just start without
> -S.

If you don't use libvirt or any other QEMU management layer, you're not
going to do migration except for debugging purposes.  There's just too
much state going on to be able to do it reliably.

> If you use libvirt, and you *don't* need to do anything special to run
> after migration, you shouldn't use -S.

Is this a real requirement, or just "it sounds nicer that way"?  How
much time really passes between the end of migration and the issuing of
the "-cont" command?

And the $1,000,000 questionL.aAre you _absolutely_ sure that an
automatic restart is entirely robust against a failure of the connection
between the two libvirtd instances?  Could you end up with the VM
running on two hosts?  Using -S gets QEMU completely out of the
equation, which is nice.

By the way, some of the states (I can think of io-error, guest-panicked,
watchdog) can be detected on the destination and restored.  Migrating a
machine with io-error state is definitely something that you want to do
no matter what versions of QEMU you have.  It may be the only way to
recover for a network partition like this:

           DISK
          /    \
         |      \
         X       |
         |       |
        SRC --- DEST

(not impossible: e.g. the SRC->DISK is fibre channel, but the SRC->DEST
link is Ethernet.  Or you have a replicated disk setup, some daemon
fails in SRC's replica but not DEST's).

> And I would emit an event saying
> "migration was finished".

The event should be emitted nevertheless. :)

Paolo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] runstate: Add runstate store
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 2/7] runstate: Add runstate store Juan Quintela
@ 2014-10-20 10:24   ` Dr. David Alan Gilbert
  2014-10-20 10:52     ` Juan Quintela
  0 siblings, 1 reply; 26+ messages in thread
From: Dr. David Alan Gilbert @ 2014-10-20 10:24 UTC (permalink / raw)
  To: Juan Quintela; +Cc: kwolf, qemu-devel, laine, lcapitulino

* Juan Quintela (quintela@redhat.com) wrote:
> This allows us to store the current state to send it through migration.

Why store the runstate as a string?  The later code then ends up doing
string compares and things - why not just use the enum value?

Dave

> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  include/sysemu/sysemu.h |  1 +
>  vl.c                    | 10 ++++++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index d8539fd..ae217da 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -28,6 +28,7 @@ bool runstate_check(RunState state);
>  void runstate_set(RunState new_state);
>  int runstate_is_running(void);
>  bool runstate_needs_reset(void);
> +int runstate_store(char *str, int size);
>  typedef struct vm_change_state_entry VMChangeStateEntry;
>  typedef void VMChangeStateHandler(void *opaque, int running, RunState state);
> 
> diff --git a/vl.c b/vl.c
> index 964d634..ce8e28b 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -677,6 +677,16 @@ bool runstate_check(RunState state)
>      return current_run_state == state;
>  }
> 
> +int runstate_store(char *str, int size)
> +{
> +    const char *state = RunState_lookup[current_run_state];
> +
> +    if (strlen(state)+1 > size)
> +        return -1;
> +    strncpy(str, state, strlen(state)+1);
> +    return 0;
> +}
> +
>  static void runstate_init(void)
>  {
>      const RunStateTransition *p;
> -- 
> 2.1.0
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/7] migration: create now section to store global state
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 5/7] migration: create now section to store global state Juan Quintela
@ 2014-10-20 10:52   ` Dr. David Alan Gilbert
  2014-10-20 11:11   ` Kevin Wolf
  1 sibling, 0 replies; 26+ messages in thread
From: Dr. David Alan Gilbert @ 2014-10-20 10:52 UTC (permalink / raw)
  To: Juan Quintela; +Cc: kwolf, qemu-devel, laine, lcapitulino

* Juan Quintela (quintela@redhat.com) wrote:
> This includes a new section that for now just stores the current qemu state.
> 
> Right now, there are only one way to control what is the state of the
> target after migration.
> 
> - If you run the target qemu with -S, it would start stopped.
> - If you run the target qemu without -S, it would run just after migration finishes.
> 
> The problem here is what happens if we start the target without -S and
> there happens one error during migration that puts current state as
> -EIO.  Migration would ends (notice that the error happend doing block
> IO, network IO, i.e. nothing related with migration), and when
> migration finish, we would just "continue" running on destination,
> probably hanging the guest/corruption data, whatever.

A couple of questions:
   1) Does the ordering of loading this state matter - lets say that the source
     was in an error state, then all the other device states get loaded and then
     it loads your global_state which tells the destination is in error - is that
     too late ? Could the device emulation have already started doing some IO
     or something to the devices which it wouldn't have done if it knew there was
     already a problem?

   2) What's the advantage of the optional section over using the 'command' sections
      I use in postcopy; http://lists.gnu.org/archive/html/qemu-devel/2014-10/msg00337.html ?

Dave

> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  include/migration/migration.h |  4 ++
>  migration.c                   | 88 +++++++++++++++++++++++++++++++++++++++++--
>  vl.c                          |  1 +
>  3 files changed, 90 insertions(+), 3 deletions(-)
> 
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index 3cb5ba8..bc1069b 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -174,4 +174,8 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t block_offset,
>                               ram_addr_t offset, size_t size,
>                               int *bytes_sent);
> 
> +void register_global_state(void);
> +void global_state_store(void);
> +char *global_state_get_runstate(void);
> +
>  #endif
> diff --git a/migration.c b/migration.c
> index 8d675b3..6f7e50e 100644
> --- a/migration.c
> +++ b/migration.c
> @@ -112,10 +112,20 @@ static void process_incoming_migration_co(void *opaque)
>          exit(EXIT_FAILURE);
>      }
> 
> -    if (autostart) {
> +    /* runstate == "" means that we haven't received it through the
> +     * wire, so we obey autostart.  runstate == runing means that we
> +     * need to run it, we need to make sure that we do it after
> +     * everything else has finished.  Every other state change is done
> +     * at the post_load function */
> +
> +    if (strcmp(global_state_get_runstate(), "running") == 0 ) {
>          vm_start();
> -    } else {
> -        runstate_set(RUN_STATE_PAUSED);
> +    } else if (strcmp(global_state_get_runstate(), "") == 0 ) {
> +        if (autostart) {
> +            vm_start();
> +        } else {
> +            runstate_set(RUN_STATE_PAUSED);
> +        }
>      }
>  }
> 
> @@ -608,6 +618,7 @@ static void *migration_thread(void *opaque)
>                  qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
>                  old_vm_running = runstate_is_running();
> 
> +                global_state_store();
>                  ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>                  if (ret >= 0) {
>                      qemu_file_set_rate_limit(s->file, INT64_MAX);
> @@ -699,3 +710,74 @@ void migrate_fd_connect(MigrationState *s)
>      qemu_thread_create(&s->thread, "migration", migration_thread, s,
>                         QEMU_THREAD_JOINABLE);
>  }
> +
> +typedef struct {
> +    int32_t size;
> +    uint8_t runstate[100];
> +} GlobalState;
> +
> +static GlobalState global_state;
> +
> +void global_state_store(void)
> +{
> +    if (runstate_store((char*)global_state.runstate,
> +                       sizeof(global_state.runstate)) == -1) {
> +        printf("Runstate is too big\n");
> +        exit(-1);
> +    }
> +}
> +
> +char *global_state_get_runstate(void)
> +{
> +    return (char *)global_state.runstate;
> +}
> +
> +static int global_state_post_load(void *opaque, int version_id)
> +{
> +    GlobalState *s = opaque;
> +    int ret = 0;
> +    char *runstate = (char*)s->runstate;
> +
> +    printf("loaded state: %s\n", runstate);
> +
> +    if (strcmp(runstate, "running") != 0) {
> +
> +        RunState r = runstate_index(runstate);
> +
> +        if (r == -1) {
> +            printf("Unknown received state %s\n", runstate);
> +            return -1;
> +        }
> +        ret = vm_stop_force_state(r);
> +    }
> +
> +   return ret;
> +}
> +
> +static void global_state_pre_save(void *opaque)
> +{
> +    GlobalState *s = opaque;
> +
> +    s->size = strlen((char*)s->runstate) + 1;
> +    printf("saved state: %s\n", s->runstate);
> +}
> +
> +static const VMStateDescription vmstate_globalstate = {
> +    .name = "globalstate",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .post_load = global_state_post_load,
> +    .pre_save = global_state_pre_save,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_INT32(size, GlobalState),
> +        VMSTATE_BUFFER(runstate, GlobalState),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
> +void register_global_state(void)
> +{
> +    /* We would use it independently that we receive it */
> +    strcpy((char*)&global_state.runstate, "");
> +    vmstate_register(NULL, 0, &vmstate_globalstate, &global_state);
> +}
> diff --git a/vl.c b/vl.c
> index 1788b6a..75e855e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -4511,6 +4511,7 @@ int main(int argc, char **argv, char **envp)
>          return 0;
>      }
> 
> +    register_global_state();
>      if (incoming) {
>          Error *local_err = NULL;
>          qemu_start_incoming_migration(incoming, &local_err);
> -- 
> 2.1.0
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] runstate: Add runstate store
  2014-10-20 10:24   ` Dr. David Alan Gilbert
@ 2014-10-20 10:52     ` Juan Quintela
  2014-10-20 15:18       ` Eric Blake
  0 siblings, 1 reply; 26+ messages in thread
From: Juan Quintela @ 2014-10-20 10:52 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: kwolf, qemu-devel, laine, lcapitulino

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> This allows us to store the current state to send it through migration.
>
> Why store the runstate as a string?  The later code then ends up doing
> string compares and things - why not just use the enum value?

How do you know that it has the same values both sides?  As far as I can
see, all interaction with the outside is done with strings (i.e. QMP).

But it is easier for me if I can sent the numeric value.

Libvirt folks?
Luiz?

What should I do?

Later, Juan.

>
> Dave
>
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>  include/sysemu/sysemu.h |  1 +
>>  vl.c                    | 10 ++++++++++
>>  2 files changed, 11 insertions(+)
>> 
>> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
>> index d8539fd..ae217da 100644
>> --- a/include/sysemu/sysemu.h
>> +++ b/include/sysemu/sysemu.h
>> @@ -28,6 +28,7 @@ bool runstate_check(RunState state);
>>  void runstate_set(RunState new_state);
>>  int runstate_is_running(void);
>>  bool runstate_needs_reset(void);
>> +int runstate_store(char *str, int size);
>>  typedef struct vm_change_state_entry VMChangeStateEntry;
>>  typedef void VMChangeStateHandler(void *opaque, int running, RunState state);
>> 
>> diff --git a/vl.c b/vl.c
>> index 964d634..ce8e28b 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -677,6 +677,16 @@ bool runstate_check(RunState state)
>>      return current_run_state == state;
>>  }
>> 
>> +int runstate_store(char *str, int size)
>> +{
>> +    const char *state = RunState_lookup[current_run_state];
>> +
>> +    if (strlen(state)+1 > size)
>> +        return -1;
>> +    strncpy(str, state, strlen(state)+1);
>> +    return 0;
>> +}
>> +
>>  static void runstate_init(void)
>>  {
>>      const RunStateTransition *p;
>> -- 
>> 2.1.0
>> 
>> 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/7] migration: create now section to store global state
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 5/7] migration: create now section to store global state Juan Quintela
  2014-10-20 10:52   ` Dr. David Alan Gilbert
@ 2014-10-20 11:11   ` Kevin Wolf
  1 sibling, 0 replies; 26+ messages in thread
From: Kevin Wolf @ 2014-10-20 11:11 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, laine, lcapitulino

Am 15.10.2014 um 09:55 hat Juan Quintela geschrieben:
> This includes a new section that for now just stores the current qemu state.
> 
> Right now, there are only one way to control what is the state of the
> target after migration.
> 
> - If you run the target qemu with -S, it would start stopped.
> - If you run the target qemu without -S, it would run just after migration finishes.
> 
> The problem here is what happens if we start the target without -S and
> there happens one error during migration that puts current state as
> -EIO.  Migration would ends (notice that the error happend doing block
> IO, network IO, i.e. nothing related with migration), and when
> migration finish, we would just "continue" running on destination,
> probably hanging the guest/corruption data, whatever.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  include/migration/migration.h |  4 ++
>  migration.c                   | 88 +++++++++++++++++++++++++++++++++++++++++--
>  vl.c                          |  1 +
>  3 files changed, 90 insertions(+), 3 deletions(-)
> 
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index 3cb5ba8..bc1069b 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -174,4 +174,8 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t block_offset,
>                               ram_addr_t offset, size_t size,
>                               int *bytes_sent);
> 
> +void register_global_state(void);
> +void global_state_store(void);
> +char *global_state_get_runstate(void);
> +
>  #endif
> diff --git a/migration.c b/migration.c
> index 8d675b3..6f7e50e 100644
> --- a/migration.c
> +++ b/migration.c
> @@ -112,10 +112,20 @@ static void process_incoming_migration_co(void *opaque)
>          exit(EXIT_FAILURE);
>      }
> 
> -    if (autostart) {
> +    /* runstate == "" means that we haven't received it through the
> +     * wire, so we obey autostart.  runstate == runing means that we
> +     * need to run it, we need to make sure that we do it after
> +     * everything else has finished.  Every other state change is done
> +     * at the post_load function */
> +
> +    if (strcmp(global_state_get_runstate(), "running") == 0 ) {
>          vm_start();

Does this mean that -S is now ignored in the common case? Wouldn't it be
better to change only the case without -S? Otherwise I guess libvirt
will get quite confused.

> -    } else {
> -        runstate_set(RUN_STATE_PAUSED);
> +    } else if (strcmp(global_state_get_runstate(), "") == 0 ) {
> +        if (autostart) {
> +            vm_start();
> +        } else {
> +            runstate_set(RUN_STATE_PAUSED);
> +        }
>      }
>  }
> 
> @@ -608,6 +618,7 @@ static void *migration_thread(void *opaque)
>                  qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
>                  old_vm_running = runstate_is_running();
> 
> +                global_state_store();
>                  ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>                  if (ret >= 0) {
>                      qemu_file_set_rate_limit(s->file, INT64_MAX);
> @@ -699,3 +710,74 @@ void migrate_fd_connect(MigrationState *s)
>      qemu_thread_create(&s->thread, "migration", migration_thread, s,
>                         QEMU_THREAD_JOINABLE);
>  }
> +
> +typedef struct {
> +    int32_t size;
> +    uint8_t runstate[100];
> +} GlobalState;
> +
> +static GlobalState global_state;
> +
> +void global_state_store(void)
> +{
> +    if (runstate_store((char*)global_state.runstate,
> +                       sizeof(global_state.runstate)) == -1) {
> +        printf("Runstate is too big\n");
> +        exit(-1);
> +    }
> +}

Not sure if the concept of a single GlobalStore that calls all the
individual handlers for each piece of global state is optimal.

Can't we use something like vmstate_register()? Perhaps even the same
function, just with dev == NULL? (Actually, you even do this below, to
register the global state. So I guess I'm only disagreeing on the
granularity of having only a single section with a single handler
function for the whole global state.)

> +char *global_state_get_runstate(void)
> +{
> +    return (char *)global_state.runstate;
> +}
> +
> +static int global_state_post_load(void *opaque, int version_id)
> +{
> +    GlobalState *s = opaque;
> +    int ret = 0;
> +    char *runstate = (char*)s->runstate;
> +
> +    printf("loaded state: %s\n", runstate);
> +
> +    if (strcmp(runstate, "running") != 0) {
> +
> +        RunState r = runstate_index(runstate);
> +
> +        if (r == -1) {
> +            printf("Unknown received state %s\n", runstate);
> +            return -1;
> +        }
> +        ret = vm_stop_force_state(r);
> +    }
> +
> +   return ret;
> +}
> +
> +static void global_state_pre_save(void *opaque)
> +{
> +    GlobalState *s = opaque;
> +
> +    s->size = strlen((char*)s->runstate) + 1;
> +    printf("saved state: %s\n", s->runstate);
> +}
> +
> +static const VMStateDescription vmstate_globalstate = {
> +    .name = "globalstate",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .post_load = global_state_post_load,
> +    .pre_save = global_state_pre_save,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_INT32(size, GlobalState),
> +        VMSTATE_BUFFER(runstate, GlobalState),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};

Could this be optional and suppressed for "running"? Then we should be
able to keep compatibility with older versions in the common case.

> +
> +void register_global_state(void)
> +{
> +    /* We would use it independently that we receive it */
> +    strcpy((char*)&global_state.runstate, "");
> +    vmstate_register(NULL, 0, &vmstate_globalstate, &global_state);
> +}
> diff --git a/vl.c b/vl.c
> index 1788b6a..75e855e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -4511,6 +4511,7 @@ int main(int argc, char **argv, char **envp)
>          return 0;
>      }
> 
> +    register_global_state();
>      if (incoming) {
>          Error *local_err = NULL;
>          qemu_start_incoming_migration(incoming, &local_err);
> -- 
> 2.1.0

Kevin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC 0/7] Optional toplevel sections
  2014-10-15  7:55 [Qemu-devel] [RFC 0/7] Optional toplevel sections Juan Quintela
                   ` (7 preceding siblings ...)
  2014-10-15 14:57 ` [Qemu-devel] [RFC 0/7] Optional toplevel sections Eric Blake
@ 2014-10-20 11:29 ` Kevin Wolf
  8 siblings, 0 replies; 26+ messages in thread
From: Kevin Wolf @ 2014-10-20 11:29 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, laine, lcapitulino

Am 15.10.2014 um 09:55 hat Juan Quintela geschrieben:
> Hi
> 
> by popular demand, and after too many time, this series.  This is an
> RFC to know what people think about how to use them, the interface
> proposed, whatever.
> [...]
> 
> Kevin: You asked for optional sections in the past for the block
>    layer, would this proposal be enough for you?

I know I've asked in more than one occasion, and of course I don't
remember all the details any more. Anyway, I remember two cases offhand:

* qcow2 with patches like Delayed COW keeps internal block layer state
  in memory that might need to be migrated. This series looks fine for
  this case in principle, we'd just need to find a way to distinguish
  the affected BlockDriverStates. We can probably take a node-name if it
  exists (with Jeff's auto-naming patches not a problem, because then it
  would always exist)

  How do devices solve this? Do they use something like a qdev path to
  identify to which device a given section belongs?

* When a VM is stopped after an I/O error, we need to migrate the
  information about pending requests (bdrv_drain_all doesn't complete
  the failed requests). Currently we do this in device code, but it
  would be very nice to make this common block layer functionality.

  The problem here is that bdrv_aio_readv/writev get an opaque pointer
  back to the device, which of course becomes meaningless during
  migration.

  So this one is tricky even if we have optional top-level sections.

Kevin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC 0/7] Optional toplevel sections
  2014-10-17 13:41     ` Paolo Bonzini
@ 2014-10-20 14:59       ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 26+ messages in thread
From: Dr. David Alan Gilbert @ 2014-10-20 14:59 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, quintela, qemu-devel, lcapitulino, laine

* Paolo Bonzini (pbonzini@redhat.com) wrote:
> Il 15/10/2014 17:59, Juan Quintela ha scritto:
> > My idea here is that, if you don't use libvirt, you just start without
> > -S.
> 
> If you don't use libvirt or any other QEMU management layer, you're not
> going to do migration except for debugging purposes.  There's just too
> much state going on to be able to do it reliably.

I'm not sure that's entirely true - while I agree that most users will
use libvirt, migration with shared disk is pretty easy; the only thing
that you need to do is bring up the tap on the destination, and I'm
not sure libvirt gets the timing ideal for it.

Dave


> 
> > If you use libvirt, and you *don't* need to do anything special to run
> > after migration, you shouldn't use -S.
> 
> Is this a real requirement, or just "it sounds nicer that way"?  How
> much time really passes between the end of migration and the issuing of
> the "-cont" command?
> 
> And the $1,000,000 questionL.aAre you _absolutely_ sure that an
> automatic restart is entirely robust against a failure of the connection
> between the two libvirtd instances?  Could you end up with the VM
> running on two hosts?  Using -S gets QEMU completely out of the
> equation, which is nice.
> 
> By the way, some of the states (I can think of io-error, guest-panicked,
> watchdog) can be detected on the destination and restored.  Migrating a
> machine with io-error state is definitely something that you want to do
> no matter what versions of QEMU you have.  It may be the only way to
> recover for a network partition like this:
> 
>            DISK
>           /    \
>          |      \
>          X       |
>          |       |
>         SRC --- DEST
> 
> (not impossible: e.g. the SRC->DISK is fibre channel, but the SRC->DEST
> link is Ethernet.  Or you have a replicated disk setup, some daemon
> fails in SRC's replica but not DEST's).
> 
> > And I would emit an event saying
> > "migration was finished".
> 
> The event should be emitted nevertheless. :)
> 
> Paolo
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] runstate: Add runstate store
  2014-10-20 10:52     ` Juan Quintela
@ 2014-10-20 15:18       ` Eric Blake
  2014-10-22 11:18         ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Blake @ 2014-10-20 15:18 UTC (permalink / raw)
  To: quintela, Dr. David Alan Gilbert; +Cc: kwolf, qemu-devel, laine, lcapitulino

[-- Attachment #1: Type: text/plain, Size: 1543 bytes --]

On 10/20/2014 04:52 AM, Juan Quintela wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>> * Juan Quintela (quintela@redhat.com) wrote:
>>> This allows us to store the current state to send it through migration.
>>
>> Why store the runstate as a string?  The later code then ends up doing
>> string compares and things - why not just use the enum value?
> 
> How do you know that it has the same values both sides?  As far as I can
> see, all interaction with the outside is done with strings (i.e. QMP).

If it's part of the migration stream, then it is not something visible
in QMP, and it is your own fault if you ever change the enum values in
such a way that the migration stream is incompatible between versions.
I think using an enum in the migration stream is just fine, and more
efficient.

> 
> But it is easier for me if I can sent the numeric value.
> 
> Libvirt folks?

As far as I can tell, libvirt is unimpacted by HOW it is represented in
the migration stream, only that the destination is able to inform
libvirt what state was received as part of migration, with libvirt
having an easy way to then get back into that state (of course, libvirt
should also still have the option to choose a different state than what
just got migrated, as in the case where the user pauses the source in
order to avoid convergence problems but wants the destination to start
running again).

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] runstate: Add runstate store
  2014-10-20 15:18       ` Eric Blake
@ 2014-10-22 11:18         ` Dr. David Alan Gilbert
  2014-10-22 11:40           ` Markus Armbruster
  0 siblings, 1 reply; 26+ messages in thread
From: Dr. David Alan Gilbert @ 2014-10-22 11:18 UTC (permalink / raw)
  To: Eric Blake; +Cc: kwolf, lcapitulino, qemu-devel, laine, quintela

* Eric Blake (eblake@redhat.com) wrote:
> On 10/20/2014 04:52 AM, Juan Quintela wrote:
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> >> * Juan Quintela (quintela@redhat.com) wrote:
> >>> This allows us to store the current state to send it through migration.
> >>
> >> Why store the runstate as a string?  The later code then ends up doing
> >> string compares and things - why not just use the enum value?
> > 
> > How do you know that it has the same values both sides?  As far as I can
> > see, all interaction with the outside is done with strings (i.e. QMP).
> 
> If it's part of the migration stream, then it is not something visible
> in QMP, and it is your own fault if you ever change the enum values in
> such a way that the migration stream is incompatible between versions.
> I think using an enum in the migration stream is just fine, and more
> efficient.

I think the question here really comes from RunState being an enum defined
in qapi-schema.json; so we could use that directly in the migration stream
if we were guaranteed that the encoding of that enum wasn't going to change.
Does qapi make any guarantees about the enum encoding?

Dave

> 
> > 
> > But it is easier for me if I can sent the numeric value.
> > 
> > Libvirt folks?
> 
> As far as I can tell, libvirt is unimpacted by HOW it is represented in
> the migration stream, only that the destination is able to inform
> libvirt what state was received as part of migration, with libvirt
> having an easy way to then get back into that state (of course, libvirt
> should also still have the option to choose a different state than what
> just got migrated, as in the case where the user pauses the source in
> order to avoid convergence problems but wants the destination to start
> running again).
> 
> -- 
> Eric Blake   eblake redhat com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 


--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] runstate: Add runstate store
  2014-10-22 11:18         ` Dr. David Alan Gilbert
@ 2014-10-22 11:40           ` Markus Armbruster
  2014-10-22 15:52             ` Eric Blake
  0 siblings, 1 reply; 26+ messages in thread
From: Markus Armbruster @ 2014-10-22 11:40 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: kwolf, quintela, qemu-devel, lcapitulino, laine

"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:

> * Eric Blake (eblake@redhat.com) wrote:
>> On 10/20/2014 04:52 AM, Juan Quintela wrote:
>> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>> >> * Juan Quintela (quintela@redhat.com) wrote:
>> >>> This allows us to store the current state to send it through migration.
>> >>
>> >> Why store the runstate as a string?  The later code then ends up doing
>> >> string compares and things - why not just use the enum value?
>> > 
>> > How do you know that it has the same values both sides?  As far as I can
>> > see, all interaction with the outside is done with strings (i.e. QMP).
>> 
>> If it's part of the migration stream, then it is not something visible
>> in QMP, and it is your own fault if you ever change the enum values in
>> such a way that the migration stream is incompatible between versions.
>> I think using an enum in the migration stream is just fine, and more
>> efficient.
>
> I think the question here really comes from RunState being an enum defined
> in qapi-schema.json; so we could use that directly in the migration stream
> if we were guaranteed that the encoding of that enum wasn't going to change.
> Does qapi make any guarantees about the enum encoding?

qapi-code-gen.txt in master is silent on the matter:

    === Enumeration types ===

    An enumeration type is a dictionary containing a single key whose value is a
    list of strings.  An example enumeration is:

     { 'enum': 'MyEnum', 'data': [ 'value1', 'value2', 'value3' ] }

Eric's "drop qapi nested structs" series improves the spec a lot.  With
it applied, this reads:

    === Enumeration types ===

    Usage: { 'enum': 'str', 'data': [ 'str' ] }

    An enumeration type is a dictionary containing a single 'data' key
    whose value is a list of strings.  An example enumeration is:

     { 'enum': 'MyEnum', 'data': [ 'value1', 'value2', 'value3' ] }

    Nothing prevents an empty enumeration, although it is probably not
    useful.  The list of strings should be lower case; if an enum name
    represents multiple words, use '-' between words.  The string 'max' is
    not allowed as an enum value, and values should not be repeated.

    The enumeration values are passed as strings over the QMP protocol,
    but are encoded as C enum integral values in generated code.  While
    the C code starts numbering at 0, it is better to use explicit
    comparisons to enum values than implicit comparisons to 0; the C code
    will also include a generated enum member ending in _MAX for tracking
    the size of the enum, useful when using common functions for
    converting between strings and enum values.  Since the wire format
    always passes by name, it is acceptable to reorder or add new
    enumeration members in any location without breaking QMP clients;
    however, removing enum values would break compatibility.  For any
    complex type that has a field that will only contain a finite set of
    string values, using an enum type for that field is better than
    open-coding the field to be type 'str'.

I figure relying on the QAPI code generator assigning values 0, 1, 2 in
order is fair.  If you want to guarantee that more explicitly in the
spec, patch welcome (on top of Eric's, please).

A test or build-time assertion checking the values don't change would be
prudent.  A comment next to the enum definition warning against
incompatible changes wouldn't hurt.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] runstate: Add runstate store
  2014-10-22 11:40           ` Markus Armbruster
@ 2014-10-22 15:52             ` Eric Blake
  0 siblings, 0 replies; 26+ messages in thread
From: Eric Blake @ 2014-10-22 15:52 UTC (permalink / raw)
  To: Markus Armbruster, Dr. David Alan Gilbert
  Cc: kwolf, quintela, qemu-devel, laine, lcapitulino

[-- Attachment #1: Type: text/plain, Size: 2961 bytes --]

On 10/22/2014 05:40 AM, Markus Armbruster wrote:
>> I think the question here really comes from RunState being an enum defined
>> in qapi-schema.json; so we could use that directly in the migration stream
>> if we were guaranteed that the encoding of that enum wasn't going to change.
>> Does qapi make any guarantees about the enum encoding?
> 
> qapi-code-gen.txt in master is silent on the matter:
> 
>     === Enumeration types ===
> 
>     An enumeration type is a dictionary containing a single key whose value is a
>     list of strings.  An example enumeration is:
> 
>      { 'enum': 'MyEnum', 'data': [ 'value1', 'value2', 'value3' ] }
> 
> Eric's "drop qapi nested structs" series improves the spec a lot.

And I still need to find time to supply the next revision of that series...

>     The enumeration values are passed as strings over the QMP protocol,
>     but are encoded as C enum integral values in generated code.  While
>     the C code starts numbering at 0, it is better to use explicit
>     comparisons to enum values than implicit comparisons to 0; the C code
>     will also include a generated enum member ending in _MAX for tracking
>     the size of the enum, useful when using common functions for
>     converting between strings and enum values.  Since the wire format
>     always passes by name, it is acceptable to reorder or add new
>     enumeration members in any location without breaking QMP clients;
>     however, removing enum values would break compatibility.  For any
>     complex type that has a field that will only contain a finite set of
>     string values, using an enum type for that field is better than
>     open-coding the field to be type 'str'.
> 
> I figure relying on the QAPI code generator assigning values 0, 1, 2 in
> order is fair.  If you want to guarantee that more explicitly in the
> spec, patch welcome (on top of Eric's, please).

Yes, the generator guarantees that the order that enums are listed in a
.json file will be the 0, 1, 2, ... values assigned in the C code.  I'll
add that in my revision.  You cannot rely on the C values being stable
unless the .json file takes care to not reorder names for the enum
members; if we have enums where the C code is going to rely on a
particular ordering, we probably ought to document that in the .json
file (since MOST enums are designed for the C code to use them solely by
symbolic name and not by specific value).

> 
> A test or build-time assertion checking the values don't change would be
> prudent.  A comment next to the enum definition warning against
> incompatible changes wouldn't hurt.

Indeed, if you are going to rely on something that the generator happens
to give but does not guarantee, then additional build-time checking and
documentation is called for.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] runstate: migration allows more transitions now
  2014-10-15  7:55 ` [Qemu-devel] [PATCH 4/7] runstate: migration allows more transitions now Juan Quintela
  2014-10-15 14:42   ` Eric Blake
@ 2014-11-03 12:46   ` Amit Shah
  1 sibling, 0 replies; 26+ messages in thread
From: Amit Shah @ 2014-11-03 12:46 UTC (permalink / raw)
  To: Juan Quintela; +Cc: kwolf, qemu-devel, laine, lcapitulino

On (Wed) 15 Oct 2014 [09:55:07], Juan Quintela wrote:
> Next commit would allow to move from incoming migration to error happening on source.
> 
> Should we add more states to this transition?  Luiz?

Yes; e.g. migrating in suspend-to-ram state.

		Amit

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2014-11-03 12:46 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-15  7:55 [Qemu-devel] [RFC 0/7] Optional toplevel sections Juan Quintela
2014-10-15  7:55 ` [Qemu-devel] [PATCH 1/7] migration: Create optional sections Juan Quintela
2014-10-15  7:55 ` [Qemu-devel] [PATCH 2/7] runstate: Add runstate store Juan Quintela
2014-10-20 10:24   ` Dr. David Alan Gilbert
2014-10-20 10:52     ` Juan Quintela
2014-10-20 15:18       ` Eric Blake
2014-10-22 11:18         ` Dr. David Alan Gilbert
2014-10-22 11:40           ` Markus Armbruster
2014-10-22 15:52             ` Eric Blake
2014-10-15  7:55 ` [Qemu-devel] [PATCH 3/7] runstate: create runstate_index function Juan Quintela
2014-10-15  7:55 ` [Qemu-devel] [PATCH 4/7] runstate: migration allows more transitions now Juan Quintela
2014-10-15 14:42   ` Eric Blake
2014-10-15 15:50     ` Juan Quintela
2014-11-03 12:46   ` Amit Shah
2014-10-15  7:55 ` [Qemu-devel] [PATCH 5/7] migration: create now section to store global state Juan Quintela
2014-10-20 10:52   ` Dr. David Alan Gilbert
2014-10-20 11:11   ` Kevin Wolf
2014-10-15  7:55 ` [Qemu-devel] [PATCH 6/7] global_state: Make section optional Juan Quintela
2014-10-15  7:55 ` [Qemu-devel] [PATCH 7/7] vmstate: Create optional sections Juan Quintela
2014-10-15 14:45   ` Eric Blake
2014-10-15 14:57 ` [Qemu-devel] [RFC 0/7] Optional toplevel sections Eric Blake
2014-10-15 15:59   ` Juan Quintela
2014-10-15 16:37     ` Eric Blake
2014-10-17 13:41     ` Paolo Bonzini
2014-10-20 14:59       ` Dr. David Alan Gilbert
2014-10-20 11:29 ` Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).