From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50435) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atSsT-0003n6-Ps for qemu-devel@nongnu.org; Fri, 22 Apr 2016 00:39:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1atSsQ-0007IQ-Gq for qemu-devel@nongnu.org; Fri, 22 Apr 2016 00:39:37 -0400 Date: Fri, 22 Apr 2016 14:28:45 +1000 From: David Gibson Message-ID: <20160422042845.GE15176@voom.fritz.box> References: <1460752385-13259-1-git-send-email-duanj@linux.vnet.ibm.com> <1460752385-13259-5-git-send-email-duanj@linux.vnet.ibm.com> <20160420051433.GG1133@voom> <57190C42.5080605@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="imjhCm/Pyz7Rq5F2" Content-Disposition: inline In-Reply-To: <57190C42.5080605@linux.vnet.ibm.com> Subject: Re: [Qemu-devel] [PATCH 4/5] Migration: migrate ccs_list in spapr state List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jianjun Duan Cc: qemu-devel@nongnu.org, qemu-ppc@nongnu.org, mdroth@linux.vnet.ibm.com --imjhCm/Pyz7Rq5F2 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Apr 21, 2016 at 10:22:10AM -0700, Jianjun Duan wrote: >=20 >=20 > On 04/19/2016 10:14 PM, David Gibson wrote: > >On Fri, Apr 15, 2016 at 01:33:04PM -0700, Jianjun Duan wrote: > >>ccs_list in spapr state maintains the device tree related > >>information on the rtas side for hotplugged devices. In racing > >>situations between hotplug events and migration operation, a rtas > >>hotplug event could be migrated from the source guest to target > >>guest, or the source guest could have not yet finished fetching > >>the device tree when migration is started, the target will try > >>to finish fetching the device tree. By migrating ccs_list, the > >>target can fetch the device tree properly. > >> > >>We tracked the size of ccs_list queue, and introduced a dynamic > >>cache for ccs_list to get around having to create VMSD for the > >>queue. There also existence tests in place for the newly added > >>fields in the spapr state VMSD to make sure forward migration > >>is not broken. > >> > >>Signed-off-by: Jianjun Duan > >So there's problems here at a couple of levels. > > > >1. Even more so that the indicators, this is transitional state, which > >it would be really nice if we can avoid migrating. At the very least > >the new state should go into a subsection with an is_needed function > >which will skip it if we don't have a configure connector in progress. > >That means we'll be able to backwards migrate as long as we're not in > >the middle of a hotplug, which won't be possible with this version. > > > >Again, if there's some way we can defer or retry the operation instead > >of migrating the interim state, that would be even better. > I am not sure why transitional state should not be migrated. Because every extra piece of state to migrate is something which can go wrong. Getting migration working properly is a difficult and fragile process as it is - every extra bit of state we add makes it harder. > I think the > fact that it changes is the reason why it should be migrated. As for > backward migration, I thought about it, but decided to leave it later > to beat the time. We can do it later, or do it this time and delayed the > patches more. I agree that subsection seems to be the solution here. Leaving backwards migration until later - after you've already introduced a new stream version - will make implementing it much, much more difficult. > >2. Having to copy the list of elements out into an array just for > >migration is pretty horrible. I'm almost include to suggest we should > >add list migration into the savevm core rather than this. > I thought about a general approach of adding something to savevm to > handle the queue. But the queue used here uses macro and doesn't really > support polymorphism. And we need to use QTAILQ_FOREACH(var, head, fiel= d) > to access it. It is not easy as we may need to modify the macro definition > to carry > type name of the elements of the queue. >=20 > Also I am following Alexey's code here ([PATCH qemu v15 07/17] spapr_iomm= u: > Migrate full state). > It was reviewed by you. Yeah. It's not incorrect, but it's ugly and every extra time I see it, the impetus to find a better way increases. > >>--- > >> hw/ppc/spapr.c | 60 ++++++++++++++++++++++++++++++++++++= ++++++++- > >> hw/ppc/spapr_rtas.c | 2 ++ > >> include/hw/ppc/spapr.h | 11 +++++++++ > >> include/migration/vmstate.h | 8 +++++- > >> 4 files changed, 79 insertions(+), 2 deletions(-) > >> > >>diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > >>index af4745c..eab95f0 100644 > >>--- a/hw/ppc/spapr.c > >>+++ b/hw/ppc/spapr.c > >>@@ -1245,10 +1245,31 @@ static bool spapr_vga_init(PCIBus *pci_bus, Err= or **errp) > >> } > >> } > >>+static void spapr_pre_save(void *opaque) > >>+{ > >>+ sPAPRMachineState *spapr =3D (sPAPRMachineState *)opaque; > >>+ sPAPRConfigureConnectorState *ccs; > >>+ sPAPRConfigureConnectorStateCache *ccs_cache; > >>+ > >>+ /* Copy ccs_list to ccs_list_cache */ > >>+ spapr->ccs_list_cache =3D g_new0(sPAPRConfigureConnectorStateCache, > >>+ spapr->ccs_list_num); > >>+ ccs_cache =3D spapr->ccs_list_cache; > >>+ QTAILQ_FOREACH(ccs, &spapr->ccs_list, next) { > >>+ ccs_cache->drc_index =3D ccs->drc_index; > >>+ ccs_cache->fdt_offset =3D ccs->fdt_offset; > >>+ ccs_cache->fdt_depth =3D ccs->fdt_depth; > >>+ ccs_cache++; > >>+ } > >>+} > >>+ > >> static int spapr_post_load(void *opaque, int version_id) > >> { > >> sPAPRMachineState *spapr =3D (sPAPRMachineState *)opaque; > >> int err =3D 0; > >>+ sPAPRConfigureConnectorState *ccs; > >>+ sPAPRConfigureConnectorStateCache *ccs_cache =3D spapr->ccs_list_c= ache; > >>+ int index =3D 0; > >> /* In earlier versions, there was no separate qdev for the PAPR > >> * RTC, so the RTC offset was stored directly in sPAPREnvironment. > >>@@ -1258,6 +1279,19 @@ static int spapr_post_load(void *opaque, int ver= sion_id) > >> err =3D spapr_rtc_import_offset(spapr->rtc, spapr->rtc_offset= ); > >> } > >>+ if (version_id < 4) { > >>+ return err; > >>+ } > >>+ /* Copy ccs_list_cache to ccs_list */ > >>+ for (index =3D 0; index < spapr->ccs_list_num; index ++) { > >>+ ccs =3D g_new0(sPAPRConfigureConnectorState, 1); > >>+ ccs->drc_index =3D (ccs_cache + index)->drc_index; > >>+ ccs->fdt_offset =3D (ccs_cache + index)->fdt_offset; > >>+ ccs->fdt_depth =3D (ccs_cache + index)->fdt_depth; > >>+ QTAILQ_INSERT_TAIL(&spapr->ccs_list, ccs, next); > >>+ } > >>+ g_free(spapr->ccs_list_cache); > >>+ > >> return err; > >> } > >>@@ -1266,10 +1300,28 @@ static bool version_before_3(void *opaque, int = version_id) > >> return version_id < 3; > >> } > >>+static bool version_ge_4(void *opaque, int version_id) > >>+{ > >>+ return version_id >=3D 4; > >>+} > >>+ > >>+static const VMStateDescription vmstate_spapr_ccs_cache =3D { > >>+ .name =3D "spaprconfigureconnectorstate", > >>+ .version_id =3D 1, > >>+ .minimum_version_id =3D 1, > >>+ .fields =3D (VMStateField[]) { > >>+ VMSTATE_UINT32(drc_index, sPAPRConfigureConnectorStateCache), > >>+ VMSTATE_INT32(fdt_offset, sPAPRConfigureConnectorStateCache), > >>+ VMSTATE_INT32(fdt_depth, sPAPRConfigureConnectorStateCache), > >>+ VMSTATE_END_OF_LIST() > >>+ }, > >>+}; > >>+ > >> static const VMStateDescription vmstate_spapr =3D { > >> .name =3D "spapr", > >>- .version_id =3D 3, > >>+ .version_id =3D 4, > >> .minimum_version_id =3D 1, > >>+ .pre_save =3D spapr_pre_save, > >> .post_load =3D spapr_post_load, > >> .fields =3D (VMStateField[]) { > >> /* used to be @next_irq */ > >>@@ -1279,6 +1331,12 @@ static const VMStateDescription vmstate_spapr = =3D { > >> VMSTATE_UINT64_TEST(rtc_offset, sPAPRMachineState, version_be= fore_3), > >> VMSTATE_PPC_TIMEBASE_V(tb, sPAPRMachineState, 2), > >>+ /* RTAS state */ > >>+ VMSTATE_INT32_TEST(ccs_list_num, sPAPRMachineState, version_ge= _4), > >You don't generally need to write your own version test functions for sp= ecific-version. Instead you can just use VMSTATE_INT32_V. > I agree. I realized that after the code was posted here. Ok. > >>+ VMSTATE_STRUCT_VARRAY_ALLOC_TEST(ccs_list_cache, sPAPRMachineS= tate, > >>+ version_ge_4, ccs_list_num, 1, > >>+ vmstate_spapr_ccs_cache, > >>+ sPAPRConfigureConnectorStateC= ache), > >> VMSTATE_END_OF_LIST() > >> }, > >> }; > >>diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c > >>index f073258..9cfd559 100644 > >>--- a/hw/ppc/spapr_rtas.c > >>+++ b/hw/ppc/spapr_rtas.c > >>@@ -70,6 +70,7 @@ static void spapr_ccs_add(sPAPRMachineState *spapr, > >> { > >> g_assert(!spapr_ccs_find(spapr, ccs->drc_index)); > >> QTAILQ_INSERT_HEAD(&spapr->ccs_list, ccs, next); > >>+ spapr->ccs_list_num++; > >> } > >> static void spapr_ccs_remove(sPAPRMachineState *spapr, > >>@@ -77,6 +78,7 @@ static void spapr_ccs_remove(sPAPRMachineState *spapr, > >> { > >> QTAILQ_REMOVE(&spapr->ccs_list, ccs, next); > >> g_free(ccs); > >>+ spapr->ccs_list_num--; > >> } > >> void spapr_ccs_reset_hook(void *opaque) > >>diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h > >>index 815d5ee..c8be926 100644 > >>--- a/include/hw/ppc/spapr.h > >>+++ b/include/hw/ppc/spapr.h > >>@@ -11,6 +11,8 @@ struct VIOsPAPRBus; > >> struct sPAPRPHBState; > >> struct sPAPRNVRAM; > >> typedef struct sPAPRConfigureConnectorState sPAPRConfigureConnectorSt= ate; > >>+typedef struct sPAPRConfigureConnectorStateCache > >>+ sPAPRConfigureConnectorStateCache; > >> typedef struct sPAPREventLogEntry sPAPREventLogEntry; > >> #define HPTE64_V_HPTE_DIRTY 0x0000000000000040ULL > >>@@ -75,6 +77,9 @@ struct sPAPRMachineState { > >> /* RTAS state */ > >> QTAILQ_HEAD(, sPAPRConfigureConnectorState) ccs_list; > >>+ /* Temporary cache for migration purposes */ > >>+ int32_t ccs_list_num; > >>+ sPAPRConfigureConnectorStateCache *ccs_list_cache; > >> /*< public >*/ > >> char *kvm_type; > >>@@ -589,6 +594,12 @@ struct sPAPRConfigureConnectorState { > >> QTAILQ_ENTRY(sPAPRConfigureConnectorState) next; > >> }; > >>+struct sPAPRConfigureConnectorStateCache { > >>+ uint32_t drc_index; > >>+ int fdt_offset; > >>+ int fdt_depth; > >>+}; > >>+ > >> void spapr_ccs_reset_hook(void *opaque); > >> #define TYPE_SPAPR_RTC "spapr-rtc" > >>diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h > >>index 1622638..7966979 100644 > >>--- a/include/migration/vmstate.h > >>+++ b/include/migration/vmstate.h > >>@@ -549,9 +549,10 @@ extern const VMStateInfo vmstate_info_bitmap; > >> .offset =3D offsetof(_state, _field), = \ > >> } > >>-#define VMSTATE_STRUCT_VARRAY_ALLOC(_field, _state, _field_num, _versi= on, _vmsd, _type) {\ > >>+#define VMSTATE_STRUCT_VARRAY_ALLOC_TEST(_field, _state, _test, _field= _num, _version, _vmsd, _type) { \ > >> .name =3D (stringify(_field)), = \ > >> .version_id =3D (_version), = \ > >>+ .field_exists =3D (_test), = \ > >> .vmsd =3D &(_vmsd), = \ > >> .num_offset =3D vmstate_offset_value(_state, _field_num, int32_t)= , \ > >> .size =3D sizeof(_type), = \ > >>@@ -677,6 +678,11 @@ extern const VMStateInfo vmstate_info_bitmap; > >> VMSTATE_STRUCT_ARRAY_TEST(_field, _state, _num, NULL, _version, = \ > >> _vmsd, _type) > >>+#define VMSTATE_STRUCT_VARRAY_ALLOC(_field, _state, _field_num, _versi= on, \ > >>+ _vmsd, _type) = \ > >>+ VMSTATE_STRUCT_VARRAY_ALLOC_TEST(_field, _state, NULL, _field_num,= \ > >>+ _version, _vmsd, _type) > >>+ > >> #define VMSTATE_BUFFER_UNSAFE_INFO(_field, _state, _version, _info, _= size) \ > >> VMSTATE_BUFFER_UNSAFE_INFO_TEST(_field, _state, NULL, _version, _= info, \ > >> _size) >=20 --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --imjhCm/Pyz7Rq5F2 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJXGah9AAoJEGw4ysog2bOS57wP/Rinzz42ix2Bjo0Mbxt72wfQ IYt9Z/f6lG7G9ws+v8aju9lZ4ukCEXaZS/Gn67bpIXLZFTeZ2UeUIA0OfzztNnAW lVrB3T3XBZ9F150Ll+iMtEbjcdYIT7FmXxfYSnDqaxJOfPm//ZlCrjSSGW5lwuFH WHPtOgeD1vhaWouNFUUrgpRT2kluuS/JLKWiKpYfjLW7+q10N8coKr1E1xPNu+jp FcVl4RDDiXiiHr7EG5iFh8yEOUW/op8DO2o8yB3LaZ14R1r8Fh6pcVBmJtp7lLDA GJMUc+Fke2HzIhWYlBZYQms7YwvCiJoVVaFCkb38goYxexrasMSdOi1aOQ1BS+uj xoV9Bbdpc7I/aWoCiNKwMWqu3Y10uXF0naLBmwl2IQZf2+SJLXUxSVu3a1omYEs/ vosLLkFNvQZKpI3Ya8ZpSorvmYspsLSQPS9YOJbJgU5WgnaGv7Ue/2G0hCkUVqse lyTezQT/oxg3o1xSgi5bBsSU9I3UEbJ+Trd88JTMWOhcMShZ5liugzFvIJJMDkst NmYJ5plU3XeWLHEgqrQKUnHwJ9uHuowwv2z/ecoYdFmrbnEwzzl2sOBcrAptsvGR LYPQNjdGg/OzJuEB9thPH+DItLYPfs2+ruDlD+G+xuj0duhC8lUlciBDgJf11JH5 YjZR5namhwfTvmoEcTGq =vIuo -----END PGP SIGNATURE----- --imjhCm/Pyz7Rq5F2--